JUCS - Journal of Universal Computer Science 24(1): 2-24, doi: 10.3217/jucs-024-01-0002
GerIE - An Open Information Extraction System for the German Language
expand article infoAkim Bassa, Mark Kroll§, Roman Kern§
‡ Unycom GmbH, Graz, Austria§ Know-Center, Graz, Austria
Open Access
Abstract
Open Information Extraction (OIE) allows to extract relations from a text without the need of domain-speci_c training data. To date, most of the research on OIE has been focused to the English language and little or no research has been conducted on other languages, including German. To tackle this problem, we developed GerIE, an OIE system for the German language. We surveyed the literature on OIE in order to identify concepts that may apply to the German language. Our system is based on the output of a German dependency parser and a number of handcrafted rules to extract the propositions. To evaluate the system, we created two dedicated datasets: one derived from news articles and the other devised from texts from an encyclopedia. Our system achieves F-measures of up to 0.89 for correctly-preprocessed sentences.
Keywords
open information extraction, fact extraction, German language