JUCS - Journal of Universal Computer Science 24(11): 1651-1676, doi: 10.3217/jucs-024-11-1651

Human Language Technologies: Key Issues for Representing Knowledge from Textual Information

Yoan Gutiérrez^‡, Elena Lloret^‡, José M. Gómez^‡

‡ University of Alicante, San Vicente del Raspeig, Spain

Corresponding author: Yoan Gutiérrez ( ygutierrez@dlsi.ua.es )

This article is freely available under the J.UCS Open Content License.

Citation: Gutiérrez Y, Lloret E, Gómez JM (2018) Human Language Technologies: Key Issues for Representing Knowledge from Textual Information. JUCS - Journal of Universal Computer Science 24(11): 1651-1676. https://doi.org/10.3217/jucs-024-11-1651

Abstract

Ontologies are appropriate structures for capturing and representing the knowledge about a domain or task. However, the design and further population of them are both di_cult tasks, normally addressed in a manual or in a semi-automatic manner. The goal of this article is to de_ne and extend a task-oriented ontology schema that semantically represents the information contained in texts. This information can be extracted using Human Language Technologies, and throughout this work, the whole process to design such ontology schema is described. Then, we also describe an algorithm to automatically populate ontologies based our Human Language Technology oriented schema, avoiding the unnecessary duplication of instances, and having as a result the required information in a more compact and useful format ready to exploit. Tangible results are provided, such as permanent online access points to the ontology schema, an example bucket (i.e. ontology instance repository) based on a real scenario, and a documentation Web page.

Keywords

ontology development, ontology population, human language technologies, semantic package, knowledge engineering