Title
Towards data-driven ontologies: A filtering approach using keywords and natural language constructs
Author
de Boer, M.H.T.
Verhoosel, J.P.C.
Publication year
2020
Abstract
Creating ontologies is an expensive task. Our vision is that we can automatically generate ontologies based on a set of relevant documents to create a kick-start in ontology creating sessions. In this paper, we focus on enhancing two often used methods, OpenIE and cooccurrences. We evaluate the methods on two document sets, one about pizza and one about the agriculture domain. The methods are evaluated using two types of F1-score (objective, quantitative) and through a human assessment (subjective, qualitative). The results show that 1) Cooc performs both objectively and subjectively better than OpenIE; 2) the filtering methods based on keywords and on Word2vec perform similarly; 3) the filtering methods both perform better compared to OpenIE and similar to Cooc; 4) Cooc-NVP performs best, especially considering the subjective evaluation. Although, the investigated methods provide a good start for extracting an ontology out of a set of domain documents, various improvements are still possible, especially in the natural language based methods
Subject
Knowledge representation
Ontologies
Text mining
Agricultural robots
Co-occurrence
Data driven
Document sets
Filtering method
Human assessment
Natural languages
Relevant documents
Subjective evaluations
Ontology
To reference this document use:
http://resolver.tudelft.nl/uuid:d8b1c45b-ba4f-4312-b123-317a6187f120
TNO identifier
884285
Publisher
European Language Resources Association (ELRA)
ISBN
9791095546344
Source
LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings, 12th International Conference on Language Resources and Evaluation, LREC 2020, 11 May 2020 through 16 May 2020, 2285-2292
Document type
conference paper