Towards data-driven ontologies: A filtering approach using keywords and natural language constructs

conference paper
Creating ontologies is an expensive task. Our vision is that we can automatically generate ontologies based on a set of relevant documents to create a kick-start in ontology creating sessions. In this paper, we focus on enhancing two often used methods, OpenIE and cooccurrences. We evaluate the methods on two document sets, one about pizza and one about the agriculture domain. The methods are evaluated using two types of F1-score (objective, quantitative) and through a human assessment (subjective, qualitative). The results show that 1) Cooc performs both objectively and subjectively better than OpenIE; 2) the filtering methods based on keywords and on Word2vec perform similarly; 3) the filtering methods both perform better compared to OpenIE and similar to Cooc; 4) Cooc-NVP performs best, especially considering the subjective evaluation. Although, the investigated methods provide a good start for extracting an ontology out of a set of domain documents, various improvements are still possible, especially in the natural language based methods
TNO Identifier
884285
ISBN
9791095546344
Publisher
European Language Resources Association (ELRA)
Source title
LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings, 12th International Conference on Language Resources and Evaluation, LREC 2020, 11 May 2020 through 16 May 2020
Pages
2285-2292
Files
To receive the publication files, please send an e-mail request to TNO Repository.