Towards data-driven ontologies: A filtering approach using keywords and natural language constructs
de Boer, M.H.T.
Creating ontologies is an expensive task. Our vision is that we can automatically generate ontologies based on a set of relevant documents to create a kick-start in ontology creating sessions. In this paper, we focus on enhancing two often used methods, OpenIE and cooccurrences. We evaluate the methods on two document sets, one about pizza and one about the agriculture domain. The methods are evaluated using two types of F1-score (objective, quantitative) and through a human assessment (subjective, qualitative). The results show that 1) Cooc performs both objectively and subjectively better than OpenIE; 2) the filtering methods based on keywords and on Word2vec perform similarly; 3) the filtering methods both perform better compared to OpenIE and similar to Cooc; 4) Cooc-NVP performs best, especially considering the subjective evaluation. Although, the investigated methods provide a good start for extracting an ontology out of a set of domain documents, various improvements are still possible, especially in the natural language based methods
To reference this document use:
European Language Resources Association (ELRA)
LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings, 12th International Conference on Language Resources and Evaluation, LREC 2020, 11 May 2020 through 16 May 2020, 2285-2292