Towards data-driven ontologies: A filtering approach using keywords and natural language constructs
                                            conference paper
                                        
                                    
                                            Creating ontologies is an expensive task. Our vision is that we can automatically generate ontologies based on a set of relevant documents to create a kick-start in ontology creating sessions. In this paper, we focus on enhancing two often used methods, OpenIE and cooccurrences. We evaluate the methods on two document sets, one about pizza and one about the agriculture domain. The methods are evaluated using two types of F1-score (objective, quantitative) and through a human assessment (subjective, qualitative). The results show that 1) Cooc performs both objectively and subjectively better than OpenIE; 2) the filtering methods based on keywords and on Word2vec perform similarly; 3) the filtering methods both perform better compared to OpenIE and similar to Cooc; 4) Cooc-NVP performs best, especially considering the subjective evaluation. Although, the investigated methods provide a good start for extracting an ontology out of a set of domain documents, various improvements are still possible, especially in the natural language based methods
                                        
                                    Topics
                                        
                                    TNO Identifier
                                        
                                            884285
                                        
                                    ISBN
                                        
                                            9791095546344
                                        
                                    Publisher
                                        
                                            European Language Resources Association (ELRA)
                                        
                                    Source title
                                        
                                            LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings, 12th International Conference on Language Resources and Evaluation, LREC 2020, 11 May 2020 through 16 May 2020
                                        
                                    Pages
                                        
                                            2285-2292
                                        
                                    Files
                                        
                                            
                                                To receive the publication files, please send an e-mail request to TNO Repository.