From Text to Knowledge Graph: Comparing Relation Extraction Methods in a Practical Context
conference paper
Knowledge graphs provide structure and semantic context to unstructured data. Creating them is labour intensive: it requires a close collaboration of graph developers and domain experts. Therefore, previous work has made attempts to automate (parts of) this process, utilising information extraction methods. This paper presents a comparative analysis of methods for extracting relations, with the goal of automated knowledge graph extraction. The contributions of this paper are two-fold: 1) the creation of a small dataset containing different versions of a news message annotated with triples, and 2) a comprehensive comparison of relation extraction methods within the context of this dataset. The primary objective of this paper is to assess these methods within a real-life use case scenario, where the resulting graph should aspire to the quality standards achievable through manual development. Prior methodologies often relied on automatically extracted datasets and a limited range of relation types, consequently constraining the expressivity and richness of resulting graphs. Furthermore, these datasets typically feature short or simplified sentences, failing to reflect the complexity inherent in real-world texts like news messages or research papers. The results show that GPT models demonstrate superior performance compared to the other relation extraction methods we tested. However, in the qualitative analysis performed additionally to the evaluation metrics, it was noted that alternative approaches like REBEL and KnowGL exhibit strengths in leveraging external world knowledge to enrich the graph beyond the textual content alone. This finding underscores the importance of considering a variety of methods that not only excel in extracting relations directly from text but also incorporate supplementary knowledge sources to enhance the overall richness and depth of the resulting knowledge graph.
TNO Identifier
1003244
Source title
GeNeSy’24: First International Workshop on Generative Neuro-Symbolic AI, co-located with ESWC 2024, May 26, 2024, Hersonissos, Crete, Greece
Files
To receive the publication files, please send an e-mail request to TNO Repository.