Data-driven Update of AGROVOC Using Agricultural Text Corpora
conference paper
AGROVOC is a well-known multilingual controlled vocabulary covering the fields of agriculture, forestry, fisheries, and food. It is used for dataset annotation, indexing of literature, and automated text tagging, and its effective use depends on its continuous update. Currently, updates are done manually by a dispersed community of editors. In this paper, we present work towards automated update recommendations using large corpora of agricultural text (such as the AGRIS database). The work is based on the extraction of agricultural concept mentions from text through the deployment of custom trained Named Entity Recognition models and the exploitation of Graph Neural Networks to recommend concept and relation additions towards predicting future AGROVOC states. The research questions and methodology are presented together with the results of an initial experiment. The next steps and future research directions are outlined. This work forms part of a PhD research on monitoring and predicting changes in knowledge graphs utilising textual data. (C) 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
Topics
TNO Identifier
980505
ISSN
16130073
Publisher
CEUR-WS
Source title
CEUR Workshop Proceedings
Editor(s)
Theodoridis A.Koutsou S.
Pages
260-265
Files
To receive the publication files, please send an e-mail request to TNO Repository.