JUCS - Journal of Universal Computer Science 31(5): 519-549, doi: 10.3897/jucs.137103

Enhancing Knowledge Graph Construction with Automated Source Evaluation Using Large Language Models

Hendrik Hendrik^‡, Silmi Fauziati^‡, Adhistya Erna Permanasari^‡

‡ Universitas Gadjah Mada, Yogyakarta, Indonesia

Corresponding author: Adhistya Erna Permanasari ( adhistya@ugm.ac.id )

This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY-ND 4.0). This license allows reusers to copy and distribute the material in any medium or format in unadapted form only, and only so long as attribution is given to the creator. The license allows for commercial use.

Citation: Hendrik H, Fauziati S, Permanasari AE (2025) Enhancing Knowledge Graph Construction with Automated Source Evaluation Using Large Language Models. JUCS - Journal of Universal Computer Science 31(5): 519-549. https://doi.org/10.3897/jucs.137103

Abstract

Knowledge graphs are a powerful way to represent and organize complex knowledge. They are used in many fields, like healthcare and finance. They allow for more insightful decision-making and discoveries. However, the quality of knowledge graphs depends heavily on their sources. Current methods for evaluating these sources are often slow and not scalable. They struggle to keep up with the large amount of online information. We created a new tool to address this problem. Our tool uses Large Language Models (LLMs) to assess online sources quickly. It evaluates websites based on credibility, relevance, content quality, coverage, comprehensiveness, and accessibility. We tested our tool on Halal tourism websites in Japan. We compared LLM evaluations with human expert judgments. Our comprehensive analysis revealed that certain LLM models, particularly GPT-3.5-turbo, GPT-4, and Mixtral-8x7B-Instruct-v0.1, showed strong correlation with human evaluations. Using a temperature setting of 0.4, these models demonstrated consistent and reliable performance across multiple evaluation runs. Our structured evaluation framework, incorporating weighted criteria validated through both expert input and statistical analysis, provides a robust foundation for automated source assessment. While some models showed varying performance across different criteria, our findings suggest that careful model selection and potential ensemble approaches could optimize evaluation accuracy. Our work contributes significantly to improving knowledge graph construction by demonstrating the viability of LLM-based source evaluation, while also identifying key areas for future research in scalability, cross-domain validation, and automated optimization.

Keywords

Knowledge Graphs, Large Language Models, Automated Evaluation, Quality Control System, Halal Tourism