EVALUATION OF THESAURUS SIMILARITY USING EXPANSION OF QUERY IN INFORMATION RETRIEVAL SYSTEMS SPEAK INDONESIAN

EVALUASI PENGGUNAAN SIMILARITY THESAURUS TERHADAP EKSPANSI KUERI DALAM SISTEM TEMU KEMBALI INFORMASI BERBAHASA INDONESIA

Authors

  • Fridolin Febrianto Paiki

Keywords:

query expansion, similarity thesauru

Abstract

Terms or tokens are the main component in information retrieval system. The use of them as index and
query affects the performance of the system. This research is conducted to observe how similarity thesaurus
improves the performance of the Indonesian information retrieval system through query expansion. By using 30
sets of query and 1.000 documents, series of tests are conducted using different weight of terms in query to
measure the performance of the system before and after query expansion. By using cosine as similarity
measurement and the weight of the query terms, the terms used in query expansion can be determined. Two
treatments that were used are by taking 5 (TH5) and 10 (TH10) terms that has the biggest similarity value with
the query. It is found that overall the query expansion improve the performance of the system compared to the
one without query expansion (NoTH). However, it also depends on the weight of the terms in the query. On three
experiment combined with NoTH, TH5, and TH10, the results show that idf is proved to be better used as weight
of the terms in query in order to improve the performance of the system, either using query expansion or without
query expansion.

Published

2023-01-22

How to Cite

Paiki, F. F. . (2023). EVALUATION OF THESAURUS SIMILARITY USING EXPANSION OF QUERY IN INFORMATION RETRIEVAL SYSTEMS SPEAK INDONESIAN: EVALUASI PENGGUNAAN SIMILARITY THESAURUS TERHADAP EKSPANSI KUERI DALAM SISTEM TEMU KEMBALI INFORMASI BERBAHASA INDONESIA. JISTECH: Journal of Information Science and Technology, 7(1). Retrieved from https://jurnal.unipa.ac.id/index.php/istech/article/view/130