AMHARIC-GE’EZ CROSS LANGUAGE INFORMATION RETRIEVAL USING NEURAL MACHINE TRANSLATION

dc.contributor.author	GASHAHUN, WERKAFERAHU
dc.date.accessioned	2024-02-07T06:47:34Z
dc.date.available	2024-02-07T06:47:34Z
dc.date.issued	2024-02-07
dc.identifier.uri	http://hdl.handle.net/123456789/7166
dc.description.abstract	Amharic is the working language of the Federal Democratic Republic of Ethiopia (FDRE) and used by the majority of Ethiopians both as mother tongue. Ge'ez on the other hand is the ancient language that was used in Ethiopia. There are lots of cultural, historical and religious documents written in Ge’ez language that contain vast amount of important information inscribed in it. There is a need to access these documents for educational use, scientific research and other purposes. Most of Amharic speakers are good in formulating queries using Amharic language to access documents available in Amharic. On the other hand, users that can understand and read Ge’ez language have difficulty to express their information need by formulating Ge’ez query. Therefore, to enable such group of users to have access to relevant Ge’ez documents, there should have a way of translating their queries written in Amharic into Ge’ez queries. Therefore, the purpose of this study is developing Amharic-Ge’ez Cross Language Information Retrieval (CLIR) by using Neural Machine Translation (NMT) technique. Recently, neural network-based translation is applied in different translation tasks and achieved better results than statistical machine translation. This is the reason behind the selection of this technique to applied in this study. To develop the NMT model, parallel sentences from religious books such as bible, wudasie Maryam, Kidasie and Mezmure dawit were collected. The query translation model is developed using ONMT, an open-source framework and achieved BLEU score of 0.477. The developed CLIR system is tested with 24 queries and 165 randomly selected documents. In this study, both monolingual and bilingual experiments were conducted. The performance of the system measured by using Mean Average Precision (MAP) and recall for both experiments. Accordingly, the monolingual experiment returned a maximum MAP of 0.62 and recall of 0.74, and the bilingual experiment returned a maximum MAP of 0.53 and recall of 0.67. The monolingual experiment achieved better result as compared to the Bilingual experiment. The reason for low performance of bilingual retrieval is error that arise during query translation. Therefore, the performance of bilingual retrieval can be improved by ad	en_US
dc.description.sponsorship	uog	en_US
dc.language.iso	en_US	en_US
dc.subject	Amharic-Ge’ez Cross-Lingual Information Retrieval, Information Retrieval, neural machine translation, Amharic, Ge’ez	en_US
dc.title	AMHARIC-GE’EZ CROSS LANGUAGE INFORMATION RETRIEVAL USING NEURAL MACHINE TRANSLATION	en_US
dc.type	Thesis	en_US