mirage

AMHARIC-GE’EZ CROSS LANGUAGE INFORMATION RETRIEVAL USING NEURAL MACHINE TRANSLATION

DSpace Repository

Show simple item record

dc.contributor.author GASHAHUN, WERKAFERAHU
dc.date.accessioned 2024-02-07T06:47:34Z
dc.date.available 2024-02-07T06:47:34Z
dc.date.issued 2024-02-07
dc.identifier.uri http://hdl.handle.net/123456789/7166
dc.description.abstract Amharic is the working language of the Federal Democratic Republic of Ethiopia (FDRE) and used by the majority of Ethiopians both as mother tongue. Ge'ez on the other hand is the ancient language that was used in Ethiopia. There are lots of cultural, historical and religious documents written in Ge’ez language that contain vast amount of important information inscribed in it. There is a need to access these documents for educational use, scientific research and other purposes. Most of Amharic speakers are good in formulating queries using Amharic language to access documents available in Amharic. On the other hand, users that can understand and read Ge’ez language have difficulty to express their information need by formulating Ge’ez query. Therefore, to enable such group of users to have access to relevant Ge’ez documents, there should have a way of translating their queries written in Amharic into Ge’ez queries. Therefore, the purpose of this study is developing Amharic-Ge’ez Cross Language Information Retrieval (CLIR) by using Neural Machine Translation (NMT) technique. Recently, neural network-based translation is applied in different translation tasks and achieved better results than statistical machine translation. This is the reason behind the selection of this technique to applied in this study. To develop the NMT model, parallel sentences from religious books such as bible, wudasie Maryam, Kidasie and Mezmure dawit were collected. The query translation model is developed using ONMT, an open-source framework and achieved BLEU score of 0.477. The developed CLIR system is tested with 24 queries and 165 randomly selected documents. In this study, both monolingual and bilingual experiments were conducted. The performance of the system measured by using Mean Average Precision (MAP) and recall for both experiments. Accordingly, the monolingual experiment returned a maximum MAP of 0.62 and recall of 0.74, and the bilingual experiment returned a maximum MAP of 0.53 and recall of 0.67. The monolingual experiment achieved better result as compared to the Bilingual experiment. The reason for low performance of bilingual retrieval is error that arise during query translation. Therefore, the performance of bilingual retrieval can be improved by ad en_US
dc.description.sponsorship uog en_US
dc.language.iso en_US en_US
dc.subject Amharic-Ge’ez Cross-Lingual Information Retrieval, Information Retrieval, neural machine translation, Amharic, Ge’ez en_US
dc.title AMHARIC-GE’EZ CROSS LANGUAGE INFORMATION RETRIEVAL USING NEURAL MACHINE TRANSLATION en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search in the Repository


Advanced Search

Browse

My Account