A Neural Network Approach to Create Minangkabau-Indonesia Bilingual Dictionary

Resiandi, Kartika and Murakami, Yohei and Nasution, Arbi Haza (2022) A Neural Network Approach to Create Minangkabau-Indonesia Bilingual Dictionary. In: The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL2022), 24-25 June 2022, Marseille.

This is the latest version of this item.

[img] Text
P10.pdf - Published Version

Download (862kB)
Official URL: http://www.lrec-conf.org/proceedings/lrec2022/work...

Abstract

Indonesia has many varieties of ethnic languages, and most come from the same language family, namely Austronesian languages. Coming from that same language family, the words in Indonesian ethnic languages are very similar. However, there is research stating that Indonesian ethnic languages are endangered. Thus, to prevent that, we proposed to create a bilingual dictionary between ethnic languages using a neural network approach to extract transformation rules using character level embedding and the Bi-LSTM method in a sequence-to-sequence model. The model has an encoder and decoder. The encoder functions read the input sequence, character by character, generate context, then extract a summary of the input. The decoder will produce an output sequence where every character in each time-step and the next character that comes out are affected by the previous character. The current case for experiment translation focuses on Minangkabau and Indonesian languages with 13,761-word pairs. For evaluating the model’s performance, 5-Fold Cross-Validation is used. The character level seq2seq method (Bi-LSTM as encoder and LSTM as decoder) with an average precision of 83.55% outperforms the sentence piece byte pair encoding (vocab size of 32) with an average precision of 79.93%.

Item Type: Conference or Workshop Item (Paper)
Subjects: T Technology > T Technology (General)
Divisions: > Teknik Informatika
Depositing User: Monika Winda Monika
Date Deposited: 19 May 2023 08:37
Last Modified: 19 May 2023 08:37
URI: http://repository.uir.ac.id/id/eprint/21776

Available Versions of this Item

Actions (login required)

View Item View Item