Unsupervised Neural Machine Translation between Myanmar Sign Language and Myanmar Language

Authors

  • Swe Zin Moe
  • Thepchai Supnithi

Keywords:

Machine Translation, Neural Machine Translation, Unsupervised Neural Machine Translation, Myanmar sign language, Myanmar language

Abstract

This paper investigate the utility of unsupervised Neural Machine translation (U-NMT) on low-resource language pairs: Myanmar sign language (MSL) and Myanmar language. Since state-of- the-art neural machine translation (NMT) require large amount of parallel sentences, which we do not have for pairs we consider. We focus primarily on incorporating two different types of monolingual data: translated Myanmar sentences of primary English and myPOS data, only into our Myanmar language side. We found that the incorporating monolingual data achieved higher performance than the baseline approach. We prepared four types of training data for U-NMT models and the results clearly show that using the myPOS corpus on incorporating the Myanmar language monolingual data achieved the highest BLEU scores when compared to other training data.

References

Department of Population. (2014). The population and housing census of Myanmar, 2014: Summary of the provisional results.

https://github.com/ye-kyaw-thu/myPOS myPOS Corpus (Myanmar Part-of-Speech Corpus) for Myanmar

language NLP Research and Developments

Hutchins, W. J., “Early years in machine translation”, John Benjamins Publishing, 2000, USA doi: 10.1075/sihols.97

Hutchins, W. J., & Somers, H. L., “An introduction to ma- chine translation”, Academic Press, 1992, London, ISBN-13: 978-0123628305

Nirenburg, S., & Raskin, V., “Ontological semantics”, The MIT Press, 2004 ISBN: 9780262140867

Bangham, J. A., & Cox, S. J., “Signing for the deaf using virtual humans”, In Proceeding of the Speech and Language Processing for Disabled and Elderly People (Ref. No. 2000/025), IEE Seminar, 2000, London, UK

Angus B., & Smith, G., “English to American sign language machine translation of weather reports”, In Proceeding of the 2nd high desert student conference in linguistics (HDSL2), 1999, Albuquerque, New Mexico, pp. 23–30

Safar, E., & Marshall, I., “The architecture of an English-text-to- sign-language translation system”, In Angelova, G. (Ed.), Recent advances in natural language processing (RANLP), 2000, Tzigov Chark, Bulgaria, pp. 223–228

Zhao, L., & Kipper, K., “A machine translation system from English to American sign language”, In Proceeding of the 4th conference of the association for machine translation in the americas on envisioning machine translation in the information future, 2000, Springer-Verlag, pp. 54– 67

Veale,T.,&Collins,B.,“TheChallengesofCross-modalTrans- lation: English to sign language translation in the ZARDOZ system”, Machine Translation, 13, 1998, pp. 81–106.

Armond, D., & Speers, L., “Representation of American sign language for machine translation”, Ph.D. Thesis, 2001, Depart- ment of linguistics, Georgetown University.

Zijl, L. V., & Barker, D., “South African sign language ma- chine translation system”, In Proceeding of the 2nd international conference on computer graphics, virtual reality, visualisation and interaction in Africa (ACM SIGGRAPH), 2003, Cape Town, South Africa, pp. 49–52

Suszczanska, N., & Szmal, P., “Translating Polish text into sign language in the TGT system”, In Proceeding of the 20th IASTED international multi-conference applied informatics AI, 2002, Innsbruck, Austria, pp. 282–287

Huenerfauth, M. (2004a), “A multi-path architecture for ma- chine translation of English text into American sign language animation”, In Proceeding of the student workshop at the human language technology conference/North American chapter of the association for computational linguistics annual meeting (HLT- NAACL), May 02 - 07, 2004, Boston, MA, USA, pp. 25-30

Huenerfauth, M. (2004b), “Spatial and planning models of ASL classifier predicates for machine translation”, In Proceeding of the 10th international conference on theoretical and methodological issues in machine translation (TMI 2004), Baltimore, MD, USA.

Huenerfauth, M. (2004c), “Spatial representation of classifier predicates for machine translation into American Sign Lan- guage”, In Proceeding of the workshop on the representation and processing of signed languages, 4th international conference on language resources and evaluation (LREC 2004), Lisbon, Portugal.

Huenerfauth,M.(2005a),“AmericanSignLanguagegeneration: Multimodal NLG with multiple linguistic channels”, In Proceed- ing of the student research workshop, the 43rd annual meeting of the association for computational linguistics, Ann Arbor, MI, USA.

Huenerfauth, M. (2005b), “American Sign Language, natural language generation and machine translation”, ACM SIGAC- CESS Accessibility and Computing (Vol. 81). New York: ACM Press.

Huenerfauth, M. (2005c), “American sign language spatial rep- resentations for an accessible user-interface”, In Proceeding of the 3rd international conference on universal access in human- computer interaction, Las Vegas, NV, USA.

Huenerfauth, M. (2005d), “Representing coordination and non- coordination in an American sign language animation”, In Pro- ceeding of the 7th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2005), Baltimore, MD, USA

Morrissey,S.&Way,A.,“Experimentsinsignlanguagemachine translation using examples”, In Proceeding of the IBM CASCON 2006 Dublin symposium, Dublin, Ireland.

Stein, D., Bungeroth, J., & Ney, H., “Morpho-syntax based statistical methods for sign language translation”, In Proceeding of the 11th annual conference of the European association for machine translation, June, 2006, Oslo, Norway.

Wikipedia of Fingerspelling: https://en.wikipedia.org/wiki/Fingerspelling

Boudreault, Patrick; Mayberry, Rachel I., “Grammatical pro- cessing in American Sign Language:Age of first-language acqui- sition effects in relation to syntactic structure”. Language and Cognitive Processes, Volume 21, 2006 – Issue 5, pages 608-635, https://doi.org/10.1080/01690960500139363

Fenlon, Jordan; Denmark, Tanya; Campbell, Ruth; Woll, Bencie, “Seeing sentence boundaries”, Sign Language & Linguistics, 2008, 10 (2), pp. 177 – 200. http://dx.doi.org/10.1075/sll.10.2.06fen

Thompson, RobYin; Emmorey, Karen; Kluender, Robert, “The Relationship between Eye Gaze and Verb Agreement in American Sign Language: An Eye-tracking Study”. Natural Language & Linguistic Theory. 24 (2), 2006, pp. 571–604 doi:10.1007/s11049- 005-1829-y.

Baker, Charlotte, and Dennis Cokely, “American Sign Lan- guage: A teacher’s resource text on grammar and culture”, 1980, Silver Spring, MD: T.J. Publishers.

Sutton-Spence, Rachel, and Bencie Woll, “The linguistics of British Sign Language”, Cambridge: Cambridge University Press, 1998

“Myanmar Sign Language Basic Conversation Book”, Ministry of Social Welfare, Relief and Resettlement, Department of Social Welfare, Japan International Cooperation Agency, 1st Edition, August 2009, Daw Yu Yu Swe, Department of Social Welfare.

G. Lample, M. Ott, A. Conneau, L. Denoyer, MA. Ranzato, “ Phrase-Based & Neural Unsupervised Machine Translation”, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. 2018, “Unsupervised neural ma-chine translation”, In In- ternational Conference on Learning Representations(ICLR).

G. Lample, A. Conneau, L. Denoyer, MA. Ranzato, “Unsu- pervised Machine Translation With Monolingual Data Only”, International Conference on Learning Representations (ICLR), 2018

Rico Sennrich, Barry Haddow, and Alexandra Birch.2015b, “Neural machine translation of rarewords with subword units”, In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pages 1715–1725.

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013, “Distributed representations of words and phrases and their compositionality”, In Advances inneural infor- mation processing systems, pages 3111–3119.

M. Johnson, M. Schuster, Q.V. Le, M. Krikun,Y. Wu, Z. Chen, N. Thorat, F. Vi ́egas, M. Wat-tenberg, G. Corrado, M. Hughes, and J. Dean.2016, “Google’s multilingual neural machine trans- lation system: Enabling zero-shot translation”, In Transactions of the Association for Computational Linguistics.

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polo- sukhin, “Attention is all you need”. CoRR, abs/1706.03762, 2017.

ThangLuong,HieuPham,andChristopherD.Manning,“Effec- tive approaches to attention-based neural machine translation”. In EMNLP, 2015.

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton, “Layer normalization”. CoRR, abs/1607.06450, 2016.

Khin War War Htike, Ye Kyaw Thu, Zuping Zhang, Win Pa Pa, Yoshinori Sagisaka and Naoto Iwahashi, “Comparison of Six POS Tagging Methods on 10K Sentences Myanmar Language (Burmese) POS Tagged Corpus”, at 18th International Confer- ence on Computational Linguistics and Intelligent Text Process- ing (CICLing 2017), April 17 23, 2017, Budapest, Hungary.

Grad 1, 2 and 3 textbooks, The Government of the Republic of the Union of Myanmar Ministry of Education, Basic Education Curriculum and Textbook committee, 2015-2016 Academic year, 2014 October.

Okell, J., Tun, S., Swe, K. M., and University., N. I.(1994 1994). Burmese (Myanmar) : an introduction to the spoken language by John Okell ; with assistance from U Saw Tun and Daw Khin Mya Swe.Northern Illinois University, Center for Southeast Asian Studies DeKalb, Illinois.

Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton, and Matt Post. “Sockeye: A Toolkit for Neural Machine Translation”. ArXiv e-prints, December 2017.

Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Min- jie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang, “Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems”. CoRR, abs/1512.01274, 2015.

Swe Zin Moe, Ye Kyaw Thu, Hnin Aye Thant and Nandar Win Min, “Neural Machine Translation between Myanmar Sign Language and Myanmar Written Text”, In the second Regional Conference on Optical character recognition and Natural lan- guage processing technologies for ASEAN languages 2018 (ONA 2018), December 13-14, 2018, Phnom Penh, Cambodia.

Duchi, John C., Elad Hazan, and Yoram Singer, “Adaptive Subgradient Methods for Online Learning and Stochastic Opti- mization”, Journal of Machine Learning Research, 12:2121–2159, 2011.

https://github.com/facebookresearch/UnsupervisedMT

Ofir Press and Lior Wolf. 2016, “Using the output em-bedding

to improve language models”, arXivpreprintarXiv:1608.05859. [48] Diederik Kingma and Jimmy Ba. 2014, “Adam: Amethod for stochastic optimization”, arXivpreprintarXiv:1412.6980.

Holger Schwenk, “Continuous space translation models for phrase-based statistical machine translation”, In COLING, 2012. [50] Nal Kalchbrenner and Phil Blunsom, “Recurrent continuous translation models”, In EMNLP, 2013.

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le, “Sequence to sequence learning with neural networks”, In NIPS, 2014.

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio, “Neural machine translation by jointly learning to align and

translate”. CoRR, abs/1409.0473, 2014.

Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin, “Convolutional sequence to sequence

learning”. CoRR, abs/1705.03122, 2017.

Papineni, K., Roukos, S., Ward, T., Zhu, W., “Bleu: a Method for Automatic Evaluation of Machine Translation”. IBM Research Report rc22176 (w0109022), 2001, Thomas J. Watson Research Center, In ACL ’02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, July 07 - 12, 2002, Philadelphia, Pennsylvania, pp. 311-318

Wikipedia of Word Error Rate: https://en.wikipedia.org/wiki/Word_error_rate

Myanmar character notes: https://r12a.github.io/scripts/myanmar/block

Wikipedia of Burmese Alphabet: https://en.wikipedia.org/wiki/Burmese_alphabet

Wikipedia of Burmese https://en.wikipedia.org/wiki/Burmese_language

Downloads

Published

2024-02-09

How to Cite

1.
Swe Zin Moe, Supnithi T. Unsupervised Neural Machine Translation between Myanmar Sign Language and Myanmar Language. j.intell.inform. [Internet]. 2024 Feb. 9 [cited 2024 Nov. 24];4(Ap). Available from: https://ph05.tci-thaijo.org/index.php/JIIST/article/view/107