New Combined Method to Improve Arabic POS Tagging

Journal: Journal of Autonomous Intelligence DOI: 10.32629/jai.v1i2.30

Mohamed Labidi

LaTICE laboratory, Tunisia

Abstract

One of the important tasks in Natural language processing is the part of speech tagging. For the Arabic language we have a lot of works but their performances do not rise to the required level, due to the complexity of the task and the Arabic language characteristics. In this work we study a combination between twodifferent approaches for Arabic POS-Taggers. The first one isa maximum entropy-based one, and the second is a statistical/rule-based one. Furthermore, we add a knowledge-based method to annotate Arabic particles. Our idea improves the accuracy rate. We passed from almost 85% to almost 90% using our combined method, which seem promoter.

Keywords

POS-Tagger, Natural language processing, Arabic language

References

Ababou N, Mazroui A (2016) A hybrid Arabic POS tagging for simple and compound morpho-syntactic tags. International Journal of Speech Technology 19:289–302.
Algrainy S, AlSerhan H M, Ayesh A (2008) Pattern-based algorithm for part-of-speech tagging. In International Conference on Computer Engineering and Systems, ICCES, pages 119-124.
Ann Bies. http://www.ircs.upenn.edu/arabic/Jan03release/arabic-POStags-collapseto-PennPOStags.txt
Berger A, Della Pietra S, Della Pietra V J (1996) A maximum entropy approach to natural language processing. Computational Lingustics, 22(1):39-71.
Brill E (1995) Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Lingustics, 21(4):543-566.
Jurafsky D, Martin J (2009) Speech and Language Processing. Pearson Education.
Khoja S (2001) Apt: Arabic part-of-speech tagger. In Proceedings of the Student Workshop at the Second Meeting of the North American Chapter of the Association for Computational Linguistics.

Copyright © 2019 Mohamed Labidi

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License