Building lexical resources for dialectical Arabic

Sumaya Sulaiman Al Ameri, Abdulhadi Shoufan

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

2 Scopus citations

Abstract

The natural language processing of Arabic dialects faces a major difficulty, which is the lack of lexical resources. This problem complicates the penetration and the business of related technologies such as machine translation, speech recognition, and sentiment analysis. Current solutions frequently use lexica, which are specific to the task at hand and limited to some language variety. Modern communication platforms including social media gather people from different nations and regions. This has increased the demand for general-purpose lexica towards effective natural language processing solutions. This chapter presents a collaborative web-based platform for building a cross-dialectical, general-purpose lexicon for Arabic dialects. This solution was tested by a team of two annotators, a reviewer, and a lexicographer. The lexicon expansion rate was measured and analyzed to estimate the overhead required to reach the desired size of the lexicon. The inter-annotator reliability was analyzed using Cohen's Kappa.

Original languageBritish English
Title of host publicationNatural Language Processing for Global and Local Business
Pages332-364
Number of pages33
ISBN (Electronic)9781799842415
DOIs
StatePublished - 31 Jul 2020

Fingerprint

Dive into the research topics of 'Building lexical resources for dialectical Arabic'. Together they form a unique fingerprint.

Cite this