The Corpus-Based Study of Ezafe Construction in Persian

Document Type : Research Paper


1 M.A. Graduate in Computational Linguistics, Languages and Linguistics Center, Sharif University of Technology

2 Ph.D. Graduate Linguistics, University of Tehran

3 Assistant professor, Department of Computer, Allameh Tabataba’i University


Ezafe construction is considered as one of the most important issues in various linguistic theories including phonetics, morphology and syntax and many Iranian linguists have analyzed this phenomenon from these different aspects. Ezafe marker is usually not written in Persian text. So, not only does it result in a high degree of ambiguity in reading, analyzing, and understanding Persian documents, but also it causes serious difficulties for a large number of natural language processing tasks (NLP) such as part-of-speech (POS) tagging, Named-Entity Recognition (NER), Co-reference Resolution, Converting Text to Speech, Machine Translation, syntactic parsing and so on. As a result, determining the positions of Ezafe in a given sentence is viewed as a controversial and challenging issue especially in these applications. Using a corpus-based analysis and dependency grammar, the current paper sets to study Ezafe positions. Due to the fact that dependency grammar applies a simple parsing, uses low memory and speeds up computer operations, this grammar is regarded as one of the important and practical grammars in the field of computational linguistics. Accordingly, this study will use a rule-based method within this framework to recognize Ezafe positions. For this purpose, all Ezafe constructions which are provided in Uppsala Persian Dependency Corpus (UPDC) are analyzed based on dependency relations. In the next step, only seven Ezafe rules are formulated consisting of such non-verbal phrases as noun phrases, adjective phrases, prepositional phrases, adverb phrases, phrases with more than one post-modifier, phrases with more than one post-modifier as a phrase and co-ordinations. The proposed rules can be used in Persian dependency corpora and a great number of language processing tasks which are based on dependency relations. In addition, in the present research, Ezafe positions which have not been mentioned in previous theoretical and computational studies will be elaborated.


Volume 10, Issue 1
June 2019
