Statistical Approaches to Instant Diacritics Restoration for Sindhi Accent Prediction

Main Article Content

H. SHAIKH
J. A. MAHAR
M. H. MAHAR

Abstract

: Sindhi script highly abounds in the homographic words which lead the reader and machine to many complexities. Due to the possibility of several meanings of one homographic structure, the interpretation and understanding of the text becomes severely difficult. Before the interpretation, pronunciation varies which is the leading cause to the complexity. Diacritics help us remove such complexities and comprehend the text easily and accurately. Due to the time saving nature of the people of current era, they don’t bother to write diacritics in routine writings. Apart from the difficulties in reading for human beings, the absence of diacritics creates difficulty for machine reading as well. The text prediction systems produced the basis for the instant diacritics restoration approach. This instant system of diacritics restoration is an entirely novel and unique work in the field of natural language processing. A framework of N-Grams and Maximum Entropy is proposed in this research work. The highest attention catching point of this system using unigram, bigram, trigram and quad-gram is 98.98% accuracy on the corpus of Sindhi language. The super edge of instant diacritics restoration is to be leading initiative to the highly advancing performance of other natural language and speech processing applications.

Article Details

How to Cite
H. SHAIKH, J. A. MAHAR, & M. H. MAHAR. (2020). Statistical Approaches to Instant Diacritics Restoration for Sindhi Accent Prediction . Sindh University Research Journal - SURJ (Science Series), 49(2). Retrieved from https://sujo.usindh.edu.pk/index.php/SURJ/article/view/1427
Section
Articles

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.