Segmentation of Sindhi Handwritten Text

Main Article Content

S. A. AWAN
D. N. HAKRO
Z. H.ABRO
A. H. JALBANI

Abstract

Optical Character Recognition (OCR) and Intelligent Character Recognition (ICR) are two emerging areas to understand and convert document text into editable text. The change of language script on a text image pose various challenges and demand challenging algorithms and approaches to overcome these challenges especially in Arabic script and its adopting languages like Sindhi, Urdu, Pashto and Farsi. Sindhi is a very rich literature language and needs some powerful OCR and ICR systems to manage the level of advances with other languages having perfection in these areas such as English , Latin, Russian and Korean. This study presents a segmentation algorithm for the segmentation of lines, words and characters. The input images written by various subjects are scanned and preprocessed and tested on segmentation algorithm. The segmentation of lines produced 100% accuracy along with words accuracy of 95%. The characters segmentation level also produced and acceptable accuracy of 81%.

Article Details

How to Cite
S. A. AWAN, D. N. HAKRO, Z. H.ABRO, & A. H. JALBANI. (2018). Segmentation of Sindhi Handwritten Text. Sindh University Research Journal - SURJ (Science Series), 50(2). https://doi.org/10.26692/surj.v50i2.1333
Section
Articles