Glyph Identification and Character Recognition for Sindhi OCR

Abstract

A computer can read and write multiple languages and today?s computers are capable of understanding various human languages. A computer can be given instructions through various input methods but OCR (Optical Character Recognition) and handwritten character recognition are the input methods in which a scanned page containing text is converted into written or editable text. The change in language text available on scanned page demands different algorithm to recognize text because every language and script pose varying number of challenges to recognize text. The Latin language recognition pose less difficulties compared to Arabic script and languages that use Arabic script for writing and OCR systems for these Latin languages are near to perfection. Very little work has been done on regional languages of Pakistan. In this paper the Sindhi glyphs are identified and the number of characters and connected components are identified for this regional language of Pakistan. A graphical user interface has been created to perform identification task for glyphs and characters of Sindhi language. The glyphs of characters are successfully identified from scanned page and this information can be used to recognize characters. The language glyph identification can be used to apply suitable algorithm to identify language as well as to achieve a higher recognition rate.

Authors and Affiliations

N. A. Memon, F. Abassi, S. Zardari

Keywords

Related Articles

Integrated GIS-Based Site Selection of Hillside Development for Future Growth of Urban Areas

Urbanization is a challenging issue for developing countries, like Malaysia. Penang Island is one of the states of Malaysia selected as a study area where limited flat land exists. As a result, this would create urban en...

A Novel Zero-Watermarking Based Scheme for Copyright Protection of Grayscale Images

Zero-watermarking of digital images is a powerful method with respect to transparency in the watermarked image. However, robustness is still a challenging characteristic for researchers. The proposed method of zero-water...

Convolutional Code Based PAPR Reduction Scheme for Multicarrier Transmission with Higher Number of Subcarriers

Multicarrier transmission technique has become a prominent transmission technique in high-speed wireless communication systems. It is due to its frequency diversity,small inter-symbol interference in the multipath fading...

LTE and GPS based Deca Band Printed Antenna for Cellular Mobile Handset Communication Applications

This paper presents novel mobile phone antenna for radiations simultaneously in ten frequency bands for applications of LTE (Long-Term Evolution), GPS (Global Positioning System), GSM (Global System for Mobile Communicat...

Prioritization of Attributes for Palletizing Robots in Beverage Industry of Pakistan

Robots are extensively used in modern manufacturing industries to perform numerous repetitive operations. The challenge of selecting the most appropriate robot for a particular manufacturing setup is progressively becomi...

Download PDF file
  • EP ID EP226281
  • DOI 10.22581/muet1982.1704.18
  • Views 117
  • Downloads 0

How To Cite

N. A. Memon, F. Abassi, S. Zardari (2017). Glyph Identification and Character Recognition for Sindhi OCR. Mehran University Research Journal of Engineering and Technology, 36(4), 933-940. https://www.europub.co.uk/articles/-A-226281