Named Entity Recognition for Urdu Language: The UNER System, A Hybrid Approach

  • Saba Rani Faculty of Engineering and Technology, University of Sindh, Jamshoro
  • Hira Fatima Naqvi Institute of Mathematics and Computer Sciences, University of Sindh, Jamshoro
  • Fida Hussain Khoso Dawood University of Engineering and Technology, Karachi
  • Attia Agha Dawood University of Engineering and Technology, Karachi
  • Dil Nawaz Hakro Faculty of Engineering and Technology, University of Sindh, Jamshoro
Keywords: UNER, NLP NER, Urdu, Recognition, Named Entity

Abstract

NER is a natural language processing technique that primarily classifies parts of parsed text into well-known named entities. In the domain of natural language processing, the recognition of name entities is used to classify nouns that appear in bulk text data and place these nouns into predefined groups, such as names of people, places, times, dates, organizations, etc. There is a lot of fragmented material and data on the Cyberspace, therefore scholars are working on several languages (i.e: Sindhi, English, etc.), by working on various approaches and techniques depending on their locations, to improve accessibility of filtered information for online users. The NER enhance the quality of NLP in applications including automated summarization, semantic web search, information extraction and retrieval machine translation and question answering, chatbots and others. This study designs an efficient framework to extract noun entities in Urdu using a hybrid approach. The UNER system not only extracts entities by searching through a list of names, but also extracts named entities by recognizing phrases in a given text. The UNER system is designed to recognize Urdu noun entities in pre-defined categories such as places, personal names, titled personal names, organizations, object names, trade names, abbreviations, dates and times, measurements, and text names in Urdu.

Published
2022-09-30
How to Cite
Saba Rani, Hira Fatima Naqvi, Fida Hussain Khoso, Attia Agha, & Dil Nawaz Hakro. (2022). Named Entity Recognition for Urdu Language: The UNER System, A Hybrid Approach. University of Sindh Journal of Information and Communication Technology , 6(3), 108-114. Retrieved from https://sujo.usindh.edu.pk/index.php/USJICT/article/view/6281

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.