Named Entity Recognition for Urdu Language: The UNER System, A Hybrid Approach
NER is a natural language processing technique that primarily classifies parts of parsed text into well-known named entities. In the domain of natural language processing, the recognition of name entities is used to classify nouns that appear in bulk text data and place these nouns into predefined groups, such as names of people, places, times, dates, organizations, etc. There is a lot of fragmented material and data on the Cyberspace, therefore scholars are working on several languages (i.e: Sindhi, English, etc.), by working on various approaches and techniques depending on their locations, to improve accessibility of filtered information for online users. The NER enhance the quality of NLP in applications including automated summarization, semantic web search, information extraction and retrieval machine translation and question answering, chatbots and others. This study designs an efficient framework to extract noun entities in Urdu using a hybrid approach. The UNER system not only extracts entities by searching through a list of names, but also extracts named entities by recognizing phrases in a given text. The UNER system is designed to recognize Urdu noun entities in pre-defined categories such as places, personal names, titled personal names, organizations, object names, trade names, abbreviations, dates and times, measurements, and text names in Urdu.
Copyright (c) 2022 University of Sindh Journal of Information and Communication Technology
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
University of Sindh Journal of Information and Communication Technology (USJICT) follows an Open Access Policy under Attribution-NonCommercial CC-BY-NC license. Researchers can copy and redistribute the material in any medium or format, for any purpose. Authors can self-archive publisher's version of the accepted article in digital repositories and archives.
Upon acceptance, the author must transfer the copyright of this manuscript to the Journal for publication on paper, on data storage media and online with distribution rights to USJICT, University of sindh, Jamshoro, Pakistan. Kindly download the copyright for below and attach as a supplimentry file during article submission