The State of the Art Approaches in Named Entity Recognition

Catherine Omidiji; Emeka Ogbuju; Taiwo Abiodun; Joshua Jimba; Francisca Oladipo

Authors

Catherine Omidiji Federal university lokoja
Emeka Ogbuju Federal university Lokoja
Taiwo Abiodun Federal University Lokoja
Joshua Jimba Federal University Lokoja
Francisca Oladipo Federal University Lokoja

Keywords:

Named Entity Recognition, Machine learning, Deep Learning Model, Rule Based Learning, Hybrid Model

Abstract

Name entity recognition (NER) is significant in extracting and categorizing entities from unstructured textual data, and it’s a pivotal domain in Natural Language Processing (NLP). However, many researchers lack the appropriate method to conduct NER effectively. To address this issue, we conducted a methodical review, by sourcing related scientific papers from reputable scientific databases such as Scopus, IEEE Xplore, Science Direct and SpringerLink. Our study answered three research questions pertaining to the common approaches used for NER, commonly used algorithms for NER and how well have the algorithms have performed, and the state-of-the-art dataset commonly used for NER. The finding from the review showed a predominant adoption of machine learning, deep learning, hybrid model and rule-based approaches. More finding shows a noteworthy performance of Conditional Random Field (CRF) and Bidirectional Long Short-Term Memory (BiLSTM), especially when combined. However, the review identified inconsistencies in reporting standards for dataset, prompting for call for standardized practices. This paper provides a comprehensive overview on approaches used for NER, and serves as a valuable resource for researchers navigating the evolving landscape of methodologies.

Author Biography

Emeka Ogbuju, Federal university Lokoja

Computer Science Department; PHD

References

Pejic-Bach, M., Bertoncel, T., Krst, Z., & Mesko, M. (2020). Text mining of industry 4.0 job advertisements. Int J. Inf. Management, 50(1), 416-431.

Naseer, S., Ghafoor, M. M., Alvi, K. S., Kiran, A., Rahman, S. U., & Murtaza, G. (2021). Named entity recognition (NER) in NLP techniques, tools accuracy and performance. Pakistan Journal of Multidisciplinary Research, 2(2), 293-308.

Fatima, S., Fatima, Z., Hayat, M. A., Shahab, M. H., Meraj, M. K., Ibrahim, R. M., & Muneeb, S. M. (2022). Impact of software metrics on software quality using McCall quality model: In-depth analysis. University of Sindh Journal of Information and Communication Technology (USJICT), 6(2), 37–46.

Hyder, H., Ali, K. H., Aziz, A., & Iram, L. (2024). Evaluating diabetes detection methods: A multilinear regression approach vs. other machine learning classifiers. University of Sindh Journal of Information and Communication Technology (USJICT), 7(2), 47–56.

Kitchenham, B., Brereton, O. P., Budgen, D., Turner, M., Linkman, S., & Bailey, J. (2009). Systematic literature reviews in software engineering–a systematic literature review. Information and Software Technology, 15(1), 7-15.

Ji, Y., Tong, C., Liang, J., Yang, X., Zhao, Z., & Wang, X. (2019). A deep learning method for named entity bidding document. Journal of physics, 1168, 1-11.

Gunawan, W., Suhartono, D., Purnomo, F., & Ongko, A. (2018). Named-entity recognition for Indonesian Language using Bidirectional LSTM-CNNs. 3rd International Conference on Computer Science and Computational Intelligence 2018.135, pp. 425-432. Jakarta, Indonesia: elsevier.

Romero, G. d., & Segura-Bedmar, I. (2020). Exploring deep learning for named entity recognition of tumor morphology mentions. Proceeding of the Iberian Languages Evauation Forum, 2664, pp. 1-16. Madrid, Spain.

Vijay, J., & Rajeswari, S. (2018). A machine learning apprach to named entity recgnition for the travel and turism domain. Asian Jurnal f Infrmatin Technology, 15(21), 4309-4317.

Shah, B., & Kopparapu, S. K. (2019). A deep learning approachfor Hindi named entity recognition. arXiv, 1-7.

Wu, Y., Jiang, M., Xu, J., Zhi, D., & Xu, H. (2018). Clinical named entity recognition using deep learning models. AMIA annual Symposium proceedings.2017, pp. 1812-1819. AMIA Symposium.

Cho, H., & Lee, H. (2019). Biomedical named entity recognition using deep neural networks with contextual information. BMC Bioinformatics, 20(735), 1-11.

Elsherif, H. M., Alomari, K. M., Shaalan, K., & Alhamad, A. Q. (2019). Arabic rule-based named entity recognition system using GATE. 15th International conference on Machine Learning and Data Mining, MLDM 2019, (pp. 1-15). New York, USA.

Khalifa, M., & Shaalan, K. (2019). Character convlutions for Arabic named entity recognition with long short-term memory networks. Computer Speech and Language, 58(2019), 335-346.

Salah, R., Mukred, M., Zakaria, Q. B., Ahmed, R., & Sari, H. (2022). A new rule-based approach for classical Arabic in natural language processing. Journal of Mathematics, 2022, 1-20.

Oyewusi, F. W., Adekanmbi, O., Okoh, I., Onuigwe, V., Salami, I. M., Osakuade, O., . . . Musa, A. U. (2021). NaijaNER: comprehensive named entity recognition for 5Nigerian languages. Computation and Language, 1-5.

Ayogu, I., Adetunbi, A. O., & Ayogu, B. A. (2019). A first step towards the development of Yoruba named entity recognition system. International Journal of Computer Application, 182(41), 1-4.

Jain, R., Sharma, A., Mishra, G. S., & Nand, P. (2020). Named entity recognition in English text. Journal of Physics, 1712(1), 1-6.

Alabi, J. O., Amponsah-Kaakyire, K., Adelani, D. I., & Espana-Bonet, C. (2020). Massive vs. curated embeddings for low-resourced languages: a case study of Yoruba and Twi. proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) (pp. 2754-2762). Marceille : Eutopean anguage Resoources Association.

Chen, Y., Zhou, C., Li, T., Wu, H., Zhao, X., Ye, K., & Liao, J. (2019). Nemed entity recognition from Chinese adverse drug event reports with lexical feature based BiLSTM-CRF and tri-training. Journal of Biomedical Informatics, 96(2019), 1-8.

Ziyadi, M., Sun, Y., Goswami, A., Huang, J., & Weizhu, C. (2020). Example-based named entity recognition. Computing Research Respository , abs-2008-10570(2020).

Khan, W., Shahzad, K., Amjad, T., Banjar, A., & Fasihuddin, H. (2022). Named entity recognition using conditional random field. Appied Science, 12(13), 1-18.

Fetahu, B., Kar, S., Chen, Z., Rokhlenko, O., & Malmasi, S. (2023). SemEvsl-2023 Task2:FIne-grained multilingual named entity recognition (MultiCoNER2). Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2032) (pp. 2247-2265). Toronto, Canada: Asociation for Computational LLinguistics.

Popovski, G., Kochev, S., Seijak, B. K., & Eftimov, T. (2019). FoodIE: A rule-based named entity recognition method for food information extraction. The INternational Conference on Pattern Recognition Applications and Methods, 1, pp. 915-922. Prague, Czech Republic.

Furrer, L., Jancso, A., Colic, N., & Rinaldi, F. (2019). OGER++: Hybrid multi-type entity recognition. Journal of Cheminformatics, 11(1), 1-10.

Hirpassa, S., & Lehal, G. S. (2020). Named entity recognition: a semi-supervised learning approach. International Journal of Information Technology, 13(3), 1659-1665.

Zhong, M., Liu, G., Xiong, J., & Zuo, J. (2022). DualNER: A triggerbased dual learning framework for low-resource named entity recognition. IEEE Intelligent System, 37(4), 79-87.

Zhao, F., Gui, X., Huang, Y., Jin, H., & Yang, L. T. (2022). Dynamic entity-based named entity recognition under unconstrained tagging schemes. IEEE Transaction on Big Data, 8(4), 1059-1074.

Zaman, G., Mahdin, H., Hussain, K., & Rahman, A. (2021). An ontological framework for information extraction from diverse scientific sources. IEEE Access, 9, 42111-42121.

Luo, X., Gandhi, P., Storey, S., & Huang, K. (2022). A deep language model for symptom extraction from clinical text and its application to extract COVID-19 symptoms from social media. IEEE J Biomed Health Inform, 26(4), 1737–1748.

Silva, R. A., Silva, L., Dutra, M. L., & Araujo, G. M. (2021). An improved NER methodology to the Portuguese language. Mobile Networks and Applications, 26(1), 319-325.

Jibril, E. C., & Tantug, C. (2023). ANEC: An Amharic named entity corpus and tranformer based recognizer. IEEE Access, 11(1), 15799-15815.

Adelani, D. I., Abbott, J., Neubig, G., D'souza, D., Kreutzer, J., Lignos, C., . . . Mayhew, S. (2021). MasakhaNER: named entity recognition for African languages. Transactions of the Association for Computational Linguistics, 9(5), 1116-1131.

The State of the Art Approaches in Named Entity Recognition

Authors

Keywords:

Abstract

Author Biography

Emeka Ogbuju, Federal university Lokoja

References

Downloads

Published

Versions

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

Information