Improving Part of Speech Tagging for Learner English

Authors

  • Taiwo Kolajo Federal University Lokoja
  • Emeka Ogbuju
  • Aliyu Abubakar
  • Victoria Yemi-Peters
  • Kayode Atteh
  • Francisca Oladipo

Abstract

As advanced algorithms for natural language processing are being developed, fundamental tasks like part of speech tagging remain crucial and challenging. As a crucial step in natural language processing, part of speech tagging assigns each word in a phrase to the correct part of speech. Spelling mistakes are a frequent blunder we make when utilizing computer software. Spelling mistakes have a negative impact on part of speech tagging performance. The main goal of the study is to use a spell checker to increase the precision of part of speech tagging. In our comparison of four spell checkers—Norvig, Jamspell, Hunspell, and Dummy—Jamspell outperformed the others in terms of accuracy, speed, and the number of words it broke during the correction. This influenced our decision to use Jamspell to correct spelling mistakes. Treebank, Brown, and Conll2000 NLTK corpora were utilized for the project's Part of speech tagging. We applied a softmax function on the hidden state of the LSTM after passing an LSTM over the tokens to make predictions. The outcome is a vector of tag scores from which we can determine the projected tag for a word based on the highest value in this distribution of tag scores. The result shows an accuracy of 97%, a macro average of 98% and a weighted average of 97% when tested on 212 tagged sentences after spelling correction was applied.

Author Biography

Emeka Ogbuju

Dr Emeka Ogbuju is a senior lecturer in Department of Computer Science, Federal University Lokoja, Nigeria

Downloads

Published

2024-05-26

How to Cite

Kolajo, T., Ogbuju, E., Abubakar, A., Yemi-Peters, V., Atteh, K., & Oladipo, F. (2024). Improving Part of Speech Tagging for Learner English. University of Sindh Journal of Information and Communication Technology, 7(1), 06–13. Retrieved from https://sujo.usindh.edu.pk/index.php/USJICT/article/view/6384

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.