Mining Emerging News from Text Data

Main Article Content

S. A. DAR
M. MUZAMMAL
H. ZAHEER
I. A. KOREJO

Abstract

we live in the information age. There is so much information emerging over the internet that it is next to impossible to be able to go through all of it. This work is focused on extracting “interesting” information from the web. As a first step, we assume that newspapers report the most interesting information and thus propose a framework that is able to extract interesting information from the internet using the news feed from news websites. We collect RSS feed from a set of user-specified sources and thus obtain the title of the news from the RSS feed. Next, we remove the insignificant words from the news title and a tokenization procedure transforms the keywords into tokens. These tokens are combined to form sets of items. An itemset mining algorithm is implemented to extract most interesting patterns and a de-tokenization procedure is used to extract the most interesting news.

Article Details

How to Cite
S. A. DAR, M. MUZAMMAL, H. ZAHEER, & I. A. KOREJO. (2016). Mining Emerging News from Text Data. Sindh University Research Journal - SURJ (Science Series), 48(2). Retrieved from https://sujo.usindh.edu.pk/index.php/SURJ/article/view/4886
Section
Articles