Studying the Reduction Techniques for Mining Engineering Datasets
Abstract
Over the world, companies often have huge datasets as data warehouses collection. The enormous size could make difficulty to analyze the data. The main reason, the complexity of data in terms of number of attributes and number of cases. To overcome this problem could be done by using a sufficient number of attributes and cases before mining this dataset. In Data Mining field, several methods could be used to reduce the attributes number and similar cases. This paper presents a study to test three reduction methods on engineering domain using five datasets. The three methods are: Genetic Algorithm (GA), Principal Component Analysis (PCA), and Johnson technique. The five datasets where obtained from UCI machine learning archive. The study examines which reduction method can be proper for datasets in Engineering field. It can be done by identifying the three reduction methods ranking based on percentage accuracy and number of selected attributes
Copyright (c) 2018 University of Sindh, Jamshoro
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
University of Sindh Journal of Information and Communication Technology (USJICT) follows an Open Access Policy under Attribution-NonCommercial CC-BY-NC license. Researchers can copy and redistribute the material in any medium or format, for any purpose. Authors can self-archive publisher's version of the accepted article in digital repositories and archives.
Upon acceptance, the author must transfer the copyright of this manuscript to the Journal for publication on paper, on data storage media and online with distribution rights to USJICT, University of sindh, Jamshoro, Pakistan. Kindly download the copyright for below and attach as a supplimentry file during article submission