IJIRST (International Journal for Innovative Research in Science & Technology)ISSN (online) : 2349-6010

 International Journal for Innovative Research in Science & Technology

Enhancement in Boosting Efficiency via Feature Selection and Clustering of Input Dataset


Print Email Cite
International Journal for Innovative Research in Science & Technology
Volume 3 Issue - 2
Year of Publication : 2016
Authors : Dincy George ; Naveen Raja S M ; Hafzal Rahman M J

BibTeX:

@article{IJIRSTV3I2055,
     title={Enhancement in Boosting Efficiency via Feature Selection and Clustering of Input Dataset},
     author={Dincy George, Naveen Raja S M and Hafzal Rahman M J},
     journal={International Journal for Innovative Research in Science & Technology},
     volume={3},
     number={2},
     pages={188--191},
     year={},
     url={http://www.ijirst.org/articles/IJIRSTV3I2055.pdf},
     publisher={IJIRST (International Journal for Innovative Research in Science & Technology)},
}



Abstract:

Learning is an inevitable aspect in the field of Artificial Intelligence. Supervised learning systems are of great importance in current era. Boosting is an iterative technique for improving the predictive accuracy of such systems. It works by learning multiple functions by considering the output label of previous function as the base of succeeding one. Real world data sets still have issues while dealing with label noise and over fitting in case of complex datasets. To mitigate this issue the datasets are being clustered together using efficient algorithms. The dataset is being grouped together by combining the most similar member data. This clustered data set is then integrated together to the boosting process. Thus it improves predictive accuracy and lessens over fitting. This work first analyses the variation in predictive accuracy of popular boosting techniques with clustered and non-clustered data sets. After the analysis, a feature selection based approach is proposed to mitigate the identified issues is proposed which can enhance the efficiency of the system in the aspects of both time and memory.


Keywords:

Artificial Intelligence Clustering Algorithms, Label Noise, Supervised Learning Systems


Download Article