• Dec 26, 2017 News!Vol. 4, No. 1-No.3 has been indexed by EI (Inspec).   [Click]
  • Dec 26, 2017 News!Vol. 3, No. 4 has been indexed by EI (Inspec).   [Click]
  • Dec 25, 2017 News!Welcome to 2018 7th International Conference on Software and Computing Technologies (ICSCT 2018), which will be held in Kuala Lumpur during April 7-9, 2018.   [Click]
General Information
    • ISSN: 2301-3559
    • Frequency: Quarterly
    • DOI: 10.18178/LNSE
    • Editor-in-Chief: Prof. Jemal Antidze
    • Executive Editor: Ms. Nina Lee
    • Abstracting/ Indexing: EI (INSPEC, IET), Electronic Journals Library,  Ulrich's Periodicals Directory, International Computer Science Digital Library (ICSDL), ProQuest and Google Scholar.
    • E-mail: lnse@ejournal.net
Prof. Jemal Antidze
I. Vekua Scientific Institute of Applied Mathematics
Tbilisi State University, Georgia
I'm happy to take on the position of editor in chief of LNSE. We encourage authors to submit papers concerning any branch of Software Engineering.

LNSE 2014 Vol.2(4): 375-379 IS4SN: 2301-3559
DOI: 10.7763/LNSE.2014.V2.153

Duo Bundling Algorithms for Data Preprocessing: Case Study of Breast Cancer Data Prediction

Janjira Jojan and Anongnart Srivihok
Abstract—Classification of imbalanced dataset is the most popular and challenged problems for researchers to solve in nowadays. This paper proposed a two-steps approach to improve the quality of class prediction imbalanced breast cancer dataset. The two-steps approach consists of two main techniques: 1) using feature selection techniques to filter out unimportant features from the dataset; and 2) using the over-sampling technique to adjust the size of the minority class to be similar to the size of the majority class. The three different classification algorithms: artificial neural network (MLP), decision tree (C4.5) and Naïve Bayes, were applied. The classification result indicated that C4.5 was the most suitable to classify this dataset which can give the highest accuracy of 83.80%.

Index Terms—Feature selection, over-sampling, classification, imbalanced dataset, breast cancer data.

Janjira Jojan and Anongnart Srivihok are with the Department of Computer Science, Faculty of Science, Kasetsart University, Thailand (e-mail: g5314401215@ku.ac.th, ajjojan@gmail.com, fcsiang@ku.ac.th).


Cite: Janjira Jojan and Anongnart Srivihok, "Duo Bundling Algorithms for Data Preprocessing: Case Study of Breast Cancer Data Prediction," Lecture Notes on Software Engineering vol. 2, no. 4, pp. 375-379, 2014.

Copyright © 2008-2015. Lecture Notes on Software Engineering. All rights reserved.
E-mail: lnse@ejournal.net