Ensemble Learning for Imbalanced E-commerce Transaction Anomaly Classification.

Bibliographic Details
Title: Ensemble Learning for Imbalanced E-commerce Transaction Anomaly Classification.
Authors: Yang, Haiqin, King, Irwin
Source: Neural Information Processing (9783642106767); 2009, p866-874, 9p
Abstract: This paper presents the main results of our on-going work, one month before the deadline, on the 2009 UC San Diego data mining contest. The tasks of the contest are to rank the samples in two e-commerce transaction anomaly datasets according to the probability each sample has a positive label. The performance is evaluated by the lift at 20% on the probability of the two datasets. A main difficulty for the tasks is that the data is highly imbalanced, only about 2% of data are labeled as positive, for both tasks. We first preprocess the data on the categorical features and normalize all the features. Here, we present our initial results on several popular classifiers, including Support Vector Machines, Neural Networks, AdaBoosts, and Logistic Regression. The objective is to get benchmark results of these classifiers without much modification, so it will help us to select a classifier for future tuning. Further, based on these results, we observe that the area under the ROC curve (AUC) is a good indicator to improve the lift score, we then propose an ensemble method to combine the above classifiers aiming at optimizing the AUC score and obtain significant better results. We also discuss with some treatment on the imbalance data in the experiment. [ABSTRACT FROM AUTHOR]
Copyright of Neural Information Processing (9783642106767) is the property of Springer Nature / Books and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
DOI: 10.1007/978-3-642-10677-4_98
Database: Complementary Index