The eBay researchers introduce a ground-breaking automatic method of identifying suspicious card usage patterns. Their paper, containing the description of the method, is titled “Credit Card Fraud Detection in e-Commerce: An Outlier Detection Approach” and available on Arxiv.org – the preprint server. Their proposed work utilizes a model trained to identify “good behavior”, with respect to payments and transactions, and the algorithm further flags any activity that deviates from normal behavior.

“Often the challenge associated with tasks like fraud and spam detection is the lack of all likely patterns needed to train suitable supervised learning models,” authors wrote in their paper. “This problem accentuates when the fraudulent patterns are not only scarce, they also change over time … Limited data and continuously changing patterns makes learning significantly difficult. We hypothesize that good behavior does not change with time and data points representing good behavior have consistent spatial signature under different groupings.”

The scholars focused on “ensemble” of clustering techniques – methods utilized to recognize sets of similar items in a dataset – with dissimilar parameters. In each training run, every single data point was allocated to a cluster from which a vector, mathematical representation, was formed, making up “fingerprints” of the data points that can be joined to form its own distinct signature representation.

To produce a signature representation of “good behavior” (i.e., stability), the researchers joined the per-data point vectors and weighed them according to the size of that cluster, reaching at one score lying in the range 0 to 1. Result closer to 0 represented low stability.

In comparison to traditional AI fraud detection methods, this approach has many advantages. It doesn’t need previous knowledge of inliers or outliers. Also, the basic algorithm is both general and highly scalable in nature. It can be applied to other clustering problems, even the ones in the medical field.

The researchers used open source credit card database available on Kaggle – data science platform – to check their method. After training for 10 times, the Artificial Intelligence (AI) algorithm recognized 40% of the fake transactions with “high precision”.

It wasn’t ideal – it flagged 29 genuine transactions – but as quoted in the paper, it is a “huge gain”, taking in account the hundreds of thousands of data points at play.

“Our [technique] can be immensely helpful, as out of 284,807 samples we can safely rule out 139,220 [transactions],” they wrote.

Users, who may have sold or purchased something on eBay lately, may have seen the system already in action. The team shyly rejoiced that they were successful in identifying fake transactions in data from an “e-commerce platform”:

“The motivation for [our] approach comes from trying to identify fraudulent consumers on an e-commerce platform … Each time the e-commerce company introduces new consumer aided features or imposes restrictions on certain transactional behaviors, it opens new doors and avenues for some consumers to misuse and abuse the platform. Our algorithm shows tremendous potential in identifying [fraud] … However, due to the confidentiality of the dataset, these results cannot be reported in this paper.”