Online shopping and the quest for ever-increasing convenience resulted in the scattering of our credit card details across various websites. Over time, we have become desensitised to providing our payment details and, effectively, passed the responsibilities of storing and protecting our personal and transaction details to e-commerce companies.
Unfortunately, many companies have failed to keep our data safe and secure. According to Statista, over 164 million sensitive records were exposed in 1,473 data breaches in the United States last year.
The exponential growth in e-commerce attracted even more aggressive growth in illicit activities. Southeast Asia’s e-commerce market will likely exceed Google’s prediction of US$200 billion, driven by a high mobile internet penetration rate, young consumers, and an increase in disposable income.
At the same time, fraud cost the world economy over US$5 trillion in 2019. According to AppsFlyer, APAC’s fraud is 60 per cent higher than the average global rate, while South East Asia is the heaviest hit region, especially Indonesia, Malaysia, Thailand, and Vietnam.
Recently, the international law enforcement agencies warned of a spike in fraud related to the COVID-19 pandemic, and The Straits Times reported that victims in Singapore have lost S$41.3 million in the first quarter of this year, nearly a 30 percent increase. Unfortunately, it seems that 2020 will turn out to be a bumper year for fraudsters around the world.
Fraud Detection and Machine Learning
In order to protect the customers, banks and retailers have deployed large-scale fraud detection pipelines that scan transactions in real-time. For a long time, fraud detection has relied upon rule-based expert systems to detect illicit activities. Traditional experts analysed transaction logs, identified fraudulent patterns, and implemented hand-coded rules to flag those activities. The rules were as simple as blocking the transactions from a compromised credit card number to more sophisticated rules such as flagging transactions that deviate from the credit card’s historical patterns.
Over time, as transaction volumes have exponentially grown, thousands of increasingly complicated fraud–detection rules emerged and this approach became intractable. Fortunately, machine learning can easily scan millions of transactions to enable real-time fraud detection. Machine learning is a procedure that enables a computer to learn from data how to perform a certain task. Once a computer learns the task to a sufficient level, human experts are no longer needed.
Indeed, fraud detection algorithms based on past purchase behaviour have matched accuracy of human performance at identifying anomalous transactions. Meanwhile, the experts’ role evolved to analyse and understand larger issues such as new global fraud trends.
But what happens when purchase behaviour suddenly changes? And therein lies the rub. If machine learning algorithms rely on learning from historical data, a rapidly changing environment and fast-evolving patterns will wreak havoc on the predictions and forecasts.
The world has drastically changed since COVID-19 wrecked the global economy. Predictably, fraudsters have taken advantage of the uncertainty in the rapidly evolving environment and changed the patterns and targets of attack. For instance, traditional strategies of ticket fraud (i.e. reselling tickets purchased with stolen card information) have migrated, and fraudsters are increasingly turning to well-orchestrated scams directly defrauding unwitting consumers.
In such an environment, machine learning algorithms must be retrained on new data and quickly redeployed. However, as changes are unfolding in real-time, algorithms have far less data to learn from and adapt to changing fraud attack vectors. Slow moving trends underpinning increasingly irrelevant historical data must be unlearnt and machine learning engines must accurately predict newly evolved tendencies using limited data. Moreover, the impact is compounded by having less validation data to judge the effectiveness of machine learning algorithms.
As fraudsters changed the patterns and targets of attack forcing experts to retrain and quickly redeploy machine learning algorithms in a rapidly evolving environment, automatic retraining of machine learning algorithms on new data and thwarting fraud attacks become paramount.
Only companies that have significantly invested in talent and robust big-data pipelines are well placed to cope with the current situation. Have the companies you have trusted with your data been so diligent? Maybe it is time to check your bank statement.
Paul Condylis is a head of data science at Tokopedia Singapore. Emir Hrnjic is an adjunct assistant professor at National University of Singapore (NUS) Business School and a co-founder of Block’N’White Consulting. The opinions expressed are those of the writers and do not represent the views and opinions of Tokopedia or NUS.