GBG’s award winning Machine Learning capability fights identity fraud with consortium data
Reading time: 15 Minutes

GBG’s award winning Machine Learning capability fights identity fraud with consortium data

Synthetic identity fraud, an identity theft technique that combines fabricated credentials where the implied identity does not exist, is one of the fastest-growing financial crimes. Traditionally, financial institutions have only their own blacklists and past application data to fall back upon, making it hard to combat new fraud typologies such as these fake identities.

GBG Machine Learning Brings Innovation to Help Our Customers Stay on Top of Evolving Fraud

Fraudsters normally use certain data points or identity markers to successfully make a fraudulent application. Typical measures would render it difficult to differentiate between a cleverly-faked identity versus a genuine profile as banks can only match these new applications to their historical data and blacklists.

This is where a solution like GBG’s Machine Learning Module can turn the tide in the battle against fraudsters. The module enables the users to train, validate and deploy machine learning models to enhance the application fraud detection capability of financial institutions.

These models can learn quickly, from hundreds of historical applications, the difference between genuine and fraudulent applications. They can then map these learned patterns and use certain attributes to spot fraud signals and combat emerging fraud threats in new applications in an automated manner.

Study by GBG and CTOS demonstrates the effectiveness of Machine Learning in CTOS IDGuard

GBG conducted a research study with CTOS to demonstrate the effectiveness of machine learning models in improving fraud detection rates. CTOS IDGuard, an application fraud bureau powered by GBG’s fraud and financial crime prevention engine, was developed with the intention to combat application fraud in Malaysia.

In this study, GBG worked with member banks of CTOS IDGuard to evaluate applicants in the area of home loans and credit cards. We used anonymised consortium data (non-personally identifiable information) from the participating member banks to create features that profile the applicant behaviour during the machine learning models development process.

Consortium data refers to a network of anonymised data that can augment verification of new applicants. Consortium data can significantly increase the effectiveness of machine learning models, by adding the context of fraud trends within and across market segments.  

Consortium Data

The behavioural features enabled the machine learning models to quickly and efficiently analyse vast amounts of data, to identify patterns and trends that are associated with fraudulent activities e.g. inconsistent applications where the same applicant would apply for multiple banks using different personal details, such as their income level.

Using consortium data provided by participating banks, the Machine Learning enhancement helped reduce false positives by up to 41% for Credit Card and Home Loan applications. A false positive is when a legitimate application by a genuine profile is flagged as suspicious. Besides, the machine learning model was able to provide additional uplift in fraud detection for credit card application types by 23% without increasing the number of alerts (without adding additional workload to the fraud investigators/reviewers).

How were the machine learning models introduced to CTOS IDGuard?

Machine Learning

For this study, two models with different objectives were created and applied to targeted segments to evaluate new applications.

Missed Fraud Model: Applied to applications that do not trigger a fraud alert based on rules. This model aims to detect additional frauds that are missed by the rules.

False Positive Model: Applied to applications that trigger a fraud alert based on the rules. This model aims to identify false positives alerts produced by the rules.

Alert Mapping: Map the Rules Alert and Machine Learning Alert to Final Fraud Alert.

Example of fraud alert flagged by Machine Learning

Machine Learning

  • Number shows the chronological order of Applicant A’s applications
  • The applications that are highlighted in yellow are the high risk applications identified by Machine Learning
  • The applicant was trying to apply with multiple banks and the information provided by the applicant was inconsistent across the applications – e.g. state, income, occupation
  • This application was flagged because of CTOS IDGuard’s ability to detect fraud that would have otherwise slipped through the cracks had we not cross-referred to the shared consortium data

The study demonstrated the benefits of the fraud models that were developed using anonymised consortium data derived from the customer bases of the participating members. Overall, we saw a significant uplift in preventing fraudulent applications relating to forged documents and falsified data.

Mostly importantly, this underscores the combined effectiveness of consortium data, machine learning models and rules for a holistic fraud prevention solution for banks and financial institutions to stay one step ahead of fraudsters.

To learn more about GBG’s Machine Learning Module, visit

GBG won the award for the Best AI or Machine Learning Innovation of the Year at the Asia Risk Awards 2021. To know more visit

Interested to know more? Speak to us today.

Sign up for more expert insight

Hear from us when we launch new research, guides and reports.