Fraud Detection: Leveraging Large Datasets to Develop Models – Info Risk Manage
This is the fourth in my series on five keys to using AI and machine learning in fraud detection.
Research shows that depth and breadth of data is more impactful to machine learning model performance than cleverness of the algorithm. It’s the computing equivalent of human experience.
This suggests that, when possible, you can improve predictive accuracy by expanding the dataset used to craft the predictive characteristics used in a machine learning model.
Think about it: There’s a reason why physicians see thousands of patients during their training. It’s this amount of experience, or learning, that allows them to accurately diagnose within their area of specialization. In fraud detection, a model will benefit from the experience gained by ingesting millions or billions of examples, consisting of both legitimate and fraudulent transactions. Superior fraud detection is achieved by analyzing an abundance of transactional data in order to effectively understand behavior, and assess risk, at an individual level.
At FICO, we have performed extensive research on different modeling techniques. Clearly, across a variety of use cases, the volume and variety of training data are more critical to prediction than the type of algorithm used. This research, and similar independent research throughout the AI community, indicates that fraud models that are developed and trained using data from thousands of institutions will be more accurate than models that rely on a relatively thin dataset.
That’s why the FICO Falcon Intelligence Network is such a vital part of our fraud solutions. It pools the fraud and non-fraud data from thousands of card issuers worldwide to build the best training set in the industry for machine learning and AI models.
Key 5 is using adaptive analytics and self-learning AI. Watch for that post, and follow me on Twitter @FraudBird.
For more information:
Article Prepared by Ollala Corp