Morning Consult: Federated Learning Is The New Thing in AI/ML for Detecting Financial Crimes and Managing Risk
This article originally appeared in Morning Consult.
In September, FinCEN began seeking solicitation on crafting a brand new regulation defining the effectiveness and efficiency of anti-money laundering programs. In January, Congress overwhelmingly voted to pass into law the Anti-Money Laundering Act of 2020. And on March 31, 2021, the financial institutions’ regulators issued a ‘‘Request for Information and Comment on Financial Institutions’ Use of Artificial Intelligence, including Machine Learning.” The financial industry is practically screaming, “We know AI changes everything! You have our attention.”
However, the industry can’t just jump into AI/ML without acknowledging the challenges and concerns.
Our current anti-money laundering regime requires radical reform. Machine-learning technologies can provide the long-awaited improvements, and AI/ML technologies abound. Financial institutions have created internal data cience teams, and the industry has begun to bring AI/ML systems to the financial sector. However, as recently identified by U.S. regulators, the data-intensive nature of AI/ML poses a risk to any deployment — and therefore an impediment to realizing the promised and desperately needed improvements.
For various reasons — including privacy, sovereignty and the cost of moving data — the financial industry doesn’t share data between institutions. Data remains siloed within banks, and within departments within banks, limiting the amount of data available for training ML.
Imagine a world of collaboration, populated by communities of interested professionals sharing knowledge and expertise, while enhancing privacy. This is the world of Federated Learning. Instead of pooling the data from various financial institutions for training ML models, which is fraught with a myriad of challenges, moving computations to the data (stored at the financial institutions) securely is much more adaptable by the financial institutions. This new approach is the foundation of federated learning.
Federated learning, the next big thing in ML, solves these data challenges. It will bring AML systems to unprecedented standards of effectiveness and efficiency.
As it exists today, the current system compels the financial industry to spend billions of dollars on compliance, but counters less than 1 percent of the problem, making it extremely inefficient. U.S. financial services firms spend over $25 billion on compliance to combat money laundering, according to the Association of Certified Financial Crime Specialists. But according to one 2018 study, compliance costs are more than 100 times greater than recovered criminal funds. The current system also produces ineffective results. Nearly all of the 2.3 million suspicious activity reports filed by AML systems in 2019 provided no actual value. Less than 5 percent of all SARs filed warrant further investigation by regulators.
Artificial intelligence and ML are providing the next generation of technologies to transform the detection of money laundering, terrorist financing and other financial crimes, making it both effective and efficient. ML technologies learn by example (inductive learning). Given large enough examples of types of financial behaviors, ML algorithms can identify and distinguish the learned behavior from all others. For example, when trained on a cat, a machine can distinguish images of cats from dogs and all other non-cat images. When an algorithm trains on a specific type of bank customer or financial crime behavior, a machine can distinguish the learned customer behaviors from the behaviors of the rest of the population.
For example, money services businesses are required under the Bank Secrecy Act to have safeguards in place; they are also required to have accounts with other regulated banks. Sometimes, MSBs that choose to engage in illicit activities will mislead banks and hide their identities, so training ML models to identify MSB by activity and not by the disclosed business description will dramatically improve the effectiveness of an AML transaction monitoring system. While the current systems reach efficiencies of no greater than 5 percent, federated learning models can achieve 85 percent efficiency or higher.
The problem with the current AML approach lies in the limitations of any one bank’s data. Collectively, the financial data processed by banks across the United States and around the world is enough to train the world’s most efficient AML machine. But data privacy rules, security issues and technological limitations do not allow for information sharing between enterprises. As a result, banks only have access to the data of their own customers and transactions, and financial criminals are able to leap from one institution to another, obscuring their tracks. For example, if a bank chooses not to provide financial products to MSBs, that bank would not have sufficient examples to train ML models to detect MSBs. This is where federated learning comes in.
Federated learning is a distributed machine learning technique where models move across banks to learn from datasets separately and adjust their hyperparameters as they learn from more and more data. This adjustment “model deltas” is the only data that’s shared across banks, and all the training data remains secure and private.
This model differs from two other leading approaches to collaboration in two important ways. First, one alternative to federated learning is collecting all data from all banks in one location. This approach infringes on customers’ privacy. The second approach is sharing personally identifiable information on a case-by-case basis under Section 314 of the Patriot Act. This approach shares data and is useful for investigations but not for the initial identification of illicit activity.
Federated learning models allow the access and interrogation of data sets in different institutions, databases, and even jurisdictions without ever moving the data or any sensitive customer information. How? With federated learning, the algorithm, not the data itself, is exchanged between decentralized servers at institutions. As the algorithm “trains on” more and more data, it enables risk and compliance professionals to detect illicit activity more accurately within their networks.
However, for this scheme to work, we must address the security and privacy of data and models. First, we must protect the models that are being trained on a bank’s data against snooping and tampering. Secondly, when the model deltas are aggregated, it needs to be protected as well. Finally, the communication links on which the model deltas are sent to the aggregation engine must be robust to protect against tampering and denial of service attacks.
The security and privacy inherent to federated learning will enable an increase in effectiveness and an increase in efficiency. This capability has the potential to revolutionize not just the traditional banking industry but also the fintech industry and online payment platforms, by allowing organizations to save costs, redeploy personnel, and prioritize their AML and counter-fraud efforts more effectively.
Nikhil Deshpande, Ph.D., is the director of AI, Security and HPC product innovations at Intel Corp.
Gary M. Shiffman, Ph.D., is the co-founder and CEO of Consilient and the founder and CEO of Giant Oak.