From insideBIGDATA.

In this special guest feature, Gary M. Shiffman, PhD, Co-founder and CEO, Consilient, takes a look at Federated Machine Learning, the branch of machine learning that’s sure to be a revolution for FCC professionals by enabling collaboration while preserving privacy. Gary is an applied micro-economist and business executive working to combat organized violence, corruption, and coercion. Past experiences include senior positions at the Pentagon, U.S. Senate, and the Department of Homeland Security. He is the Founder and CEO of Giant Oak, Inc. and the Co-Founder and CEO of Consilient, Inc., machine learning and artificial intelligence companies building solutions to support professionals promoting national security and combating financial crime. Dr. Shiffman is the author of The Economics of Violence: How Behavioral Science Can Transform Our View of Crime, Insurgency, and Terrorism with Cambridge University Press in 2020.

Money laundering has been in the headlines a lot recently, thanks in large part to Putin loyalists and their big yachts. But money laundering has been a massive and persistent problem for as long as people have had a reason to hide their money. Modern money laundering entered common consciousness about 100 years ago thanks to Al Capone and his crew. According to the United Nations Office on Drugs and Crime, 2-5% of global GDP — up to $2 trillion — is laundered through the global financial system each year.

In the diligent work by Financial Crimes and Compliance (FCC) professionals to make a dent in this massive illicit line of business, the chaos of a thousand good ideas makes its way to the data teams. Yet when managing massive amounts of highly sensitive data, every data-sharing event runs the significant risk of creating compliance and privacy violations. Add to that consumer and reputational risks should that data leak, and the technological challenges of simply moving massive data. Overall, bank operating costs spent on compliance have risen by more than 60% in the last eight years. What are big data professionals at financial institutions to do?

The obvious answer is – don’t move the transaction data. As someone who spent a career in the national security community, sharing data helps discover criminals and terrorists. In fact, this sharing is necessary – across government agencies, across borders with allies, and between public and private institutions. But because financial institutions have been precluded from sharing, collaboration has been slow to non-existent. For this reason, less than 1% of that $2 trillion in laundered proceeds is interdicted each year, in spite of the $50-100bn spent to do so.

Federated Machine Learning Looks to Solve the Challenge

To solve this problem, look to Federated Machine Learning, the branch of machine learning that’s sure to be a revolution for FCC professionals by enabling collaboration while preserving privacy. After all, money launderers are humans and therefore display consistent patterns of behavior. Machine learning (ML) technology, at its core, detects patterns across big data. Many banks have ML teams running experiments today based on the evolution of ML to detect patterns of money-laundering activity. However, they do so individually – without collaboration, the holy grail of crime fighting.

Federating the machine learning allows separate financial institutions to collaborate, looking beyond the four walls of the enterprise without actually moving data. This is the revolution: Institutions get the benefits of collaboration without taking on the compliance, privacy, and reputational risks. Through the federation process, each model gets smarter over time, and more models enter the library. Federated machine learning is crushing the 95% false positive rate that most institutions face with their rules-based transaction-monitoring systems.

Federated machine learning is more accurate, for a variety of reasons:

  • It eliminates information-sharing constraints.  By federating machine learning, collaboration happens without the sharing of any data. Federated machine learning bypasses the regulatory, privacy, and technological constraints of sharing data. Each institution keeps its data exactly where the data team wants it kept.
  • It discovers more illicit activities. Through collaboration, model effectiveness improves beyond the upper bound of any one bank’s ability to build their own model, ML or otherwise. In addition to the accuracy of any one model, through collaboration, each participating institution gets access to an ever-increasing library of models.
  • It works faster and at lower cost. Through collaboration, model efficiency – like effectiveness – improves beyond any bank’s capability in the absence of collaboration. Using federated machine learning, banks can send fewer cases to human review, and yet yield more actual risk. And, they can reduce the amount they pay in fines, which is perhaps the most important way to control costs.

Initial testing of federated machine learning in the FCC space in 2021 and 2022, where financial institutions have collaborated without sharing data, has demonstrated massive reduction in the workload, as measured in three metrics:

  • Hit Rate: of all potential cases, the number sent for human review.
  • Hit Yield: of those sent for human review, the number actually displaying the searched-for behavior.
  • Discovery Rate: of those displaying the searched-for behavior, the number undetected without collaboration with federated machine learning.

When we talk about banks and their data, we usually and rightly are talking about data islands – since the second you move that data beyond the four walls of the enterprise, you are creating compliance and privacy risks. Federated machine learning has proved that it is ready to change this game. Look out you, in the 2-5% of GDP engaged in money laundering, drug trafficking, human trafficking, and fraud. With the significant investments made into identifying you, and the introduction of federated machine learning into the arena, the easy days of evasion are numbered.