
The Office of Data and Innovation (ODI) built a model that flags electronic benefits transfer (EBT) theft for the California Department of Social Services (CDSS).
Project scope
- Timeline: 10 months
- Team: 3 data scientists, 4 engineers, 1 analytics manager, 3 program managers, 2 research scientists, 1 research analyst
- Reach: 562,000 households
Partner
Methods
- Data engineering
- Data visualization
- Machine learning
- Statistical methods
The opportunity
California Work Opportunity and Responsibility to Kids (CalWORKs) is a critical safety net program overseen by CDSS. It provides temporary cash aid and services to eligible families with children. Families rely on CalWORKs to help pay for essentials like housing, food, and child care. Benefits are loaded onto a debit card. Recipients can withdraw cash at ATMs or stores.
Criminals target CalWORKs debit cards and steal benefits. About $13 million a month was being stolen. It can take weeks to have stolen benefits replaced.
CDSS has been working on stopping benefits theft. One part of their approach is to use transaction data to identify possible thefts. But CDSS had limited access to data with long turnaround times. They also had trouble identifying what transactions might be theft.

How we helped
ODI built a model that flags benefits theft more quickly and more accurately. During the initial pilot, the tool:
- Shortens the lag time in measuring EBT theft from 2 months to 72 hours (95% faster)
- Collects data automatically (2,160 staff hours saved each year)
- Automatically and correctly identifies theft 82% of the time
- Shows where theft happens with precise geographic data
Our approach
ODI worked with CDSS’s data vendor to set up daily data dumps of raw data. This went into a secure cloud system run by CDSS. The Office of Technology and Solutions Integration at the California Health and Human Services Agency helped us create a plan to get access to the data.
- We used 4 months of CalWORKs transactions in the study.
- This included 562,000 households who withdrew about $1.4 billion in benefits at 33,000 locations.
- We also included data that described $47 million in theft reimbursement to about 43,000 households.
We used 2.5 months of these data to train the model. We used the rest of the data to test its accuracy.
What we built
ODI built a data pipeline to make the vendor data usable. It automatically:
- Organizes the data so it is easy to use
- Cleans up errors and discrepancies
- Adds geographic coordinates where missing
- Breaks up multiple transactions into individual ones

We used a machine learning model called a random forest. Our model focuses on predicting illegal transactions. 99% of all CalWORKs transactions are legitimate. That makes illegal transactions difficult to predict. Our model correctly predicts illegal transactions 82% of the time.
We check the model through counterfactual questions. This helps us explain why the model flagged a transaction as illegal or legitimate. This lets researchers explain forecasts and build trust with the community.
ODI built the data pipeline and model to integrate with CDSS’s existing systems. CDSS took over the pipeline and model at the end of the project.
While our breakthrough in measuring and predicting theft in closer to real-time will help CDSS today, this project laid the foundation for even larger impacts in the future. An organization that improves through learning requires good data, people who can extract insight from that data, and leaders that use those insights to take action. CDSS is in the middle of a once-in-a-generation modernization of our data infrastructure along with a push to hire data scientists and engineers. ODI dramatically accelerated our efforts to use our new infrastructure and people to ingest, analyze, and extract insights from vast quantities of data – insights that will improve services and protect benefits for low-income Californians.
—Joaquin Carbonell, Research Data Supervisor II, CDSS
—Californian after seeing the webpage
What’s next
CDSS plans to build on this project by:
- Targeting interventions to protect benefits: At least one theft prevention intervention has been implemented
based on this work and more are in development - Monitoring the impact of upgrades to EBT card technology: California is the first state in the nation to roll out Chip and Tap Cards for EBT benefits
- Looking at if adding the transaction’s time could increase accuracy
- Extending the model to cover CalFresh
- Improving the data pipeline’s geographic data coding
- Identifying compromised cards and where criminals are stealing card information
Learn more about ODI’s work on this project in our technical paper.