Skip to content

Using data to forecast drought in community water outages

A pond of water split by a peninsula of cracked, dry ground

The Office of Data and Innovation (ODI) partnered with the Division of Drinking Water (DDW) to improve a tool that forecasts when drought will affect community water systems.

Project scope

  • Timeline: 4 months
  • Team: 2 data scientists, 2 physical scientists, 1 project manager, 2 program managers
  • Reach: 39 million Californians served by 2,866 community water systems

Partner

Methods

  • Data visualization
  • Machine learning
  • Statistical methods

The opportunity

Map of California with dots for water systems affected by drought. They are distributed throughout the state.
123 community water systems experienced drought in 2023.

DDW, a part of the State Water Resources Control Boards (State Water Board), regulates public drinking water systems. It monitors 2,866 community water systems across California. Every year, a small fraction of these water systems runs out of water. In some cases, the State Water Board helps supply these communities with bottled or hauled water.

Map of California with dots for water systems affected by drought. They are distributed throughout the state.
123 community water systems experienced drought in 2023.

This prompted DDW to develop a model to forecast the effects of drought on water systems. In early 2022, the model identified 510 community water systems likely to run into an issue. But investigating that many systems would take too many resources. DDW asked ODI to develop a more accurate model.

How we helped

ODI worked with DDW to build a new machine learning model. A machine learning model is a tool that helps make sense of a large amount of information and numbers. DDW has a lot of data about what water systems are and aren’t affected by drought. This made a machine learning model a good solution.

The model runs each spring. It is designed to identify issues that may arise during the dry summer months. This will allow community water systems to anticipate and fix problems before they happen.

The model is easy to use and understand. Researchers can experiment with it by adding more data sources and improving the model. They can also use “what if” questions to explain forecasts. This transparency builds trust with the community.

Our approach

We modeled the flow of:

  • Water throughout the network of water systems in California
  • Groundwater, through rocks and soil
  • Surface water through streams, rivers, and reservoirs

Our dataset includes data from 2,866 community water systems from 2021 to 2023. We use 15 features and one outcome to describe each system. Four percent of water systems in our sample experienced some effects due to drought. They:

  • Ran out of water
  • Sustained themselves on bottled or hauled water
  • Experienced a violation for inadequate source capacity
  • Are under a water service connection moratorium
  • Petitioned for exemption from a curtailment order

What we built

We used a machine learning model called a support vector machine. It calculates 15 features for each water system to determine if it will be resilient to drought.

We check the model through “what if” questions. The model makes forecasts based on these hypotheticals. This helps us identify what changes in the input data cause a change in the outcome.

This lets researchers explain forecasts and build trust with the community.

The model is designed to keep learning over time. As more data gets added, the model will get smarter each time. This will let the model serve the state for years to come.

Recommendations

We gave DDW a roadmap to improve the model’s performance. The steps include:

  • Continue to improve their data management. This will reduce manual data collection and check for errors.
  • Use more data, including non-drought years. This will help researchers make more accurate forecasts across different conditions.

With these improvements, researchers can run rapid experiments. The more tests DDW runs, the better they’ll be able to forecast the impact of drought.

Data and Innovation Fund

This project was part of the Data and Innovation Fund (DIF), one way the Office of Data and Innovation (ODI) helps improve state services for all Californians. ODI works in partnership with the California Department of Technology (CDT), which improves state technology infrastructure through its Technology Modernization Fund (TMF) and Technology Stabilization Service (TSF). Together, these three funds ensure California state departments innovate by applying human-centered design, data, and IT investments to yield quick and meaningful results.

Email info@innovation.ca.gov to learn more about this project.