Top Questions Utilities Are Asking About AI for Risk Prediction (Part 1)
Data, Accuracy, and Integration

AI, Machine learning, Risk modelling, Predictive Analytics, Pipe Failure Prediction, Capital Planning

As a decision-maker in the water utility sector, you manage high-stakes responsibilities: keeping assets reliable, protecting public safety, and making tough financial decisions that can be costly.

So, when vendors talk about AI predicting pipe failures or optimizing networks, skepticism is not only normal – it’s healthy.

This is part 1 of a 2-part series highlighting questions in the water industry about the readiness to adopt and deploy AI, a technology with the potential to dramatically transform processes and entire industries.

1. What data do we actually need to get started?

The Answer: You need less than you think, but what you have must be accurate. At a minimum, effective AI risk modeling requires:

Pipe inventory data: Material, diameter, installation year, length (from your GIS).

Historical failure records: Break locations, dates, and basic failure modes (from CMMS or work orders).

Pipe segment identification, using a consistent identifier

The good news: most utilities already have 70-80% of what they need. The challenge isn’t usually data availability – it’s data quality and accessibility. Missing installation dates for 15% of your network are manageable; inconsistent material classifications or unreliable failure records are problematic.

The best AI models incorporate additional layers, such as soil type, soil moisture, water quality data, seasonal weather patterns, traffic loading and much more. But this information enhances accuracy rather than enables basic functionality. The biggest difference in the quality of AI predictions derives from the quality of the data.

Bottom line: If you can produce a decent pipe inventory and a list of when and where breaks have occurred over the past couple of years, you have enough to start.

See the Top AI Questions utilities are asking in 2026 →

2. How accurate are the predictions, really?

The Answer: Modern AI models typically achieve over 85% accuracy* in identifying which pipes will fail within the next year. As noted in the previous section, the quality of the underlying asset and failure data plays a major role in how accurate these predictions can be. But accuracy isn’t the whole story – precision matters more.

Here’s what that means in practice: if the AI flags the top 1% of your network as highest-risk, that segment typically accounts for 40-60%* of actual failures over the next year. With excellent data, some advanced models push this much higher.

The more useful question is precision: how many false alarms will we get? In other words, how often does the model flag a pipe as high-risk when it actually proves to be healthy? Well-calibrated models typically achieve accuracy in the high 80-90% range, meaning 8-9 of every 10 predictions correspond to actual problems.**

Reality check: AI won’t predict every failure. A water main hit by an excavator or a manufacturing defect that suddenly manifests won’t show up in historical patterns. Expect to catch 50-70%* of failures, which is transformative compared to the essentially random 10-15%* you’d catch with age-based replacement alone.

The accuracy improves over time as the model learns from your specific system and incorporates new failure data. Think of year one as calibration and year two onward as optimization.

3. What if our data quality isn’t great?

The Answer: Imperfect data is normal – every utility has gaps. The question is whether your data is “messy but usable” or “fundamentally unreliable.”

AI can work with:

Missing pipe attributes for some portion of the network (often AI can infer these from the dataset).

Incomplete soil or pressure data (models use regional proxies).

Inconsistent recording practices over time (with data cleaning).

You cannot work with:

Systematically wrong material classifications (calling cast iron “ductile iron”).

Missing failure records for significant time periods.

GIS data that doesn’t match physical reality.

The path forward: Most vendors offer a data assessment phase where they analyze your existing data quality and identify critical gaps. Expect to spend 2-4 months on data cleaning before model deployment. This isn’t a waste of time—it’s infrastructure hygiene that benefits everything digital.

Pro tip: Start tracking failures with GPS coordinates today, even if you’re not ready for AI. Twelve months of clean failure data dramatically improve model results.

Here are the Top AI Questions for Utility leaders in 2026 →

4. How do we integrate this with our existing capital planning process?

The Answer: The best implementations augment rather than replace existing planning workflows. Here’s how utilities typically integrate AI:

Phase 1 – Parallel Running (Months 1-6): Run AI predictions alongside your traditional planning. Use the AI’s high-risk segments as an additional input when developing or reviewing your planned replacements. You’ll often find 30-40% alignment and discover critical gaps in both directions – pipes you were about to replace that are actually low-risk, and high-risk segments not on your radar.

Phase 2 – Hybrid Decision-Making (Months 6-18): Use AI risk scores as one factor among many. Many utilities also use planning tools to quickly generate and test different replacement scenarios –adjusting budgets, priorities, or operational constraints – to see how those choices affect the projects that rise to the top. Increasingly, modern platforms allow utilities to move directly from risk insights to project planning within the same workflow.

A typical scoring matrix might include:

AI likelihood of failure (40% weight).

Consequence of failure – criticality of location (30% weight).

Operational considerations – planned street work, grant opportunities (20% weight).

Condition assessment findings (10% weight)

Phase 3 – AI-Driven Planning (18+ months): AI risk scores become the primary driver of planning, with human override for special circumstances (e.g., COF and BRE). Field crews validate high-risk predictions through targeted condition assessment.

The key: AI doesn’t make decisions – it informs decision-makers and helps accelerate planning. You still own the capital plan. The difference is you’re now working from data-driven prioritization rather than relying primarily on age-based assumptions or limited historical insight

5. What happens when the AI is wrong?

The Answer: AI will never be 100% accurate. It will be wrong sometimes. But that is not unique to AI. Utilities already rely on prediction methods – primarily pipe age, break history, and engineering judgment – that also produce false positives and missed failures, but with much lower accuracy than AI. And like any planning framework, the results should be reviewed over time so utilities can learn from missed failures or incorrect predictions and continuously improve their planning.

See the Top AI Questions utility leaders are asking in 2026 →

When AI predicts a failure that doesn’t happen (false positive):

You’ve invested in inspecting or replacing a pipe earlier than necessary.

The cost is usually modest—a wasted inspection or slightly premature replacement.

This becomes valuable data that improves the model.

When AI misses a failure (false negative):

You face an emergency repair, just as you would have without AI.

The difference: you’re catching 50-70% of failures proactively, rather than 10-15%.

This provides learning data that refines the model.

The critical mindset shift: AI doesn’t eliminate risk—it manages it more effectively. Even at 70% accuracy, you’re preventing hundreds of emergency callouts and millions in reactive repairs.

Build in validation protocols: When AI flags a pipe as high-risk, conduct a physical inspection or condition assessment to validate before making a major capital investment. This creates a feedback loop that continuously improves accuracy while protecting against costly mistakes.

Track and analyze misses: Review every failure the AI didn’t predict. Was it truly unpredictable (excavation damage, manufacturing defect), or did the model miss a pattern? This analysis drives model refinement.

Stay tuned for Part 2 of this series on questions utilities have on evaluating AI to augment operations and planning.

Sources:
* VODA.ai experience
** Failure Analysis and Machine Learning-Based Prediction in Urban Drinking Water Systems,” Appl. Sci. 2025, 15(24), 12887

This article is part of our Utility Voices series – where we share real stories, field-tested insights, and trusted perspectives from across the water sector. From frontline engineers to leading consultants, from early questions to proven outcomes, these are the voices shaping the future of water.

🔔 Subscribe to our blog so you don’t miss the next chapter.

Theoktisti Makridou

Theokisti is Lead Data Scientist at VODA.ai, where she specializes in building machine learning models that help water utilities better understand and predict infrastructure risk. Her work focuses on turning complex utility data into practical insights that support smarter maintenance and capital planning decisions. She holds a Master’s degree in Data and Web Science from Aristotle University of Thessaloniki and previously worked as a data scientist in both research and industry.

Jim Fitchett

Jim Fitchett is an Adjunct Instructor Harvard University and Co-founder of VODA.ai. An entrepreneur and former CIO of Harvard Medical School, he brings decades of experience in digital transformation, AI, and innovation – both as a business leader and longtime faculty member.

Top Questions Utilities Are Asking About AI for Risk Prediction (Part 1)
Data, Accuracy, and Integration

Theoktisti Makridou, Lead Data Scientist VODA.ai

Mar 06, 2026

As a decision-maker in the water utility sector, you manage high-stakes responsibilities: keeping assets reliable, protecting public safety, and making tough financial decisions that can be costly.

1. What data do we actually need to get started?

2. How accurate are the predictions, really?

3. What if our data quality isn’t great?

4. How do we integrate this with our existing capital planning process?

5. What happens when the AI is wrong?

Theoktisti Makridou

Jim Fitchett

You May Also Like

The Northeast’s Toughest Water Infrastructure Challenges

The true Return on Investment of AI-driven Risk Prediction

Common Concerns About
AI Risk Predictions

Top Questions Utilities Are Asking About AI for Risk Prediction (Part 1) Data, Accuracy, and Integration

Theoktisti Makridou, Lead Data Scientist VODA.ai

Mar 06, 2026

As a decision-maker in the water utility sector, you manage high-stakes responsibilities: keeping assets reliable, protecting public safety, and making tough financial decisions that can be costly.

1. What data do we actually need to get started?

2. How accurate are the predictions, really?

3. What if our data quality isn’t great?

4. How do we integrate this with our existing capital planning process?

5. What happens when the AI is wrong?

Theoktisti Makridou

Jim Fitchett

You May Also Like

The Northeast’s Toughest Water Infrastructure Challenges

The true Return on Investment of AI-driven Risk Prediction

Common Concerns About AI Risk Predictions

Categories

Recent Posts

Top Questions Utilities Are Asking About AI for Risk Prediction (Part 1)
Data, Accuracy, and Integration

Common Concerns About
AI Risk Predictions