In the 1950’s, Arthur Samuel, an IBM employee, wanted to teach a computer to play checkers. So, he wrote the original program on IBM’s first commercial computer, but he kept winning. Next, he wrote a program to let the computer play against itself. The program collected data and created a predictive engine, to optimize its playing and then Mr. Samuel started losing.
Artificial Intelligence can be defined as the design of an intelligent “agent” that perceives its environment and makes decisions to maximize the chances of achieving its goal- which has been more readily applied to natural language processing, Machine Learning, robotics, etc. Machine Learning, a category of Artificial Intelligence, gives computers the ability to learn without being “programmed”. Under Machine Learning, there is Supervised Learning which relates to classification and regression, Unsupervised Learning such as clustering, dimensionality reduction, and Reinforcement Learning focused on reward maximization.
Artiﬁcial Intelligence, specifically Machine Learning, is a modeling tool that leverages critical components such as advanced computing power, computational heuristics, and mathematical intuition to analyze non-linear and oftentimes dynamic data sets. In traditional data analysis when predicting a result from an input, one of the foremost methods, linear regression analysis would often be used—that is creating a ‘line of best ﬁt.’ Given the rising amount of data that is created and available today, oftentimes detecting patterns and correlations within the data is not as easy as simply ﬁnding a single line of best ﬁt. Sometimes principal components can only be detected by testing many different conditions and several combinations of disparate variables. These are problems that go beyond what traditional data analysis can provide. However, with enough of the right data and computational power, Machine Learning can uncover and reveal many of these complex patterns for our use.
Machine Learning uses automated and iterative models, to learn about patterns in big data, detecting anomalies, and identifying a structure that may be new and previously unknown. Machine Learning is not statistics and is not rules based. Rules are based on what a human knows about the data. When Machine Learning is combined with statistical analysis, it identifies relationships that may otherwise have gone undetected. Machine Learning can surpass human capability and software engineering capability to make use of very large data sets.
Today, just as one would choose a doctor utilizing Machine Learning (for disease identification and risk stratification accessing every medical paper ever written to diagnose a cancer) over a recent medical school graduate; utilities are turning to asset management consultants applying Machine Learning in their asset management planning efforts. Machine Learning helps determine the current state of buried linear assets and provides an enhanced asset management tool to help address the water industry’s most pressing concerns of calculating an optimal balance between predictive accuracy, performance and cost in an accelerated time frame.
Source: US EPA Asset Management: A Best Practice Guide.
Machine learning models need a large amount of historical and geospatial data. Water main condition assessment data contains all the necessary components for Machine Learning in water utilities with years of historical data. Analyzing this data consistently can uncover trends, gain insight on pipeline health, and offer data-driven assessments.
Two companies in the water industry, Fracta and Esri have partnered to offer a fast, affordable and accurate Machine Learning model to predict water main Likelihood of Failure (LoF). The results are being used by utilities across the United States, and soon Japan, to target leak detection and valve maintenance efforts, focus preventative maintenance crews, validate capital replacement plans, and align master planning efforts. With 90% of water assets being location based and most water main pipe data in GIS files, the intersection of GIS and Machine Learning was inevitable for both analysis and visualization.
Every utility has problems with their water pipe data to varying degrees. Data cleanup is essential for any degree of accuracy when making financial investment decision making. Data acquisition, assessment and cleaning for any Machine Learning process is roughly 60 to 80 percent of the work — also known as pre-processing or data wrangling — with the remaining percentage being the Machine Learning itself. Once the data is assessed, cleaned and imputed where needed, it is ready to be fed into a Machine Learning model where it is subsequently ‘trained’ to learn the patterns that predict breakage events.
As an ever-increasing amount of data strengthens the predictive power of a Machine Learning algorithm benefiting utilities with large amounts of historical breakage and asset information, Machine Learning can also benefit utilities with limited asset or breakage data as Machine Learning data can “fill in the gaps.” If a utility has little breakage data, future breaks can be informed by patterns found for other similar materials, install years, soil compositions, etc. If a utility has little asset data, a similar process can be applied by simply looking at ancillary geospatial data to impute the probability of pipe breakage events. Thus, because Machine Learning utilizes many streams of data in order to perform certain predictions, it begins to learn patterns that can inform situations where many of the usual data points may not be available.
Virginia Tech’s Sustainable Water Infrastructure Management program (SWIM) has a national water pipe database project called PipeID funded by the Bureau of Reclamation. Every US water utility, regardless of size, is encouraged to participate. A recommended best practice is to have Fracta clean the pipe data and provide water pipe condition assessment using Machine Learning before submitting it to the national PipeID database. Fracta is able to complete its work within 4-8 weeks providing valuable insight back to the utility for immediate action, while it is expected that the PipeID project will have results available in 2020.
The more data a model contains, the more robust the model. As utilities are, over time, constantly collecting data such as new breaks and installed pipes, that data can continually be fed into a Machine Learning model. New pipe data strengthens the predictive power of a Machine Learning model. Machine learning can also benefit utilities with limited asset or breakage data by “filling in the gaps.” Machine learning can utilize hundreds of streams of data (climate, environmental, soil, etc.) in order to perform certain predictions and begins to learn patterns that can inform situations where many of the usual data points may not be available creating a new digital revolution in advanced asset management practices.
Machine Learning is a major trend and poised to make a significant impact in underground water infrastructure asset management. Not only does Machine Learning drive performance optimization, but also in business processes and planning. In the water utility industry, due to the multitude of data and variables involved, water main condition assessment (CA) is an ideal use case for this technology. For many water municipalities, Machine Learning for water main condition assessment is a low risk use case to test drive Machine Learning within their organizations, and in the process save millions of dollars unnecessarily spent on pipes that are in no critical or immediate need of replacement and on repairing costly breaks that could have been prevented given a proactive plan.
Incorporating a Machine Learning condition assessment tool into a buried water main asset management program will contribute to the reduction of the economic impacts incurred from water main breaks, and more efficient allocation of capital by water utilities. Use of best practices and a more accurate, objective tool will align maintenance and capital repair and replacement strategies to more efficiently leverage scarce financial and human resources. They also inject financial integrity to the planning process and refine the investment strategy so a utility will be in a better position to defend planning efforts and fund needed capital pipe replacement projects.