Before optimizing industrial equipment with AI, optimize your data

Manifold

These days, everything is “smart,” from the IoT toaster to internet-connected toilet paper dispensers. While uses of these devices are limited, their existence points to the increasing availability of resources that enable more important pursuits. Sensor, communication, storage and computing costs are rapidly decreasing, meaning it’s now possible to collect vast amounts of data from sensors attached to expensive equipment like oil and gas rigs, earth-moving tools and factory machinery.

It’s cheap and easy to collect a lot of data, but getting value out of that data is the challenge. In order to use this data to optimize physical assets, machine learning is essential. A recent client engagement of ours demonstrates why.

The client designs, manufactures and leases industrial equipment. The company wanted to remotely monitor products and alert teams about units at risk of failure before they fail, resulting in reduced downtime. We helped our client build a prototype machine learning system to address this problem.

We began with understanding, the first step of our lean AI development process. Understanding has two big components: business understanding and data understanding. On the business front, we quickly understood that reducing downtime and turning unplanned maintenance into planned maintenance was a big client business driver. On the data-understanding front, we audited three years of client data.

Going into the project, we thought that we had all the data we needed to build the predictive system. But we discovered that the root cause for many failures was not documented in a consistent manner. As such, we decided not to use this data because it was not clean enough.

So, we pivoted from trying to predict high-cost failures, like engine breakdowns, to trying to predict longer outages. Although this had slightly less business value, it was solvable and could still improve our client’s operational costs. Taking time to thoroughly understand the data before you begin using it can save you time and wasted effort.

Once we understood what data we had to work with, and the shape it was in, we were ready to move on to the next step in the lean AI process: engineering.

In this project, engineering was largely about extracting, transforming and loading data to make it useful for machine learning. As always, we took a production-first mindset and created a scalable data pipeline that merged several data sources, including real-time streaming sensor data, machine metadata from the ERP and weather data.

The pipeline was developed using a combination of tools, from Apache Spark to Dask to HDF5 files. The use of big data tools was necessary because of the volume of data being processed — both at training and inference time.

We then moved on to the third step of lean AI: modeling. We built a simple baseline random forest model using features hand-engineered from the raw data. Feature engineering incorporates domain expertise into inputs that feed an algorithmic model.

We needed to create appropriate inputs for the time series forecast problem. We worked alongside the company’s mechanics and engineers to identify features such as pressure and temperature ranges. Once we had our baseline model working, we added features and tried more complex modeling techniques like 1D convolutional neural networks. We found that a random forest with appropriately tuned features outperformed more complex models.

In the final part of the prototyping process, we focused on the fourth step of lean AI: user feedback. Working with our client’s software engineering team, we prototyped a simple spreadsheet tool that mechanical engineers could use to consume the daily failure predictions. Then, we held a number of working sessions with mechanical engineers. We discovered many interesting nuances to the problem that led us to adapt the model and post-process the raw predictions into a more useful state. For example, we found that many units were being operated outside of the recommended operating range, so they were more likely to fail. Showing those at the top of the list was not particularly informative; rather, we found that looking at changes in the baseline failure rate for units was more interesting. We adapted the spreadsheet to highlight significant daily changes instead of absolute failure rates.

Our client ended up with a working prototype that gave daily failure predictions and built the capability to continue developing the prototype.

Although IoT systems can quickly arm you with massive quantities of data, it’s important to remember that there must be a method to the madness. There is value to be had from sensor data, but it’s hard to get it if you just shoot from the hip because you’ll flail a lot. By understanding the business problem and your data upfront — and ensuring that they’re well-matched — you’ll save a lot of backtracking. Then, follow a methodical process such as lean AI to get to value quickly.

All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.