Guide to big data analytics tools, trends and best practices
A comprehensive collection of articles, videos and more, hand-picked by our editors
Northern Indiana Public Service Co., a utility known as NIPSCO, has tens of thousands of miles of electrical lines and natural gas pipelines in its distribution networks. Many of the pipelines were installed in the early 1900s, and keeping a close eye on them is a top priority, according to Michael Hooper, senior vice president of major projects and electric field operations at NIPSCO. To help provide the view, it's installing sensors to continually report on pressure, flow and other metrics -- part of a big data and operational analytics initiative that NIPSCO is investing heavily in. "We really need to pull in more real-time data so we can monitor the pipelines and be proactive [in maintaining them]," Hooper said.
Collecting, storing and analyzing data from industrial sensors, network logs and other machinery connected to the so-called Internet of Things has become more feasible because of the emergence of big data technologies, and an increasing number of organizations are joining marquee users such as General Electric in looking to take advantage of the available data. In a 2013 survey conducted by Enterprise Management Associates Inc. and 9sight Consulting, 38% of the nearly 600 big data projects being pursued by the 259 respondents involved machine-generated data, up from 24% in a similar survey a year earlier. In another survey conducted in 2013 by The Data Warehousing Institute, 47% of 188 respondents with big data management experience said machine data was part of their deployments.
NIPSCO already collects a lot of data from some points in its gas pipelines -- for example, the compressor stations that keep gas pressurized as it moves through the pipes. But Hooper, who took part in a panel discussion on big data trends at the 2014 Oracle Industry Connect conference in Boston, said in an interview afterward that the utility is in the infancy of the effort to place sensors along the pipelines. It's a big job: Fully outfitting the pipelines as well as the Merrillville, Indiana, company's electrical grid could take 10 to 15 years, he said.
Michael Hoopersenior VP of major projects and electric field operations, NIPSCO
That's partly because of the nature of the natural gas network. Adding new hardware to underground pipelines is an expensive proposition -- in many cases, the most logical approach is to wait until pipes are replaced, Hooper said. There are also data transmission issues to resolve, he added. Much of the piping runs through rural and mountainous areas that currently don't have Wi-Fi connectivity or cellular service.
And NIPSCO needs to build up its technology infrastructure to handle all the data that sensors can capture. "The physical amount of pings you're collecting is just enormous," Hooper said, noting that the utility eventually could gather multiple data points per minute from tens of thousands of sensors.
Big business benefits in the pipeline
But the potential benefits are big as well. The sensor data could alert pipeline operators to abnormal pressure, flow or temperature conditions that might point to problems in the pipes. The operators could then use remote valve controls that are also being installed to shut down parts of the distribution network and reroute gas flows, enabling what Hooper described as "real-time management of the pipes." That could help avoid major disruptions in gas service and increase pipeline safety, he said.
In addition, NIPSCO plans to analyze the accumulated sensor data in hopes of further improving pipeline operations. But another issue that Hooper and other project managers still need to address is how much of the incoming data to retain for analysis. "You never want to feel like you've disposed of a piece of data that might have made a difference in some kind of analysis," he said. "At the same time, there are limits on how much data you can physically store."
The amount of data that sensors and log files generate can be overwhelming for organizations to deal with, said William McKnight, president of McKnight Consulting Group. In addition to the sheer volume, machine-generated data often contains more "noise" than conventional transaction data does. "You're capturing data at such a fine grain," McKnight said, adding that one of the first priorities is filtering the information to figure out what's useful and what isn't.
A variety of new technologies might be required, according to McKnight and other analysts. That could include Hadoop systems, NoSQL databases, in-memory data grids, real-time data integration tools and stream processing technologies to pull in data, do real-time analysis and store what's being kept for later use. For more detailed analysis downstream, predictive analytics, data mining and big data analytics tools all have possible roles to play.
Technology doesn't hold all the answers
But there's a lot more to the process of harnessing sensor data and log files than throwing technology at the problem, said Gartner Inc. analyst Nick Heudecker. A big data infrastructure designed to capture sensor data can enable organizations to base decisions on real- or near-real-time information -- but Heudecker said a company might have to revise its business processes in order to make effective operational analytics decisions.
John Myers, an analyst at Enterprise Management Associates, said sensor data and the log files produced by websites, computer systems and networks are relatively similar in nature. But he added that collecting data from sensors installed on jet engines, vehicles and equipment in the field can be much more complicated than gathering log data is, partly because of the connectivity and data transmission issues in remote areas that NIPSCO's Hooper mentioned.
For many organizations, though, the business benefits that can be gained by using sensor data to improve the operational efficiency of machinery make the challenges, headaches and required investments worthwhile. For example, jet fuel is the biggest expense in operating airplanes. "If you can shave 1% or 2% off of fuel costs by being able to operate an engine more efficiently, that's a huge savings," Myers said.
Preventive maintenance on industrial equipment is one of the core applications for sensor-driven big data analytics applications across various industries, Gartner's Heudecker said. He also cited a range of potential industry-specific uses in sectors such as transportation, energy, natural resources management, telecommunications, health care and agriculture -- "anything that's resource-intensive." At a transportation company, for example, "even if you can get drivers to shift more efficiently, you can save a ton of money," Heudecker said.