Wave interference is the technical term for when two waves meet. The resulting displacement or superposition is the combined net effect of each wave. IoT data and analytics reflects in many ways the superposition between IoT and big data.
IoT is a continuously evolving concept, and some definitions include IoT data and analytics as part of the concept, yet fundamentally, the Internet of Things is the network of physical objects or things, digitalizing information about the environment and exchanging that data across the existing Internet structure. Big data too has not been immune to various definitions, and one of the more commonly applied understandings is that by McKinsey that big data are “datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyse.”1
These two waves, IoT and big data, have started to meet, and not only that, have had a significant multiplier effect on the superposition outcome. The number of connected devices continues to grow and accelerate with the demand for more and more data. The value of data has also started to change in a positive direction with more and more insights achieved with real-time data sources and data aggregation.
The combination of IoT and big data has had its demands on enabling technologies. Fast data has generated new requirements in terms of data ingestion and in-stream processing. Big data has placed new requirements on data storage and how schema and queries are managed. Let’s examine these in slightly more detail.
Fast data becomes an important game changer
Big data is an important factor in IoT data and analytics, however the fundamental and more significant change that has taken place in data management and analytics has been driven by the speed with which data is now being processed and fed back into action in near real-time. From traditional batch processing with historical analytics driving insight over periods of days and weeks, fast data is about real-time ingestion and in-stream processing of data, down to seconds and milliseconds of actionable feedback. Examples of database providers able to meet these requirements are Exasol, SAP HANA, SQream and VoltDB. Fast data does away with a fairly traditional extract-transform-load (ETL) approach and has pushed the analysis of data from a back-end business intelligence activity to a critical front-end application plus feature; application plus referring to the expected outcomes of such applications as predictive maintenance or prescriptive decisions for medication routines, both involving a degree of machine learning/advanced analytics.
Big data is the challenge on the other side of the coin
Big data is not a new phenomenon. Big data has become an increasing challenge for many enterprises, and enablement technologies such as Hadoop are really what have driven this new opportunity space. With Hadoop, or more specifically HDFS for distributed file storage and MapReduce for distributed processing, enterprises were finally able to scale-in and scale-out their data storage requirements in a more flexible and cost efficient manner rather than the more traditional “more data, one more server” approach. Examples of database providers here would include Cloudera, Hortonworks and MapR.
Big data has also been about variety of data and not just volume. Here, NoSQL databases2 or new hybrid databases have pushed boundaries, creating schema on the fly or on read, and dispensing with the more cumbersome and limiting RDBMS columnar approach. As the growth in numbers of connected devices continues, the richness and variety of data sources will continue to expand, and from highly structured data, enterprises will need to work with semi- and as well as completely unstructured data to gain the additional value from data aggregation.
The value engine in IoT data and analytics
The creation of value comes from all the components in an end-to-end IoT application. Devices contribute to the value. Connectivity contributes to the value. Applications certainly have a major contribution component as does the data and the analytics. What is interesting to consider, as illustrated by the multiplier effect in the two-wave model at the start of the article, is the net effect combining IoT and big data has — a superposition or multiplier effect which is greater than the parts.
Data is a reusable commodity, and where value may initially be unlocked from the single data point in real-time, the aggregation of single data points, real-time and historical will also yield additional and valuable insights previously unidentified.
1McKinsey Global Institute, “Big data: The next frontier for innovation, competition, and productivity,” May 2011
2 For some more information about NoSQL databases, read the Machina Research Research Note, “Why NoSQL are needed for the Internet of Things,” April 2014
All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.