Many internet-connected devices send metrics to the cloud, where analytics software is used to extract meaningful data for trend analysis, display and alerting. If data is time-varying then it must be sampled and reported more frequently, which increases network traffic and cloud storage. As IoT deployments increase in scale, this becomes more costly and less practical. For example, an IoT deployment with 10 million devices each reporting 2 kilobytes of data every 30 seconds would generate 60 terabytes of data per day and a database load of over 600,000 IOPS.
The concept behind embedded device analytics is quite simple: divide analytics functionality into that which analyzes the data provided only by the single device and that which takes a global view across devices, and embed the first of these directly into the device. This has numerous benefits:
- Time-varying data can be sampled frequently without requiring a high volume of reports to be sent to the cloud, reducing network load and storage requirements
- A set of related metrics can be analyzed together in real time, which requires very little computational load for a single device but enables more sophisticated metrics to be sent to the cloud
- These more sophisticated analytics can be used within the device to prove meaningful local feedback, which makes devices more intelligent
There are a number of real world examples of embedded device analytics at work.
Smart AMI electricity meters sample usage data every five minutes but report every hour, and also support “last gasp” outage reporting. An electrical utility with 2.5 million customers collects about 4 terabytes of usage data per month, however if we consider reported outages then only about 1 gigabyte of this data is actually used.
Voice over IP devices use embedded agents, such as VQmon, that use multistate Markov models to learn about the distribution of lost and discarded IP packets and sophisticated analytics to correlate this with models of the codec and playout buffer in order to report accurate quality of experience scores. Reports are sent at the end of each VoIP call that distills the entire call into a set of metrics that reflect user experience and everything affecting it, enabling large deployments of VoIP devices to be cost-effectively managed. This VoIP embedded device analytics model is widely deployed in over 500 million IP phones, residential gateways and other devices.
The trend is to increase both the scale of deployment and the frequency with which data is sent. For example, smart meters are already sampling usage every five minutes, however there are proposals to reduce this to six seconds, the rationale being that this would allow individual appliance-level usage to be tracked. Reducing usage sampling intervals from five minutes to six seconds will increase the amount of data stored by a factor of 50. If, however, a Markov model (as used in VoIP analytics) was used to track usage then usage could be sampled every second and the resulting metrics could be sent less frequently and would be smaller in size but would contain much more detail on usage over time.
Currently the planet Earth stores about 2,500,000 terabytes of data per day, which equates to 300 megabytes of data per person per day — we are storing much more data than we can possibly comprehend. While IoT represents a small proportion of this today, growth in IoT and the desire for more detailed metrics will soon make IoT a major contributor. The use of embedded device analytics can help to reduce data volume, improve the quality and resolution of reported data and economize on storage.
All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.