What type of IoT data are we going to keep?

Let’s start by answering this question with a typical IT answer: It depends.

There are many different devices that collect different types of data. In my first post I talked about electricity data, in my second post I discussed smart fridges and health trackers. But there are so many more examples that are existing already, becoming more and more mainstream, and in the future there will be new types of devices that will gather IoT data also.

In an ideal world, every little piece of IoT data that is collected should be kept for many years. Availability solutions would be a key and crucial part in this strategy, and whenever needed, you would have all data at hand from a very long time frame.

Unfortunately, we don’t live in an ideal world, and we don’t want to keep all the data for a very simple reason: Cost. Holding all that data would cost a fortune on disks. Even if you would use tape storage, it still would become rather expensive after a while.

Let’s illustrate this with a simple example and look at the data that is gathered. Let’s take a health tracker that gathers the following data:

  • Heartbeat
  • Number of steps
  • Run exercises + GPS map of the run
  • Sleep pattern

I know that most of the health trackers gather much more information, but let’s keep it simple for now. Obviously you want the end-user to have an overview of all of the statistics/data during the first week, maybe averaged per hour, and full details of the exercise. For the next three weeks, your online service will probably show averages per day and after that per week and even per month.

You might think now that your job is done and you have decided on what data you are going to keep and what data you will average, but you have to think further than that. Cost savings have applied because you are averaging the data and throw away the bulk of raw data. However, that might not be the best choice…

Imagine that you can hold the most relevant IoT data and gather that for many years. Imagine that scientists will use that data for medical research. Instead of doing a long, multiple year research, they suddenly have raw data at hand in the beginning of the project. Instead of a research with a couple of thousand people (which is costly), they have their hands on data of hundreds of thousands and maybe even more.

Imagine that your customer base is very loyal and buys (on a regular basis) your newest health tracker. Imagine that this health tracker has some new functionality based on certain data that is already collected by the previous one, and now can be put to use for that new functionality…

And there could be many more examples or future things you can do with that data. Devices that automatically arrange climate or electricity in your house can deliver data that can be used to tailor a service specifically per customer needs.

Fridge devices can allow a service that not only automatically creates a shopping list, but based on your eating/drinking patterns deliver health information on what you are consuming or even suggest alternatives for you. That data, again, could also be used for medical research and of course for marketing use. Many options become available (but do remember that there are privacy rules to data collection and selling that data…)

To conclude, what IoT data should you keep when collecting information from devices? It depends, but do know that you need to think this through and have to think about the future also. More services might need that additional data and/or that data can be used for other services also that could make a difference in the future. And while you are at it, make sure you develop a strategic availability service for that data, as we discussed in post one, which should be located at the back-end of your service.

