The real danger in IoT development is that posed by early, apparent “success”
There’s a huge difference between a prototype implementation that covers a handful of devices and supporting a hundred million devices with different software and hardware levels as well as intermittent connectivity. The failures we list below all fall into this category — stuff that doesn’t generate revenue or directly impact user experience, and thus can fall by the wayside in the heat of early success and piecemeal iterations of agile software development. A further risk is posed by the nonstop nature of IoT. You can’t turn off an IoT deployment once it’s launched, even for a moment, so routine maintenance will be challenging and radical architectural change will be near impossible.
The cold, hard truth is that you can have “cool” and you can have “unreliable.” You can’t be both, and there’s a limited number of times you can disappoint or inconvenience your customers and remain in business. Here are five foreseeable failures that will hit IoT:
Do we need to say anything?
Failure to architect from day zero for massive volumes
Success in IoT will be bigger and more “real time” than anything we’ve seen before. This means that IoT systems will have to scale painlessly to massive deployments. We’re not talking about compound annual growth of “only” 50% or even 100% — success could mean going from 1,000 deployed to 100,000,000 units over a couple of years.
Almost every cloud vendor claims to be able to support stupendous workloads, and the same claim is made by NoSQL vendors and open source projects.
I’m not going to directly question these claims, but there are three factors which need to be considered if you’re going to treat this like Lego and try to assemble massive systems by assembling a herd of diverse technologies:
- Physical constraints — It takes roughly 60 milliseconds to cross the U.S. one way. This means if your application has to have long chats with the server instead of single requests, your latency could jump into the seconds as people use the app far from your single data center. Constraints also apply within the data center. High-volume networking gets interesting at around 10 GiB/sec. And to be really scalable, not only will your app not fit onto one server, no single subsystem of your app will fit onto one server.
- Technology impedance mismatch — No matter how scalable individual technologies are, there’s no guarantee that high levels of performance will be obtained when you glue them together, especially if they scale at different speeds.
- Developer experience — How many developers do you have who have actually built a system with 100,000,000 users? Do you happen to employ any of them, or are they contractors or consultants? I’m not saying your people aren’t smart enough to do this, but are they going to get it right the first time? If not, will there be a second time?
Failure to anticipate unlikely but certain events
A fundamental difference between conventional software and IoT systems is the lack of control you have over the environment in which your creation is deployed. The real world is a strange, confusing and erratic place, and this oddness will impose itself on your system. Bizarre, one-off events will happen frequently. Assuming you have 100,000,000 devices deployed, an annual million-to-one event will be happening roughly twice a week. Coping with this requires a fundamental change of mindset for many developers. Software which is insufficiently paranoid will allow errors to enter the system and spread chaos. Chaos will lead to poor user experience, which will in turn lead to negative perceptions — or worse.
Failure to anticipate and cope with evolving physical complexity
The initial deployment of an IoT system will usually focus on some form of minimal viable product and then, as time goes on, more features will be added. But in addition to this, success will result in acquisitions of competitors, technology partnerships with people who have different worldviews, marketing inspired “features” which fail and are quietly forgotten, and the inevitable technical debt that will accrue over time.
While in a traditional corporate environment we could replace aging components, in the IoT universe the actual things aren’t owned by you — they are owned by your customers, who cannot be forced to stop using their devices and will expect them to work until they physically fail. This means that once you ship a device you’re stuck supporting it for decades, unless you’re willing to “brick” your devices and antagonize your customers.
Failure to anticipate and cope with evolving logical complexity
As your physical environment becomes more complicated, your software stack will follow it. What might have been a nice, clean deployment will become old and wrinkly over time, with chunks of obsolete code and increasingly convoluted data paths through the system as you try to cope with the unavoidable fact that you can never, ever stop supporting anything you’ve shipped.
Does your JSON have something nasty lurking in it?
Document data stores represent a real problem in this scenario. Because they don’t enforce any rules about what’s stored, every single piece of database interaction code you own needs to be able to understand every single record structure it could ever possibly encounter. In a SQL database, new columns are used to store new data, making this problem much more manageable.
How will you reconcile the need to evolve with the need to remain up at all times?
Much of the apparent clumsiness of highly available (HA) applications is a side effect of needing to anticipate rare but dangerous scenarios. Creating and deploying a truly HA application is a significant technical challenge; keeping it going is a much bigger one in the long term.
Will all your open source components be supported a decade from now?
The only thing sadder than an abandoned shopping mall is an abandoned open source project, especially if you use it. While it’s now easy to build elaborate applications by assembling a stack of open source projects, ongoing support will become challenging over time. Sooner or later one or more components will become orphaned. At this point you will have to choose between architectural change and disruption that isn’t tied to increased revenue, or betting you will be able to fix any issues that occur. At a minimum, it will mean you will need to retain the capability to compile your entire stack.
All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.