Small World Big Data
Published: 05 Jun 2014
IT departments can benefit from storage vendors eavesdropping on their arrays to help them curb the amount of Internet of Things data inundating their storage shops.
Tired of big data stories? Unfortunately, they're not likely to stop anytime soon, especially from vendors hoping to re-ignite your Capex spending. But there is some great truth to the growing data explosion, and it pays to consider how that will likely cause incredible changes to our nice, safe, well-understood current storage offerings. For some it might feel like watching a train wreck unfolding in slow motion, but for others it just might be an exciting big wave to surf on. Either way, one of the biggest contributions to data growth will come from the so-called Internet of Things.
Basically, the Internet of Things just means that clever new data sensors (and, in many cases, remote controls) will be added to more and more devices that we interact with every day, turning almost everything we touch into data sources. The prime example is probably your smartphone, which is capable of reporting your location, orientation, usage, movement, and even social and behavioral patterns. If you're engineering-minded, you can order a cheap Raspberry Pi computer and instrument any given object in your house today, from metering active energy devices to tracking passive items for security to monitoring environmental conditions. You can capture your goldfish's swimming activity or count how many times someone opened the refrigerator door before dinner.
One challenge is that this highly measured world will create data at an astonishing rate, even if not all the data is or ever will be interesting or valuable. We might resist this in our own homes, but the Internet of Things trend breached the data center walls long ago. Most active IT components and devices have some built-in instrumentation and logging already, and we'll see additional sensors added to the rest of our gear. Naturally, if there are instrumented IT components and devices, someone (probably us) will want to collect and keep the data around for eventual analysis, just in case.
Adding to the potential data overload, there's an emerging big data science principle that says the more data history one has the better. Since we don't necessarily know today all the questions we might want to ask of our data in the future, it's best to retain all the detailed history perpetually. That way we always have the flexibility to answer any new questions we might ever think of, and as a bonus gain visibility over an ever-larger data set as time goes by.
The good news is that there are already some solutions within IT that can tame the Internet of Things. Tools such as Splunk and VMware vCenter Log Insight are great for aggregating log and event data for useful analysis. Splunk has even proven valuable beyond IT when business relevant "crossover" data sources such as Web user logs are folded into the mix (e.g., marketing might like to know about user behavior on those Web servers).
But it's still not clear how an IT organization can make sense of a flood of disparate data from custom sensors, logs and metering sourced across all of its heterogeneous vendor devices. There aren't any notable industry standards yet, and we don't believe there will be given that every stab at creating monitoring standards in the past has essentially failed to enable a consistent interpretation of even the most basic metric like CPU utilization. The truth is that deep, vendor-specific knowledge is required to interpret data sourced from each vendor's devices.
Storage vendors, call home
So what are our storage vendors up to? The most evolved are offering call home support based on analyzing a stream of data generated by your on-site implementation of their storage. Industrial Internet of Things data is sometimes collected directly through wireless or live Internet connectivity, but for IT storage arrays it's often sent as an automated periodic upload of various log, status, configuration and usage reports. By sharing this data, both the vendor and client provide each other with extra value, leading to a more intimate, trusted and valuable relationship.
There are two things a vendor does with call home data. The first is to provide much more effective and even proactive support because they can see directly what's happening on the storage platform, which features are being used, and the exact components that are installed and configured; but the vendor can also compare each customer's situation with the entire pool of other customers with similar implementations. In many call home schemes, the customer can also get their own processed views back to gain insights from the pooled knowledge and vendor analysis.
A vendor can also do a better job at, well, everything. Product management can better prioritize features, support can spot and address trending errors, account teams can ensure the customer is getting the most value out of what they bought, and marketing can plan more effective promotions. And yes, sales folks can make a well-timed sales call when usage is just about to max out the current license. There are non-client focused benefits too. For example, a storage vendor could track how OEM components such as flash cards or disk drives are performing in the field to help negotiate new supplier contracts.
Most vendors have thought about implementing a more rigorous call home program, but many have discovered that building a big data Internet of Things analytical platform at scale just isn't a core competency. Which is when we'd suggest bringing in a call home solution provider like Glassbeam (which counts several well-known storage vendors and a set of broader industrial/medical Internet of Things device makers among its clients) that can help set up a given product line for fully featured call home services in weeks, instead of the months or years it might take to build something internally.
A big brother can be a good thing
As a storage customer, does the thought of having your arrays metered and tattling back to the vendor seem more like a 1984 Big Brother intrusion than a valuable service? While the data "sent home" could be competitively interesting since it concerns your IT architecture and usage, it's not your actual business data and you'd probably have no problem sending it file by file when working through a specific support issue.
At Taneja Group, we recommend IT storage buyers look specifically for vendors that have intelligently instrumented their solutions so that they can offer proactive call home support and provide holistic "views" back into their own site. Fundamentally, we think your IT vendors ought to be trusted advisors. If you can't trust the vendors you have, you should look for ones that you can trust.
About the author:
Mike Matchett is a senior analyst and consultant at Taneja Group.