Safeguard AI from security vulnerabilities as IIoT big data grows

Insignary

What IoT does for consumer devices, IIoT does on an industrialized scale. After collecting data from an array of close and far-flung devices and sensors, IIoT relays important information back to higher-order computing platforms. When machines have the capability to communicate with each other through sensors, it increases efficiency, saves costs and streamlines the entire workflow. So, it comes as no surprise that Accenture projects that by 2030, IIoT revenue will reach more than $14 trillion globally.

The benefits of big data growth

Industrialized companies generate and collect huge amounts of data. For example, FedEx has three major businesses — FedEx Ground, FedEx Freight and FedEx Express. In 2017, FedEx handled an average daily volume of almost 6 million packages internationally. In addition to collecting pickup, delivery, cost, payment, weight, barcode and contents information, each package generated multiple data points along its way from sorting facilities to its final destination. Some packages were subject to special handling, like secure maintenance or specific temperature provisions, all through the FedEx transportation system. Building on to the data provided by each package was information generated by FedEx’s delivery vehicles, including miles travelled and transportation coordinates.

To give you an idea about the scale of data we can collect, let’s look at an aircraft engine. Aerospace engine maker Pratt & Whitney’s family of geared turbofan engines are fitted with thousands of sensors that collect up to 10 GB of data per second. This means an average 12 hours of flight time can generate more than 800 TB of data. If each plane has multiple engines and there are multiple fleets of planes in service, the organization can amass zettabytes of data over very short intervals of time.

The engine information, if handled correctly can be used to achieve optimal engine performance and therefore reduce costs. If FedEx can duplicate its system of sensor-connected data gathering and communications all through its supply chain, it will experience staggering increases in efficiency. But effectively managing all of this data can be problematic for today’s systems. Consequently, many scientists and businesses are turning to artificial intelligence for answers.

The race to integrate AI to manage big data

AI is used to power Amazon’s Echo program. Apple’s Siri is also backed by AI. Those are just two examples of consumer-focused AI platforms. As AI becomes more sophisticated, we will see virtual agents become increasingly commonplace, while humans address the more difficult issues that require sentient, non-automated attention.

Over the next decade, AI is expected to permeate businesses and organizations vertically, beginning with single task systems and eventually leading to more extensive integration. This means that in 10 years, we can expect AI systems to touch far-flung IoT devices all the way to an organization’s internal and cloud-based systems. Accenture corroborates this claim, estimating that AI has the potential to boost rates of profitability by an average of 38% by 2035, across 16 industries.

How best to secure AI systems

The key goal behind AI systems is that humans will be pushed higher and higher in the decision-making process. This means that the transactional data collection, cross-system communication and system-to-human or system-to-system interactions will take place in an increasingly automated manner.

But should we proceed with caution? Yes, because while these systems seem hyper-intelligent, AI systems are as susceptible to hacking by malicious actors as any other type of software-based computing platform. Should a hacker — whether a criminal or state-sponsored actor — breach an AI system, the damage from data impairment, loss or misuse could be incredibly debilitating. Additionally, the high-speed, highly autonomous nature of AI systems can hinder the timely discovery of bad actors’ activities. Data theft, loss or misuse from a hacked AI could affect governments, businesses and citizens, and it could impair financial, healthcare, communications, and defense and policing activities. And hackers can do this by exploiting known open source vulnerabilities — a historically favored and frequented avenue used to disrupt and take over entire systems.

Open source: The irreversible trend

Due to its incredible value as an innovation engine, open source is an irreversible trend. Its code is now in widespread use by companies of all sizes, across all industry verticals. More than 90% of all software is either comprised completely of open source or contains open source components. It is used in operating systems, productivity software, and administration and development tools, as well as in code libraries that companies and third-party software vendors use to build their software. Today, it is difficult to find commercial or off-the-shelf software that does not use open source components.

The prevalence of open source components applies to AI systems as well. In fact, there are more than 20 Python-related AI and machine learning projects on GitHub, the largest open source software repository.

Software vulnerabilities

Whether software code is proprietary or open source, it harbors security vulnerabilities. Supporters of open source argue that the accessibility and transparency of the code allow the “good guys” — corporate quality assurance teams, white hat hackers or open source project groups — to find bugs faster.

Critics contend that more attackers than defenders examine the code, resulting in a net effect of higher incidents of vulnerability exploits. The open source community is very good at addressing vulnerability issues. Once open source vulnerabilities are discovered, the community is quick to catalogue their IDs and publicize the corresponding updated open source component version.

Vulnerabilities are dynamically increasing

The basic number of vulnerabilities is growing. The increase in code development means the inadvertent creation of more vulnerabilities. The U.S. government sponsors the Common Vulnerability and Exposure list, which reported 14,712 new vulnerabilities in 2017. What is more concerning is that according to recent data, it appears that 2018 will break last year’s record.

Complicating matters is the fact that good open source code is used in many different ways, across a spectrum of different kinds of applications. If a very useful piece of code has a vulnerability, it means that a large number of platforms and software applications containing that code become vulnerable to hackers.

Further compounding this issue is the fact that most known security vulnerabilities hide in the code used by organizations. Consequently, users do not know that within their code rest security threats, awaiting hacker attacks.

So, how are these known vulnerabilities able to hide in and pervade applications, platforms and devices that leverage open source? While newer versions of open source components are available without security vulnerabilities, the challenge for in-house software development teams and third-party developers is to effectively track all open source software components in their respective codebases. To a certain extent, this challenge is caused by the current software development and procurement model. It is also attributable to the fact that development teams often receive third-party software in binary format.

Understand your code

Development, security and software provisioning teams can use binary code scanners that use code fingerprinting. The tools extract “fingerprints” from a binary to be examined and then compare them to the fingerprints collected from open source components hosted in well-known, open source repositories. Once a component and its version are identified through this fingerprint matching process, development and security teams can easily find known security vulnerabilities associated with the component from vulnerability databases, like the National Vulnerability Database, also sponsored by the U.S. government.

AI promises a great deal. It has the power to positively change the fundamental structure of our economies. However, there is potential for malicious actors to use security vulnerabilities to disrupt or harm AI adoption and use. Due to the speed and scale of these security risks, engineering teams owe it to their customers and employers to discover and address known vulnerabilities before AI systems are placed in the field — and continue check them at consistent intervals.

All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.