David Tetley, Principal Software Engineer, Abaco Systems
High Performance Embedded Computing (HPEC) has evolved tremendously in the last twenty years. Two decades ago, embedded real-time processing for system modeling, simulation, image and signal processing often utilized scaled-down supercomputer architectures--a homogenous array of identical processors interconnected in a parallel, symmetric topology.
Programming solutions for these architectures were initially fragmented, often using hardware vendor or microprocessor-specific software layers for communication between processing elements. Over time, the need for portability drove the development of new open standards; for example, MPI (Message Passing Interface) evolved to enable developers to create high performance, scalable, and portable applications to run on these homogeneous, parallel architectures.
In the last ten years, new technologies, such as multi-core CPUs, DSPs, GPUs, and FPGAs, have provided orders of magnitude more embedded processing power, and so deployed architectures have become heterogeneous. A modern system combining a modern FPGA, multi-core CPU, and powerful GPU can now replace a system that had tens to hundreds of processors a decade ago. Also, “System on Chip” (SoC) technology can integrate multiple processing architectures into one piece of silicon with the performance to match that of a supercomputer of the late ‘90’s!
These new high performance, heterogeneous architectures are being deployed in a rapidly expanding raft of industrial embedded applications such as industrial robots, autonomous vehicles, medical devices, and of course the Internet of Things revolution. They are enabling machine learning to be applied to these applications and AI capabilities to be embedded directly in devices.
Existing software standards have not kept up with heterogeneous architectures
Unfortunately, traditional programming models and open standards for communications between processing elements have lagged the HPEC hardware revolution. Interconnect topologies have become fragmented and more complex. There is no longer a clear, unified API that can drive all the new communication interfaces introduced by heterogeneous architectures.
The diagram above shows a sampling of the many software APIs that today’s HPEC developer may have to use to communicate between compute elements--often in the same application.
This can be an overwhelming endeavor and limit the innovation that can be achieved with these heterogeneous architectures.
Relatively new standards have evolved to tackle subsets of the problem. For example, MCAPI addresses core-core communication; whereas, OpenMP has a higher-level abstraction. HSA abstracts the complexities of the heterogeneous architecture of the CPU paired with a GPU but is not designed for inter-device communication. The bottom line is that there currently isn’t one communication API that fits all needs.
At a fundamental level, communication is about getting data from one endpoint to another. This basic concept is the same for all interconnects and localities. If it were possible to abstract these communication concepts into a simple, unified, high-level open API without compromising low-level performance and flexibility, then this new open standard could be a one-stop shop for developers wanting to focus on algorithm development rather than the complexities of inter-processor communication--greatly simplifying HPEC software development. The diagram below shows how this could look.
Click image to enlarge
Figure 2. Simplified HPEC software development
Complexity hurts safety certifiability
Another significant area of concern with existing communication standards is complexity. With the demand for high performance, heterogeneous architectures to operate in on the factory floor, in automobiles on our roads and in our homes, meeting safety certifications is critical. To minimize certification costs, APIs must be kept as simple as possible - a small, streamlined communication standard would be a significant help here.
Khronos creates an Exploratory Group and invites your input
In January 2019, The Khronos Group, an open consortium of leading hardware and software companies creating advanced acceleration standards, created an Exploratory Group to determine the industry’s interest in developing a simplified open standard for embedded heterogeneous communications. If there is enough interest, Khronos will form a Working Group and invite all interested parties to collaborate on the development of a multi-vendor standard under Khronos’ proven multi-company governance process.
Developers of aerospace, automotive, robotics, industrial, medical, and Internet of Things (IoT) applications, hardware vendors for silicon, boards, and systems, embedded software tool and OS vendors now have a unique opportunity to have a voice in the direction of open standards that will affect their industry.