Self-driving cars, text to speech, artificial intelligence (AI) services and delivery drones — just a few obvious applications of AI. To keep fueling the AI gold rush, we’ve been improving the very heart of AI hardware technology: digital AI cores that power deep learning, the key enabler of artificial intelligence.
At IBM Research, we’ve been making strides in adapting to workload complexities of AI systems while streamlining and accelerating performance – by innovating across materials, devices, chip architectures and the entire software stack, bringing closer the next generation AI computational systems with cutting-edge performance and unparalleled energy efficiency.
In a new paper presented at the 2021 International Solid-State Circuits Virtual Conference (ISSCC), our team details the world’s first energy efficient AI chip at the vanguard of low precision training and inference built with 7nm technology. Through its novel design, the AI hardware accelerator chip supports a variety of model types while achieving leading edge power efficiency on all of them.
This chip technology can be scaled and used for many commercial applications — from large-scale model training in the cloud to security and privacy efforts by bringing training closer to the edge and data closer to the source. Such energy efficient AI hardware accelerators could significantly increase compute horsepower, including in hybrid cloud environments, without requiring huge amounts of energy.
AI model sophistication and adoption is quickly expanding, now being used for drug discovery, modernizing legacy IT applications and writing code for new applications. But the rapid evolution of AI model complexity also increases the technology’s energy consumption, and a big issue has been to create sophisticated AI models without growing carbon footprint. Historically, the field has simply accepted that if the computational need is big, so too will be the power needed to fuel it.
But we want to change this approach and develop an entire new class of energy-efficient AI hardware accelerators that will significantly increase compute power without requiring exorbitant energy.
Tackling the problem
Since 2015, we’ve been consistently improving power performance of AI chips, boosting improving power performance by 2.5 times every year. To do so, we’ve been creating algorithmic techniques that enable training and inference without loss of prediction accuracy. We’ve also been developing architectural innovations and chip designs that allow us to build highly efficient compute engines able to execute more complex workloads with high-sustained use and power efficiency. And we’ve been creating a software stack that renders the hardware transparent to the application developer and compatible across hybrid cloud infrastructure, from cloud to edge.
We remain the leaders in driving reduced precision for AI models [Figure 1], with industry-wide adoption. We’ve extended reduced precision formats to 8-bit for training and 4-bits for inference and developed data communication protocols that enable AI cores on a multiple-core chip to exchange data effectively with each other.
0 comments:
Post a Comment