What processors will be important in robocars?

Topic: 
ARM graphic to make you feel their chips will control cars

Recently chip-maker ARM announced a new processor series aimed at robocars. We've also seen Nvidia get a lot of stock market boost thanks to the presence of their GPUs in most neural network systems. (I own some NVDA.) A long line of chip companies have touted that their chips will be big winners in the robocar race.

Will they? Obviously as cars turn into computers, we are going to need lots of processors. But today there are only 80 million cars sold every year. That number is going to grow a bit with robotaxis -- perhaps in a few years it will double to 160M.

But that's pretty small numbers in consumer electronics. 260 million PCs are sold per year, and another 160 million tablets. And 408 million smartphones. Plus gaming devices and industrial computers -- it's a lot. If you could corner the market of all cars, and perhaps sell multiple processors per car, it could be serious revenue.

It's good for the chip companies to want to win this, but they following the right strategy?

Processors

Some companies are hoping to make specialized processors, with a plan that car makers will choose them over the commodity processors made by Intel, AMD, ARM and others. The ARM chip claims some redundancy features for reliability.

It's true that you want a chip that can handle a rougher environment than a desktop, but most chip makers already build that. Redundancy is good, but I suspect most teams would prefer to get their redundancy by just duplicating off-the-shelf parts rather than buying more expensive specialized parts. This is how Google built the most reliable data centers in the world -- they used the cheapest parts they could find, expecting lower grade reliability, and planned for failure and built systems to handle it gracefully. I'm convinced that is the right architecture here. When a system is life critical, you don't design to have parts that don't fail, you must design to properly handle the failure of any part of the system, and even multiple parts.

In fact, cars face a particular problem. Some of them will fail with one section of the vehicle literally smashed to bits. You can't always survive that, but if you can stay partly online it may make a difference. You don't put all your computing in one physical place.

The good news is that most of the processing for non-neural-network parts of a robocar have only moderate-high compute requirements. This may change as new algorithms are developed, but the trend is for this to get better and easier as algorithms improve and hardware gets faster and cheaper.

Neural Network Processors

Nvidia got its big boost because everybody is excited by using deep neural networks, particularly for vision processing and object classification. All the early research on neural networks was done on Nvidia GPU chips. While GPUs are meant to process images, they are really massively parallel simple multi-core computers. Nvidia developed the CUDA language to let people use the chips that way and it paid off big time. They have added expertise and tools aimed at self driving, and are selling their GPU boards into most self-driving car teams.

At the same time, neural networks are the "it" thing in hardware right now. Everybody wants to build neural network hardware. Every chip company, and many startups, are hard at work on making better hardware, to compete with Nvidia. Some of the approaches are radical and promise orders of magnitude improvements over GPUs. GPUs are unlikely to remain the king, though Nvidia might make some of the new chips too.

GPUs have the advantage that they are mass market products. Some type of GPU is found in every computer, phone, tablet or gaming console. This provides tremendous development resources and a much shorter product cycle. High end gamers buy 13 million expensive GPU cards per year and other devices use even more.

The first company to make hardware just for neural networks was actually Google, with their TPU (Tensor Processing Unit.) They don't sell the TPU, they use it in their data centers, but it's also available to their sibling company Waymo. Other companies will have dedicated neural net chips soon.

I wrote earlier about Tesla's custom chip for neural networks.

It is important to understand that much more processing power is needed to train neural networks than to run them. The training is not done in the car, it's on servers with other processors. What matters is the chips in the car that run the network to process their perception data.

The algorithms keep changing, and they adapt to the hardware. For example, right now running neural networks is often limited by memory bandwidth and latency, not computation. In addition, sparse neural networks (which have many nodes with weight zero) can get hardware acceleration for major speedups. But who knows what the most powerful algorithms of 2021 will be? Chip designers of 2018 need to know.

Specialty ASICs and FPGAs

MobilEye, now a unit of Intel, did very well building a special custom chip (ASIC) just for doing machine vision. They were able to turn that into an ADAS device and sell lots of them, and now they are expanding it with Intel to include more powerful tools for self-driving. MobilEye has a bit of luck, because their early chip had hardware on it which turned out to be also useful at neural networks, just as GPUs did. This is designed into the later generations.

The problem with ASICs is you do need to be smart and lucky. It takes years to fully produce a new ASIC, and so you must bet right on what you're going to need down the road. And the world is changing fast.

Another type of general purpose chip with some role is the FPGA. These are chips where you can rewire the logic using software. They can't do as much logic as an ASIC can, but you can change it on a dime. Some neural network tools have found good performance on FPGAs. Don't count them out. Their flexibility is even greater than the GPU's, and could be a key asset.

There are several other types of chips that will be used as well. Network controllers (for the old CANBUS and more modern ethernet based networks.) Data radio (4G/5G) chips for the external connection. Some companies have proposed independent security chips and a host of others. This article has its focus on more general processors.

Microcontrollers

A typical car contains lots and lots of microcontrollers, sometimes known as ECUs in their parlance. Some cars have upwards of 100 ECUs in them. In the robocar, you will find them in all the sensors, and the drive-by-wire components, but you'll actually see a lot of them go away, as buttons, dials, pedals, wheels and displays are removed from the car. Adjustable seats, mirrors, steering columns and many other components that used to need ECUs and actuators will be gone.

The electric power train also will be much simpler. On the other hand, the future car is going to be laden with internal sensors to measure temperature, vibration and other factors that affect maintenance. These will use super-cheap ECUs simply to save the cost and complexity of wiring.

Energy use

In electric cars, some of these chips, most notably the high-power GPU boards, are starting to consume serious wattage. Enough to make a difference in the range of the car, which means a difference in the cost of the battery. A driving electric car might be using about 10kw, but small and light personal pods might drop that down to 4kw or less. I've seen processor rigs that draw multiple kilowatts. High powered lasers also can consume serious power.

In these vehicles, and especially in smaller robots such as delivery robots, power usage of even hundreds of watts can be a serious amount. The mobile phone industry has made the chip industry search for low power solutions, and the small robotaxi market might do so as well. Gasoline cars won't suffer the same penalty of the larger battery and will be better able to handle this. The ARM chips claim to use only 15 watts, which will be popular.

A server farm has tons of power available (and mostly worries about heat.) A car has a lot, but less than the server. A small robot has less, but still has much more power than a phone or battery powered embedded device. People want chips for all of these.

Who wins?

I don't think any chip company wins? I think that developers will be inclined to build their systems to be portable, to be able to move over different hardware architectures. Hardware architectures will change rapidly. Neural network chips will be a particularly fast changing field. Companies will constantly evaluate who gives the best combination of price, performance and reliability.

So if your chips become a value leader in automotive, be happy. But not for long.

Comments

(Adapted from a FB discussion on this post:)

We are post-Dennard scaling now. (But at 7nm TSMC we also have transistor budgets into the tens of billions.) Hopefully the ADAS stuff is 20W and not 200W. So power efficiency critically enables throughput, performance, low latency. So there will be a remarkable degree of hardware customization to run the various parts of the stack. So whatever it is, it will be a very heterogeneous system-in-package with integrated memory such as HBM2.

It will have aspects of the brilliant Apple A12 or similar mobile apps processor. https://www.anandtech.com/.../the-iphone-xs-xs-max.../2 CPU + GPU + "NPU", advanced caches / memory subsystem, integrated with dozens of fixed accelerators, power optimized.

It will have aspects of the just-announced Xilinx Versal ACAP (adaptive compute acceleration platform). https://www.xilinx.com/.../white.../wp505-versal-acap.pdf to do low latency sensor fusion, afford hardware acceleration / energy optimization of new emerging computer vision and ML algorithms (batchsize=1 low latency) and standards, and the incredibly flexible I/O that FPGAs bring (direct connection to myriad MIPI cameras, sensors, networks, transducers), motor control, etc.

It will have a NoC to flexibly and scalably tie it all together.

In all, it will have a remarkably diverse mix up of general purpose processors, massively parallel GPU/AI processors, and programmable logic, many, many special purpose fixed function accelerators, lots of special purpose memories, lots of cache, HBM2 or later super high bandwidth memory, high speed networking, radio (probably second device), possibly a multi-die EMIB or 2.5D stacked die thing to rapidly integrate new ASIC accelerators into the overall system.

I bet the software stack will be heterogeneous too -- including secure boot / enclave, hypervisor, application OS (Linux), one or more RTOSs, C++, Java, JS, Python, so many different parallel programming models, so many libraries, so many frameworks. The cloud too will participate in driving you around -- one wonders if a vehicle will go into a degraded mode when/if internet connectivity is lost.

[Brad then clarified "will it make sense to produce specialty chips aimed at the problems a car has, or will more general chips (including general neural net chips) rule the day?" I replied:]

The bet that Xilinx in particular is making with their forthcoming ACAP product line named Versal is to produce a volume standard part in the most advanced process but with flexible and configurable hardware and (more software developer friendly) parallel processors and IO so that its customers can build a "domain specific architecture" overlay on the commodity programmable hardware, optimized to a specific domain. They specifically say they aim to accelerate the whole workload and not just the AI component (a weakness of standalone AI chips).

Here are some slides, a white paper, and a video on this.
https://www.xilinx.com/.../XDF_VERSAL_Press_Presentation...
https://www.xilinx.com/.../white.../wp505-versal-acap.pdf (see esp. ADAS section)
https://www.youtube.com/watch?v=xnNs74tTHZU

For the next few years a broad, versatile, adaptable ACAP-like part should be a sound platform upon which to build the in-vehicle HW/SW stack. Over time, as volumes rise and as workloads mature, more of the ADAS and its ML workload could then migrate into bespoke/customized accelerator blocks on a future ADAS-workload-mix-optimized ACAP device.

Add new comment