Elon Musk's war on LIDAR: He hates it, everybody else loves it. Who's right?

Topic: 
Tags: 
Tesla tries to get distance from vision in avoiding LIDAR

It wasn't news when Elon Musk declared how much he dislikes LIDAR for robocars at Tesla Autonomy Day last month, but this time he declared it more forcefully and laid out his reasons. Many people are wondering whether Tesla's plan is smart or crazy. In this new article, I outline the two different philosophies and the arguments behind them, to help you figure out who might be first to real robocars on the road.

In short, Tesla feels that self-driving needs superb computer vision to work, and that Tesla will get there first. Once you have superb computer vision, it does all that LIDAR does and you wasted your time. Most other teams feel that you can get to a real robocar faster with not-quite-as-good computer vision combined with the reliable data of LIDAR.

Read about it in my Forbes site post at Elon Musk's war on LIDAR - who is right and why do they think that?

Comments

Neural nets for LIDAR and radar in self-driving cars? This is the first I've heard of that. Do you know where I can read up on it?

I love your term 'bet your life reliability'. Hope you don't mind if I steal it.

It's weird people harp on LIDAR's cost in robotaxi applications. Even at $5k it's only a penny a mile. Musk is talking about $1/mile fares. It's a rounding error.

Some examples include Frustum Pointnets, VoxelNet and PointPillar all for 3D object detection in Lidar point clouds, which are based on recent novel neural nets (e.g. Pointnets) that are able to process point clouds. They were invented very recently around 2017-2018.

Frustum Pointnets: http://stanford.edu/~rqi/frustum-pointnets/
VoxelNet: https://arxiv.org/abs/1711.06396
PointNet: http://stanford.edu/~rqi/pointnet/

If lidar would actually make a difference and could be had for only $5K, sure. They could cut out the huge batteries, burn gasoline instead, and save well over $5K there, if saving $5K is critical.

There are lots of problems with that, though. Most importantly, lidar doesn't actually make enough of a difference. Not right now, and probably not ever for the type of car that Tesla is building - a car that can go anywhere.

The cost of lidar right now isn't a penny a mile. The cost of lidar right now would be Tesla going out of business. People frame the question as whether or not you can build a robocar without lidar. But the real question is whether or not you can build a general purpose robocar with it.

I can't see how one can say that. If there had been LIDAR in Huang's car or Joshua Brown's car, they would be alive. That's a big difference.

Yes, saving two lives is a big difference. But putting lidar (and software to utilize it) in every single car you sell would put you out of business. You can't expect a manufacturer to spend $2 billion to save two lives. If you could, then lidar would be mandatory on all cars, not just Tesla cars (as well as numerous other things, and all cars would cost a fortune).

Prices of lidar might one day come down to the point where it is mandatory to install it on all cars. Maybe. But I doubt it. If it does, Tesla, and all the other car manufacturers, can start installing it.

This is especially true because there are many much more cost-effective fixes to the problems that killed Huang and Brown. And those fixes will save more lives than wasting time and money on lidar.

Your suggestion in the article about what you call "detailed maps" makes more sense. The car should know how many lanes are on the road, and should be extra cautious if it thinks it sees something different or if there are no maps available. In the article you suggest that Musk is opposed to that. I know Musk is opposed to "high definition maps," which makes sense. But I haven't seen him come out against using maps with details such as the number of lanes.

Huang's death was when autopilot was in a very early stage. Yes, adding lidar can help at that early stage. But the argument that Musk is making is that it doesn't help at the later stages. I'm not sure if he's right or not about that. Maybe it will help enough to add it on, especially if prices come way down. But I think he's right that it is counterproductive to use it during the development phase. Or to put it in his words, it's a crutch.

They do have maps, including maps of the number of lanes, but they oppose highly detailed maps showing the geometry of the lanes, and the location of things around the lanes (like those crash barriers.) To use such maps you just localize to them which they don't do, I think.

It would not just be these two lives. There have been a number of crashes. And untold millions of crashes averted because the human driver took over. Of course I don't know it's millions, but there are 400K autopilots out there, and as an autopilot user, I have made many interventions. I don't have the tool to tell how many of those interventions were truly needed, and I'm not going to avoid intervening to find out.

The whole point is, tons of companies are out there prepping to make LIDARS well under $1K. With a goal of under $100 by the time you are making them by the millions.

Under $100 cost to the car company, with no start-up costs (or with start-up costs factored into the $100), and maybe it'll be worth it. Maybe not though. Depends how good computer vision is by then, and how good that cheapo lidar is. But hey, if the cost to number of lives saved ratio of lidar can approach that of, say, airbags or antilock brakes, I have no doubt Tesla (and all the other car manufacturers) will add it to their new models.

At $1000, it's still questionable. Yes, there have been crashes, some of them fatalities, but it's not clear how many would have been prevented with lidar. As far as crashes averted because a human took over, again I'd question how many would have been averted without human intervention had Tesla used lidar. But in any case, I think people would much rather pay $1000 less for their car than have a slightly lower rate of interventions. As I understand it, you don't even have full self-driving on your car. Yet you are judging it as though it's a robocar. Lidar right not would bring Tesla from not-a-robocar to...still-not-a-robocar. It might have saved a handful of lives and prevented some crashes. That is, if the money to pay for the enormous cost had fallen from the sky. Instead it probably would have driven Tesla out of business.

Yes, they have maps. And yes, they've come out against "high definition maps." As I understand it those maps are created with lidar, and I'm not sure how useful they'd be for Tesla. Detailed maps, with enough details so that the car can figure out that the area next to it that looks like a lane isn't actually a lane - yeah, they should have that. (There's no need to map the barrier, certainly not in 3D, and in fact that might have confused things because most of the barrier was actually not there that day.)

Maps are really important for situations like the one where Huang was killed. It's easy to see how even a human could mistake that gore area for a lane (I wonder why there are no chevrons in it). The human would probably not crash all the way into the barrier. But they very well might think it's a lane the first time they drive there. At which point they'll add it to their mental map, so they don't make the same mistake next time.

I judge it by Musk's claim that full self driving will be feature complete this year, and sleep in your car will work next year (but may not have regulatory approval.) I do have the "full self drive" option on my car but of course it is not yet implemented. I can only judge what's out there, and it's not at all at a state that it will be FSD with just 6 months or 18 months.

I drive that ramp all the time, and no, nobody would mistake the gore for a lane. Look at it on streetview if you like. It's a left exit to a carpool flyover lane. A human will understand that. A camera may or may not understand what it says. A LIDAR perception system may also not understand what it sees, but it will not miss that there is a large crash barrier sitting immobile directly in its path. Not even UBER's car would screw that up.

I have looked at it in street view.

Yes, lidar would have prevented this crash. Most importantly it would have prevented the crash because Tesla would have been bankrupt. Which would probably be a net loss for society. Teslas are safer than most cars.

I thought you didn't pay for FSD. Do you have the upgraded computer? Have they pushed out to your car the update to activate it?

As you say, you can only judge what's out there, and what's out there is ADAS. It should be judged as ADAS, and I think you've admitted that, by that standard, it's the best.

Tesla's strategy is to use an ADAS car to collect data to help them build a robocar. It's a brilliant strategy, and wouldn't have worked if they had relied on lidar.

There was never a chance of Tesla putting in any LIDAR except a fixed beam one like the Valeo/ibeo (though that one would have prevented these crashes.)

I did not buy FSD at the time, but Tesla offered a brief "sale" for $2,000 which I bought. $6K or more for a product not yet shipping is too much.

For $2K, aside from the new computer, I will get any other hardware they have to add on (even a LIDAR.) Plus there are some useful things you can do with just cameras, such as trafic jam full self drive, valet parking etc. that I expect to come for that money.

Hey Brad, Why are you capitalizing lidar? You did not write "RADAR" in your article. Just curious if you had a good reason. Good article on the lidar debate. It will be interesting to see who prevails.

It's just a common form for it. Actually, I prefer to write it as lidar or Lidar but most people seemed to be writing it LIDAR. Some call it LADAR still.

Best argument I've heard is that lidar, like radar, has now become a common word, rather than a technical one, so you don't have to capitalize it any more.

That's my excuse, anyway. Real reason is maybe that it's easier to type on my phone.

I think you might as well transition to "lidar" and help people feel like it's an accepted word just as it is an accepted technology. Previous commenter FKA is right on all points IMO. I also feel like Li.D.A.R. is really awkward and convinces us to avoid acronymizing it. I think just giving it a proper name can help move the technology along.

One advantage of having lidar today, even if it turns out it's not needed tomorrow, is that it provides a lot of accurate depth data to go together with the radar and video data. That in turn can be used to train a vision system (that might not use lidar) in the future. One of the few advantages that Tesla has today is more collected data, but that data is not as rich as what the competitors are collecting. So even if Waymo and the others have to drop lidar in the future, it won't have been a wasted effort I suspect.

Good point. Tesla showed something similar, but using radar, on Autonomy Day. Lidar obviously gives more accurate data to enable better training.

Does anyone know how much Waymo is asking for their "Honeycomb" lidars? They are offering them to industrial and other non-auto users, but pricing is apparently a state secret. Not unusual in the lidar arena. Seems the only companies who announce pricing are the ones who don't have units in actual production yet.

People training NNs are keen for anything that provides ground truth. This is how comma.ai trained their neural networks, by driving with a LIDAR on board. And also how they tested them -- they made sure the neural network saw what the lidar said was there. I presume many teams have done this, except Tesla.

It seems Tesla may have used the same technique, at least to some limited extent, as spotted here in Sept.2016, shortly before Autopilot 8.0 (HW2.0, sans Mobileye) was introduced:
https://insideevs.com/news/331679/tesla-model-s-with-manufacturers-plate-caught-testing-velodyne-lidar-system-video/

Karpathy indicated at the recent Autonomy Day presentation that the traditional Tesla perception gap [e.g. not braking for stationary objects from >50mph, doppler radar only detecting moving vehicles] is to be patched up with the HW3 upgrade enabling the running of some clever and compute-intensive software which was beyond the tapped-out HW 2.5, such as the advanced algorithms described in these recent papers:

1. Night Vision with Neural Networks
https://arxiv.org/pdf/1805.01934

2. Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras
https://arxiv.org/abs/1904.04998

3. Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
https://arxiv.org/abs/1812.07179

With the on-board compute resources Tesla's HW3 supplies, something like #1 could provide amazingly enhanced passive night vision [ see demo https://www.youtube.com/watch?v=bcZFQ3f26pA ] and #2&3 can emulate high-resolution LiDAR for 3D-mapping in real-time, hence the sensor-suite problems are, to quote Karpathy, quite tractable.

This does not detract from the fact that Waymo and others make a very safe AV system in reliance on LiDAR, but goes to illustrate that Tesla has now revealed a convincingly feasible technological path to the same L5 goal at probably much less unit expense. Only time can tell which path pays off first and/or more handsomely, but I think both will probably do very well at leaving the established auto OEMs in the dust.

It is very exciting what can now be achieved with just HDR camera inputs and maximal data-crunching.

Even though Tesla will unlikely manage a feature-complete L5 FSD candidate system by end of 2019, as Musk claims, it should still bring an impressive highway-L3 candidate by mid-2020, which will be an extremely useful advance in safety although it will retain the existing L2 hands-on-steering checks until new approvals are granted.

You can train networks to detect depth cues, and night vision, and uneven lighting and more.

But can you train them to a high enough level of precision and recall, a near perfect level, where you would bet your life, bet your customer's lives and bet your company on it? I am not sure you can even get close to that yet, but I welcome Tesla showing otherwise.

I see what appears to be an assumption that you can't have all of good vision systems, good lidar, and good maps.
The argument appears to be that lidar is a crutch and that using it for anything will prevent the team concentrating on the task of getting vision to the point where it is sufficiently good. Presumably it is because it is assumed that a sophisticated understanding of the real world is required to make vision work, and this understanding is generalizable to the driving task.
If this is correct, the this seems awfully close to requiring a full AI in order to drive a car. If so I think we will have quite a wait.
If on the other hand we want to just develop a sophisticated system which approximates intelligence only in as much as it pertains to the driving task, then I think we need to combine data from as many sources as possible, and constrain the problem space in any way we can.
I see no reason to suspect that NNs trained on both lidar and vision (and using maps) are a priori less likely to converge on a robust 'understanding' of the real world than one trained only using vision.
Just because a particular type of intelligence evolved in the natural world, does not mean it is the only type of intelligence that could evolve. If evolution had somehow provided animals with lidar, I have no doubt that it could have developed an intelligence that used lidar as an input to understand the world in the human sense.

It's certainly possible, once you have good vision systems, to have good vision systems, good lidar, and good maps. I'd question whether or not good lidar is worth the cost, though, and spending too much money on good lidar probably means that good vision systems will take longer. (There is an argument that use of lidar might be useful in helping to train good vision systems. This is probably true if things are done correctly, but there's also a danger in training your vision systems with a different configuration than your production systems.)

I think good maps are useful. HD maps, maybe not (especially if the car doesn't have lidar).

I think we are awfully close to requiring "a full AI" in order to drive a car outside of tightly geofenced areas. I think that only allowing a car to go in tightly geofenced areas is tantamount to requiring infrastructure, and requiring infrastructure is not part of the vision of the robocar (https://www.templetons.com/brad/robocars/vision.html). (On this, I think it'd probably be less expensive to make infrastructure improvements than to do the mapping and testing required to deploy a robocar in one small area at a time.)

Maybe "close to...a full AI" isn't the right way to describe it. The car "only needs to approximate intelligence as much as it pertains to the driving task." But that's a lot. Driving isn't just following a bunch of rules. To drive as well as a human, you need to predict what other drivers (and pedestrians) are going to do, and in order to do that you need to predict what they predict you are going to do (https://www.youtube.com/watch?v=nRgplHWGbuM). And sometimes you have to communicate with them, not with words or hand signals but with your car, in order to "enhance their prediction." Merging in traffic is very much a task which requires a very sophisticated AI. And that's not the least of it. I've been noticing quite frequently when driving how often it's required to break the rules, and how often it's required to deal with others who break the rules. Not only does that require sophisticated AI, it also requires a lot of guts on the part of the company that releases this car that sometimes intentionally breaks the rules when that's the best thing to do.

I have no idea how long this will take. Maybe it'll be a long time. And maybe we'll wind up going the route of having self-driving cars only in tightly geofenced areas, and expand those areas with infrastructure improvements. Maybe we'll have to give up, for the next decade, on teaching a self-driving car how to cut someone off in LA traffic. If so, there are at least two alternatives. One is to have cars that are level 4 in tightly geofenced areas and level 2 or level 3 outside of them. Tesla, and the other car manufacturers, are well positioned for this. Another alternative would be to provide infrastructure to expand (and connect) the geofenced areas. Of course, the two aren't mutually exclusive. We can do both, and I won't be at all surprised if we do in fact do both.

If lidar were free it'd be a no-brainer. But it's not free. It's not free per unit, it's not free to integrate into a car, and it's not free in terms of making the car less efficient. So the question is, what does it provide, and what is the cost? For a car that can only go in limited places, that equation might make sense. For a car that can go anywhere, I don't think it does. And a car that can go anywhere is, I think, the whole point of building a self-driving vehicle (as opposed to building new infrastructure).

I don't think you need human AI at all. In fact, when it comes to perception and planning, I suspect an insect brain could do that, certainly a bird or mammal brain, if they could understand the rules of the road. The rules of the road you can code into non-AI.

It's well known that in the old days, sometimes people would get on their horses at the bar and fall asleep and the horse would get them home. That is of course lower speed, and modest traffic, but computers are good at speeding things up and understanding the physics of that.

No one follows the rules of the road. The idea that you can create a self-driving car by just coding "the rules of the road" (as though that's even a well-defined thing) and attaching them to good sensors, would be very dangerous, if anyone of importance actually believed it.

Didn't Waymo's AI figure out on its own how to handle a four-way stop (andnot the way the "rules of the road" tell you to handle it)? I thought I remembered reading something like that years ago.

Thing is, that's only one situation, and one that's very common. You can't hand code every situation into a car. Definitely not if you want to be able to drive outside of very limited geofenced areas.

FKA - you seem to be under the illusion that Tesla's neural network makes driving decisions. It merely identifies objects and tries to give path prediction hints.

Do you have access to inside information? At Autonomy day, they presented how they were training it on paths driven by humans so it could then make those decisions. As to what they do now, I don't recall them having said. However, this is not a central issue and they probably aren't doing it yet.

Brad, I have no inside info. Just going off what they said on Autonomy Day, what the hackers uncover, Karpathy's other talks, etc. Musk said the NN processes a single frame. Verygreen says they feed it two frames. Either way works well with their convolution accelerator chip and Inception-style NN.

The NN operates on 2D pixels and returns bounding boxes, lines and/or labels. The original labels were car/truck/van/etc. They've since added a distance label, X-Y path prediction label, etc. I believe they re-run the NN multiple times on the same frame but with different weights to get these different labels. One set of weights is trained for lane lines, one for car/truck/van, one for X-Y path prediction, etc. This fits their "2100 frames per second" narrative. It also lets them train a special "adjacent vehicle about to change lanes" NN without messing up the weights that identify pedestrians or estimate distances. You could do it all in a huge monolithic NN, but training and testing would be orders of magnitude harder and IMHO would not work as well.

Stuart Bowers said they stitch all these NN outputs into a 3-D view of the world. This keeps the NN in 2D pixel space so it doesn't have to know about speed, acceleration, steering angle, radar pings, etc. It also solves other issues, e.g. Roadster can use the same NN and weights as Model X despite cameras being almost 2 feet lower.

Bowers alluded to a separate set of NN outputs that identify a drivable path and obstructions. He said they cross check these obstructions against the bounding box/ID labels. If a NN trained to find drivable space sees an obstruction at a certain pixel location and a NN independently trained to look for pedestrians finds one in that same spot it boosts their confidence level. Especially if there's a radar ping with a pedestrian-sized RCS that matches the distance label and the path prediction NN shows slow (i.e. 1-5 mph) left-to-right motion and the interpretation of all is consistent over multiple frames.

So heuristic code builds the 3D model, cross checks NN labels against other sensors and for time continuity, then plots the desired course. They could have other NNs - e.g. radar or ultrasonic NNs analogous to the interesting lidar NNs listed above by Charles Qi (thanks Charles). They could also have a "driving NN". But the hackers haven't found anything like that and everything said during Autonomy Day tells me those other types of NNs don't exist. At least not yet.

I'm not sure what I said that gives you that impression.

I do believe Tesla will eventually use its AI chip to make at least some driving decisions.

I also don't think the problem they are trying to solve (level 5 autonomy, give or take some really exotic driving conditions) can be solved without using AI (specifically, machine learning) to determine the "rules of the road." But it's important to note that this statement is separate from my previous one. Much of the machine learning will probably be done "offline," i.e. when not driving.

(What are "path prediction hints"? When I google that term the only hit I get is a message from you here.)

I don't mean the written rules of the road. I mean the practices of driving which we all know, and which we can all write down if pushed to. Including the very much illegal rules most drivers use. Where to put yourself in lanes, how to move between lanes and through intersections, what to do when other drivers act in certain ways etc. The rules MobilEye is putting into RSS, and quite a few more.

I don't think anyone can write down all the practices of driving (to the level of detail where it doesn't require human intelligence to understand what's written down). That's exactly why I think this problem (a computer that can drive almost anywhere humans do) is only solvable by having those practices determined by a computer observing how people actually drive. And that in turn is why I think that Tesla, with its billions of miles making such observations, is very well positioned to solve the problem.

I encounter situations quite regularly when I think to myself, "I wouldn't have thought to tell a self-driving car how to handle that."

If anyone could write down all the practices of driving, why is making a self-driving car so hard?

I'm not sure I could even write down all the practices of driving in a form that only a human could understand. Try not crash into anything. Try not to harm humans. Try not to harm animals, at least not significant animals. Try to get to your destination quickly. Try not to break the law, at least not laws that are likely to be enforced. Try to be courteous to other drivers.... But the problem is when those goals contradict, which they frequently do. In fact, to some extent they are constantly contradictory (driving always involves some risk, and the quicker you try to get to your destination, the higher the risk).

Do you think the unwritten "rules of the road" are universal? Probably not, but do you even think we could get everyone in a single state to agree on what they are?

Hint: No, we can't. I can point you to heated discussions on the Internet over how you're supposed to make an unprotected left turn. But that's just one example, and it's easy to specially code a single example. (Hopefully we can get all the state legislatures to approve the technique of pulling out into the intersection and then finishing the turn when the light turns red, if you can't finish it sooner.) Even then, there's often intelligence, and sometimes communication, involved in figuring out what the other cars and pedestrians are going to do (not everyone knows that you're supposed to yield to a vehicle already in the intersection, even if you have a green light and they have a red light). And how aggressive you can and should be depends on where you are. Are car companies going to hand code the rules of the road in NYC vs. Chicago vs. Atlanta vs. Austin? Even if I could write all the unwritten rules of the road down for the cities I've regularly driven in, I couldn't write them down for the cities I haven't regularly driven in.

(Another possibility to deal with tricky situations like making an unprotected left turn, or making a U-turn, is to simply avoid them. That's usually possible, and usually possible without adding too much time to your drive. Of course you have to be able to deal with them in the cases where it isn't reasonable to avoid them, but you can be extra cautious during those relatively rare cases.)

Sure, we all run into situations where we wouldn't have thought to program a car to do X.

That's why Waymo goes and drives 15 million miles, around 25 human lifetimes of driving, so find those situations. And as time goes on, they see fewer and fewer of them until the number is vanishingly small, and the few that crop up later either are handled safely (if imperfectly) by general rules, or by a remote operations center.

I doubt 15 million miles is anywhere near enough, especially not 15 million miles under very limited driving conditions.

Either the "general rules" are intelligent enough to safely handle a one in 10,000 lifetime event, or you've got a level 3 system (with the remote operations center taking over at speed) at best.

A once in 25 lifetimes event needs to be handled.

You have to drive in each area to learn its quirks. But "needs to be handled" does not mean "needs to be handled ideally." Very rare events can be handled badly as long as there are no injuries. Extremely rare events can even have injuries. And the most extremely rare events may result in fatalities. Perfection is not possible or the right goal. That it is extremely rare for people to get hurt is the right goal.

I agree with you that perfection is not the right goal. Driving as safe or better than a safe driver is a better goal. I don't think such a small number of miles will get you there.

I'm not even sure what miles are being counted. 15 million miles of perfect driving with the exact same software as production might be enough for a limited deployment in those areas of testing. 15 million miles over numerous different versions of the software with lots of unsafe incidents occurring along the way isn't even remotely close.

Sorry was responding to article, not a particular comment.

"Google isn’t teaching its computers how to drive. It’s collecting data—its cars have driven 200,000 miles in total, recording everything they see—and letting its algorithms figure out the rules on their own." https://www.wired.com/2012/01/ff_autonomouscars/

"“The data can make better rules. It’s very deep in the roots of almost everything Google does.”" Id.

Maybe they changed that approach. Or maybe the article was wrong. But I doubt it. There doesn't seem to be any other way to do it.

Is the Wired reporter's interpretation, not a quote from the Google team. They have used lots of machine learning for a long time, but it doesn't mean that higher level decisions are made that way. The example cited, a 4 way stop, is one where Google learned early that you don't just code in the DMV handbook, you learn how people actually drive. You may decide that a machine learning tool is a good one to make the decision about when to go at a 4 way stop, but higher level code still is in charge of knowing you are at a 4 way stop, and invoking that system.

Letting the AI make all the driving decisions would not be a good idea.

Add new comment