FSD v12: Tesla's Autonomous Driving Game-Changer w/ James Douma (Ep. 757)
Summary
TLDRIn this engaging discussion, Dave and James delve into the recent developments at Tesla, focusing on the release of FSD V12 and the anticipated reveal of the Optimus robotaxi in August. They share firsthand experiences with FSD V12, noting its impressive capabilities and smoother performance compared to its predecessor. The conversation also explores the potential of Tesla's AI technology, the challenges of scaling up robot production, and the impact of competition in the AI field. The discussion highlights the rapid advancements in AI and the transformative potential of Tesla's upcoming projects.
Takeaways
- 🚗 Tesla's FSD V12 release has shown significant improvements over previous versions, surpassing initial expectations.
- 🌟 The V12 update introduced a drastic rewrite of Tesla's planning architecture, enhancing the overall driving experience.
- 🧠 The neural network's ability to generalize from mimicking human driving behaviors has led to a more natural and smoother ride.
- 🔧 Tesla's approach to developing FSD involves an end-to-end process, which has proven to be more sample-efficient and scalable.
- 🚀 The potential for FSD to reach superhuman driving capabilities is evident as the system continues to learn and improve.
- 🤖 The development of Tesla's humanoid robot, Optimus, is ongoing, with a focus on perfecting the hardware before scaling production.
- 📈 The importance of data gathering in refining AI models like FSD and Optimus cannot be overstated, with real-world variability being crucial for training.
- 🌐 Tesla's strategy for robotaxis involves a phased rollout, starting with select cities and gradually expanding the fleet.
- 🚕 The economic and operational shift of Tesla from a car manufacturer to an AI company is becoming more apparent as software takes center stage.
- 💡 The future of Tesla's products, including FSD and Optimus, hinges on continuous advancements in AI and the ability to scale effectively.
- 🌟 The conversation highlights the rapid evolution of AI in the automotive and robotics industry, showcasing the potential for transformative changes in transportation and manufacturing.
Q & A
What significant update did Tesla release recently?
-Tesla recently released the FSD (Full Self-Driving) V12 update.
What is the significance of the V12 release for Tesla's FSD?
-The V12 release is significant because it represents a drastic rewrite of Tesla's planning architecture approach and a major leap in the capabilities of the FSD system.
What were some of the issues with the previous version of FSD?
-The previous version of FSD had issues related to planning, such as not getting in the right lane, not moving far enough over, not knowing when it was its turn, and stopping in the wrong place.
How did the guest on the podcast describe their experience with the V12 update?
-The guest described their experience with the V12 update as very positive, noting that it exceeded their expectations and that it was much more polished than they anticipated.
What is the robotaxi reveal that was mentioned in the transcript?
-The robotaxi reveal mentioned in the transcript refers to Tesla's planned announcement of its robotaxi service, which is expected to be revealed in August.
What were some of the improvements observed with the V12 update compared to the previous version?
-With the V12 update, improvements were observed in the planning stack, with old failings being addressed and not replaced by new issues. The system also seemed to drive more naturally and made better decisions in various driving scenarios.
What is the expected timeline for Tesla's robotaxi service rollout?
-While a specific timeline was not provided in the transcript, it was suggested that Tesla might start testing unsupervised robo taxis on the streets in the second half of 2025.
What are some of the challenges that Tesla might face with the rollout of the robotaxi service?
-Some challenges that Tesla might face include ensuring the safety and reliability of the robo taxis, navigating regulatory requirements, and managing the transition from a private vehicle manufacturer to a fleet operator.
What was the general sentiment towards the V12 update at the beginning of the podcast?
-The general sentiment towards the V12 update at the beginning of the podcast was cautious optimism. The hosts were excited about the potential of the update but also aware of the challenges that might arise during its initial rollout.
How does the FSD V12 handle unexpected situations compared to the previous version?
-The FSD V12 handles unexpected situations more gracefully compared to the previous version. It is designed to mimic human driving behaviors more closely, which allows it to adapt and react better to new or unforeseen scenarios.
Outlines
🚗 Introducing Tesla's FSD V12 and Optimus
The discussion begins with Dave and James catching up on recent developments, focusing on Tesla's Full Self-Driving (FSD) V12 release and the Optimus robot. Dave shares his experiences driving with FSD V12 for three weeks, highlighting its impressive capabilities and the significant improvements from V11. They also touch on the potential for the robotaxi reveal in August and the anticipation surrounding it.
🤖 Rethinking Tesla's Planning Stack
Dave and James delve into the technical aspects of Tesla's FSD V12, discussing the shift from heuristics to an end-to-end neural network approach. They explore the challenges of removing guardrails and the surprising lack of major mistakes in V12. The conversation also covers the potential methods Tesla might be using to achieve such polished results, including simulation and data curation.
🚦 Navigating Intersections and Planning
The talk moves to the intricacies of driving behavior, with Dave sharing his observations of FSD V12's handling of intersections and its ability to mimic human driving patterns. They discuss the importance of understanding the severity of different driving mistakes and the evolving nature of the system's learning process.
🌐 Global Perspectives on FSD
Dave and James consider the implications of FSD's global rollout, discussing the need for local adaptations and the potential for cultural differences in driving styles to impact the system. They also speculate on the future of Tesla's development process, including the possibility of using human drivers as data sources.
📈 Data-Driven Improvements in FSD
The conversation focuses on the role of data in refining FSD, with Dave sharing his insights on how Tesla's vast amounts of driving data contribute to the system's improvement. They discuss the potential for generalization and the challenges of addressing rare but critical scenarios.
🚗🤖 Reflecting on FSD and Optimus Developments
Dave and James recap the significant progress made in FSD and the potential impact of the upcoming robotaxi reveal. They discuss the broader implications of Tesla's advancements in autonomy and robotics, considering the future trajectory of the company and its products.
📅 Anticipating the Robotaxi Future
The discussion turns to predictions about Tesla's robotaxi service, with speculation on potential timelines and strategies for implementation. Dave and James consider the challenges of scaling up the service and the potential for Tesla to transition from a car manufacturer to a leader in autonomous transportation.
🤖🏭 Optimus: The Path to Production
Dave and James explore the potential timeline for Tesla's Optimus robot, discussing the challenges of industrializing humanoid robots and the importance of data gathering. They consider various methods for training the robots and the potential for real-world deployment.
🌟 The Future of AI and Tesla
In the final part of their conversation, Dave and James reflect on the broader implications of Tesla's AI developments, considering the potential for the company to evolve into a major player in the AI industry. They discuss the impact of open-source models and the future of AI in consumer products.
Mindmap
Keywords
💡Tesla's FSD V12 release
💡Optimus robot
💡Robotaxi reveal
💡AI and machine learning
💡Human mimicry
💡End-to-end learning
💡Perception stack
💡Heuristics
💡Autonomous driving experience
💡Neural networks
💡Strategic path
Highlights
Discussion on Tesla's FSD V12 release and its improvements
James' experiences with FSD V12 during a cross-country trip
Impressions of FSD V12's capability in rural and urban areas
Comparison of FSD V12 to V11 and the changes in planning architecture
Expectations for FSD V12 and its surprisingly polished performance
Discussion on the potential reasons behind FSD V12's success
The role of neural networks in achieving a more natural driving experience
Thoughts on how Tesla might have achieved the polish in FSD V12
The importance of end-to-end training in neural networks
Discussion on the challenges of removing heuristics from the planning stack
The potential for FSD to exceed human driving capabilities
Expectations for future improvements in FSD based on current trends
The significance of the transition from heuristics to neural networks in FSD
The potential impact of FSD V12 on driver intervention and safety
Speculations on the future of Tesla's Autopilot and FSD
Transcripts
hey it's Dave welcome today I'm joined
by James dama and we've got a whole host
of things to talk about we've got um
Tesla's FSD V12 release that just
happened this past month we've got um
Optimus to talk about um and this robot
taxy reveal in August so anyway it's
been a long time it's been like at least
a half a year was last August or
something like that so yeah yeah I
remember the last time we met we talked
about V12 cuz they did a demo mhm and um
we were quite excited about the
potential but also a little bit cautious
in terms of how it will first roll out
and how capable but um curious just what
has been your first experiences and
first impressions of you talk how long
have you been driving it for uh I got it
a few Sundays back I think I I got it
the first weekend that it really went
right so I think I've had it three weeks
or something like that maybe four three
probably and uh of course drove it out
here to Austin from Los Angeles drove it
quite a bit in Los Angeles on the way
out here so my my wife has this hobby of
like visiting supercharges we've never
been to so every cross country trip
turns it's ends up being way longer than
otherwise would be but one of the cool
things about that on the FSD checkout to
her is that we end up driving around all
the cities on the way you know because
you're driving around to the different
Chargers and stuff and so you get a
chance to see what it's like in you know
this town or that town or um different
you know highways are different we drive
a lot of rural areas so I got lots of
rural we uh we did like the whole back
Country tour coming out here through
across Texas and so feel like it was it
was a good experience for like trying to
compress a whole lot of FSD yeah and I
got to say I'm just like really
impressed like it's I was not expecting
it to be this good because it's a really
like this is not a small change to the
planner was yeah with v11 we had gotten
to a point where the perception stack
was good enough that we just weren't
seeing perception failures I mean they
just but people almost all the
complaints people had had to do with
planning not getting in the right lane
not being able to move far enough over
um not knowing when it was its turn uh
stopping in the wrong place creeping the
wrong way these are all planning
elements they're not uh you know so if
you're going to take a planning stack
that you've been working on for years
you've really invested a lot and you
like literally throwing it away like
there just not retaining any at least
that's what they tell us they got rid of
300K lines they went end to end it's
harder to actually mix heuristics into
end to end so it makes sense that they
actually got rid of almost everything
anything they have in there that's
heuristic now would be built new from
scratch for the end to end stack and yet
they managed to
outdo in what seems to me like a really
short because they weren't just
developing this they were developing the
way to develop it you know they were
having to figure out what would work
there's all of these layers of stuff
that they had to do so my you know my
expectation was that the first version
that we were going to see was going to
be like on par it would have some
improvements it would have a couple of
meaningful regressions and there would
they would be facing some challenges
with you know figuring out how to
address so because it makes sense that
they want to get it out soon and the
sooner they get it out into the fleet
the faster they learn um but the the
degree of polish on this was yeah in a
much higher than I expected and like you
know Bradford stopped by and I got a
chance to see 1221 as he was coming
through we only had about 40 minutes
together I think I it was just like the
spur of the moment thing and uh and yet
even in because he was kind enough to to
take it places that I knew well that I
had driven on 11 a lot and I think it
took me about three blocks to realize
like right away and after 45 minutes I
just knew that this is going to be
completely different and every
everything that I've experienced since
getting it and
I you know what have I got I'm I must be
at like 50 hours in the seat with it
right now a few thousand miles highly
varied stuff yeah it's super solid yeah
yeah I think um yeah I wanted to dive
into kind of how big of a jump this fsd2
is because when I drove it I was shocked
um because this is not like a is I think
V12 is a little bit of a misnomer
because this is a drastic you know
rewrite of their whole planning
architecture approach different
different um I mean on their perception
it seems like they probably kept a lot
of their neuron Nets um in terms of the
perception stack added on as well but in
their planning stack this is where they
pretty much it seemed like they're
starting from I would say scratch
completely but they're taking out all of
the guard rails all their hortic and
they're taking putting on this n10
neural approach where it's deciding
where and how to navigate right the the
perceived environment but I would have
imagined and this is kind of my
expectation also is like you you would
be better in some ways it would be more
natural Etc but then there would be some
just like weird mistakes or things that
it just doesn't get because all of the
guard rails are off theistic ones and so
you're just like it's D more dangerous
than some other ways right and that on
par though Tesla would wait until it
would be a little more safer before
releasing V12 but what we ended up
getting was we got this V12 that just
seems like really polished you know
we're not it's not easy to catch those
big mistakes in V12 and I'm kind of like
where did all these big mistakes go like
you know that was my expectation at
least and so I'm wondering like like
what was your did that catch you off
guard like just seeing the the the small
number you know of of big mistakes or
seeing how polish this V12 is um and
then I also wanted to go into like how
did Tesla do that in terms of um because
once you take off the heris sixs at
guardrails you really have to
like like be confident you need I don't
know like yeah I'm curious to hear
what's your take on how you think they
achieve this with B12 you know the the
the the Polish they have well first yeah
it
was well there's two components of like
starting out experience there's like my
sort of abstract understanding of the
system and what I sort of rationally
expected and then there's you know
there's my gut you know because I've got
I've got like 200,000 miles on various
layers of autopilot including you know
maybe I don't know 50,000 miles on FSD
so I have this muscle memory and this
you know sort of sense of the thing and
I expected that to sort of be dislocated
I mean you know going from 10 to 11 and
was also I mean they added a lot this is
not the first time that they've made
pretty substantive changes it's the
biggest change for sure right but I was
expecting it to feel a little bit weird
and uncomfortable but but sort of
intellectually I was expecting all the
old problems to go away and a new set of
problems to come in because it's a
different product
like because the perception was pretty
polished and and the things that people
were most aware of is failings of the
system were essentially baked into this
heuristic code well of course you take
theistic code away all those failings go
away too but what do you get with the
new thing right so and you know so that
did happen like all the old failings
went away like rationally right but it
was weird to sit in the SE in the seat
and you know there you know there's this
street you've driven over and over and
over again where there was this
characteristic behavior that it had
which is you know maybe not terrible but
not comfortable maybe or less ideal than
you would are slower annoying whatever
the deal and those are just gone like
all of them not just like one or two
they're just like gone all of them so
that was sort of like it was such a big
disconnect that it was kind of
disquieting the first you know week or
two I mean delightful but also
disquieting because now you're like
Uncharted Territory you know what demons
are looking here that I'm not prepared
to
you know after you drive theistic thing
for all you kind of got a sense of the
character of the failures I mean even if
you haven't seen it before you know the
kind of thing that's not going to work
and now but I didn't I didn't really
find those like I haven't really found I
haven't seen something and I was
expecting to see a couple of things that
were kind of worrisome and where I
wasn't clear to me how they were going
to get go about addressing them and I
just I really haven't right and so like
in that sense I'm really I'm more
optimistic about it than I expected to
be at this point um how do they do it
yeah okay so let me give context to that
question a bit more because I know it
could be open-ended so I would imagine
that if you go end to end with planning
that um driving is is very high stakes
you have one mistake let's say you go
into the center divider aisle or there's
a there's a concrete wall or you there's
a signpost you drive into or a treat or
something it just seems like you have
one second of mistake or even Split
Second and your car is you know it's
just catastrophic it could be and with
V1 up until v11 you had these guard
rails of like oh stay in the lane and do
this and all that stuff but with those
guard rails off like V12 could when it's
confused just make a bad move you know
and just go into some you know another
car another Lane another you know object
or something but what about it is
preventing it you know without the
guardrails is it just the data of
mimicking humans or is there something
else attached on top of that where
they're actually doing some simulation
or stuff where it's showing like what
happens when you go out of the lane into
the next Lane you know into oncoming
traffic or if you do something like is
it is are they you know pumping the the
the the neuron nest with lots of
examples of bad things also that could
happen if you know if it doesn't you
know follow a certain path like what's
your take on
that um so that question prompts a
couple of thoughts um so one
thought are okay first of all preface at
all like I don't know what the nuts and
bolts of how they are tuning the system
they've told us it's end to end right so
that basically constrains the things
that they could be doing but when you
train in a system you can you don't have
to train it end to end I mean some
training will be done endend end but you
can break it into blocks and you can
pre-train blocks in certain ways and we
know that they can use simulation we
know that they can curate the data set
um so there're you know what's the mix
of stuff that they're doing is really
hard to predict they're going to be a
bunch of you know uh learned methods for
things that work well that are going to
be really hard to predict externally
just from first principles um this whole
field it's super empirical one thing
that we keep learning about neural
networks even like the language models
we can talk about those some if you want
to cuz that's also super exciting but
the they keep surprising us right like
so you take somebody who knows the field
pretty well and you at one point and
they make predictions about what's going
to be the best way to do this and
whatnot and aside from some really basic
things I mean there's some things are
just kind of P prohibited by basic
information Theory right but when you
start getting into the Nuance of oh will
this way of tweaking the system work
better than that way or if I scale it if
I make this part bigger and that part
smaller will that be a win or a lot you
know there's so many small decisions and
the training is like that too like how
do you curate the data set like what in
particular matters what makes data good
like that's a surprisingly subtle thing
we know that good data like some
training sets get you to a good result
much faster than other training sets do
and we have theories about what makes
one good and what makes one bad and
people on some kinds of things like text
databases a lot of work has been done
trying to figure this out and we have
some ideas but at the end the day this
is super empirical and we don't really
have good theory behind it so for me to
kind of sit here not having seen what
they have going on in the back room and
guess I'm just guessing so just like
frankly like I have ideas about what
they could be
doing um but you know I would expect
them to have many clever things that
never would have occurred to me yeah
that they've discovered are important
and they may be doubling down and we we
actually don't know the fundamental
mechanism of like how they're going
about doing the mimicry like what degree
of we you know we know that the you know
they have told us that the final thing
is photons in controls out as end to end
would be right
but uh so the the final architecture but
like how you get to the result of the
behavior that you want you're going to
break the system down
like I don't know it's it's just like
there are many possibilities that are
credible picking them and they vary a
lot and picking the one that's going to
be the best like that's a hard thing to
do sitting in a chair not knowing um
they are doing it really clearly and
they're getting it to work like the
reason why I I it fascinates me on the
on what type of like um uh kind of
catastrophic scenarios or dangerous
things that there may be putting in like
it it the reason why it fascinates me is
because with driving part of the driving
intelligence is knowing that if your car
is like one foot into this Lane and it's
oncoming traffic that that's really
really bad like you know be a huge
accent versus if there's um no cars or
something then it's okay or if there's
or just it the driving intelligence just
requires an awareness of how serious
mistakes are in different situations in
some situations they're really really
bad in some situations the same driving
maneuver is not that dangerous and so it
just seems to me like there have to be
some way to train that right to teach
the the neuronist that so there's an
interesting thing about the driving
system that we have and
people okay first so the failure you're
describing is much more likely with
heuristics like heuristics you build
this logical framework a set of rules
right where um you know when heuristic
Frameworks break they break big like
they because you can get something
logically wrong and there's this gaping
hole this scenario that you didn't
imagine where the system does exactly
the opposite of what you intended
because you have some logical flaw in
the reasoning that got you to there
right so you know bugs that crash the
computer that take it out like we you
know computers generally don't fail
gracefully heuristic computers right
neural networks do tend to fail
gracefully so that's one thing right
they they they're less likely to crash
and they're more likely to give you a
slightly wrong answer or a you know to
get almost everything right and have one