FSD v12: Tesla's Autonomous Driving Game-Changer w/ James Douma (Ep. 757)

Dave Lee
12 Apr 2024115:33

Summary

TLDR在这次对话中,Dave和James讨论了特斯拉的FSD V12发布、Optimus机器人以及即将在8月揭晓的机器人出租车。他们分享了对FSD V12的初步体验和印象,并对特斯拉如何实现这一成就进行了深入分析。此外,他们还探讨了特斯拉作为AI公司的未来发展,以及机器人技术在不同领域的应用前景。

Takeaways

  • 🚗 特斯拉FSD V12版本发布,带来了显著的驾驶体验改进。
  • 🌟 V12版本是对规划系统的重大改写,与V11相比有很大的提升。
  • 🧠 特斯拉可能已经移除了30万行启发式代码,转而使用端到端的神经网络。
  • 🛣️ 用户在实际驾驶中发现V12版本的FSD在多个场景下表现出色,几乎没有出现大的错误。
  • 🏙️ 特斯拉的自动驾驶技术在城市和乡村道路上都有很好的适应性。
  • 🚀 特斯拉的自动驾驶技术正在快速发展,未来可能会达到超越人类驾驶员的水平。
  • 🤖 特斯拉Optimus机器人的开发也在稳步推进,尽管目前还未大规模生产。
  • 📈 特斯拉的AI技术正在成为公司的核心优势,可能会改变其作为汽车制造商的形象。
  • 🌐 特斯拉可能会在特定城市测试无人驾驶出租车服务,并逐步扩大服务范围。
  • 🔄 特斯拉的FSD技术通过不断学习和改进,预计将在未来几年内实现更高水平的自动驾驶。
  • 🌐 特斯拉的全球扩张计划可能会受到各地法规和市场接受度的影响。

Q & A

  • 特斯拉FSD V12版本发布的时间是什么时候?

    -特斯拉FSD V12版本发布于2023年3月。

  • James提到的Optimus有哪些值得期待的特点?

    -Optimus作为特斯拉的机器人项目,值得期待的特点包括其自主导航能力、执行复杂任务的潜力以及未来可能在多种场景下的应用。

  • 特斯拉的自动驾驶技术FSD V12版本有哪些显著的改进?

    -FSD V12版本对规划器进行了彻底的改写,提高了对不同道路和交通情况的适应能力,减少了对旧有启发式规则的依赖,从而提升了整体的驾驶体验和安全性。

  • James提到他在使用FSD V12版本时有哪些直观的感受?

    -James在使用FSD V12版本时感到非常印象深刻,他发现新的规划器比以前的版本有了显著的提升,即便在未曾遇到的新情况下也能做出合理的决策和应对。

  • 特斯拉的自动驾驶技术FSD是如何实现对复杂交通情况的适应的?

    -特斯拉的FSD通过端到端的深度学习技术,模拟人类驾驶员的行为习惯和决策逻辑,结合大量的实际驾驶数据进行训练,从而提高对复杂交通情况的适应和处理能力。

  • James对于特斯拉FSD V12版本未来的预期是什么?

    -James预期特斯拉FSD V12版本将会在未来得到进一步的优化和提升,随着更多的数据积累和算法的迭代,其性能有望达到甚至超越优秀的人类驾驶员水平。

  • 特斯拉的机器人Optimus预计在什么时候有更多信息公布?

    -根据特斯拉的计划,关于机器人Optimus的更多信息可能会在2024年8月的某个时间点公布。

  • 特斯拉的FSD技术在V11版本中主要解决了哪些问题?

    -在V11版本中,特斯拉的FSD技术主要解决了感知堆栈的问题,提高了对周围环境的识别和理解能力,减少了感知失败的情况。

  • 特斯拉FSD V12版本在安全性方面有哪些考虑?

    -特斯拉在FSD V12版本中通过全新的规划器架构来提高安全性,尽管去掉了一些旧的启发式规则,但通过大量的模拟和实际驾驶数据训练,使得新的规划器能够在保证安全的前提下做出更加合理的决策。

  • 特斯拉的自动驾驶技术FSD未来的发展方向是什么?

    -特斯拉的FSD技术未来将继续朝着全自动驾驶的方向发展,提高车辆在各种复杂环境下的自主驾驶能力,同时也会加大对数据利用和算法优化的投入,以实现更高的安全性和更流畅的驾驶体验。

Outlines

00:00

🚗 特斯拉FSD V12发布及Optimus机器人讨论

本段讨论了特斯拉FSD V12的发布,以及与James的对话中提到了Optimus机器人的相关内容。提到了FSD V12的驾驶体验,以及对新系统的初步印象。同时,还提到了在8月即将揭晓的taxy机器人。

05:01

🤖 特斯拉Optimus机器人的期待与展望

本段继续讨论了特斯拉Optimus机器人的期待,以及对特斯拉在人工智能领域的转型和未来发展的展望。提到了特斯拉可能在软件和硬件方面的创新,以及Optimus机器人在实际应用中的潜力。

10:04

🧠 人工智能与机器人技术的进步

本段讨论了人工智能和机器人技术的进步,特别是关于自然语言处理模型和机器人学习人类行为的能力。提到了如何通过模拟和学习提高机器人的性能,以及如何使用大量数据来训练神经网络。

15:04

🚗 FSD V12的实际驾驶体验

本段分享了FSD V12的实际驾驶体验,包括在不同城市和乡村道路上的驾驶感受。讨论了FSD V12在规划和感知方面的改进,以及与V11版本相比的变化。同时,还提到了对FSD V12未来发展的乐观态度。

20:05

🤔 对FSD V12改进的思考

本段深入思考了FSD V12如何实现其显著的改进,包括去掉旧的启发式代码和采用端到端的学习方式。讨论了特斯拉如何通过大量的数据和模拟来训练其系统,以及如何从人类驾驶员的行为中学习。

25:07

🚗 FSD V12的安全性和挑战

本段讨论了FSD V12在安全性方面的挑战,包括去掉了旧的保护措施后如何保证安全。探讨了特斯拉如何通过模拟和数据来预测和处理潜在的危险情况,以及系统如何处理不同驾驶场景下的决策。

30:09

📈 FSD V12的性能和未来展望

本段对FSD V12的性能进行了总结,并对特斯拉未来的发展进行了展望。提到了FSD V12在模仿人类驾驶员行为方面的成功,以及系统如何通过学习人类的行为来提高其性能。同时,还讨论了特斯拉可能如何使用增强学习和其他技术来进一步提升系统。

35:11

🤖 特斯拉Optimus机器人的潜力

本段讨论了特斯拉Optimus机器人的潜力,以及它如何通过模仿人类行为来学习和执行任务。提到了机器人在处理复杂和不可预测情况时的能力,以及特斯拉如何利用大量数据来提高机器人的性能。

40:12

🌟 特斯拉在AI领域的转型

本段讨论了特斯拉作为一家汽车制造商向AI公司转型的过程。提到了特斯拉在软件和生态系统方面的重点,以及FSD和Optimus机器人如何成为公司未来发展的关键。同时,还探讨了市场和投资者对特斯拉这一转型的看法。

45:12

📚 人工智能模型的发展和应用

本段讨论了人工智能模型的发展,特别是大型语言模型(LLMs)的进步和应用。提到了不同公司和开源社区在模型开发上的成就,以及这些模型在特定任务上的性能。同时,还探讨了如何通过封装模型来提高性能,以及小型设备上模型的未来发展。

50:13

🎥 文本到视频的技术进展

本段讨论了文本到视频技术Stable Diffusion的进展和演示。提到了这项技术目前的发展阶段,以及它在未来可能的改进方向。同时,还讨论了这项技术的成本和可访问性问题,以及如何使其更加经济实惠。

55:13

🌍 日蚀观测与个人体验分享

本段分享了个人观测日蚀的体验,以及对下一次日蚀的期待。讨论了日蚀观测的独特性和对个人的影响,以及如何准备和选择合适的地点进行观测。同时,还提到了对未来日蚀观测的计划。

Mindmap

Keywords

💡特斯拉FSD V12发布

特斯拉FSD (Full Self-Driving) V12发布是视频中讨论的主要话题之一。FSD是特斯拉的自动驾驶系统,V12指的是该系统的第12个主要版本更新。视频中提到,V12版本带来了显著的改进和新功能,如更好的规划能力和更少的人工干预需求。这表明特斯拉在自动驾驶技术方面取得了重要进展。

💡Optimus

Optimus是特斯拉计划推出的一款机器人。在视频中,讨论者提到了对Optimus的期待和对其未来功能的预测。Optimus被视为特斯拉在机器人技术领域的重要尝试,预计将具备执行各种任务的能力,从而在家庭和工业环境中发挥作用。

💡自动驾驶技术

自动驾驶技术是指车辆能够在没有人类驾驶员直接操作的情况下自主导航和行驶的技术。视频中讨论了特斯拉FSD V12版本的自动驾驶技术,强调了其在规划和感知方面的改进,以及如何通过大量数据和机器学习算法提高车辆的自动驾驶能力。

💡神经网络

神经网络是一种模仿人脑工作原理的计算模型,用于识别模式和处理复杂数据。在视频中,神经网络被用来描述特斯拉FSD V12版本中的感知和规划系统,它们通过学习大量的驾驶数据来提高车辆对环境的理解和决策能力。

💡端到端学习

端到端学习是一种机器学习方法,它允许系统从输入数据直接映射到输出结果,而不需要人为设计特征或中间处理步骤。在视频中,端到端学习被用来描述特斯拉FSD V12的规划堆栈,意味着系统从原始感知数据直接学习如何驾驶汽车,而不是依赖于预定义的规则或启发式方法。

💡机器人出租车

机器人出租车是指自动驾驶车辆用于提供出租车服务的概念。在视频中,提到特斯拉可能在未来发展机器人出租车服务,这将改变人们的出行方式,提高交通效率,并可能对传统出租车行业产生影响。

💡感知堆栈

感知堆栈是指自动驾驶系统中用于识别和理解周围环境的一系列技术,包括摄像头、雷达、激光雷达等传感器及其数据处理算法。在视频中,提到特斯拉FSD V12的感知堆栈已经足够成熟,能够准确识别和处理各种道路情况,这是实现高级自动驾驶功能的关键。

💡启发式编程

启发式编程是一种编程方法,它依赖于经验和直觉来创建解决问题的规则和算法。在自动驾驶技术的背景下,启发式编程曾被用于编写指导车辆行为的规则。然而,视频中提到特斯拉FSD V12放弃了传统的启发式编程,转而采用基于数据和机器学习的端到端学习方法,以提高系统的适应性和决策能力。

💡自动驾驶挑战

自动驾驶挑战是指在开发和部署自动驾驶技术过程中遇到的技术和非技术难题。视频中提到,尽管特斯拉FSD V12取得了显著进步,但仍需面对如何处理复杂交通场景、异常情况和确保系统安全性等挑战。

💡数据驱动

数据驱动是指通过大量数据的收集、分析和应用来指导决策和改进技术的过程。在自动驾驶领域,数据驱动的方法是通过分析真实驾驶场景的数据来训练和优化神经网络,从而提高车辆的自动驾驶能力。视频中强调了特斯拉FSD V12如何利用数据驱动的方法来实现更好的规划和感知。

Highlights

特斯拉FSD V12版本的发布和讨论,标志着自动驾驶技术的重要进展。

Optimus机器人的讨论,预示着特斯拉可能在人工智能和机器人技术方面有更大的野心。

特斯拉的自动驾驶技术已经能够在多种城市和乡村环境中进行测试,显示出其强大的适应性。

FSD V12版本的规划堆栈进行了重大改进,提高了自动驾驶的效率和安全性。

特斯拉的自动驾驶技术在没有 guardrails(安全防护措施)的情况下表现出色,显示出其技术的成熟度。

特斯拉的自动驾驶技术通过模仿人类驾驶员的行为,提高了其在复杂交通环境中的适应性和决策能力。

特斯拉的自动驾驶技术在处理紧急情况和避免潜在危险方面表现出了更高的智能水平。

特斯拉的自动驾驶技术通过大量的数据和模拟训练,提高了其在各种驾驶场景下的表现。

特斯拉的自动驾驶技术在模仿人类驾驶员的行为时,能够考虑到不同情况下的安全边际。

特斯拉的自动驾驶技术通过神经网络的学习和优化,提高了其在交通流量控制和车辆导航方面的能力。

特斯拉的自动驾驶技术在处理交通堵塞和车辆排队方面表现出了更好的适应性和决策能力。

特斯拉的自动驾驶技术通过不断的迭代和更新,提高了其在各种驾驶环境和条件下的稳定性和可靠性。

特斯拉的自动驾驶技术在模仿人类驾驶员的行为时,能够考虑到不同情况下的礼貌和合理性。

特斯拉的自动驾驶技术通过大量的数据和模拟训练,提高了其在处理交通信号和标志方面的能力。

特斯拉的自动驾驶技术在处理复杂交通情况和突发事件方面表现出了更高的智能和适应性。

Transcripts

00:00

hey it's Dave welcome today I'm joined

00:02

by James dama and we've got a whole host

00:04

of things to talk about we've got um

00:07

Tesla's FSD V12 release that just

00:09

happened this past month we've got um

00:12

Optimus to talk about um and this robot

00:15

taxy reveal in August so anyway it's

00:19

been a long time it's been like at least

00:21

a half a year was last August or

00:23

something like that so yeah yeah I

00:24

remember the last time we met we talked

00:26

about V12 cuz they did a demo mhm and um

00:30

we were quite excited about the

00:32

potential but also a little bit cautious

00:35

in terms of how it will first roll out

00:38

and how capable but um curious just what

00:41

has been your first experiences and

00:43

first impressions of you talk how long

00:45

have you been driving it for uh I got it

00:47

a few Sundays back I think I I got it

00:50

the first weekend that it really went

00:52

right so I think I've had it three weeks

00:53

or something like that maybe four three

00:56

probably and uh of course drove it out

00:59

here to Austin from Los Angeles drove it

01:02

quite a bit in Los Angeles on the way

01:04

out here so my my wife has this hobby of

01:07

like visiting supercharges we've never

01:09

been to so every cross country trip

01:11

turns it's ends up being way longer than

01:13

otherwise would be but one of the cool

01:15

things about that on the FSD checkout to

01:16

her is that we end up driving around all

01:19

the cities on the way you know because

01:20

you're driving around to the different

01:21

Chargers and stuff and so you get a

01:23

chance to see what it's like in you know

01:26

this town or that town or um different

01:28

you know highways are different we drive

01:30

a lot of rural areas so I got lots of

01:32

rural we uh we did like the whole back

01:34

Country tour coming out here through

01:36

across Texas and so feel like it was it

01:39

was a good experience for like trying to

01:40

compress a whole lot of FSD yeah and I

01:43

got to say I'm just like really

01:45

impressed like it's I was not expecting

01:47

it to be this good because it's a really

01:50

like this is not a small change to the

01:53

planner was yeah with v11 we had gotten

01:58

to a point where the perception stack

02:00

was good enough that we just weren't

02:02

seeing perception failures I mean they

02:04

just but people almost all the

02:06

complaints people had had to do with

02:07

planning not getting in the right lane

02:09

not being able to move far enough over

02:11

um not knowing when it was its turn uh

02:13

stopping in the wrong place creeping the

02:15

wrong way these are all planning

02:16

elements they're not uh you know so if

02:21

you're going to take a planning stack

02:23

that you've been working on for years

02:25

you've really invested a lot and you

02:26

like literally throwing it away like

02:29

there just not retaining any at least

02:30

that's what they tell us they got rid of

02:32

300K lines they went end to end it's

02:34

harder to actually mix heuristics into

02:37

end to end so it makes sense that they

02:38

actually got rid of almost everything

02:40

anything they have in there that's

02:42

heuristic now would be built new from

02:43

scratch for the end to end stack and yet

02:46

they managed to

02:48

outdo in what seems to me like a really

02:50

short because they weren't just

02:51

developing this they were developing the

02:53

way to develop it you know they were

02:56

having to figure out what would work

02:57

there's all of these layers of stuff

02:59

that they had to do so my you know my

03:04

expectation was that the first version

03:06

that we were going to see was going to

03:08

be like on par it would have some

03:10

improvements it would have a couple of

03:12

meaningful regressions and there would

03:14

they would be facing some challenges

03:15

with you know figuring out how to

03:17

address so because it makes sense that

03:19

they want to get it out soon and the

03:21

sooner they get it out into the fleet

03:23

the faster they learn um but the the

03:26

degree of polish on this was yeah in a

03:29

much higher than I expected and like you

03:33

know Bradford stopped by and I got a

03:36

chance to see 1221 as he was coming

03:39

through we only had about 40 minutes

03:42

together I think I it was just like the

03:44

spur of the moment thing and uh and yet

03:47

even in because he was kind enough to to

03:49

take it places that I knew well that I

03:52

had driven on 11 a lot and I think it

03:55

took me about three blocks to realize

03:58

like right away and after 45 minutes I

04:00

just knew that this is going to be

04:02

completely different and every

04:04

everything that I've experienced since

04:06

getting it and

04:09

I you know what have I got I'm I must be

04:12

at like 50 hours in the seat with it

04:14

right now a few thousand miles highly

04:17

varied stuff yeah it's super solid yeah

04:20

yeah I think um yeah I wanted to dive

04:22

into kind of how big of a jump this fsd2

04:25

is because when I drove it I was shocked

04:28

um because this is not like a is I think

04:33

V12 is a little bit of a misnomer

04:35

because this is a drastic you know

04:38

rewrite of their whole planning

04:40

architecture approach different

04:43

different um I mean on their perception

04:46

it seems like they probably kept a lot

04:48

of their neuron Nets um in terms of the

04:51

perception stack added on as well but in

04:54

their planning stack this is where they

04:57

pretty much it seemed like they're

04:59

starting from I would say scratch

05:01

completely but they're taking out all of

05:03

the guard rails all their hortic and

05:06

they're taking putting on this n10

05:09

neural approach where it's deciding

05:11

where and how to navigate right the the

05:14

perceived environment but I would have

05:16

imagined and this is kind of my

05:18

expectation also is like you you would

05:21

be better in some ways it would be more

05:22

natural Etc but then there would be some

05:25

just like weird mistakes or things that

05:28

it just doesn't get because all of the

05:30

guard rails are off theistic ones and so

05:33

you're just like it's D more dangerous

05:34

than some other ways right and that on

05:37

par though Tesla would wait until it

05:39

would be a little more safer before

05:41

releasing V12 but what we ended up

05:43

getting was we got this V12 that just

05:45

seems like really polished you know

05:48

we're not it's not easy to catch those

05:50

big mistakes in V12 and I'm kind of like

05:53

where did all these big mistakes go like

05:55

you know that was my expectation at

05:56

least and so I'm wondering like like

05:59

what was your did that catch you off

06:00

guard like just seeing the the the small

06:03

number you know of of big mistakes or

06:06

seeing how polish this V12 is um and

06:09

then I also wanted to go into like how

06:11

did Tesla do that in terms of um because

06:14

once you take off the heris sixs at

06:16

guardrails you really have to

06:20

like like be confident you need I don't

06:23

know like yeah I'm curious to hear

06:25

what's your take on how you think they

06:27

achieve this with B12 you know the the

06:29

the the Polish they have well first yeah

06:32

it

06:32

was well there's two components of like

06:36

starting out experience there's like my

06:39

sort of abstract understanding of the

06:42

system and what I sort of rationally

06:44

expected and then there's you know

06:46

there's my gut you know because I've got

06:48

I've got like 200,000 miles on various

06:51

layers of autopilot including you know

06:53

maybe I don't know 50,000 miles on FSD

06:56

so I have this muscle memory and this

06:58

you know sort of sense of the thing and

07:02

I expected that to sort of be dislocated

07:05

I mean you know going from 10 to 11 and

07:10

was also I mean they added a lot this is

07:13

not the first time that they've made

07:14

pretty substantive changes it's the

07:15

biggest change for sure right but I was

07:18

expecting it to feel a little bit weird

07:20

and uncomfortable but but sort of

07:23

intellectually I was expecting all the

07:25

old problems to go away and a new set of

07:27

problems to come in because it's a

07:29

different product

07:31

like because the perception was pretty

07:33

polished and and the things that people

07:35

were most aware of is failings of the

07:38

system were essentially baked into this

07:40

heuristic code well of course you take

07:41

theistic code away all those failings go

07:43

away too but what do you get with the

07:45

new thing right so and you know so that

07:49

did happen like all the old failings

07:51

went away like rationally right but it

07:53

was weird to sit in the SE in the seat

07:55

and you know there you know there's this

07:57

street you've driven over and over and

07:59

over again where there was this

08:01

characteristic behavior that it had

08:02

which is you know maybe not terrible but

08:04

not comfortable maybe or less ideal than

08:07

you would are slower annoying whatever

08:08

the deal and those are just gone like

08:10

all of them not just like one or two

08:11

they're just like gone all of them so

08:13

that was sort of like it was such a big

08:17

disconnect that it was kind of

08:18

disquieting the first you know week or

08:20

two I mean delightful but also

08:22

disquieting because now you're like

08:24

Uncharted Territory you know what demons

08:27

are looking here that I'm not prepared

08:29

to

08:30

you know after you drive theistic thing

08:31

for all you kind of got a sense of the

08:33

character of the failures I mean even if

08:35

you haven't seen it before you know the

08:36

kind of thing that's not going to work

08:38

and now but I didn't I didn't really

08:40

find those like I haven't really found I

08:43

haven't seen something and I was

08:46

expecting to see a couple of things that

08:49

were kind of worrisome and where I

08:51

wasn't clear to me how they were going

08:52

to get go about addressing them and I

08:54

just I really haven't right and so like

08:56

in that sense I'm really I'm more

08:58

optimistic about it than I expected to

09:00

be at this point um how do they do it

09:04

yeah okay so let me give context to that

09:06

question a bit more because I know it

09:07

could be open-ended so I would imagine

09:10

that if you go end to end with planning

09:12

that um driving is is very high stakes

09:16

you have one mistake let's say you go

09:18

into the center divider aisle or there's

09:21

a there's a concrete wall or you there's

09:23

a signpost you drive into or a treat or

09:26

something it just seems like you have

09:28

one second of mistake or even Split

09:30

Second and your car is you know it's

09:33

just catastrophic it could be and with

09:36

V1 up until v11 you had these guard

09:38

rails of like oh stay in the lane and do

09:40

this and all that stuff but with those

09:42

guard rails off like V12 could when it's

09:47

confused just make a bad move you know

09:50

and just go into some you know another

09:54

car another Lane another you know object

09:55

or something but what about it is

09:58

preventing it you know without the

10:01

guardrails is it just the data of

10:03

mimicking humans or is there something

10:06

else attached on top of that where

10:08

they're actually doing some simulation

10:10

or stuff where it's showing like what

10:12

happens when you go out of the lane into

10:14

the next Lane you know into oncoming

10:16

traffic or if you do something like is

10:17

it is are they you know pumping the the

10:22

the the neuron nest with lots of

10:24

examples of bad things also that could

10:26

happen if you know if it doesn't you

10:28

know follow a certain path like what's

10:30

your take on

10:31

that um so that question prompts a

10:34

couple of thoughts um so one

10:37

thought are okay first of all preface at

10:40

all like I don't know what the nuts and

10:43

bolts of how they are tuning the system

10:46

they've told us it's end to end right so

10:49

that basically constrains the things

10:51

that they could be doing but when you

10:53

train in a system you can you don't have

10:56

to train it end to end I mean some

10:57

training will be done endend end but you

10:59

can break it into blocks and you can

11:00

pre-train blocks in certain ways and we

11:02

know that they can use simulation we

11:04

know that they can curate the data set

11:06

um so there're you know what's the mix

11:09

of stuff that they're doing is really

11:10

hard to predict they're going to be a

11:12

bunch of you know uh learned methods for

11:16

things that work well that are going to

11:18

be really hard to predict externally

11:20

just from first principles um this whole

11:22

field it's super empirical one thing

11:25

that we keep learning about neural

11:27

networks even like the language models

11:28

we can talk about those some if you want

11:30

to cuz that's also super exciting but

11:32

the they keep surprising us right like

11:35

so you take somebody who knows the field

11:37

pretty well and you at one point and

11:39

they make predictions about what's going

11:41

to be the best way to do this and

11:42

whatnot and aside from some really basic

11:44

things I mean there's some things are

11:45

just kind of P prohibited by basic

11:47

information Theory right but when you

11:50

start getting into the Nuance of oh will

11:52

this way of tweaking the system work

11:54

better than that way or if I scale it if

11:57

I make this part bigger and that part

11:58

smaller will that be a win or a lot you

12:00

know there's so many small decisions and

12:03

the training is like that too like how

12:05

do you curate the data set like what in

12:07

particular matters what makes data good

12:10

like that's a surprisingly subtle thing

12:12

we know that good data like some

12:14

training sets get you to a good result

12:16

much faster than other training sets do

12:18

and we have theories about what makes

12:20

one good and what makes one bad and

12:22

people on some kinds of things like text

12:24

databases a lot of work has been done

12:26

trying to figure this out and we have

12:27

some ideas but at the end the day this

12:29

is super empirical and we don't really

12:31

have good theory behind it so for me to

12:34

kind of sit here not having seen what

12:35

they have going on in the back room and

12:36

guess I'm just guessing so just like

12:38

frankly like I have ideas about what

12:41

they could be

12:41

doing um but you know I would expect

12:45

them to have many clever things that

12:47

never would have occurred to me yeah

12:49

that they've discovered are important

12:51

and they may be doubling down and we we

12:52

actually don't know the fundamental

12:54

mechanism of like how they're going

12:55

about doing the mimicry like what degree

12:58

of we you know we know that the you know

13:00

they have told us that the final thing

13:02

is photons in controls out as end to end

13:04

would be right

13:07

but uh so the the final architecture but

13:09

like how you get to the result of the

13:12

behavior that you want you're going to

13:13

break the system down

13:15

like I don't know it's it's just like

13:17

there are many possibilities that are

13:20

credible picking them and they vary a

13:22

lot and picking the one that's going to

13:24

be the best like that's a hard thing to

13:25

do sitting in a chair not knowing um

13:29

they are doing it really clearly and

13:31

they're getting it to work like the

13:32

reason why I I it fascinates me on the

13:36

on what type of like um uh kind of

13:41

catastrophic scenarios or dangerous

13:44

things that there may be putting in like

13:46

it it the reason why it fascinates me is

13:48

because with driving part of the driving

13:50

intelligence is knowing that if your car

13:54

is like one foot into this Lane and it's

13:56

oncoming traffic that that's really

13:59

really bad like you know be a huge

14:01

accent versus if there's um no cars or

14:05

something then it's okay or if there's

14:08

or just it the driving intelligence just

14:10

requires an awareness of how serious

14:13

mistakes are in different situations in

14:16

some situations they're really really

14:18

bad in some situations the same driving

14:20

maneuver is not that dangerous and so it

14:23

just seems to me like there have to be

14:25

some way to train that right to teach

14:27

the the neuronist that so there's an

14:29

interesting thing about the driving

14:30

system that we have and

14:32

people okay first so the failure you're

14:36

describing is much more likely with

14:37

heuristics like heuristics you build

14:39

this logical framework a set of rules

14:42

right where um you know when heuristic

14:45

Frameworks break they break big like

14:48

they because you can get something

14:50

logically wrong and there's this gaping

14:52

hole this scenario that you didn't

14:53

imagine where the system does exactly

14:55

the opposite of what you intended

14:57

because you have some logical flaw in

14:58

the reasoning that got you to there

15:00

right so you know bugs that crash the

15:03

computer that take it out like we you

15:06

know computers generally don't fail

15:07

gracefully heuristic computers right

15:10

neural networks do tend to fail

15:12

gracefully so that's one thing right

15:13

they they they're less likely to crash

15:16

and they're more likely to give you a

15:18

slightly wrong answer or a you know to

15:21

get almost everything right and have one

15:22

thing be kind of wrong like that's a

15:24

more kind of characteristic thing so

15:27

neural networks

15:29

you know the way that they're going to

15:31

fail is going to be a little bit

15:32

different than heuristic code and

15:33

they're just by their nature they're

15:35

going to be somewhat less apt to that

15:37

kind of failure not that it's impossible

15:39

just that it's not going to be the

15:41

default automatic thing you know if you

15:43

get an if statement wrong in a piece of

15:45

code or something you you know

15:46

catastrophic failures are kind of the

15:48

norm in logical chains so um then

15:52

there's this other thing which is the

15:54

the system that we have is for is it's

15:57

Evol co-evolved with drivers you know

16:00

you uh you know you you learn you

16:03

develop reflexes you read the traffic

16:06

you read the

16:07

environment um you know when the lane

16:09

gets narrow people slow down people sort

16:12

of have a set of reflexes that adapt to

16:16

an environment to try to maximize the

16:19

safety margin they have for what they're

16:20

doing you're when you're driving down a

16:22

row of parked cars if you have space you

16:24

move over to give your safe a little

16:26

more space um you know if you're coming

16:29

up on an intersection and you can't see

16:30

what's coming you may slow down you may

16:32

move over to give yourself more space to

16:33

see what like all of these unconscious

16:36

behaviors right and the road system has

16:39

been developed over a lot of years to

16:41

like take advantage of the strengths of

16:43

people and and minimize the weaknesses

16:46

of people right I mean the way this the

16:48

amount of space that we provide on roads

16:50

and the way that we shape our

16:51

intersection sight lines that kind of

16:53

stuff the rationale for how our our

16:56

traffic controls work and all that kind

16:57

of stuff is

16:59

uh it's evolved to the strengths and

17:01

weaknesses of human beings right so

17:04

human beings are constantly trying to

17:05

within certain margins maximize their

17:07

safety margin give themselves make

17:09

themselves more comfortable that they

17:10

understand what what's going on right so

17:14

and now we have a system that's

17:15

mimicking people right so like there are

17:18

funny things that the that the that the

17:20

car will do that that just really is

17:22

kind of underscore this like you know

17:24

you're in a line of cars and that they

17:26

suddenly slow down and you have a truck

17:28

in front of you so one like one of the

17:29

most natural things is people will pull

17:31

over a little to if they can't see to

17:33

see what's happening up there to help

17:35

them prepare for what might be happening

17:37

to give them more situational awareness

17:39

well you see the cars do this sometimes

17:41

the funny thing about the car is the car

17:43

the the car it like it's camera is in

17:45

the center so moving a little to the

17:47

left doesn't let the car see around the

17:48

car ahead of it right it still can't see

17:51

but it still mimics that action so

17:52

similarly coming up to an intersection

17:55

slowing down moving over you know

17:58

preparing yourself so essentially

18:01

there's this interesting characteristic

18:03

that you're going to get out of that is

18:05

it is it the is that the planning system

18:08

is going to mimic the margin you know

18:11

that do the little Preparatory things

18:13

that give you a little more margin a

18:14

little more situational awareness and

18:17

help you prepare give you a little more

18:18

time to react in case something happens

18:21

it's mimicking all those things now so

18:24

uh instead of the her istics having to

18:26

kind of be perfect instead what the

18:27

system is doing is it's learning to

18:29

mimic PE you know drivers who already

18:32

have all these reflexes and and and

18:34

behaviors in a really complicated

18:36

contextual environment so it's not like

18:39

we're not talking about four or five

18:40

behaviors you know we're talking about

18:42

four or five thousand behaviors the kind

18:43

of things that people were as drivers

18:46

were not even aware that we're doing

18:47

them and the car is mimicking that right

18:50

in the thing and so so so they're going

18:53

to fail more gracefully and they're

18:54

mimicking drivers who are you know who

18:58

are cautious in situations where they

19:00

need to be cautious and they're you know

19:02

they're they're making small adjustments

19:04

to give themselves more margin all the

19:05

time and I think we may have under

19:08

appreciated the degree to which you know

19:12

human drivers with a lot of experience

19:14

have

19:15

reflexively you know developed a lot of

19:18

behaviors that are actually because

19:19

we're talking about Good drivers here

19:21

right uh they've they've unconsciously

19:24

developed a lot of habits that actually

19:26

have a an appreciable impact on their

19:29

safety and and the system is now getting

19:32

those for free kind of because it's

19:33

mimicking drivers right even all the

19:35

little Nuance things that we that kind

19:37

of don't make sense like I said like

19:39

pulling over to see what's ahead of the

19:40

car uh ahead of you or we see the like

19:44

the the behavior where that the very

19:46

Charming Behavior where you know it

19:48

doesn't block the Box you come to an

19:50

intersection and if it's not clear that

19:51

it can get across it stops right like

19:53

nobody had to program that and if you

19:55

look at intersections like when to do

19:57

that and when to not do that that's kind

19:58

of subtle right like is the car ahead of

20:01

you going to move forward Enough by the

20:03

time you cross intersection or is it not

20:05

and if you look at the flow of traffic

20:06

like as a human you're like better than

20:09

even odds there will be space when I

20:11

cross or no I should definitely stop

20:12

here because I don't want to be caught

20:14

in the intersection the cars mimic all

20:15

that yeah even in really complicated

20:18

context I mean I would say I mean

20:21

mimicking it it seems like it goes even

20:23

a little beyond the mimicking at times I

20:26

think this is like the unch territory

20:29

which V12 surprises me is it mimics with

20:33

some level of understanding sometimes

20:35

like why it because for example you're

20:38

going you don't know whether to to go

20:41

into the intersection or not or let's

20:42

say you're you're turning into pedest

20:45

left turn into and pedestrians are here

20:47

every situation is a little bit

20:49

different and so just because in your

20:51

data you have a bunch of examples it's

20:54

like there it might not be the perfect

20:56

like you might not be able to mimic

20:58

perfectly because it's a new situation

21:00

so you've got to infer kind of in this

21:02

new situation what should I do and

21:05

that's where I think it's not just

21:07

mimicry it and it could be just mimicry

21:10

right now but the the the big I guess

21:13

jump in in ability is is is UN is it's

21:17

kind of like llms you know like they

21:19

they can understand to a certain extent

21:22

what you're asking for in a new

21:24

situation or a new you know dialogue I

21:27

think the word you're looking for is

21:28

General

21:29

yeah yeah yeah maybe generalize like

21:31

taking that the specific mimicry

21:33

situations that the data provides and

21:35

generalizing those but there's a certain

21:37

level in order to generalize that you do

21:40

need um capability Beyond just mimicry

21:43

right some level of of maybe application

21:47

or so mimicry I mean we talk about

21:50

mimicry mimicry is the training goal

21:53

right do what a human would do in this

21:55

situation that's why we call it mimicry

21:57

right but

21:59

the system it doesn't have the capacity

22:01

to record every single possibility right

22:04

and so it's frequently going to see a

22:06

situation that's kind of a combination

22:08

of situations it's seen before it's not

22:10

a duplicate of any of them and it h and

22:12

you have to kind of figure out how to

22:14

combine what you learned in these other

22:16

situations that were similar but come up

22:18

with something that's different and yet

22:19

somehow it follows the same rules so a

22:22

way you could think about it is that the

22:24

using the block the Box Thing depending

22:26

on how many lanes of traffic there are

22:28

and how aggressive the drivers are and

22:30

what the weather is like what the cross

22:32

traffic is like you know just all of

22:34

these variables you you you as a human

22:37

you come up to the intersection you have

22:38

to make the decision whether you're

22:39

going to cross and maybe get stuck or

22:41

whether you're going to you're going to

22:42

pause and wait for the other car to move

22:44

up you know I saw one i' I've seen one

22:46

where where I had the the block box and

22:49

you could see the light at the end of

22:50

the row of cars right and like this is

22:53

the thing humans do when this light

22:55

turns red you know you have plenty of

22:56

time to cross because it's not going to

22:58

turn GRE you're not going to get stuck

22:59

and you see the next light up there turn

23:00

green well even if you get stick in the

23:02

Box it doesn't matter I was been in that

23:04

situation twice now and the car moved

23:06

into the intersection even though it

23:07

would block it because it's confident

23:09

that the row of cars well who coded

23:11

nobody coded that right there's now as a

23:15

human I'm describing this thing well

23:16

here's a rule I just made up if this

23:18

light has just turned red you know there

23:20

will be no cross traffic and the light

23:22

ahead turns green while the car is ahead

23:23

they're definitely going to move forward

23:25

almost certainly right unless there's a

23:27

broken down car or something like that

23:28

and so you see humans do this they move

23:30

up because they know they're going to be

23:32

able to and they want to all they want

23:34

to take they want to reserve that space

23:36

behind that car for themselves you know

23:37

to get their priority for crossing the

23:40

intersection so they move forward I see

23:42

the car mimic this Behavior right only

23:45

where it's really

23:47

appropriate so in a sense what I when I

23:50

described that to you what I did was I

23:52

looked at the situation and I figured

23:54

out what the rules were oh this light

23:55

changed that light changed now I have

23:58

time right yeah but when I've done that

24:00

in the past I didn't think about the

24:01

rules consciously I you know I'm not

24:03

checking off this list of things I see

24:05

the conditions where it's safe for me to

24:07

move forward I'm unlikely to block

24:08

anyone and I do it right so a way that

24:11

you can think about what the system is

24:12

doing is it's we're training it to mimic

24:15

it but it has to compress that somehow

24:18

to save that into a set of rules that is

24:20

more General so what the you can think

24:22

of what the system is trying to do is

24:23

trying to figure out what the rules are

24:26

like I've seen these 50 block the Box

24:28

situations what rules say when it's good

24:31

to go and when it's not good to go so if

24:33

it can figure out what those rules are

24:34

like if it's it's essentially getting a

24:37

and you know understanding is a loaded

24:39

word so I don't like to use

24:41

understanding right but it's deriving a

24:44

representation of the rule set if you

24:47

will that humans cross which might you

24:49

know when we write code we want to

24:51

minimize the rules keep the code simple

24:52

so we don't have weird bugs and that

24:53

kind of stuff but neural networks if

24:55

it's if the r if the simple version of

24:57

the rules is 300 rules that's fine like

25:00

300 rules is no problem for them so if

25:02

humans have unconsciously 300 sets of

25:04

rules that we use to decide when we go

25:06

across and it can come to figure out

25:08

what those are well that lets it

25:10

generalize it can now take the same

25:13

principles it's extracting the

25:15

principles unconsciously not rationally

25:18

just reflexively in the same way people

25:20

do it's extracting the principles that

25:21

humans are using to make that decision

25:24

and it's applying those to its own

25:26

actions and so that's where you

25:29

and we it manifest some you know some

25:32

cute behaviors that are irrational for

25:34

the car right perhaps but it also

25:36

captures I mean the fact that for I mean

25:38

you get you know as a Pim had said that

25:41

you you get the the puddle of voiding

25:43

for free right you got the u-turns for

25:46

free like when is it to say the U-turn

25:47

or not that's hard to write you just you

25:50

get that for free but you also get the

25:52

oh this guy wants to turn left into the

25:54

parking lot so I'm going to pause back

25:57

here and let him go or somebody behind

25:58

me wants to pass me I'm going to move up

26:00

a couple of feet so they can get in or

26:02

move over you see the cars doing all of

26:04

this stuff right like they're

26:07

not you know the autopilot team they're

26:11

not picking and choosing the behaviors

26:14

that they want it's it's I mean it seems

26:16

clear to me anyway looking at this that

26:18

they're grasping the whole spectrum of

26:21

all the behaviors that people do the

26:23

polite things the impolite things where

26:25

people are irrational I mean one

26:28

thing that I

26:30

do like one of the things I liked before

26:33

because it it it does mimic some things

26:34

that I would prefer it doesn't mimic but

26:36

they're extremely human behaviors and

26:37

that is like when you're on the highway

26:39

humans tend to follow other humans other

26:42

cars too closely in certain situations

26:45

where the traffic is kind of dense and

26:47

whatnot and I've been just using the

26:49

auto speed letting the car pick its own

26:50

spacing and stuff and I notice that you

26:53

know previously there was a hero stick

26:55

this many car lengths and no less and

26:56

you know maybe temporarily for breaking

26:58

and stuff it might go soer but was

27:00

really good at maintaining a really

27:01

comfortable distance and now I notice

27:03

it's kind of it's driving more like

27:05

people and I kind of preferred when it

27:07

was keeping more space like I liked that

27:09

the car's ability to like maintain more

27:11

have a bigger and you know you don't

27:13

pick up rocks from trucks and stuff but

27:15

it's now F it's it's it's on I'm finding

27:18

it's mimicking human following Behavior

27:20

which I personally find less than ideal

27:23

but that's part of the whole like that's

27:25

definitely something that if you were

27:26

picking and choosing you wouldn't have

27:27

picked to add because it's not a win

27:30

like it's an irrational behavior that

27:32

humans engage in that can lead to

27:34

accidents that reduces your safety

27:36

margin but the car is going to mimic

27:37

that too because you know they're taking

27:39

the good with the bad in order to get

27:41

everything including the stuff that they

27:43

don't necessarily know is in there I was

27:44

suggesting there are all these

27:45

unconscious rules that we follow well

27:47

they're unconscious to the autopilot

27:49

team too like they don't know to go look

27:51

for that so they're and but the net net

27:54

is it's you know the reality is they've

27:56

got this thing it's out there and it's

27:58

just working incredibly well yeah yeah I

28:00

mean it's yeah it's interesting I guess

28:02

on the topic of generalizing so um I

28:07

think that's probably one of the most I

28:08

think promising aspects of V12 is that

28:13

the behaviors that are it's picking up

28:15

um some of it can be unexpected because

28:18

let's say you've got you know 100 you

28:21

know videos on on um on whether or not

28:25

to go in and out of an intersection or

28:27

something at a at a yellow light or

28:29

something or a green light even if it's

28:32

blocked but then um so the neuron Nets

28:35

are analyzing and training training data

28:37

like through billions of parameters and

28:39

analyzing this these these videos

28:42

getting what what it can out of it I

28:43

also wonder I guess it goes back to this

28:45

whole thing is are they adding more

28:47

types of data where it's like are they

28:49

adding onto those video clips or

28:51

providing different stuff of if this car

28:53

actually does this then you know there's

28:55

a crash or does this there's a crash cu

28:58

it seems like if it's if they're only

28:59

providing 100 say video clips of it

29:02

doing well then the signal for the

29:05

negative for the dangerous situation

29:07

isn't as high as if you give it directly

29:09

like so that's useful in reinforcement

29:11

learning where having negative examples

29:13

is really useful because you're trying

29:14

to figure out what the score is and you

29:15

have it good and bad um in the case of

29:18

human mimicking right the score is just

29:21

how close did you get to what like the

29:25

way you rate the how the neural network

29:27

is doing and training is you show it a

29:29

clip it hasn't seen before and you ask

29:30

it what do you do here and you rate it

29:33

just by how close it was to what a human

29:35

did so you take a human recorded example

29:36

that the system isn't trained on has

29:38

never seen before and and when I test it

29:41

to decide these other Clips are they

29:43

helping are they hurting I give it one

29:45

it's never seen before and and wait and

29:47

and good and bad is just how close are

29:49

you to the human it's not did you crash

29:51

it's not there no in reinforcement

29:53

learning you do that you you know you do

29:56

or contrastive learning you know there

29:57

are other things where you do that but

29:59

the simple mimicking at least the way

30:01

that it's done in robotics

30:03

overwhelmingly right is we just we have

30:05

a signal from from a Target that we want

30:08

you to get close to and and your score

30:11

is just how close you are to that so the

30:14

degree to which it mimics a recording of

30:17

a never-before seen good driver Behavior

30:21

that's its score so you don't need the

30:22

crashes so do you think that they're

30:25

only doing that type of mimic training

30:27

versus are they you don't think they're

30:29

adding on different types of contrastive

30:31

or let's say reinforcement learning or

30:33

whatever long term reinforcement

30:35

learning is going to be really useful um

30:38

like you know I mentioned there are

30:40

these various technique there are

30:41

various ways that I can you know when

30:45

fundamentally neural networks you know

30:46

the way they train them is you give them

30:48

an example and then they say what they

30:50

would do in this situation and then you

30:52

give them a score and based on the score

30:54

you you adjust all the weights and you

30:56

just do that over and over again and the

30:57

weights eventually get really good at

30:58

giving you the answer that you're

31:00

looking for okay how do I pose the

31:02

problem um in reinforcement learning

31:04

what you do the the problem is you do

31:06

you play all these steps and then you

31:08

get a score for the game so this is how

31:10

like deepmind did with you tari games

31:12

and that kind of stuff you do a whole

31:13

bunch of actions and this is the

31:15

challenge in reinforcement learning is

31:16

it's hard to know which you know if you

31:18

if you have to do a 100 things to get a

31:20

point well how do you know which of the

31:22

hundred things you did was important

31:23

which wasn't like that's a big Challenge

31:25

and so reinforcement learning does all

31:27

that but because of this challenge

31:29

reinforcement learning tends to be very

31:30

sample inefficient we say it you need

31:33

lots and lots and lots of games to play

31:36

before in order to learn a certain

31:37

amount of stuff if on the other hand you

31:40

were trying to train Atari right and you

31:43

and your feedback signal was have the

31:45

paddle go exactly where the expert human

31:47

does right then that's more sample

31:51

efficient it learns faster so remember

31:53

we've talked about the alphago example

31:54

before right so when they first started

31:56

training alphao the first step that they

31:58

did was they had it mimic humans they

32:00

took 600,000 expert human games and the

32:02

first stage of training the first

32:04

version of alpha go was they just

32:06

trained it via human mimicry do what the

32:08

human did now that got them a certain

32:11

distance right that got them to because

32:13

they had 600,000 games which were decent

32:15

but you know decently good human players

32:17

but they were like amateurs or whatever

32:19

how do you get to the next level well in

32:22

the case of a game like go or chess or

32:24

whatnot a thing you can do is you can

32:26

start doing reinforcement learning now

32:28

reinforcement learning in those kind of

32:30

settings in in chess you've got you know

32:33

16 30 50 moves that choices at any given

32:37

point you have and maybe only 10 of them

32:38

are good choices so you don't you know

32:40

the the tree of possibilities doesn't

32:42

expand that quickly right so

32:46

uh so essentially you can get the

32:49

network that's trying to learn which of

32:50

13 possibilities to converge much faster

32:53

than if the choice is much bigger and in

32:55

the world you know we have these

32:56

continuous spaces where where like you

32:58

can turn the steering wheel to 45° 22°

33:03

13.45 7° you know the space of

33:05

possibilities is is really large and so

33:08

because so this is a real challenge with

33:10

reinforcement learning so people have

33:13

tried to do reinforcement learning with

33:14

cars in games like you know car driving

33:16

video games and that kind of stuff and

33:17

we know it works but we also know it's

33:19

very sample inefficient okay so me

33:22

looking right now at where Tesla is I

33:25

would guess that they're doing human

33:27

mimicry and they might be doing a little

33:29

bit of reinforcement learning training

33:31

on top of that you know maybe there's

33:32

something you want the system to do and

33:35

it's not quite getting there with the

33:37

mimicry stopping at stop signs you know

33:40

um and so you you can layer on a little

33:43

bit of reinforcement learning on top of

33:44

that to just tweak the behavior of the

33:47

system so incidentally this is what this

33:49

is what chat GPT did originally remember

33:51

there with chat gbt there was the the

33:53

basic training then there's instruct

33:55

training where you you tell it don't

33:58

just predict the next token pretend

34:00

you're in a dialogue right and then

34:02

there's one more step after that that

34:04

they do with chat GPT which was the

34:05

reinforcement learning from Human

34:07

feedback right which is where you do at

34:10

that after you get to that point now you

34:12

do a little reinforcement learning and

34:14

you train it don't just pretend you're

34:16

in a dialogue with me but you're in a

34:18

dialogue with me and you want to please

34:20

me these are the answers that humans

34:22

prefer so that last one is the one that

34:25

makes it polite and gives you alignment

34:27

and all that that other stuff now it's a

34:29

tiny fraction of the overall training

34:31

the overwhelming bulk of the training is

34:33

a pre-training just predict the next

34:34

token and then there's a big chunk of

34:36

the instruct okay so you can do a

34:38

similar thing with self-driving and I

34:39

would sort of expect that that's how it

34:41

would evolve that you know there's a ton

34:43

of pre-training for the uh perception

34:46

Network which is just you know they

34:48

already have all this labeled data and

34:49

they can they've got an autol labeler so

34:51

they can take these recordings they can

34:53

generate you know maps of where all the

34:54

street signs are they can ask the

34:56

perception system tell me where the sign

34:58

is and whatnot so that's a ton of

35:00

training on supervised data which is

35:03

very sample efficient that's the most

35:04

sample efficient kind then they go to

35:06

maybe a more General thing where they're

35:08

mimicking humans that's also supervised

35:10

but it's in a broader domain but it's

35:12

still more sample efficient much more

35:13

sample efficient than reinforcement

35:15

learning so then at the tail end you add

35:17

you know it's this layer cake you build

35:19

the foundational capabilities then you

35:22

do some refinement and add some

35:23

additional capabilities and then maybe

35:25

you fine-tune with yet another kind of

35:27

training at the end of it so if they're

35:29

using reinforcement learning right now

35:32

because of the sample efficiency issue I

35:33

would expect it to be that cherry on top

35:35

kind of thing right at the end the last

35:37

little bit where there's one or two

35:39

things that the mimicking isn't getting

35:41

you or it's mimicking a behavior you

35:43

don't want it to and now you on now you

35:45

come up with a new game for it to play

35:47

where you've got a game and it has to

35:49

get a score and now you're going to do

35:51

reinforcement so you could totally do

35:52

that and eventually they will because if

35:54

you really want to get deeply superhuman

35:56

that's how you did it that you know

35:58

that's what we learned one of the

35:59

examples from go was you know it got to

36:02

play when it first when it was when they

36:04

were first playing Fon way you know who

36:06

was the European Champion like it could

36:09

kind of get to his level with that

36:11

mimicry and maybe Monte Carlo search on

36:13

top of that which is basically you know

36:15

not just doing doing the first thing the

36:17

neural network has but exploring a few

36:19

possibilities just heris right that got

36:21

him there and they could beat fine way

36:23

but they're not going to beat Lisa do

36:24

that way there aren't enough example

36:26

games and you know for it to train on it

36:28

has to play against itself with this

36:30

reinforcement and then then the sky the

36:32

limit how good it is possible to be

36:35

becomes the limit of how good the system

36:37

can be and then they can become truly

36:39

superhuman So eventually we'll see you

36:41

know self-driving systems they will

36:43

they'll do that you know as as we get

36:45

more computers more computer capacity as

36:48

we learn how to do reinforcement

36:50

learning in this domain it will come to

36:51

that and so you know longterm I think

36:53

that's very likely some I mean there are

36:56

things that do the same thing as

36:57

enforcement learning they're a little

36:58

bit different but one of these

36:59

techniques so it can self-play so that

37:02

it can it can learn to be better than

37:03

humans can ever learn to be um like

37:07

that'll become part of the for but we're

37:08

not there yet right I mean there's still

37:10

the lwh hanging fruit of being as good

37:11

as a really good human driver yeah

37:14

because if FSD was was equivalent to a

37:17

really good human driver but it never

37:19

got tired it never got distracted it

37:20

could see in all directions at the same

37:22

time that's a great driver like that's

37:24

superhuman by itself it it's decision

37:27

making doesn't necessarily have to be

37:29

superum but the combination of its

37:30

perception and its def

37:33

fatigability right inability all right

37:36

it never gets tired uh the combination

37:39

of those things on top of good human

37:40

decision making like I kind of feel like

37:42

as a near-term goal that's a great goal

37:44

and that will get us tremendous utility

37:46

and you don't necessarily need more than

37:48

human mimicking in order to do that okay

37:50

so on human mimickry so um when Tesla's

37:54

training um and feeding their neurons

37:57

that's uh all this you know video of

38:01

good drivers

38:03

driving how is the training working so

38:06

for example is it you're in a situation

38:09

and it's say um is it telling the neural

38:13

network to predict what the human will

38:14

do next and then show what the human

38:17

does next and it it corrects it its

38:19

weight is it s is it something like that

38:21

basically auto training itself off of

38:24

all of the videos right yes okay I would

38:28

guess they're

38:30

probably so you take the human drive and

38:34

you break it down into some variables

38:36

right like positioning timing decisions

38:38

for Lane uh stuff and whatnot to create

38:41

kind of a scoring system for how close

38:44

are you to what the human did is it you

38:47

know do we just look at all the controls

38:49

and we take you know least uh uh mean

38:53

squares error of the car versus that you

38:55

could do that maybe that works great

38:57

maybe uh maybe you go take a step

39:00

further back and you say what was the

39:01

line the human took through the traffic

39:03

and you know what's the distance at each

39:05

point you are off that line maybe that's

39:06

the score or the speed um there might be

39:10

other elements of the score like you

39:12

know how quickly did you respond when

39:13

the light changed when The Pedestrian

39:15

moved I mean you could layer other

39:16

things on top of it you would you would

39:18

start with the simplest thing you know

39:20

this uh mean squares error right and

39:24

then if that didn't work or if you could

39:26

at layer other things on to it to make

39:27

the scoring because having a good

39:29

scoring system is an important part and

39:31

this is all comes down to sample

39:32

efficiency too like you know does my

39:35

super computer run for a week to get me

39:37

a good result does it run for a month

39:39

does it run for a year that's sample

39:40

efficiency like how fast do I get to the

39:43

result I want the system itself will

39:45

constrain how good it can get but a good

39:47

scoring system can get you there faster

39:48

it's economics and so they'll definitely

39:51

there will be a lot of tricks in that

39:53

scoring function that they have we call

39:55

it the loss function mhm and

39:57

uh you know so it would be really like

40:00

as a practitioner I would be really

40:02

curious to know like what they're doing

40:03

but they do have one they've got they've

40:06

come up with a scoring system and it's

40:07

almost certain that you know essentially

40:09

they're taking what the human did they

40:11

have this sort sort of you know ideal

40:14

point you know they have an ideal score

40:16

that you could get any and the and the

40:18

system score is just like how close are

40:20

you to like what what our uh expert

40:23

human did in this situation yeah I mean

40:25

what's exciting about kind of of being

40:28

able to train like that is it reminds me

40:30

of you know the whole Transformer

40:32

Transformer model with chat gbt it's

40:35

like you could give it so much data and

40:38

it just you

40:39

know takes all that data and and by

40:43

predicting the next token and then and

40:45

then rearrange its own weights it could

40:47

just get better and better and it's just

40:49

it's so scalable in a sense you just

40:50

feed it more data um more parameters it

40:55

just gets better and better um because

40:58

the the training is just it's just such

41:00

a um such an efficient you know usage of

41:02

it a really interesting metaphor is you

41:05

know if if a text model is learning to

41:07

predict the next token right exactly

41:10

okay it's well these tokens they're all

41:11

written by humans right like all this

41:13

stuff before there were language models

41:15

like all the text was written by human

41:16

beings right we didn't have automated

41:17

systems that generated any meaningful

41:20

amount of the Conta so in a sense it's

41:22

just predicting what the human the ne

41:24

what was the next thing the human put

41:25

it's a kind of human mimikry right

41:26

exactly yeah but when we look at if you

41:29

look at what like chat GPT can do

41:31

relative to what a human can do well

41:33

there are things it can't do that a

41:34

human can do still there's forms of

41:36

reasoning and whatnot that it still pour

41:38

out but there are a lot of ways it's

41:39

nutly superhuman like its ability to

41:42

remember stuff is just like it's vastly

41:44

superhuman like you can talk to it about

41:46

any of 10,000 Topics in a hundred

41:48

different languages you know it's like

41:50

deeply superhuman in certain respects

41:53

already and so you could expect the same

41:55

thing from the mimicking like if they're

41:57

learning protect predict the next

41:59

steering wheel movement predict the next

42:01

brake pedal that like in a sense you you

42:02

get a similar kind of thing it's not

42:05

necessarily constrained to just what a

42:07

human could do because its capacities

42:09

are different it's going to learn it a

42:10

different way like it's not a human

42:12

being like human one of the things about

42:15

human beings is we have these really

42:17

terrible working memories right which is

42:19

one of the reasons that our that our

42:20

like thought process is broken into

42:22

these two layers this unconscious thing

42:24

and the conscious thing that because

42:25

consciously we can only keep track of

42:27

like you know a few things at one time

42:30

well you know um FSD doesn't have that

42:33

problem like when a human being comes to

42:36

an intersection one of the challenges

42:38

that you have is you know there's three

42:40

pedestrians and two cars crossing and

42:43

you're turning your head to look at them

42:44

you're paying attention to a couple well

42:47

FSD is simultaneously looking at 100

42:49

pedestrians all the street signs all the

42:52

cars in all directions simultaneously

42:54

like it doesn't have attention the same

42:56

way we do m so so even given the same

43:00

set of you know ideal uh the same Target

43:05

to get to because it's getting there in

43:06

a different way there's lots of

43:08

potential for many of its behaviors to

43:10

be greatly superhuman even just in the

43:12

planning sense you know I mean the the

43:14

human being doesn't end up being the

43:15

limit in the same way that the human

43:16

being isn't the limit like for chat GPT

43:19

like the upper bound of how many

43:20

languages Chad gbd can learn is much

43:22

higher than the upper bound of what the

43:24

number of languages a human can be

43:25

fluent in right and similarly you know

43:28

like what can you tell me about you know

43:30

the Wikipedia page on Winston Churchill

43:32

like how many humans are going to know

43:33

that right and Wikipedia does try it

43:35

it'll it can tell you yeah yeah that's

43:37

interesting because yeah the It's

43:41

ability to retain you know like so much

43:44

more information I mean for example

43:46

Chacha and also if you apply that to FSD

43:50

through the training like if like if a

43:52

human was to be trained like as a

43:54

Transformer model for like LM you know

43:56

we wouldn't retain of you know it would

43:57

be like I mean it would just be like

44:00

it's like for example the the amount of

44:03

data we get from I guess you know just

44:06

looking at video clips ourselves it's

44:08

it's limited we're just looking at one

44:10

aspect maybe like how the person's

44:11

turning a little bit about the

44:12

environment but um a neuronet is picking

44:16

up a lot more subtle things that maybe

44:18

we're not completely conscious or aware

44:20

of and retaining that as well um so I

44:24

mean I I think two things one is

44:27

it just seems so scalable you just feed

44:29

it a thousand more times data you know

44:31

across a variety of of scenarios and it

44:34

just gets that much better you know it's

44:37

so it's the potential is just crazy

44:39

right um the second thing is is this

44:42

kind of crossover of abilities where it

44:45

does stuff that maybe you didn't expect

44:47

it to do because it's learning from

44:50

other scenarios and other situations and

44:52

kind of generalizing in new new

44:54

scenarios right and so it's kind of like

44:56

these ENT behaviors or abilities that

44:59

you weren't planning or you didn't train

45:00

for originally and I think as you feed

45:03

it more and more data um we're probably

45:06

going to see more and more of that kind

45:08

of people will feel like it's superum in

45:10

some ways it's just better driver than

45:12

me um and that is going to come out more

45:14

and more right as you know the data

45:16

increases yeah well we're going to see a

45:18

lot of those I mean I

45:20

already have lots of EX have I mean I've

45:23

only been trying for a few I mean I got

45:25

this on v11 sometimes but I'm getting a

45:27

lot more in V12 where you come to an

45:29

intersection and then it gets a behav

45:31

well I like I I told somebody the other

45:33

day that um that on v11 early v11 for

45:38

sure if I

45:39

intervened you know I want to say like

45:41

80% of the time the intervention was the

45:43

right thing to do right and every once

45:46

in a while you'd intervene then you

45:47

realize that the car was right you know

45:49

oh no I needed to take that turn instead

45:50

of this or I intervene because I thought

45:52

it was slowing for pointlessly for the

45:54

stop sign and I didn't see The

45:55

Pedestrian or I didn't see the speed

45:56

bump you know or whatever the deal was I

45:59

want to say on V12 I'm getting much more

46:01

into the Zone where it's like 8020 the

46:02

other way you know like 80% of the time

46:04

I intervene it was my mistake the car

46:06

saw something it was responding to

46:07

something that I should have that

46:09

ideally I would have seen I would have

46:11

responded to but but I didn't right and

46:15

you know so it's exposing more of my

46:17

fail when we disagree it's often

46:19

exposing my failings more than the

46:21

systems failings you know as that goes

46:23

and I think that's you know we're on the

46:25

trajectory we on right now now we could

46:27

very quickly be getting into a world

46:29

where you know the odds are if you like

46:32

you should still intervene you know it's

46:34

because the system is not perfect but

46:36

but you know 99% of the time you

46:38

intervene the car was right and it was

46:40

you and it's you that's wrong and you

46:41

know so that's that begs a question of

46:43

like at what point do we not let the

46:45

drive right because like is it 99 or

46:47

99.9 like how how much more right does

46:51

the car need to be and of course that's

46:53

going to depend on the waiting of Errors

46:54

you know like if if the 99er and the one

46:57

is Extreme you know but I think you know

47:01

I think there's a good chance we're

47:02

going to be there this year yeah at the

47:04

current rate of progress and that's

47:05

going to be really exciting I think what

47:08

what can trick people is you think V12

47:11

is like the next iteration of V1 right

47:13

so it got you know from v11 to V12

47:16

you're like oh big jump right and so

47:18

you're thinking okay maybe in another

47:20

year we'll have another big jump you

47:22

know v13 or something it'll take another

47:24

year and then you project that but I

47:27

think the tricky part is V12 was largely

47:30

done under the cover as this you know

47:33

stealth project um not released to the

47:36

public or really shown much and it's

47:38

really been like probably you know

47:40

supposedly maybe December of what 202

47:43

it's building on a lot of infrastructure

47:45

that was built for those other projects

47:47

too but yeah so it's a difficult

47:49

comparison to make but it's not unfair

47:52

to say yeah this is a clean sheet for

47:54

the the planning part and did

47:57

if you look at the trajectory of how

47:59

fast let's say the planning and is

48:02

improving and and it's probably you

48:04

could probably map it out with the

48:06

amount of data you're you're putting

48:07

into it and map out the abilities and

48:10

Tesla has probably the ability to see

48:13

into the next 12 months in terms of how

48:15

much compute that have they have how

48:16

much data they could feed it and what

48:18

type of abilities they're expecting from

48:20

it you think and I think that would

48:22

surprise a lot of people one one thing

48:24

we don't know what abilities like

48:26

there's some things that are clearly

48:27

have been left like the parking lots

48:29

have been left out at this point right

48:31

the actually smart summon you know we're

48:32

waiting on that

48:35

um why are those held back are they held

48:37

back because they had this part working

48:39

well and it's 95% of what people use it

48:42

for and we're going to push it out are

48:43

they holding it back because there's

48:44

something tricky about it and they want

48:45

to get it right and so does that maybe

48:47

indicate that there are some challenges

48:48

there we don't know until it comes out

48:50

uh parking lots are really different

48:52

than driving on surface streets and so

48:54

it wouldn't be surprising if there's

48:55

some novel things problems that occur in

48:57

parking lots at high rates I mean there

48:59

are benefits in parking lots you move

49:01

really slow it doesn't matter if you

49:02

stop you know it's not like driving on a

49:03

Surface Street so I believe you know

49:06

ultimately they're tractable and whatnot

49:08

but you know we don't know that it's

49:10

it's feature incomplete I would say at

49:12

this point and so when when it's feature

49:14

complete then it'll be easier to predict

49:16

what the scaling do do you have you

49:18

heard the expression the bitter lesson

49:20

no no okay so it's this white paper was

49:22

written by a machine learning uh

49:24

researcher named Richard Sutton it's

49:25

kind of famous inside the field right

49:27

Richard su he basically wrote this thing

49:29

it was an observation about machine

49:30

learning over the decades right and

49:32

especially recently and it basically

49:35

says that what the field has learned

49:37

over and over again is that doing simple

49:39

things that scale that maybe don't work

49:41

great today but which will get better if

49:42

you scale them up always wins over doing

49:45

exotic things that don't scale and the

49:47

temptation as a researcher is always to

49:49

do is to get the best research you to

49:50

get the best performance you can at

49:52

whatever scale you're working at in your

49:54

lab or whatnot even as a small company

49:57

but Sutton basically observed that that

50:00

betting on techniques that scale like

50:02

maybe doesn't work great but it

50:04

predictably improves as you scale up it

50:06

all they always win they just always

50:08

always always win and you know it's he

50:10

called it the bitter lesson because you

50:12

know researchers keep learning that you

50:14

build this beautiful thing but because

50:16

it doesn't scale it falls to the Wayside

50:18

nobody ever uses it and the simple thing

50:20

that everybody's known since like know

50:22

1920 or whatever that just scales well

50:24

is what just people keep doubling down

50:26

on so yeah this is what models are

50:28

teaching us today right and a thing

50:31

that's the way that this relates back to

50:33

FSD is that heuristics aren't scalable

50:35

you need humans to do it the more

50:37

heuristics you have like if you have

50:38

300,000 lines of heuristics and they

50:40

have a certain number of bugs when you

50:42

get to 600,000 you don't have twice as

50:44

many bugs you have like four times as

50:45

many bugs because the interactions get

50:47

more complicated right so so there's

50:50

like poor scaling like heuristics don't

50:53

scale heuristics written by people don't

50:55

scale but if you if I just take the same

50:58

model and I give it more video and it

50:59

gets better now that scales I just need

51:01

more video and I need more compute time

51:04

and it gets better so the bitter lesson

51:06

would tell us that V12 is way better

51:09

fundamental approach to solving this

51:11

problem than v11 was with its heuristic

51:13

planner and I think if you go all the

51:15

way back you know uh Andre karpathy was

51:18

telling us in his earliest talks about

51:21

this that he foresaw the soft what he

51:23

was calling software 2 the neural

51:24

network just gradually taking over and I

51:26

think that in you know that's largely

51:29

inspired by the same thing the neural

51:31

networks are going to take over because

51:33

as you get scale they just become the

51:35

right way to do everything right and

51:36

eventually there's nothing left for the

51:38

heris stics yeah yeah I was thinking

51:39

about that karpathy quote and I think

51:42

you know the the intention was it for

51:46

for the at least the planning stack to

51:48

to be more in like gradual you know 2.0

51:51

to eat away and I think this was V12 the

51:55

endon end approach a bit more drastic

51:57

than maybe what I originally you know uh

52:00

intended but it's just to me it makes it

52:02

definitely makes sense and if they can

52:03

get it working which they have it's

52:06

clearly I think going to be the the well

52:09

there's another way to tell this story

52:10

too well like I've people have asked me

52:12

a few times and I think the right way to

52:15

think about this is that Tesla didn't

52:17

suddenly stumble onto the idea of doing

52:19

end to end end to end is obvious right

52:22

sure like if you can make end to end

52:23

work the problem is it just doesn't work

52:24

in really complex domains or or rather

52:27

it doesn't not work at all you have to

52:29

get to a certain scale before it starts

52:31

working right so I think the more

52:33

realistic way of thinking about Tesla's

52:35

relationship with end to end is that

52:38

they had they were trying it it didn't

52:40

work they tried it they didn't work you

52:41

know they would you know so you know it

52:44

may be that the reason that v11 got to

52:47

300,000 lines right is they expected end

52:50

to end to start working a year ago two

52:52

years ago they didn't think they were

52:53

ever going to get to 300,000 lines but

52:55

it took long

52:56

to get the neural network to do the

52:58

planning part yeah so essentially this

53:01

is like the dam breaking you know when

53:04

they finally find the technique that

53:05

scales that they can do that kind of

53:07

stuff the Dam breaks quickly because it

53:10

it quickly overwhelms the downsides to

53:12

having 300,000 lines of heuristics that

53:15

are guiding your planning yeah I mean

53:17

did you see that uh tweet by aoke like

53:20

something about the beginning of the end

53:22

or something do you think it's related

53:25

to FS at all

53:28

it's complet spec speculative but I I

53:30

think it is but yeah I mean what does

53:33

the comment on that's not ever right

53:36

it's

53:37

like it's it's mysterious but you know

53:40

the beginning of the end of uh of uh

53:43

people driving cars is like it's kind of

53:45

the way I look yeah I kind of wonder if

53:47

like with the internal metrics and you

53:50

know things that Tesla internally is

53:51

tracking with V12 and you know they're

53:55

they're on their next version to you

53:57

know v124 whatever and uh they're just

54:01

seeing the improvements and then and

54:04

they they know what's coming down the

54:06

line and how much compute and dat

54:08

everything going forward that they just

54:10

me they just be they just must be really

54:12

excited right now I think just to see

54:14

the level of you know of improvement

54:17

especially with um 12.3 it was still the

54:21

it was a 2023 build I mean you could

54:24

tell from the firmware number right mhm

54:26

and generally what we saw through uh

54:29

through v11 right was that the things

54:31

that were getting in the customers hands

54:33

were 3 four five sometimes six months

54:35

old right so Tesla's already looking at

54:39

the one we're going to get in six months

54:41

so I mean they may you know why does it

54:44

take them six months well they do all

54:45

this testing and validation there's

54:47

tweaking there's all these waves of

54:49

rolling it out to be super safe and

54:50

whatnot so the pipe is deep between when

54:53

they first but the but you they're going

54:55

to know the potential yeah you know when

54:58

you know the first couple of weeks after

55:00

they do those initial builds so you know

55:03

they already mostly know what we're

55:05

going to have in six months and so uh

55:08

they don't really have to guess right we

55:10

just you know it takes six months for it

55:11

to get through the safety pipe and

55:13

everything and get to us yeah um so with

55:16

v11 I remember very uh half fondly half

55:20

not fly when you're at like some like uh

55:23

intersection or something you're stopped

55:26

or moving slowly you get this like you

55:28

know jerky uh steering wheel thing it's

55:31

going left going straight going left

55:33

going straight and when I think about

55:35

that I'm like that's going to be

55:37

something I think all prev12 beta

55:41

testers have will be having their joint

55:44

experience you know this like jerky

55:46

stere have you seen the the V so V12 has

55:48

this thing where occasionally you'll be

55:49

stopped an intersection and it starts

55:51

you're totally stopped not moving slowly

55:53

you're stopped you're behind another car

55:55

something like and it just starts tur

55:57

yeah it does that yeah yeah I thought it

55:58

was just me I guess it it does a little

56:00

bit no I've seen it two or three times

56:02

the first couple of times I saw it I'm

56:04

like what are you doing you know and

56:06

it's just slowly turning the steering

56:08

wheel right I'm like this will be

56:10

interesting you know the light changes

56:11

it goes and it like whips back straight

56:13

and

56:15

D it's like it's Bard or something in

56:17

playing with

56:18

this that's funny um but okay so from

56:22

moving from V12 to V or v11

56:25

V12 like v11 it just I interpreted the

56:29

the steering wheel thing at the

56:30

intersection it's like it's debating

56:32

between two options right it's like oh

56:34

60% this way 40% but then it changes to

56:37

60% this way and then you know goes back

56:39

and forth like literally as it changes

56:41

percentage of of what it should do it's

56:44

it's changing the the steering wheel but

56:46

why in V12 we don't see that behavior

56:49

you know why is it just confidently just

56:51

going in One Direction without

56:54

human uh

56:57

okay when you have her istics you come

56:58

to an intersection your options are you

57:01

got a few options straight right left

57:04

right go don't go they're

57:07

binary so the neural

57:11

network that the output from the neural

57:14

network uh it it there's you know you're

57:17

at an intersection and you can go right

57:19

you can go straight or you can turn

57:20

right right there is no 45 degree option

57:24

right okay so the neural Network it's a

57:27

in this case it's functioning as a

57:28

classifier you choose this or choose

57:30

that but neural networks to work they

57:33

have to be continuous so there has to

57:35

exist in the system a very low

57:37

probability option between the two right

57:40

this is you know you have a a

57:42

sigmoid right the important parts of the

57:44

zero and the one but it has to be

57:47

continuous because if it's not

57:48

continuous you can't it's not

57:50

differentiable and you can't back

57:52

propagate so this is a fundamental thing

57:54

neural networks have to have has to be

57:57

okay so the system has a set of criteria

58:00

where it's going to go forward and it

58:01

has a set of criteria where it's going

58:03

to go right and you're trying you know

58:06

and you minimize you know this this is a

58:08

this is there's a certain probability

58:10

for this a certain probability for this

58:11

and they add almost one and there's a

58:13

tiny little bit of remaining probability

58:15

in the stuff in between and it's

58:16

intended to just connect the two states

58:18

so the neural network so it's

58:19

differentiable right okay this is

58:22

actually kind of a weakness in a system

58:24

where you have two states right because

58:26

imagine that you get to a set of

58:29

criteria that in you know every once in

58:31

a while you're going to get to a

58:31

situation where the system is balanced

58:33

right on that 45 point right and as the

58:36

Shadows shift and the cars move around

58:38

the contextual cues just shift a little

58:40

bit you know the the network is going to

58:43

it's going to that because that's a

58:44

choice and this is a choice and the the

58:46

system before it was built so the

58:47

steering wheel it reflected the choice

58:49

that was upcoming for the intersection

58:52

right so something is flickering back

58:54

and forth and yeah as you say it it's

58:56

it's it's oscillating there a very tiny

58:59

little oscillation but you have to have

59:01

that OS you have to have this huge

59:03

disparity between going right and going

59:05

left because going 45 is never an option

59:08

like you have to make that super super

59:10

small so if you're right on the boundary

59:12

it'll hop back and forth between two

59:13

options that to a human being seem very

59:15

disperate right the thing is if you're

59:18

mimicking a human

59:21

being you no longer have you know your

59:24

goal is to just get as close to the

59:26

human being as you have you don't have

59:27

this classifier thing where you have

59:29

these AB options so the system is not

59:32

going to end up in states where it's

59:34

making like it has the option like a

59:36

human being comes to the intersection if

59:38

they're going straight their wheel might

59:39

be here might be here might be here

59:41

right that one it might be here might be

59:43

here they're they're they're fairly

59:44

Broad and continuous it's not perfectly

59:46

straight or here with like a no man's

59:48

land in between like humans will come to

59:50

an intersection they can turn the wheel

59:51

45 degrees let it sit there and then

59:53

when the light changes turn it straight

59:54

and keep going that's not

59:56

that's not a fail for the network it's

59:58

an option so it never gets in these

60:00

situations where it's oscillating

60:02

between two states that the design of

60:05

the neural network has to keep highly

60:07

discreet for safety sake right because

60:08

it's just mimicking a human being I

60:10

don't know if I'm explaining that very

60:11

well but it it is naturally going to

60:13

fall out of the fact that that they have

60:15

a Target that they're tracking and and

60:18

the goal is to be close not you don't

60:20

have to be right on being being pretty

60:21

close is good enough would you say

60:23

because say with with FSD intend the

60:26

neuron Nets are because they're

60:28

mimicking they just have so many points

60:30

to mimic along the path and that it's

60:34

just like whereas v11 it's deciding

60:37

between left and right you or I say

60:39

straight and right it's oscillating and

60:41

these are two big decisions to make and

60:44

once you're on them it just it's going

60:45

that certain path so it's that's the big

60:48

decision versus put it this way right

60:50

okay you're writing digits down M

60:52

there's a one a two a three there's no

60:55

nothing part way between the one and the

60:56

two like it should either be a one or a

60:59

two there's no in between option that's

61:01

okay um but as a human you can have a

61:05

sloppy one or a two you know I mean if

61:08

you're if what you're doing is mimicking

61:09

the human the target the success Target

61:12

is Broad it's not precisely one or

61:14

precisely two with a No Man's Land

61:16

there's a whole bunch of different ways

61:17

you could write a one a whole bunch of

61:18

ways you could write a two there's not

61:20

really a space in between but but the

61:23

network has The leeway to have but

61:26

slightly different ones and still be

61:28

right whereas you know in the classifier

61:31

way you don't have that you've got these

61:33

a very small number of extremely

61:34

distinct decision points and so if

61:36

you're on the boundary between them

61:37

you're going to see oscillation

61:39

interesting um all right so um moving

61:43

forward to

61:44

robotaxi August 8 reveal what are your

61:47

expectations on um what Tesla expect

61:50

like why do you think they're revealing

61:51

it now you know like yeah any any

61:55

thought or any ideas on this it seemed

61:57

kind of forced after that reuter's

61:59

article maybe that was a coincidence I

62:01

don't know um the you know I've seen a

62:05

couple of theories uh my guess is that

62:09

that the a that around August that rough

62:12

time

62:13

frame there is a good time for them to

62:16

be introducing this vehic so there's

62:18

kind of there's the software angle of

62:19

interpreting it there's a hardware angle

62:21

like you know it's about time for them

62:23

to get the hardware out why would they

62:24

need to get the hardware out why

62:25

wouldn't wait for reveal like they did

62:27

with like the y or the three where they

62:29

waited until they were ready to start

62:30

taking I mean the three it was early but

62:32

with the Y they didn't want to Osborne

62:33

the three they waited and they played it

62:35

down until they got there and up until

62:37

now it seems like you know with the

62:38

compact car that they'd been doing a

62:40

similar kind of thing so so as not the

62:42

Osborne the three or the Y

62:44

presumably um if they introduce it in

62:47

August they're they've either greatly

62:49

accelerated the timeline or they're

62:51

doing an introduction well ahead of the

62:53

actual release of the vehicle which kind

62:55

of makes sense for robo taxi because

62:56

people aren't expecting like nobody's

62:58

not going to buy a model 3 because

62:59

they're waiting for the robo taxi right

63:01

I at least that's unlikely to be a thing

63:03

whereas they might wait to buy a model 3

63:05

so maybe it's less of an issue and maybe

63:09

they want to get prototypes out on the

63:11

road to start testing and Gathering data

63:14

like that's a theory I've seen seems

63:16

like not bad so that's one the other

63:18

possibility is that um they think the

63:22

software is getting really close and

63:24

they want to demo the soft Ware on a

63:26

platform to start sort of preparing the

63:29

world and Regulators for the fact that

63:31

this is a real thing it's really going

63:32

to happen and here's our status I mean

63:34

that's obviously it's good for the

63:36

company gathers

63:38

attention um it might get investors to

63:41

take it more realistically it might get

63:44

Regulators to start taking it more

63:46

realistically like this isn't ping this

63:48

guy and this isn't us just dreaming and

63:50

so don't put us at the bottom of your

63:51

stack of work like put it at the top

63:54

because this is we really need to start

63:56

working on the issue of like how what do

63:59

you are you going to require before you

64:00

allow us to operate these these things

64:03

so like those all kind of make sense

64:05

yeah yeah I wonder if the robotx would

64:08

be just Tesla owned right for certain

64:12

urban city environments in the beginning

64:15

at least um I don't see like why would

64:18

they sell it to people initially when

64:20

they have a lot of capacity needs to

64:24

fill this vacuum of ride healing because

64:27

the discrepancy of how much phys like

64:30

human ride healing costs and robot taxy

64:33

will causes such a big gap like Tesla

64:37

could easily use you know the first few

64:40

years of what production maybe 3 million

64:43

Vehicles they could it it's a really

64:46

good question and you know this is this

64:48

is something that it's been debated a

64:51

long time I have a 10e standing bet with

64:53

another guy about whether Tesla will

64:55

stop making selling cars to private

64:58

parties when the when they start making

65:00

Robo taxis uh you know you can see it

65:04

going like I've tried to work this a

65:07

couple of ways I can see advantages

65:09

either I mean the robo taxi where holy

65:11

owned Fleet thing it's upside is a

65:14

simple model like predicting and

65:16

understanding it or kind of

65:18

straightforward right I don't know like

65:21

I would argue it's not the best model to

65:23

like to plan kind of long term I Al feel

65:26

like when I think about the whole sweep

65:28

of this thing like I've said before that

65:32

you know I feel like the robot tax is

65:34

going to go through this period of time

65:35

where a relatively small number of robot

65:37

taxis really profitable but as the fleet

65:39

continues to grow and we and it

65:41

continues to take more miles it becomes

65:43

commoditized now the degree to which it

65:45

becomes commoditized like ultimately

65:48

it's still a profitable business it's

65:49

much bigger business so the total profit

65:51

being generated is bigger but the gross

65:53

margins are a lot lower as you get out

65:55

into building out the fleet and that

65:57

might be a relatively like when I look

65:59

at the numbers I could see that

66:00

transition from being I could see

66:03

they're super profitable you know

66:04

because you're just taking ride haill

66:06

business and there's a lot of demand and

66:08

you like you basically can't build

66:09

enough cars to fill the demand like that

66:11

could last a couple years easy like will

66:13

it last five years maybe I don't know

66:15

that seems long to me uh and it's

66:18

there's it's not going to abruptly end

66:19

you know there'll be this long taper

66:21

into this long-term thing where like I

66:24

think you know there's I mean what is

66:27

the end State like is it 20 years 50

66:30

years you know you get different windows

66:31

at different things but I the other

66:33

point I like to think about is the point

66:35

where it's commoditized like the lwh

66:37

hanging fruit of vehicle miles traveled

66:40

for you know like you your your Robo

66:43

taxi it costs 4050 cents a mile it shows

66:45

up in 3 minutes it's super convenient

66:47

you can rent a two-seater four- seater

66:50

minivan you know like there's a lot of

66:52

variety lot of

66:54

accessibility and it's less expensive

66:57

than owning your own vehicle and half of

67:00

all miles have moved over to that and so

67:02

why do I say half and not 100% or some

67:04

other number

67:07

um one is human habits change slowly you

67:10

know so that people tend to not do make

67:13

transitions to new technologies as soon

67:15

as they you know we you you the tail end

67:17

of the adopter curve and I you know

67:20

there are aspects of the of the robo

67:22

taxi adopter curve like moving off of

67:24

private vehicles on a robo taxis which I

67:25

think for various reasons are likely to

67:27

be more slow than than say uh you know

67:32

moving to cell phones or smartphones off

67:33

of you know galpagos dumb phones was uh

67:37

even though that took 10 years plus for

67:39

us to make that transition but it's an

67:41

interesting point to talk about because

67:43

it's that's a point we're definitely

67:45

going to get to we're definitely going

67:46

to get you know when we have 25 million

67:49

Robo taxis on the streets in the United

67:51

States they'll be supplying like half of

67:53

vehicle mile travels and I like that

67:55

because it's really hard to argue that

67:57

we won't at least get to that point so

67:59

you can talk about that model you can

68:00

talk about the model when you have 123

68:02

million robot taxies and that sort of

68:05

gives you an overall Spectrum to sort of

68:06

think about what's going on okay in

68:09

state two which I think probably comes

68:13

five years after State one maybe it's a

68:17

bit longer maybe it's 10 years I don't

68:19

think it's 10 years but maybe it is um

68:22

most of the car market is private

68:24

Vehicles it's not Robo taxis because uh

68:27

a smaller number of vehicles saturate

68:29

the robo taxi Market sooner and you know

68:32

if you still have a lot of vehicle miles

68:34

travel I mean because robot taxes drive

68:35

five times as many miles as privately

68:37

owned Vehicles do say five times um that

68:40

means it takes five times as many

68:41

private vehicles to satisfy the same

68:44

demand that that an equivalent number of

68:46

robot taxis could do so you so you after

68:48

you get out of this profitable Zone

68:50

where you know you have a small number

68:52

of robot taxis because your production

68:54

constraint or jurisdiction constraint

68:56

regulation

68:57

constrained uh after you get out of that

69:00

zone uh like I my see the way I see this

69:04

thing is Tesla is going to have this a

69:06

huge demand for robo taxis over some

69:08

window of time and that that's going to

69:10

taper and most of their business you

69:12

know in this longer term is private

69:14

Vehicles again right so how do you

69:16

manage that as a company like you don't

69:18

want to leave anything on the table

69:20

during the Gold Rush when the Robo taxis

69:22

are making a ton of money and you're

69:23

rapidly scaling out the thing but you

69:25

also don't want to gut your long-term

69:28

prospects of continuing to be you know a

69:30

viable manufacturer like you can't walk

69:33

away from the car business for five

69:34

years and feel like you're just going to

69:35

pick it up you know I mean you got a

69:37

supercharger Network to keep going you

69:39

got to keep your service centers going

69:40

you have sales people you have like all

69:42

these channels your manufacturing design

69:44

goals all that kind of stuff they're

69:45

different between the the between the

69:48

two robotaxi I think will be crazy

69:50

profitable through some window of time I

69:51

think it'll be decently profitable and

69:53

huge long term right so that's the arc I

69:56

see for those things but I but I'm

70:01

skeptical about the there are people who

70:04

feel like the economics of robotaxis are

70:07

so good that they expect a wholesale

70:11

abandonment of private

70:12

ownership is that possible I think it's

70:15

possible I just don't like that's not

70:18

the base case to me of what's going on

70:20

and I think whatever

70:22

strategy Tesla uses has to be prepared

70:26

for both eventualities and the the

70:28

flexible strategy that guarantees your

70:30

future is to keep a foot solidly in the

70:34

retail Camp all the way through this

70:36

transition sure um in terms of timeline

70:39

of when we can get unsupervised um FSD

70:43

or robotaxi starting to roll out I know

70:46

there's going to be different

70:47

municipalities different cities um it's

70:51

going to be a a phase roll out where

70:54

you're going to have start with certain

70:56

places that are more permissible you

70:58

know and it'll be a smaller Fleet to to

71:00

try out kind of like what weo is doing

71:02

for example in a few cities and then you

71:06

you gradually you know roll it out more

71:10

I mean I imagine Tesla R will be a lot

71:12

faster because I think their rate of

71:14

improvement is going to be tremendously

71:16

fast especially once they get to that

71:18

point but would you say um timeline of

71:23

expectations when do you think the first

71:26

when do you think Tesla will first test

71:28

out kind of unsupervised Robo taxis on

71:32

the streets kind of like weo in a city

71:36

do you think it's second half of

71:38

2025 um uh test like if they're I think

71:44

say more than 50 vehicles in a city this

71:48

year with Tesla employees behind the

71:51

wheels ex you I'm talking about like no

71:54

no one in the car and taking passengers

71:56

kind of like what weo is doing with no

71:58

one in the car yeah

72:00

that like I wouldn't expect to see them

72:02

doing it this year it's going to you

72:04

know we're seeing this sort of dislo

72:07

discon discontinuous sort of rate of

72:09

improvement yeah and you know we don't

72:11

know what the next six months holds

72:13

Tesla has a way better idea than we do

72:15

so it's conceivable that they're

72:17

confident about this and they feel like

72:19

they could try to do that this year um

72:22

like that seems super aggressive to me

72:23

yeah um

72:25

the uh and you know they're gonna just

72:30

as way Mo Cruz Uber did they're going to

72:33

go through this long period where they

72:34

have employees sitting in the cars

72:36

trying not to touch the wheel anymore

72:37

than they have and they're racking up

72:39

miles and they're getting a sense of how

72:40

well the thing works and I don't think

72:42

that that's going to be 10 cars you know

72:44

I think that's going to be 500 cars kind

72:46

of thing various places maybe various

72:49

countries and they're you know that's

72:51

going to be a way of GA gathering data a

72:53

way of providing feedback to that AP

72:55

team about things that have to be done

72:58

uh it's going to be a way for management

73:01

to develop a strategy or get data to

73:04

help inform a strategy for how they're

73:06

going to proceed and I would expect that

73:09

to happen this year interesting um now

73:12

you know what fraction of the drives

73:13

will be totally intervention free will

73:15

it be 99% will it be

73:18

99.99% I mean I think that's open to

73:20

debate and it it very much depends on

73:22

what we haven't seen the slope of

73:25

improvement for V12 yet and so it's hard

73:27

to have an informed decision so do you

73:30

think these uh this test these like test

73:33

of employees in let's say these Robo

73:35

taxis are they going to be picking up

73:37

passengers and driving them or so both

73:40

Crews and weo did a thing where they had

73:42

company internal passengers

73:44

for years I think and San Francisco Cruz

73:48

had company internal for like two years

73:49

or something weo did it for quite a

73:51

while I think weo is doing that with p

73:54

with

73:55

employees in Austin now that's like the

73:58

first stage is you your own employees

74:00

get to use it it's you know and then uh

74:03

weo did a thing where they did they had

74:05

a long thing in Chandler Arizona where

74:07

they had you know customers under NDA as

74:10

they were working through and it turned

74:12

out to be long because obviously you

74:13

know they weren't making progress as

74:15

fast as they wanted to you know in terms

74:16

of like polishing off all the things or

74:18

maybe they became more conservative you

74:20

know they were in that window for a

74:21

really long time um like I don't why

74:25

that wouldn't be a good idea for Tesla

74:27

to have you know internal people and

74:30

then you have external people just like

74:32

with the safety score thing you know you

74:34

have a population of people you know who

74:36

are who ride as passengers maybe under

74:39

NDA maybe not under NDA and you know you

74:42

just as your confidence builds and you

74:43

have more vehicles on the road and

74:45

whatnot you gradually open up you know

74:48

you let people see what you're doing

74:50

partly because you have to because as

74:52

your scale goes it's too hard to keep

74:53

things you know Under Wraps like I I

74:57

would expect them to be starting that

75:01

process this year and like how quickly

75:03

they move through the various stages of

75:04

like scaling up the vehicles having more

75:06

and more things uh that's um you know

75:11

that's going to depend on the techn I I

75:13

I really do believe the the tech is the

75:15

fundamental thing yeah I mean that's

75:17

interesting because um in the just in

75:20

the Bay Area and in like Austin they

75:23

could roll out you know Tesla or

75:26

employee passengers right and employee

75:29

driver pal Alto first probably yeah pal

75:31

Alto Fremont Austin factories whatever

75:35

um that would be I mean they have plenty

75:37

of plenty of people there's plenty of

75:39

employees that they could do I mean they

75:41

how many people do they have that

75:42

commute to their factories every day you

75:44

know I mean imagine you know having a

75:46

fleet that just brings your your line

75:48

workers and you know yeah and so you run

75:51

a shuttle service for line workers and

75:53

use Robo taxis yeah I wonder if the

75:55

August 8 reveal will share some of those

75:58

details you know like what do you think

76:01

like been cool if it did it would I I

76:05

I've been like my guess is we won't get

76:07

a ton of detail because they don't we I

76:11

you know battery there occasionally we

76:12

do get a lot of detail right I mean the

76:15

AI days have never given us a ton of

76:17

detail on strategy um the battery day it

76:20

kind of did so there's precedent for

76:22

maybe getting more data so if they if

76:24

they think of Robo taxi as more like but

76:27

the other thing is the like there's this

76:30

variable about um you know to what

76:34

degree do people with

76:36

Teslas get to participate in the Tesla

76:38

Network right you know when when Elon

76:42

first announced the Tesla network was

76:43

going to be a thing Robo the dedicated

76:46

robot taxi was pretty far away so

76:48

there's a lot of incentive to like get

76:50

um the the other thing is like when they

76:53

initially did it they didn't have the

76:54

cash reserves they had now like the idea

76:56

of like building your own Fleet from out

76:58

of based on your own Pockets or

76:59

borrowing money to do it that would have

77:01

been a lot scarier back when they were

77:03

thinking about that like now they could

77:05

scale moderate size fleets with their

77:08

existing cash reserves and it could

77:10

totally make sense like could be a

77:11

no-brainer of a thing and so my guess is

77:14

like the optimal strategy has probably

77:17

shifted but there are lots of people who

77:19

expect to be able to participate in that

77:21

and we're looking forward to it and I

77:22

like I didn't go back to read what the

77:24

con cont language was when when we

77:27

bought these things but you know that

77:28

was part of the promise that FSD got

77:30

sold on in the early days so I'm still

77:33

expecting that to some extent they

77:35

expect participation now what are the

77:37

terms how many people get involved you

77:40

know that's we like we don't know what

77:42

that is these are very these are knobs

77:43

they can turn to tune the

77:46

strategy I mentioned the the thing like

77:49

I I feel like navigating this boom in

77:51

robotaxi sales and whatnot while

77:52

maintaining your retail business is

77:55

going to be challenging and these are

77:56

knobs that they can turn to try to keep

77:59

the market orderly while all this stuff

78:01

unfolds and you know you

78:04

know you know gain as what as much

78:06

benefit as they can provide as much

78:08

benefit as they can to their consumers

78:10

while not taking on unnecessary risk

78:12

yeah um is there anything about the

78:15

robotaxi like what's the biggest

78:17

difference between the robotaxi in the

78:19

$25,000 vehicle you think I mean like I

78:21

would say self-closing doors you think

78:24

you think that that important you know

78:26

when yeah when I think

78:30

about I you know when when I found out

78:32

they were doing a robot taxi I did a

78:34

couple of clean sheet things like what

78:35

would be a good Robo taxi like if you

78:37

were making it and uh when I think about

78:41

this stuff like what doesn't a model 3

78:43

have or a model y have that you want in

78:45

a robo

78:46

taxi uh you know there's a there's a a

78:49

bunch of things that I think are

78:51

nonobvious that have to do with Fleet

78:52

operation vehicles that that you know

78:55

they make sense they're totally coste

78:57

effective in a robot like a self-closing

78:59

door I feel like is a total is highly

79:01

cost- effective thing to put in a

79:03

$25,000 Robo taxi right just so that

79:06

your passenger doesn't walk off and

79:07

leave the door open right or you know

79:09

make sure the door is actually properly

79:11

closed and be able to properly close it

79:13

um but other stuff like you know being

79:15

able to check if somebody left packages

79:18

behind in the car making it so it's easy

79:20

to clean so that you know one of the

79:22

things taxi cabs one of the first things

79:24

wear out as a back seat you know cuz

79:26

people get in and get out so you want to

79:27

be able to you know easily swap out that

79:30

kind of stuff MH um it like I like the

79:35

idea of doing a cybertruck style you

79:37

know kind of really unusual looking

79:40

because it's it's for one thing it's an

79:41

advertisement the thing oh there's one

79:43

of those in the same way the

79:45

cybertruck's an advertisement there's

79:46

one of those Tesla robot tties right but

79:49

also you know being

79:51

dentree you know not needing as much

79:54

cleaning

79:55

care um so there's that obviously

79:57

there's sensor Suite stuff you know

79:59

there's um spending more money on the

80:01

sensor Suite spending more money on the

80:03

computer like all that stuff is more

80:05

justifiable in a vehicle that's using

80:07

the sensors and using the computer like

80:10

247 that is so like the economic

80:12

tradeoffs of that kind of stuff when

80:14

it's a gimme that you're putting on cars

80:16

and 90% of people aren't using it like

80:17

that's harder to justify than in a

80:19

robotx where you know they're going to

80:20

use it right so sure that's does it need

80:24

to be four doors like four-seater that's

80:27

a really interesting question so like I

80:28

went back and forth on this a couple

80:30

years back when I was looking at the

80:31

thing and my so two-seater attractive

80:36

like the fundamental economics of the

80:38

two-seater are pretty attractive but you

80:40

do have the thing where like like it

80:42

it's true that most rides are one or two

80:44

people right but n like 10% of the rides

80:46

are more than two people so of course if

80:49

you have two seaters they can take two

80:50

vehicles but like if you have two

80:51

parents traveling with children like are

80:53

they going to be happy with the two

80:54

vehicle kind of thing I uh you know and

80:57

a lot of people if your drive is more

80:58

than short and you're traveling with

80:59

your family you want to travel together

81:01

so you can talk kind of thing I mean

81:02

there's I feel like um from an

81:04

operational flexibility standpoint like

81:06

if you're going to build one vehicle

81:07

that the four- seater is the thing that

81:08

makes the most sense because the

81:10

overhead today the way our streets are

81:13

configured right I mean there's no

81:15

advantage to having a really tiny

81:16

vehicle today you're going to take a

81:17

whole Lane anyway you're not lightening

81:19

up congestion or whatnot you're just

81:20

reducing the cost of the vehicle I feel

81:22

like the four four-door vehicle if

81:24

you're just going to build one vehicle

81:25

and you're not going to make another one

81:26

for 2 or 3 years and this is going to be

81:28

the first one and you're going to start

81:29

scaling your robot taxi I feel like

81:30

there's a lot of argument to be made for

81:33

doing a four seater because it lets you

81:35

cover like

81:36

99.9% of the market or something like

81:38

that as opposed to 90% of the market MH

81:40

interesting um so I was thinking about

81:42

this about the whole idea of Tesla

81:45

becoming an AI company versus let's say

81:47

autom manufacture I was

81:51

thinking um it seems like the the P

81:54

autom manufacturer business it's just

81:57

I've never thought it's just very

81:59

cyclical low margins typically it's like

82:02

the software component is is the

82:04

Intriguing part adding extra value where

82:07

as an investor yeah yeah or you know

82:10

rather than the human you know invested

82:14

amount of time and energy and focused

82:15

Drive you're pulling that out off

82:17

loading it onto AI chip like that's

82:20

interesting that can drive margins Etc

82:22

um but it seems like this

82:25

transition from Tesla as a autom

82:27

manufacturer to AI company like it's

82:29

been happening over time and I would

82:33

argue that Tesla's like focus and

82:36

priority best Engineers all been on this

82:39

you know kind of AI um um trajectory but

82:44

just like for example open AI before

82:47

chat GPT sure they were AI company but

82:50

chpt made them kind of like a a real AI

82:55

company a AI company that people use

82:58

their products you know like a company

83:00

that's that's immensely useful right for

83:02

people as a AI company rather than just

83:04

a research lab right before that in a

83:06

sense and I think in some ways when I

83:08

drive at V12 I'm like oh it feels like

83:11

Tesla is getting close to this point

83:15

where FSD is going to be really really

83:17

useful right it's like unsupervised FSD

83:20

is going to you know transform people's

83:23

you know driving trans experiences and

83:25

it'll get to this point where Tesla's AI

83:28

products are finally in the hands of of

83:32

lots of people in a very useful way and

83:35

that to me marks kind of like this big

83:38

transformation in Tesla's history when

83:40

we look back 20 years from now you know

83:43

we'll say oh that was kind of the moment

83:46

where everything crossed over like it

83:48

wasn't it's not so much that again like

83:50

open a wasn't an AI company they're more

83:52

like a research lab MH but then when

83:55

they came out with their product it

83:56

really transformed so in a sense I look

83:58

at Tesla up until now the AI part of

84:01

Tesla it still feels like more

84:04

like lavish to this point you know where

84:08

the the real products haven't come out

84:11

you know for for for for Millions to use

84:14

it so it just seems like we're getting

84:15

closer and closer to this pivotal point

84:18

in in Tesla's kind of History I wonder

84:20

if people if their impression will

84:23

change like we don't think of Apple as a

84:26

software company even though software

84:28

and the ecosystem that they build and

84:29

the stores and all that kind of stuff

84:31

arguably add more value than building

84:34

the laptops the phones that kind of

84:35

stuff right um I mean not just the

84:38

software but the ecosystem the software

84:41

enables you know both the cloud stuff

84:43

and the software that goes on the that

84:45

but we still think of Apple as a phone

84:48

company as a laptop company and whatnot

84:50

like the the software becomes like an

84:52

ingredient in the hardware but the

84:53

hardware is thing that you see so you

84:56

know I mean arguably Tesla's already you

84:59

know the the software content of the

85:02

cars is super high and it has all these

85:04

Network features and stuff and yeah the

85:06

world even Tesla fans they don't really

85:09

see them as qualitatively different than

85:11

other cars it's a different kind of car

85:13

and we still view it as a as a car so

85:16

even though the you know the economic

85:17

reality the company and the operational

85:19

reality the company May shift away from

85:22

being more about the car and more about

85:24

like the ecosystem and the services and

85:26

that kind of stuff I don't know that

85:29

that the like I wonder if they will

85:32

change and by extension like will

85:34

investors who you know mostly they're

85:37

Ordinary People they're not experts yeah

85:39

right will their perception of the

85:41

company shift it might I I think a big

85:43

part of it is going to come down to uh

85:46

like uh you know we don't think of

85:48

Amazon as a grocery store we still think

85:49

of it as an internet store right because

85:51

we went through this thing but you know

85:53

when the internet company all took off

85:55

back in the 20s and Amazon just became

85:57

an internet and you know it Amazon is

86:00

probably way more Hardware than it is

86:01

internet at this point I mean if you put

86:03

aside the AWS part which is a very

86:06

important part of the thing you know I

86:08

mean it's delivery Vans and

86:10

warehouses you know and vast inventories

86:13

of stuff and and there's a you know

86:16

there's this other component too but we

86:17

think of it as an internet company

86:19

that's true so it'll be interesting to

86:21

see you know if and what is trigger if

86:25

if if if we if if if Tesla ever escapes

86:29

the car maker thing I it's not clear to

86:31

me it ever will I mean I guess Apple

86:33

being like Steve Jobs defined apple as

86:36

more of a device company that's always

86:38

been their thing I it's possible Tesla

86:41

follows in that sense where they're a

86:44

car but also a robot humanoid robot

86:46

company that type of devices in in those

86:49

ways um but yeah on Optimus I wanted to

86:54

ask ask you your latest thoughts on kind

86:57

of where Tesla's at um do you think

87:00

they're going to start some limited

87:03

production run in the next year or so or

87:05

are we still kind of a little bit

87:07

farther out than that that's a good

87:08

question I I

87:12

mean okay so I I think they're still

87:14

getting up the curve on software like

87:16

they everybody's still getting up the

87:18

curve on software the thing is the

87:19

humanoid robot software stack is

87:21

evolving that's like the llm stack it's

87:23

just evolving Crazy Fast

87:26

um

87:28

the like if it the reason that I thought

87:31

Tesla should make humanoid robots right

87:34

is because I see the software as

87:37

happening now you can make it happen

87:39

sooner but the underlying Tech that

87:41

makes the software possible for doing

87:43

those it's just coming we can speed it

87:45

up some but it's coming for sure right

87:47

and the the the ingredient that I

87:48

thought was missing to make humanoid

87:50

robots happen big happen soon was

87:54

that you want to be able to build them

87:56

at scale and you want to be able to

87:58

build them cheap you want good cheap

88:00

robots built at scale and I didn't see

88:03

the industrial infrastructure out there

88:04

in the world or anybody preparing at the

88:07

point that we first talked about this to

88:09

make that in infrastructure and that's

88:11

the long PLL on doing this stuff like

88:13

the software is going to happen it's

88:14

kind of I mean we're going to pull it in

88:16

now that there's a lot of interest it's

88:17

going to happen sooner than it would

88:19

have otherwise but it was going to

88:20

happen these techniques were going to be

88:22

developed right and so so was the fact

88:25

that there were no good robots out there

88:28

going to be the limiter the reason why

88:30

it didn't get adopted and you know in

88:33

2028 as opposed to 2038 became the big

88:36

year that it goes so you know to like

88:40

when I look at at through this lens and

88:43

my sense is that there are people in

88:44

Tesla who look at at the like you know

88:46

they very clearly understand the

88:48

challenge of of industrial stuff at

88:51

scale you know and they understand that

88:53

that a problem that needs to be properly

88:56

addressed in order for this product to

88:58

really fulfill its potential and that

89:01

there's a there's a big first mover

89:03

advantage in getting their first not

89:05

just a first mover Advantage but a

89:06

sustainable Advantage right because you

89:08

get their first and then you don't stop

89:10

you keep developing you always have the

89:13

best product right so you command the

89:15

best margin and you also have the

89:17

platform that let your software people

89:18

move forward the most quickly right um

89:21

it lets you get to scale and keep the

89:23

scale because building things at scale a

89:25

lot of building things at scale is about

89:27

harnessing the advantages of scale and

89:30

and maintaining that advantage means you

89:32

want to maintain the Lion Share of the

89:34

market because that gets you the scale

89:36

to let you hold that position and

89:39

maintain it and maintain the margins

89:41

associated with that position so like I

89:43

look at this and I imagine that you know

89:45

if Tesla sees it the same way and a lot

89:47

of the stuff that they say suggests to

89:49

me that they do see it this way that

89:51

their Focus right now is on like getting

89:53

the hardware down right building stuff

89:56

and getting it out there like if it

89:57

helps them get the manufacturing line up

89:59

if it helps them understand the product

90:01

better so they can build a better

90:02

product so that they can build the

90:04

product that builds the product better

90:06

like I think they'll do it they'll

90:07

they'll do that but it's a good question

90:10

we've just seen so little of Optimus in

90:13

action we've seen so little of it in

90:16

terms of you know

90:17

detailed uh uh information about the way

90:21

that it's built that understanding where

90:24

they are in the process of

90:25

industrializing it is tough but my

90:30

sense was has been for some years now

90:33

and still is that there's a lot of

90:36

really fundamental Improvement in the in

90:39

the tech that's available that you can

90:40

keep turning that

90:42

crank and you know every single year the

90:45

product that you can make is going to be

90:47

a lot better so to some extent timing

90:51

the scale up of the manufacturing with

90:53

when the software is really useful like

90:55

that's a thing that makes sense to me

90:57

because if you build the robot a year

90:58

early you're not going to have as good a

91:00

robot as you will a year later right

91:02

it's a longer the longer you delay

91:04

scaling the better product design and

91:07

stuff the more you're going to know the

91:08

better the core techn just you know for

91:10

Teslas came out and for years the motors

91:12

got better and better sometimes the

91:13

motor in your car would get better they'

91:15

do firmware updates because they'd

91:16

figured out something new or they could

91:17

change the margins you know early Teslas

91:19

if you had an early model 3 you know the

91:21

battery capacities would change right

91:23

because they change the software

91:25

um but there's stuff that you can't

91:28

change without actually changing the

91:29

hardware out too right and we did see

91:32

you know the motors that go in the cars

91:33

today are much better than the motors

91:34

that went in like two years ago 5 years

91:36

ago and so on because they're still

91:37

learning that kind of stuff

91:39

so yeah yeah whereas uh like I sort of

91:43

expect them to not scale until the

91:44

software is fairly mature but I expect

91:46

the focus to be on scaling the indust

91:48

the industrial capacity I see so I mean

91:52

Elon has said like often times takes

91:54

three generations of product of a

91:56

product before it gets you really good

91:59

so they're on gen two supposedly so

92:02

maybe one more Generation Well was the

92:05

first one of product I would argue that

92:07

Bumblebee and bubble seed weren't really

92:09

I would I would argue that the first

92:10

Optimus wasn't really a product I mean

92:12

they're they're going to make these test

92:14

mules essentially where they're figuring

92:16

stuff out yeah um but I think they're

92:18

calling it Gen 2 right yeah sure but I

92:21

think third generation product is third

92:22

generation product customers I see

92:25

that's true that could be thing I wonder

92:27

if the internal thing though is

92:29

developing three really you know

92:32

prototypes and then you know starting

92:35

your first product after that so maybe

92:37

we see one more gen 3 prototype and then

92:39

we start to see some type of initial

92:42

production a good question is when when

92:45

do you get to the point where uh having

92:48

more robots accelerate your development

92:49

Pro because like if they're I mean this

92:51

is a thing with the fleet for FSD cars

92:54

right once they got to a point where

92:56

they had data ingestion engine and the

92:58

data from the fleet was a major limiter

93:00

on how fast it could improve while

93:02

having a bigger Fleet is a really big

93:04

Advantage um I would guess that Optimus

93:06

isn't at the point right now and there's

93:08

this interesting thing about Gathering

93:11

data like you know having Optimus the

93:14

platform itself Gathering data to the

93:16

extent that you can do it efficiently is

93:18

pretty useful but like having humans put

93:20

on you know sensor stuff and go around

93:22

and do stuff that's actually a not

93:25

unreasonable mechanism in certain ways

93:27

it's better than having like for

93:29

instance if you want to do human

93:30

mimicking yeah well there's kind of two

93:32

ways you can do it you have a human

93:33

driving an Optimus right or you can have

93:35

you know an Optimus mimic a human and

93:38

those both have different strengths and

93:39

weaknesses but they're both things you

93:40

want to do and they both involve a human

93:42

in the loop right so you know if you've

93:43

got 50 operators there no point in

93:45

having a thousand optimi right because

93:46

you can only use 50 of them at a time if

93:48

you get to a point where the software

93:49

can start doing stuff on its own then it

93:51

makes sense to start scaling up I would

93:53

guess what do you mean when the the

93:55

point that software does it on its own

93:56

like say for instance that you're

93:57

working in it that you have some basic

93:59

task that you can do in a factory that's

94:02

you know that makes sense to do you know

94:04

like you it's economically useful or you

94:06

have some space to do it and you can set

94:08

some optimi aside in order to uh in

94:11

order to work on this thing well then

94:13

they can kind of autonomously gather

94:15

data by repetitively doing a task with

94:17

some variation you know and you we see

94:19

other robot like Google had a robot lab

94:21

that had like you know hundreds of robot

94:23

arms that were just basically doing

94:24

repetitive tasks over and over and

94:26

varying them to gather data so you can

94:27

do that kind of thing too it's it you

94:30

know I don't know if it's compatible

94:32

with the way that they're trying to do

94:33

the stack in Optimus right now but if it

94:35

is then it would make sense to like have

94:37

a thousand of them and find something

94:39

for them to do um but that's a question

94:42

like you know there's I think you're

94:43

going there's going to be the scale up

94:44

where they build a bunch of them and

94:45

they use them internally prior to any

94:48

customers getting them and them going

94:50

externally uh so it's interesting

94:52

question to ask when they do that and

94:53

when they that depends on the

94:54

development path that they're doing and

94:56

what their strategic path that they see

94:58

is I still don't see op like I still see

95:01

FSD as a significantly more near-term

95:03

product than I see Optimus despite so

95:06

how does Tesla let's say scale human

95:08

mimicry of humanoid robots like with

95:12

Optimist like

95:14

so let's say they need quantity of data

95:18

and quality of data so do you have I

95:21

mean you're talking about human control

95:23

the robot but then would it be better

95:25

just to have a suit or or a bunch of

95:28

sensors on the key parts so that you

95:31

know how the human is doing it when you

95:33

have a human and we've seen we know

95:34

Tesla does this already they've already

95:36

demonstrated you know a guy in a VR rig

95:38

who has some hand uh controls that he's

95:41

using to basically do you know upper

95:43

body men you know stuff rearranging

95:45

stuff on a table we saw the folding of

95:47

shirt thing that was how that was being

95:48

done in fact the the folding shirt video

95:50

might have been a data cap might have

95:52

been somebody in the process of data

95:53

capture I'm folding a shirt with Optimus

95:55

so like you put on your VR rig you take

95:57

Direct Control of an Optimus body and

95:59

then you use it to fold you know this is

96:01

a thing that's done this is one of the

96:03

ways that is known to be effective and

96:05

fairly um fairly sample efficient way of

96:09

gathering data measuring it straight off

96:12

of a human you can also do that and

96:13

people do do that stuff it has some

96:16

strength like because the exact

96:18

operating constraints for a human are

96:20

different and you you just have the hand

96:22

targets and stuff you don't have all the

96:23

inter mediate joint positions and that

96:24

kind of stuff so you get less data but

96:28

the data Gathering rig is a lot less

96:30

expensive and uh so you can give it to a

96:33

whole bunch of people and they can take

96:34

it out in the real world you know they

96:35

can you know they can go down the street

96:38

and pick up garbage they can uh you know

96:41

they can fold cardboard boxes in a UPS

96:44

store they you know you because you can

96:45

just take it someplace so there are

96:47

constraints with trying to do it with

96:49

Optimus to use the body directly but

96:51

then there are advantages also and you

96:53

know the the the the trade-off between

96:55

those two is another one of those

96:56

empirical things that I was mentioning

96:58

you know there are going to be some

96:59

trade-offs what's the right mix of which

97:01

things that you do and then there's

97:02

reinforcement l u i mean reinforcement

97:06

learning in simulation for robots that's

97:09

known to work well uh and in fact using

97:12

reinforcement learning to train robots

97:14

to mimic humans is like one of the

97:16

primary Mo modalities for doing it

97:18

because uh robots have many more

97:21

operating degrees of freedom than cars

97:23

do

97:24

so a robot can mimic a human action in

97:27

many different ways some of which are

97:29

much more preferable to others like if

97:30

the goal is just like move your hand

97:32

through this Arc to pick this thing up

97:34

you know what is your upper body doing

97:36

what is your head doing these are all

97:37

free variables to train a robot you know

97:40

in a sample efficient way you'd like to

97:42

constrain all of those to something

97:43

reasonable so having a human control the

97:45

entire body so that you gather all you

97:47

know the opinions of the human as to

97:49

what those things should be doing and

97:50

make those targets too even though

97:52

they're not minimal necessary to the

97:54

maybe the target is to move the bottle

97:56

over here or pour a drink or something

97:58

like that right so uh I think you know

98:02

most in you know the reality is that all

98:04

of these processes get used in various

98:07

combinations because they all kind of

98:09

bring something to table and you know as

98:11

we were talking about you've got you

98:12

know pre-training in instruct training

98:15

and then rhf on a on a and other things

98:18

now that get used in training large

98:20

language mod like there's many other

98:21

stages that we're not mentioning

98:24

um you don't it's not a matter of like

98:26

which is the best one it's like you use

98:28

all of them to the degree that they

98:29

contribute to a rapid reliable solution

98:33

so do you think I mean it seems to get

98:36

for Tesla to to get a product out to

98:39

people they need to scale up the data I

98:42

mean the or what if it's human mimickry

98:46

or whatever unless you're doing I guess

98:48

some specialized you know you know

98:50

Factory tasks but even that if if it's

98:53

so specialized why do you need a

98:55

humanoid robot has to be somewhat you

98:56

know you know like there needs to be a

99:00

need for right more generalized it a lot

99:03

of you know great progress is being made

99:05

in robots robotics right now without a

99:07

huge Fleet of robots uh there are you

99:11

know scale that we were talking about

99:13

before like scale just wins if you can

99:15

scale but you know for scale to win with

99:19

Optimus you have to have you know a wide

99:22

variety of real world task that you're

99:24

deploying Optimus into where it's

99:26

operating with either without super

99:28

human supervision or where humans are

99:31

supervising it and they would have been

99:33

doing the task anyway before like so

99:35

that because the cost of like paying

99:37

10,000 people to stand around operate

99:40

Optimus you know eight hours a day like

99:42

it's it's it's super burdensome right

99:44

and more importantly you want to one of

99:48

the advantages of operating in the real

99:50

world is you want to take advantage of

99:52

the complex and entropy of the world

99:55

like if you got 10,000 optimine and

99:56

they're all standing in white cubicles

99:58

that are basically the same and they're

99:59

just moving the same blocks

100:01

around uh part of the benefit operating

100:03

in the world is the long tail of context

100:06

and properties you know so if you give a

100:09

optimi to 10,000 people and you tell

100:11

them hey go use this on your farm hey

100:13

use this it you know try to use it as a

100:15

carpenter or whatnot and you can find

100:16

people who are enthusiastic about

100:17

investing their own time and doing it

100:19

maybe finding something useful for it to

100:21

do now what you're doing is you're

100:22

harnessing the variety that all of these

100:25

different people thinking about this

100:26

stuff in all the different settings and

100:28

environments that's where the data

100:30

really starts to if having a ton of

100:33

optimi in a factory all in approximately

100:36

the same context doing approximately

100:38

same thing it's not nearly as valuable

100:39

as having lots of different because

100:41

that's what that's what the cars get

100:43

sure each of the cars is serving a

100:45

different Master it's doing a different

100:46

set of task on a different set of roads

100:48

at different times of day in different

100:50

weather and whatnot so so the data it

100:53

gathers is bringing all of that Variety

100:55

in and that variety is really useful to

100:58

training these things mhm um Tesla I

101:00

mean that reminds me Tesla recently had

101:02

a job listing for like 10 or so

101:05

prototype Vehicle drivers and they're

101:08

like in these different cities all over

101:09

the US like why do you think they need

101:12

that I assumed it was because they were

101:13

checking out V12 and maybe Gathering

101:15

training it's you know having some I

101:18

mean you get two things out of having a

101:19

driver in Adelaide Australia right uh

101:22

one of them is get to see like is there

101:24

anything weird about Adela Australia

101:26

that breaks what we're doing um that we

101:28

should be paying attention to and you

101:30

get to gather data from Adela Australia

101:32

like as I was saying variety right and

101:34

different countries just have things

101:35

that are different and they'll different

101:37

driving cultures I mean uh when Brad

101:40

Ferguson went to New York he noticed

101:41

that you know FSD was driving like a New

101:43

Yorker you know humans change their

101:45

driving behavior depending on context

101:47

right some of that is cultural you know

101:50

you drive in Brazil you drive in Italy

101:53

and then you go drive in like England or

101:54

Germany and the driving cultures are

101:56

really the way people behave are

101:57

different right so just like being in

101:59

those environments and Gathering data on

102:01

the driving culture in that environment

102:02

that can also be useful I mean why can't

102:05

they just use you know their own hundred

102:09

rated 100 like safety score drivers in

102:12

those different cities instead why do

102:14

they need to hire separate drivers you

102:16

think yeah I wouldn't say they

102:17

necessarily I

102:20

it say you want to run a stack that

102:22

you're not confident is safe yet and you

102:24

want to give it control of the vehicle

102:25

so the first thing I said is like is

102:27

there something in Adelaide that breaks

102:29

our current stack well if you you know

102:31

imagine that you wanted to go test V12

102:34

but you were like four months away from

102:36

being able to roll it out well you can

102:37

go there and test it to see if there's

102:39

any big problems with it you know and

102:42

you know without taking the risk of

102:44

giving it to a retail customer and you

102:46

know you can put it on a single vehicle

102:48

if you have a professional driver that

102:49

you're paying you get a ton of data in a

102:52

small period of time and you choose the

102:53

data you can tell them we want data from

102:55

this situation go there and do this now

102:57

go to this other you know like the the

102:58

drivers doing Chuck's UPL

103:03

sure yeah interesting

103:06

um um llm so you talked about llms a

103:10

little bit so um what's going on where

103:13

is this all headed so in the bigger llm

103:16

picture so we have you know open AI they

103:19

just released a I guess update for gp4

103:24

yeah we'll see what it what the

103:25

capabilities are but then you um Claud

103:28

Opus has been destroying GPT at least in

103:31

my personal use it's beating out on

103:33

benchmarks and in a lot of people's

103:35

personal experience like I that may be

103:37

the reason that we that we we're getting

103:39

this gp4 turbo because you know one of

103:42

open ai's points of pride is you know

103:43

they've managed to stay at top the

103:45

leaderboard quite comfortably for a long

103:46

time with gp4 yeah I mean does this

103:49

change the the LM game to have like

103:53

anthropic being this I me be being able

103:56

to challenge open AI at least at this

103:58

point well game so everybody likes a

104:02

horse race and that's why the horse race

104:04

aspects of this stuff get played up um

104:07

so yeah the game that newspaper

104:08

reporters want to report on the game

104:10

that you know the bystanders like want

104:12

you know it gets more exciting when the

104:14

two horses are close together at the

104:15

nose uh does that change in an important

104:19

way the long-term dynamics of the market

104:21

I don't think from a standpoint it does

104:24

I think from a regulatory standpoint and

104:27

from the perception of the the markets

104:29

the breadth of the willingness of a wide

104:33

range of people to get involved and uh

104:36

like I think it might because it'll

104:37

change people's

104:39

perceptions and it might have an impact

104:42

on the real on on the outcomes because

104:44

it changes people's perceptions I think

104:46

most of it is you know people just like

104:48

a race and so like that that's part of

104:51

it I am you know mixt uh uh 8- uh 22b

104:57

came out I if you saw this that was

104:59

yesterday I'm going to download that

105:01

tonight

105:03

so that might be the first GPT 4 class

105:06

open source model yeah and that would be

105:09

exciting yeah I I doubt it's gp4 but

105:12

we'll see wait yeah I mean you know the

105:15

GP there's a there's a range of the gp4s

105:18

like because the current turbo like on

105:20

from a benchmark standpoint it's

105:22

interesting how like the performance on

105:25

benchmarks and the performance in

105:26

people's experience has kind of diverged

105:29

you know over time the uh the turbos you

105:33

know the later versions of gp4 they

105:35

continue to get better on benchmarks

105:37

right but there are a lot of heavy users

105:40

of of it their perception is that its

105:43

performance on the jobs that they're

105:44

doing has degraded so that's a really

105:47

interesting you know and I think one of

105:49

the reasons there's so much enthusiasm

105:51

about like the stuff that I seeing is

105:53

that a lot of heavy users people who are

105:55

building applications around this they

105:57

were delighted that cloud that Opus

106:00

wasn't having the problems that they

106:02

were that that they were experiencing

106:03

with with with gp4 as they felt like it

106:06

had degraded now you know it's it's hard

106:09

to know you know how much of this is

106:11

anecdotal how much of it represents the

106:13

real experience of everybody using the

106:16

tools uh certainly having competitors

106:18

out there gives you something to compare

106:19

it to you know so you get Alternatives

106:22

it's defin like having other models out

106:24

there is definitely for the good for the

106:26

field yeah uh it I it's about time if

106:32

you look at the rate of improvement in

106:34

the open source models for us to be

106:36

getting there you know it's a uh data

106:38

bricks had a model that came out

106:41

um uh it was another well there's a four

106:45

of 16 mixture of

106:47

experts UH 60 billion parameter is this

106:51

right that kind of scale 150 billion

106:53

parameter M super high performant on

106:56

Industrial low we're starting to see uh

107:00

the ecosystem very kind of just you know

107:03

veryy out where you see models that

107:06

where the the people building them are

107:08

specifically have certain types of

107:09

workloads certain kinds of applications

107:11

in mind and so the models can get good

107:13

at those without necessarily showing up

107:15

in the benchmarks you know so the people

107:17

who work in that space that set of

107:19

applications there's a command R Plus MH

107:23

is out which is uh that's another big

107:26

open source model it just came out like

107:28

in the last like couple weeks and it's

107:31

optimized for doing like rag

107:33

applications you know back office type

107:35

stuff where you use it in agentic ways

107:38

you build it you wrap an agent wrapper

107:40

around it and it it's been specifically

107:41

trained for all these modalities that so

107:43

like like we don't know how good that is

107:47

right now because it's not optimized for

107:50

the kinds of things that like it does

107:51

find on the benchmarks yeah for its size

107:54

but you know as Andrew wi has been

107:57

pointing out a lot recently if you wrap

107:59

a model in an agent you get much higher

108:02

performance on the same set of tasks

108:04

it's with somewhat lower reliability but

108:06

people are gradually figuring out how to

108:08

do that so you can get gp4 performance

108:11

from like 7 billion parameter models if

108:14

you wrap a good Agent around it and you

108:15

direct it at a particular task so like

108:18

it's really exciting to think like what

108:20

people building wrapping agents around

108:23

you know 150 billion parameter not maybe

108:25

not quite gbd4 CL but getting closed are

108:28

going to be and it's an open source

108:30

model like it's it's decentralizing the

108:34

power structure the knowledge base right

108:37

I I yeah I I I think it's actually huge

108:41

like open source getting to gp4 level I

108:44

think this year we'll get there um seems

108:47

like mistel it will deliver something

108:49

probably they've been really impressive

108:51

yeah they've been super impressive um it

108:52

just seems like that is significant

108:55

because the gp4 level is kind of this

108:58

Benchmark where llms get really start to

109:00

get really useful for a lot of things um

109:04

and once you can get that open sourced

109:06

um you can the the cost to access that

109:10

intelligence just drops like crazy

109:11

because you could basically download it

109:14

run it on your computer or eventually

109:17

that will be you know shrunk down and to

109:20

be able to run locally on different

109:21

devices or different things things um

109:23

the cost to access that base level of

109:27

intelligence will basically go down to

109:30

you know negligible cost I'm impressed

109:32

with what people demo running on iPhones

109:35

these days you know it's a Apple has

109:37

this uh this group of guys inside

109:39

researchers that developed this platform

109:41

called mlx which is basically it's like

109:44

Cuda for Apple silicon with like a pie

109:47

torch layer sort of built on top of it

109:49

so that it basically it's it's designed

109:52

like the you know the uh the new mix

109:56

model came out I mean they literally

109:58

just released the yeah released it for

110:00

download and like three hours later

110:02

people had it running optimized on Apple

110:03

silicon under mlx it's designed to like

110:06

make bring models in you know easy and

110:11

performant uh so you know people

110:14

building on that platform they can map

110:16

it to iPhones and that kind of stuff and

110:18

so there's there's a pretty good you

110:20

know ecosystem of demos out there where

110:22

people are taking whisper they're taking

110:24

all the you know various other models

110:25

and demonstrating what you can do by

110:27

quantizing them by doing Apple

110:29

themselves they released a paper last

110:31

year basically that was all about how

110:33

you change the design of a transformer

110:35

so you can run it out of flash like you

110:37

don't even have to load it into Dam you

110:39

just keep the weights in Flash and it

110:40

runs at full speed off of the CPU with

110:43

most of the weights being kept in Flash

110:45

like we're going to see like over the

110:48

next year or two like an a lot of

110:50

performance coming into these small

110:52

portable devices that you can carry

110:53

around yeah definitely and it's about

110:55

time cuz Siri

110:57

sucks yeah I think Apple's finally going

111:00

to announce something WWDC this year

111:02

it's so disappointing that it's taken

111:03

him as long as it has they'll do

111:05

something um Sora what's your take on

111:07

Open the Eyes uh text to video andin

111:10

Sora

111:12

it I'm you know it's a really cool

111:17

demonstr I I I think it's a

111:19

straightforward extrapolation of the

111:20

trends that had been that we've been

111:23

you know taken to video it's a cool tool

111:26

you know it'll be great when more people

111:28

get to use it I mean it's still you know

111:31

you're not going to just dump a prompt

111:33

into sore and get a movie out you know

111:34

it's a point in the Continuum it's a

111:36

nice step forward uh but I think it's

111:40

the kind of place that you would have

111:41

gotten throwing a lot of you know

111:44

compute at the problem so like like I'm

111:47

pleased to see I one of the things about

111:50

all these things like you see this Arc

111:52

of pro

111:53

right and every point that you make

111:55

along the Arc of progress like on

111:57

there's a part of it where you're like

111:58

you know right in line that's what we

112:00

expected but then there's there's always

112:02

this ah we haven't hit the plateau yet

112:04

you know that where you know you're and

112:08

that's kind of how I feel about Sora

112:10

right that yes the the these methods

112:12

they continue to scale and we're they're

112:14

going to keep getting better the

112:15

capabilities themselves are kind of in

112:17

line with the trend yeah it just seems

112:19

like with Sora it's super impressive to

112:22

me but I just think that it's taking a

112:24

lot of compute to run that thing and

112:27

it's not cheap and it's it's like a

112:30

proof of it it shows what's possible and

112:34

and people are going to be able to do

112:35

the similar thing with different methods

112:37

and it'll be a lot cheaper over time and

112:40

the capabilities will grow but it'll

112:42

take some time you know I mean um yeah

112:45

to I mean the difference between demoing

112:47

something and making it economical to

112:48

give the customers can be pretty large

112:50

on these things and I think open

112:52

themselves has said that you know they

112:54

need time to get it to where they can

112:57

offer it at a reasonable price yeah yeah

113:00

um we're almost wrapped up with our two

113:02

hours here um here in Austin um how was

113:06

your Eclipse viewing where did you see

113:08

did you ter I'm so miserable oh no I saw

113:12

the 2017 one and it was mindblowing I

113:14

was super excited about this and we we

113:17

ended up going out to kville because you

113:19

know I looked at the map ahead of time

113:20

there was no I was prepared to like go

113:24

you Vie it ended up kind being this toss

113:28

up I mean you've just got these Banks of

113:29

CLS coming and are you going to get

113:31

lucky and be between two clouds during

113:33

totality so we picked kville like the

113:36

day before it looked like it had the

113:38

best odds and we just totally struck out

113:41

I mean we it was it it's still cool to

113:44

be underneath the thing and see the sky

113:46

go dark and hear the animals all chain

113:49

you know I mean it's definitely

113:50

interesting I don't regret going I don't

113:52

feel like we made any bad decisions

113:53

going back looking at the data it was

113:55

still the best shot it was just like it

113:57

struck out like I was inconsolable the

114:00

whole day I was so bumped out wait so

114:02

the clouds were over the the whole time

114:04

no no I mean durity no we had the whole

114:07

thing where you know you'd see it in

114:09

between the clouds as there's or

114:12

whatever um but there were these couple

114:14

of different layers of clouds moving

114:15

back and forth and they had holes

114:16

between them and occasionally you'd get

114:18

a good view for you know a few seconds

114:20

or a minute or something like that yeah

114:22

but like just before totality this huge

114:25

thick Bank

114:26

just just like we didn't get anything

114:29

like I couldn't even look at eclipse

114:31

pictures for like 24 hours online I was

114:33

so bummed out that's funny that's Terri

114:36

it's too bad well I got to tell people

114:38

like if you've never seen one it they're

114:40

it is so worth it they're so like it's

114:43

super it's just a really incredible

114:46

experience to be out in an open space

114:49

under a blue sky and watch the Moon move

114:52

in front of the sun it it will change

114:55

the way you see the world yeah

114:56

definitely we had um my kids were really

114:58

into it I had them watch a bunch of

115:00

videos we bought some books on solar

115:02

eclipse and so they're really into

115:04

they're just like so excited yeah it's

115:06

fun yeah bummer but so now you know like

115:10

this morning I was like where's the next

115:12

one Australia is going to get a lot over

115:15

the next year maybe we're going to be

115:17

going to

115:18

Australia cool I really want to see

115:20

another one yeah yeah I really do fun

115:23

all right James thanks for hanging out

115:25

um yeah and um this was fun yeah yeah

115:27

we'll talk again hopefully soon all

115:29

right see you guys bye