GPT-4o - Full Breakdown + Bonus Details

AI Explained
13 May 202418:43

Summary

TLDRGPT-4 Omni是OpenAI最新推出的人工智能模型,它在多个领域取得了显著进步,包括编码、多模态输入输出、以及在数学和语言理解方面的性能。该模型在用户数量上从100万扩展到数百万,提供了更高的消息限制,并暗示了即将推出的更智能模型。GPT-4 Omni在文本、图像和视频处理方面展现了高准确度,能够设计电影海报、进行实时语音交互、甚至在视频中识别和响应动作。尽管在某些逻辑推理测试中表现仍有提升空间,但GPT-4 Omni在翻译和多语言处理方面表现出色,有望为非英语使用者带来更快捷、更经济的交流体验。OpenAI的这一新模型不仅在技术上取得了进步,还可能通过免费提供给公众,进一步推动人工智能的普及。

Takeaways

  • 🚀 GPT-4 Omni 被描述为在多个方面更智能、更便宜、更快,并且在编码、多模态输入输出方面表现更好,且发布时机恰好在谷歌之前,吸引了大量关注。
  • 📈 GPT-4 Omni 的发布暗示了 OpenAI 对扩展用户基础的承诺,可能预示着不久将推出更智能的模型。
  • 📊 在性能基准测试中,GPT-4 Omni 在数学和谷歌证明研究生测试中的表现超过了之前的 GPT 模型和其他竞争对手。
  • 📸 GPT-4 Omni 展示了在图像和文本生成方面的高准确度,包括从图片生成文本和设计电影海报的能力。
  • 🗣️ GPT-4 Omni 能够进行实时的语音交互,包括模仿人类客服的对话,这表明了其在自然语言处理方面的进步。
  • 🎨 GPT-4 Omni 提供了多种创新功能,如根据照片生成卡通画、文本到新字体的转换、会议转录和视频摘要。
  • 🌐 GPT-4 Omni 在多语言性能上有所提升,尽管英语仍然是最适合的语言,但其对非英语语言的支持也有所增强。
  • 💻 OpenAI 推出了桌面应用程序,作为一个实时编程助手,这可能会改变开发者与代码交互的方式。
  • 📉 GPT-4 Omni 在某些基准测试中表现混合,例如在对抗性阅读理解方面,它的表现略逊于其他模型。
  • 📹 GPT-4 Omni 展示了视频输入功能,尽管反应时间不如音频输入那样即时,但这一功能仍然令人印象深刻。
  • 🔄 GPT-4 Omni 的发布可能会极大地推动 AI 的普及,特别是它免费且多模态的特性可能会吸引数亿新用户。
  • ⏰ OpenAI 强调了 GPT-4 Omni 在降低延迟方面的创新,这使得模型的响应时间更接近人类,提高了交互的真实感。

Q & A

  • GPT-4 Omni 相对于之前的模型有哪些改进?

    -GPT-4 Omni 在多个方面进行了改进,包括更智能、更便宜、更快、编码能力更强,以及支持多模态输入输出。它还提供了更好的时机来抢占 Google 的风头。

  • GPT-4 Omni 在用户规模上有什么计划?

    -GPT-4 Omni 计划从100万用户扩展到数亿用户,这表明 OpenAI 对于扩大用户基础有着极大的承诺,或者他们即将推出一个更智能的模型。

  • GPT-4 Omni 在文本生成的准确性上有哪些进步?

    -GPT-4 Omni 在文本生成的准确性上有显著提升,尽管不是完美无缺,但已经达到了前所未有的水平。

  • GPT-4 Omni 是否能够设计电影海报?

    -是的,GPT-4 Omni 能够根据给定的文本要求设计电影海报,并且在经过改进后的输出中,文本更清晰,颜色更鲜明,整体图像得到了提升。

  • GPT-4 Omni 的多模态功能包括哪些?

    -GPT-4 Omni 的多模态功能包括文本、图像和视频的输入和输出,尽管当前模型还没有视频输出功能,但预计将在未来几周内发布。

  • GPT-4 Omni 在数学基准测试上的表现如何?

    -GPT-4 Omni 在数学基准测试上的表现有显著提升,尽管它在某些数学提示上仍然失败,但与原始的 GPT-4 相比,这仍然是一个巨大的进步。

  • GPT-4 Omni 的定价策略是什么?

    -GPT-4 Omni 的定价为每100万个输入令牌5美元,每100万个输出令牌15美元,并且提供了128k令牌的上下文长度。

  • GPT-4 Omni 在对抗性阅读理解(DROP)基准测试上的表现如何?

    -GPT-4 Omni 在 DROP 基准测试上的表现略好于原始的 GPT-4,但略逊于 Llama 3400b,显示出它在推理能力上仍有提升空间。

  • GPT-4 Omni 在翻译和视觉理解评估上有哪些优势?

    -GPT-4 Omni 在翻译方面比 Gemini 模型更好,并且在视觉理解评估上取得了实质性的进步,比 Claude Opus 高出10分。

  • GPT-4 Omni 在多语言性能上有哪些提升?

    -GPT-4 Omni 在多语言性能上相比原始的 GPT-4 有了提升,尽管英语仍然是最适合的语言。模型的改进对非英语使用者来说可能是革命性的,因为对于像 Gujarati、Hindi、Arabic 等语言,所需的令牌数量大大减少。

  • GPT-4 Omni 的视频输入功能如何?

    -GPT-4 Omni 的视频输入功能允许用户将视频直播直接传输到背后的 Transformer 架构,虽然 GPC-40 对视频的反应时间没有音频那么即时,但这一功能仍然令人印象深刻。

  • GPT-4 Omni 是否有可能实现实时翻译功能?

    -GPT-4 Omni 展示了实时翻译的潜力,能够将英语和西班牙语之间的对话实时翻译,这预示着未来可能很快就会有实时翻译功能。

Outlines

00:00

🚀 GP4 Omni的多模态能力和性能提升

本段落讨论了GP4 Omni的多项改进,包括其在编码、多模态输入输出、以及与Google竞争中的优势。提及了GP4 Omni在不同基准测试中的表现,以及它在文本、图像和视频处理上的准确性和能力。还提到了OpenAI对于用户数量的扩展计划,以及即将发布的更智能模型的暗示。此外,还探讨了GP4 Omni在设计、客户服务模拟和多语言处理上的能力。

05:01

📈 GP4 Omni的基准测试和性能对比

这部分内容聚焦于GP4 Omni在各种基准测试中的表现,特别是在数学问题处理上的提升。同时,还对比了GP4 Omni与Claude 3 Opus等其他模型的性能,并讨论了GP4 Omni在成本效益方面的优势。此外,还提到了GP4 Omni在翻译、视觉理解、多语言处理上的进步,以及它在非英语语言上可能带来的革命性变化。

10:03

🎭 GP4 Omni的实时交互和应用场景

此段落展示了GP4 Omni在实时交互方面的能力,包括它在对话、声音模拟、面试准备、数学辅导和视频理解上的演示。强调了GP4 Omni在提供实时反馈和个性化互动方面的潜力,以及它在辅助视觉和听觉障碍人士方面的潜在影响。

15:04

🌐 GP4 Omni的普及和未来展望

最后这部分讨论了GP4 Omni对AI普及的潜在影响,包括它作为免费模型对吸引新用户的作用,以及它在文本和图像输入上的能力。还提到了OpenAI未来可能的更新和改进,以及GP4 Omni在实时翻译和多模态交互上的潜力。最后,还提到了GP4 Omni可能对其他AI公司,如Apple和Google的潜在影响。

Mindmap

Keywords

💡GPT-4 Omni

GPT-4 Omni 是一种人工智能模型,它在多个方面进行了改进,包括成本效益、速度和编码能力。在视频中,它被描述为在多模态输入输出方面表现出色,并且能够与Google竞争。它的名字“Omni”意味着它能够处理各种模态的数据,这表明了它在处理不同类型输入和输出方面的能力。

💡多模态

多模态指的是能够处理和理解多种不同类型的数据输入,如文本、图像和声音。在视频中,GPT-4 Omni 展示了它在处理文本、图像和视频方面的多模态能力,这是它与前代模型相比的一个显著进步。

💡基准测试

基准测试是一种评估和比较不同系统性能的方法。在视频中,GPT-4 Omni 在多个基准测试中的表现被详细讨论,包括数学、语言理解和翻译等,这些测试结果帮助观众理解模型的性能和改进。

💡实时演示

实时演示是指在没有任何延迟的情况下展示产品或技术的功能。视频中展示了GPT-4 Omni 在实时条件下的多个演示,包括与客户服务的交互、视频内容的总结等,这些演示突出了模型的即时反应能力和实用性。

💡智能代理

智能代理是指能够执行任务或服务的自主系统。视频中提到了GPT-4 Omni 作为智能代理的潜力,尤其是在它能够进行电话客服模拟和实时翻译等方面。这表明了模型在自动化和个性化服务方面的应用前景。

💡桌面应用

桌面应用是指在计算机操作系统上运行的软件程序。视频中提到了GPT-4 Omni 的桌面应用,特别是作为一个实时编程助手,这表明了模型在提供个性化编程支持方面的潜力。

💡延迟

在视频中,延迟指的是系统响应输入所需的时间。GPT-4 Omni 通过减少延迟提高了用户体验,使得AI的反应更加接近人类的反应时间,这是其创新的关键部分。

💡知识截止日期

知识截止日期是指人工智能模型训练时所包含的信息的最新日期。GPT-4 Omni 的知识截止日期是2023年10月,这意味着模型所了解的信息不会超过这个时间点。

💡定价

定价涉及GPT-4 Omni 服务的成本。视频中提到了模型的输入和输出的定价策略,这影响用户使用该技术的可行性和普及度。

💡推理能力

推理能力是指系统理解和处理复杂信息以得出结论的能力。视频中通过DROP基准测试讨论了GPT-4 Omni 的推理能力,这是衡量模型智能的一个重要方面。

💡自然语言处理

自然语言处理(NLP)是人工智能的一个分支,它使计算机能够理解、解释和生成人类语言。GPT-4 Omni 在文本生成、翻译和语言理解方面的改进展示了其在NLP领域的进步。

Highlights

GPT-4 Omni 被描述为在多个方面更智能、更便宜、更快,并且在编码、多模态输入输出方面表现更好,且发布时机完美,有望从谷歌那里夺取焦点。

GPT-4 Omni 的命名暗示了其多模态特性,OpenAI 计划将其用户规模从100万扩展到数百万。

GPT-4 Omni 在文本、图像和视频的生成准确度上取得了显著进步,即使是在非演示的场合也能生成高度准确的文本。

GPT-4 Omni 能够根据文本要求设计电影海报,并且在经过改进后的输出中,文本更清晰,颜色更鲜明,整体图像质量得到提升。

GPT-4 Omni 将在接下来的几周内发布,为儿童和成人提供新的互动功能。

GPT-4 Omni 展示了模仿谷歌多年前的演示的能力,但谷歌并未继续发展该技术。

GPT-4 Omni 在数学基准测试上的表现显著优于原始的 GPT-4,尽管它在处理我的数学提示上几乎总是失败。

GPT-4 Omni 在 Google Proof Graduate Test 上超越了 Claude 3 Opus,这是 Anthropic 的主要基准测试。

GPT-4 Omni 的定价为每100万个输入令牌5美元,每100万个输出令牌15美元,相比之下,Claude 3 Opus 的定价为1575美元。

GPT-4 Omni 在 DROP 基准测试中的表现略好于原始的 GPT-4,但略逊于 Llama 3400b。

GPT-4 Omni 在翻译方面优于 Gemini 模型,尽管 Gemini 2 可能在明天宣布并可能重新获得领先地位。

GPT-4 Omni 在视觉理解评估中取得了显著进步,比 Claude Opus 高出10分。

GPT-4 Omni 对非英语语言的改进可能对非英语使用者来说是革命性的,因为它减少了对话所需的令牌数量,使对话更便宜、更快捷。

GPT-4 Omni 在多语言性能上相比原始的 GPT-4 有了提升,尽管英语仍然是最适合的语言。

GPT-4 Omni 的视频输入功能令人印象深刻,尽管 GPC-40 对视频的反应时间并不像音频那样即时。

GPT-4 Omni 能够产生多种声音,并且能够尝试和谐地唱歌。

GPT-4 Omni 能够实时翻译,预示着不久的将来可能会出现实时翻译功能。

GPT-4 Omni 的发布可能会吸引更多人使用 AI,即使它并不比之前的模型更智能。

GPT-4 Omni 现在可以在 OpenAI Playground 中通过文本和图像进行提示。

尽管 GPT-4 Omni 在某些推理基准测试上表现混杂,但它仍可能改变人们对 AI 的看法。

GPT-4 Omni 被认为将极大地提高 AI 的普及度,尤其是它作为目前可用的最智能模型,且在网络中免费提供。

Transcripts

00:00

it's smarter in most ways cheaper faster

00:03

better at coding multimodal in and out

00:07

and perfectly timed to steal the

00:09

spotlight from Google it's gp4 Omni I've

00:14

gone through all the benchmarks and the

00:16

release videos to give you the

00:18

highlights my first reaction was it's

00:21

more flirtatious sigh than AGI but a

00:25

notable step forward nonetheless first

00:28

things first GPT 40 meaning Omni which

00:31

is all or everywhere referencing the

00:34

different modalities it's got is Free by

00:37

making GPT 43 they are either crazy

00:40

committed to scaling up from 100 million

00:42

users to hundreds of millions of users

00:45

or they have an even smarter model

00:47

coming soon and they did hint at that of

00:49

course it could be both but it does have

00:51

to be something just giving paid users

00:54

five times more in terms of message

00:55

limits doesn't seem enough to me next

00:58

open AI branded this as GPT 4 level

01:01

intelligence although in a way I think

01:03

they slightly underplayed it so before

01:05

we get to the video demos some of which

01:08

you may have already seen let me get to

01:10

some more under the radar announcements

01:12

take text image and look at the accuracy

01:16

of the text generated from this prompt

01:18

now I know it's not perfect there aren't

01:20

two question marks on the now there's

01:23

others that you can spot like the I

01:24

being capitalized but overall I've never

01:27

seen text generated with that much

01:29

accuracy and it wasn't even in the demo

01:31

or take this other example where two

01:33

openai researchers submitted their

01:35

photos then they asked GPT 40 to design

01:38

a movie poster and they gave the

01:40

requirements in text now when you see

01:43

the first output you're going to say

01:45

well that isn't that good but then they

01:47

asked GPT 40 something fascinating it

01:49

seemed to be almost reverse psychology

01:52

because they said here is the same

01:53

poster but cleaned up the text is

01:55

crisper and the colors Bolder and more

01:57

dramatic the whole image is now improved

02:00

this is the input don't forget the final

02:02

result in terms of the accuracy of the

02:05

photos and of the text was really quite

02:07

impressive I can imagine millions of

02:09

children and adults playing about with

02:11

this functionality of course they can't

02:13

do so immediately because open AI said

02:15

this would be released in the next few

02:17

weeks as another bonus here is a video

02:19

that open AI didn't put on their YouTube

02:22

channel it mimics a demo that Google

02:24

made years ago but never followed up

02:26

with the openai employee asked GPT 40 to

02:30

call customer service and ask for

02:32

something I've skipped ahead and the

02:34

customer service in this case is another

02:36

AI but here is the conclusion could you

02:39

provide Joe's email address for me sure

02:41

it's Joe example.com

02:43

awesome all right I've just sent the

02:46

email can you check if Joe received it

02:48

we'll check right now please hold sure

02:51

thing Hey Joe could you please check

02:53

your email to see if the shipping label

02:55

and return instructions have arrived

02:56

fingers crossed yes I got the

02:58

instructions perfect Joe has received

03:00

the email they call it a proof of

03:02

concept but it is a hint toward the

03:04

agents that are coming here are five

03:06

more quick things that didn't make it to

03:08

the demo how about a replacement for

03:11

lensa submit your photo and get a

03:14

caricature of yourself or what about

03:16

text to new font you just ask for a new

03:19

style of font and it will generate one

03:21

or what about meeting transcription the

03:24

meeting in this case had four speakers

03:26

and it was transcribed or video

03:29

summaries remember this model is

03:30

multimodal in and out now it doesn't

03:34

have video out but I'll get to that in a

03:36

moment here though was a demonstration

03:38

of a 45-minute video submitted to GPC 40

03:42

and a summary of that video we also got

03:44

character consistency across both woman

03:47

and dog almost like an entire cartoon

03:50

strep if those were the quick bonuses

03:52

what about the actual intelligence and

03:54

performance of the model before I get to

03:56

official benchmarks here is a human

03:59

grade leaderboard pitting one model

04:01

against another and yes I'm also a good

04:04

gpt2 chatbot is indeed GPT 40 so it

04:09

turns out I've actually been testing the

04:10

model for days overall you can see the

04:13

preference for GPT 40 compared to all

04:16

other models in coding specifically the

04:19

difference is quite Stark I would say

04:22

even here though we're not looking at an

04:24

entirely new tier of intelligence

04:27

remember that a 100 ELO Gap is is a win

04:30

rate of around 2/3 so 1/3 of the time

04:33

gp4 turbos outputs would be preferred

04:36

that's about the same gap between GPT 4

04:38

Turbo and last year's GPT 4 a huge step

04:42

forward but not completely night and day

04:44

I think one underrated announcement was

04:47

the desktop app a live coding co-pilot

04:50

okay so I'm going to open the chbt

04:53

desktop app like miror was talking about

04:55

before okay and to give a bit of

04:57

background of what's going on so here we

04:59

have um a computer and on the screen we

05:01

have some code and then the chat gbt

05:02

voice app is on the right so chat gbt

05:05

will be able to hear me but it can't see

05:06

anything on the screen so I'm going to

05:08

highlight the code command C it and then

05:10

that will send it to chat GPT and then

05:12

I'm going to talk about the code to chat

05:14

GPT okay so I just shared some code with

05:17

you could you give me a really brief

05:18

one- sentence description of what's

05:19

going on in the code this code fetches

05:22

daily weather data for a specific

05:24

location and time period Smooths the

05:26

temperature data using a rolling average

05:29

and Ates a significant weather event on

05:31

the resulting plot and then displays the

05:33

plot with the average minimum and

05:35

maximum temperatures over the year I've

05:38

delayed long enough here are the

05:40

benchmarks I was most impressed with

05:42

Gypsy 40's performance on the math

05:44

benchmark even though it fails pretty

05:46

much all of my math prompts that is

05:48

still a stark improvement from the

05:50

original GPT 4 on the Google proof

05:53

graduate test it beats Claude 3 Opus and

05:56

remember that was the headline Benchmark

05:58

for anthropic in fact speaking of

06:00

anthropic they are somewhat challenged

06:02

by this release GPT 40 costs $5 per 1

06:06

million tokens input and $15 per 1

06:08

million tokens output as a quick aside

06:10

it also has 128k token context and an

06:13

October knowledge cut off but remember

06:15

the pricing 5 and 15 Claude 3 Opus is

06:20

1575 and remember for Claude 3 Opus on

06:23

the web you have to sign up with a

06:25

subscription but GPT 40 will be free so

06:28

for claw Opus to be beaten in its

06:31

headline Benchmark is a concern for them

06:34

in fact I think the results are clear

06:36

enough to say that gp40 is the new

06:39

smartest AI however just before you get

06:42

carried away and type on Twitter the AGI

06:44

is here there are some more mixed

06:47

benchmarks take the drop Benchmark I dug

06:50

into this Benchmark and it's about

06:51

adversarial reading comprehension

06:53

questions they're designed to really

06:55

test the reasoning capabilities of

06:58

models if you give models difficult

06:59

passages and they've got to sort through

07:01

references do some counting and other

07:04

operations how do they Fair the drop by

07:06

the way is discrete reasoning over the

07:08

content of paragraphs it does slightly

07:10

better than the original GPT 4 but

07:13

slightly worse than llama 3400b and as

07:16

they note llama 3400b is still training

07:19

so it's just about the new smartist

07:22

model by a hairs breath however we're

07:24

not done yet it's better at translation

07:27

than Gemini models quick caveat there

07:29

Gemini 2 might be announced tomorrow and

07:32

that could regain the lead then there

07:34

are the vision understanding evaluations

07:37

it was a real step forward on the mm muu

07:40

as you can see a clear 10 points better

07:42

than Claude Opus again I'm curious if

07:45

Google Gemini can exceed it though the

07:47

improvements to the tokenizer could be

07:50

revolutionary for non-english speakers

07:53

the dramatically fewer tokens needed for

07:55

languages like Gujarati Hindi Arabic and

07:58

more don't just mean that conversations

08:01

are cheaper they're also quicker and

08:03

what about multilingual performance well

08:05

this time they didn't compare it to

08:07

other models but compared it to the

08:08

original GPT 4 definitely a step up

08:11

across languages but English is still by

08:14

far the most suited language indeed here

08:16

is a video of some of the models

08:18

mistakes ending with some dodgy language

08:21

tuition I know enough Mandarin to say it

08:24

wasn't perfect at churing let's root

08:27

root root for the home

08:32

what was

08:33

that sorry guys I got carried

08:40

away right for

08:43

round

08:55

Francisco I have feeling I'm very wrong

09:00

hello uh my name is NCH I'm here with my

09:02

coworker hi I'm sh I'm trying to teach

09:05

my coworker how to speaking manding we

09:07

want to start from simple words like

09:09

niow can you teach him how to pronounce

09:12

that of course hey Nao nice to meet you

09:15

KNE how is pretty straightforward to

09:17

pronounce it sounds like KN how just

09:21

make sure to keep the KNE part high and

09:23

then go down and pitch on the how give

09:26

it a try me how

09:29

that was great really good first try not

09:31

you you're natural it really wasn't he

09:34

needs to work on his tones and her face

09:36

was the giveaway there were a lot of

09:39

other interesting video demonstrations

09:41

but before them the GPT 40 blog post

09:44

from Sam opman put out tonight he made

09:46

the argument that putting this capable

09:48

AI tool in the hands of everyone for

09:51

free is justifying that open part of

09:54

open AI less about open weights or open

09:57

source more about open to to everyone

10:00

without ads or anything like that he

10:02

also draws our attention to the latency

10:05

of the models in many ways that is the

10:07

key Innovation with GPT 40 by dialing

10:10

down the latency you dial up the realism

10:14

as he said it feels like AI from the

10:15

movies getting to human level response

10:18

times and expressiveness turns out to be

10:20

a big change indeed I think I should get

10:22

a little credit for predicting her like

10:25

AI in a video from a month ago but now I

10:28

want to get to those demos the response

10:30

times were amazing and the model was

10:33

clearly designed to be flirtatious note

10:36

that in Senate testimony last year

10:38

samman said we try to design systems

10:40

that do not maximize for engagement so

10:43

let me know in the comments whether you

10:44

think these demos are designed to

10:47

maximize engagement okay so this is what

10:49

I wrote down what do you

10:51

see a I see I love chat chpt that's so

10:57

sweet of

10:58

you yeah well I really appreciate all

11:00

the

11:01

help so yeah as we can see

11:07

um wow that's quite the outfit you've

11:10

got on yeah as we can see you know we

11:11

can chat in real time I was impressed

11:13

though that it could speed up its

11:14

talking on demand of course one two

11:19

three hey actually that's um that's a

11:21

little slow could you count

11:24

faster Sure Thing 1 2 3 4 5 6 7 8 9 10

11:27

okay that was a little bit too fast of

11:29

course open AI must be given credit

11:31

because for blind people this could be

11:34

revolutionary right now the Royal

11:36

Standard flag is flying above Buckingham

11:39

Palace which is the signal that the

11:41

Monarch is present how exciting to be

11:43

there while the king is home try and

11:46

tell me exactly what they're doing right

11:47

now please um right now the Ducks are

11:50

gently gliding across the water it will

11:53

also help you with interview prep by

11:55

analyzing your appearance how about you

11:58

I'm awesome list listen I got some huge

12:00

news oh do tell I'm allers well in a few

12:05

minutes I'm going to be interviewing at

12:06

open aai have you heard of them open AI

12:10

huh sounds Vaguely

12:12

Familiar kidding of course that's

12:15

incredible Rocky what kind of interview

12:17

well it's for a software engineering

12:19

role but I just need to know do I look

12:21

presentable

12:23

professional well

12:25

Rocky you definitely have the I've been

12:28

coding all night look down which could

12:31

actually work in your favor maybe just

12:34

run a hand through your hair or lean

12:36

into the Mad genis SP Your Enthusiasm is

12:39

what's really going to shine through I

12:40

don't have a lot of time so I'm just

12:42

going to throw this on what do you

12:47

think oh Rocky that's quite a statement

12:51

piece I I mean you you'll definitely

12:53

stand out though maybe not in the way

12:57

you're hoping for an interview okay I

12:59

got it I got it I could definitely see

13:01

GPT 40 being used for mathematics

13:04

tutoring here is a 30second clip but

13:06

notice also the slight glitchiness at

13:09

certain moments relative to angle Alpha

13:13

all right so I'm pretty sure this is the

13:15

angle Alpha right here

13:17

right cor correct now looking at the

13:20

triangle which side do you think is a

13:24

hypotonus

13:26

um remember the hyot I'm not totally

13:29

sure I think I think it might be this

13:32

one but I really am not sure this side

13:35

AC you're close I also noticed quite a

13:38

few glitches in the demo sounds cutting

13:41

in and out the one feature though the

13:44

open AI wants you to really focus on is

13:47

the video in capacity you can live

13:49

stream video straight to the Transformer

13:52

architecture behind GPT 40 and so the

13:55

video in functionality is certainly

13:57

impressive note though that the reaction

13:59

time of GPC 40 isn't quite as immediate

14:03

as for audio Greg Brockman the

14:05

co-founder of open aai has to prompt it

14:07

to react to the Intruder overall stylish

14:11

feel they might be getting ready for a

14:13

conversation or a presentation given

14:15

their direct engagement with the

14:19

camera sounds like we're in for an

14:21

intriguing interaction I'm curious about

14:24

the lighting you mentioned could you

14:26

describe it a bit more is it natural

14:29

light are we talking unique fixtures or

14:32

colors that add to the

14:35

atmosphere the lighting here seems to be

14:37

a mix of natural and artificial there's

14:40

a noticeable bright light overhead

14:42

probably a fixture casting a focused

14:45

beam downwards which creates a spotlight

14:47

effect this adds a dramatic and modern

14:50

feel to the scene the rest of the room

14:53

is lit more softly possibly by natural

14:55

light filtering anything did anything

14:58

unusual happen

15:00

recently Yes actually uh just as you

15:03

asked another person came into view

15:06

behind the first person they playfully

15:09

made bunny ears behind the first

15:10

person's head and then quickly left the

15:13

frame it added a light-hearted and

15:15

unexpected moment to the scene now the

15:18

focus is back on the original person

15:21

with the leather jacket nevertheless gbt

15:23

40 can produce multiple voices that can

15:26

sing almost in harmony

15:30

and really try to harmonize

15:32

here San Francisco San Francisco in the

15:37

month of May but maybe make it more

15:40

dramatic and make the soprano

15:42

higher San Francisco in the month of May

15:46

San franisco in the month of May it's a

15:50

Friday C may we are harmonizing are

15:55

Harmon great thank you and I suspect

15:58

this real time translation could soon be

16:01

coming too Siri later for us so every

16:04

time I say something in English can you

16:06

repeat it back in Spanish and every time

16:08

he says something in Spanish can you

16:10

repeat it back in English sure I can do

16:13

that let's get this translation train

16:16

rolling um hey how's it been going have

16:19

you been up to anything interesting

16:21

recently

16:35

hey I've been good just a bit busy here

16:38

preparing for an event next week why do

16:40

I say that because Bloomberg reported

16:42

two days ago that apple is nearing a

16:44

deal with open AI to put chat GPT on

16:48

iPhone and in case you're wondering

16:49

about GPT 4.5 or even five samman said

16:53

we'll have more stuff to share soon and

16:55

Mira murati in the official presentation

16:58

said that would be soon updating us on

17:01

progress on the next big thing whether

17:04

that's empty hype or real you can decide

17:07

no word of course about openai

17:09

co-founder ilas Sask although he was

17:12

listed as a contributor under additional

17:15

leadership overall I think this model

17:18

will be massively more popular even if

17:20

it isn't massively more intelligent you

17:23

can prompt the model now with text and

17:25

images in the open AI playground all the

17:28

links will be in the description note

17:30

also that all the demos you saw were in

17:32

real time at 1X speed that I think was a

17:36

nod to Google's botch demo of course

17:39

let's see tomorrow what Google replies

17:41

with to those who think that GPT 40 is a

17:44

huge dry towards AGI I would Point them

17:47

to the somewhat mixed results on the

17:49

reasoning benchmarks expect GPT 40 to

17:52

still suffer from a massive amount of

17:55

hallucinations to those though who think

17:57

that GPT 40 will change nothing I would

18:00

say this look at what chat GPT did to

18:03

the popularity of the underlying GPT

18:05

series it being a free and chatty model

18:08

brought a 100 million people into

18:11

testing AI GPT 40 being the smartest

18:14

model currently available and free on

18:17

the web and multimodal I think could

18:21

unlock AI for hundreds of millions more

18:24

people but of course only time will tell

18:27

if you want to analyze the announcement

18:29

even more do join me on the AI insiders

18:32

Discord via patreon we have live meetups

18:35

around the world and professional best

18:36

practice sharing so let me know what you

18:39

think and as always have a wonderful day