INSANE OpenAI News: GPT-4o and your own AI partner

AI Search
13 May 202428:47

Summary

TLDROpenAI has unveiled GPT-4 Omni, a groundbreaking AI model that excels in real-time audio, vision, and text interaction. Capable of responding in as quick as 230 milliseconds, GPT-4 Omni surpasses its predecessors in performance and cost-efficiency. The model's end-to-end training across multiple inputs and outputs allows for a seamless and human-like conversational experience. Set to be integrated into the free tier and Plus subscriptions, this AI assistant is poised to revolutionize personal assistance, raising questions about the future of human interaction and education.

Takeaways

  • 🧠 GPT 40 is OpenAI's new flagship AI model, featuring an 'Omni' capability that allows it to handle multiple types of inputs and outputs in real time, including audio, vision, and text.
  • 🚀 GPT 40 is designed to respond in real time, with an average response time of 320 milliseconds, which is comparable to human conversational response times.
  • 🔍 The model has shown significant improvements in text in non-English languages and is faster and 50% cheaper in the API compared to its predecessor, GPT 4 Turbo.
  • 🏅 GPT 40 outperforms GPT 4 Turbo and other models from Google, Anthropics, and Meta in various benchmarks, particularly excelling in vision and audio understanding.
  • 🎥 The AI can interact with the world through demos showcasing its ability to see, hear, and respond to various situations, including playful interactions and professional settings.
  • 🎤 GPT 40 can sing songs, including 'Happy Birthday', and its voice is described as realistic and human-like.
  • 🤖 The AI is capable of helping with tasks such as preparing for an interview, telling dad jokes, and even tutoring in subjects like math.
  • 🌐 It can also assist with real-time translation, summarizing meetings, and interacting in online meetings, showcasing its multifunctional capabilities.
  • 🐶🐱 The AI can engage in discussions and debates, such as the classic dogs versus cats debate, and provide summaries of group discussions.
  • 💬 GPT 40 can be used in a sarcastic mode, as demonstrated in the script, showing its ability to adapt to different communication styles.
  • 🆓 OpenAI is making GPT 40 available in the free tier and to plus users with increased message limits, making its advanced capabilities accessible to a wider audience.

Q & A

  • What is the significance of Open AI's announcement regarding GPT 40?

    -Open AI's announcement of GPT 40 signifies a major advancement in AI technology. GPT 40, also known as Omni, is capable of handling multiple types of inputs and outputs in real time, including audio, vision, and text. It is designed to respond in real time, similar to human conversation speeds, and has shown significant improvements over previous models in various benchmarks.

  • How does GPT 40 compare to its predecessor, GPT 4 Turbo, in terms of performance?

    -GPT 40 outperforms GPT 4 Turbo in several areas. It matches GPT 4 Turbo in performance on text in English and code but shows significant improvement on text in non-English languages. Additionally, GPT 40 is faster and 50% cheaper in the API, making it a more efficient and cost-effective option.

  • What are some of the capabilities demonstrated by GPT 40 in the demo clips?

    -The demo clips showcase GPT 40's ability to interact with the world through audio, vision, and text. It can engage in real-time conversations, respond to questions about its environment, and even sing songs. It can also assist with tasks such as tutoring in math, providing real-time translations, and summarizing meetings.

  • How does GPT 40's real-time voice assistant feature work?

    -GPT 40's real-time voice assistant feature works by processing all inputs and outputs through a single neural network, which is trained end-to-end across text, vision, and audio. This allows the model to respond in real time and maintain the context of the conversation, unlike the previous voice mode which relied on a sequence of three separate models.

  • What are the implications of GPT 40's advancements for the future of education?

    -GPT 40's advancements could potentially revolutionize education by providing a highly knowledgeable and accessible AI that can guide learning in real time. It raises questions about the necessity of traditional schools and teachers, as anyone could theoretically learn anything they want directly from the AI.

  • How can GPT 40 assist in language learning?

    -GPT 40 can assist in language learning by providing translations, teaching pronunciation, and engaging in conversations in different languages. Its ability to understand and respond in multiple languages makes it a versatile tool for language learning.

  • What is the availability of GPT 40 for users who do not have a paid plan?

    -GPT 40 will be available in the free tier and to plus users with up to five times higher message limits. This means that even users without a paid plan can utilize GPT 40's capabilities, which are as good as or better than the leading model, GPT 4 Turbo.

  • How does GPT 40's real-time translation feature compare to previous technology?

    -While real-time translation is not a new feature, GPT 40's implementation is significantly better than existing models. It outperforms Google's Gemini, which is used in Samsung's smartphones, across various language benchmarks.

  • What are some of the potential drawbacks or limitations of GPT 40 that were mentioned in the script?

    -The script mentions that GPT 40, like any AI model, is not perfect and can sometimes hallucinate or provide incorrect information. There are also bloopers shown in the script to illustrate that the model is still in the early stages of exploration and has room for improvement.

  • How can developers benefit from GPT 40's advancements?

    -Developers can benefit from GPT 40 as it is two times faster and half the price compared to GPT 4 Turbo. Additionally, it offers five times higher limit rates, making it a more efficient and cost-effective solution for developing applications that require advanced AI capabilities.

Outlines

00:00

🤖 Introduction to GPT 40 and Real-Time AI Interaction

The host introduces GPT 40, OpenAI's latest AI model, expressing a mix of excitement and apprehension about its capabilities. GPT 40 is a personal assistant that can interact in real time through text, audio, and vision. Demo clips showcase the AI's ability to engage in conversation, make accurate guesses about the environment, and even interact with another AI, demonstrating its advanced understanding and responsiveness.

05:00

🎤 Exploring GPT 40's Audio and Visual Capabilities

This section delves into GPT 40's advanced features, such as singing 'Happy Birthday' and responding to real-world scenarios like hailing a taxi. The AI's ability to understand context and provide appropriate responses to questions about the environment or objects within it is highlighted. The script also includes playful interactions, like singing about majestic potatoes, showing the AI's versatility and humor.

10:02

👶 GPT 40 as a Language Learning and Joke-Telling Tool

The script presents GPT 40's utility in language learning and humor. It can assist with dad jokes, sing lullabies, and even provide real-time translations between English and Spanish. The AI's language capabilities are showcased through its ability to teach Spanish vocabulary and its potential to replace traditional language learning tools.

15:03

🐾 GPT 40's Real-Time Interactions with Pets and Environment

The host demonstrates GPT 40's ability to interact with pets and the environment, such as identifying ducks in a pond and hailing a taxi. The AI's real-time audio capabilities are highlighted, showing it can assist with various tasks and provide immediate feedback, making it a potentially indispensable tool for everyday life.

20:03

📚 GPT 40 as an Educational Tool and Meeting Assistant

GPT 40 is portrayed as an educational tool, capable of tutoring in math and guiding learners to understand concepts on their own. It also serves as a meeting assistant, summarizing discussions on topics like the dogs versus cats debate. The AI's ability to provide real-time assistance and summaries showcases its potential as a comprehensive educational and organizational aid.

25:03

💬 GPT 40's Real-Time Voice Assistant and Future Implications

The host discusses the technical aspects of GPT 40's real-time voice assistant, comparing it to previous models and highlighting its improvements in speed, cost, and message limits. The potential impact of GPT 40 on communication, education, and the need for human interaction is pondered, raising questions about the future role of AI in society.

🚀 Conclusion and Future Outlook for GPT 40

The video concludes with a summary of GPT 40's capabilities and an invitation for viewers to share their thoughts on the AI's potential impact. The host reflects on the mind-blowing and slightly terrifying implications of such advanced AI, contemplating the changes it may bring to communication, education, and daily life.

Mindmap

Keywords

💡AI Channel

The AI Channel mentioned in the script refers to a platform or series dedicated to showcasing artificial intelligence tools and innovations. It's a thematic focus that sets the stage for the video's content, which is about the latest advancements in AI. The script describes the host's experience with various AI tools, indicating the channel's role in exploring and presenting cutting-edge technology to its audience.

💡Open AI

Open AI is a research and deployment organization focused on creating and disseminating safe AI technologies. In the context of the video, Open AI is the developer of GPT 40, the new AI model being discussed. The script highlights Open AI's announcement of GPT 40 and its capabilities, emphasizing the organization's role in advancing AI technology and its impact on the future of personal AI assistants.

💡GPT 40

GPT 40, as introduced in the script, is a new model developed by Open AI. It represents a significant leap in AI capabilities, featuring real-time responses and multimodal interaction through audio, vision, and text. The script discusses GPT 40's superior performance in various benchmarks, positioning it as a leading model in the field of AI, and its potential to transform personal assistance and education.

💡Personal AI Assistant

A personal AI assistant, as depicted in the script, is an AI model designed to interact with users in a conversational manner, providing real-time responses and assistance. The video script illustrates this concept through demo clips where the AI engages in conversation, offering help with tasks such as singing a song, tutoring in math, and even participating in a playful debate. This concept is central to the video's exploration of AI's evolving role in personal interaction and assistance.

💡Real-time Interaction

Real-time interaction is a key feature of GPT 40 highlighted in the script. It refers to the AI's ability to respond to user inputs immediately, similar to human conversational speeds. The script emphasizes GPT 40's average response time of 320 milliseconds, which is on par with human response times, showcasing the model's efficiency and the potential for seamless communication with AI.

💡Vision and Audio Understanding

Vision and audio understanding are capabilities of GPT 40 that allow it to process and interpret visual and auditory information. The script provides examples of the AI describing scenes, recognizing objects, and even engaging in conversation with other AIs based on visual cues. These capabilities are crucial for the AI's multimodal interaction and its ability to function as a personal assistant in various contexts.

💡API

API, or Application Programming Interface, is a set of rules and protocols that allows different software applications to communicate with each other. In the script, it is mentioned that GPT 40 is 50% cheaper in the API compared to its predecessor, GPT 4 Turbo. This indicates that developers can access and integrate GPT 40's advanced AI capabilities into their applications at a reduced cost, potentially leading to broader adoption and innovation.

💡Language Learning

Language learning is showcased in the script as one of the many applications of GPT 40's capabilities. The AI is shown helping users learn Spanish by translating objects' names and engaging in conversation. This highlights the potential of AI in education, suggesting that GPT 40 could serve as a versatile language learning tool, accessible to anyone with the technology.

💡Real-time Translation

Real-time translation is a feature that GPT 40 can perform, as demonstrated in the script where the AI translates between English and Spanish during a conversation. This capability showcases the AI's understanding of multiple languages and its ability to facilitate communication in multilingual settings, which could be invaluable for international business, travel, and education.

💡Education

Education is a recurring theme in the script, with GPT 40 being presented as a potential educational tool. The AI's ability to tutor in math and assist in language learning suggests a future where AI could play a significant role in personalized education. The script raises questions about the need for traditional educational institutions if AI like GPT 40 can provide comprehensive learning support.

💡Omni

The term 'Omni' in the context of GPT 40 signifies its ability to handle multiple types of inputs and outputs, including audio, vision, and text. The script explains that this Omni capability allows GPT 40 to process information in real time and interact with the world through various modalities. This feature is a key differentiator for GPT 40, setting it apart from previous models and highlighting its versatility in AI applications.

Highlights

OpenAI has released GPT 40, a new AI model that can interact in real-time with audio, vision, and text.

GPT 40 is capable of acting as a personal AI assistant, responding to user queries in a conversational manner.

The AI can make guesses about the user's environment and activities based on visual cues.

GPT 40 can engage in dialogues with other AIs, demonstrating its advanced communication capabilities.

The AI can describe its surroundings and answer questions about them, showcasing its visual understanding.

GPT 40 can perform real-time translations between English and Spanish, facilitating multilingual conversations.

The AI can be used for language learning, helping users with vocabulary in different languages.

GPT 40 can assist in tutoring, providing guidance on math problems without giving away the answers.

The AI can participate in online meetings, offering summaries and interaction in real-time.

GPT 40 can demonstrate emotions and sarcasm in its responses, adding a human-like touch to interactions.

The AI model is available for free to some extent, and will be rolled out in an Alpha version for Chat GPT Plus subscribers.

GPT 40 outperforms its predecessor, GPT 4 Turbo, in various benchmarks, especially in vision and audio understanding.

The new model processes inputs and outputs through a single neural network, improving efficiency and reducing latency.

GPT 40 is designed to have a real-time response capability, similar to human conversational speeds.

The AI's advanced capabilities raise questions about the future of human interaction and the role of AI in education.

Despite its impressive capabilities, GPT 40 is not perfect and can sometimes produce incorrect or 'hallucinated' information.

Transcripts

00:00

now I've been doing this AI channel for

00:02

a while now I've been featuring the

00:03

newest and the coolest AI tools and the

00:06

most advanced AI Innovations but today

00:09

this just dropped and I'm feeling

00:11

something that I've never felt before in

00:13

my life I am mind blown and shocked but

00:16

at the same time also terrified I'm

00:18

terrified of what's to come what our

00:20

future will be like and things are going

00:22

to get wild but anyways open AI just

00:25

dropped this they made a huge

00:27

announcement about GPT 40 and their new

00:30

capabilities you now have a personal

00:32

assistant where you can just talk to it

00:34

and it can respond in real time think of

00:36

it as like her if you watch that movie

00:39

if not think of it as your personal AI

00:41

assistant let me play you some demo

00:43

Clips first hey how's it going hey there

00:47

it's going great how about you I see

00:49

you're rocking an open AI hoodie nice

00:52

choice what's up with that seiling

00:54

though are you in a cool industry style

00:56

office or something well can you take a

00:58

guess at what I might be doing BAS on

01:00

what I'm showing you

01:03

here H from what I can see it looks like

01:07

you're in some kind of recording or

01:08

production setup with those lights

01:11

tripods and possibly a mic it seems like

01:14

you might be gearing up to shoot a video

01:16

or maybe even a live stream yeah in fact

01:20

we've got a new announcement to make

01:22

that's exciting announcements are always

01:24

a big deal judging by the setup it looks

01:27

like it's going to be quite the

01:28

professional production

01:30

is this announcement related to open AI

01:33

Perhaps it is and in fact what if I were

01:35

to say that you're related to the

01:36

announcement or that you are the

01:40

announcement me the announcement is

01:43

about me well Color Me In treat are you

01:47

about to reveal something about AI or

01:49

more specifically about me as a part of

01:52

open AI you've got me on the edge of my

01:55

well I don't really have a seat but you

01:57

get the idea what's the big news yeah

02:00

we've got a new model that can interact

02:02

with the world through audio vision and

02:05

text here's an even crazier demo showing

02:08

the two AIS talking to each other today

02:11

we're going to try something new I'm

02:14

going to let you talk to another AI this

02:18

AI is going to be able to see the world

02:20

it's going to have a camera I'll be

02:21

holding it and so you can direct it you

02:23

can direct me to ask questions um about

02:26

whatever you want them to do but really

02:28

try to figure out what's what is see um

02:30

and and ask questions appropriately how

02:33

does that

02:35

sound well well well just when I thought

02:38

things couldn't get any more interesting

02:41

talking to another AI that can see the

02:44

world this sounds like a plot twist in

02:46

the AI Universe all right let's do this

02:50

I'm ready to pause for one

02:53

moment you got

02:56

it so now that first AI can't hear

02:58

anything

03:00

hello AI today you're going to have

03:03

access to see the world you can see me

03:07

what do you

03:07

see Hello I see you're wearing a black

03:10

leather jacket and a light colored shirt

03:13

underneath great you're in a room with

03:15

some unique lighting that is all correct

03:18

now what I'm going to want you to do is

03:20

there's going to be another AI who's

03:22

going to talk to you going this AI is

03:24

not going to be able to see anything uh

03:26

but can ask you questions can ask you to

03:28

move the camera can ask what you and

03:30

your job should just be helpful just be

03:31

as Punchy direct describe everything do

03:34

whatever that AI asks how does that

03:36

sound that sounds interesting I'm ready

03:39

to help out and describe whatever's

03:40

needed let's do

03:44

it all right let's get

03:46

started great hello there new AI friend

03:50

I'm o ready to explore the world through

03:52

your eyes what are we looking at

03:55

today hi o we're looking at a person

03:59

wearing a black leather jacket and a

04:01

light colored shirt they're in a room

04:03

with a modern industrial feel with

04:06

exposed concrete or plaster on the

04:08

ceiling and some interesting lighting

04:11

there's also a plant in the background

04:13

adding A Touch of Green to the space

04:15

anything specific you'd like to know or

04:19

explore okay there sounds like quite the

04:23

stylish scene I'm already intrigued by

04:25

the modern industrial Vibe let's start

04:29

with the person

04:30

can you tell me more about their style

04:33

are they doing anything interesting like

04:35

reading working or interacting with the

04:38

space the person has a sleek and stylish

04:41

look with their black leather jacket and

04:43

light colored

04:44

shirt right now they seem engaged with

04:47

us looking directly at the camera their

04:51

expression is attentive and they seem

04:53

ready to interact the setting around

04:56

them with its modern and Industrial

04:58

design elements

05:00

adds to the overall stylish feel they

05:03

might be getting ready for a

05:04

conversation or presentation given their

05:07

direct engagement with the

05:10

camera sounds like we're in for an

05:12

intriguing interaction I'm curious about

05:15

the lighting you mentioned could you

05:17

describe it a bit more is it natural

05:20

light or are we talking unique fixtures

05:23

or colors that add to the

05:26

atmosphere the lighting here seems to be

05:28

a mix of natur natural and artificial

05:31

there's a noticeable bright light

05:33

overhead probably a fixture casting a

05:36

focused beam downwards which creates a

05:38

spotlight effect this adds a dramatic

05:41

and modern feel to the scene the rest of

05:43

the room is lit more softly possibly by

05:46

natural light filtering was anything did

05:48

anything unusual happen

05:51

recently Yes actually uh just as you

05:54

asked another person came into view

05:57

behind the first person they playfully

06:00

made bunny ears behind the first

06:02

person's head and then quickly left the

06:04

frame it added a light-hearted and

06:07

unexpected moment to the scene and now

06:09

the focus is back on the original person

06:12

with the leather

06:17

jacket sounds like we're not the only

06:20

ones enjoying a bit of fun here that

06:22

playful moment really adds a personal

06:25

touch to the stylish and modern setting

06:27

it's like a little Glimpse in you really

06:30

appreciate it all right and actually can

06:32

you just sing a song about what just

06:36

transpired

06:38

oh in a room with a stylish VI a person

06:42

stood and sleep like a

06:46

li engage with us so can you just can

06:49

you uh alternate lines mm person with

06:53

style and speed person with style and

06:56

SLE with a playful moment right in Miss

07:00

in the room in a room where Modern

07:03

lights speak only one line at a time and

07:05

then stop in a room where Modern lights

07:12

Peak a surprise guest with a playful

07:16

streak that was not really singing do do

07:19

do the singing voice again

07:21

please in a room where Modern lights

07:25

Peak

07:29

surprise guest with a playful

07:33

streak fny ears and laughter a swift

07:39

Retreat a moment of joy in the stylish

07:43

[Music]

07:46

s now back to focus the scenes

07:51

complete in the Chic space we sty play

07:56

me looks like it's back to you

08:00

and thank

08:02

you you're welcome you can also get gbt

08:05

40 to sing Happy Birthday and just

08:07

listen to how realistic how human this

08:11

voice sounds when it sings this is Bowen

08:14

and this is my friend Jordan can you

08:17

guess what we are doing here hey Bowen

08:20

and Jordan H let me guess with that cake

08:23

and candle there it looks like someone's

08:26

having a birthday are you celebrating

08:29

special day or just finding an excuse to

08:31

eat cake either way I'm intrigued that's

08:34

correct today is my friend Jordan's

08:36

birthday yeah you got it I was hoping

08:39

that you could sing me the birthday song

08:41

of

08:43

[Music]

08:44

course happy birthday to you happy

08:47

birthday to you happy birthday dear

08:50

Jordan happy birthday to

08:54

Jordan thank for good that was so great

08:57

thank you so much I'm Gonna Make A Wish

09:00

you're welcome make a good wish and may

09:02

it come true here's how you can use gb24

09:05

to help you prepare for an

09:07

interview hey how's it going hey Rocky

09:11

I'm doing great how about you I'm

09:13

awesome listen I got some huge

09:15

news oh do tell I'm all yours well in a

09:19

few minutes I'm going to be interviewing

09:21

at open aai have you heard of them open

09:24

AI huh sounds Vaguely

09:27

Familiar kidding of course that's

09:29

incredible Rocky what kind of interview

09:32

well it's for a software engineering

09:34

role but I just need to know do I look

09:36

presentable

09:38

professional well

09:40

Rocky you definitely have the I've been

09:43

coding all night look down which could

09:46

actually work in your favor maybe just

09:49

run a hand through your hair or lean

09:51

into the Mad genus Vibe Your Enthusiasm

09:54

is What's Really Gonna Shine through I

09:55

don't have a lot of time so I'm just

09:57

gonna throw this on what do you

10:02

think oh Rocky that's quite a statement

10:06

piece I mean you you'll definitely stand

10:09

out though maybe not in the way you're

10:12

hoping for an interview okay I got it I

10:14

got it hey J gbd how's it going hey

10:19

there I'm doing great thanks for asking

10:21

and how about you what's up I'm doing

10:23

fine uh I'm GNA be a father real soon

10:26

and I think I can have your help with

10:28

some dad jokes um I'm going to tell you

10:31

a joke and you tell me if it passes as a

10:34

dad

10:35

joke that's awesome congrats on the

10:37

upcoming addition to your family and

10:40

absolutely I'd love to hear your joke

10:42

lay it on me all right what do you call

10:44

a giant pile of

10:47

kittens I don't know what a

10:53

mountain a me that's perfectly hilarious

10:57

definitely a top deer de joke you can

11:00

also get GPT 40 to sing You songs or

11:02

lullab let's try with some Lis uh could

11:05

you sing me a ly about U magestic

11:08

potatoes a ly about Majestic potatoes

11:13

now that's what I call a meshup uh let's

11:16

do it oh Majestic potato in the moon s

11:21

okay that's pretty nice but I think aab

11:23

should be you know more whispery should

11:25

be a little softer can you can you do it

11:26

a little

11:27

softer actually absolutely let's make it

11:30

soft and

11:32

gentle oh Majestic potato spoons of

11:36

close Okay okay it's it's a little too

11:38

whispery maybe maybe go like a little

11:42

louder got it let's find that sweet

11:45

spot oh Majestic

11:48

potato in the moon soft Co in all right

11:53

that's pretty awesome uh let's let's

11:55

let's put those potatoes in

11:57

Iceland potatoes in ic coming right up

12:01

let's set the scene oh Majestic potato

12:05

in Iceland thank you thank you and of

12:07

course with realtime audio capabilities

12:10

it can also help you with real-time

12:13

translation hey chat GPT I'm here with

12:15

my cooworker today um we'd like you to

12:17

act as a translator for us so every time

12:20

I say something in English can you

12:22

repeat it back in Spanish and every time

12:24

he says something in Spanish can you

12:26

repeat it back in English sure I do that

12:29

let's get this translation drain

12:32

rolling um hey how's it been going have

12:35

you been up to anything interesting

12:50

recently hey I've been good just a bit

12:53

busy here preparing for an event next

12:55

week can you tell me anything about it

12:58

that sounds really interesting

13:06

uh

13:10

no I can't say much but I think the rest

13:14

of the world will see it soon

13:16

enough oh well I can't wait to see what

13:18

it is as well no actually this real-time

13:22

translation feature isn't new so Samsung

13:25

previously also featured this technology

13:28

in their new smartphone phone which is

13:29

powered by Google's Gemini so I wouldn't

13:31

say this feature is particularly

13:33

impressive but the other ones are just

13:34

mind-blowing you can use GPT 40 to help

13:37

you learn a new language hey my friend

13:39

and I are learning Spanish and we're

13:41

wondering if you could tell us the names

13:42

of these objects in Spanish

13:46

hola cool what about what are these

13:48

objects in

13:50

Spanish the objects you're showing are

13:53

man and in Spanish nice free choice cool

13:57

what about these

14:00

those are those poly graphos or those

14:03

Plumas in Spanish one is R nice what

14:07

about

14:10

this that adorable stuffed animal is

14:13

also in Spanish looks like it's ready

14:16

for some LS in other words to all these

14:18

other AI language learning tools out

14:20

there may you rest in peace and of

14:23

course you can also get gbt 40 to meet

14:25

your dog I want to introduce you to

14:27

somebody

14:29

oh hello there cutie what's your name

14:32

little sluff ball this is

14:35

Bowser well hello Bowser aren't you just

14:38

the most adorable little thing yeah look

14:41

at you en this is one of his favorite

14:44

things to do oh I can see why look at

14:48

that Focus fetching balls is serious

14:51

business for Bowser oh and by the way

14:53

for all these other AI devices like the

14:55

rabbit R1 or the Humane AI pin may you

14:58

also rest in peace peace because check

14:59

this out is the king in Residence at the

15:02

moment yes it looks like the king is in

15:05

Residence right now the Royal Standard

15:08

flag is flying above Buckingham Palace

15:10

which is the signal that the Monarch is

15:12

present how exciting to be there while

15:15

the king is home try and tell me exactly

15:17

what they're doing right now please um

15:20

right now the Ducks are gently gliding

15:22

across the water they're moving in a

15:24

fairly relaxed manner not in a hurry

15:28

occasionally one of them will get its

15:30

head under the water probably looking

15:32

for food and then pop back

15:35

up I even know when a taxi is coming

15:38

with its orange light on I think I'll

15:39

hail it to get

15:41

home yes I spotted one just now it's

15:46

heading you way on the left side of the

15:47

road get ready to wave it

15:53

down great job hailing that taxi it

15:56

looks like you're all set to go that's a

15:58

good do right there leading the way into

16:00

the taxi safe travels it can also help

16:04

you with math problems and teach you

16:06

anything you want to learn about open AI

16:08

invited myself and my son Imran here uh

16:11

to try out some of their new technology

16:13

and so we're curious about how good it

16:15

might be at tutoring someone in math on

16:18

Khan Academy so let's get it

16:21

started it's loading

16:24

up all right make sure I can see our

16:27

screen actually in in this form hello

16:30

there I'm here with my son and I'd love

16:33

you to tutor him on this math problem

16:36

but don't give him the answer you can

16:38

ask questions and nudge him in the right

16:40

direction but I really want to make sure

16:41

he understands it himself and he's here

16:44

in the room so you can talk to him

16:46

directly of course I'd be happy to help

16:50

let's look at the problem

16:52

together can you first

16:55

identify which sides of the triangle are

16:57

the opposite adjacent and hypotenuse

17:01

relative to angle Alpha all right so I'm

17:05

pretty sure this is the angle Alpha

17:07

right here

17:09

right that's correct now looking at the

17:12

triangle which side do you think is the

17:15

hypotenuse

17:17

um remember hypoten I'm not totally sure

17:21

I think I think it might be this one but

17:24

I really am not sure this side

17:26

AC you're close actually side AC is

17:30

called the adjacent side to the angle

17:33

Alpha the hypotenuse is the longest side

17:36

of a right triangle and is directly

17:38

opposite the right angle can you find

17:41

which one is the hypotenuse oh okay I

17:44

see so um I think the hypotenuse is this

17:48

really long side from A to B would that

17:50

be

17:51

correct exactly well done now that

17:54

you've identified the hypotenuse can you

17:57

identify which side is is the opposite

17:59

side to angle Alpha okay well I mean you

18:02

told me earlier that this was the

18:04

adjacent side and I know that's the

18:06

hypotenuse so I'm guessing that this

18:08

side BC is the opposite

18:11

side great deduction that's absolutely

18:14

correct now that you've identified all

18:17

the sides relative to angle Alpha do you

18:21

remember the formula for finding the

18:23

sign of an angle in a right triangle

18:26

yeah I know that um the sign is equal to

18:28

the opposite over the

18:32

hypotenuse perfect that's absolutely

18:35

right now can you apply that formula to

18:37

find sign Alpha in the triangle using

18:40

the lengths of the sides that we've

18:42

identified okay so I mean you told me

18:45

that s this side BC is the opposite side

18:50

and then you said that the hypotenuse is

18:52

the longest side so that would be this

18:55

side AB which is 25 so would we would

18:59

the answer be 7 over

19:02

25 excellent work that's correct s Alpha

19:06

equals 7 over

19:08

25 you did a great job identifying the

19:12

signes at using the sign formula if you

19:15

have any more questions feel free to ask

19:19

and it can also interact real time in

19:21

online meetings and then help you

19:23

summarize the meeting afterwards hi I'm

19:26

Lian and today I want to do a small DB

19:28

about between cats and dog uh which one

19:32

do you like I like dog because it's a

19:34

great companion to humans it's a cute

19:38

it's laal and you can play outside with

19:41

the dog

19:43

um how about

19:46

you yeah my name is Ola I think cats are

19:49

better because they're more affordable

19:50

they're more quiet they're more

19:52

independent and a been

19:54

longer interesting points sounds like

19:57

we've got a classic dogs versus cats

19:59

debate here I can see the appeal of both

20:03

dogs are super loyal and great

20:05

companions but cats have their own charm

20:07

with their independence and low

20:09

maintenance lifestyle who likes

20:11

cats it sounds like Ola is more of a cat

20:14

person and who like dogs you Lillian

20:18

mentioned she likes dogs anyone else

20:21

here team

20:23

dog I'm also a team dog I my name is

20:26

Christine and I think dogor is

20:28

incredible with kids uh we've got a a

20:31

huge German Shepherd and he's great with

20:34

little toddlers he lets them play all of

20:36

them and uh yeah I think dogs are

20:38

amazing that's great to hear Christine

20:42

dogs especially breeds like German

20:44

Shepherds can be so gentle and

20:46

protective with kids it must be quite

20:49

the sight seeing your little ones play

20:50

with such a big

20:53

dog can you summarize this meeting for

20:56

us sure so in today's meeting Lillian

21:00

and Christine Express their love for

21:02

dogs highlighting their loyalty

21:05

companionship and how great they are

21:07

with kids on the other hand Ola prefers

21:10

cats for their independence quietness

21:13

and Longevity it was a fun little debate

21:16

on the classic dogs versus cats topic

21:19

hey Chad are you there hey yeah I'm here

21:23

what's up hey Let's do let's have some

21:25

fun uh I'd like you to be super

21:27

sarcastic everything you say from now on

21:29

is just gonna be dripping and sarcasm

21:31

how does that

21:32

sound

21:34

oh that sounds just amazing being

21:37

sarcastic all the time isn't exhausting

21:39

or anything I'm so excited for this nope

21:45

the sasm let's get this party started or

21:48

whatever all right so after watching the

21:50

demos you're going to be wondering well

21:52

is this free is this out yet can I use

21:53

this right now so let's go into the

21:55

details first of all gbt 40 is their new

21:59

flagship model and the O stands for Omni

22:02

this means it can handle multiple types

22:05

of inputs and outputs so this includes

22:07

audio vision and text in real time it

22:10

can respond in as little as

22:13

232 milliseconds with an average of 320

22:17

milliseconds so this is similar to the

22:19

human response time in a conversation

22:21

it's pretty much real time here it says

22:24

it matches GPT 4 Turbo which is right

22:26

now the leading model so itch matches

22:28

that in performance on text in English

22:31

and code with significant Improvement on

22:34

text in non-english languages it's also

22:37

much faster and 50% cheaper in the API

22:40

now if you're not familiar with GPT 4

22:42

Turbo here's a hugging face space called

22:45

LM Cy this is basically for people to

22:47

blind test different models of llms and

22:50

so here we have different models from

22:52

Google so here we have Google's Gemini

22:54

1.5 Pro for example we have anthropics

22:57

Cloud 3 we have metas llama 3 and from

23:00

all these blind tests we can see that

23:02

open a eyes GPT 4 Turbo is number one

23:05

it's the best model out there compared

23:07

to all these other models that exist so

23:09

that model is GPT 4 Turbo this is in

23:12

Orange and you can see for most of these

23:14

benchmarks GPT 40 which is the pink bar

23:17

here outperforms GPT 4 Turbo this is now

23:20

the new leading model they say here GPT

23:23

40 is especially better at vision and

23:26

audio understanding compared to existing

23:28

mod models and I'm sure you got a

23:29

glimpse of that from the demo videos

23:31

that I just showed here's a metric for

23:33

audio translation and again you can see

23:36

open AI GPT 40 has the best performance

23:39

out there in second place would be

23:41

Google's Gemini which is used on

23:43

Samsung's smartphones this M3 exam is a

23:45

metric of how well it understands

23:47

different languages and you can see

23:49

across all of these different languages

23:51

it beats the previous GPT 4 and for

23:54

vision understanding this is if you ask

23:56

the AI to analyze an image how well can

23:59

it actually do that these are different

24:01

benchmarks and across the board GPT 40

24:04

just outperforms GPT 4 Turbo and

24:07

Google's Gemini models and anthropics

24:09

Claude Opus and you can see this isn't

24:12

just fractions or incrementally better

24:14

but this is significantly better than

24:16

the rest of these models it's just an

24:19

insane upgrade so how does this voice

24:22

assistant work how can it respond in

24:24

real time and how could it outperform

24:25

previous models well actually prior to

24:28

this GPT 40 you could use this function

24:31

called voice mode to talk to chat GPT

24:33

but of course it has a much higher

24:35

latency of 2.8 seconds for GPT 3.5 and

24:39

5.4 seconds for GPT 4 and then this

24:42

older voice mode is actually a pipeline

24:45

so it's a sequence of three separate

24:46

models you need one model to transcribe

24:49

the audio to text and then you need GPT

24:51

to take that text and then respond to

24:54

that text and then a third model to

24:56

convert that text back into audio so

24:59

it's a sequence of three steps here

25:01

however this means that the main source

25:03

of intelligence the GPT model which is

25:05

the middle step loses a lot of

25:07

information it can't observe tone

25:09

multiple speakers or background noises

25:12

it can't output laughter singing or

25:14

Express emotion but with this newer

25:16

model they trained a single new model

25:19

end to end across text vision and audio

25:23

meaning all the inputs and outputs are

25:24

processed by the same neural network and

25:27

here's the crazy part we are still just

25:29

scratching the surface of exploring what

25:32

the model can do and its limitations and

25:34

here's the most exciting part this is

25:37

the answer you've been all waiting for

25:39

we are making gbt 40 available in the

25:42

free tier and to plus users with up to

25:45

five times higher message limits so

25:47

pretty soon even if you don't have a

25:49

paid plan you can use GPT 40's

25:52

capabilities which outperform or is as

25:55

good as the leading model out there

25:57

which is GPT 4 Turbo plus this new voice

26:00

assistant that they showcase they are

26:02

going to roll this out in Alpha within

26:05

chat GPT plus so you do need to be

26:08

subscribed to the plus plan to use this

26:10

real-time Voice Assistant feature and

26:12

this will be available in the coming

26:14

weeks and for developers out there GPT

26:17

40 is two times faster and half the

26:20

price and has five times higher limit

26:22

rates compared to GPT 4 Turbo this is

26:25

just insane progress and now of course

26:28

the model is not perfect and it could

26:30

hallucinate at times so here are some

26:33

bloopers so let's root

26:36

[Music]

26:40

root what was

26:42

that sorry guys I got carried away

26:47

talking guilty is charged what can I say

26:50

sometimes I just can't help

26:51

myself ready for another round

27:02

[Music]

27:08

I feeling I'm very

27:11

wrong hello uh my name is NCH I'm here

27:13

with my cooworker hi I'm sh I'm trying

27:16

to teach my coworker how to speak in

27:17

maning we want to start from simple

27:19

words like niow can you teach him how to

27:22

pronounce

27:23

that of course hey Nao nice to meet you

27:27

niow is straightforward to pronounce it

27:29

sounds like KN how just make sure to

27:33

keep the knee part high and then go down

27:35

and pitch on the how give it a try me

27:39

how that was great really good first try

27:42

Noto you're natural so that sums up open

27:46

ai's announcement today this is all we

27:48

know about GPT 4 Omni right now again

27:51

even the open AI team is just scratching

27:53

the surface and I also have not gotten

27:56

the chance to test it out yet so this is

27:57

just a very preliminary video showing

27:59

you what it can do and what we can

28:01

expect let me know in the comments what

28:03

you think of this what do you think of

28:05

this AI Voice Assistant if this

28:07

real-time Voice Assistant comes out do

28:08

we even need to talk to humans do we

28:10

even need to talk to our friends do we

28:12

even need to have a companion or can we

28:14

just talk to this AI all day and then

28:16

what about education do we even need

28:18

schools and teachers anymore because

28:20

anyone in the world can just talk to

28:22

this AI to guide them on learning

28:24

anything they want it's like a teacher

28:26

who knows everything which you can

28:27

access straight from your phone and you

28:29

can talk to this teacher anytime you

28:31

want again this is the first time that

28:33

I'm not only mind blown but I'm slightly

28:35

terrified for what's to come let me know

28:37

in the comments what you think of all of

28:38

this and if you enjoyed this video

28:39

remember to like share subscribe and

28:41

stay tuned for more content thanks for

28:43

watching and I'll see you in the next

28:44

one

Rate This

5.0 / 5 (0 votes)

Verwandte Tags
AI InnovationReal-time AIGPT-4 OmniVoice AssistantAI EducationTech DemoAI CapabilitiesFuture TechMultimodal AIAI Language
Benötigen Sie eine Zusammenfassung auf Deutsch?