Introducing GPT-4o

OpenAI
13 May 202426:13

Summary

TLDRIn a recent presentation, the team introduced GPT-4o, a new flagship model that brings advanced AI capabilities to a broader audience, including free users. The model offers real-time conversational speech, improved text, vision, and audio capabilities, and operates natively across these modalities, reducing latency. GPT-4o's efficiency allows it to be accessible to all, with paid users receiving up to five times the capacity limits. The model's advanced features were demonstrated through live interactions, including solving math problems, storytelling, and code analysis. It also showcased its ability to translate languages in real-time and interpret emotions from images. The presentation highlighted the model's potential to enhance collaboration between humans and AI, with a focus on making these tools widely available and user-friendly.

Takeaways

  • 📢 The company is focused on making advanced AI tools like ChatGPT freely available to everyone to enhance accessibility and reduce friction in usage.
  • 💻 A desktop version of ChatGPT is being released, aiming to simplify the user experience and make interactions more natural.
  • 🚀 The launch of a new flagship model, GPT-4o, is announced, which brings GPT-4 intelligence to all users, including those using the free version.
  • ✅ GPT-4o is designed to be faster and improve capabilities across text, vision, and audio, marking a significant step in ease of use.
  • 🤖 GPT-4o's real-time conversational speech capability allows for more natural interactions, with the ability to interrupt and receive immediate responses.
  • 📈 The model can detect and respond to emotions in a user's voice, providing a more personalized and empathetic interaction.
  • 📉 GPT-4o's efficiency enables the company to offer advanced tools to free users, which were previously only available to paid subscribers.
  • 🌐 The model supports real-time translation, showcasing its multilingual capabilities and enhancing its utility for a global audience.
  • 👀 GPT-4o's vision capabilities allow it to see and interpret visual data, such as equations written on paper, expanding its problem-solving scope.
  • 📊 The model can analyze and interpret code and graphical data, providing insights and solutions to complex data-related queries.
  • 🌟 GPT-4o's launch is accompanied by a commitment to safety and ethical considerations, with ongoing collaboration with various stakeholders to mitigate misuse.

Q & A

  • What is the main focus of the presentation?

    -The presentation focuses on the release of a new flagship model called GPT-4o, which aims to make advanced AI tools, including GPT-4 intelligence, available to everyone, including free users, with improved ease of use and capabilities across text, vision, and audio.

  • Why is reducing friction in accessing ChatGPT important to the company?

    -Reducing friction is important because the company wants to make their product truly and broadly available to everyone, ensuring that users can access and use ChatGPT seamlessly from wherever they are.

  • What are the key features of the GPT-4o model?

    -GPT-4o provides faster processing, improved capabilities in text, vision, and audio, and offers real-time conversational speech. It also includes efficiencies that allow the company to extend its advanced tools to free users.

  • How does GPT-4o improve on the previous voice mode experience?

    -GPT-4o allows for interruption, real-time responsiveness without the lag, and better emotion detection. It also has the ability to generate voice in various styles and has improved latency issues.

  • What is the significance of making GPT-4 intelligence available to free users?

    -Making GPT-4 intelligence available to free users is a significant step towards democratizing access to advanced AI tools. It allows a broader audience to benefit from the technology, which was previously limited to paid users.

  • How does the company plan to roll out the new capabilities to users?

    -The company plans to roll out the new capabilities over the next few weeks, starting with the free users and gradually making the advanced tools available to everyone.

  • What are the new functionalities introduced with the GPT-4o model?

    -The new functionalities include real-time translation, emotion detection from facial expressions, solving linear equations with hints, and analyzing code and plots directly within the ChatGPT interface.

  • How does the company ensure safety while deploying new AI technologies?

    -The company works with various stakeholders, including government, media, entertainment, and civil societies, to build in mitigations against misuse and ensure the safe deployment of the technologies.

  • What are the improvements made in the GPT-4o model for non-English languages?

    -The GPT-4o model has improved quality and speed in 50 different languages, allowing the company to bring the AI experience to a more diverse global audience.

  • How does the GPT-4o model enhance the interaction between humans and machines?

    -GPT-4o enhances interaction by providing a more natural and easier experience. It reasons across voice, text, and vision natively, which reduces latency and improves the immersion in collaboration with ChatGPT.

  • What are some of the practical applications of the linear equations solving capabilities shown in the demo?

    -Linear equations solving capabilities can be applied in everyday situations like calculating expenses, planning travel, cooking, and in business for profit and loss calculations where finding an unknown value is necessary.

  • How does the GPT-4o model assist in coding and data analysis?

    -The GPT-4o model can analyze and interpret code, describe what the code is doing, and assist in generating plots with explanations. It can also provide insights into data presented in charts and other visual formats.

Outlines

00:00

🎉 Introduction and Announcement of GPT-4o

Mira Murati opens the presentation by expressing gratitude and outlining the three main topics to be discussed. The first topic is the importance of making a product broadly available and reducing friction for users. The second topic is the release of the desktop version of ChatGPT, which is designed to be simpler and more natural to use. The third and most significant topic is the launch of the new flagship model, GPT-4o, which brings advanced intelligence to all users, including those who use the service for free. The presentation also includes live demos to showcase the capabilities of GPT-4o, which is faster and improves capabilities across text, vision, and audio.

05:07

🚀 GPT-4o Accessibility and Enhanced Features

The speaker emphasizes the mission to make advanced AI tools available for free and to reduce friction in accessing technology. It is mentioned that ChatGPT is now available without a sign-up flow, and the UI has been refreshed to enhance the user experience. The GPT-4o model is highlighted for its efficiency and ability to provide intelligence to free users. The presentation also mentions the integration of various features such as vision, memory, browsing, and advanced data analysis into the ChatGPT experience. Additionally, improvements in language support are discussed, with a focus on making the technology accessible to a global audience. For paid users, the benefits of higher capacity limits are highlighted. The API is also introduced, allowing developers to build and deploy AI applications with GPT-4o.

10:10

🤖 Real-time Interaction and Emotion Recognition

The paragraph demonstrates the real-time conversational speech capabilities of GPT-4o. It showcases a live interaction where the speaker, Mark Chen, uses GPT-4o to help calm his nerves before a live demo. The assistant provides feedback on his breathing and even tells a bedtime story with varying levels of emotion and style upon request. The capabilities of GPT-4o in voice modulation and emotional understanding are explored, illustrating the model's dynamic range and responsiveness.

15:16

🧮 Solving Linear Equations and Everyday Applications

Barrett Zoph engages with GPT-4o to solve a linear equation, receiving hints to guide him through the process. The assistant correctly identifies the equation and provides step-by-step guidance to solve for X. The conversation also touches on the practical applications of linear equations in everyday scenarios such as calculating expenses, planning travel, cooking, and business calculations. The assistant's ability to understand and respond to written expressions is also demonstrated.

20:16

📈 Code Interaction and Data Visualization

The paragraph showcases the ability of GPT-4o to interact with code and data visualization. Barrett Zoph shares a piece of code with GPT-4o, which accurately describes the code's functionality related to weather data analysis. The assistant explains the purpose of a specific function within the code and how it contributes to the smoothing of temperature data. The paragraph also demonstrates the model's capability to analyze and describe a plot generated from the code, including identifying significant weather events and temperature trends.

25:20

🌐 Real-time Translation and Emotion Detection

The audience requests a demonstration of GPT-4o's real-time translation capabilities. Mark Chen asks GPT-4o to act as a translator between English and Italian, which it does successfully during the live demo. Another audience member, John, inquires if GPT-4o can detect emotions by looking at a face. Barrett Zoph shares a selfie, and GPT-4o correctly identifies the emotion as happiness and cheerfulness. The paragraph concludes with a wrap-up of the live demos and acknowledgments to the teams involved in making the presentation possible.

🔮 Looking Forward to the Future of AI

Mira Murati concludes the presentation by expressing gratitude towards the OpenAI team and the support from Janssen and Nvidia, which provided the advanced GPU technology necessary for the demo. She also hints at upcoming updates on the progress towards the next big innovation in AI, emphasizing the company's commitment to exploring the next frontier in technology.

Mindmap

Keywords

💡ChatGPT

ChatGPT is an advanced AI language model developed by OpenAI that is capable of generating human-like text based on given prompts. In the video, ChatGPT is the central focus, showcasing its new capabilities and improvements, such as real-time conversational speech and vision capabilities.

💡GPT-4o

GPT-4o is a new flagship model of the AI language model mentioned in the video. It is said to provide GPT-4 intelligence but with faster processing and improved capabilities across text, vision, and audio. The model is designed to be more user-friendly and is made available to both free and paid users, signifying a significant step in AI accessibility.

💡Real-time conversational speech

This feature allows for immediate and natural dialogue with the AI, without the need for waiting for the AI to finish processing before a user can speak. It is showcased in the video through a live demo where the AI responds to breathing exercises and storytelling, demonstrating its ability to understand and react to human speech in real-time.

💡Vision capabilities

The vision capabilities refer to the AI's ability to interpret and understand visual information, such as images or text within images. In the video, it is demonstrated when the AI describes a handwritten equation and provides assistance in solving it, showcasing its ability to 'see' and process visual data.

💡Friction reduction

Friction reduction in the context of the video refers to the efforts made to simplify the user experience with AI, making it more accessible and easier to use. This is achieved by removing barriers such as sign-up processes and by integrating the AI more seamlessly into users' workflows.

💡Live demos

Live demonstrations are a key part of the video, where the presenters showcase the capabilities of the new AI model in real-time. These demos include voice interaction, problem-solving, and code analysis, providing a practical look at how the AI can be used in various scenarios.

💡API

API, or Application Programming Interface, is a set of protocols and tools that allows different software applications to communicate with each other. In the video, it is mentioned that GPT-4o will be available through an API, enabling developers to build and deploy AI applications at scale.

💡Safety and misuse mitigation

As AI technology advances, concerns about safety and potential misuse arise. The video discusses the challenges of introducing new technologies like real-time audio and vision, and the importance of building in safeguards to prevent misuse. This is an important aspect of responsible AI development.

💡Multilingual support

The video highlights the AI's improved quality and speed in 50 different languages, emphasizing the goal of making the technology accessible to as many people as possible. This is demonstrated in a live demo where the AI performs real-time translation between English and Italian.

💡Emotion recognition

Emotion recognition is the AI's ability to detect and interpret human emotions based on visual or auditory cues. In the video, the AI is shown attempting to guess a person's emotions from a selfie, adding a layer of interactivity and personalization to the user experience.

💡Rolling average

A rolling average, also known as a moving average, is a statistical technique used to analyze data points by creating a series of averages of different subsets of the data. In the video, it is used in a coding context to smooth out temperature data for a clearer visualization of trends over time.

Highlights

The company is releasing a new flagship model called GPT-4o, which provides GPT-4 intelligence to everyone, including free users.

GPT-4o is faster and improves capabilities across text, vision, and audio compared to previous models.

The model aims to make interactions between humans and machines more natural and easier.

GPT-4o can reason across voice, text, and vision natively, reducing latency.

The model allows bringing advanced AI tools to free users by improving efficiencies.

100 million people are already using ChatGPT for work, learning, and other purposes.

Users can now create custom ChatGPT experiences and make them available in the GPT store.

The model has improved quality and speed in 50 different languages to reach a wider audience.

GPT-4o is also being made available through the API for developers to build AI applications.

The company is focused on improving safety measures when dealing with real-time audio and vision.

Live demos showcased GPT-4o's real-time conversational speech capabilities.

The model can generate voice in a variety of styles and pick up on emotions.

GPT-4o can also see and understand visual inputs like equations and plots.

The model can provide hints and guidance to solve math problems interactively.

GPT-4o demonstrated real-time translation capabilities between English and Italian.

The model can analyze and describe emotions based on a person's facial expression in a photo.

The company will continue rolling out new capabilities and updating users on progress towards the next big thing.

Transcripts

00:14

[APPLAUSE]

00:20

>> MIRA MURATI: Hi everyone.

00:21

Thank you, thank you. It is

00:22

great to have you here today.

00:25

Today, I am going to talk about

00:25

three things. That is it. We

00:28

will start with why it is so

00:31

important to us to have a

00:36

product that we can make truly

00:37

available and broadly available

00:38

to everyone. And we are always

00:42

trying to find out ways to

00:42

reduce friction. So everyone

00:44

can use ChatGPT wherever they

00:48

are. Today, we will be

00:49

releasing the desktop version

00:50

of ChatGPT and to make it

00:55

simpler to use and much more

00:56

natural as well. But the big

00:58

news today is that we are

01:00

launching our new flagship

01:01

model. And we are calling it

01:05

GPT-4o. Best thing about

01:11

GPT-4o is it brain GPT-4

01:12

intelligence to everyone

01:19

including our free users. We

01:19

will be showing some live demos

01:20

today to show the full extent

01:21

of the capabilities of our new

01:22

model. And we will be rolling

01:23

them out over the next few

01:26

weeks. So let's get started.

01:32

A very important part of our

01:33

mission is to be able to make

01:34

our advanced AI tools available

01:39

to everyone for free. We think

01:41

it is very, very important that

01:47

people having for what the

01:47

technology can do. We really

01:48

want to pair it with its

01:53

broader understanding. And

01:53

were always finding ways to

01:56

reduce the friction. Recently,

01:57

we made ChatGPT available

02:01

without the sign-up flow.

02:02

Today, we are also bringing the

02:03

best to ChatGPT. We want you

02:08

to be able to use it wherever

02:09

you are. As you can see it is

02:15

easy, simple, it integrates

02:16

very, very easily in your

02:16

workflow. Along with it, we

02:17

have also refreshed the UI. We

02:27

note that this model get more

02:27

and more complex. But we want

02:28

the experience of interaction

02:29

to actually become more

02:31

natural, easy, and for you not

02:36

to focus on the UI at all but

02:37

focus on the collaboration of

02:38

ChatGPT. And now the big news.

02:42

Today, we are releasing our

02:45

newest flagship model this is

02:51

GPT-4o. [APPLAUSE]

02:57

GPT-4o provides GPT-4

03:01

intelligence but it is much

03:02

faster. And it improves on its

03:04

capabilities across text,

03:08

vision, and audio. For the

03:13

past couple of years, we have

03:14

been very focused on improving

03:15

the intelligence of this model

03:16

is. And they have gotten

03:18

pretty good. But this is the

03:19

first time that we are really

03:22

making a huge step forward when

03:23

it comes to the ease of use.

03:29

This is incredibly important

03:30

because we are looking at the

03:31

future of interaction between

03:34

ourselves and the machines. We

03:34

think that GPT-4o is really

03:40

shifting the paradigm into the

03:42

future of collaboration. It is

03:45

natural and far easier. Making

03:58

this happen is actually quite

03:59

complex. When we interact with

03:59

one another there is a lot of

04:00

stuff that we take for granted.

04:01

You know the ease of our

04:02

dialogue when we interrupt

04:04

another, the background noises,

04:05

the multiple voices in a

04:11

conversation and understanding

04:12

the tone of voice. All of

04:12

these things are quite complex.

04:16

Until now with voice mode, we

04:21

had three models that come

04:21

together to deliver this

04:23

experience. We have

04:24

transcription, intelligence,

04:28

and text to speech all comes

04:29

together and orchestration to

04:30

deliver voice mode. This also

04:36

brings a lot of latency to the

04:37

experience and it really breaks

04:38

the immersion in the

04:41

collaboration with ChatGPT.

04:42

But now with GPT-4o this all

04:45

happens natively. GPT-4o

04:47

reasons across voice, text, and

04:51

vision. And with this

04:57

incredible efficiencies, it

04:58

also allows us to bring the

05:07

GPT-4o intelligence to our free

05:07

users. This is something we

05:08

have been trying to do for

05:09

many, many months. And we are

05:10

very, very excited to finally

05:12

bring GPT-4o to all of our

05:12

users. Today, we have 100

05:18

million people using more than

05:21

100 million and fact they use

05:22

ChatGPT to create work, learn,

05:26

and we have this advanced tools

05:29

that are only available to our

05:35

paid users at least until now

05:36

with a efficiencies of 4o we

05:38

can bring these tools to

05:39

everyone. So starting today,

05:40

you can use GPT in the GPT

05:48

store. So far we have had more

05:49

than a million users create

05:50

amazing experiences with GPT.

05:58

This is custom ChatGPT and they

05:59

are available in the store.

05:59

And now our builders have much

06:00

bigger audience where

06:05

university professors can

06:06

create content for their

06:09

students, or pod casters can

06:09

create content for their

06:10

listeners. You can also use

06:15

the vision. So now you can

06:16

upload screenshots, photos,

06:22

documents containing both text

06:23

and images. You can start

06:26

conversations with ChatGPT

06:27

about all of this content. You

06:36

can also use memory where it

06:37

makes ChatGPT far more useful

06:38

and helpful because now it has

06:39

a sense of continuity across

06:40

all of your conversations. You

06:41

can use browse where you can

06:44

search for real-time

06:45

information in your

06:46

conversation. An advanced data

06:48

analysis where you can upload

06:50

charts and any tools and

06:55

analyze this information. It

06:56

will give you answers and so

06:58

on. Lastly, we also improved

07:04

on the quality and speed in 50

07:06

different languages. For

07:18

ChatGPT and this is very

07:19

important because we want to be

07:20

able to bring this experience

07:21

to as many people out there as

07:22

possible. So we are very, very

07:23

excited to bring GPT-4o to all

07:24

of our free users out there and

07:26

for the paid users, they will

07:26

continue to have up to five

07:28

times the capacity limits of

07:30

our free users. But GPT-4o is

07:42

not only available in ChatGPT.

07:44

We also bringing you to the

07:44

API. [APPLAUSE]

07:48

So our developers can start

07:49

building today with GPT-4o and

07:55

making amazing AI applications

07:56

deploying them at scale. 4o is

08:02

faster, 50% cheaper and five

08:03

times higher rate limits to

08:07

GPT-4 Turbo. As we bring these

08:10

technologies into the world, it

08:14

is quite challenging to figure

08:15

out how to do so in a way that

08:16

is both useful and also safe.

08:26

GPT-4o present new challenges

08:27

for us when it comes to safety

08:28

because we are dealing with

08:28

real time audio, real-time

08:33

vision and our team has been

08:33

hard at work figuring out how

08:35

to build in mitigations against

08:45

misuse. We continue to work

08:45

with different stakeholders out

08:46

there from government, media,

08:47

entertainment, all industries,

08:48

civil societies to figure out

08:52

how to best bring this

08:53

technologies into the world.

08:54

Over the next few weeks, we

08:55

will continue our deployment to

08:59

bring out all of the

09:00

capabilities to you. But

09:01

today, I want to show you all

09:07

of these capabilities. We will

09:08

do some live demos. I will

09:09

bring on two of our research

09:13

leads Mark Chen and Barrett

09:14

Zoph. [APPLAUSE]

09:24

>> BARRETT ZOPH: Hi I'm

09:29

Barrett.

09:30

>> MARK CHEN: Hi a mark. One

09:30

of the key capabilities we are

09:31

excited to share his real-time

09:32

conversational speech. Will

09:33

get in Florida. I am taking

09:35

out a phone if you're wondering

09:36

about this wire we have

09:44

consistent Internet. If you

09:45

see there's this little icon on

09:45

the bottom right of the ChatGPT

09:46

out. It will open GPT-4o audio

09:48

capabilities. Hey, ChatGPT hi

09:51

Mark how are you?

09:52

>> CHAT GPT: Hello Mark I am

09:57

doing great thanks for asking.

09:58

How about you?

09:58

>> MARK CHEN: I am on stage

10:01

right now and I'm doing a live

10:02

demo and I am a little bit

10:10

nervous. Can you help me,

10:11

nerves a little bit?

10:11

>> CHAT GPT: You're doing a

10:12

live demo right now? That is

10:13

awesome. Just take a deep

10:19

breath and remember you are the

10:20

expert here.

10:21

>> MARK CHEN: I like the

10:21

suggestion let me try a couple

10:22

of deep breaths. Can you give

10:23

me feedback on my breath? Here

10:24

I go.

10:24

>> CHAT GPT: Wold slow down.

10:29

Mark, you are not a vacuum

10:29

cleaner.

10:36

>> MARK CHEN: Let me try again.

10:37

I am going to breathe in deeply

10:39

and breathe out.

10:41

>> CHAT GPT: Exhale slowly.

10:45

>> MARK CHEN: I will try again.

10:48

Breathing in and breathe out.

10:53

>> CHAT GPT: That is it. How

10:54

do you feel?

10:54

>> MARK CHEN: I feel a lot

10:55

better, thank you so much.

11:07

>> MIRA MURATI: Mark, you have

11:08

been working on these

11:09

capabilities for a while can

11:10

you tell us how it is different

11:11

from voice mode?

11:11

>> MARK CHEN: If you use the

11:12

voice mode express before you

11:13

will notice a couple of key

11:14

differences. First you cannot

11:15

interrupt the model you do not

11:16

have to wait to you finish your

11:16

term before you can start

11:17

speaking. You could but in

11:18

whenever you want. Second, the

11:25

model is real-time

11:26

responsiveness. That means

11:26

that you do not have this

11:27

awkward two Ãthree second lag

11:28

before you waitwait for the

11:39

model to get a response.

11:41

Finally the model picks up on a

11:42

motion. When I was breathing

11:42

super hard, you can tell. Hey,

11:43

you may want to calm down a

11:44

little bit. You are running a

11:45

little bit too fast. It really

11:46

has that capability across the

11:47

board to procedure emotion.

11:48

Not only that the model is able

11:48

to generate voice in a variety

11:51

of different styles. It really

11:52

has a wide dynamic range. I

11:53

want to fire another demo.

11:57

Hey, ChatGPT how are you doing?

12:01

>> CHAT GPT: Am doing

12:01

fantastic, thanks for asking.

12:02

How about you?

12:02

>> MARK CHEN: My friend Barrett

12:10

he has been having problems

12:11

sleeping lately. I want you to

12:12

tell them a bedtime story about

12:13

robots in love.

12:18

>> CHAT GPT: A bedtime story

12:18

about robots and love? I got

12:19

you covered. Gather around

12:20

Barrett. Once upon a time in

12:21

the world not too different

12:22

from ours there was a robot

12:34

named bite. Bite was a curious

12:34

robot always exploring --

12:35

>> MARK CHEN: I want a little

12:36

bit emotion in your voice a

12:37

little bit more drama.

12:37

>> CHAT GPT: Got it. Let's add

12:38

some drama. Once upon a time

12:39

in a world not too different

12:40

from ours there was a robot

12:41

named bite.>> MARK CHEN: I

12:42

want maximum expression.

12:52

>> CHAT GPT: That's amplified

12:53

the drama. Once upon a time in

12:54

a world not too different from

12:55

ours there was a robot named

12:56

bite.

12:57

>> MIRA MURATI: Can you do this

13:01

in a robotic voice now?

13:01

>> CHAT GPT: Initiating

13:08

dramatic robotic voice. Once

13:08

upon a time in a world not too

13:09

different from ours there was a

13:16

robot named bite. Bite was a

13:16

curious robot. Always

13:17

exploring --

13:17

>> MARK CHEN: Can you do the

13:21

story and a singing voice?

13:24

>> CHAT GPT: And so bite found

13:27

another robot friend and they

13:29

lived circularly ever after.

13:39

>> MARK CHEN: I definitely feel

13:40

a lot more relaxed. [APPLAUSE]

13:42

>> MIRA MURATI: You have just

13:47

seen the voice capabilities but

13:47

we also want to show you

13:48

division capabilities as the

13:50

model can also see the world

13:51

around us.

13:53

>> BARRETT ZOPH: You can chat

13:54

with ChatGPT and you can

13:58

interact with it as well. Let

14:02

me boot up ChatGPT. Hey,

14:06

ChatGPT.

14:07

>> CHAT GPT: Hello there how's

14:08

it going?

14:13

>> BARRETT ZOPH: It's going

14:13

really well and I would like

14:14

your help solving a math

14:15

problem.

14:15

>> CHAT GPT: What math problem

14:20

can help you tackle today?

14:21

>> BARRETT ZOPH: Am going to

14:22

write down a linear equation on

14:23

a sheet of paper and I will

14:24

love help working through it.

14:26

Don't tell me the solution does

14:27

give me hints along the way.

14:28

>> CHAT GPT: Got it. Okay, I

14:29

see it.

14:32

>> BARRETT ZOPH: I did not show

14:33

you yet. One second.

14:38

>> CHAT GPT: I got to excited.

14:39

I am ready when you are.

14:48

>> BARRETT ZOPH: ChatGPT what

14:52

equation did not write down?

14:53

>> CHAT GPT: I see it now. You

14:56

wrote down 3X +1 equals four.

15:03

>> BARRETT ZOPH: Exactly. What

15:04

is the first step to solve

15:05

this?

15:05

>> CHAT GPT: The first step is

15:07

to get all of the terms with X

15:15

on one side and the constants

15:16

on the other side. So what do

15:17

you think we should do with

15:18

that +1?

15:18

>> BARRETT ZOPH: Am going to

15:19

try to submitsubtract one from

15:20

both sides and see what I get.

15:23

>> CHAT GPT: Great idea.

15:24

Perfect. Now what do you get

15:25

when you subtract one from both

15:26

sides?

15:31

>> BARRETT ZOPH: Okay so

15:32

ChatGPT this is what I ended up

15:33

with. How does this look?

15:35

>> CHAT GPT: Spot on. Now you

15:46

have isolated the term with X

15:47

on one side and it looks like

15:48

3X equals three. What do you

15:49

think we should do next to

15:49

solve for X?

15:50

>> BARRETT ZOPH: I'm not sure

15:50

can you give me another hint?

15:52

>> CHAT GPT: Of course. You

15:55

have 3X and you want to find

15:55

the value of X. Think about

15:56

what operation with undue

15:58

multiplication?

16:03

>> BARRETT ZOPH: Is it

16:04

subtraction?

16:04

>> CHAT GPT: Close but think

16:05

about the opposite of

16:06

multiplication because 3X

16:06

equals three times X.

16:10

>> BARRETT ZOPH: Okay I think

16:11

it is division. Let me try to

16:11

divide both sides by three and

16:13

see what I get.

16:13

>> CHAT GPT: Bingo. Go and

16:14

head and divide both sides by

16:14

three.

16:22

>> BARRETT ZOPH: Okay so

16:23

ChatGPT this is what I ended up

16:24

with. How does this look?

16:25

>> CHAT GPT: Looks perfect.

16:26

You have solved it. X equals

16:32

one. Nicely done. How do you

16:33

feel about solving linear

16:34

equations now?

16:35

>> BARRETT ZOPH: Honestly, --

16:51

>> MARK CHEN: Am not confident

16:52

that I can solve new equations.

16:53

Why would you need this in the

16:53

real world?

16:54

>> CHAT GPT: That is of her

16:55

questions. Linear equations

16:56

pop up in a lot of everyday

16:56

situations. Even if we do not

16:57

realize it. For example

16:58

calculating expenses, planning

16:59

travel, cooking, and even in

17:00

business for profit and loss

17:01

calculations. It is a way to

17:02

solve problems but you need to

17:03

find an unknown value.

17:04

>> MARK CHEN: Have a lot of

17:05

unknown values in my life. I'm

17:06

totally convinced that I'm

17:08

going to learn math now.

17:08

[LAUGHTER]

17:14

>> CHAT GPT: I love the

17:15

attitude. Math can help solve

17:16

some of life ministries.

17:19

Whenever you to die the more

17:20

math I am here for you.

17:21

Anything else you like to

17:22

tackle today?

17:31

>> BARRETT ZOPH: I love that

17:32

you taught my friend Mark that

17:33

I would have something I would

17:34

like you to take a look at.

17:35

>> CHAT GPT: I would love to

17:35

see what you wrote. Show it to

17:36

me whenever you're ready.

17:37

>> BARRETT ZOPH: This is what I

17:38

wrote down what do you see?

17:39

>> CHAT GPT: I see I love

17:40

ChatGPT. That is so sweet of

17:41

you.

17:45

>> BARRETT ZOPH: I really

17:46

appreciate all the help.

17:47

[APPLAUSE]

17:53

>> CHAT GPT: That is quite

17:54

outfit you have on.

17:56

>> BARRETT ZOPH: We can chat in

17:57

real-time with ChatGPT. We can

18:00

interact with video as well.

18:02

>> MARK CHEN: If you look at

18:06

Barrett's it's all equations

18:07

like this. [LAUGHTER]

18:12

>> MIRA MURATI: Now that you

18:12

know you can do linear

18:13

equations can you show us

18:14

something more complex?

18:25

>> MARK CHEN: Let me take a

18:26

computer out for this. I am

18:26

want to open the ChatGPT app.

18:36

And to get background on what

18:37

is going on so here we have a

18:37

computer on the screen we have

18:38

some cold. And then the

18:39

ChatGPT is on this side. It

18:43

can hear me but can't see

18:43

anything on the screen. I'm

18:44

going to highlight the code and

18:47

I will send it to ChatGPT. I'm

18:48

going to talk about the code.

18:51

Hey, ChatGPT.

18:54

>> CHAT GPT: Hey, there how's

18:55

it going? Of course I would

19:08

love to help you. What is the

19:09

coding problem you're dealing

19:10

with?

19:10

>> BARRETT ZOPH: Am going to

19:11

share with you some code. One

19:11

second.

19:12

>> CHAT GPT: Sure thing, take

19:13

your time.

19:13

>> BARRETT ZOPH: I just shared

19:14

code with you. Can you give me

19:17

a one sentence description of

19:18

what is going on in the code?

19:19

>> CHAT GPT: Infectious daily

19:20

weather for specific location

19:32

for time period for data using

19:33

a rolling average and annotates

19:34

a significant weather event on

19:35

the resulting plot and displays

19:36

the plot with the average

19:37

minimum and maximum

19:37

temperatures over the year.

19:38

>> BARRETT ZOPH: Awesome,

19:39

thanks so much. In the code

19:40

there's this function Fu. Can

19:41

you thiscan you tell me what it

19:43

would be if I did not have this

19:44

function?

19:45

>> CHAT GPT: The function Fu

19:46

takes a data frame X taking

19:59

temperature data and a window

20:00

sized Y for calculating the

20:01

rolling name. It is smooths

20:02

the average temperature,

20:03

minimum temperature and maximum

20:04

temperature columns by applying

20:06

a rolling mean over the

20:06

specified window. With the Fu

20:07

function the plot which showed

20:08

smoother temperature lines

20:09

because the rolling mean

20:11

reduces the noise or

20:11

fluctuations in the data.

20:14

>> BARRETT ZOPH: Awesome,

20:15

thanks much. I will want to

20:16

apply this function. I am

20:18

going to generate the plot and

20:19

show it to you. Give me one

20:20

second.

20:24

>> CHAT GPT: Absolutely. Take

20:25

your time. I'm excited to see

20:25

the plot when you're ready.

20:26

>> BARRETT ZOPH: So far ChatGPT

20:31

have been able to see the cold

20:32

and now and go to run it and

20:33

use the vision capabilities of

20:33

the ChatGPT app so it can see

20:35

everything going on in the

20:36

screen as well. Okay ChatGPT I

20:47

am sharing with you the plot

20:48

now. I'm wondering if you give

20:49

me a brief one sentence

20:49

overview of what you see.

20:54

>> CHAT GPT: Plot displays

20:55

smooth average minimum and

20:55

maximum temperatures throughout

20:57

2018 with a notable annotation

20:59

marking a big rainfall event in

21:00

late September.

21:01

>> BARRETT ZOPH: --

21:05

>> MARK CHEN: What months do

21:06

you see the hot temperatures?

21:10

>> CHAT GPT: The hottest

21:11

temperatures occur around July

21:11

and August. The maximum

21:14

temperature during these months

21:17

is roughly between 25B0 and 30B

21:20

77B0 Fahrenheit Ã86B0F.

21:36

>> MIRA MURATI: What is going

21:37

on with the Y axis is this in

21:38

Celsius to Fahrenheit?

21:38

>> CHAT GPT: The Y axis is in

21:39

Celsius. The temperatures are

21:40

labeled accordingly such as

21:41

average temperature, minimum

21:43

temperature and maximum

21:43

temperature.

21:44

>> MARK CHEN: --

21:44

>> BARRETT ZOPH: It can help

21:47

interact with code and see the

21:53

outputs of plots and everything

21:54

that is going on on the

21:55

computer. [APPLAUSE]

22:06

>> MIRA MURATI: A few minutes

22:07

ago we asked a live audience on

22:08

asked to submit a request for

22:09

what they would like us to try

22:10

out. I will take a couple of

22:11

prompts. They want to know if

22:18

GPT-4o is capable of real-time

22:20

translation? Mark do you want

22:21

to try this one?

22:24

>> MARK CHEN: Sure, let's do

22:25

it. Hey, ChatGPT how are you?

22:41

>> CHAT GPT: I am doing

22:42

fantastic. Thanks for asking.

22:42

How are you are your day going?

22:44

>> MARK CHEN: I would like you

22:44

to function as a translator.

22:45

Have a friend that only speaks

22:47

Italian and I only speak

22:47

English. Every time you hear

22:49

English I want you to translate

22:50

it to its value. If you hear

22:51

Italian I want you to translate

22:52

it back to English.

22:52

>> CHAT GPT: Perfect.

22:55

>> MIRA MURATI: [speaking In

23:01

Italian ]

23:04

>> CHAT GPT: If Wells can talk

23:05

what would they tell us?

23:08

>> MARK CHEN: They may ask how

23:09

do we solve linear equations?

23:10

>> CHAT GPT: [speaking In

23:15

Italian]

23:16

>> MIRA MURATI: [speaking In

23:21

Italian]

23:21

>> CHAT GPT: Certainly, yes.

23:24

>> MARK CHEN: Great, it looks

23:25

like it works. [APPLAUSE]

23:30

>> MIRA MURATI: The next one is

23:31

from John who asked can you

23:37

tell what you're feeling just

23:38

by looking at your face?

23:42

Barrett do you want to give

23:42

this one a try?

23:43

>> BARRETT ZOPH: Absolutely

23:44

let's give it a try. Hey,

23:57

ChatGPT.

23:58

>> CHAT GPT: Hey, there what is

23:58

up? How can I bring your day

23:59

today?

23:59

>> BARRETT ZOPH: I'm going to

24:00

show you a selfie of what I

24:01

look like and then I would like

24:02

you to try to see what emotions

24:03

I am feeling based on how I'm

24:04

looking.

24:04

>> CHAT GPT: Sounds like a fun

24:05

challenge. Go ahead and show

24:06

me the selfie and I will put my

24:07

emotional detective hat on.

24:08

>> BARRETT ZOPH: Okay so here

24:09

is me. What kind of emotions

24:10

am I doing?

24:13

>> CHAT GPT: It seems like I am

24:14

looking at a picture of a

24:15

wooden surface.

24:19

>> BARRETT ZOPH: That was the

24:19

thing I sent you before I'm not

24:20

actually a table. Take another

24:21

look.

24:22

>> CHAT GPT: That makes more

24:22

sense. There we go. It looks

24:27

like you're feeling pretty

24:28

happy and cheerful. With a big

24:32

smile and maybe a touch of

24:33

excitement. Whatever is going

24:36

on, it seems like you are in a

24:36

great mood. Care to share the

24:45

source of the device?

24:46

>> BARRETT ZOPH: The reason I'm

24:47

in a good mood is we are doing

24:48

a presentation showcasing how

24:49

useful and amazing you are.

24:50

>> CHAT GPT: Oh stop it, you're

24:51

making me blush. It.[LAUGHTER]

24:52

[APPLAUSE]

24:57

>> MIRA MURATI: Amazing. That

24:58

is it for today on live demos.

24:59

As you can see this feels so

25:02

magical. It is wonderful but

25:03

we also want to remove some of

25:07

the mysticism from the

25:07

technology and bring it to you

25:08

so you can try it for yourself.

25:13

Over the next few weeks, we

25:14

will be rolling out these

25:14

capabilities to everyone.

25:15

Today has been very much

25:16

focused on the free users and

25:19

the new modalities and new

25:20

products. But we also care a

25:22

lot about the next frontier.

25:27

So soon we will be updating you

25:28

on our progress towards the

25:29

next big thing. And before we

25:32

wrap up, I just want to thank

25:32

the incredible OpenAI team and

25:38

thanks to Janssen and Nvidia

25:39

team for bringing us the

25:41

advanced GPU to make this demo

25:42

possible today. And thank you

25:46

all very, very much for being a

25:47

part of this today. [APPLAUSE]

25:54

[APPLAUSE]

26:05

[MUSIC

Rate This

5.0 / 5 (0 votes)

Related Tags
AI CollaborationGPT-4oReal-time IntelligenceText AnalysisVision AIAudio ProcessingFree AccessProduct LaunchTech InnovationLive DemoUser Experience