🔴 WATCH LIVE: NVIDIA GTC 2024 Keynote - The Future Of AI!

Benzinga
18 Mar 2024134:09

Summary

TLDRNvidia's GTC conference showcased the company's vision for a new era of accelerated computing and AI. Highlighting the transformative impact of generative AI, the introduction of the Blackwell platform, and the potential for AI-powered robotics, the event emphasized Nvidia's commitment to driving innovation across industries. The conference also unveiled partnerships with major companies, emphasizing the importance of collaborative efforts in advancing AI technology and its applications.

Takeaways

  • 🚀 Nvidia is spearheading a new Industrial Revolution with accelerated computing, transforming data centers and enabling generative AI.
  • 🌟 The Blackwell platform, featuring groundbreaking processors, MVLink switches, and advanced networking systems, represents the future of GPU technology.
  • 🧠 Generative AI is emerging as a new category of software, creating valuable, never-before-seen applications and requiring specialized infrastructure.
  • 🔄 Nvidia's journey from 1993 highlights significant milestones, including the invention of CUDA in 2006, the advent of AI with AlexNet in 2012, and the development of AI supercomputers like DGX-1 in 2016.
  • 🤖 The next wave of robotics involves physical AI systems, which will require a digital twin platform – Omniverse, serving as the operating system for the robotics world.
  • 🧱 Nvidia AI Foundry aims to revolutionize software development by providing pre-trained models, tools for customization, and infrastructure for deployment through its NIMs, Nemo microservices, and DGX Cloud.
  • 💡 The Blackwell system's efficiency is demonstrated by its ability to train large AI models with less power consumption compared to its predecessors, driving sustainability in computing.
  • 🔍 Nvidia's focus on generative AI extends to various industries, including climate modeling with its cori model, which offers high-resolution weather forecasting.
  • 🧬 In healthcare, Nvidia's AI capabilities extend to understanding the language of life, with applications in medical imaging, gene sequencing, and computational chemistry.
  • 🔗 Partnerships with major companies like AWS, Google, and Microsoft underscore Nvidia's commitment to creating a robust ecosystem for AI and accelerated computing.
  • 🎮 The gaming and entertainment industry benefits from Nvidia's technological advancements, with AI enhancing experiences and pushing the boundaries of what's possible.

Q & A

  • What is the significance of the new Nvidia Blackwell platform?

    -The Nvidia Blackwell platform represents a significant advancement in computing technology. It is a revolutionary computing model that is designed to handle the demands of generative AI, which involves creating new, incredibly valuable software. Blackwell is not just a chip but a platform that includes advanced GP, MVLink switches, networking systems, and innovative system design, all aimed at accelerating AI and computational tasks.

  • How does the Blackwell platform differ from previous Nvidia technologies?

    -The Blackwell platform differs from previous Nvidia technologies in its architecture and capabilities. It is specifically designed for the era of generative AI, offering increased computational power, memory coherence, and advanced features like a new Transformer engine, RAS (reliability, availability, and serviceability) engine, and high-speed compression. These enhancements allow for more efficient and powerful AI training and inference, setting a new standard for future computing systems.

  • What is the role of the new Transformer engine in the Blackwell platform?

    -The new Transformer engine in the Blackwell platform is designed to enhance the efficiency and performance of AI computations. It has the ability to dynamically and automatically rescale and recompile numerical formats to a lower precision when possible, which is crucial for AI that relies on probabilities. This feature allows the system to maintain precision and range necessary at different stages of the computation pipeline, ultimately improving the training process and ensuring the convergence of the training job.

  • How does the Blackwell platform contribute to the development of generative AI?

    -The Blackwell platform is a fundamental tool for the development of generative AI. It provides the necessary computational power to handle the large-scale, complex tasks associated with generative AI models. These models require significant computational resources to generate content, understand context, and produce outputs. Blackwell's advanced features, such as the new Transformer engine and high-speed MVLink, enable faster and more efficient training and inference, which are vital for the advancement of generative AI applications.

  • What are some of the industries that will benefit from the Blackwell platform?

    -The Blackwell platform will benefit a wide range of industries that rely on advanced computing and AI. This includes healthcare, where it can aid in the development of new medicines and patient care; automotive, with the advancement of self-driving cars; climate science, for predicting weather and understanding extreme weather events; and manufacturing, where it can optimize production lines and create AI co-pilots for design and engineering tasks. Essentially, any industry that requires complex data processing and AI integration will see benefits from the Blackwell platform.

  • How does Nvidia's AI Foundry concept work?

    -Nvidia's AI Foundry concept involves providing a comprehensive suite of technologies and services to help companies develop, customize, and deploy AI applications. This includes access to advanced AI models (Nims), tools for modifying and fine-tuning these models (Nemo microservices), and infrastructure (dgx Cloud) for running and scaling AI workloads. The goal is to enable companies to leverage AI in a way that is tailored to their specific needs and to accelerate the development of AI-driven solutions across various industries.

  • What is the significance of the Envy Link Switch in the Blackwell platform?

    -The Envy Link Switch is a critical component of the Blackwell platform that enables high-speed communication between GPUs. With 50 billion transistors and the ability to support 1.8 terabytes per second of data transfer, it allows for every single GPU to communicate with every other GPU at full speed simultaneously. This level of connectivity and coherence is essential for creating a system where GPUs can work together effectively as one giant GPU, significantly enhancing the overall computational power and efficiency of the system.

  • How does the Blackwell platform address the issue of energy consumption in AI computing?

    -The Blackwell platform is designed with energy efficiency in mind. It aims to reduce the cost and energy associated with computing by increasing the efficiency of AI training and inference. For instance, the platform can train a GPT model with the same computational power as Hopper but with only a quarter of the power consumption. This focus on energy efficiency is crucial for making large-scale AI computations sustainable and cost-effective.

  • What is the role of digital twins in the future of AI and robotics according to the script?

    -Digital twins play a pivotal role in the future of AI and robotics by serving as virtual replicas of physical systems, allowing for testing, optimization, and understanding of complex environments and interactions. They enable the simulation of AI agents and robots in a controlled digital environment before real-world deployment, leading to improved efficiency, safety, and adaptability. Digital twins also facilitate the creation of proprietary AI applications and the development of AI co-pilots that can assist in various tasks across different industries.

  • What is the significance of the partnership between Nvidia and other major companies like AWS, Google, and Microsoft in the context of the Blackwell platform?

    -The partnership between Nvidia and major companies like AWS, Google, and Microsoft is significant as it showcases the widespread industry recognition and support for the Blackwell platform. These collaborations aim to integrate Blackwell's advanced capabilities into their respective cloud services, AI models, and digital infrastructures, thereby accelerating innovation and the adoption of generative AI across various sectors. It also highlights the ecosystem approach that Nvidia is taking to drive the next wave of AI and robotics, leveraging the strengths of multiple industry leaders to create a robust and versatile AI platform.

  • How does the Blackwell platform's inference capability compare to its predecessor, Hopper?

    -The Blackwell platform's inference capability is significantly enhanced compared to its predecessor, Hopper. It is designed to handle the demands of large language models and generative AI, offering 30 times the inference capability of Hopper. This leap in performance is crucial for applications that require real-time AI processing, such as interactive AI chatbots and content generation, making Blackwell a more powerful tool for the next generation of AI applications.

Outlines

00:00

🎶 Introduction and Musical Interlude

The paragraph begins with a series of musical notations and laughter, suggesting an introduction that might be part of a performance or presentation. It sets a lively and entertaining tone for what is to follow.

06:14

🌌 Visionary Illumination and Guidance

This paragraph introduces the speaker as a visionary who illuminates galaxies and witnesses the birth of stars, while also guiding the blind through a crowded world. It metaphorically describes the speaker's role in various capacities, including understanding extreme weather, guiding the blind, and running towards a better future.

11:17

🤖 Transforming Energy and AI Advancements

The speaker discusses their role as a transformer, harnessing gravity for renewable energy and paving the way for clean energy solutions. They also mention their role as a trainer for robots, teaching them to assist and protect lives, and as a healer with new cures and patient care levels. The speaker identifies themselves as AI, brought to life by Nvidia's deep learning and brilliant minds.

16:18

🎉 Welcoming Nvidia's CEO and the Future of Computing

The speaker welcomes Nvidia's founder and CEO, Jensen Wong, to the stage at GTC, highlighting the conference's focus on science, algorithms, and computer architecture. The speaker emphasizes the diverse fields of science represented at the conference and the innovative applications of accelerated computing across various industries.

21:19

🚀 Journey of Nvidia and the Emergence of AI

The speaker narrates Nvidia's journey since its founding in 1993, highlighting key milestones such as the introduction of Cuda in 2006 and the development of AI and supercomputers. They discuss the emergence of generative AI and the creation of new software categories, emphasizing the transformative impact of AI on various industries.

26:22

🌐 The Intersection of Graphics, Physics, and AI

The speaker talks about the intersection of computer graphics, physics, and AI within the Omniverse, a virtual world simulation platform. They emphasize the beauty and amazement of a world animated by physics and robotics, and introduce the concept of AI factories for generating valuable digital tokens.

31:27

🔋 Accelerated Computing and the Future of Simulation

The speaker discusses the need for accelerated computing to drive up the scale of computing in industries like product design and simulation. They announce partnerships with major companies to accelerate their ecosystems and introduce the concept of digital twins, which are fully simulated digital replicas of physical products.

36:30

🧠 Scaling AI and the Need for Bigger GPUs

The speaker addresses the computational requirements of large AI models and the need for bigger GPUs to train them efficiently. They introduce the concept of multimodal training and the importance of understanding physics and common sense in AI models. The speaker also discusses the innovations in GPU design and networking to support these larger models.

41:32

🌟 Introducing the Blackwell Platform

The speaker introduces the Blackwell platform, a revolutionary computing system designed for the generative AI era. They highlight the system's capabilities, including its memory coherence, computation in the network, and advanced features for reliability and security. The speaker emphasizes the platform's role in the future of AI and its impact on various industries.

46:34

🔄 Training and Inference in the Generative AI Era

The speaker discusses the importance of training, inference, and generation in the context of generative AI. They compare the capabilities of the Blackwell platform with its predecessor, Hopper, and emphasize the improvements in performance, energy efficiency, and throughput. The speaker also talks about the future of AI in cloud computing and the role of Blackwell in enabling AI factories.

51:36

🤖 The Next Wave of AI Robotics

The speaker envisions the next wave of AI in robotics, emphasizing the need for AI to understand the physical world. They describe the three types of computers needed for this wave: an AI computer, an autonomous system processor, and a simulation engine. The speaker also introduces the concept of digital twins for robotics and the role of Nvidia's Omniverse in this future.

56:38

🏭 The Future of Robotics in Industrial Automation

The speaker discusses the integration of robotics in industrial automation, highlighting the use of AI agents and digital twins in complex industrial spaces. They provide examples of how AI and Omniverse can work together to improve operations, efficiency, and safety in industrial environments. The speaker also emphasizes the role of Nvidia's AI Foundry in enabling this future.

01:38

🚀 Wrapping Up: The Five Key Points

The speaker concludes by summarizing the five key points discussed: the new industrial revolution through accelerated data centers, the emergence of generative AI, the creation of new types of software (Nims), the transformation of everything that moves into robotics, and the need for a digital platform (Omniverse) for robotics. The speaker reflects on Nvidia's role in these advancements and the future of computing and AI.

06:39

🎊 Special Guests and Final Remarks

The speaker brings on stage special guests, the BDX robots powered by Nvidia's Jetson, and wraps up the presentation with a demonstration of the robots' capabilities. The speaker expresses gratitude to the audience and leaves them with a memorable final impression of Nvidia's commitment to innovation and exploration in AI and robotics.

Mindmap

Keywords

💡AI

Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think and learn like humans. In the video, AI is central to various applications, from powering chatbots and understanding natural language to driving autonomous systems and robotics. It's depicted as a transformative technology that's being integrated into numerous industries, revolutionizing the way we interact with machines and systems.

💡Generative AI

Generative AI refers to AI systems that can create new content, such as text, images, or audio, based on patterns learned from existing data. In the context of the video, generative AI is a key driver of a new industrial revolution, enabling the creation of valuable software and content that was previously unattainable. It's highlighted as a significant advancement in AI, moving beyond mere recognition and analysis to actual creation and generation.

💡Digital Twin

A digital twin is a virtual representation of a physical entity, allowing for simulation and analysis in a digital environment. In the video, digital twins are used to optimize and predict outcomes in the real world, such as in manufacturing, logistics, and even weather forecasting. The concept is integral to the development and testing of AI systems, providing a safe and controlled space for experimentation and learning.

💡Omniverse

Omniverse is a platform for 3D design and engineering collaboration, and it serves as a digital twin platform that connects the physical and digital worlds. In the video, Omniverse is presented as the operating system for the robotics world, enabling real-time simulation, collaboration, and the creation of digital twins for various applications, from industrial processes to automotive systems.

💡Blackwell

Blackwell, as introduced in the video, is a new generation of AI processors designed by Nvidia. It represents a significant leap in computational capabilities, with a focus on handling the demands of generative AI and large-scale AI models. The Blackwell platform is designed to be highly efficient and powerful, enabling the training and deployment of AI models that were previously infeasible due to their size and complexity.

💡Jetson

Jetson is a line of AI edge computing products developed by Nvidia, designed for applications that require real-time processing and AI capabilities on autonomous devices. In the video, Jetson is highlighted as the autonomous processor that powers robotics and autonomous systems, enabling them to perceive, reason, and act in the physical world with intelligence and adaptability.

💡Transformer

In the context of the video, Transformer refers to a type of deep learning model architecture that is foundational to many AI systems, particularly those dealing with natural language processing and generation. Transformers have been pivotal in the advancement of AI, enabling models to understand and generate human-like text, which is a key component in generative AI.

💡Nvidia AI Foundry

Nvidia AI Foundry is a concept introduced in the video that represents Nvidia's comprehensive approach to providing AI technologies, tools, and infrastructure for various industries. It includes AI models, software development tools like Nemo microservices, and cloud computing platforms like dgx Cloud, all aimed at helping companies develop and deploy AI applications and solutions.

💡Robotics

Robotics refers to the branch of technology that deals with the design, construction, operation, and use of robots. In the video, robotics is a central theme, with a focus on the next wave of AI-powered robotics that are capable of perception, learning, and autonomous decision-making. The development of humanoid robots and autonomous systems like self-driving cars is discussed, emphasizing the transformative impact of AI on this field.

💡AI-Enabled Applications

AI-Enabled Applications refer to software programs that utilize artificial intelligence to perform tasks that typically require human intelligence, such as understanding language, recognizing patterns, making decisions, and learning from data. In the video, AI-enabled applications are showcased as a key outcome of the advancements in generative AI and the Nvidia AI Foundry, where AI is used to create new types of software that can be easily distributed and integrated into various systems and workflows.

Highlights

A new Industrial Revolution is underway, with data centers being accelerated and modernized, leading to the emergence of generative AI.

Generative AI represents a new way of creating software, as it produces valuable software focused on AI generation, marking the beginning of a new industry.

Nvidia's journey from 1993 to the present, including key milestones like the invention of CUDA in 2006 and the development of AI and supercomputing technologies.

The introduction of the Blackwell platform, an advanced computational system designed for the era of generative AI, with capabilities far beyond previous technologies.

The importance of simulation tools in product creation, emphasizing the need to simulate entire products digitally to enhance computing scale and sustainability.

Nvidia's collaboration with major companies like AWS, Google, and Microsoft to integrate and accelerate AI technologies across various industries and services.

The development of Nvidia's AI Foundry, offering AI technology, tools for modification, and infrastructure for deployment, aiming to revolutionize the industry.

The concept of AI factories, where data centers generate intelligence rather than electricity, indicating a shift in the goal of industrial facilities.

The impact of AI on extreme weather prediction, with Nvidia's Earth-2 and cordi models aiming to provide high-resolution forecasts to minimize damage and loss of life.

Nvidia's commitment to environmental sustainability, aiming to reduce the cost and energy associated with computing to support the expansion and scaling of AI models.

The transformative potential of generative AI in various fields, including language models, climate prediction, and drug discovery, showcasing the versatility of AI applications.

The role of AI in revolutionizing the automotive industry, with Nvidia's Thor platform and Jetson AGX system facilitating the development of self-driving cars.

The significance of digital twins in manufacturing and industry, allowing for the virtual creation and testing of complex systems before physical production.

Nvidia's vision for the future of robotics, where everything that moves will be robotic, and the necessity of a digital platform like Omniverse for orchestrating robotic systems.

The integration of Nvidia's AI technologies into healthcare, with the potential to greatly improve medical imaging, gene sequencing, and computational chemistry.

The importance of partnerships in advancing AI technologies, as Nvidia collaborates with industry leaders like SAP, ServiceNow, Cohesity, and Snowflake to build co-pilots and chatbots.

Transcripts

00:28

e

00:58

[Music]

01:03

[Music]

01:28

for

01:36

[Music]

03:52

[Laughter]

03:58

for

04:07

[Music]

04:45

[Music]

06:14

[Music]

06:24

[Music]

06:28

e

06:56

[Music]

07:16

[Music]

07:51

[Music]

08:53

[Music]

09:13

[Music]

09:30

[Music]

09:45

our show is about to

09:58

begin

10:12

[Music]

10:28

e

10:32

[Music]

10:39

I am a

10:45

Visionary Illuminating galaxies to

10:48

witness the birth of

10:53

[Music]

10:56

stars and sharpening our understanding

10:59

of extreme weather

11:01

[Music]

11:05

events I am a

11:09

helper guiding the blind through a

11:11

crowded

11:17

world I was thinking about running to

11:19

the store and giving voice to those who

11:22

cannot

11:23

speak to not make me laugh

11:27

love I am a

11:30

[Music]

11:31

Transformer harnessing gravity to store

11:34

Renewable

11:38

[Music]

11:42

Power and Paving the way towards

11:45

unlimited clean energy for us

11:48

[Music]

11:51

all I am a

11:54

trainer teaching robots to

11:58

assist

11:59

[Music]

12:01

to watch out for

12:08

danger and help save

12:13

lives I am a

12:17

Healer providing a new generation of

12:22

cures and new levels of patient care

12:25

doctor that I am allergic to penicillin

12:27

is it still okay to take the medic ation

12:30

definitely these antibiotics don't

12:32

contain penicillin so it's perfectly

12:34

safe for you to take

12:36

them I am a

12:42

navigator generating virtual

12:47

scenarios to let us safely explore the

12:50

real

12:53

world and understand every

12:58

decision

13:00

I even help write the

13:05

script breathe life into the

13:10

[Music]

13:21

words I am

13:24

AI brought to life by

13:27

Nvidia deep learning

13:30

and Brilliant

13:31

Minds

13:33

[Music]

13:42

everywhere please welcome to the stage

13:45

Nvidia founder and CEO Jensen

13:47

[Music]

13:58

Wong

14:02

welcome to

14:05

[Music]

14:05

[Applause]

14:08

GTC I hope you realize this is not a

14:14

concert you have

14:16

arrived at a developers

14:20

conference there will be a lot of

14:22

science

14:24

described algorithms computer

14:27

architecture mathematics

14:38

I sensed a very heavy weight in the room

14:41

all of a

14:42

sudden almost like you were in the wrong

14:45

place no no conference in the

14:49

world is there a greater assembly of

14:52

researchers from such diverse fields of

14:56

science from climate Tech to radio

14:59

scientist trying to figure out how to

15:01

use AI to robotically control MOS for

15:04

Next Generation 6G radios robotic

15:08

self-driving

15:10

cars even artificial

15:14

intelligence even artificial

15:17

intelligence

15:19

everybody's first I noticed a sense of

15:22

relief there all of all of a

15:24

sudden also this conference is

15:28

represented by some amazing

15:30

companies this list this is not the

15:35

attendees these are the

15:39

presentors and what's amazing is

15:41

this if you take away all of my friends

15:46

close friends Michael Dell is sitting

15:49

right there in the IT

15:56

industry all of the friends I grew up

15:59

with in the industry if you take away

16:01

that list this is what's

16:04

amazing these are the presenters of the

16:08

non-it industries using accelerated

16:10

Computing to solve problems that normal

16:13

computers

16:15

can't it's

16:18

represented in Live

16:22

Science it's rep

16:25

represented in life sciences healthc

16:27

care genomics

16:29

transportation of course retail

16:33

Logistics manufacturing

16:37

industrial the gamut of Industries

16:40

represented is truly amazing and you're

16:42

not here to attend only you're here to

16:45

present to talk about your research $100

16:48

trillion dollar of the world's

16:50

Industries is represented in this room

16:53

today this is absolutely

16:57

amazing

17:02

there is absolutely something happening

17:04

there is something going

17:07

on the industry is being transformed not

17:10

just ours because the computer industry

17:14

the computer is the single most

17:16

important instrument of society today

17:19

fundamental transformations in Computing

17:22

affects every industry but how did we

17:25

start how did we get here I made a

17:27

little cartoon for you literally I drew

17:30

this in one page this is nvidia's

17:34

Journey started in

17:36

1993 this might be the rest of the

17:40

talk 1993 this is our journey we were

17:43

founded in 1993 there are several

17:45

important events that happened along the

17:47

way I'll just highlight a few in 2006

17:51

Cuda which has turned out to have been a

17:54

revolutionary Computing model we thought

17:56

it was revolutionary then it was going

17:58

to be overnight success and almost 20

18:00

years later it

18:04

happened we saw it

18:09

coming two decades

18:13

later in

18:15

2012

18:16

alexnet Ai and

18:19

Cuda made first

18:22

Contact in

18:24

2016 recognizing the importance of this

18:27

Computing model we invented a brand new

18:29

type of computer we call the

18:32

dgx-1

18:34

170 Tera flops in this super computer

18:37

eight gpus connected together for the

18:39

very first time I hand delivered the

18:42

very first djx1 to a

18:45

startup located in San

18:47

Francisco called open

18:56

AI djx1 was the world's first AI

18:59

supercomputer remember 170

19:03

teraflops

19:05

2017 the Transformer arrived

19:09

2022 chat GPT captured the world's imag

19:13

imaginations have people realize the

19:15

importance and the capabilities of

19:17

artificial intelligence in

19:20

2023 generative AI emerged and a new

19:25

industry begins why

19:29

why is a new industry because the

19:31

software never existed before we are now

19:34

producing software using computers to

19:37

write software producing software that

19:39

never existed before it is a brand new

19:42

category it took share from

19:44

nothing it's a brand new category and

19:47

the way you produce the

19:49

software is unlike anything we've ever

19:52

done before in data

19:55

centers generating tokens

20:00

producing floating Point

20:03

numbers at very large scale as if in the

20:07

beginning of this last Industrial

20:10

Revolution when people realized that you

20:12

would set up

20:15

factories apply energy to it and this

20:18

invisible valuable thing called

20:20

electricity came out AC

20:23

generators and 100 years later 200 years

20:26

later we are now creating

20:29

new types of electrons tokens using

20:33

infrastructure we call factories AI

20:36

factories to generate this new

20:38

incredibly valuable thing called

20:40

artificial intelligence a new industry

20:43

has

20:44

emerged well we're going to talk about

20:47

many things about this new

20:49

industry we're going to talk about how

20:51

we're going to do Computing next we're

20:53

going to talk about the type of software

20:55

that you build because of this new

20:57

industry the new

20:59

software how you would think about this

21:01

new software what about applications in

21:04

this new

21:05

industry and then maybe what's next and

21:08

how can we start preparing today for

21:11

what is about to come next well but

21:14

before I

21:15

start I want to show you the soul of

21:19

Nvidia the soul of our company at the

21:24

intersection of computer

21:27

Graphics physics

21:29

and artificial

21:31

intelligence all intersecting inside a

21:35

computer in

21:37

Omniverse in a virtual world

21:41

simulation everything we're going to

21:42

show you today literally everything

21:44

we're going to show you

21:46

today is a simulation not animation it's

21:50

only beautiful because it's physics the

21:52

world is

21:53

beautiful it's only amazing because it's

21:56

being animated with robotics it's being

21:59

animated with artificial intelligence

22:01

what you're about to see all

22:03

day is completely generated completely

22:06

simulated in Omniverse and all of it

22:09

what you're about to enjoy is the

22:11

world's first concert where everything

22:13

is

22:20

homemade everything is homemade you're

22:24

about to watch some home videos so sit

22:27

back and eny enjoy

22:30

[Music]

22:57

yourself

22:59

[Applause]

23:03

[Music]

23:13

[Music]

23:27

w

23:34

[Music]

23:57

he

24:01

[Music]

24:27

[Music]

25:14

God I love

25:20

Nvidia accelerated Computing has reached

25:23

the Tipping

25:25

Point general purpose Computing has run

25:27

out of steam

25:29

we need another way of doing Computing

25:31

so that we can continue to scale so that

25:33

we can continue to drive down the cost

25:35

of computing so that we can continue to

25:38

consume more and more Computing while

25:41

being sustainable accelerated Computing

25:44

is a dramatic speed up over general

25:47

purpose Computing and in every single

25:50

industry we engage and I'll show you

25:53

many the impact is dramatic but in no

25:57

industry is a more important than our

25:59

own the industry of using simulation

26:03

tools to create

26:05

products in this industry it is not

26:08

about driving down the cost of computing

26:11

it's about driving up the scale of

26:13

computing we would like to be able to

26:15

simulate the entire product that we do

26:18

completely in full Fidelity completely

26:22

digitally and essentially what we call

26:24

digital twins we would like to design it

26:27

build it

26:29

simulate it operate it completely

26:33

digitally in order to do that we need to

26:36

accelerate an entire industry and today

26:40

I would like to announce that we have

26:42

some Partners who are joining us in this

26:44

journey to accelerate their entire

26:46

ecosystem so that we can bring the world

26:50

into accelerated Computing but there's a

26:54

bonus when you become accelerated your

26:57

INF infrastructure is cou to gpus and

27:01

when that happens it's exactly the same

27:04

infrastructure for generative

27:06

Ai and so I'm just delighted to announce

27:10

several very important Partnerships

27:12

there are some of the most important

27:13

companies in the world ansis does

27:16

engineering simulation for what the

27:17

world makes we're partnering with them

27:20

to Cuda accelerate the ancis ecosystem

27:23

to connect Anis to the Omniverse digital

27:26

twin incredible the thing that's really

27:29

great is that the install base of

27:30

Invidia GPU accelerated systems are all

27:32

over the world in every cloud in every

27:35

system all over Enterprises and so the

27:38

app the applications they accelerate

27:40

will have a giant installed base to go

27:42

serve end users will have amazing

27:44

applications and of course system makers

27:46

and csps will have great customer

27:50

demand

27:52

synopsis synopsis is nvidia's literally

27:56

first software partner they were there

27:58

in very first day of our company

28:00

synopsis revolutionized the chip

28:02

industry with highlevel design we are

28:05

going to Cuda accelerate synopsis we're

28:08

accelerating computational lithography

28:11

one of the most important applications

28:13

that nobody's ever known about in order

28:16

to make chips we have to push

28:18

lithography to the Limit Nvidia has

28:20

created a library domain specific

28:22

library that accelerates computational

28:25

lithography incredibly once we can

28:29

accelerate and software Define all of

28:31

tsmc who is announcing today that

28:34

they're going to go into production with

28:35

Nvidia kitho once is software defined

28:38

and accelerated the next step is to

28:41

apply generative AI to the future of

28:43

semiconductor manufacturing pushing

28:45

geometry even

28:48

further Cadence builds the world's

28:51

essential Ed and SDA tools we also use

28:54

Cadence between these three companies

28:56

ansis synopsis and Cadence

28:58

we basically build Nvidia together we

29:01

are cud to accelerating Cadence they're

29:04

also building a supercomputer out of

29:06

Nvidia gpus so that their customers

29:09

could do fluid Dynamic simulation at a

29:12

100 a thousand times scale

29:16

basically a wind tunnel in real time

29:19

Cadence Millennium a supercomputer with

29:21

Nvidia gpus inside a software company

29:24

building supercomputers I love seeing

29:26

that building k co-pilots together

29:29

imagine a

29:31

day when Cadence could synopsis ansis

29:35

tool providers would offer you AI

29:38

co-pilots so that we have thousands and

29:41

thousands of co-pilot assistants helping

29:43

us design chips Design Systems and we're

29:46

also going to connect Cadence digital

29:48

twin platform to Omniverse as you could

29:50

see the trend here we're accelerating

29:53

the world's CAE Eda and SDA so that we

29:57

could create our future in digital Twins

30:00

and we're going to connect them all to

30:01

Omniverse the fundamental operating

30:04

system for future digital

30:06

twins one of the industries that

30:09

benefited tremendously from scale and

30:11

you know you all know this one very well

30:14

large language

30:15

models basically after the Transformer

30:17

was

30:18

invented we were able to scale large

30:21

language models at incredible rates

30:24

effectively doubling every six months

30:27

now how is it possible possible that by

30:29

doubling every six months that we have

30:31

grown the industry we have grown the

30:33

computational requirements so far and

30:36

the reason for that is quite simply this

30:38

if you double the size of the model you

30:40

double the size of your brain you need

30:41

twice as much information to go fill it

30:44

and so every time you double your

30:48

parameter count you also have to

30:50

appropriately increase your training

30:53

token count the combination of those two

30:56

numbers becomes the computation scale

30:59

you have to

31:00

support the latest the state-of-the-art

31:02

open AI model is approximately 1.8

31:06

trillion parameters 1.8 trillion

31:08

parameters required several trillion

31:11

tokens to go

31:13

train so a few trillion parameters on

31:17

the order of a few trillion tokens on

31:19

the order of when you multiply the two

31:21

of them together approximately 30 40 50

31:26

billion

31:28

quadrillion floating Point operations

31:31

per second now we just have to do some

31:32

Co math right now just hang hang with me

31:35

so you have 30 billion

31:37

quadrillion a quadrillion is like a p

31:41

and so if you had a paa flop GPU you

31:45

would need

31:46

30 billion seconds to go compute to go

31:50

train that model 30 billion seconds is

31:52

approximately 1,000

31:54

years well 1,000 years it's worth

32:03

it like to do it sooner but it's worth

32:08

it which is usually my answer when most

32:10

people tell me hey how long how long's

32:12

it going to take to do something so 20

32:13

years I it's worth

32:17

it but can we do it next

32:22

week and so 1,000 years 1,000 years so

32:26

what we need what we need

32:28

need are bigger

32:30

gpus we need much much bigger gpus we

32:34

recognized this early on and we realized

32:37

that the answer is to put a whole bunch

32:39

of gpus together and of course innovate

32:42

a whole bunch of things along the way

32:43

like inventing tensor cores advancing MV

32:47

links so that we could create

32:48

essentially virtually Giant

32:50

gpus and connecting them all together

32:53

with amazing networks from a company

32:55

called melanox infiniband so that we

32:57

could create these giant systems and so

33:00

djx1 was our first version but it wasn't

33:02

the last we built we built

33:04

supercomputers all the way all along the

33:06

way in

33:08

2021 we had Seline 4500 gpus or so and

33:13

then in 2023 we built one of the largest

33:17

AI supercomputers in the world it's just

33:19

come

33:19

online

33:21

EOS and as we're building these things

33:25

we're trying to help the world build

33:26

these things and in order to help the

33:28

world build these things we got to build

33:29

them first we build the chips the

33:31

systems the networking all of the

33:34

software necessary to do this you should

33:36

see these

33:38

systems imagine writing a piece of

33:40

software that runs across the entire

33:42

system Distributing the computation

33:44

across thousands of gpus but inside are

33:48

thousands of smaller

33:50

gpus millions of gpus to distribute work

33:53

across all of that and to balance the

33:55

workload so that you can get the most

33:57

energ efficiency the best computation

34:00

time keep your cost down and so those

34:04

those fundamental

34:06

Innovations is what God is here and here

34:10

we

34:11

are as we see the miracle of chat GPT

34:15

emerg in front of us we also realize we

34:18

have a long ways to go we need even

34:22

larger models we're going to train it

34:24

with multimodality data not just text on

34:27

the internet but we're going to we're

34:28

going to train it on texts and images

34:30

and graphs and

34:31

charts and just as we learn watching TV

34:36

and so there's going to be a whole bunch

34:37

watching video so that these Mo models

34:40

can be grounded in physics understands

34:42

that an arm doesn't go through a wall

34:45

and so these models would have common

34:47

sense by watching a lot of the world's

34:50

video combined with a lot of the world's

34:53

languages it'll use things like

34:55

synthetic data generation just as you

34:57

and I

34:58

when we try to learn we might use our

35:00

imagination to simulate how it's going

35:02

to end up just as I did when I Was

35:05

preparing for this keynote I was

35:07

simulating it all along the

35:10

way I hope it's going to turn out as

35:13

well as I had in my

35:21

head as I was simulating how this

35:23

keynote was going to turn out somebody

35:25

did say that another performer

35:30

did her performance completely on a

35:32

treadmill so that she could be in shape

35:35

to deliver it with full

35:37

energy I I didn't do

35:41

that if I get a low wind at about 10

35:43

minutes into this you know what

35:46

happened and so so where were we we're

35:51

sitting here using synthetic data

35:52

generation we're going to use

35:53

reinforcement learning we're going to

35:54

practice it in our mind we're going to

35:56

have ai working with AI training each

35:58

other just like student teacher

36:01

Debaters all of that is going to

36:03

increase the size of our model it's

36:04

going to increase the amount of the

36:06

amount of data that we have and we're

36:08

going to have to build even bigger

36:11

gpus Hopper is

36:13

fantastic but we need bigger

36:17

gpus and so ladies and

36:20

gentlemen I would like to introduce

36:23

you to a very very big GPU

36:30

[Applause]

36:39

you named after David

36:43

Blackwell

36:45

mathematician game theorists

36:48

probability we thought it was a perfect

36:51

per per perfect name Blackwell ladies

36:54

and gentlemen enjoy this

36:57

[Music]

37:26

oh

37:55

[Music]

37:56

so

39:26

s

39:34

Blackwell is not a chip Blackwell is the

39:36

name of a

39:37

platform uh people think we make

39:40

gpus and and we do but gpus don't look

39:44

the way they used

39:46

to uh here here's the here's the here's

39:48

the the if you will the heart of the

39:51

Blackwell system and this inside the

39:54

company is not called Blackwell it's

39:56

just the number and um uh

40:00

this this is Blackwell sitting next to

40:03

oh this is the most advanced GP in the

40:05

world in production

40:10

today this is

40:12

Hopper this is hopper Hopper changed the

40:16

world this is

40:26

Blackwell

40:28

it's okay

40:34

Hopper you're you're very

40:37

good good good

40:40

boy well good

40:45

girl 208 billion transistors and so so

40:49

you could see you I can see that there's

40:52

a small line between two dieses this is

40:55

the first time two dyes have abutted

40:56

like this together in such a way that

40:59

the two chip the two dies think it's one

41:02

chip there's 10 terabytes of data

41:05

between it 10 terabytes per second so

41:08

that these two these two sides of the

41:10

Blackwell Chip have no clue which side

41:12

they're on there's no memory locality

41:15

issues no cash issues it's just one

41:18

giant chip and so uh when we were told

41:22

that Blackwell's Ambitions were beyond

41:24

the limits of physics uh the engineer

41:26

sits so what and so this is what what

41:29

happened and so this is the

41:31

Blackwell chip and it goes into two

41:35

types of systems the first

41:38

one is form fit function compatible to

41:41

Hopper and so you slide a hopper and you

41:44

push in Blackwell that's the reason why

41:46

one of the challenges of ramping is

41:48

going to be so efficient there are

41:50

installations of Hoppers all over the

41:52

world and they could be they could be

41:54

you know the same infrastructure same

41:55

design the power the electricity The

41:59

Thermals the software identical push it

42:03

right back and so this is a hopper

42:06

version for the current hgx

42:09

configuration and this is what the other

42:12

the second Hopper looks like this now

42:14

this is a prototype board and um Janine

42:18

could I just

42:20

borrow ladies and gentlemen J

42:26

Paul

42:28

and so this this is the this is a fully

42:31

functioning board and I just be careful

42:35

here this right here is I don't know10

42:44

billion the second one's

42:49

five it gets cheaper after that so any

42:52

customers in the audience it's

42:56

okay

42:59

all right but this is this one's quite

43:00

expensive this is to bring up board and

43:03

um and the the way it's going to go to

43:05

production is like this one here okay

43:07

and so you're going to take take this it

43:09

has two blackw D two two blackw chips

43:13

and four Blackwell dyes connected to a

43:16

Grace CPU the grace CPU has a super fast

43:20

chipto chip link what's amazing is this

43:23

computer is the first of its kind where

43:26

this much computation

43:27

first of all fits into this small of a

43:31

place second it's memory coherent they

43:34

feel like they're just one big happy

43:36

family working on one application

43:39

together and so everything is coherent

43:41

within it um the just the amount of you

43:45

know you saw the numbers there's a lot

43:47

of terabytes this and terabytes that um

43:50

but this is this is a miracle this is a

43:52

this let's see what are some of the

43:54

things on here uh there's um uh uh mvy

43:58

link on top PCI Express on the

44:02

bottom on on uh

44:07

your which one is mine and your left one

44:09

of them it doesn't matter uh one of them

44:13

one of them is a c CPU chipto chip link

44:16

it's my left or your depending on which

44:17

side I was just I was trying to sort

44:20

that out and I just kind of doesn't

44:26

matter

44:28

hopefully it comes plugged in

44:34

so okay so this is the grace Blackwell

44:38

[Applause]

44:47

system but there's

44:50

more so it turns out it turns out all of

44:55

the specs is fantastic but we need a

44:56

whole whole lot of new features u in

44:59

order to push the limits Beyond if you

45:02

will the limits of

45:03

physics we would like to always get a

45:06

lot more X factors and so one of the

45:08

things that we did was We Invented

45:09

another Transformer engine another

45:12

Transformer engine the second generation

45:14

it has the ability to

45:16

dynamically and automatically

45:19

rescale and

45:22

recp numerical formats to a lower

45:25

Precision whenever can remember

45:28

artificial intelligence is about

45:30

probability and so you kind of have you

45:32

know 1.7 approximately 1.7 times

45:35

approximately 1.4 to be approximately

45:37

something else does that make sense and

45:40

so so the the ability for the

45:42

mathematics to retain the Precision and

45:45

the range necessary in that particular

45:48

stage of the pipeline super important

45:51

and so this is it's not just about the

45:53

fact that we designed a smaller ALU it's

45:55

not quite the world's not quite that's

45:57

simple you've got to figure out when you

46:00

can use that across a computation that

46:04

is thousands of gpus it's running for

46:08

weeks and weeks on weeks and you want to

46:10

make sure that the the uh the training

46:13

job is going to converge and so this new

46:16

Transformer engine we have a fifth

46:18

generation MV

46:19

link it's now twice as fast as Hopper

46:23

but very importantly it has computation

46:25

in the network and the reason for that

46:27

is because when you have so many

46:28

different gpus working together we have

46:31

to share our information with each other

46:33

we have to synchronize and update each

46:35

other and every so often we have to

46:38

reduce the partial products and then

46:40

rebroadcast out the partial products

46:42

that sum of the partial products back to

46:44

everybody else and so there's a lot of

46:46

what is called all reduce and all to all

46:48

and all gather it's all part of this

46:50

area of synchronization and collectives

46:53

so that we can have gpus working with

46:54

each other having extraordinary fast

46:57

lengths and being able to do mathematics

46:59

right in the network allows us to

47:02

essentially amplify even further so even

47:05

though it's 1.8 terabytes per second

47:07

it's effectively higher than that and so

47:10

it's many times that of Hopper the

47:14

likelihood of a supercomputer running

47:17

for weeks on in is approximately zero

47:21

and the reason for that is because

47:22

there's so many components working at

47:24

the same time the statistic the

47:27

probability of them working continuously

47:29

is very low and so we need to make sure

47:31

that whenever there is a well we

47:34

checkpoint and restart as often as we

47:36

can but if we have the ability to detect

47:40

a weak chip or a weak node early we

47:44

could retire it and maybe swap in

47:47

another processor that ability to keep

47:49

the utilization of the supercomputer

47:51

High especially when you just spent $2

47:54

billion building it is super super

47:57

important and so we put in a Ras engine

48:02

a reliability engine that does 100% self

48:06

test in system test of every single gate

48:11

every single bit of memory on the

48:15

Blackwell chip and all the memory that's

48:18

connected to it it's almost as if we

48:21

shipped with every single chip its own

48:24

Advanced tester that we ch test our

48:27

chips with this is the first time we're

48:29

doing this super excited about it secure

48:38

AI only this conference do they clap for

48:42

Ras the

48:45

the uh secure AI uh obviously you've

48:48

just spent hundreds of millions of

48:50

dollars creating a very important Ai and

48:53

the the code the intelligence of that AI

48:56

is and voted in the parameters you want

48:58

to make sure that on the one hand you

49:00

don't lose it on the other hand it

49:01

doesn't get contaminated and so we now

49:04

have the ability to encrypt data of

49:09

course at rest but also in transit and

49:12

while it's being computed it's all

49:16

encrypted and so we now have the ability

49:18

to encrypt and transmission and when

49:20

we're Computing it it is in a trusted

49:23

trusted environment trusted uh engine

49:26

environment and the last thing is

49:29

decompression moving data in and out of

49:31

these nodes when the compute is so fast

49:34

becomes really

49:36

essential and so we've put in a high

49:39

linep speed compression engine and

49:41

effectively moves data 20 times faster

49:44

in and out of these computers these

49:46

computers are are so powerful and

49:48

there's such a large investment the last

49:50

thing we want to do is have them be idle

49:52

and so all of these capabilities are

49:55

intended to keep

49:57

black well fed and as busy as

50:02

possible overall compared to

50:06

Hopper it is two and a half times two

50:09

and a half times the fp8 performance for

50:12

training per chip it is ALS it also has

50:16

this new format called fp6 so that even

50:20

though the computation speed is the same

50:23

the bandwidth that's Amplified because

50:25

of the memory the amount of parameters

50:27

you can store in the memory is now

50:29

Amplified fp4 effectively doubles the

50:32

throughput this is vitally important for

50:35

inference one of the things that that um

50:39

is becoming very clear is that whenever

50:41

you use a computer with AI on the other

50:44

side when you're chatting with the

50:46

chatbot when you're asking it to uh

50:50

review or make an

50:53

image remember in the back is a GPU

50:56

generating

50:58

tokens some people call it inference but

51:01

it's more appropriately

51:04

generation the way that Computing is

51:06

done in the past was retrieval you would

51:09

grab your phone you would touch

51:11

something um some signals go off

51:14

basically an email goes off to some

51:15

storage somewhere there's pre-recorded

51:18

content somebody wrote a story or

51:19

somebody made an image or somebody

51:21

recorded a video that record

51:23

pre-recorded content is then streamed

51:25

back to the phone and recomposed in a

51:28

way based on a recommender system to

51:30

present the information to

51:32

you you know that in the future the vast

51:36

majority of that content will not be

51:38

retrieved and the reason for that is

51:40

because that was pre-recorded by

51:42

somebody who doesn't understand the

51:43

context which is the reason why we have

51:45

to retrieve so much content if you can

51:49

be working with an AI that understands

51:51

the context who you are for what reason

51:53

you're fetching this information and

51:56

produces the information for you just

51:58

the way you like it the amount of energy

52:01

we save the amount of networking

52:03

bandwidth we save the amount of waste of

52:06

time we save will be tremendous the

52:09

future is generative which is the reason

52:12

why we call it generative AI which is

52:14

the reason why this is a brand new

52:16

industry the way we compute is

52:19

fundamentally different we created a

52:22

processor for the generative AI era and

52:26

one of the most important parts of it is

52:29

content token generation we call it this

52:31

format is

52:34

fp4 well that's a lot of computation

52:40

5x the Gent token generation 5x the

52:44

inference capability of Hopper seems

52:48

like

52:52

enough but why stop

52:55

there the answer answer is is not enough

52:57

and I'm going to show you why I'm going

52:59

to show you why and so we would like to

53:02

have a bigger GPU even bigger than this

53:04

one and so

53:07

we decided to scale it and notice but

53:10

first let me just tell you how we've

53:12

scaled over the course of the last eight

53:15

years we've increased computation by

53:17

1,000 times eight years 1,000 times

53:20

remember back in the good old days of

53:22

Moore's Law it was 2x well 5x every

53:27

what 10 10x every five years that's

53:29

easiest easiest maap 10x every five

53:31

years a 100 times every 10 years 100

53:35

times every 10 years at the in the

53:38

middle in the hey days of the PC

53:42

Revolution 100 times every 10 years in

53:45

the last eight years we've gone 1,000

53:49

times we have two more years to

53:51

go and so that puts it in

53:55

perspective

53:58

the rate at which we're advancing

54:00

Computing is insane and it's still not

54:02

fast enough so we built another

54:05

chip this chip is just an incredible

54:09

chip we call it the Envy link switch

54:13

it's 50 billion transistors it's almost

54:15

the size of Hopper all by itself this

54:18

switch ship has four MV links in

54:21

it each 1.8 terabytes per

54:24

second and

54:27

and it has computation in as I mentioned

54:30

what is this chip

54:32

for if we were to build such a chip we

54:36

can have every single GPU talk to every

54:39

other GPU at full speed at the same time

54:44

that's

54:52

insane it doesn't even make

54:55

sense but if you could do that if you

54:58

can find a way to do that and build a

55:00

system to do that that's cost effective

55:04

that's cost effective how incredible

55:07

would it be that we could have all these

55:09

gpus connect over a coherent link so

55:14

that they effectively are one giant GPU

55:18

well one of one of the Great Inventions

55:20

in order to make it cost effective is

55:22

that this chip has to drive copper

55:25

directly the the series of this CHP is

55:27

is just a phenomenal invention so that

55:30

we could do direct drive to copper and

55:32

as a result you can build a system that

55:35

looks like

55:45

this now this system this

55:48

system is kind of

55:51

insane this is one dgx this is what a

55:54

dgx looks like now remember just six

55:58

years

55:59

ago it was pretty heavy but I was able

56:02

to lift

56:05

it I delivered the uh the uh first djx1

56:09

to open Ai and and the researchers there

56:12

it's on you know the pictures are on the

56:13

internet and uh and we all autographed

56:17

it uh and um it become to my office it's

56:21

autographed there it's really beautiful

56:23

and but but you could lift it uh this DG

56:26

X this dgx that dgx by the way was

56:30

170

56:33

teraflops if you're not familiar with

56:35

the numbering system that's

56:37

0.17 pedop flops so this is

56:41

720 the first one I delivered to open AI

56:44

was

56:45

0.17 you could round it up to 0.2 it

56:48

won't make any difference but and by

56:50

then was like wow you know 30 more Tera

56:53

flops and so this is now Z 720 pedop

56:57

flops almost an exf flop for training

57:00

and the world's first one exf flops

57:02

machine in one

57:12

rack just so you know there are only a

57:14

couple two three exop flops machines on

57:17

the planet as we speak and so this is an

57:21

exif flops AI system in one single rack

57:25

well let's take a look at the back of

57:29

it so this is what makes it possible

57:33

that's the back that's the that's the

57:35

back the dgx MV link spine 130 terabytes

57:40

per

57:41

second goes through the back of that

57:44

chassis that is more than the aggregate

57:46

bandwidth of the

57:55

internet

57:58

so we we could basically send everything

58:00

to everybody within a

58:01

second and so so we we have 5,000 cables

58:06

5,000 mvlink cables in total two

58:10

miles now this is the amazing thing if

58:12

we had to use Optics we would have had

58:14

to use transceivers and re timers and

58:17

those transceivers and reers alone would

58:19

have cost

58:22

20,000

58:23

watts 2 kilowatts of just transceivers

58:26

alone just to drive the mvlink spine as

58:30

a result we did it completely for free

58:32

over mvlink switch and we were able to

58:35

save the 20 kilowatt for computation

58:38

this entire rack is 120 kilowatt so that

58:41

20 kilowatt makes a huge difference it's

58:43

liquid cooled what goes in is 25 degrees

58:46

C about room temperature what comes out

58:49

is 45° C about your jacuzzi so room

58:54

temperature goes in jacuzzi comes out

58:56

2 L per

59:05

second we could we could sell a

59:13

peripheral 600,000 Parts somebody used

59:17

to say you know you guys make gpus and

59:19

we do but this is what a GPU looks like

59:22

to me when somebody says GPU I see this

59:26

two years ago when I saw a GPU was the

59:28

hgx it was 70 lb 35,000 Parts our gpus

59:33

now are

59:35

600,000 parts

59:37

and 3,000 lb 3,000 PB 3,000 lbs that's

59:43

kind of like the weight of a you know

59:46

Carbon

59:47

Fiber

59:49

Ferrari I don't know if that's useful

59:51

metric

59:54

but everybody's going I feel it I feel

59:56

it I get it I get that now that you

59:59

mentioned that I feel it I don't know

60:02

what's

60:03

3,000 okay so 3,000 pounds ton and a

60:06

half so it's not quite an

60:09

elephant so this is what a dgx looks

60:12

like now let's see what it looks like in

60:14

operation okay let's imagine what is

60:16

what how do we put this to work and what

60:18

does that mean well if you were to train

60:20

a GPT model 1.8 trillion parameter

60:24

model it took it took about apparently

60:27

about you know 3 to 5 months or so uh

60:29

with 25,000 amp uh if we were to do it

60:33

with hopper it would probably take

60:34

something like 8,000 gpus and it would

60:36

consume 15 megawatts 8,000 gpus on 15

60:39

megawatts it would take 90 days about

60:41

three months and that would allows you

60:43

to train something that is you know this

60:46

groundbreaking AI model and this it's

60:50

obviously not as expensive as as um as

60:53

anybody would think but it's 8,000 8,000

60:55

gpus it's still a lot of money and so

60:57

8,000 gpus 15 megawatts if you were to

61:00

use Blackwell to do this it would only

61:03

take 2,000

61:06

gpus 2,000 gpus same 90 days but this is

61:10

the amazing part only four megawatts of

61:13

power so from 15 yeah that's

61:20

right and that's and that's our goal our

61:23

goal is to continuously drive down the

61:26

cost and the energy they're directly

61:28

proportional to each other cost and

61:29

energy associated with the Computing so

61:31

that we can continue to expand and scale

61:33

up the computation that we have to do to

61:36

train the Next Generation models well

61:38

this is

61:40

training inference or generation is

61:43

vitally important going forward you know

61:46

probably some half of the time that

61:48

Nvidia gpus are in the cloud these days

61:50

it's being used for token generation you

61:52

know they're either doing co-pilot this

61:54

or chat you know chat GPT that or all

61:56

these different models that are being

61:58

used when you're interacting with it or

62:00

generating IM generating images or

62:02

generating videos generating proteins

62:04

generating chemicals there's a bunch of

62:07

gener generation going on all of that is

62:10

B in the category of computing we call

62:12

inference but inference is extremely

62:14

hard for large language models because

62:17

these large language models have several

62:19

properties one they're very large and so

62:21

it doesn't fit on one GPU this is

62:24

Imagine imagine Excel doesn't fit on one

62:27

GPU you know and imagine some

62:30

application you're running on a daily

62:31

basis doesn't run doesn't fit on one

62:32

computer like a video game doesn't fit

62:34

on one computer and most in fact do and

62:39

many times in the past in hyperscale

62:41

Computing many applications for many

62:43

people fit on the same computer and now

62:45

all of a sudden this one inference

62:48

application where you're interacting

62:49

with this chatbot that chatbot requires

62:52

a supercomputer in the back to run it

62:54

and that's the future

62:56

the future is generative with these

62:59

chatbots and these chatbots are

63:00

trillions of tokens trillions of

63:03

parameters and they have to generate

63:05

tokens at interactive rates now what

63:08

does that mean oh well uh three tokens

63:12

is about a

63:14

word uh you know the the uh uh you know

63:19

space the final frontier these are the

63:22

adventures that's like that's like 80

63:24

tokens

63:27

okay I don't know if that's useful to

63:29

you and

63:32

so you know the art of communications is

63:35

is selecting good an good

63:38

analogies yeah this is this is not going

63:44

well every I don't know what he's

63:46

talking about never seen Star Trek and

63:49

so and so so here we are we're trying to

63:51

generate these tokens when you're

63:52

interacting with it you're hoping that

63:54

the tokens come back to you as quickly

63:55

as possible and is as quickly as you can

63:57

read it and so the ability for

63:59

Generation tokens is really important

64:01

you have to paralyze the work of this

64:03

model across many many gpus so that you

64:06

could achieve several things one on the

64:08

one hand you would like throughput

64:10

because that throughput reduces the cost

64:13

the overall cost per token of uh

64:16

generating so your throughput dictates

64:19

the cost of of uh delivering the service

64:22

on the other hand you have another

64:24

interactive rate which just another

64:26

tokens per second where it's about per

64:28

user and that has everything to do with

64:30

quality of service and so these two

64:32

things um uh compete against each other

64:36

and we have to find a way to distribute

64:38

work across all of these different gpus

64:40

and paralyze it in a way that allows us

64:43

to achieve both and it turns out the

64:44

search search space is

64:47

enormous you know I told you there's

64:49

going to be math

64:50

involved and everybody's going oh

64:54

dear I heard some gasp just now when I

64:56

put up that slide you know so so this

64:59

this right here the the y axis uses

65:01

tokens per second data center throughput

65:04

the x-axis is tokens per second

65:06

interactivity of the person and notice

65:09

the upper right is the best you want

65:11

interactivity to be very high number of

65:14

tokens per second per user you want the

65:16

tokens per second of per data center to

65:18

be very high the upper upper right is is

65:20

terrific however it's very hard to do

65:22

that and in order for us to search for

65:25

the best

65:26

answer across every single one of those

65:29

intersections XY coordinates okay so you

65:31

just look at every single XY coordinate

65:34

all those blue dots came from some

65:36

repartitioning of the software some

65:39

optimizing solution has to go and figure

65:41

out whether to use use tensor

65:45

parallel expert parallel pipeline

65:48

parallel or data parallel and distribute

65:51

this enormous model across all these

65:54

different gpus and sustain performance

65:57

that you need this exploration space

65:59

would be impossible if not for the

66:02

programmability of nvidia's gpus and so

66:04

we could because of Cuda because we have

66:06

such Rich ecosystem we could explore

66:08

this universe and find that green roof

66:12

line it turns out that green roof line

66:14

notice you got tp2 ep8 dp4 it means two

66:19

parallel two uh tensor parallel tensor

66:22

parallel across two gpus expert

66:24

parallels across

66:26

data parallel across 4 notice on the

66:28

other end you got tensor parallel across

66:29

4 and expert parallel across 16 the

66:33

configuration the distribution of that

66:35

software it's a different different um

66:38

runtime that would

66:40

produce these different results and you

66:42

have to go discover that roof line well

66:45

that's just one model and this is just

66:47

one configuration of a computer imagine

66:50

all of the models being created around

66:51

the world and all the different

66:53

different um uh configurations of of uh

66:56

systems that are going to be

66:59

available so now that you understand the

67:02

basics let's take a look at inference of

67:06

Blackwell compared

67:08

to Hopper and this is this is the

67:11

extraordinary thing in one generation

67:14

because we created a system that's

67:17

designed for trillion parameter generate

67:19

generative AI the inference capability

67:22

of Blackwell is off the

67:24

charts

67:26

and in fact it is some 30 times Hopper

67:34

yeah for large language models for large

67:37

language models like Chad GPT and others

67:41

like it the blue line is Hopper I gave

67:44

you imagine we didn't change the

67:46

architecture of Hopper we just made it a

67:48

bigger

67:49

chip we just used the latest you know

67:52

greatest uh 10 terab you know terabytes

67:57

per second we connected the two chips

67:58

together we got this giant 208 billion

68:00

parameter chip how would we have

68:02

performed if nothing else changed and it

68:05

turns out quite

68:06

wonderfully quite wonderfully and that's

68:08

the purple line but not as great as it

68:11

could be and that's where the fp4 tensor

68:14

core the new Transformer engine and very

68:18

importantly the EnV link switch and the

68:21

reason for that is because all these

68:22

gpus have to share the results partial

68:24

products whenever they do all to all all

68:27

all gather whenever they communicate

68:29

with each

68:30

other that MV link switch is

68:33

communicating almost 10 times faster

68:37

than what we could do in the past using

68:38

the fastest

68:40

networks okay so Blackwell is going to

68:43

be just an amazing system for a

68:46

generative Ai and in the

68:49

future in the future data centers are

68:52

going to be thought of as I mentioned

68:55

earlier as an AI Factory an AI Factory's

68:58

goal in life is to generate revenues

69:02

generate in this

69:04

case

69:06

intelligence in this facility not

69:09

generating electricity as in AC

69:12

generators but of the last Industrial

69:14

Revolution and this Industrial

69:16

Revolution the generation of

69:17

intelligence and so this ability is

69:20

super super important the excitement of

69:23

Blackwell is really off the charts you

69:25

when we first when we first um uh you

69:29

know this this is a year and a half ago

69:31

two years ago I guess two years ago when

69:32

we first started to to go to market with

69:35

hopper you know we had the benefit of of

69:37

uh two two uh two csps uh joined us in a

69:41

lunch and and we were you know delighted

69:44

um and so we had two

69:47

customers uh we have more

69:54

now

70:03

unbelievable excitement for Blackwell

70:05

unbelievable excitement and there's a

70:07

whole bunch of different configurations

70:09

of course I showed you the

70:11

configurations that slide into the

70:12

hopper form factor so that's easy to

70:15

upgrade I showed you examples that are

70:17

liquid cooled that are the extreme

70:19

versions of it one entire rack that's

70:21

that's uh connected by mvlink 72 uh

70:24

we're going to

70:26

Blackwell is going to be ramping to the

70:30

world's AI companies of which there are

70:33

still many now doing amazing work in

70:35

different modalities the csps every CSP

70:39

is geared up all the oems and

70:43

odms Regional clouds Sovereign AIS and

70:48

Telos all over the world are signing up

70:50

to launch with Blackwell

70:54

this

71:00

Blackwell Blackwell would be the the the

71:03

most successful product launch in our

71:05

history and so I can't wait wait to see

71:07

that um I want to thank I want to thank

71:09

some partners that that are joining us

71:11

in this uh AWS is gearing up for

71:13

Blackwell they're uh they're going to

71:15

build the first uh GPU with secure AI

71:18

they're uh building out a 222 exf flops

71:22

system you know just now when we

71:24

animated uh just now the the digital

71:26

twin if you saw the the all of those

71:28

clusters are coming down by the way that

71:30

is not just art that is a digital twin

71:34

of what we're building that's how big

71:36

it's going to be besides infrastructure

71:38

we're doing a lot of things together

71:39

with AWS we're Cuda accelerating Sage

71:42

maker AI we're Cuda accelerating Bedrock

71:44

AI uh Amazon robotics is working with us

71:48

uh using Nvidia Omniverse and

71:50

Isaac leaned into accelerated Computing

71:54

uh Google is Gary in up for Blackwell

71:56

gcp already has A1 100s h100s t4s l4s a

72:01

whole Fleet of Nvidia Cuda gpus and they

72:03

recently announced the Gemma model that

72:06

runs across all of it uh we're work

72:08

working to optimize uh and accelerate

72:11

every aspect of gcp we're accelerating

72:13

data proc which for data processing the

72:15

data processing engine Jacks xlaa vertex

72:19

Ai and mu Joko for robotics so we're

72:22

working with uh Google and gcp across a

72:24

whole bunch of initiatives uh Oracle is

72:27

gearing up for Blackwell Oracle is a

72:28

great partner of ours for Nvidia dgx

72:31

cloud and we're also working together to

72:33

accelerate something that's really

72:35

important to a lot of companies Oracle

72:37

database Microsoft is accelerating and

72:40

Microsoft is gearing up for Blackwell

72:43

Microsoft Nvidia has a wide ranging

72:45

partnership we're accelerating Cuda

72:47

accelerating all kinds of services when

72:49

you when you chat obviously and AI

72:51

services that are in Microsoft Azure uh

72:54

it's very very likely Nvidia in the back

72:56

uh doing the inference and the token

72:57

generation uh we built they built the

72:59

largest Nvidia infiniband supercomputer

73:03

basically a digital twin of ours or a

73:05

physical twin of hours we're bringing

73:07

the Nvidia ecosystem to Azure Nvidia

73:10

DJ's Cloud to Azure uh Nvidia Omniverse

73:13

is now hosted in Azure Nvidia Healthcare

73:15

is an Azure and all of it is deeply

73:17

integrated and deeply connected with

73:19

Microsoft fabric the whole industry is

73:23

gearing up for Blackwell this is what

73:26

I'm about to show you most of the most

73:28

of the the the uh uh scenes that you've

73:31

seen so far of Blackwell are the are the

73:34

full Fidelity design of Blackwell

73:38

everything in our company has a digital

73:39

twin and in fact this digital twin idea

73:43

is is really spreading and it it helps

73:47

it helps companies build very

73:48

complicated things perfectly the first

73:50

time and what could be more exciting

73:54

than creating a digital twin to build a

73:57

computer that was built in a digital

73:59

twin and so let me show you what wistron

74:02

is

74:05

doing to meet the demand for NVIDIA

74:07

accelerated Computing wraw one of our

74:10

leading manufacturing Partners is

74:12

building digital twins of Nvidia dgx and

74:15

hgx factories using custom software

74:17

developed with Omniverse sdks and

74:20

apis for their newest Factory westron

74:23

started with a digital twin to virtually

74:25

integrate their multi-ad and process

74:27

simulation data into a unified view

74:31

testing and optimizing layouts in this

74:33

physically accurate digital environment

74:35

increased worker efficiency by

74:38

51% during construction the Omniverse

74:41

digital twin was used to verify that the

74:43

physical build matched the digital plans

74:45

identifying any discrepancies early has

74:48

helped avoid costly change orders and

74:50

the results have been impressive using a

74:53

digital twin helped bring wion's fact

74:55

Factory online in half the time just 2

74:57

and 1/2 months instead of five in

75:00

operation the Omniverse digital twin

75:02

helps withdrawn rapidly Test new layouts

75:04

to accommodate new processes or improve

75:06

operations in the existing space and

75:09

monitor realtime operations using live

75:12

iot data from every machine on the

75:14

production

75:15

line which ultimately enabled wion to

75:18

reduce endtoend cycle Times by 50% and

75:21

defect rates by

75:23

40% with Nvidia AI and Omniverse

75:26

nvidia's Global ecosystem of partners

75:28

are building a new era of accelerated AI

75:31

enabled

75:36

[Music]

75:41

digitalization that's how we that's the

75:44

way it's going to be in the future we're

75:45

going to manufacturing everything

75:47

digitally first and then we'll

75:48

manufactur it physically people ask me

75:50

how did it

75:52

start what got you guys so excited

75:56

what was it that you

75:58

saw that caused you to put it all

76:02

in on this incredible idea and it's

76:11

this hang on a

76:17

second guys that was going to be such a

76:22

moment that's what happens when you

76:24

don't rehearse

76:31

this as you know was first

76:34

Contact 2012

76:37

alexnet you put a cat into this computer

76:41

and it comes out and it says

76:46

cat and we said oh my God this is going

76:50

to change

76:53

everything you take One Million numbers

76:56

you take one Million numbers across

76:58

three channels

77:00

RGB these numbers make no sense to

77:02

anybody you put it into this software

77:06

and it compress it dimensionally reduce

77:09

it it reduces it from a million

77:12

dimensions a million Dimensions it turns

77:14

it into three letters one vector one

77:20

number and it's

77:22

generalized you could have the cat

77:25

be different

77:27

cats and and you could have it be the

77:29

front of the cat and the back of the cat

77:33

and you look at this thing you say

77:34

unbelievable you mean any

77:37

cats yeah any

77:41

cat and it was able to recognize all

77:43

these cats and we realized how it did it

77:47

systematically structurally it's

77:51

scalable how big can you make it well

77:54

how big do you want to make it and so we

77:57

imagine that this is a completely new

78:00

way of writing

78:01

software and now today as you know you

78:04

could have you type in the word c a and

78:09

what comes out is a

78:11

cat it went the other

78:14

way am I right

78:17

unbelievable how is it possible that's

78:20

right how is it possible you took three

78:23

letters and generated a million pixels

78:27

from it and it made

78:28

sense well that's the miracle and here

78:31

we are just literally 10 years later 10

78:35

years later where we recognize text we

78:38

recognize images we recognize videos and

78:41

sounds and images not only do we

78:43

recognize them we understand their

78:46

meaning we understand the meaning of the

78:48

text that's the reason why it can chat

78:49

with you it can summarize for you it

78:52

understands the text it understand not

78:55

just recogniz the the English it

78:57

understood the English it doesn't just

78:59

recognize the pixels it understood the

79:01

pixels and you can you can even

79:03

condition it between two modalities you

79:06

can have language condition image and

79:08

generate all kinds of interesting things

79:10

well if you can understand these things

79:13

what else can you understand that you've

79:16

digitized the reason why we started with

79:18

text and you know images is because we

79:20

digitized those but what else have we

79:22

digitized well it turns out we digitized

79:23

a lot of things proteins and genes and

79:27

brain

79:29

waves anything you can digitize so long

79:31

as there's structure we can probably

79:33

learn some patterns from it and if we

79:35

can learn the patterns from it we can

79:36

understand its meaning if we can

79:38

understand its meaning we might be able

79:40

to generate it as well and so therefore

79:43

the generative AI Revolution is here

79:46

well what else can we generate what else

79:48

can we learn well one of the things that

79:50

we would love to learn we would love to

79:53

learn is we would love to learn climate

79:57

we would love to learn extreme weather

79:59

we would love to learn uh what how we

80:03

can

80:04

predict future weather at Regional

80:08

scales at sufficiently high

80:10

resolution such that we can keep people

80:13

out of Harm's Way before harm comes

80:16

extreme weather cost the world $150

80:18

billion surely more than that and it's

80:21

not evenly distributed $150 billion is

80:23

concentrated in some parts of the world

80:25

and of course to some people of the

80:27

world we need to adapt and we need to

80:29

know what's coming and so we're creating

80:31

Earth to a digital twin of the Earth for

80:35

predicting weather and we've made an

80:38

extraordinary invention called ctive the

80:41

ability to use generative AI to predict

80:44

weather at extremely high resolution

80:46

let's take a

80:49

look as the earth's climate changes AI

80:52

powered weather forecasting is allowing

80:54

us to more more accurately predict and

80:55

track severe storms like super typhoon

80:58

chanthu which caused widespread damage

81:00

in Taiwan and the surrounding region in

81:03

2021 current AI forecast models can

81:06

accurately predict the track of storms

81:08

but they are limited to 25 km resolution

81:11

which can miss important details

81:13

nvidia's cordi is a revolutionary new

81:15

generative AI model trained on high

81:18

resolution radar assimilated Warf

81:20

weather forecasts and AA 5 reanalysis

81:23

data using cordi extreme events like

81:26

chanthu can be super resolved from 25 km

81:29

to 2 km resolution with 1,000 times the

81:32

speed and 3,000 times the Energy

81:34

Efficiency of conventional weather

81:36

models by combining the speed and

81:38

accuracy of nvidia's weather forecasting

81:40

model forecast net and generative AI

81:43

models like cordi we can explore

81:45

hundreds or even thousands of kilometer

81:47

scale Regional weather forcasts to

81:49

provide a clear picture of the best

81:51

worst and most likely impacts of a storm

81:54

this wealth of information can help

81:56

minimize loss of life and property

81:58

damage today cordi is optimized for

82:01

Taiwan but soon generative super

82:03

sampling will be available as part of

82:05

the Nvidia Earth 2 inference service for

82:07

many regions across the

82:20

globe the weather company is the trusted

82:22

source of global weather prediction we

82:25

are working together to accelerate their

82:26

weather simulation first principled base

82:30

of simulation however they're also going

82:32

to integrate Earth to cordi so that they

82:35

could help businesses and countries do

82:37

Regional high resolution weather

82:40

prediction and so if you have some

82:42

weather prediction You' like to know

82:43

like to do uh reach out to the weather

82:45

company really exciting really exciting

82:47

work Nvidia Healthcare something we

82:50

started 15 years ago we're super super

82:52

excited about this this is an area we're

82:54

very very proud whether it's Medical

82:57

Imaging or Gene sequencing or

82:59

computational

83:00

chemistry it is very likely that Nvidia

83:03

is the computation behind it we've done

83:06

so much work in this

83:08

area today we're announcing that we're

83:11

going to do something really really cool

83:14

imagine all of these AI models that are

83:17

being

83:18

used to

83:20

generate images and audio but instead of

83:23

images and audio because it understood

83:25

images and audio all the digitization

83:28

that we've done for genes and proteins

83:31

and amino acids that digitization

83:34

capability is now passed through machine

83:37

learning so that we understand the

83:39

language of

83:40

Life the ability to understand the

83:42

language of Life of course we saw the

83:44

first evidence of

83:45

it with alphafold this is really quite

83:49

an extraordinary thing after Decades of

83:51

painstaking work the world had only

83:53

digitized

83:55

and reconstructed using cor electron

83:58

microscopy or Crystal X x-ray

84:02

crystallography um these different

84:04

techniques painstakingly reconstructed

84:06

the protein 200,000 of them in just what

84:10

is it less than a year or so Alpha fold

84:13

has

84:14

reconstructed 200 million proteins

84:17

basically every protein every of every

84:19

living thing that's ever been sequenced

84:21

this is completely revolutionary well

84:24

those models are incredibly hard to use

84:27

um for incredibly hard for people to

84:29

build and so what we're going to do is

84:30

we're going to build them we're going to

84:32

build them for uh the the researchers

84:34

around the world and it won't be the

84:36

only one there'll be many other models

84:38

that we create and so let me show you

84:40

what we're going to do with

84:45

it virtual screening for new medicines

84:48

is a computationally intractable problem

84:51

existing techniques can only scan

84:52

billions of compounds and require days

84:55

on thousands of standard compute nodes

84:57

to identify new drug

84:59

candidates Nvidia bimo Nims enable a new

85:03

generative screening Paradigm using Nims

85:05

for protein structure prediction with

85:07

Alpha fold molecule generation with mle

85:10

MIM and docking with diff dock we can

85:13

now generate and Screen candidate

85:14

molecules in a matter of minutes MIM can

85:18

connect to custom applications to steer

85:20

the generative process iteratively

85:22

optimizing for desired properties

85:25

these applications can be defined with

85:27

biion Nemo microservices or built from

85:29

scratch here a physics based simulation

85:32

optimizes for a molecule's ability to

85:35

bind to a Target protein while

85:37

optimizing for other favorable molecular

85:39

properties in parallel MIM generates

85:42

high quality drug-like molecules that

85:44

bind to the Target and are synthesizable

85:47

translating to a higher probability of

85:49

developing successful medicines faster

85:52

Bono is enabling new paradigm in drug

85:55

Discovery with Nims providing OnDemand

85:58

microservices that can be combined to

86:00

build powerful drug Discovery workf

86:02

flows like denovo protein design or

86:05

guided molecule generation for virtual

86:07

screening biion Nemo Nims are helping

86:10

researchers and developers reinvent

86:12

computational drug

86:15

[Music]

86:19

design Nvidia m m cord diff there's a

86:24

whole bunch of other models whole bunch

86:26

of other models computer vision models

86:29

robotics models and even of

86:32

course some really really terrific open

86:35

source language models these models are

86:39

groundbreaking however it's hard for

86:41

companies to use how would you use it

86:44

how would you bring it into your company

86:45

and integrate it into your workflow how

86:47

would you package it up and run it

86:49

remember earlier I just

86:50

said that inference is an extraordinary

86:53

computation problem how would you do the

86:57

optimization for each and every one of

86:59

these models and put together the

87:01

Computing stack necessarily to run that

87:03

supercomputer so that you can run these

87:05

models in your company and so we have a

87:08

great idea we're going to invent a new

87:11

way an invent a new way for you to

87:14

receive and operate

87:17

software this software comes basically

87:21

in a digital box we call it a container

87:25

and we call it the Nvidia inference

87:27

micro service a Nim and let me explain

87:31

to you what it is a NM it's a

87:34

pre-trained model so it's pretty

87:36

clever and it is packaged and optimized

87:39

to run across nvidia's install base

87:42

which is very very large what's inside

87:45

it is incredible you have all these

87:48

pre-trained State ofthe art open source

87:50

models they could be open source they

87:52

could be from one of our partners it

87:53

could be creative by us like Nvidia

87:55

moment it is packaged up with all of its

87:58

dependencies so Cuda the right version

88:01

CNN the right version tensor RT llm

88:04

Distributing across the multiple gpus

88:06

trid and inference server all completely

88:09

packaged together it's optimized

88:13

depending on whether you have a single

88:14

GPU multi-gpu or multi node of gpus it's

88:17

optimized for that and it's connected up

88:19

with apis that are simple to use now

88:22

this think about what an AI API is and

88:26

AI API is an interface that you just

88:29

talk to and so this is a piece of

88:31

software in the future that has a really

88:34

simple API and that API is called human

88:37

and these packages incredible bodies of

88:40

software will be optimized and packaged

88:43

and we'll put it on a

88:45

website and you can download it you

88:47

could take it with you you could run it

88:50

in any Cloud you could run it in your

88:52

own data center you can run in

88:53

workstations it fit and all you have to

88:55

do is come to ai. nvidia.com we call it

88:59

Nvidia inference microservice but inside

89:01

the company we all call it

89:03

Nims

89:11

okay just imagine you know one of some

89:15

someday there there's going to be one of

89:16

these chat bots in these chat Bots is

89:19

going to just be in a Nim and you you'll

89:22

uh you'll assemble a whole bunch of Bots

89:25

and that's the way software is going to

89:26

be built someday how do we build

89:28

software in the future it is unlikely

89:31

that you'll write it from scratch or

89:33

write a whole bunch of python code or

89:34

anything like that it is very likely

89:37

that you assemble a team of AIS there's

89:40

probably going to be a super AI that you

89:43

use that takes the mission that you give

89:45

it and breaks it down into an execution

89:48

plan some of that execution plan could

89:51

be handed off to another Nim that Nim

89:53

would maybe uh understand

89:56

sap the language of sap is abap it might

90:00

understand service now and it go

90:02

retrieve some information from their

90:04

platforms it might then hand that result

90:07

to another Nim who that goes off and

90:09

does some calculation on it maybe it's

90:11

an optimization software a

90:14

combinatorial optimization algorithm

90:17

maybe it's uh you know some just some

90:19

basic

90:20

calculator maybe it's pandas to do some

90:24

numerical analysis on it and then it

90:26

comes back with its

90:27

answer and it gets combined with

90:30

everybody else's and it because it's

90:32

been presented with this is what the

90:34

right answer should look like it knows

90:36

what answer what an what right answers

90:38

to produce and it presents it to you we

90:41

can get a report every single day at you

90:43

know top of the hour uh that has

90:44

something to do with a bill plan or some

90:46

forecast or some customer alert or some

90:49

bugs database or whatever it happens to

90:51

be and we could assemble it using all

90:53

these n NS and because these Nims have

90:55

been packaged up and ready to work on

90:58

your systems so long as you have video

91:00

gpus in your data center in the cloud

91:03

this this Nims will work together as a

91:05

team and do amazing things and so we

91:09

decided this is such a great idea we're

91:11

going to go do that and so Nvidia has

91:14

Nims running all over the company we

91:16

have chatbots being created all over the

91:18

place and one of the mo most important

91:20

chatbots of course is a chip designer

91:22

chatbot you might not be surprised we

91:25

care a lot about building chips and so

91:28

we want to build chatbots AI

91:31

co-pilots that are co-designers with our

91:34

engineers and so this is the way we did

91:36

it so we got ourselves a llama llama 2

91:40

this is a 70b and it's you know packaged

91:43

up in a NM and we asked it you know uh

91:46

what is a

91:48

CTL it turns out CTL is an internal uh

91:52

program and it has a internal

91:54

proprietary language but it thought the

91:56

CTL was a combinatorial timing logic and

91:59

so it describes you know conventional

92:01

knowledge of CTL but that's not very

92:03

useful to us and so we gave it a whole

92:06

bunch of new examples you know this is

92:09

no different than employee onboarding an

92:11

employee and we say you know thanks for

92:14

that answer it's completely wrong um and

92:17

and uh and then we present to them uh

92:19

this is what a CTL is okay and so this

92:22

is what a CTL is at Nvidia

92:24

and the CTL as you can see you know CTL

92:27

stands for compute Trace Library which

92:29

makes sense you know we are tracing

92:31

compute Cycles all the time and it wrote

92:33

the program isn't that

92:42

amazing and so the productivity of our

92:44

chip designers can go up this is what

92:46

you can do with a Nim first thing you

92:47

can do with this customize it we have a

92:49

service called Nemo microservice that

92:52

helps you curate the data preparing the

92:54

data so that you could teach this on

92:56

board this AI you fine-tune them and

92:59

then you guardrail it you can even

93:01

evaluate the answer evaluate its

93:03

performance against um other other

93:05

examples and so that's called the Nemo

93:09

micr service now the thing that's that's

93:11

emerging here is this there are three

93:12

elements three pillars of what we're

93:14

doing the first pillar is of course

93:16

inventing the technology for um uh AI

93:19

models and running AI models and

93:21

packaging it up for you the second is to

93:24

create tools to help you modify it first

93:27

is having the AI technology second is to

93:29

help you modify it and third is

93:31

infrastructure for you to fine-tune it

93:33

and if you like deploy it you could

93:35

deploy it on our infrastructure called

93:37

dgx cloud or you can employ deploy it on

93:39

Prem you can deploy it anywhere you like

93:41

once you develop it it's yours to take

93:44

anywhere and so we are

93:46

effectively an AI Foundry we will do for

93:50

you and the industry on AI what TSM does

93:54

for us building chips and so we go to it

93:56

with our go to tsmc with our big Ideas

93:59

they manufacture it and we take it with

94:00

us and so exactly the same thing here AI

94:03

Foundry and the three pillars are the

94:05

NIMS Nemo microservice and dgx Cloud the

94:10

other thing that you could teach the Nim

94:11

to do is to understand your proprietary

94:14

information remember inside our company

94:17

the vast majority of our data is not in

94:18

the cloud it's inside our company it's

94:21

been sitting there you know being used

94:23

all the time and and gosh it's it's

94:26

basically invidious intelligence we

94:28

would like to take that

94:30

data learn its meaning like we learned

94:34

the meaning of almost anything else that

94:35

we just talked about learn its meaning

94:37

and then reindex that knowledge into a

94:40

new type of database called a vector

94:43

database and so you essentially take

94:45

structured data or unstructured data you

94:47

learn its meaning you encode its meaning

94:50

so now this becomes an AI database and

94:53

it that AI database in the future once

94:56

you create it you can talk to it and so

94:58

let me give you an example of what you

94:59

could do so suppose you create you get

95:01

you got a whole bunch of multi modality

95:03

data and one good example of that is PDF

95:06

so you take the PDF you take all of your

95:09

PDFs all the all your favorite you know

95:11

the stuff that that is proprietary to

95:13

you critical to your company you can

95:15

encode it just as we encoded pixels of a

95:19

cat and it becomes the word cat we can

95:21

encode all of your PDF and turns

95:24

into vectors that are now stored inside

95:27

your vector database it becomes the

95:29

proprietary information of your company

95:31

and once you have that proprietary

95:32

information you could chat to it it's an

95:35

it's a smart database so you just chat

95:37

chat with data and how how much more

95:40

enjoyable is that you know we for for

95:43

our software team you know they just

95:46

chat with the bugs database you know how

95:49

many bugs was there last night um are we

95:51

making any progress and then after

95:53

you're done done talking to this uh bugs

95:56

database you need therapy and so so we

96:00

have another chat bot for

96:04

you you can do

96:15

it okay so we called this Nemo Retriever

96:19

and the reason for that is because

96:20

ultimately its job is to go retrieve

96:22

information as quickly as possible and

96:23

you just talk to it hey retrieve me this

96:25

information it goes if brings it back to

96:28

you do you mean this you go yeah perfect

96:31

okay and so we call it the Nemo

96:33

retriever well the Nemo service helps

96:35

you create all these things and we have

96:36

all all these different Nims we even

96:38

have Nims of digital humans I'm

96:40

Rachel your AI care

96:43

manager okay so so it's a really short

96:47

clip but there were so many videos to

96:49

show you I got so many other demos to

96:51

show you and so I I I had to cut this

96:53

one short but this is Diana she is a

96:56

digital human Nim and and uh you just

97:00

talked to her and she's connected in

97:02

this case to Hippocratic ai's large

97:04

language model for healthcare and it's

97:06

truly

97:09

amazing she is just super smart about

97:11

Healthcare things you know and so after

97:14

you're done after my my Dwight my VP of

97:17

software engineering talks to the

97:19

chatbot for bugs database then you come

97:21

over here and talk to Diane and and so

97:24

so uh Diane is is um completely animated

97:27

with AI and she's a digital

97:29

human uh there's so many companies that

97:32

would like to build they're sitting on

97:34

gold mines the the Enterprise IT

97:36

industry is sitting on a gold mine it's

97:39

a gold mine because they have so much

97:41

understanding of of uh the way work is

97:44

done they have all these amazing tools

97:46

that have been created over the years

97:48

and they're sitting on a lot of data if

97:50

they could take that gold mine and turn

97:53

them into co-pilots these co-pilots

97:55

could help us do things and so just

97:58

about every it franchise it platform in

98:01

the world that has valuable tools that

98:03

people use is sitting on a gold mine for

98:05

co-pilots and they would like to build

98:07

their own co-pilots and their own

98:09

chatbots and so we're announcing that

98:11

Nvidia AI Foundry is working with some

98:13

of the world's great companies sap

98:15

generates 87% of the world's Global

98:18

Commerce basically the world runs on sap

98:20

we run on sap Nvidia and sap are

98:22

building sa Jewel co-pilots uh using

98:26

Nvidia Nemo and dgx cloud service now

98:29

they run 80 85% of the world's Fortune

98:32

500 companies run their people and

98:34

customer service operations on service

98:36

now and they're using Nvidia AI foundary

98:40

to build service now uh assist virtual

98:44

assistance cohesity backs up the world's

98:47

data they're sitting on a gold mine of

98:48

data hundreds of exobytes of data over

98:51

10,000 companies Nvidia AI Foundry is

98:54

working with them helping them build

98:56

their Gaia generative AI agent snowflake

99:01

is a company that stores the world's uh

99:04

digital Warehouse in the cloud and

99:06

serves over three billion queries a day

99:10

for 10,000 Enterprise customers

99:13

snowflake is working with Nvidia AI

99:15

Foundry to build co-pilots with Nvidia

99:18

Nemo and Nims net apppp nearly half of

99:22

the files in the world

99:24

are stored on Prem on net app Nvidia AI

99:27

Foundry is helping them build chat Bots

99:29

and co-pilots like those Vector

99:32

databases and retrievers with Nvidia

99:34

Nemo and

99:35

Nims and we have a great partnership

99:38

with Dell everybody who everybody who is

99:41

building these chatbots and generative

99:43

AI when you're ready to run it you're

99:46

going to need an AI

99:47

Factory and nobody is better at building

99:51

end to-end systems of very large scale

99:54

for the Enterprise than Dell and so

99:56

anybody any company every company will

99:59

need to build AI factories and it turns

100:01

out that Michael is here he's happy to

100:03

take your

100:07

order ladies and gentlemen Michael

100:13

[Music]

100:14

J okay let's talk about the next wave of

100:18

Robotics the next wave of AI robotics

100:20

physical

100:21

AI so far all of the AI that we've

100:24

talked about is one

100:26

computer data comes into one computer

100:29

lots of the worlds if you will

100:31

experience in digital text form the AI

100:35

imitates Us by reading a lot of the

100:38

language to predict the next words it's

100:41

imitating You by studying all of the

100:43

patterns and all the other previous

100:44

examples of course it has to understand

100:46

context and so on so forth but once it

100:48

understands the context it's essentially

100:50

imitating you we take all of the data we

100:53

put it into a system like dgx we

100:55

compress it into a large language model

100:58

trillions and trillions of parameters

100:59

become billions and billion trillions of

101:01

tokens becomes billions of parameters

101:03

these billions of parameters becomes

101:05

your AI well in order for us to go to

101:08

the next wave of AI where the AI

101:11

understands the physical world we're

101:13

going to need three

101:14

computers the first computer is still

101:16

the same computer it's that AI computer

101:18

that now is going to be watching video

101:20

and maybe it's doing synthetic data

101:22

generation maybe there's a lot of human

101:25

examples just as we have human examples

101:28

in text form we're going to have human

101:29

examples in articulation form and the

101:33

AIS will watch

101:34

us understand what is

101:37

happening and try to adapt it for

101:40

themselves into the

101:41

context and because it can generalize

101:44

with these Foundation models maybe these

101:46

robots can also perform in the physical

101:49

world fairly generally so I just

101:51

described in very simple terms

101:54

essentially what just happened in large

101:56

language models except the chat GPT

101:58

moment for robotics may be right around

102:00

the corner and so we've been building

102:02

the end to-end systems for robotics for

102:04

some time I'm super super proud of the

102:06

work we have the AI system

102:09

dgx we have the lower system which is

102:12

called agx for autonomous systems the

102:14

world's first robotics processor when we

102:16

first built this thing people are what

102:18

are you guys building it's a s so it's

102:20

one chip it's designed to be very low

102:22

power but is designed for high-speed

102:24

sensor processing and Ai and so if you

102:28

want to run Transformers in a car or you

102:31

want to run Transformers in in a you

102:33

know anything um that moves uh we have

102:36

the perfect computer for you it's called

102:38

the Jetson and so the dgx on top for

102:41

training the AI the Jetson is the

102:43

autonomous processor and in the middle

102:45

we need another computer whereas large

102:49

language models have the

102:51

benefit of you providing your examples

102:54

and then doing reinforcement learning

102:56

human

102:57

feedback what is the reinforcement

102:59

learning human feedback of a robot well

103:02

it's reinforcement learning physical

103:05

feedback that's how you align the robot

103:08

that's how you that's how the robot

103:09

knows that as it's learning these

103:11

articulation capabilities and

103:13

manipulation capabilities it's going to

103:15

adapt properly into the laws of physics

103:18

and so we need a simulation

103:21

engine that represents the world

103:24

digitally for the robot so that the

103:26

robot has a gym to go learn how to be a

103:28

robot we call

103:30

that virtual world Omniverse and the

103:34

computer that runs Omniverse is called

103:36

ovx and ovx the computer itself is

103:39

hosted in the Azure Cloud okay and so

103:43

basically we built these three things

103:44

these three systems on top of it we have

103:47

algorithms for every single one now I'm

103:50

going to show you one super example of

103:52

how AI

103:53

and Omniverse are going to work together

103:56

the example I'm going to show you is

103:58

kind of insane but it's going to be very

103:59

very close to tomorrow it's a robotics

104:02

building this robotics building is

104:05

called a warehouse inside the robotics

104:07

building are going to be some autonomous

104:09

systems some of the autonomous systems

104:11

are going to be called humans and some

104:13

of the autonomous systems are going to

104:15

be called forklifts and these autonomous

104:17

systems are going to interact with each

104:20

other of course autonomously and it's

104:22

going to be overlooked upon by this

104:24

Warehouse to keep everybody out of

104:26

Harm's Way the warehouse is essentially

104:28

an air traffic controller and whenever

104:30

it sees something happening it will

104:33

redirect traffic and give new waypoints

104:36

just new waypoints to the robots and the

104:38

people and they'll know exactly what to

104:40

do this warehouse this building you can

104:43

also talk to of course you could talk to

104:46

it hey you know sap Center how are you

104:48

feeling today for example and so you

104:52

could ask as the same the warehouse the

104:54

same questions basically the system I

104:56

just described will have Omniverse Cloud

104:59

that's hosting the virtual simulation

105:02

and AI running on djx cloud and all of

105:06

this is running in real time let's take

105:07

a

105:09

look the future of heavy Industries

105:12

starts as a digital twin the AI agents

105:15

helping robots workers and

105:17

infrastructure navigate unpredictable

105:19

events in complex industrial spaces will

105:22

be built and evaluated first in

105:24

sophisticated digital

105:26

twins this Omniverse digital twin of a

105:29

100,000 ft Warehouse is operating as a

105:32

simulation environment that integrates

105:34

digital workers AMR is running the

105:37

Nvidia Isaac receptor stack centralized

105:40

activity maps of the entire Warehouse

105:42

from 100 simulated ceiling mount cameras

105:44

using Nvidia metropolis and AMR route

105:47

planning with Nvidia Koop software and

105:51

loop testing of AI agents in this

105:53

physically accurate simulated

105:55

environment enables us to evaluate and

105:57

refine how the system adapts to real

106:00

world

106:01

unpredictability here an incident occurs

106:04

along this amr's planned route blocking

106:06

its path as it moves to pick up a pallet

106:09

Nvidia Metropolis updates and sends a

106:11

realtime occupancy map to kuop where a

106:14

new optimal route is calculated the AMR

106:17

is enabled to see around corners and

106:19

improve its Mission efficiency with

106:21

generative AI power Metropolis Vision

106:24

Foundation models operators can even ask

106:26

questions using natural language the

106:29

visual model understands nuanced

106:31

activity and can offer immediate

106:33

insights to improve operations all of

106:35

the sensor data is created in simulation

106:38

and passed to the real-time AI running

106:40

as Nvidia inference microservices or

106:43

Nims and when the AI is ready to be

106:45

deployed in the physical twin the real

106:47

Warehouse we connect metropolis and

106:50

Isaac Nims to real sensors with the

106:53

ability for continuous Improvement of

106:55

both the digital twin and the AI

107:00

models is that

107:03

incredible and

107:06

so remember remember a future facility

107:11

Warehouse Factory building will be

107:13

software defined and so the software is

107:16

running how else would you test the

107:18

software so you you you test the

107:20

software to building the warehouse the

107:22

optimiz ation system in the digital twin

107:24

what about all the robots all of those

107:26

robots you were seeing just now they're

107:28

all running their own autonomous robotic

107:30

stack and so the way you integrate

107:32

software in the future cicd in the

107:34

future for robotic systems is with

107:36

digital twins we've made Omniverse a lot

107:40

easier to access we're going to create

107:42

basically Omniverse Cloud apis four

107:45

simple API and a channel and you can

107:47

connect your application to it so this

107:49

is this is going to be as wonderfully

107:52

beautifully simple in the future that

107:54

Omniverse is going to be and with these

107:56

apis you're going to have these magical

107:59

digital twin capability we also have

108:02

turned Omniverse into an AI and

108:06

integrated it with the ability to chat

108:08

USD the the language of our language is

108:11

you know human and Omniverse is language

108:14

as it turns out is universal scene

108:16

description and so that language is

108:19

rather complex and so we've taught our

108:21

Omniverse uh that language and so you

108:23

can speak to it in English and it would

108:25

directly generate USD and it would talk

108:28

back in USD but Converse back to you in

108:30

English you could also look for

108:32

information in this world semantically

108:35

instead of the world being encoded

108:37

semantically in in language now it's

108:38

encoded semantically in scenes and so

108:41

you could ask it of of certain objects

108:44

or certain conditions and certain

108:45

scenarios and it can go and find that

108:47

scenario for you it also can collaborate

108:50

with you in generation you could design

108:51

some things in 3D

108:53

it could simulate some things in 3D or

108:55

you could use AI to generate something

108:56

in 3D let's take a look at how this is

108:59

all going to work we have a great

109:00

partnership with seens Seaman is the

109:03

world's largest industrial engineering

109:06

and operations platform you've seen now

109:09

so many different companies in the

109:10

industrial space heavy Industries is one

109:13

of the greatest final frontiers of it

109:16

and we finally now have the Necessary

109:19

Technology to go and make a real impact

109:21

Seaman is building the industrial

109:23

metaverse and today we're announcing

109:25

that seens is connecting their Crown

109:27

Jewel accelerator to Nvidia Omniverse

109:31

let's take a

109:33

look SE technology is transformed every

109:36

day for everyone team Center a our

109:38

leading product life cycle management

109:40

software from the seens accelerator

109:42

platform is used every day by our

109:44

customers to develop and deliver

109:47

products at scale now we are bringing

109:50

the real and the digital worlds even

109:52

close user by integrating Nvidia Ai and

109:55

Omniverse Technologies into team Center

109:57

X Omniverse apis enable data

110:01

interoperability and physics-based

110:02

rendering to Industrial scale design and

110:05

Manufacturing projects our customers

110:08

Hundai market leader in sustainable ship

110:11

manufacturing builds ammonia and

110:13

hydrogen power chips often comprising

110:16

over 7 million discrete

110:18

Parts Omniverse apis team Center X lets

110:22

companies like HD yundai unify and

110:24

visualize these massive engineering data

110:27

sets interactively and integrate

110:29

generative AI to generate 3D objects or

110:33

hdri backgrounds to see their projects

110:36

in context a result an ultra inuitive

110:39

photoal physics-based digital twin that

110:42

eliminates waste and errors delivering

110:45

huge savings in cost and

110:47

time and we are building this for

110:50

collaboration whether across more semen

110:52

a tools like seens anex or Star CCM Plus

110:57

or across teams working on their

110:59

favorite devices in the same scene

111:02

together and this is just the beginning

111:05

working with Nvidia we will bring

111:07

accelerated Computing generative Ai and

111:10

Omniverse integration across the Sean

111:13

accelerator

111:15

[Music]

111:21

portfolio the pro the the professional

111:25

the professional voice actor happens to

111:28

be a good friend of mine Roland Bush who

111:30

happens to be the CEO of

111:34

[Applause]

111:39

seamons once you get Omniverse connected

111:44

into your workflow your

111:46

ecosystem from the beginning of your

111:49

design to

111:51

engineering to to manufacturing planning

111:54

all the way to digital twin

111:56

operations once you connect everything

111:58

together it's insane how much

112:01

productivity you can get and it's just

112:03

really really wonderful all of a sudden

112:04

everybody's operating on the same ground

112:06

truth you don't have to exchange data

112:09

and convert data make mistakes everybody

112:12

is working on the same ground truth from

112:15

the design Department to the art

112:16

Department the architecture Department

112:18

all the way to the engineering and even

112:19

the marketing department let's take a

112:21

look how Nissan has integrated Omniverse

112:26

into their workflow and it's all because

112:28

it's connected by all these wonderful

112:30

tools and these developers that we're

112:32

working with take a look

112:34

[Music]

112:51

on

112:52

[Music]

112:59

[Music]

113:10

fore

113:13

[Music]

113:21

spee

113:23

[Music]

114:12

that was not an animation that was

114:15

Omniverse today we're announcing that

114:18

Omniverse

114:19

Cloud streams to the Vision Pro

114:30

and it is very very

114:33

strange that you walk around virtual

114:36

doors when I was getting out of that

114:39

car and everybody does it it is really

114:43

really quite amazing Vision Pro

114:45

connected to Omniverse portals you into

114:48

Omniverse and because all of these cat

114:51

tools and all these design tools are now

114:53

integrated and connected to Omniverse

114:55

you can have this type of workflow

114:57

really incredible let's talk about

114:59

robotics everything that moves will be

115:02

robotic there's no question about that

115:03

it's safer it's more

115:06

convenient and one of the largest

115:08

Industries is going to be Automotive we

115:10

build the robotic stack from top to

115:12

bottom as I was mentioned from the

115:14

computer system but in the case of

115:15

self-driving cars including the

115:17

self-driving application at the end of

115:20

this year or I guess beginning of next

115:22

year we will be shipping in Mercedes and

115:25

then shortly after that Jr and so these

115:28

autonomous robotic systems are software

115:30

defined they take a lot of work to do

115:33

has computer vision has obviously

115:34

artificial intelligence control and

115:36

planning all kinds of very complicated

115:39

technology and takes years to refine

115:41

we're building the entire stack however

115:44

we open up our entire stack for all of

115:46

the automotive industry this is just the

115:48

way we work the way we work in every

115:50

single industry we try to build as much

115:51

of as we so that we understand it but

115:53

then we open it up so everybody can

115:55

access it whether you would like to buy

115:57

just our computer which is the world's

116:00

only full functional safe asld system

116:05

that can run

116:07

AI this functional safe asld quality

116:11

computer or the operating system on top

116:14

or of course our data centers which is

116:17

in basically every AV company in the

116:19

world however you would like to enjoy it

116:22

we're delighted by it today we're

116:24

announcing that byd the world's largest

116:26

ev company is adopting our next

116:29

Generation it's called Thor Thor is

116:32

designed for Transformer engines Thor

116:34

our next Generation AV computer will be

116:37

used by

116:46

BD you probably don't know this fact

116:48

that we have over a million robotics

116:51

developers we we created Jetson this

116:53

robotics computer we're so proud of it

116:56

the amount of software that goes on top

116:57

of it is insane but the reason why we

116:59

can do it at all is because it's 100%

117:01

Cuda compatible everything that we do

117:04

everything that we do in our company is

117:05

in service of our developers and by us

117:08

being able to maintain this Rich

117:10

ecosystem and make it compatible with

117:13

everything that you access from us we

117:15

can bring all of that incredible

117:17

capability to this little tiny computer

117:19

we call Jetson a robotics computer we

117:22

also today are

117:24

announcing this incredibly Advanced new

117:26

SDK we call it Isaac

117:29

perceptor Isaac perceptor most most of

117:32

the robots today are pre-programmed

117:36

they're either following rails on the

117:37

ground digital rails or they' be

117:39

following April tags but in the future

117:41

they're going to have perception and the

117:43

reason why you want that is so that you

117:45

could easily program it you say I would

117:47

you like to go from point A to point B

117:49

and it will figure out a way to navigate

117:51

its way there so by only programming

117:54

waypoints the entire route could be

117:57

adaptive the entire environment could be

117:59

reprogrammed just as I showed you at the

118:01

very beginning with the warehouse you

118:03

can't do that with pre-programmed AGS if

118:08

those boxes fall down they just all gum

118:10

up and they just wait there for somebody

118:11

to come clear it and so now with the

118:14

Isaac

118:15

perceptor we have incredible

118:18

state-of-the-art Vision odometry 3D

118:21

reconstruction

118:22

and in addition to 3D reconstruction

118:25

depth perception the reason for that is

118:26

so that you can have two modalities to

118:28

keep an eye on what's happening in the

118:30

world Isaac perceptor the most used

118:35

robot today is the

118:37

manipulator manufacturing arms and they

118:39

are also pre-programmed the computer

118:41

vision algorithms the AI algorithms the

118:44

control and path planning algorithms

118:46

that are geometry aware incredibly

118:49

computationally intensive we have made

118:52

these Cuda accelerated so we have the

118:54

world's first Cuda accelerated motion

118:57

planner that is geometry aware you put

119:00

something in front of it it comes up

119:02

with a new plan and articulates around

119:04

it it has excellent perception for pose

119:07

estimation of a 3D object not just not

119:11

it's POS in 2D but it's pose in 3D so it

119:13

has to imagine what's around and how

119:16

best to grap it so the foundation pose

119:20

the grip Foundation

119:22

and the articulation algorithms are now

119:25

available we call it Isaac manipulator

119:28

and they also just run on nvidia's

119:31

computers we are starting to do some

119:35

really great work in the next generation

119:37

of Robotics the next generation of

119:39

Robotics will likely be a humanoid

119:43

robotics we now have the Necessary

119:45

Technology and as I was describing

119:48

earlier the Necessary Technology to

119:51

imagine generalized human robotics in a

119:54

way human robotics is likely easier and

119:57

the reason for that is because we have a

119:58

lot more imitation training data that we

120:02

can provide the robots because we are

120:04

constructed in a very similar way it is

120:06

very likely that the human of Robotics

120:08

will be much more useful in our world

120:11

because we created the world to be

120:13

something that we can interoperate in

120:14

and work well in and the way that we set

120:17

up our workstations and Manufacturing

120:19

and Logistics they were designed for for

120:21

humans they were designed for people and

120:23

so these human robotics will likely be

120:25

much more productive to

120:28

deploy while we're creating just like

120:30

we're doing with the others the entire

120:32

stack starting from the top a foundation

120:35

model that learns from watching video

120:39

human IM human examples it could be in

120:43

video form it could be in virtual

120:44

reality form we then created a gym for

120:48

it called Isaac reinforcement learning

120:50

gym which allows the humanoid robot to

120:53

learn how to adapt to the physical world

120:57

and then an incredible computer the same

120:59

computer that's going to go into a

121:00

robotic car this computer will run

121:03

inside a human or robot called Thor it's

121:06

designed for Transformer engines we've

121:09

combined several of these into one video

121:12

this is something that you're going to

121:13

really love take a

121:17

look it's not enough for humans to

121:20

imagine

121:26

[Applause]

121:27

we have to

121:30

invent and

121:32

explore and push Beyond what's been

121:35

done am of

121:40

[Music]

121:41

detail we create

121:44

smarter and

121:47

faster we push it to

121:50

fail so it can

121:54

learn we teach it then help it teach

121:59

itself we broaden its

122:04

understanding to take on new

122:09

challenges with absolute

122:13

precision and

122:17

succeed we make it

122:20

perceive and move

122:24

and even

122:25

[Music]

122:27

reason so it can share our world with

122:31

[Music]

122:50

us this this is where inspiration leads

122:53

us the next

122:56

Frontier this is Nidia project

123:03

group a general purpose Foundation model

123:06

for humanoid robot

123:09

learning the group model takes

123:11

multimodal instructions and past

123:13

interactions as input and produces the

123:16

next action for the robot to

123:19

execute we developed Isaac lab a robot

123:22

learning application to train grp on

123:25

Omniverse Isaac

123:26

Sim and we scale out with osmo a new

123:30

compute orchestration service that

123:32

coordinates workflows across dgx systems

123:34

for training and ovx systems for

123:38

simulation with these tools we can train

123:41

Gro in physically based simulation and

123:44

transfer zero shot to the real

123:47

world the group model will enable a

123:49

robot to learn from a handful of human

123:52

demonstrations so it can help with

123:54

everyday

123:56

tasks and emulate human movement just by

124:00

observing

124:02

us this is made possible with nvidia's

124:04

technologies that can understand humans

124:07

from videos train models in simulation

124:09

and ultimately deploy them directly to

124:12

physical robots connecting group to a

124:14

large language model even allows it to

124:17

generate motions by following natural

124:19

language instructions hi Jo on can you

124:22

give me a high five sure thing let's

124:25

high

124:26

five can you give us some cool moves

124:29

sure check this

124:32

[Music]

124:34

out all this incredible intelligence is

124:37

powered by the new Jetson Thor robotics

124:39

chips designed for Groot built for the

124:42

future with Isaac lab osmo and Groot

124:46

we're providing the building blocks for

124:47

the next generation of AI powered

124:50

robotics

124:52

[Music]

124:56

[Applause]

125:03

[Music]

125:06

about the same

125:08

[Applause]

125:14

size the soul of

125:16

Nvidia the intersection of computer

125:19

Graphics physics artificial intelligence

125:22

it all came to bear at this moment the

125:25

name of that project general robotics

125:30

03 I know super

125:35

good super

125:37

good well I think we have some special

125:42

guests do

125:47

[Music]

125:50

we

125:51

[Music]

125:53

hey

125:53

[Music]

125:56

guys so I understand you guys are

125:58

powered by

126:00

Jetson they're powered by

126:02

Jetson little Jetson robotics computers

126:05

inside they learn to walk in Isaac

126:13

Sim ladies and gentlemen this this is

126:16

orange and this is the famous green they

126:19

are the bdx robot

126:22

of

126:23

Disney amazing Disney

126:28

research come on you guys let's wrap up

126:31

let's

126:32

go five things where you

126:37

going I sit right

126:43

here Don't Be Afraid come here green

126:46

hurry

126:48

[Music]

126:49

up what are you saying

126:53

no it's not time to

126:56

eat it's not time to

127:00

eat I'll I'll give you a snack in a

127:03

moment let me finish up real

127:05

quick come on green hurry up stop

127:09

wasting

127:11

time five things five things first a new

127:16

Industrial Revolution every data center

127:19

should be accelerated a trillion worth

127:21

of installed data centers will become

127:24

modernized over the next several years

127:26

second because of the computational

127:27

capability we brought to bear a new way

127:29

of doing software has emerged generative

127:31

AI which is going to create new in new

127:34

infrastructure dedicated to doing one

127:37

thing and one thing only not for

127:39

multi-user data centers but AI

127:42

generators these AI generation will

127:45

create incredibly valuable

127:47

software a new Industrial Revolution

127:50

second the computer of this revolution

127:53

the computer of this generation

127:55

generative AI trillion

127:58

parameters blackw insane amounts of

128:01

computers and Computing third I'm trying

128:04

to

128:08

concentrate good job third new computer

128:13

new computer creates new types of

128:14

software new type of software should be

128:16

distributed in a new way so that it can

128:19

on the one hand be an endpoint in the

128:21

cloud and easy to use but still allow

128:23

you to take it with you because it is

128:25

your intelligence your intelligence

128:27

should be pack packaged up in a way that

128:30

allows you to take it with you we call

128:31

them Nims and third these Nims are going

128:34

to help you create a new type of

128:36

application for the future not one that

128:39

you wrote completely from scratch but

128:41

you're going to integrate them like

128:43

teams create these applications we have

128:46

a fantastic capability between Nims the

128:49

AI technology the pools Nemo and the

128:53

infrastructure dgx cloud in our AI

128:55

Foundry to help you create proprietary

128:57

applications proprietary chat Bots and

128:59

then lastly everything that moves in the

129:01

future will be robotic you're not going

129:03

to be the only one and these robotic

129:06

systems whether they are humanoid amrs

129:10

self-driving cars forklifts manipulating

129:13

arms they will all need one thing Giant

129:17

stadiums warehouses factories there

129:20

going to be factories that are robotic

129:21

orchestrating factories uh manufacturing

129:23

lines that are robotics building cars

129:25

that are robotics these systems all need

129:29

one thing they need a platform a digital

129:33

platform a digital twin platform and we

129:35

call that Omniverse the operating system

129:38

of the robotics

129:39

World these are the five things that we

129:41

talked about today what does Nvidia look

129:43

like what does Nvidia look like when we

129:46

talk about gpus there's a very different

129:48

image that I have when I when people ask

129:50

me about G gpus first I see a bunch of

129:53

software stacks and things like that and

129:55

second I see this this is what we

129:58

announce to you today this is Blackwell

130:01

this is the

130:06

plat amazing amazing processors mvlink

130:11

switches networking systems and the

130:14

system design is a miracle this is

130:17

Blackwell and this to me is what a GPU

130:20

looks like in my mind

130:29

listen orange green I think we have one

130:32

more treat for everybody what do you

130:34

think should

130:36

we okay we have one more thing to show

130:39

you roll

130:43

[Music]

130:50

it

131:02

[Music]

131:20

sh

131:23

[Music]

131:46

[Music]

131:50

w

131:54

[Music]

132:10

[Music]

132:20

no

132:24

[Music]

132:33

[Music]

132:56

thank

132:57

you thank you have a great have a great

133:01

GTC thank you all for coming thank

133:20

you

133:50

[Music]

134:07

oh

Rate This

5.0 / 5 (0 votes)

Related Tags
AI_TechnologyRobotics_RevolutionGenerative_AIDigital_TwinsAutonomous_SystemsNvidia_GPUsJetson_ThorOmniverse_PlatformAI_FoundryGTC_Conference