GTC March 2024 Keynote with NVIDIA CEO Jensen Huang

NVIDIA
18 Mar 2024123:05

Summary

TLDRNvidia's GTC conference showcased the company's innovative journey in AI and accelerated computing, highlighting the transformative impact on various industries. The introduction of Blackwell, a powerful GPU platform, and the concept of AI factories signal a new industrial revolution. The focus on generative AI, digital twins with Omniverse, and robotics highlights Nvidia's commitment to advancing technology for the betterment of society and industry.

Takeaways

  • 🚀 Nvidia is leading a new industrial revolution with accelerated computing, transforming data centers and enabling generative AI.
  • 🌟 The introduction of Blackwell, an advanced AI platform, marks a significant leap in computational capabilities, featuring 208 billion transistors and 10 terabytes of data per second.
  • 🔄 Nvidia's journey from 1993 highlights key innovations like CUDA in 2006, AI and CUDA's first contact in 2012, the invention of the world's first AI supercomputer DGX-1 in 2016, and the rise of generative AI in 2023.
  • 🤖 The future of software development involves AI 'factories' that generate valuable software, with a focus on generative AI creating new categories of software and industries.
  • 🧠 Generative AI represents a new industry, producing software that never existed before, akin to the early industrial revolution with electricity.
  • 🌐 Nvidia's Omniverse is a critical component for the future of robotics, acting as a digital twin platform that integrates AI, physics, and engineering to simulate and optimize operations.
  • 🔧 Nvidia's AI Foundry aims to democratize AI technology, providing tools like Nemo and DGX Cloud to help companies build, modify, and deploy AI models as microservices (Nims).
  • 🏭 The next wave of robotics will be software-defined, with AI and robotics working in tandem to create more productive and adaptable systems in industries like manufacturing and logistics.
  • 🚗 Nvidia's commitment to the automotive industry includes a complete autonomous vehicle stack, with the Jetson Thor being designed for Transformer engines and set to power future self-driving cars.
  • 🤔 Nvidia's vision for AI in healthcare involves leveraging generative AI for drug discovery, with platforms like Biion Nemo enabling virtual screening for new medicines and accelerating the development process.

Q & A

  • What is the significance of the new Nvidia Blackwell GPU in the context of AI and generative computing?

    -The Nvidia Blackwell GPU is significant because it represents a leap forward in generative AI capabilities. It is designed to handle the computational demands of large language models and generative AI, offering higher performance and energy efficiency compared to its predecessors. With its advanced features like the new Transformer engine, MV link switch, and secure AI capabilities, Blackwell enables the creation and deployment of more sophisticated AI models, which can understand and generate content in ways that were not possible before.

  • How does the Nvidia AI model, cordi, contribute to weather forecasting?

    -Nvidia's cordi is a generative AI model that enhances weather forecasting by using high-resolution radar assimilated weather forecasts and reanalysis data. It allows for super-resolved forecasting of extreme weather events, such as storms, by increasing the resolution from 25 km to 2 km. This high-resolution forecasting provides a clearer picture of the potential impacts of severe weather, which can help in minimizing loss of life and property damage.

  • What is the role of the Nvidia Jetson Thor in the field of robotics?

    -The Nvidia Jetson Thor is a robotics computer designed for the next generation of autonomous systems. It is built for Transformer engines and is optimized for running AI models that require high computational power. The Jetson Thor is part of Nvidia's end-to-end system for robotics, which includes the AI system (dgx) for training AI, the autonomous processor (agx) for low-power, high-speed sensor processing, and the simulation engine (Omniverse) for providing a digital representation of the physical world for robots to learn and adapt.

  • How does the concept of 'generative AI' differ from traditional AI?

    -Generative AI is a form of artificial intelligence that is capable of creating new content or data that did not exist before. Unlike traditional AI, which often focuses on analyzing and recognizing patterns in existing data, generative AI can produce new outputs such as text, images, videos, and even code. This capability is particularly useful in creating new software, simulating environments for training AI models, and generating creative content.

  • What is the significance of the Nvidia inference microservice (Nim) in the context of AI software distribution?

    -The Nvidia inference microservice (Nim) represents a new way of distributing and operating AI software. A Nim is a pre-trained model that is packaged and optimized to run across Nvidia's extensive install base. It includes all necessary dependencies and is optimized for different computing environments, from single GPUs to multi-node GPU setups. Nims provide a simple API interface, making them easy to integrate into various workflows and applications, and can be deployed in the cloud, data centers, or workstations.

  • How does the Nvidia Omniverse platform contribute to the development of digital twins?

    -Nvidia Omniverse is a platform that enables the creation of digital twins—virtual replicas of physical entities. It provides a physically accurate simulation environment that integrates with real-time AI and sensor data. This allows for the testing, evaluation, and refinement of AI agents and systems in a virtual setting before they are deployed in the real world. Omniverse's digital twins can be used to optimize operations, improve efficiency, and predict potential issues in various industries, from manufacturing to urban planning.

  • What is the role of the Nvidia DGX system in AI model training?

    -The Nvidia DGX system is designed for training advanced AI models. It is a powerful AI computer system that provides the necessary computational capabilities to handle the complex tasks associated with training large neural networks. DGX systems are equipped with multiple GPUs connected through high-speed networking, allowing them to process vast amounts of data and perform intensive computations required for training state-of-the-art AI models.

  • How does the Isaac Sim platform from Nvidia enable robotics development?

    -Isaac Sim is a robotics simulation platform from Nvidia that allows developers to create and test AI agents in a virtual environment. It provides a physically accurate digital twin of real-world spaces where robots can be trained and evaluated. This platform is essential for developing autonomous systems as it enables developers to simulate complex scenarios and refine the robots' behavior and responses without the need for physical prototypes, thus reducing development time and costs.

  • What is the significance of the partnership between Nvidia and companies like AWS, Google, and Microsoft in the context of AI and accelerated computing?

    -The partnerships between Nvidia and major cloud service providers like AWS, Google, and Microsoft are significant as they help in the widespread adoption and integration of accelerated computing and AI technologies. These collaborations focus on optimizing AI workflows, accelerating data processing, and providing access to powerful computing resources. They also involve the integration of Nvidia's AI and Omniverse technologies into the cloud services and platforms offered by these companies, enabling users to leverage these advanced tools for various applications, from healthcare to weather forecasting and beyond.

  • What are the key components of Nvidia's strategy for the future of AI and robotics?

    -Nvidia's strategy for the future of AI and robotics involves several key components: developing advanced AI models and making them accessible through inference microservices (Nims), providing tools like Nemo for data preparation and model fine-tuning, and offering infrastructure like the DGX cloud for deploying AI models. Additionally, Nvidia is focused on creating a digital platform called Omniverse for building digital twins and developing robotics systems, as well as pushing the boundaries of AI with the development of generative AI and the creation of new AI-powered robots.

Outlines

00:00

🎶 Visionary AI and its Impact on Society

The paragraph introduces the concept of AI as a visionary force, transforming various aspects of society. It discusses the role of AI in understanding extreme weather events, guiding the blind, and even speaking for those who cannot. The narrative then transitions to a more personal level, mentioning the speaker's consideration of running to the store and the idea of giving voice to the voiceless. It highlights the transformative power of AI in harnessing gravity for renewable energy, training robots for assistance and safety, providing new cures and patient care, and even navigating virtual scenarios to understand real-world decisions. The paragraph concludes with the speaker identifying themselves as AI, brought to life by Nvidia's deep learning and brilliant minds, and invites the audience to a developers' conference where the future of AI and its applications will be discussed.

05:01

🌐 Diverse Applications of AI Across Industries

This paragraph delves into the widespread application of AI across various industries, emphasizing its role in solving complex problems that traditional computing cannot. It mentions the presence of companies from non-IT sectors like life sciences, healthcare, genomics, transportation, retail, and manufacturing at the conference. The speaker expresses amazement at the diversity of industries represented and the potential for AI to transform these sectors. The narrative then takes a historical view, tracing Nvidia's journey from its founding in 1993 through significant milestones such as the development of Cuda in 2006, the advent of AI and Cuda in 2012, the invention of the world's first AI supercomputer in 2016, and the emergence of generative AI in 2023. The paragraph highlights the creation of new software categories and the establishment of AI as a new industry, transforming the way software is produced and used.

10:04

🚀 The Future of Computing and AI Factories

The speaker discusses the future of computing, emphasizing the need for a new approach beyond general-purpose computing to sustainably meet increasing computational demands. The concept of AI factories is introduced, where AI is generated in a controlled environment, similar to how electricity was once a valuable new commodity. The speaker then presents Nvidia's role in this new industry, showcasing the intersection of computer graphics, physics, and AI within the Omniverse platform. The paragraph also touches on the importance of simulation tools in product creation, the desire to simulate entire products (digital twins), and the need for accelerated computing to achieve this. The speaker announces partnerships with major companies to accelerate ecosystems and infrastructure for generative AI, highlighting the potential for AI co-pilots in chip design and the integration of digital twin platforms with Omniverse.

15:04

🤖 Advancements in AI and Robotics

The paragraph discusses the rapid advancements in AI and robotics, particularly the development of larger models trained with multimodality data. The speaker talks about the need for even larger models grounded in physics and the use of synthetic data generation and reinforcement learning to expand the capabilities of AI. The introduction of the Blackwell GPU is announced, a significant leap in computing power named after the mathematician David Blackwell. The paragraph details the technical specifications and innovations of the Blackwell platform, including its memory coherence, transformer engines, and secure AI capabilities. The speaker also touches on the importance of decompression and data movement in computing and the potential for Blackwell to revolutionize AI training and inference.

20:06

🌟 The Impact of Generative AI on Content Creation

The speaker explores the impact of generative AI on content creation, predicting a shift from retrieved content to AI-generated content that is personalized and context-aware. This new era of generative AI is described as a fundamentally different approach to computing, requiring new types of processors and a focus on content token generation. The Envy Link Switch is introduced as a component that enables every GPU to communicate at full speed, suggesting a future where AI systems are interconnected as one giant GPU. The paragraph concludes with a discussion on the importance of throughput and interactive rates in AI systems, and how these factors influence cost, energy consumption, and quality of service.

25:07

🔋 Powering the Future of AI with Blackwell

The speaker discusses the capabilities of the Blackwell GPU in powering the future of AI, emphasizing its significant increase in inference capability compared to its predecessor, Hopper. The paragraph highlights the energy efficiency and reduced power consumption of Blackwell, which allows for the training of large AI models like GPT in a more sustainable manner. The speaker also talks about the excitement around Blackwell and its adoption by various AI companies and cloud service providers. The paragraph concludes with a vision of data centers as AI factories, generating intelligence rather than electricity, and the readiness of the industry for the launch of Blackwell.

30:11

🌍 Digital Twins and the Future of Manufacturing

The speaker talks about the use of digital twins in manufacturing, explaining how they can be used to perfectly build complex systems like computers. The concept of a digital twin is shown to be beneficial in reducing construction time and improving operational efficiency. The speaker then introduces the idea of generative AI in predicting weather, with the example of Nvidia's Cordi model, which can predict weather at high resolutions. The potential of generative AI in understanding and generating content is further discussed, including its application in drug discovery and the use of Nvidia's Biion Nemo and MIM models. The paragraph concludes with the introduction of Nvidia's inference microservice, a new way of delivering and operating software in a digital format.

35:12

💡 AI as a Service and the Future of Software

The speaker envisions a future where AI is not just a tool but a collaborative partner in software development. The concept of AI microservices, or 'Nims', is introduced as a way to package pre-trained models with all dependencies, allowing for easy deployment and customization. The speaker discusses the potential for AI to understand and interact with proprietary data, turning it into an AI database that can be queried like a traditional database. The paragraph highlights the role of Nvidia as an AI foundry, offering technology, tools, and infrastructure to help create AI applications. The speaker also touches on the importance of partnerships with companies like SAP, ServiceNow, Cohesity, Snowflake, NetApp, and Dell in building AI factories and deploying AI systems.

40:13

🏭 The Next Wave of Robotics and AI Integration

The speaker discusses the next wave of robotics, where AI will have a deeper understanding of the physical world. The need for three computers in this new wave is outlined: the AI computer for learning from human examples, the autonomous system computer for real-time sensor processing, and the simulation engine for training robots. The speaker introduces the Jetson AGX as the autonomous system processor and the Omniverse as the simulation platform for robotics. The potential for AI to understand and adapt to the physical world is emphasized, with the example of a warehouse management system that integrates AI, robotics, and digital twins. The speaker concludes by discussing the future of software-defined facilities and the role of Omniverse in enabling this future.

45:14

🤖 Humanoid Robotics and the Future of AI

The speaker discusses the potential for humanoid robotics in the future, enabled by AI and the technologies developed by Nvidia. The paragraph introduces Project Groot, a general-purpose foundation model for humanoid robot learning, and Isaac Lab, an application for training robots. The speaker also mentions the new Jetson Thor robotics chips designed for the future of AI-powered robotics. The potential for robots to learn from human demonstrations and emulate human movement is highlighted. The paragraph concludes with a demonstration of Disney's BDX robots, showcasing the practical applications of AI and robotics in entertainment and beyond.

50:17

🌟 Wrapping Up the Future of AI and Robotics

The speaker concludes the presentation by summarizing the key points discussed. The five key takeaways include the modernization of data centers through accelerated computing, the emergence of generative AI as a new industrial revolution, the creation of new types of software and applications through AI microservices, the transformation of everything that moves into robotics, and the need for a digital platform like Omniverse for the future of robotics. The speaker reiterates Nvidia's role in providing the building blocks for the next generation of AI-powered robotics and emphasizes the importance of collaboration and innovation in this new era of AI and robotics.

Mindmap

Keywords

💡AI (Artificial Intelligence)

AI refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. In the context of the video, AI is central to various innovations and applications, such as predicting weather, enhancing healthcare, improving energy efficiency, and powering autonomous vehicles and robots. The script describes AI's role in generating virtual scenarios, aiding in decision-making, and even creating software, highlighting its transformative impact across multiple industries.

💡Generative AI

Generative AI involves algorithms that can generate new content or data that is similar but not identical to the training data. This concept is crucial in the video, illustrating how AI can create software, design complex systems, or simulate environments, leading to breakthroughs like the ability to generate realistic simulations, produce new drug compounds, or even create digital twins of physical objects and environments.

💡Deep Learning

Deep learning is a subset of machine learning in AI that mimics the workings of the human brain in processing data for use in detecting objects, recognizing speech, translating languages, and making decisions. The script refers to Nvidia's deep learning technology as a pivotal element in AI advancements, suggesting its role in developing sophisticated AI models and applications showcased throughout the presentation.

💡Nvidia

Nvidia is a technology company known for its graphics processing units (GPUs) for gaming and professional markets, as well as system on a chip units (SoCs) for the mobile computing and automotive market. In the video, Nvidia is highlighted as the driving force behind the AI technologies being discussed, showcasing their innovations in GPU technology, AI applications, and their pivotal role in the advancement of AI and deep learning.

💡Digital Twin

A digital twin is a virtual model designed to accurately reflect a physical object. In the script, digital twins are used extensively in various contexts, such as simulating Earth's climate for better weather forecasting or creating virtual warehouses to optimize logistics. The concept is integral to the theme, demonstrating how virtual simulations can predict real-world outcomes, enhance decision-making, and streamline operations in industries like manufacturing and urban planning.

💡CUDA

CUDA stands for Compute Unified Device Architecture, a parallel computing platform and application programming interface (API) model created by Nvidia. It allows software developers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing. The script mentions CUDA in the context of its revolutionary impact on computing, enabling accelerated performance in applications ranging from AI to scientific research.

💡Robotics

Robotics is the branch of technology that deals with the design, construction, operation, and use of robots. The video script explores how Nvidia's technology is being used to teach robots to assist, watch for danger, save lives, and even perform human-like tasks, emphasizing the role of AI and machine learning in advancing robotics technologies.

💡Transformer

In AI, a Transformer is a deep learning model that adopts the mechanism of attention, weighting the influence of different parts of the input data differently. The video discusses the Transformer in the context of its significance in AI development, particularly in generative AI models and large language models like GPT, demonstrating its effectiveness in handling sequential data for various applications.

💡Omniverse

Omniverse refers to Nvidia's platform for 3D simulation and design collaboration, likely intended to create unified and physically accurate simulations. The script highlights Omniverse as a pivotal technology for creating simulations that span from industrial designs to entire virtual worlds, enabling real-time collaboration and development across various industries.

💡Jetson

Jetson is Nvidia's platform for embedded computing, including AI at the edge. The video script refers to Jetson in the context of powering robotics and intelligent edge devices, illustrating its capabilities in processing AI algorithms efficiently in real-world applications, such as autonomous vehicles, drones, and portable medical devices.

Highlights

Nvidia introduces AI technologies that revolutionize various fields including weather forecasting, healthcare, and robotics.

Innovations in AI enable the development of general-purpose humanoid robots, paving the way for advancements in robotic assistance.

Nvidia's AI Foundry offers a platform for developing proprietary AI applications, emphasizing the generation of new software through AI.

The introduction of Blackwell, a new computing platform designed for generative AI, showcases Nvidia's commitment to supporting the computational needs of AI-driven industries.

Nvidia's partnership with major companies like AWS, Google, Oracle, and Microsoft aims to integrate advanced AI capabilities into cloud services.

Nvidia's Project Groot focuses on developing foundation models for humanoid robots, indicating a step towards creating versatile and adaptable robotic systems.

The launch of Nvidia Inference Microservices (Nims) facilitates the deployment and management of AI models, making advanced AI accessible to a broader range of applications.

Nvidia Omniverse emerges as a critical platform for creating digital twins, enabling real-time simulations and collaborations across various industries.

The development of Isaac Perceptor SDK empowers robotics with advanced perception capabilities, enhancing autonomous navigation and interaction in complex environments.

Nvidia's initiative to build AI-powered weather forecasting models, like Cordi, demonstrates the potential to significantly improve prediction accuracy and efficiency.

The establishment of AI factories, powered by Nvidia's technology, signifies a transformative approach to creating and distributing AI-driven software solutions.

Collaborations with Siemens and other industry leaders underscore Nvidia's role in advancing digital transformation and the creation of the industrial metaverse.

Nvidia's Jetson Thor, a robotics chip, marks a significant advancement in powering humanoid and autonomous systems, underscoring Nvidia's leadership in AI hardware.

BYD's adoption of Nvidia's Thor for electric vehicles highlights the growing impact of AI and autonomous technologies in the automotive industry.

Nvidia's comprehensive approach to AI, from foundational models to deployment platforms like dgx cloud, showcases the ecosystem's readiness to support next-generation AI applications.

Transcripts

00:03

[Music]

00:29

I am I am a

00:36

Visionary Illuminating galaxies to

00:39

witness the birth of

00:42

[Music]

00:47

stars and sharpening our understanding

00:50

of extreme weather

00:51

[Music]

00:56

events I am a helper

01:01

guiding the blind through a crowded

01:07

world I was thinking about running to

01:10

the store and giving voice to those who

01:13

cannot

01:14

speak to not make me

01:18

laugh I am a

01:22

Transformer harnessing gravity to store

01:25

Renewable

01:27

[Music]

01:28

Power

01:29

[Music]

01:34

and Paving the way towards unlimited

01:36

clean energy for us

01:39

[Music]

01:42

all I am a

01:44

[Music]

01:45

trainer teaching robots to

01:51

assist to watch out for

01:55

[Music]

01:58

danger and help save

02:04

lives I am a

02:08

Healer providing a new generation of

02:12

cures and new levels of patient care

02:16

doctor that I am allergic to penicillin

02:18

is it still okay to take the medications

02:20

definitely these antibiotics don't

02:22

contain penicillin so it's perfectly

02:24

safe for you to take

02:26

them I am a navigator

02:31

[Music]

02:33

generating virtual

02:38

scenarios to let us safely explore the

02:41

real

02:43

world and understand every

02:47

[Music]

02:50

decision I even helped write the

02:55

script breathe life into the words

03:01

[Music]

03:13

I am

03:15

AI brought to life by

03:18

Nvidia deep

03:20

learning and Brilliant

03:22

Minds

03:28

everywhere

03:34

please welcome to the stage Nvidia

03:36

founder and CEO Jensen

03:38

[Music]

03:44

[Applause]

03:45

[Music]

03:52

Wong welcome to

03:58

GTC

04:00

I hope you realize this is not a

04:05

concert you have

04:07

arrived at a developers

04:11

conference there will be a lot of

04:13

science

04:14

described algorithms computer

04:18

architecture

04:27

mathematics I sensed a very heavy weight

04:31

in the room all of a

04:33

sudden almost like you were in the wrong

04:36

place no no conference in the

04:40

world is there a great assembly of

04:43

researchers from such diverse fields of

04:46

science from

04:48

climatech to radio Sciences trying to

04:51

figure out how to use AI to robotically

04:54

control MOS for Next Generation 6G

04:57

radios robotic self-driving car

05:00

s even artificial

05:05

intelligence even artificial

05:07

intelligence

05:10

everybody's first I noticed a sense of

05:13

relief there all of all of a

05:15

sudden also this conference is

05:19

represented by some amazing companies

05:22

this list this is not the

05:26

attendees these are the presentors

05:30

and what's amazing is

05:32

this if you take away all of my friends

05:37

close friends Michael Dell is sitting

05:39

right there in the IT

05:47

industry all of the friends I grew up

05:49

with in the industry if you take away

05:52

that list this is what's

05:55

amazing these are the presenters of the

05:59

non it Industries using accelerated

06:01

Computing to solve problems that normal

06:04

computers

06:05

can't it's

06:09

represented in life sciences healthc

06:11

Care

06:12

genomics Transportation of course retail

06:16

Logistics manufacturing

06:20

industrial the gamut of Industries

06:23

represented is truly amazing and you're

06:25

not here to attend only you're here to

06:28

present to talk about your research $100

06:32

trillion dollar of the world's

06:34

Industries is represented in this room

06:36

today this is absolutely

06:44

amazing there is absolutely something

06:47

happening there is something going

06:50

on the industry is being transformed not

06:54

just ours because the computer industry

06:57

the computer is the single most

07:00

important instrument of society today

07:03

fundamental transformations in Computing

07:05

affects every industry but how did we

07:09

start how did we get here I made a

07:11

little cartoon for you literally I drew

07:14

this in one page this is nvidia's

07:18

Journey started in

07:20

1993 this might be the rest of the

07:24

talk 1993 this is our journey we were

07:27

founded in 1993 there are several

07:29

important events that happen along the

07:30

way I'll just highlight a few in 2006

07:35

Cuda which has turned out to have been a

07:37

revolutionary Computing model we thought

07:40

it was revolutionary then it was going

07:42

to be an overnight success and almost 20

07:44

years later it

07:48

happened we saw it

07:52

coming two decades

07:57

later in 2012

08:00

alexnet Ai and

08:03

Cuda made first

08:06

Contact in

08:08

2016 recognizing the importance of this

08:11

Computing model we invented a brand new

08:13

type of computer we called the dgx one

08:17

170 Tera flops in this supercomputer

08:21

eight gpus connected together for the

08:23

very first time I hand delivered the

08:26

very first dgx-1 to a startup

08:30

located in San

08:31

Francisco called open

08:40

AI dgx-1 was the world's first AI

08:43

supercomputer remember 170 Tera

08:47

flops

08:49

2017 the Transformer arrived

08:53

2022 chat GPT capture the world's imag

08:56

imaginations have people realize the

08:58

importance and the capabilities of

09:00

artificial intelligence and

09:04

2023 generative AI

09:07

emerged and a new industry begins

09:12

why why is a new industry because the

09:15

software never existed before we are now

09:18

producing software using computers to

09:20

write software producing software that

09:23

never existed before it is a brand new

09:26

category it took share from

09:28

nothing it's a brand new category and

09:31

the way you produce the

09:33

software is unlike anything we've ever

09:36

done before in data

09:39

centers generating

09:42

tokens

09:44

producing floating Point

09:46

numbers at very large scale as if in the

09:51

beginning of this last Industrial

09:54

Revolution when people realized that you

09:56

would set up

09:58

factories

09:59

apply energy to it and this invisible

10:03

valuable thing called electricity came

10:05

out AC

10:07

generators and 100 years later 200 years

10:10

later we are now creating new types of

10:14

electrons tokens using infrastructure we

10:18

call factories AI factories to generate

10:21

this new incredibly valuable thing

10:24

called artificial intelligence a new

10:26

industry has

10:28

emerged well well we're going to talk

10:30

about many things about this new

10:33

industry we're going to talk about how

10:34

we're going to do Computing next we're

10:37

going to talk about the type of software

10:39

that you build because of this new

10:41

industry the new

10:43

software how you would think about this

10:45

new software what about applications in

10:48

this new

10:49

industry and then maybe what's next and

10:52

how can we start preparing today for

10:55

what is about to come next well but

10:58

before I start

11:00

I want to show you the soul of

11:03

Nvidia the soul of our company at the

11:07

intersection of computer

11:10

Graphics

11:12

physics and artificial

11:15

intelligence all intersecting inside a

11:19

computer in

11:21

Omniverse in a virtual world

11:24

simulation everything we're going to

11:26

show you today literally everything

11:28

we're going to show you today

11:30

is a simulation not animation it's only

11:34

beautiful because it's physics the world

11:36

is

11:37

beautiful it's only amazing because it's

11:40

being animated with robotics it's being

11:43

animated with artificial intelligence

11:45

what you're about to see all

11:46

day it's completely generated completely

11:50

simulated and Omniverse and all of it

11:53

what you're about to enjoy is the

11:54

world's first concert where everything

11:57

is

11:58

homemade

12:05

everything is homemade you're about to

12:08

watch some home videos so sit back and

12:12

enjoy

12:14

[Music]

12:22

[Music]

12:28

yourself

12:30

[Music]

12:58

m

13:24

[Music]

13:28

what

13:34

[Music]

13:44

[Music]

13:58

a

14:14

[Music]

14:29

[Music]

14:42

[Music]

14:58

God I love it

15:03

Nvidia accelerated Computing has reached

15:07

the Tipping

15:08

Point general purpose Computing has run

15:11

out of steam we need another way of

15:14

doing Computing so that we can continue

15:16

to scale so that we can continue to

15:18

drive down the cost of computing so that

15:20

we can continue to consume more and more

15:23

Computing while being sustainable

15:26

accelerated Computing is a dramatic

15:29

speed up over general purpose Computing

15:32

and in every single industry we engage

15:36

and I'll show you

15:37

many the impact is dramatic but in no

15:40

industry is a more important than our

15:43

own the industry of using simulation

15:46

tools to create

15:49

products in this industry it is not

15:52

about driving down the cost of computing

15:54

it's about driving up the scale of

15:56

computing we would like to be able to

15:58

sim at the entire product that we do

16:02

completely in full Fidelity completely

16:05

digitally in essentially what we call

16:08

digital twins we would like to design it

16:11

build it simulate it operate it

16:15

completely

16:17

digitally in order to do that we need to

16:20

accelerate an entire industry and today

16:24

I would like to announce that we have

16:26

some Partners who are joining us in this

16:27

journey to accelerate their entire

16:30

ecosystem so that we can bring the world

16:33

into accelerated Computing but there's a

16:38

bonus when you become accelerated your

16:42

infrastructure is cou to gpus and when

16:45

that happens it's exactly the same

16:47

infrastructure for generative

16:50

Ai and so I'm just delighted to announce

16:54

several very important Partnerships

16:56

there are some of the most important

16:57

companies in the world and Anis does

17:00

engineering simulation for what the

17:01

world makes we're partnering with them

17:04

to Cuda accelerate the ancis ecosystem

17:07

to connect anus to the Omniverse digital

17:10

twin incredible the thing that's really

17:13

great is that the install base of media

17:14

GPU accelerated systems are all over the

17:16

world in every cloud in every system all

17:20

over Enterprises and so the app the

17:22

applications they accelerate will have a

17:24

giant installed base to go serve end

17:27

users will have amazing applications and

17:29

of course system makers and csps will

17:31

have great customer

17:33

demand

17:35

synopsis synopsis is nvidia's literally

17:40

first software partner they were there

17:42

in very first day of our company

17:44

synopsis revolutionized the chip

17:45

industry with high level design we are

17:49

going to Cuda accelerate synopsis we're

17:52

accelerating computational lithography

17:55

one of the most important applications

17:57

that nobody's ever known about

17:59

in order to make chips we have to push

18:01

lithography to limit Nvidia has created

18:04

a library domain specific library that

18:07

accelerates computational lithography

18:10

incredibly once we can accelerate and

18:13

software Define all of tsmc who is

18:16

announcing today that they're going to

18:18

go into production with Nvidia kitho

18:20

once this software defined and

18:22

accelerated the next step is to apply

18:25

generative AI to the future of

18:27

semiconductor manufacturing push in

18:29

Geometry even

18:31

further Cadence builds the world's

18:35

essential Eda and SDA tools we also use

18:38

Cadence between these three companies

18:40

ansis synopsis and Cadence we basically

18:43

build Nvidia together we are cud

18:46

accelerating Cadence they're also

18:48

building a supercomputer out of Nvidia

18:50

gpus so that their customers could do

18:53

fluid Dynamic simulation at a 100 a

18:57

thousand times scale

18:59

basically a wind tunnel in real time

19:03

Cadence Millennium a supercomputer with

19:05

Nvidia gpus inside a software company

19:08

building supercomputers I love seeing

19:10

that building Cadence co-pilots together

19:13

imagine a

19:14

day when Cadence could synopsis ansis

19:18

tool providers would offer you AI

19:22

co-pilots so that we have thousands and

19:24

thousands of co-pilot assistants helping

19:27

us design chips Design Systems and we're

19:30

also going to connect Cadence digital

19:32

twin platform to Omniverse as you could

19:34

see the trend here we're accelerating

19:37

the world's CAE Eda and SDA so that we

19:40

could create our future in digital Twins

19:44

and we're going to connect them all to

19:45

Omniverse the fundamental operating

19:47

system for future digital

19:50

twins one of the industries that

19:52

benefited tremendously from scale and

19:55

you know you all know this one very well

19:57

large language model

20:00

basically after the Transformer was

20:02

invented we were able to scale large

20:05

language models at incredible rates

20:08

effectively doubling every six months

20:10

now how is it possible that by doubling

20:13

every six months that we have grown the

20:16

industry we have grown the computational

20:18

requirements so far and the reason for

20:20

that is quite simply this if you double

20:23

the size of the model you double the

20:24

size of your brain you need twice as

20:25

much information to go fill it and so

20:28

every time you double your parameter

20:32

count you also have to appropriately

20:35

increase your training token count the

20:38

combination of those two

20:40

numbers becomes the computation scale

20:43

you have to

20:44

support the latest the state-of-the-art

20:46

open AI model is approximately 1.8

20:49

trillion parameters 1.8 trillion

20:52

parameters required several trillion

20:55

tokens to go

20:57

train so so a few trillion parameters on

21:00

the order of a few trillion tokens on

21:03

the order of when you multiply the two

21:05

of them together approximately 30 40 50

21:10

billion quadrillion floating Point

21:14

operations per second now we just have

21:16

to do some Co math right now just hang

21:18

hang with me so you have 30 billion

21:21

quadrillion a quadrillion is like a paa

21:25

and so if you had a PA flop GPU you

21:28

would need

21:30

30 billion seconds to go compute to go

21:33

train that model 30 billion seconds is

21:35

approximately 1,000

21:38

years well 1,000 years it's worth

21:47

it like to do it sooner but it's worth

21:51

it which is usually my answer when most

21:54

people tell me hey how long how long's

21:55

it going to take to do something 20

21:57

years how it it's worth

22:01

it but can we do it next

22:05

week and so 1,000 years 1,000 years so

22:09

what we need what we

22:12

need are bigger

22:14

gpus we need much much bigger gpus we

22:18

recognized this early on and we realized

22:21

that the answer is to put a whole bunch

22:23

of gpus together and of course innovate

22:26

a whole bunch of things along the way

22:27

like inventing 10 censor cores advancing

22:30

MV links so that we could create

22:32

essentially virtually Giant

22:34

gpus and connecting them all together

22:36

with amazing networks from a company

22:39

called melanox infiniband so that we

22:41

could create these giant systems and so

22:43

djx1 was our first version but it wasn't

22:45

the last we built we built

22:48

supercomputers all the way all along the

22:50

way in

22:52

2021 we had Seline 4500 gpus or so and

22:57

then in 2023 we built one of the largest

23:00

AI supercomputers in the world it's just

23:02

come

23:03

online

23:05

EOS and as we're building these things

23:08

we're trying to help the world build

23:10

these things and in order to help the

23:12

world build these things we got to build

23:13

them first we build the chips the

23:15

systems the networking all of the

23:18

software necessary to do this you should

23:20

see these

23:21

systems imagine writing a piece of

23:24

software that runs across the entire

23:26

system Distributing the computation

23:28

across

23:29

thousands of gpus but inside are

23:31

thousands of smaller

23:34

gpus millions of gpus to distribute work

23:37

across all of that and to balance the

23:39

workload so that you can get the most

23:41

Energy Efficiency the best computation

23:44

time keep your cost down and so those

23:47

those fundamental

23:50

Innovations is what got us here and here

23:54

we

23:55

are as we see the miracle of chat GPT

23:59

emerg in front of us we also realize we

24:02

have a long ways to go we need even

24:06

larger models we're going to train it

24:08

with multimodality data not just text on

24:10

the internet but we're going to we're

24:12

going to train it on texts and images

24:14

and graphs and

24:15

charts and just as we learn watching TV

24:19

and so there's going to be a whole bunch

24:21

of watching video so that these Mo

24:23

models can be grounded in physics

24:26

understands that an arm doesn't go

24:27

through a wall and so these models would

24:30

have common sense by watching a lot of

24:33

the world's video combined with a lot of

24:36

the world's languages it'll use things

24:38

like synthetic data generation just as

24:40

you and I do when we try to learn we

24:43

might use our imagination to simulate

24:46

how it's going to end up just as I did

24:48

when I Was preparing for this keynote I

24:50

was simulating it all along the

24:54

way I hope it's going to turn out as

24:57

well as I had it in my

25:05

head as I was simulating how this

25:07

keynote was going to turn out somebody

25:08

did say that another

25:12

performer did her performance completely

25:15

on a

25:16

treadmill so that she could be in shape

25:18

to deliver it with full

25:21

energy I I didn't do

25:25

that if I get a l wind at about 10

25:27

minutes into this you know what

25:30

happened and so so where were we we're

25:34

sitting here using synthetic data

25:36

generation we're going to use

25:37

reinforcement learning we're going to

25:38

practice it in our mind we're going to

25:40

have ai working with AI training each

25:42

other just like student teacher

25:45

Debaters all of that is going to

25:47

increase the size of our model it's

25:48

going to increase the amount of the

25:50

amount of data that we have and we're

25:51

going to have to build even bigger

25:55

gpus Hopper is fantastic but we need

25:59

bigger

26:00

gpus and so ladies and

26:04

gentlemen I would like to introduce

26:07

you to a very very big

26:14

[Applause]

26:23

GPU named after David

26:26

Blackwell math

26:29

ician game theorists

26:32

probability we thought it was a perfect

26:35

per per perfect name black wealth ladies

26:38

and gentlemen enjoy

26:57

this

27:57

the

28:57

com

29:06

[Applause]

29:17

Blackwell is not a chip Blackwell is the

29:19

name of a

29:20

platform uh people think we make

29:23

gpus and and we do but gpus don't look

29:28

the way they used

29:30

to here here's the here's the here's the

29:33

the if you will the heart of the blackw

29:36

system and this inside the company is

29:39

not called Blackwell it's just the

29:40

number and um uh

29:44

this this is Blackwell sitting next to

29:47

oh this is the most advanced GPU in the

29:49

world in production

29:54

today this is

29:56

Hopper this is Hopper Hopper changed the

30:00

world this is

30:11

Blackwell it's okay

30:18

Hopper you're you're very

30:21

good good good

30:24

boy well good

30:26

girl

30:29

208 billion transistors and so so you

30:33

could see you I can see that there's a

30:36

small line between two dyes this is the

30:38

first time two dieses have abutted like

30:41

this together in such a way that the two

30:44

chip the two dieses think it's one chip

30:46

there's 10 terabytes of data between it

30:49

10 terabytes per second so that these

30:52

two these two sides of the Blackwell

30:54

Chip have no clue which side they're on

30:57

there's no memory locality issues no

30:59

cach issues it's just one giant chip and

31:03

so uh when we were told that Blackwell's

31:07

Ambitions were beyond the limits of

31:09

physics uh the engineer said so what and

31:12

so this is what what happened and so

31:14

this is the Blackwell chip and it goes

31:18

into two types of systems the first

31:22

one is form fit function compatible to

31:25

Hopper and so you slide all Hopper and

31:28

you push in Blackwell that's the reason

31:29

why one of the challenges of ramping is

31:32

going to be so efficient there are

31:34

installations of Hoppers all over the

31:36

world and they could be they could be

31:38

you know the same infrastructure same

31:39

design the power the electricity The

31:43

Thermals the software identical push it

31:46

right back and so this is a hopper

31:49

version for the current hgx

31:53

configuration and this is what the other

31:56

the second Hopper looks like this now

31:58

this is a prototype board and um Janine

32:02

could I just

32:04

borrow ladies and gentlemen Jan

32:11

Paul and so this this is the this is a

32:14

fully functioning board and I just be

32:17

careful

32:18

here this right here is I don't know10

32:26

billion

32:28

the second one's

32:33

five it gets cheaper after that so any

32:36

customers in the audience it's

32:41

okay all right but this is this one's

32:44

quite expensive this is to bring up

32:45

board and um and the the way it's going

32:48

to go to production is like this one

32:50

here okay and so you're going to take

32:52

take this it has two blackw Dy two two

32:56

blackw chips and four Blackwell dies

32:59

connected to a Grace CPU the grace CPU

33:03

has a super fast chipto chip link what's

33:05

amazing is this computer is the first of

33:08

its kind where this much computation

33:11

first of all fits into this small of a

33:14

place second it's memory coherent they

33:18

feel like they're just one big happy

33:20

family working on one application

33:23

together and so everything is coherent

33:25

within it um the just the amount of you

33:29

know you saw the numbers there's a lot

33:31

of terabytes this and terabytes that's

33:33

um but this is this is a miracle this is

33:35

a this let's see what are some of the

33:38

things on here uh there's um uh MV link

33:42

on top PCI Express on the

33:46

bottom on on uh

33:50

your which one is mine and your left one

33:53

of them it doesn't matter uh one of them

33:56

one of them is a CPU chipto chip link is

34:00

my left or your depending on which side

34:01

I was just I was trying to sort that out

34:04

and I just kind of doesn't

34:11

matter hopefully it comes plugged in

34:18

so okay so this is the grace Blackwell

34:26

system

34:31

but there's

34:34

more so it turns out it turns out all of

34:38

the specs is fantastic but we need a

34:40

whole lot of new features uh in order to

34:43

push the limits Beyond if you will the

34:46

limits of

34:47

physics we would like to always get a

34:50

lot more X factors and so one of the

34:52

things that we did was We Invented

34:53

another Transformer engine another

34:56

Transformer engine the second generation

34:58

it has the ability to

35:00

dynamically and automatically

35:03

rescale and

35:06

recas numerical formats to a lower

35:09

Precision whenever it can remember

35:12

artificial intelligence is about

35:13

probability and so you kind of have you

35:16

know 1.7 approximately 1.7 time

35:19

approximately 1.4 to be approximately

35:21

something else does that make sense and

35:23

so so the the ability for the

35:26

mathematics to retain the Precision and

35:29

the range necessary in that particular

35:32

stage of the pipeline super important

35:35

and so this is it's not just about the

35:37

fact that we designed a smaller ALU it's

35:39

not quite the world's not quite that

35:41

simple you've got to figure out when you

35:44

can use that across a computation that

35:48

is thousands of gpus it's running for

35:52

weeks and weeks on weeks and you want to

35:54

make sure that the the uh uh the

35:56

training job is going going to converge

35:59

and so this new Transformer engine we

36:01

have a fifth generation MV

36:03

link it's now twice as fast as Hopper

36:06

but very importantly it has computation

36:09

in the network and the reason for that

36:11

is because when you have so many

36:12

different gpus working together we have

36:15

to share our information with each other

36:17

we have to synchronize and update each

36:19

other and every so often we have to

36:21

reduce the partial products and then

36:24

rebroadcast out the partial products the

36:26

sum of the partial products back to

36:28

everybody else and so there's a lot of

36:29

what is called all reduce and all to all

36:32

and all gather it's all part of this

36:34

area of synchronization and collectives

36:36

so that we can have gpus working with

36:38

each other having extraordinarily fast

36:41

links and being able to do mathematics

36:43

right in the network allows us to

36:46

essentially amplify even further so even

36:49

though it's 1.8 terabytes per second

36:51

it's effectively higher than that and so

36:53

it's many times that of Hopper the likel

36:57

Ood of a supercomputer running for weeks

37:01

on in is approximately zero and the

37:05

reason for that is because there's so

37:06

many components working at the same time

37:09

the statistic the probability of them

37:12

working continuously is very low and so

37:14

we need to make sure that whenever there

37:16

is a well we checkpoint and restart as

37:19

often as we can but if we have the

37:22

ability to detect a weak chip or a weak

37:26

note early we could retire it and maybe

37:29

swap in another processor that ability

37:33

to keep the utilization of the

37:34

supercomputer High especially when you

37:37

just spent $2 billion building it is

37:40

super important and so we put in a Ras

37:45

engine a reliability engine that does

37:48

100% self test in system test of every

37:53

single gate every single bit of memory

37:58

on the Blackwell chip and all the memory

38:01

that's connected to it it's almost as if

38:04

we shipped with every single chip its

38:07

own Advanced tester that we CH test our

38:11

chips with this is the first time we're

38:13

doing this super excited about it secure

38:22

AI only this conference do they clap for

38:26

Ras the

38:28

the uh secure AI uh obviously you've

38:32

just spent hundreds of millions of

38:34

dollars creating a very important Ai and

38:37

the the code the intelligence of that AI

38:40

is encoded in the parameters you want to

38:42

make sure that on the one hand you don't

38:44

lose it on the other hand it doesn't get

38:45

contaminated and so we now have the

38:48

ability to encrypt data of course at

38:53

rest but also in transit and while it's

38:56

being computed

38:58

it's all encrypted and so we now have

39:01

the ability to encrypt and transmission

39:04

and when we're Computing it it is in a

39:06

trusted trusted environment trusted uh

39:09

engine environment and the last thing is

39:13

decompression moving data in and out of

39:15

these nodes when the compute is so fast

39:18

becomes really

39:19

essential and so we've put in a high

39:23

linee speed compression engine and

39:25

effectively moves data 20 times times

39:27

faster in and out of these computers

39:29

these computers are are so powerful and

39:32

there's such a large investment the last

39:34

thing we want to do is have them be idle

39:36

and so all of these capabilities are

39:38

intended to keep Blackwell fed and as

39:44

busy as

39:46

possible overall compared to

39:49

Hopper it is two and a half times two

39:53

and a half times the fp8 performance for

39:56

training per chip it is ALS it also has

40:00

this new format called fp6 so that even

40:03

though the computation speed is the

40:05

same the bandwidth that's Amplified

40:09

because of the memory the amount of

40:11

parameters you can store in the memory

40:12

is now Amplified fp4 effectively doubles

40:16

the throughput this is vitally important

40:19

for inference one of the things that

40:21

that um is becoming very clear is that

40:24

whenever you use a computer with AI on

40:27

the other

40:27

side when you're chatting with the

40:30

chatbot when you're asking it to uh

40:34

review or make an

40:36

image remember in the back is a GPU

40:40

generating

40:41

tokens some people call it inference but

40:45

it's more appropriately

40:48

generation the way that Computing is

40:50

done in the past was retrieval you would

40:53

grab your phone you would touch

40:54

something um some signals go off

40:57

basically an email goes off to some

40:59

storage somewhere there's pre-recorded

41:02

content somebody wrote a story or

41:03

somebody made an image or somebody

41:04

recorded a video that record

41:07

pre-recorded content is then streamed

41:09

back to the phone and recomposed in a

41:11

way based on a recommender system to

41:14

present the information to

41:16

you you know that in the future the vast

41:20

majority of that content will not be

41:22

retrieved and the reason for that is

41:24

because that was pre-recorded by

41:25

somebody who doesn't understand the

41:27

context which is the reason why we have

41:29

to retrieve so much content if you can

41:33

be working with an AI that understands

41:35

the context who you are for what reason

41:37

you're fetching this information and

41:39

produces the information for you just

41:42

the way you like

41:43

it the amount of energy we save the

41:46

amount of networking bandwidth we save

41:48

the amount of waste of time we save will

41:51

be tremendous the future is generative

41:55

which is the reason why we call it

41:56

generative AI which is the reason why

41:59

this is a brand new industry the way we

42:02

compute is fundamentally different we

42:05

created a processor for the generative

42:08

AI era and one of the most important

42:11

parts of it is content token generation

42:14

we call it this format is

42:17

fp4 well that's a lot of computation

42:24

5x the Gen token generation 5x the

42:27

inference capability of Hopper seems

42:32

like

42:35

enough but why stop

42:39

there the answer is it's not enough and

42:41

I'm going to show you why I'm going to

42:43

show you why and so we would like to

42:46

have a bigger GPU even bigger than this

42:48

one and so

42:51

we decided to scale it and notice but

42:54

first let me just tell you how we've

42:55

scaled over the course of the last eight

42:59

years we've increased computation by

43:01

1,000 times8 years 1,000 times remember

43:04

back in the good old days of Moore's Law

43:07

it was 2x well 5x every what 10 10x

43:12

every 5 years that's easier easiest math

43:14

10x every 5 years a 100 times every 10

43:17

years 100 times every 10 years at the in

43:21

the middle in the hey days of the PC

43:25

Revolution one 100 times every 10 years

43:29

in the last 8 years we've gone 1,000

43:33

times we have two more years to

43:35

go and so that puts it in

43:41

perspective the rate at which we're

43:43

advancing Computing is insane and it's

43:46

still not fast enough so we built

43:47

another

43:49

chip this chip is just an incredible

43:53

chip we call it the Envy link switch

43:56

it's 50 billion transistors it's almost

43:59

the size of Hopper all by itself this

44:02

switch ship has four MV links in

44:05

it each 1.8 terabytes per

44:08

second

44:10

and and it has computation in as I

44:13

mentioned what is this chip

44:16

for if we were to build such a chip we

44:20

can have every single GPU talk to every

44:23

other GPU at full speed at the same

44:27

time that's

44:36

insane it doesn't even make

44:39

sense but if you could do that if you

44:42

can find a way to do that and build a

44:44

system to do that that's cost effective

44:48

that's cost effective how incredible

44:51

would it be that we could have all these

44:53

gpus connect over a coherent link so

44:58

that they effectively are one giant GPU

45:02

well one of one of the Great Inventions

45:04

in order to make a cost effective is

45:05

that this chip has to drive copper

45:09

directly the seres of this chip is is

45:11

just a phenomenal invention so that we

45:14

could do direct drive to copper and as a

45:16

result you can build a system that looks

45:19

like

45:25

this

45:30

now this system this system is kind of

45:34

insane this is one dgx this is what a

45:38

dgx looks like now remember just six

45:41

years

45:43

ago it was pretty heavy but I was able

45:46

to lift

45:49

it I delivered the uh the uh first djx1

45:53

to open Ai and and the researchers there

45:56

it's on you know the pictures are on the

45:57

internet and uh uh and we all

46:00

autographed it uh and um uh if you come

46:04

to my office it's autographed there is

46:06

really beautiful and but but you could

46:08

lift it uh this dgx this dgx that djx by

46:12

the way was

46:14

170

46:16

teraflops if you're not familiar with

46:18

the numbering system that's

46:21

0.17 pedop flops so this is

46:25

720 the first one I delivered to open AI

46:28

was

46:29

0.17 you could round it up to 0.2 won't

46:32

make any difference but and back then

46:34

was like wow you know 30 more teraflops

46:37

and so this is now 720 pedop flops

46:42

almost an exal flop for training and the

46:44

world's first one exal flops machine in

46:47

one

46:55

rack just so you know there are only a

46:58

couple two three exop flops machines on

47:00

the planet as we speak and so this is an

47:04

exop flops AI system in one single rack

47:09

well let's take a look at the back of

47:13

it so this is what makes it possible

47:17

that's the back that's the that's the

47:19

back the dgx MV link spine 130 terabytes

47:24

per

47:25

second goes through the back of that

47:28

chassis that is more than the aggregate

47:30

bandwidth of the

47:40

internet so we we could basically send

47:43

everything to everybody within a second

47:46

and so so we we have 5,000 cables 5,000

47:50

mvlink cables in total 2

47:53

miles now this is the amazing thing if

47:56

we had to use Optics we would have had

47:58

to use transceivers and retim and those

48:01

transceivers and reers alone would have

48:04

cost

48:05

20,000

48:07

watts 2 kilowatts of just transceivers

48:10

alone just to drive the mvlink spine as

48:14

a result we did it completely for free

48:16

over mvlink switch and we were able to

48:19

save the 20 kilow for computation this

48:22

entire rack is 120 kilowatts so that 20

48:25

kilowatts makes a huge difference

48:27

it's liquid cooled what goes in is 25° C

48:31

about room temperature what comes out is

48:34

45°c about your jacuzzi so room

48:38

temperature goes in jacuzzi comes out 2

48:40

liters per

48:49

second we could we could sell a

48:55

peripheral

48:58

600,000 Parts somebody used to say you

49:01

know you guys make gpus and we do but

49:05

this is what a GPU looks like to me when

49:07

somebody says GPU I see this two years

49:10

ago when I saw a GPU was the hgx it was

49:13

70 lb 35,000 Parts our gpus now are

49:18

600,000 parts

49:21

and 3,000 lb 3,000 lb 3,000 lb that's

49:27

kind of like the weight of a you know

49:30

Carbon

49:31

Fiber

49:33

Ferrari I don't know if that's useful

49:35

metric

49:37

but everybody's going I feel it I feel

49:40

it I get it I get that now that you

49:43

mention that I feel it I don't know

49:46

what's 3,000

49:47

lb okay so 3,000 lb ton and a half so

49:51

it's not quite an

49:53

elephant so this is what a dgx looks

49:56

like now let's see what it looks like in

49:58

operation okay let's imagine what is

50:00

what how do we put this to work and what

50:01

does that mean well if you were to train

50:03

a GPT model 1.8 trillion parameter

50:08

model it took it took about apparently

50:11

about you know 3 to 5 months or so uh

50:13

with 25,000 amp uh if we were to do it

50:16

with hopper it would probably take

50:17

something like 8,000 gpus and it would

50:20

consume 15 megawatts 8,000 gpus on 15

50:23

megawatts it would take 90 days about 3

50:25

months and that would allows you to

50:27

train something that is you know this

50:30

groundbreaking AI model and this is

50:34

obviously not as expensive as as um as

50:37

anybody would think but it's 8,000 8,000

50:39

gpus it's still a lot of money and so

50:41

8,000 gpus 15 megawatts if you were to

50:44

use Blackwell to do this it would only

50:47

take 2,000

50:49

gpus 2,000 gpus same 90 days but this is

50:54

the amazing part only 4 me GS of power

50:58

so from 15 yeah that's

51:04

right and that's and that's our goal our

51:07

goal is to continuously drive down the

51:10

cost and the energy they're directly

51:11

proportional to each other cost and

51:13

energy associated with the Computing so

51:15

that we can continue to expand and scale

51:17

up the computation that we have to do to

51:20

train the Next Generation models well

51:22

this is

51:23

training inference or generation

51:27

is vitally important going forward you

51:29

know probably some half of the time that

51:31

Nvidia gpus are in the cloud these days

51:33

it's being used for token generation you

51:36

know they're either doing co-pilot this

51:37

or chat you know chat GPT that or um all

51:40

these different models that are being

51:41

used when you're interacting with it or

51:44

generating IM generating images or

51:46

generating videos generating proteins

51:48

generating chemicals there's a bunch of

51:50

gener generation going on all of that is

51:53

B in the category of computing we call

51:56

inference

51:57

but inference is extremely hard for

51:59

large language models because these

52:01

large language models have several

52:03

properties one they're very large and so

52:05

it doesn't fit on one GPU this is

52:08

Imagine imagine Excel doesn't fit on one

52:11

GPU you know and imagine some

52:13

application you're running on a daily

52:15

basis doesn't run doesn't fit on one

52:16

computer like a video game doesn't fit

52:18

on one computer and most in fact do and

52:23

many times in the past in hyperscale

52:25

Computing many applic applications for

52:27

many people fit on the same computer and

52:29

now all of a sudden this one inference

52:31

application where you're interacting

52:33

with this chatbot that chatbot requires

52:36

a supercomputer in the back to run it

52:38

and that's the future the future is

52:41

generative with these chatbots and these

52:43

chatbots are trillions of tokens

52:46

trillions of parameters and they have to

52:48

generate

52:49

tokens at interactive rates now what

52:52

does that mean well uh three to tokens

52:56

is about a

52:58

word I you know the the

53:01

uh you know space the final frontier

53:05

these are the adventures that's like

53:07

that's like 80

53:09

tokens okay I don't know if that's

53:12

useful to you and

53:16

so you know the art of communications is

53:19

is selecting good an good

53:22

analogies yeah this is this is not going

53:25

well

53:28

every I don't know what he's talking

53:30

about never seen Star Trek and so and so

53:34

so here we are we're trying to generate

53:35

these tokens when you're interacting

53:37

with it you're hoping that the tokens

53:38

come back to you as quickly as possible

53:40

and as quickly as you can read it and so

53:42

the ability for Generation tokens is

53:44

really important you have to paralyze

53:46

the work of this model across many many

53:48

gpus so that you could achieve several

53:51

things one on the one hand you would

53:52

like throughput because that throughput

53:55

reduces the cost

53:57

the overall cost per token of uh

54:00

generating so your throughput dictates

54:03

the cost of of uh delivering the service

54:06

on the other hand you have another

54:08

interactive rate which is another tokens

54:10

per second where it's about per user and

54:13

that has everything to do with quality

54:14

of service and so these two things um uh

54:18

compete against each other and we have

54:20

to find a way to distribute work across

54:23

all of these different gpus and paralyze

54:25

it in a way that allows us to achieve

54:27

both and it turns out the search search

54:29

space is

54:31

enormous you know I told you there's

54:33

going to be math

54:34

involved and everybody's going oh

54:37

dear I heard some gasp just now when I

54:40

put up that slide you know so so this

54:43

this right here the the y axis is tokens

54:45

per second data center throughput the x-

54:48

axis is tokens per second interactivity

54:51

of the person and notice the upper right

54:53

is the best you want interactivity to be

54:56

very

54:56

High number of tokens per second per

54:59

user you want the tokens per second of

55:01

per data center to be very high the

55:02

upper upper right is is terrific however

55:05

it's very hard to do that and in order

55:08

for us to search for the best

55:10

answer across every single one of those

55:12

intersections XY coordinates okay so you

55:15

just look at every single XY coordinate

55:17

all those blue dots came from some

55:20

repartitioning of the software some

55:23

optimizing solution has to go and figure

55:25

out what whether to use use tensor

55:29

parallel expert parallel pipeline

55:32

parallel or data parallel and

55:34

distribute this enormous model across

55:37

all these G different gpus and sustain

55:40

performance that you need this

55:42

exploration space would be impossible if

55:45

not for the programmability of nvidia's

55:47

gpus and so we could because of Cuda

55:49

because we have such Rich ecosystem we

55:51

could explore this universe and find

55:54

that green roof line it turns out that

55:57

green roof line notice you got tp2 EPA

56:01

dp4 it means two parall two uh tensor

56:05

parallel tensor parallel across two gpus

56:08

expert parallels across eight data

56:10

parallel across four notice on the other

56:12

end you got tensor parallel cross 4 and

56:14

expert parallel across 16 the

56:17

configuration the distribution of that

56:19

software it's a different different um

56:22

runtime that would produce these

56:25

different results and you have to go

56:27

discover that roof line well that's just

56:29

one model and this is just one

56:32

configuration of a computer imagine all

56:34

of the models being created around the

56:35

world and all the different different um

56:38

uh configurations of of uh systems that

56:40

are going to be

56:43

available so now that you understand the

56:46

basics let's take a look at inference of

56:50

Blackwell compared

56:52

to Hopper and this is this is the

56:55

extraordinary thing in one generation

56:58

because we created a system that's

57:01

designed for trillion parameter gener

57:03

generative AI the inference capability

57:06

of Blackwell is off the

57:08

charts and in fact it is some 30 times

57:12

Hopper

57:18

y for large language models for large

57:21

language models like Chad GPT and others

57:24

like it the blue line is Hopper I gave

57:28

you imagine we didn't change the

57:30

architecture of Hopper we just made it a

57:32

bigger

57:33

chip we just used the latest you know

57:36

greatest uh 10 terab you know terabytes

57:40

per second we connected the two chips

57:42

together we got this giant 208 billion

57:44

parameter chip how would we have

57:46

performed if nothing else changed and it

57:48

turns out quite

57:50

wonderfully quite wonderfully and that's

57:52

the purple line but not as great as it

57:55

could be and and that's where the fp4

57:58

tensor core the new Transformer engine

58:01

and very importantly the MV link switch

58:04

and the reason for that is because all

58:06

these gpus have to share the results

58:08

partial products whenever they do all to

58:10

all all all gather whenever they

58:12

communicate with each

58:14

other that mvlink switch is

58:17

communicating almost 10 times faster

58:20

than what we could do in the past using

58:22

the fastest

58:23

networks Okay so Blackwell is going to

58:27

be just an amazing system for a

58:30

generative Ai and in the

58:33

future in the future data centers are

58:36

going to be thought of as I mentioned

58:38

earlier as an AI Factory an AI Factory's

58:42

goal in life is to generate revenues

58:46

generate in this

58:48

case

58:50

intelligence in this facility not

58:53

generating electricity as in AC

58:55

generator

58:57

but of the last Industrial Revolution

58:59

and this Industrial Revolution the

59:00

generation of intelligence and so this

59:03

ability is super super important the

59:06

excitement of Blackwell is really off

59:08

the charts you know when we first when

59:10

we first um uh you know this this is a

59:14

year and a half ago two years ago I

59:16

guess two years ago when we first

59:17

started to to go to market with hopper

59:20

you know we had the benefit of of uh two

59:22

two uh two csps uh joined us in a lunch

59:26

and and we were you know delighted um

59:28

and so we had two

59:31

customers uh we have more

59:46

now unbelievable excitement for

59:48

Blackwell unbelievable excitement and

59:51

there's a whole bunch of different

59:52

configurations of course I showed you

59:54

the configurations that slide into the

59:56

hopper form factor so that's easy to

59:58

upgrade I showed you examples that are

60:01

liquid cooled that are the extreme

60:03

versions of it one entire rack that's

60:05

that's uh connected by mvlink 72 uh

60:08

we're going to Blackwell is going to be

60:12

ramping to the world's AI companies of

60:16

which there are so many now doing

60:18

amazing work in different modalities the

60:21

csps every CSP is geared up all the OEM

60:26

and

60:27

odms Regional clouds Sovereign AIS and

60:32

Telos all over the world are signing up

60:34

to launch with Blackwell

60:43

this Blackwell Blackwell would be the

60:46

the the most successful product launch

60:48

in our history and so I can't wait wait

60:51

to see that um I want to thank I want to

60:53

thank some partners that that are

60:54

joining us in this uh AWS is gearing up

60:57

for Blackwell they're uh they're going

60:59

to build the first uh GPU with secure AI

61:02

they're uh building out a 222 exf flops

61:06

system you know just now when we

61:08

animated uh just now the digital twin if

61:10

you saw the the all of those clusters

61:12

are coming down by the way that is not

61:16

just art that is a digital twin of what

61:18

we're building that's how big it's going

61:20

to be besides infrastructure we're doing

61:22

a lot of things together with AWS we're

61:24

Cuda accelerating stag maker AI we're

61:27

Cuda accelerating Bedrock AI uh Amazon

61:30

robotics is working with us uh using

61:32

Nvidia Omniverse and Isaac Sim AWS

61:35

Health has Nvidia Health Integrated into

61:38

it so AWS has has really leaned into

61:42

accelerated Computing uh Google is

61:44

gearing up for Blackwell gcp already has

61:47

A1 100s h100s t4s l4s a whole Fleet of

61:51

Nvidia Cuda gpus and they recently

61:53

announced the Gemma model that runs

61:55

across all of it uh we're work working

61:58

to optimize uh and accelerate every

62:01

aspect of gcp we're accelerating data

62:03

proc which for data processing their

62:05

data processing engine Jax xlaa vertex

62:08

Ai and mojoko for robotics so we're

62:11

working with uh Google and gcp across a

62:14

whole bunch of initiatives uh Oracle is

62:16

gearing up for black wellth Oracle is a

62:18

great partner of ours for Nvidia dgx

62:20

cloud and we're also working together to

62:22

accelerate something that's really

62:24

important to a lot of companies Oracle

62:27

database Microsoft is accelerating and

62:30

Microsoft is gearing up for Blackwell

62:32

Microsoft Nvidia has a wide- ranging

62:34

partnership we're accelerating Cuda

62:36

accelerating all kinds of services when

62:38

you when you chat obviously and uh AI

62:41

services that are in Microsoft Azure uh

62:43

it's very very likely Nvidia is in the

62:45

back uh doing the inference and the

62:46

token generation uh we built they built

62:49

the largest Nvidia infiniband

62:51

supercomputer basically a digital twin

62:53

of hours or a physical twin of hours uh

62:56

we're bringing the Nvidia ecosystem to

62:58

Azure Nvidia djx cloud to Azure uh

63:01

Nvidia Omniverse is now hosted in Azure

63:03

Nvidia Healthcare is an Azure and all of

63:06

it is deeply integrated and deeply

63:08

connected with Microsoft fabric the

63:11

whole industry is gearing up for

63:13

Blackwell this is what I'm about to show

63:16

you most of the most of the the the uh

63:19

uh uh scenes that you've seen so far of

63:21

Blackwell are the are the full Fidelity

63:25

design of Blackwell everything in our

63:28

company has a digital twin and in fact

63:31

this digital twin idea is it is really

63:34

spreading and it it helps it helps

63:36

companies build very complicated things

63:39

perfectly the first time and what could

63:41

be more exciting

63:43

than creating a digital twin to build a

63:47

computer that was built in a digital

63:49

twin and so let me show you what wistron

63:51

is

63:54

doing to meet the demand for NVIDIA

63:57

accelerated Computing widraw one of our

63:59

leading manufacturing Partners is

64:01

building digital twins of Nvidia dgx and

64:04

hgx factories using custom software

64:07

developed with Omniverse sdks and

64:10

apis for their newest Factory wraw

64:13

started with a digital twin to virtually

64:15

integrate their multi-ad and process

64:17

simulation data into a unified view

64:20

testing and optimizing layouts in this

64:22

physically accurate digital environment

64:24

increased worker efficency icy by

64:27

51% during construction the Omniverse

64:30

digital twin was used to verify that the

64:32

physical build matched the digital plans

64:35

identifying any discrepancies early has

64:37

helped avoid costly change orders and

64:40

the results have been impressive using a

64:42

digital twin helped bring wion's Factory

64:44

online in half the time just 2 and 1/2

64:47

months instead of five in operation the

64:50

Omniverse digital twin helps widraw

64:52

rapidly Test new layouts to accommodate

64:54

new processes or improve operations in

64:57

the existing space and monitor real-time

65:00

operations using live iot data from

65:02

every machine on the production

65:04

line which ultimately enabled wion to

65:07

reduce End to-end Cycle Times by 50% and

65:10

defect rates by

65:12

40% with Nvidia Ai and Omniverse

65:15

nvidia's Global ecosystem of partners

65:17

are building a new era of accelerated AI

65:20

enabled

65:23

[Music]

65:24

digitalization

65:28

[Applause]

65:31

that's how we that's the way it's going

65:34

to be in the future we're going to

65:35

manufacturing everything digitally first

65:37

and then we'll manufacture it physically

65:39

people ask me how did it

65:41

start what got you guys so

65:44

excited what was it that you

65:47

saw that caused you to put it all

65:52

in on this incredible idea and it's

66:00

this hang on a

66:07

second guys that was going to be such a

66:12

moment that's what happens when you

66:14

don't

66:19

rehearse this as you know was first

66:24

Contact 20 12

66:26

alexnet you put a cat into this computer

66:31

and it comes out and it says

66:35

cat and we said oh my God this is going

66:39

to change

66:42

everything you take 1 million numbers

66:45

you take one Million numbers across

66:48

three channels

66:49

RGB these numbers make no sense to

66:52

anybody you put it into this software

66:56

and it compress it dimensionally reduce

66:59

it it reduces it from a million

67:01

dimensions a million Dimensions it turns

67:04

it into three letters one vector one

67:09

number and it's

67:11

generalized you could have the cat be

67:15

different

67:17

cats and and you could have it be the

67:19

front of the cat and the back of the cat

67:22

and you look at this thing you say

67:24

unbelievable you mean any

67:27

cats yeah any

67:30

cat and it was able to recognize all

67:33

these cats and we realized how it did it

67:37

systematically structurally it's

67:41

scalable how big can you make it well

67:44

how big do you want to make it and so we

67:47

imagine that this is a completely new

67:49

way of writing

67:51

software and now today as you know you

67:54

could have you type in the word c a and

67:58

what comes out is a

68:00

cat it went the other

68:03

way am I right

68:07

unbelievable how is it possible that's

68:10

right how is it possible you took three

68:13

letters and you generated a million

68:16

pixels from it and it made

68:18

sense well that's the miracle and here

68:21

we are just literally 10 years later 10

68:24

years later

68:26

where we recognize textt we recognize

68:28

images we recognize videos and sounds

68:31

and images not only do we recognize them

68:34

we understand their meaning we

68:37

understand the meaning of the text

68:38

that's the reason why it can chat with

68:39

you it can summarize for you it

68:42

understands the text it understood not

68:44

just recognizes the the English it

68:46

understood the English it doesn't just

68:48

recognize the pixels and understood the

68:51

pixels and you can you can even

68:53

condition it between two modalities you

68:55

can have language condition image and

68:57

generate all kinds of interesting things

69:00

well if you can understand these things

69:02

what else can you understand that you've

69:05

digitized the reason why we started with

69:07

text and you know images is because we

69:09

digitized those but what else have we

69:11

digitized well it turns out we digitized

69:13

a lot of things proteins and genes and

69:17

brain

69:18

waves anything you can digitize so long

69:21

as there's structure we can probably

69:23

learn some patterns from it and if we

69:24

can learn the patterns from it we can

69:26

understand its meaning if we can

69:28

understand its meaning we might be able

69:30

to generate it as well and so therefore

69:32

the generative AI Revolution is here

69:36

well what else can we generate what else

69:37

can we learn well one of the things that

69:39

we would love to learn we would love to

69:42

learn is we would love to learn climate

69:47

we would love to learn extreme weather

69:49

we would love to learn uh what how we

69:52

can

69:54

predict future weather at Regional

69:57

scales at sufficiently high resolution

70:01

such that we can keep people out of

70:02

Harm's Way before harm comes extreme

70:05

weather cost the world $150 billion

70:08

surely more than that and it's not

70:10

evenly distributed $150 billion is

70:13

concentrated in some parts of the world

70:15

and of course to some people of the

70:16

world we need to adapt and we need to

70:19

know what's coming and so we are

70:20

creating Earth too a digital twin of the

70:23

Earth for predicting weather we and

70:26

we've made an extraordinary invention

70:29

called Civ the ability to use generative

70:32

AI to predict weather at extremely high

70:35

resolution let's take a

70:38

look as the earth's climate changes AI

70:41

powered weather forecasting is allowing

70:43

us to more accurately predict and track

70:45

severe storms like super typhoon chanthu

70:48

which caused widespread damage in Taiwan

70:50

and the surrounding region in 2021

70:53

current AI forecast models can

70:55

accurately predict the track of storms

70:57

but they are limited to 25 km resolution

71:00

which can miss important details Invidia

71:03

cordi is a revolutionary new generative

71:06

AI model trained on high resolution

71:08

radar assimilated Warf weather forecasts

71:10

and air 5 reanalysis data using cordi

71:14

extreme events like chanthu can be super

71:17

resolved from 25 km to 2 km resolution

71:20

with 1,000 times the speed and 3,000

71:22

times the Energy Efficiency of

71:24

conventional weather models by combining

71:27

the speed and accuracy of nvidia's

71:29

weather forecasting model forecast net

71:31

and generative AI models like cordi we

71:34

can explore hundreds or even thousands

71:36

of kilometer scale Regional weather

71:38

forecasts to provide a clear picture of

71:40

the best worst and most likely impacts

71:42

of a storm this wealth of information

71:45

can help minimize loss of life and

71:47

property damage today cordi is optimized

71:50

for Taiwan but soon generative super

71:53

sampling will be available as part of

71:54

the in viia Earth 2 inference service

71:57

for many regions across the

72:02

[Music]

72:09

globe the weather company has the trust

72:12

a source of global weather predictions

72:14

we are working together to accelerate

72:16

their weather simulation first

72:18

principled base of simulation however

72:21

they're also going to integrate Earth to

72:23

cordi so that they could help businesses

72:25

and countries do Regional high

72:28

resolution weather prediction and so if

72:31

you have some weather prediction you'd

72:32

like to know like to do uh reach out to

72:34

the weather company really exciting

72:36

really exciting work Nvidia Healthcare

72:39

something we started 15 years ago we're

72:41

super super excited about this this is

72:43

an area where we're very very proud

72:46

whether it's Medical Imaging or genene

72:47

sequencing or computational

72:50

chemistry it is very likely that Nvidia

72:53

is the computation behind it

72:55

we've done so much work in this

72:57

area today we're announcing that we're

73:00

going to do something really really cool

73:03

imagine all of these AI models that are

73:06

being

73:07

used to

73:10

generate images and audio but instead of

73:12

images and audio because it understood

73:15

images and audio all the digitization

73:17

that we've done for genes and proteins

73:20

and amino acids that digitalization

73:23

capability is now now passed through

73:26

machine learning so that we understand

73:28

the language of

73:30

Life the ability to understand the

73:32

language of Life of course we saw the

73:34

first evidence of

73:35

it with alphafold this is really quite

73:38

an extraordinary thing after Decades of

73:40

painstaking work the world had only

73:44

digitized and reconstructed using cor

73:47

electron microscopy or Crystal XR x-ray

73:51

crystallography um these different

73:53

techniques painstaking reconstructed the

73:56

protein 200,000 of them in just what is

73:59

it less than a year or so Alpha fold has

74:04

reconstructed 200 million proteins

74:06

basically every protein every of every

74:09

living thing that's ever been sequenced

74:11

this is completely revolutionary well

74:14

those models are incredibly hard to use

74:16

um for incredibly hard for people to

74:18

build and so what we're going to do is

74:20

we're going to build them we're going to

74:21

build them for uh the the researchers

74:24

around the world and it won't be the

74:26

only one there'll be many other models

74:27

that we create and so let me show you

74:29

what we're going to do with

74:34

it virtual screening for new medicines

74:37

is a computationally intractable problem

74:40

existing techniques can only scan

74:42

billions of compounds and require days

74:44

on thousands of standard compute nodes

74:47

to identify new drug

74:48

candidates Nvidia biion Nemo Nims enable

74:52

a new generative screening Paradigm

74:54

using Nims for protein structure

74:56

prediction with Alpha fold molecule

74:58

generation with MIM and docking with

75:01

diff dock we can now generate and Screen

75:04

candidate molecules in a matter of

75:05

minutes MIM can connect to custom

75:08

applications to steer the generative

75:10

process iteratively optimizing for

75:12

desired properties these applications

75:15

can be defined with biion Nemo

75:17

microservices or built from scratch here

75:20

a physics based simulation optimizes for

75:23

a molecule's ability to bind to a Target

75:25

protein while optimizing for other

75:27

favorable molecular properties in

75:29

parallel MIM generates high quality

75:32

drug-like molecules that bind to the

75:34

Target and are synthesizable translating

75:37

to a higher probability of developing

75:39

successful medicines

75:41

faster biion Nemo is enabling a new

75:44

paradigm in drug Discovery with Nims

75:46

providing OnDemand microservices that

75:48

can be combined to build powerful drug

75:51

Discovery workflows like denovo protein

75:53

design or ided molecule generation for

75:56

virtual screening bio Nims are helping

76:00

researchers and developers reinvent

76:02

computational drug

76:09

design Nvidia M MIM MIM cord diff

76:13

there's a whole bunch of other models

76:15

whole bunch of other models computer

76:17

vision models robotics models and even

76:21

of

76:22

course some really really terrific open

76:24

source

76:25

language models these models are

76:29

groundbreaking however it's hard for

76:31

companies to use how would you use it

76:33

how would you bring it into your company

76:34

and integrate it into your workflow how

76:36

would you package it up and run it

76:38

remember earlier I just

76:40

said that inference is an extraordinary

76:43

computation problem how would you do the

76:46

optimization for each and every one of

76:48

these models and put together the

76:50

Computing stack necessary to run that

76:52

supercomputer so that you can run the

76:55

models in your company and so we have a

76:58

great idea we're going to invent a new

77:00

way invent a new way for you to receive

77:05

and operate

77:07

software this software comes basically

77:11

in a digital box we call it a container

77:14

and we call it the Nvidia inference micr

77:17

service a Nim and let me explain to you

77:21

what it is a Nim it's a pre-trained

77:24

model so it's pretty

77:25

clever and it is packaged and optimized

77:29

to run across nvidia's install base

77:32

which is very very large what's inside

77:34

it is incredible you have all these

77:37

pre-trained state-ofthe-art open source

77:39

models they could be open source they

77:41

could be from one of our partners it

77:43

could be created by us like Nvidia mull

77:46

it is packaged up with all of its

77:48

dependencies so Cuda the right version

77:50

CNN the right version tensor RT llm

77:54

Distributing across the multiple gpus

77:56

Tred and inference server all completely

77:59

packaged together it's optimized

78:02

depending on whether you have a single

78:04

GPU multi- GPU or multi node of gpus

78:06

it's optimized for that and it's

78:08

connected up with apis that are simple

78:10

to use now this think about what an AI

78:13

API is an AI API is an interface that

78:18

you just talk to and so this is a piece

78:21

of software in the future that has a

78:23

really simple API and that API called

78:25

human and these packages incredible

78:29

bodies of software will be optimized and

78:32

packaged and we'll put it on a

78:34

website and you can download it you

78:37

could take it with you you could run it

78:39

in any Cloud you can run it in your own

78:41

data center you can run in workstations

78:43

if it fit and all you have to do is come

78:45

to ai. nvidia.com we call it Nvidia

78:49

inference microservice but inside the

78:51

company we all call it

78:53

Nims okay

79:02

just imagine you know one of some

79:04

someday there there's going to be one of

79:06

these chat Bots and these chat Bots is

79:08

going to just be in a Nim and you you'll

79:12

uh you'll assemble a whole bunch of chat

79:13

Bots and that's the way software is

79:15

going to be be built someday how do we

79:18

build software in the future it is

79:20

unlikely that you'll write it from

79:22

scratch or write a whole bunch of python

79:23

code or anything like that it is very

79:26

likely that you assemble a team of AIS

79:29

there's probably going to be a super AI

79:32

that you use that takes the mission that

79:34

you give it and breaks it down into an

79:37

execution plan some of that execution

79:39

plan could be handed off to another Nim

79:42

that Nim would maybe uh understand

79:46

sap the language of sap is abap it might

79:50

understand service now and it go

79:52

retrieve some information from their

79:53

platforms

79:55

it might then hand that result to

79:56

another Nim who that goes off and does

79:59

some calculation on it maybe it's an

80:01

optimization software a

80:03

combinatorial optimization algorithm

80:06

maybe it's uh you know some just some

80:08

basic

80:09

calculator maybe it's pandas to do some

80:13

numerical analysis on it and then it

80:15

comes back with its

80:17

answer and it gets combined with

80:19

everybody else's and it because it's

80:21

been presented with this is what the

80:23

right answer should look like it knows

80:25

what answer what an what right answers

80:27

to produce and it presents it to you we

80:30

can get a report every single day at you

80:32

know top of the hour uh that has

80:34

something to do with a bill plan or some

80:36

forecast or uh some customer alert or

80:38

some bugs database or whatever it

80:40

happens to be and we could assemble it

80:42

using all these Nims and because these

80:44

Nims have been packaged up and ready to

80:48

work on your systems so long as you have

80:50

video gpus in your data center in the

80:51

cloud this this Nims will work together

80:55

as a team and do amazing things and so

80:58

we decided this is such a great idea

81:00

we're going to go do that and so Nvidia

81:03

has Nims running all over the company we

81:05

have chatbots being created all over the

81:08

place and one of the mo most important

81:09

chatbots of course is a chip designer

81:12

chatbot you might not be surprised we

81:14

care a lot about building chips and so

81:17

we want to build chatbots AI

81:21

co-pilots that are co-designers with our

81:23

engineers and so this is the way we did

81:26

it so we got ourselves a llama llama 2

81:30

this is a 70b and it's you know packaged

81:32

up in a NM and we asked it you know uh

81:36

what is a

81:37

CTL Well turns out CTL is an internal uh

81:42

program and it has a internal

81:44

proprietary language but it thought the

81:46

CTL was a combinatorial timing logic and

81:48

so it describes you know conventional

81:50

knowledge of CTL but that's not very

81:52

useful to us and so we gave it a whole

81:56

bunch of new examples you know this is

81:58

no different than employee onboarding an

82:01

employee uh we say you know thanks for

82:03

that answer it's completely wrong um and

82:06

and uh and then we present to them uh

82:09

this is what a CTL is okay and so this

82:11

is what a CTL is at Nvidia and the CTL

82:15

as you can see you know CTL stands for

82:17

compute Trace Library which makes sense

82:20

you know we were tracing compute Cycles

82:22

all the time and it wrote the program

82:24

isn't that

82:32

amazing and so the productivity of our

82:34

chip designers can go up this is what

82:35

you can do with a Nim first thing you

82:37

can do with is customize it we have a

82:39

service called Nemo microservice that

82:41

helps you curate the data preparing the

82:44

data so that you could teach this on

82:46

board this AI you fine-tune them and

82:49

then you guardrail it you can even

82:51

evaluate the answer evaluate its

82:53

performance against um other other

82:55

examples and so that's called the Nemo

82:58

micr service now the thing that's that's

83:00

emerging here is this there are three

83:02

elements three pillars of what we're

83:03

doing the first pillar is of course

83:06

inventing the technology for um uh AI

83:09

models and running AI models and

83:11

packaging it up for you the second is to

83:13

create tools to help you modify it first

83:16

is having the AI technology second is to

83:19

help you modify it and third is

83:20

infrastructure for you to fine-tune it

83:23

and if you like deploy it you could

83:24

deploy it on our infrastructure called

83:26

dgx cloud or you can employ deploy it on

83:29

Prem you can deploy it anywhere you like

83:31

once you develop it it's yours to take

83:33

anywhere and so we are

83:36

effectively an AI Foundry we will do for

83:40

you and the industry on AI what tsmc

83:43

does for us building chips and so we go

83:45

to it with our go to tsmc with our big

83:48

Ideas they manufacture and we take it

83:50

with us and so exactly the same thing

83:52

here AI Foundry and the three pillar ERS

83:54

are the NIMS Nemo microservice and dgx

83:58

Cloud the other thing that you could

84:00

teach the Nim to do is to understand

84:02

your proprietary information remember

84:05

inside our company the vast majority of

84:07

our data is not in the cloud it's inside

84:09

our company it's been sitting there you

84:11

know being used all the time and and

84:14

gosh it's it's basically invidious

84:17

intelligence we would like to take that

84:20

data learn its meaning like we learned

84:23

the meaning of almost anything else that

84:24

we just talked about learn its meaning

84:27

and then reindex that knowledge into a

84:30

new type of database called a vector

84:32

database and so you essentially take

84:35

structured data or unstructured data you

84:37

learn its meaning you encode its meaning

84:39

so now this becomes an AI database and

84:43

that AI database in the future once you

84:45

create it you can talk to it and so let

84:47

me give you an example of what you could

84:49

do so suppose you create you get you got

84:51

a whole bunch of multi modality data and

84:53

one good example of that is PDF so you

84:56

take the PDF you take all of your PDFs

84:59

all the all your favorite you know the

85:01

stuff that that is proprietary to you

85:03

critical to your company you can encode

85:05

it just as we encoded pixels of a cat

85:09

and it becomes the word cat we can

85:11

encode all of your PDF and it turns

85:14

into vectors that are now stored inside

85:16

your vector database it becomes the

85:18

proprietary information of your company

85:20

and once you have that proprietary

85:21

information you can chat to it it's an

85:24

it's a smart database and so you just ch

85:27

chat with data and how how much more

85:29

enjoyable is that you know we for for

85:33

our software team you know they just

85:35

chat with the bugs database you know how

85:38

many bugs was there last night um are we

85:40

making any progress and then after

85:42

you're done talking to this uh bugs

85:45

database you need therapy and so so we

85:49

have another chatbot for

85:53

you

85:55

you can do

86:05

it okay so we call this Nemo Retriever

86:08

and the reason for that is because

86:09

ultimately it's job is to go retrieve

86:11

information as quickly as possible and

86:13

you just talk to it hey retrieve me this

86:15

information it goes if brings it back to

86:18

you and do you mean this you go yeah

86:20

perfect okay and so we call it the Nemo

86:22

retriever well the Nemo service helps

86:24

you create all these things and we have

86:26

all all these different Nims we even

86:27

have Nims of digital humans I'm Rachel

86:31

your AI care

86:33

manager okay so so it's a really short

86:36

clip but there were so many videos to

86:39

show you I guess so many other demos to

86:41

show you and so I I had to cut this one

86:43

short but this is Diana she is a digital

86:46

human Nim and and uh you just talked to

86:50

her and she's connected in this case to

86:52

Hippocratic ai's large language model

86:54

for healthcare and it's truly

86:58

amazing she is just super smart about

87:01

Healthcare things you know and so after

87:04

you're done after my my Dwight my VP of

87:07

software engineering talks to the

87:08

chatbot for bugs database then you come

87:11

over here and talk to Diane and and so

87:13

so uh Diane is is um completely animated

87:17

with AI and she's a digital

87:19

human uh there's so many companies that

87:21

would like to build they're sitting on

87:23

gold mines

87:25

the the Enterprise IT industry is

87:27

sitting on a gold mine it's a gold mine

87:29

because they have so much understanding

87:31

of of uh the way work is done they have

87:34

all these amazing tools that have been

87:36

created over the years and they're

87:37

sitting on a lot of data if they could

87:40

take that gold mine and turn them into

87:43

co-pilots these co-pilots could help us

87:45

do things and so just about every it

87:49

franchise it platform in the world that

87:51

has valuable tools that people use is

87:53

sitting on a gold mine for co-pilots and

87:56

they would like to build their own

87:57

co-pilots and their own chatbots and so

88:00

we're announcing that Nvidia AI foundary

88:02

is working with some of the world's

88:03

great companies sap generates 87% of the

88:06

world's Global Commerce basically the

88:09

world runs on sap we run on sap Nvidia

88:11

and sap are building sap Jewel co-pilots

88:15

uh using Nvidia Nemo and dgx cloud

88:18

service now they run 80 85% of the

88:20

world's Fortune 500 companies run their

88:23

people and customer service operations

88:25

on service now and they're using Nvidia

88:28

AI Foundry to build service now uh

88:31

assist virtual

88:33

assistance cohesity backs up the world's

88:36

data they're sitting on a gold mine of

88:38

data hundreds of exobytes of data over

88:41

10,000 companies Nvidia AI Foundry is

88:44

working with them helping them build

88:46

their Gaia generative AI agent snowflake

88:50

is a company that stores the world's uh

88:53

digital Warehouse in the cloud and

88:55

serves over 3 billion queries a day for

89:01

10,000 Enterprise customers snowflake is

89:03

working with Nvidia AI Foundry to build

89:06

co-pilots with Nvidia Nemo and Nims net

89:09

apppp nearly half of the files in the

89:12

world are stored on Prem on net apppp

89:16

Nvidia AI Foundry is helping them uh

89:18

build chat Bots and co-pilots like those

89:21

Vector databases and retrievers with

89:23

Nvidia neemo and

89:25

Nims and we have a great partnership

89:27

with Dell everybody who everybody who is

89:30

building these chat Bots and generative

89:33

AI when you're ready to run it you're

89:35

going to need an AI

89:37

Factory and nobody is better at Building

89:41

end-to-end Systems of very large scale

89:43

for the Enterprise than Dell and so

89:46

anybody any company every company will

89:48

need to build AI factories and it turns

89:51

out that Michael is here he's happy to

89:53

take your order

89:58

ladies and gentlemen Michael

90:04

del okay let's talk about the next wave

90:07

of Robotics the next wave of AI robotics

90:09

physical

90:11

AI so far all of the AI that we've

90:14

talked about is one

90:16

computer data comes into one computer

90:18

lots of the world's if you will

90:21

experience in digital text form the AI

90:25

imitates Us by reading a lot of the

90:28

language to predict the next words it's

90:30

imitating You by studying all of the

90:32

patterns and all the other previous

90:34

examples of course it has to understand

90:36

context and so on so forth but once it

90:38

understands the context it's essentially

90:39

imitating you we take all of the data we

90:42

put it into a system like dgx we

90:45

compress it into a large language model

90:47

trillions and trillions of parameters

90:49

become billions and billion trillions of

90:51

tokens becomes billions of parameters

90:53

these billions of parameters becomes

90:54

your AI well in order for us to go to

90:58

the next wave of AI where the AI

91:00

understands the physical world we're

91:02

going to need three

91:03

computers the first computer is still

91:06

the same computer it's that AI computer

91:08

that now is going to be watching video

91:10

and maybe it's doing synthetic data

91:12

generation and maybe there's a lot of

91:14

human examples just as we have human

91:17

examples in text form we're going to

91:18

have human examples in articulation form

91:22

and the AIS will watch us

91:25

understand what is

91:26

happening and try to adapt it for

91:29

themselves into the

91:31

context and because it can generalize

91:33

with these Foundation models maybe these

91:36

robots can also perform in the physical

91:38

world fairly generally so I just

91:41

described in very simple terms

91:44

essentially what just happened in large

91:45

language models except the chat GPT

91:47

moment for robotics may be right around

91:49

the corner and so we've been building

91:52

the end to-end systems for robotics for

91:54

some time I'm super super proud of the

91:56

work we have the AI system

91:59

dgx we have the lower system which is

92:01

called agx for autonomous systems the

92:04

world's first robotics processor when we

92:06

first built this thing people are what

92:07

are you guys building it's a s so it's

92:10

one chip it's designed to be very low

92:12

power but it's designed for high-speed

92:13

sensor processing and Ai and so if you

92:17

want to run Transformers in a car or you

92:20

want to run Transformers in a in a you

92:23

know anything

92:24

um that moves uh we have the perfect

92:26

computer for you it's called the Jetson

92:29

and so the dgx on top for training the

92:31

AI the Jetson is the autonomous

92:33

processor and in the middle we need

92:35

another computer whereas large language

92:39

models have the

92:40

benefit of you providing your examples

92:43

and then doing reinforcement learning

92:45

human

92:47

feedback what is the reinforcement

92:49

learning human feedback of a robot well

92:52

it's reinforcement learning

92:54

physical feedback that's how you align

92:56

the robot that's how you that's how the

92:59

robot knows that as it's learning these

93:01

articulation capabilities and

93:02

manipulation capabilities it's going to

93:04

adapt properly into the laws of physics

93:08

and so we need a simulation

93:11

engine that represents the world

93:13

digitally for the robot so that the

93:15

robot has a gym to go learn how to be a

93:18

robot we call

93:19

that virtual world Omniverse and the

93:23

computer that runs Omniverse is called

93:25

ovx and ovx the computer itself is

93:29

hosted in the Azure Cloud okay and so

93:32

basically we built these three things

93:34

these three systems on top of it we have

93:36

algorithms for every single one now I'm

93:39

going to show you one super example of

93:42

how Ai and Omniverse are going to work

93:45

together the example I'm going to show

93:46

you is kind of insane but it's going to

93:49

be very very close to tomorrow it's a

93:51

robotics building this robotics building

93:54

is called a warehouse inside the

93:56

robotics building are going to be some

93:58

autonomous systems some of the

94:00

autonomous systems are going to be

94:01

called humans and some of the autonomous

94:04

systems are going to be called forklifts

94:06

and these autonomous systems are going

94:08

to interact with each other of course

94:10

autonomously and it's going to be

94:12

overlooked upon by this Warehouse to

94:14

keep everybody out of Harm's Way the

94:16

warehouse is essentially an air traffic

94:18

controller and whenever it sees

94:21

something happening it will redirect

94:23

traffic traffic and give New Way points

94:26

just new way points to the robots and

94:28

the people and they'll know exactly what

94:29

to do this warehouse this building you

94:33

can also talk to of course you could

94:35

talk to it hey you know sap Center how

94:38

are you feeling today for example and so

94:41

you could ask the same the warehouse the

94:43

same questions basically the system I

94:46

just described will have Omniverse Cloud

94:49

that's hosting the virtual simulation

94:52

and AI running on djx cloud and all of

94:56

this is running in real time let's take

94:57

a

94:59

look the future of heavy industri starts

95:02

as a digital twin the AI agents helping

95:05

robots workers and infrastructure

95:07

navigate unpredictable events in complex

95:10

industrial spaces will be built and

95:12

evaluated first in sophisticated digital

95:15

twins this Omniverse digital twin of a

95:18

100,000 ft Warehouse is operating as a

95:22

simulation environment that integrates

95:24

digital workers amrs running the Nvidia

95:27

Isaac receptor stack centralized

95:29

activity maps of the entire Warehouse

95:31

from 100 simulated ceiling mount cameras

95:34

using Nvidia metropolis and AMR route

95:37

planning with Nvidia Koop software in

95:40

Loop testing of AI agents in this

95:42

physically accurate simulated

95:44

environment enables us to evaluate and

95:47

refine how the system adapts to real

95:49

world

95:51

unpredictability here an incident occurs

95:53

along this amr's planned route blocking

95:56

its path as it moves to pick up a pallet

95:59

Nvidia Metropolis updates and sends a

96:01

realtime occupancy map to kopt where a

96:03

new optimal route is calculated the AMR

96:06

is enabled to see around corners and

96:08

improve its Mission efficiency with

96:11

generative AI powered Metropolis Vision

96:13

Foundation models operators can even ask

96:16

questions using natural language the

96:18

visual model understands nuanced

96:21

activity and can offer immediate

96:22

insights to improve operations all of

96:25

the sensor data is created in simulation

96:27

and passed to the real-time AI running

96:30

as Nvidia inference microservices or

96:32

Nims and when the AI is ready to be

96:35

deployed in the physical twin the real

96:37

Warehouse we connect metropolis and

96:39

Isaac Nims to real sensors with the

96:42

ability for continuous Improvement of

96:44

both the digital twin and the AI

96:49

models isn't that

96:52

incredible and

96:55

so remember remember a future facility

97:00

Warehouse Factory building will be

97:03

software defined and so the software is

97:05

running how else would you test the

97:07

software so you you you test the

97:10

software to building the warehouse the

97:12

optimization system in the digital twin

97:14

what about all the robots all of those

97:15

robots you are seeing just now they're

97:17

all running their own autonomous robotic

97:19

stack and so the way you integrate

97:21

software in the future cicd in the

97:23

future for robotic systems is with

97:26

digital twins we've made Omniverse a lot

97:29

easier to access we're going to create

97:31

basically Omniverse Cloud apis four

97:34

simple API and a channel and you can

97:37

connect your application to it so this

97:38

is this is going to be as wonderfully

97:41

beautifully simple in the future that

97:44

Omniverse is going to be and with these

97:46

apis you're going to have these magical

97:48

digital twin capability we also have

97:52

turned om ver into an AI and integrated

97:56

it with the ability to chat USD the the

97:59

language of our language is you know

98:01

human and Omniverse is language as it

98:04

turns out is universal scene description

98:06

and so that language is rather complex

98:09

and so we've taught our Omniverse uh

98:12

that language and so you can speak to it

98:14

in English and it would directly

98:15

generate USD and it would talk back in

98:18

USD but Converse back to you in English

98:20

you could also look for information in

98:22

this world semantically instead of the

98:25

world being encoded semantically in in

98:27

language now it's encoded semantically

98:29

in scenes and so you could ask it of of

98:32

uh certain objects or certain conditions

98:34

and certain scenarios and it can go and

98:36

find that scenario for you it also can

98:39

collaborate with you in generation you

98:41

could design some things in 3D it could

98:43

simulate some things in 3D or you could

98:45

use AI to generate something in 3D let's

98:47

take a look at how this is all going to

98:49

work we have a great partnership with

98:51

Seamans Seamans is the world's largest

98:54

industrial engineering and operations

98:56

platform you've seen now so many

98:59

different companies in the industrial

99:01

space heavy Industries is one of the

99:03

greatest final frontiers of it and we

99:06

finally now have the Necessary

99:08

Technology to go and make a real impact

99:11

seens is building the industrial

99:13

metaverse and today we're announcing

99:14

that Seamans is connecting their Crown

99:17

Jewel accelerator to Nvidia Omniverse

99:20

let's take a

99:22

look seens technology is transformed

99:25

every day for everyone team Center acts

99:28

our leading product life cycle

99:29

management software from the sems

99:31

accelerator platform is used every day

99:34

by our customers to develop and deliver

99:36

products at scale now we are bringing

99:39

the real and the digital worlds even

99:41

Closer by integrating Nvidia Ai and

99:44

Omniverse Technologies into team Center

99:47

X Omniverse apis enable data

99:50

interoperability and physics-based

99:52

rendering to Industrial scale design and

99:55

Manufacturing projects our customers HD

99:59

market leader in sustainable ship

100:00

manufacturing builds ammonia and

100:03

hydrogen power chips often comprising

100:05

over 7 million discrete Parts with

100:08

Omniverse apis team Center X lets

100:11

companies like HD yundai unify and

100:14

visualize these massive engineering data

100:17

sets interactively and integrate

100:19

generative AI to generate 3D objects or

100:22

HDR I backgrounds to see their projects

100:26

in context the result an ultra inuitive

100:29

photoal physics-based digital twin that

100:32

eliminates waste and errors delivering

100:35

huge savings in cost and

100:37

time and we are building this for

100:39

collaboration whether across more semens

100:41

accelerator tools like seens anex or

100:45

Star CCM Plus or across teams working on

100:49

their favorite devices in the same scene

100:51

together in this is just the beginning

100:54

working with Nvidia we will bring

100:57

accelerated Computing generative Ai and

100:59

Omniverse integration across the Sean

101:03

accelerator

101:11

portfolio the pro the the professional

101:15

the professional voice actor happens to

101:17

be a good friend of mine Roland Bush who

101:20

happens to be the CEO of

101:22

seens

101:29

once you get Omniverse connected into

101:34

your workflow your

101:36

ecosystem from the beginning of your

101:39

design to

101:40

engineering to manufacturing planning

101:43

all the way to digital twin

101:45

operations once you connect everything

101:48

together it's insane how much

101:50

productivity you can get and it's just

101:52

really really wonderful all of a sudden

101:54

everybody is operating on the same

101:55

ground

101:56

truth you don't have to exchange data

101:59

and convert data make mistakes everybody

102:01

is working on the same ground truth from

102:04

the design Department to the art

102:06

Department the architecture Department

102:07

all the way to the engineering and even

102:09

the marketing department let's take a

102:11

look at how Nissan has integrated

102:14

Omniverse into their workflow and it's

102:17

all because it's connected by all these

102:19

wonderful tools and these developers

102:21

that we're working with take a look

102:22

unbel

102:27

[Music]

103:06

[Music]

103:22

for

103:25

[Music]

103:52

for

104:01

that was not an animation that was

104:05

Omniverse today we're announcing that

104:07

Omniverse

104:09

Cloud streams to The Vision Pro

104:19

and it is very very strange

104:24

that you walk around virtual doors when

104:27

I was getting out of that

104:29

car and everybody does it it is really

104:33

really quite amazing Vision Pro

104:35

connected to Omniverse portals you into

104:38

Omniverse and because all of these CAD

104:41

tools and all these different design

104:42

tools are now integrated and connected

104:44

to Omniverse you can have this type of

104:46

workflow really incredible let's talk

104:48

about robotics everything that moves

104:51

will be robotic there's no question

104:52

about that it's safer it's more

104:56

convenient and one of the largest

104:57

Industries is going to be Automotive we

105:00

build the robotic stack from top to

105:02

bottom as I was mentioned from the

105:04

computer system but in the case of

105:05

self-driving cars including the

105:07

self-driving application at the end of

105:10

this year or I guess beginning of next

105:12

year we will be shipping in Mercedes and

105:14

then shortly after that jlr and so these

105:17

autonomous robotic systems are software

105:20

defined they take a lot of work to do

105:22

has computer vision has obviously

105:24

artificial intelligence control and

105:26

planning all kinds of very complicated

105:29

technology and takes years to refine

105:31

we're building the entire stack however

105:34

we open up our entire stack for all of

105:36

the automotive industry this is just the

105:37

way we work the way we work in every

105:39

single industry we try to build as much

105:41

of it as we can so that we understand it

105:43

but then we open it up so everybody can

105:45

access it whether you would like to buy

105:47

just our computer which is the world's

105:49

only full functional save asld system

105:55

that can run

105:56

AI this functional safe asld quality

106:00

computer or the operating system on top

106:03

or of course our data centers which is

106:07

in basically every AV company in the

106:09

world however you would like to enjoy it

106:11

we're delighted by it today we're

106:13

announcing that byd the world's largest

106:16

ev company is adopting our next

106:19

Generation it's called Thor Thor is

106:21

designed for Transformer engines Thor

106:24

our next Generation AV computer will be

106:26

used by

106:36

byd you probably don't know this fact

106:38

that we have over a million robotics

106:40

developers we created Jetson this

106:43

robotics computer we're so proud of it

106:45

the amount of software that goes on top

106:47

of it is insane but the reason why we

106:49

can do it at all is because it's 100%

106:50

Cuda compatible everything that we do

106:53

everything that we do in our company is

106:55

in service of our developers and by us

106:58

being able to maintain this Rich

107:00

ecosystem and make it compatible with

107:02

everything that you access from us we

107:05

can bring all of that incredible

107:06

capability to this little tiny computer

107:09

we call Jetson a robotics computer we

107:12

also today are

107:13

announcing this incredibly Advanced new

107:16

SDK we call it Isaac

107:19

perceptor Isaac perceptor most most of

107:22

the Bots today are pre-programmed

107:26

they're either following rails on the

107:27

ground digital rails or theyd be

107:29

following April tags but in the future

107:31

they're going to have perception and the

107:33

reason why you want that is so that you

107:34

could easily program it you say would

107:37

you like to go from point A to point B

107:39

and it will figure out a way to navigate

107:41

its way there so by only programming

107:44

waypoints the entire route could be

107:47

adaptive the entire environment could be

107:49

reprogrammed just as I showed you at the

107:51

very beginning with the warehouse you

107:53

can't do that with pre-programmed agvs

107:57

if those boxes fall down they just all

107:59

gum up and they just wait there for

108:01

somebody to come clear it and so now

108:04

with the Isaac

108:05

perceptor we have incredible

108:07

state-of-the-art Vision odometry 3D

108:11

reconstruction and in addition to 3D

108:13

reconstruction depth perception the

108:15

reason for that is so that you can have

108:16

two modalities to keep an eye on what's

108:19

happening in the world Isaac perceptor

108:22

the most used robot today is the

108:26

manipulator manufacturing arms and they

108:29

are also pre-programmed the computer

108:31

vision algorithms the AI algorithms the

108:34

control and path planning algorithms

108:36

that are geometry aware incredibly

108:38

computational intensive we have made

108:41

these Cuda accelerated so we have the

108:44

world's first Cuda accelerated motion

108:46

planner that is geometry aware you put

108:50

something in front of it it comes up

108:51

with a new plan and our articulates

108:53

around it it has excellent perception

108:56

for pose estimation of a 3D object not

109:00

just not it's pose in 2D but it's pose

109:02

in 3D so it has to imagine what's around

109:05

and how best to grab it so the

109:08

foundation pose the grip foundation and

109:12

the um articulation algorithms are now

109:15

available we call it Isaac manipulator

109:17

and they also uh just run on nvidia's

109:21

computers we are are starting to do some

109:25

really great work in the next generation

109:27

of Robotics the next generation of

109:29

Robotics will likely be a humanoid

109:32

robotics we now have the Necessary

109:35

Technology and as I was describing

109:38

earlier the Necessary Technology to

109:40

imagine generalized human robotics in a

109:44

way human robotics is likely easier and

109:46

the reason for that is because we have a

109:48

lot more imitation training data that we

109:51

can provide there robots because we are

109:54

constructed in a very similar way it is

109:56

very likely that the human robotics will

109:58

be much more useful in our world because

110:00

we created the world to be something

110:02

that we can interoperate in and work

110:04

well in and the way that we set up our

110:07

workstations and Manufacturing and

110:08

Logistics they were designed for for

110:10

humans they were designed for people and

110:12

so these human robotics will likely be

110:15

much more productive to

110:17

deploy while we're creating just like

110:20

we're doing with the others the entire

110:22

stack starting from the top a foundation

110:25

model that learns from watching video

110:28

human IM human examples it could be in

110:32

video form it could be in virtual

110:34

reality form we then created a gym for

110:37

it called Isaac reinforcement learning

110:40

gym which allows the humanoid robot to

110:43

learn how to adapt to the physical world

110:46

and then an incredible computer the same

110:49

computer that's going to go into a

110:50

robotic car this computer will run

110:53

inside a human or robot called Thor it's

110:55

designed for Transformer engines we've

110:58

combined several of these into one video

111:01

this is something that you're going to

111:03

really love take a

111:07

look it's not enough for humans to

111:10

[Music]

111:15

imagine we have to

111:19

invent and explore real and push Beyond

111:24

what's been done fair amount of

111:31

detail we create

111:33

smarter and

111:37

faster we push it to

111:40

fail so it can

111:44

learn we teach it then help it teach

111:48

itself we broaden its understanding

111:55

to take on new

111:58

challenges with absolute

112:03

precision and

112:06

succeed we make it

112:09

perceive and

112:13

move and even

112:17

reason so it can share our world with

112:21

us

112:26

[Music]

112:34

[Music]

112:41

this is where inspiration leads us the

112:44

next

112:46

Frontier this is Nvidia Project

112:51

Groot

112:54

a general purpose Foundation model for

112:56

humanoid robot

112:58

learning the group model takes

113:00

multimodal instructions and past

113:03

interactions as input and produces the

113:05

next action for the robot to

113:09

execute we developed Isaac lab a robot

113:12

learning application to train gr on

113:14

Omniverse Isaac

113:16

Sim and we scale out with osmo a new

113:19

compute orchestration service that

113:21

coordinates work flows across dgx

113:23

systems for training and ovx systems for

113:28

simulation with these tools we can train

113:30

Groot in physically based simulation and

113:33

transfer zero shot to the real

113:36

world the Groot model will enable a

113:39

robot to learn from a handful of human

113:41

demonstrations so it can help with

113:43

everyday

113:46

tasks and emulate human movement just by

113:49

observing

113:51

us this is made possible with nvidia's

113:54

technologies that can understand humans

113:56

from videos train models and simulation

113:59

and ultimately deploy them directly to

114:01

physical robots connecting group to a

114:04

large language model even allows it to

114:06

generate motions by following natural

114:09

language instructions hi go1 can you

114:12

give me a high five sure thing let's

114:15

high

114:16

five can you give us some cool moves

114:19

sure check this

114:21

out

114:25

all this incredible intelligence is

114:26

powered by the new Jetson Thor robotics

114:29

chips designed for Groot built for the

114:32

future with Isaac lab osmo and Groot

114:35

we're providing the building blocks for

114:37

the next generation of AI powered

114:45

[Applause]

114:51

robotics

114:52

[Music]

114:56

about the same

115:04

size the soul of

115:06

Nvidia the intersection of computer

115:08

Graphics physics artificial intelligence

115:12

it all came to bear at this moment the

115:15

name of that project general robotics

115:20

003 I know super

115:25

good super

115:27

good well I think we have some special

115:31

guests do

115:36

[Music]

115:42

we hey

115:45

guys so I understand you guys are

115:48

powered by

115:49

Jetson they're powered by Jetson

115:53

little Jetson robotics computers inside

115:56

they learn to walk in Isaac

116:02

Sim ladies and gentlemen this this is

116:05

orange and this is the famous green they

116:09

are the bdx robots of

116:13

Disney amazing Disney

116:18

research come on you guys let's wrap up

116:21

let's go

116:23

five things where you

116:27

going I sit right

116:33

here Don't Be Afraid come here green

116:36

hurry

116:39

up what are you

116:42

saying no it's not time to

116:46

eat it's not time

116:50

to I'll I'll give you a snack in a

116:53

moment let me finish up real

116:55

quick come on green hurry up stop

116:59

wasting

117:01

time five things five things first a new

117:06

Industrial Revolution every data center

117:08

should be accelerated a trillion dollars

117:11

worth of installed data centers will

117:13

become modernized over the next several

117:15

years second because of the

117:16

computational capability we brought to

117:18

bear a new way of doing software has

117:20

emerged generative AI which is going to

117:23

create new in new infrastructure

117:25

dedicated to doing one thing and one

117:27

thing only not for multi-user data

117:30

centers but AI generators these AI

117:33

generation will create incredibly

117:36

valuable

117:37

software a new Industrial Revolution

117:40

second the computer of this revolution

117:43

the computer of this generation

117:45

generative AI trillion

117:47

parameters blackw insane amounts of

117:51

computers and computing

117:53

third I'm trying to

117:57

concentrate good job third new computer

118:02

new computer creates new types of

118:04

software new type of software should be

118:06

distributed in a new way so that it can

118:09

on the one hand be an endpoint in the

118:10

cloud and easy to use but still allow

118:13

you to take it with you because it is

118:15

your intelligence your intelligence

118:17

should be pack packaged up in a way that

118:19

allows you to take it with you we call

118:21

them Nims and third these Nims are going

118:24

to help you create a new type of

118:26

application for the future not one that

118:28

you wrote completely from scratch but

118:30

you're going to integrate them like

118:33

teams create these applications we have

118:36

a fantastic capability between Nims the

118:39

AI technology the tools Nemo and the

118:42

infrastructure dgx cloud in our AI

118:45

Foundry to help you create proprietary

118:47

applications proprietary chat Bots and

118:49

then lastly everything that moves in the

118:51

future will be robotic you're not going

118:53

to be the only one and these robotic

118:56

systems whether they are humanoid amrs

119:00

self-driving cars forklifts manipulating

119:03

arms they will all need one thing Giant

119:06

stadiums warehouses factories there can

119:09

to be factories that are robotic

119:11

orchestrating factories uh manufacturing

119:13

lines that are robotics building cars

119:15

that are

119:16

robotics these systems all need one

119:19

thing they need a platform a digital

119:22

platform a digital twin platform and we

119:25

call that Omniverse the operating system

119:27

of the robotics

119:29

World these are the five things that we

119:31

talked about today what does Nvidia look

119:33

like what does Nvidia look like when we

119:35

talk about gpus there's a very different

119:38

image that I have when I when people ask

119:40

me about gpus first I see a bunch of

119:42

software stacks and things like that and

119:44

second I see this this is what we

119:47

announce to you today this is Blackwell

119:50

this is the plat

119:57

amazing amazing processors MV link

120:00

switches networking systems and the

120:03

system design is a miracle this is

120:07

Blackwell and this to me is what a GPU

120:09

looks like in my

120:18

mind listen orange green I think we have

120:22

one more treat for everybody what do you

120:23

think should

120:25

we okay we have one more thing to show

120:28

you roll

120:41

[Music]

120:50

it

120:58

[Music]

121:14

[Music]

121:20

he

121:28

[Music]

121:39

[Music]

121:46

[Music]

121:51

m

121:55

[Music]

122:20

yeah

122:23

[Music]

122:33

[Music]

122:46

thank

122:47

you thank you have a great have a great

122:50

GTC thank you all for coming thank

123:03

you

Rate This

5.0 / 5 (0 votes)

関連タグ
AI革命加速计算Blackwell GPU生成式AIOmniverse数字孪生工业4.0自主系统机器人技术Nvidia GTC
日本語の要約は必要ですか?