Nvidia 2024 AI Event: Everything Revealed in 16 Minutes

CNET
18 Mar 202416:00

Summary

TLDR本次演讲介绍了Blackwell平台,一种新型的GPU架构,拥有28亿个晶体管和10TB每秒的数据传输速度。Blackwell芯片通过创新的设计实现了两个芯片的无缝连接,消除了内存局部性和缓存问题。此外,还推出了MVY链接交换机芯片,拥有50亿个晶体管和1.8TB每秒的传输速度,旨在实现GPU之间的全速通信。演讲还强调了NVIDIA与各大公司合作,推动AI时代的发展,包括与SAP、cohesity、snowflake等公司的合作案例,以及NVIDIA AI Foundry的推出,旨在帮助企业构建AI工厂。最后,介绍了Omniverse和Isaac Sim等工具在机器人学习中的应用,展示了AI和机器人技术的未来发展方向。

Takeaways

  • 🚀 黑威尔(Blackwell)是一款拥有280亿晶体管的创新芯片,它改变了传统GPU的设计方式。
  • 🔗 黑威尔芯片通过独特的设计,实现了两个Die之间的高速互联,仿佛它们是一个整体。
  • 🌐 黑威尔芯片能够提供每秒10TB的数据传输速度,消除了内存局部性问题和缓存问题。
  • 💻 黑威尔芯片兼容现有的Hopper系统,可以无缝替换,为现有基础设施带来升级。
  • 🔄 黑威尔芯片的推出,预示着计算能力的巨大飞跃,尤其是在生成式AI时代。
  • 🌟 黑威尔芯片采用了名为FP4的新格式,这是其内容标记生成的核心。
  • 🔧 黑威尔芯片的推出得到了多个行业巨头的支持,包括AWS、Google、Microsoft等。
  • 🤖 黑威尔芯片的推出,将助力构建更加强大的AI系统,如NVIDIA AI Foundry与SAP、cohesity、snowflake等公司的合作。
  • 🌐 黑威尔芯片的设计和功能,将推动云计算和数据中心的发展,提升整体计算效率。
  • 📈 黑威尔芯片的推出,标志着NVIDIA在高性能计算和AI领域的持续领导地位。
  • 🎉 黑威尔芯片的发布,是NVIDIA在计算机图形学、物理学和人工智能交叉领域的最新成果。

Q & A

  • Blackwell是什么?

    -Blackwell是一个平台,它改变了GPU的传统形态,拥有28亿个晶体管,并且是首个将两个Die以一种特殊方式结合在一起的芯片,使得两边没有内存局部性问题和缓存问题,就像一个巨大的芯片。

  • Blackwell芯片的两个Die之间是如何连接的?

    -Blackwell芯片的两个Die之间通过一条小线连接,这是首次两个Die以这种方式结合在一起,它们之间的数据传输速度达到每秒10TB,使得两边感觉像是在同一个芯片上工作。

  • Blackwell芯片如何与现有的Hopper系统兼容?

    -Blackwell芯片是形式、功能与Hopper兼容的,可以将Hopper滑入Blackwell,这是因为两者的基础设施、设计、电力需求和软件都是相同的。

  • Blackwell芯片的内存一致性是如何实现的?

    -Blackwell芯片的两个部分没有任何关于它们位于芯片哪一侧的线索,它们之间没有内存局部性问题和缓存问题,实现了内存一致性,使得它们像一个大家庭一样共同工作。

  • Nvidia为生成式AI时代创建了哪种处理器?

    -Nvidia为生成式AI时代创建了一种处理器,并且这种处理器的一个重要部分是内容标记生成,称为fp4格式。

  • MVY链接开关芯片有什么特点?

    -MVY链接开关芯片拥有50亿个晶体管,几乎与Hopper的大小相当,这个开关芯片内置了四个MV链接,每个链接的速度为每秒1.8TB,并且它还包含计算功能。

  • Nvidia与哪些公司合作,共同推动Blackwell的发展?:

    -Nvidia与包括AWS、Google、Microsoft、Oracle、SAP、cohesity、snowflake、netapp等在内的多家世界顶级公司合作,共同推动Blackwell的发展。

  • Nvidia AI Foundry提供的三种支柱服务是什么?

    -Nvidia AI Foundry提供的三种支柱服务是NIMS(Nvidia推理微服务)、Nemo微服务和DGX Cloud。

  • Nvidia如何帮助企业构建AI工厂?

    -Nvidia通过提供AI Foundry服务,帮助企业构建AI工厂,这包括使用Nemo微服务来准备和调整数据,使用NIMS进行推理,以及使用DGX Cloud进行大规模的AI训练和部署。

  • Omniverse和OVX在Nvidia的AI生态系统中扮演什么角色?

    -Omniverse是Nvidia的虚拟世界平台,用于模拟和训练AI代理,而OVX是运行Omniverse的计算机,它托管在Azure云中,用于创建数字孪生和评估AI代理。

  • Nvidia的Jetson Thor机器人芯片有哪些特点?

    -Jetson Thor机器人芯片是为未来设计的,具有Isaac实验室和Groot项目支持,能够处理多模态指令和过去的交互作为输入,并为机器人产生下一个动作。

Outlines

00:00

🚀 引领未来的Blackwell芯片

介绍了Blackwell芯片的创新设计和功能,包括其28亿个晶体管、10TB/秒的数据传输速度以及无内存局部性和缓存问题的特点。强调了Blackwell芯片在两种系统中的应用,以及与Hopper的兼容性和挑战。此外,还提到了为生成AI时代创建的处理器、内容令牌生成格式fp4,以及为了满足计算需求而开发的另一个芯片——mvy链接交换机。

05:00

🤖 与行业巨头合作的AI生态系统

描述了Nvidia与多个行业巨头合作,共同推动AI技术的发展。提到了与Google、AWS、Oracle和Microsoft等公司的合作项目,如Google的Gemma模型、AWS的机器人和健康项目、Oracle数据库和Microsoft Azure的Nvidia生态系统。强调了Nvidia AI生态系统的三个支柱:NIMS、Nemo微服务和DGX云,以及与SAP、cohesity、snowflake和net app等公司的合作案例。

10:00

🌐 Omniverse和AI机器人的未来

讨论了Omniverse作为模拟引擎的重要性,以及ovx计算机在Azure云中的托管。强调了数字孪生技术在重工业中的应用前景,以及AI代理在复杂工业空间中导航的能力。提到了Nvidia Project Groot作为通用基础模型,以及Isaac Sim和Osmo在机器人学习中的应用。最后,介绍了Jetson Thor机器人芯片和Nvidia在AI驱动的机器人领域的贡献。

15:02

🎉 Blackwell芯片的里程碑

总结了Blackwell芯片的主要特点,包括其作为GPU的创新设计和系统设计的奇迹。强调了Blackwell芯片对于未来发展的重要性和意义。

Mindmap

Keywords

💡开发者大会

这是一个聚集软件开发者、工程师和科技公司的会议,用于分享最新的技术进展和产品发布。在视频中,开发者大会是展示新技术和交流科学理念的平台,如Blackwell芯片的介绍。

💡Blackwell

Blackwell是NVIDIA推出的一款新型芯片平台,它拥有28亿个晶体管,并且在设计上进行了创新,使得两个芯片模块(Dies)紧密相连,仿佛是一个整体。这种设计大幅提升了数据处理速度和内存一致性。

💡GPU

GPU是图形处理单元(Graphics Processing Unit)的缩写,是一种专门处理图像和视频渲染的微处理器。在视频中,NVIDIA强调他们确实制造GPU,但新一代的GPU在外观和性能上都有了革命性的变化。

💡内存一致性

内存一致性是指在多处理器系统中,各个处理器访问同一块内存时,看到的是一致的数据状态。这是并行计算中的一个重要概念,确保了数据的同步和一致性。在视频中,Blackwell芯片的设计使得两个芯片模块之间实现了内存一致性,大幅提升了计算效率。

💡MVY链接开关

MVY链接开关是一种高速网络设备,拥有50亿个晶体管,能够实现高达1.8TB每秒的数据传输速率。它使得每个GPU能够以全速与其他GPU通信,构建起高效的计算网络。

💡AI代工厂

AI代工厂是指为其他公司提供人工智能技术和服务的机构,帮助他们构建和优化AI应用。在视频中,NVIDIA将自己定位为一个AI代工厂,提供包括NIMS、Nemo微服务和DGX云在内的一系列工具和服务,以支持合作伙伴的AI发展。

💡数字孪生

数字孪生是一种通过数字模型精确表示物理对象或系统的概念。这种技术可以用于模拟和预测实体对象在现实世界中的行为。在视频中,数字孪生被用于机器人学习,以便它们能够在虚拟环境中进行训练和评估。

💡Omniverse

Omniverse是NVIDIA推出的一个开放的、多GPU加速的模拟和协作平台,用于创建和模拟复杂的物理和工程系统。它为设计师、工程师和其他专业人士提供了一个共享的虚拟环境,以实现高效的协作和创新。

💡机器人学习

机器人学习是指机器人通过人工智能算法和大量数据来学习如何执行任务、做出决策和适应环境的过程。在视频中,NVIDIA开发了Isaac Sim这样的机器人学习应用,以及Project Groot这样的通用基础模型,用于训练机器人执行日常任务。

💡Jetson Thor

Jetson Thor是NVIDIA推出的一款专为机器人设计的高性能计算芯片,它结合了先进的AI技术和强大的计算能力,用于支持下一代AI驱动的机器人和自动化系统。

💡Nemo微服务

Nemo微服务是NVIDIA提供的一项服务,旨在帮助用户准备和管理数据,以便在AI模型上进行训练和微调。这项服务提供了数据策划、准备和评估的功能,使得用户能够更有效地利用AI技术。

Highlights

Blackwell平台的介绍,它改变了人们对GPU的传统认知。

Hopper拥有280亿个晶体管,改变了世界,Blackwell继承了这一创新。

Blackwell芯片的独特设计,两个芯片紧密连接,数据传输速度达到每秒10TB。

Blackwell芯片的内存一致性,消除了内存局部性问题和缓存问题。

Blackwell芯片可以无缝替换现有的Hopper系统,保持基础设施、设计、电力和软件的一致性。

介绍了Blackwell芯片的两种系统类型,包括与当前hgx配置兼容的版本。

展示了Blackwell芯片的原型板,这是一个完全功能的板子。

介绍了MVY链接交换机芯片,拥有50亿个晶体管和1.8TB每秒的数据传输速度。

MVY链接交换机芯片允许每个GPU以全速与其他GPU通信,构建强大的系统。

展示了一个dgx系统,这是一个exaflops AI系统,具有极大的计算能力。

合作伙伴加入Blackwell,包括构建安全AI的GPU和大规模AI系统的公司。

Nvidia与多个行业巨头合作,包括AWS、Google、Microsoft和Oracle,共同推动AI技术的发展。

Nvidia AI Foundry与SAP合作,利用Nemo和dgx Cloud服务构建SAP Jewel co-pilots。

Nvidia AI Foundry帮助cohesity构建GIA生成AI代理,以及与snowflake合作构建co-pilots。

Nvidia与Dell合作,为企业提供构建AI工厂的能力,以运行大规模企业系统。

Omniverse作为虚拟世界,为机器人提供学习环境,ovx计算机在Azure云中托管。

Nvidia Project Groot是一个通用的基础模型,用于类人机器人学习,使用Isaac Sim和Osmo进行训练。

Jetson Thor机器人芯片为未来AI驱动的机器人提供动力,展示了Disney的BDX机器人。

Transcripts

00:01

I hope you realize this is not a

00:06

concert you have

00:08

arrived at a developers

00:12

conference there will be a lot of

00:14

science

00:15

described algorithms computer

00:18

architecture mathematics Blackwell is

00:21

not a chip Blackwell is the name of a

00:24

platform uh people think we make

00:27

gpus and and we do but gpus don't look

00:31

the way they used to this is hopper

00:34

Hopper changed the

00:36

world this is

00:47

Blackwell it's okay

00:52

Hopper 28 billion transistors and so so

00:56

you could see you I can see there there

01:00

a small line between two dyes this is

01:02

the first time two dieses have abutted

01:04

like this together in such a way that

01:07

the two CH the two dies think it's one

01:09

chip there's 10 terabytes of data

01:12

between it 10 terabytes per second so

01:15

that these two these two sides of the

01:17

Blackwell Chip have no clue which side

01:19

they're on there's no memory locality

01:22

issues no cach issues it's just one

01:25

giant chip and it goes into two types of

01:29

systems the first

01:31

one is form fit function compatible to

01:34

Hopper and so you slide a hopper and you

01:37

push in Blackwell that's the reason why

01:39

one of the challenges of ramping is

01:41

going to be so efficient there are

01:43

installations of Hoppers all over the

01:45

world and they could be they could be

01:47

you know the same infrastructure same

01:49

design the power the electricity The

01:52

Thermals the software identical push it

01:56

right back and so this is a hopper

01:59

version for the current hgx

02:02

configuration and this is what the other

02:05

the second Hopper looks like this now

02:07

this is a prototype board this is a

02:09

fully functioning board and I just be

02:12

careful here this right here is I don't

02:15

know10

02:21

billion the second one's

02:26

five it gets cheaper after that so any

02:29

customer in the audience it's okay the

02:32

gray CPU has a super fast chipto chip

02:34

link what's amazing is this computer is

02:37

the first of its kind where this much

02:40

computation first of all fits into this

02:44

small of a place second it's memory

02:47

coherent they feel like they're just one

02:49

big happy family working on one

02:52

application together we created a

02:54

processor for the generative AI era and

02:59

one of the most important important

03:00

parts of it is content token generation

03:03

we call it this format is fp4 the rate

03:06

at which we're advancing Computing is

03:08

insane and it's still not fast enough so

03:10

we built another

03:13

chip this chip is just an incredible

03:17

chip we call it the mvy link switch it's

03:20

50 billion transistors it's almost the

03:23

size of Hopper all by itself this switch

03:25

ship has four MV links in

03:28

it each 1.8 terabytes per

03:32

second

03:33

and and it has computation in it as I

03:37

mentioned what is this chip

03:39

for if we were to build such a chip we

03:43

can have every single GPU talk to every

03:47

other GPU at full speed at the same time

03:51

you can build a system that looks like

03:58

this

04:03

now this system this

04:05

system is kind of

04:08

insane this is one dgx this is what a

04:12

dgx looks like now just so you know

04:14

there only a couple two three exop flops

04:16

machines on the planet as we speak and

04:19

so this is an exif flops AI system in

04:23

one single rack I want to thank I want

04:26

to thank some partners that that are

04:28

joining us in this uh aw is gearing up

04:30

for Blackwell they're uh they're going

04:32

to build the first uh GPU with secure AI

04:35

they're uh building out a 222 exif flops

04:39

system we Cuda accelerating Sage maker

04:42

AI we Cuda accelerating Bedrock AI uh

04:45

Amazon robotics is working with us uh

04:47

using Nvidia Omniverse and Isaac Sim AWS

04:51

Health has Nvidia Health Integrated into

04:54

it so AWS has has really leaned into

04:57

accelerated Computing uh Google is

05:00

gearing up for Blackwell gcp already has

05:02

A1 100s h100s t4s l4s a whole Fleet of

05:06

Nvidia Cuda gpus and they recently

05:09

announced the Gemma model that runs

05:11

across all of it uh we're work working

05:13

to optimize uh and accelerate every

05:16

aspect of gcp we're accelerating data

05:18

proc which for data processing the data

05:21

processing engine Jacks xlaa vertex Ai

05:25

and mujo for robotics so we're working

05:27

with uh Google and gcp across whole

05:29

bunch of initiatives uh Oracle is

05:32

gearing up for blackw Oracle is a great

05:34

partner of ours for Nvidia dgx cloud and

05:36

we're also working together to

05:38

accelerate something that's really

05:40

important to a lot of companies Oracle

05:43

database Microsoft is accelerating and

05:46

Microsoft is gearing up for Blackwell

05:48

Microsoft Nvidia has a wide- ranging

05:50

partnership we're accelerating could

05:52

accelerating all kinds of services when

05:54

you when you chat obviously and uh AI

05:56

services that are in Microsoft Azure uh

05:59

it's very very very likely nvidia's in

06:00

the back uh doing the inference and the

06:02

token generation uh we built they built

06:04

the largest Nvidia infiniband super

06:07

computer basically a digital twin of

06:09

ours or a physical twin of ours we're

06:11

bringing the Nvidia ecosystem to Azure

06:14

Nvidia DJ's Cloud to Azure uh Nvidia

06:17

Omniverse is now hosted in Azure Nvidia

06:19

Healthcare is in Azure and all of it is

06:22

deeply integrated and deeply connected

06:24

with Microsoft fabric a NM it's a

06:27

pre-trained model so it's pretty clever

06:30

and it is packaged and optimized to run

06:33

across nvidia's install base which is

06:36

very very large what's inside it is

06:39

incredible you have all these

06:41

pre-trained stateof the open source

06:43

models they could be open source they

06:45

could be from one of our partners it

06:46

could be created by us like Nvidia

06:48

moment it is packaged up with all of its

06:51

dependencies so Cuda the right version

06:54

cdnn the right version tensor RT llm

06:57

Distributing across the multiple gpus

06:59

tried and inference server all

07:01

completely packaged together it's

07:05

optimized depending on whether you have

07:07

a single GPU multi- GPU or multi- node

07:09

of gpus it's optimized for that and it's

07:12

connected up with apis that are simple

07:14

to use these packages incredible bodies

07:17

of software will be optimized and

07:19

packaged and we'll put it on a

07:22

website and you can download it you

07:24

could take it with you you could run it

07:27

in any Cloud you could run it in your

07:29

own data Center you can run in

07:30

workstations if it fit and all you have

07:32

to do is come to ai. nvidia.com we call

07:35

it Nvidia inference microservice but

07:38

inside the company we all call it Nims

07:40

we have a service called Nemo

07:42

microservice that helps you curate the

07:44

data preparing the data so that you

07:46

could teach this on board this AI you

07:49

fine-tune them and then you guardrail it

07:51

you can even evaluate the answer

07:53

evaluate its performance against um

07:55

other other examples and so we are

07:58

effectively an AI Foundry we will do for

08:02

you and the industry on AI what tsmc

08:05

does for us building chips and so we go

08:08

to it with our go to tsmc with our big

08:10

Ideas they manufacture and we take it

08:12

with us and so exactly the same thing

08:14

here AI Foundry and the three pillars

08:17

are the NIMS Nemo microservice and dgx

08:21

Cloud we're announcing that Nvidia AI

08:23

Foundry is working with some of the

08:25

world's great companies sap generates

08:27

87% of the world's global Commerce

08:30

basically the world runs on sap we run

08:32

on sap Nvidia and sap are building sap

08:35

Jewel co-pilots uh using Nvidia Nemo and

08:38

dgx Cloud uh service now they run 80 85%

08:42

of the world's Fortune 500 companies run

08:44

their people and customer service

08:46

operations on service now and they're

08:49

using Nvidia AI Foundry to build service

08:52

now uh assist virtual

08:55

assistance cohesity backs up the world's

08:58

data their sitting on a gold mine of

09:00

data hundreds of exobytes of data over

09:02

10,000 companies Nvidia AI Foundry is

09:05

working with them helping them build

09:08

their Gia generative AI agent snowflake

09:12

is a company that stores the world's uh

09:15

digital Warehouse in the cloud and

09:17

serves over three billion queries a day

09:22

for 10,000 Enterprise customers

09:24

snowflake is working with Nvidia AI

09:26

Foundry to build co-pilots with Nvidia

09:29

Nemo and Nims net apppp nearly half of

09:33

the files in the world are stored on

09:36

Prem on net app Nvidia AI Foundry is

09:39

helping them uh build chat Bots and

09:41

co-pilots like those Vector databases

09:44

and retrievers with enidan Nemo and

09:47

Nims and we have a great partnership

09:49

with Dell everybody who everybody who is

09:52

building these chatbots and generative

09:54

AI when you're ready to run it you're

09:57

going to need an AI Factory

10:00

and nobody is better at Building

10:02

endtoend Systems of very large scale for

10:05

the Enterprise than Dell and so anybody

10:08

any company every company will need to

10:10

build AI factories and it turns out that

10:13

Michael is here he's happy to take your

10:17

order we need a simulation

10:21

engine that represents the world

10:23

digitally for the robot so that the

10:25

robot has a gym to go learn how to be a

10:27

robot we call that

10:30

virtual world Omniverse and the computer

10:33

that runs Omniverse is called ovx and

10:36

ovx the computer itself is hosted in the

10:39

Azure Cloud the future of heavy

10:42

Industries starts as a digital twin the

10:45

AI agents helping robots workers and

10:48

infrastructure navigate unpredictable

10:50

events in complex industrial spaces will

10:52

be built and evaluated first in

10:55

sophisticated digital twins once you

10:57

connect everything together it's insane

11:00

how much productivity you can get and

11:02

it's just really really wonderful all of

11:04

a sudden everybody's operating on the

11:05

same ground

11:06

truth you don't have to exchange data

11:09

and convert data make mistakes everybody

11:12

is working on the same ground truth from

11:15

the design Department to the art

11:16

Department the architecture Department

11:18

all the way to the engineering and even

11:19

the marketing department today we're

11:21

announcing that Omniverse

11:23

Cloud streams to The Vision Pro and

11:35

it is very very

11:37

strange that you walk around virtual

11:40

doors when I was getting out of that

11:43

car and everybody does it it is really

11:47

really quite amazing Vision Pro

11:49

connected to Omniverse portals you into

11:52

Omniverse and because all of these cat

11:55

tools and all these different design

11:56

tools are now integrated and connected

11:58

to Omniverse

11:59

you can have this type of workflow

12:01

really

12:02

incredible this is Nvidia Project

12:09

Groot a general purpose Foundation model

12:13

for humanoid robot

12:15

learning the group model takes

12:17

multimodal instructions and past

12:19

interactions as input and produces the

12:22

next action for the robot to

12:25

execute we developed Isaac lab a robot

12:29

learning application to train Gro on

12:31

Omniverse Isaac

12:33

Sim and we scale out with osmo a new

12:36

compute orchestration service that

12:38

coordinates workflows across djx systems

12:40

for training and ovx systems for

12:43

simulation the group model will enable a

12:46

robot to learn from a handful of human

12:49

demonstrations so it can help with

12:51

everyday

12:53

tasks and emulate human movement just by

12:57

observing us all this incredible

12:59

intelligence is powered by the new

13:01

Jetson Thor robotics chips designed for

13:04

Gro built for the future with Isaac lab

13:08

osmo and Groot we're providing the

13:10

building blocks for the next generation

13:12

of AI powered

13:18

[Applause]

13:26

[Music]

13:28

robotics

13:30

about the same

13:37

size the soul of

13:39

Nvidia the intersection of computer

13:42

Graphics physics artificial intelligence

13:46

it all came to bear at this moment the

13:48

name of that project general robotics

13:54

003 I know super

13:58

good

13:59

super

14:01

good well I think we have some special

14:05

guests do

14:10

[Music]

14:15

we hey

14:19

guys so I understand you guys are

14:22

powered by

14:23

Jetson they're powered by

14:26

Jetson little Jetson robotics computer

14:29

inside they learn to walk in Isaac

14:36

Sim ladies and gentlemen this this is

14:39

orange and this is the famous green they

14:42

are the bdx robots of

14:47

Disney amazing Disney

14:51

research come on you guys let's wrap up

14:54

let's

14:55

go five things where you going

15:02

I sit right

15:07

here Don't Be Afraid come here green

15:09

hurry

15:12

up what are you

15:15

saying no it's not time to

15:19

eat it's not time to

15:24

eat I'll give I'll give you a snack in a

15:26

moment let me finish up real quick

15:29

come on green hurry up stop wasting

15:34

time this is what we announce to you

15:37

today this is Blackwell this is the

15:45

plat amazing amazing processors MV link

15:49

switches networking systems and the

15:52

system design is a miracle this is

15:55

Blackwell and this to me is what a GPU

15:58

looks like in my mind