Nvidia 2024 AI Event: Everything Revealed in 16 Minutes
Summary
TLDR本次演讲介绍了Blackwell平台,一种新型的GPU架构,拥有28亿个晶体管和10TB每秒的数据传输速度。Blackwell芯片通过创新的设计实现了两个芯片的无缝连接,消除了内存局部性和缓存问题。此外,还推出了MVY链接交换机芯片,拥有50亿个晶体管和1.8TB每秒的传输速度,旨在实现GPU之间的全速通信。演讲还强调了NVIDIA与各大公司合作,推动AI时代的发展,包括与SAP、cohesity、snowflake等公司的合作案例,以及NVIDIA AI Foundry的推出,旨在帮助企业构建AI工厂。最后,介绍了Omniverse和Isaac Sim等工具在机器人学习中的应用,展示了AI和机器人技术的未来发展方向。
Takeaways
- 🚀 黑威尔(Blackwell)是一款拥有280亿晶体管的创新芯片,它改变了传统GPU的设计方式。
- 🔗 黑威尔芯片通过独特的设计,实现了两个Die之间的高速互联,仿佛它们是一个整体。
- 🌐 黑威尔芯片能够提供每秒10TB的数据传输速度,消除了内存局部性问题和缓存问题。
- 💻 黑威尔芯片兼容现有的Hopper系统,可以无缝替换,为现有基础设施带来升级。
- 🔄 黑威尔芯片的推出,预示着计算能力的巨大飞跃,尤其是在生成式AI时代。
- 🌟 黑威尔芯片采用了名为FP4的新格式,这是其内容标记生成的核心。
- 🔧 黑威尔芯片的推出得到了多个行业巨头的支持,包括AWS、Google、Microsoft等。
- 🤖 黑威尔芯片的推出,将助力构建更加强大的AI系统,如NVIDIA AI Foundry与SAP、cohesity、snowflake等公司的合作。
- 🌐 黑威尔芯片的设计和功能,将推动云计算和数据中心的发展,提升整体计算效率。
- 📈 黑威尔芯片的推出,标志着NVIDIA在高性能计算和AI领域的持续领导地位。
- 🎉 黑威尔芯片的发布,是NVIDIA在计算机图形学、物理学和人工智能交叉领域的最新成果。
Q & A
Blackwell是什么?
-Blackwell是一个平台,它改变了GPU的传统形态,拥有28亿个晶体管,并且是首个将两个Die以一种特殊方式结合在一起的芯片,使得两边没有内存局部性问题和缓存问题,就像一个巨大的芯片。
Blackwell芯片的两个Die之间是如何连接的?
-Blackwell芯片的两个Die之间通过一条小线连接,这是首次两个Die以这种方式结合在一起,它们之间的数据传输速度达到每秒10TB,使得两边感觉像是在同一个芯片上工作。
Blackwell芯片如何与现有的Hopper系统兼容?
-Blackwell芯片是形式、功能与Hopper兼容的,可以将Hopper滑入Blackwell,这是因为两者的基础设施、设计、电力需求和软件都是相同的。
Blackwell芯片的内存一致性是如何实现的?
-Blackwell芯片的两个部分没有任何关于它们位于芯片哪一侧的线索,它们之间没有内存局部性问题和缓存问题,实现了内存一致性,使得它们像一个大家庭一样共同工作。
Nvidia为生成式AI时代创建了哪种处理器?
-Nvidia为生成式AI时代创建了一种处理器,并且这种处理器的一个重要部分是内容标记生成,称为fp4格式。
MVY链接开关芯片有什么特点?
-MVY链接开关芯片拥有50亿个晶体管,几乎与Hopper的大小相当,这个开关芯片内置了四个MV链接,每个链接的速度为每秒1.8TB,并且它还包含计算功能。
Nvidia与哪些公司合作,共同推动Blackwell的发展?:
-Nvidia与包括AWS、Google、Microsoft、Oracle、SAP、cohesity、snowflake、netapp等在内的多家世界顶级公司合作,共同推动Blackwell的发展。
Nvidia AI Foundry提供的三种支柱服务是什么?
-Nvidia AI Foundry提供的三种支柱服务是NIMS(Nvidia推理微服务)、Nemo微服务和DGX Cloud。
Nvidia如何帮助企业构建AI工厂?
-Nvidia通过提供AI Foundry服务,帮助企业构建AI工厂,这包括使用Nemo微服务来准备和调整数据,使用NIMS进行推理,以及使用DGX Cloud进行大规模的AI训练和部署。
Omniverse和OVX在Nvidia的AI生态系统中扮演什么角色?
-Omniverse是Nvidia的虚拟世界平台,用于模拟和训练AI代理,而OVX是运行Omniverse的计算机,它托管在Azure云中,用于创建数字孪生和评估AI代理。
Nvidia的Jetson Thor机器人芯片有哪些特点?
-Jetson Thor机器人芯片是为未来设计的,具有Isaac实验室和Groot项目支持,能够处理多模态指令和过去的交互作为输入,并为机器人产生下一个动作。
Outlines
🚀 引领未来的Blackwell芯片
介绍了Blackwell芯片的创新设计和功能,包括其28亿个晶体管、10TB/秒的数据传输速度以及无内存局部性和缓存问题的特点。强调了Blackwell芯片在两种系统中的应用,以及与Hopper的兼容性和挑战。此外,还提到了为生成AI时代创建的处理器、内容令牌生成格式fp4,以及为了满足计算需求而开发的另一个芯片——mvy链接交换机。
🤖 与行业巨头合作的AI生态系统
描述了Nvidia与多个行业巨头合作,共同推动AI技术的发展。提到了与Google、AWS、Oracle和Microsoft等公司的合作项目,如Google的Gemma模型、AWS的机器人和健康项目、Oracle数据库和Microsoft Azure的Nvidia生态系统。强调了Nvidia AI生态系统的三个支柱:NIMS、Nemo微服务和DGX云,以及与SAP、cohesity、snowflake和net app等公司的合作案例。
🌐 Omniverse和AI机器人的未来
讨论了Omniverse作为模拟引擎的重要性,以及ovx计算机在Azure云中的托管。强调了数字孪生技术在重工业中的应用前景,以及AI代理在复杂工业空间中导航的能力。提到了Nvidia Project Groot作为通用基础模型,以及Isaac Sim和Osmo在机器人学习中的应用。最后,介绍了Jetson Thor机器人芯片和Nvidia在AI驱动的机器人领域的贡献。
🎉 Blackwell芯片的里程碑
总结了Blackwell芯片的主要特点,包括其作为GPU的创新设计和系统设计的奇迹。强调了Blackwell芯片对于未来发展的重要性和意义。
Mindmap
Keywords
💡开发者大会
💡Blackwell
💡GPU
💡内存一致性
💡MVY链接开关
💡AI代工厂
💡数字孪生
💡Omniverse
💡机器人学习
💡Jetson Thor
💡Nemo微服务
Highlights
Blackwell平台的介绍,它改变了人们对GPU的传统认知。
Hopper拥有280亿个晶体管,改变了世界,Blackwell继承了这一创新。
Blackwell芯片的独特设计,两个芯片紧密连接,数据传输速度达到每秒10TB。
Blackwell芯片的内存一致性,消除了内存局部性问题和缓存问题。
Blackwell芯片可以无缝替换现有的Hopper系统,保持基础设施、设计、电力和软件的一致性。
介绍了Blackwell芯片的两种系统类型,包括与当前hgx配置兼容的版本。
展示了Blackwell芯片的原型板,这是一个完全功能的板子。
介绍了MVY链接交换机芯片,拥有50亿个晶体管和1.8TB每秒的数据传输速度。
MVY链接交换机芯片允许每个GPU以全速与其他GPU通信,构建强大的系统。
展示了一个dgx系统,这是一个exaflops AI系统,具有极大的计算能力。
合作伙伴加入Blackwell,包括构建安全AI的GPU和大规模AI系统的公司。
Nvidia与多个行业巨头合作,包括AWS、Google、Microsoft和Oracle,共同推动AI技术的发展。
Nvidia AI Foundry与SAP合作,利用Nemo和dgx Cloud服务构建SAP Jewel co-pilots。
Nvidia AI Foundry帮助cohesity构建GIA生成AI代理,以及与snowflake合作构建co-pilots。
Nvidia与Dell合作,为企业提供构建AI工厂的能力,以运行大规模企业系统。
Omniverse作为虚拟世界,为机器人提供学习环境,ovx计算机在Azure云中托管。
Nvidia Project Groot是一个通用的基础模型,用于类人机器人学习,使用Isaac Sim和Osmo进行训练。
Jetson Thor机器人芯片为未来AI驱动的机器人提供动力,展示了Disney的BDX机器人。
Transcripts
I hope you realize this is not a
concert you have
arrived at a developers
conference there will be a lot of
science
described algorithms computer
architecture mathematics Blackwell is
not a chip Blackwell is the name of a
platform uh people think we make
gpus and and we do but gpus don't look
the way they used to this is hopper
Hopper changed the
world this is
Blackwell it's okay
Hopper 28 billion transistors and so so
you could see you I can see there there
a small line between two dyes this is
the first time two dieses have abutted
like this together in such a way that
the two CH the two dies think it's one
chip there's 10 terabytes of data
between it 10 terabytes per second so
that these two these two sides of the
Blackwell Chip have no clue which side
they're on there's no memory locality
issues no cach issues it's just one
giant chip and it goes into two types of
systems the first
one is form fit function compatible to
Hopper and so you slide a hopper and you
push in Blackwell that's the reason why
one of the challenges of ramping is
going to be so efficient there are
installations of Hoppers all over the
world and they could be they could be
you know the same infrastructure same
design the power the electricity The
Thermals the software identical push it
right back and so this is a hopper
version for the current hgx
configuration and this is what the other
the second Hopper looks like this now
this is a prototype board this is a
fully functioning board and I just be
careful here this right here is I don't
know10
billion the second one's
five it gets cheaper after that so any
customer in the audience it's okay the
gray CPU has a super fast chipto chip
link what's amazing is this computer is
the first of its kind where this much
computation first of all fits into this
small of a place second it's memory
coherent they feel like they're just one
big happy family working on one
application together we created a
processor for the generative AI era and
one of the most important important
parts of it is content token generation
we call it this format is fp4 the rate
at which we're advancing Computing is
insane and it's still not fast enough so
we built another
chip this chip is just an incredible
chip we call it the mvy link switch it's
50 billion transistors it's almost the
size of Hopper all by itself this switch
ship has four MV links in
it each 1.8 terabytes per
second
and and it has computation in it as I
mentioned what is this chip
for if we were to build such a chip we
can have every single GPU talk to every
other GPU at full speed at the same time
you can build a system that looks like
this
now this system this
system is kind of
insane this is one dgx this is what a
dgx looks like now just so you know
there only a couple two three exop flops
machines on the planet as we speak and
so this is an exif flops AI system in
one single rack I want to thank I want
to thank some partners that that are
joining us in this uh aw is gearing up
for Blackwell they're uh they're going
to build the first uh GPU with secure AI
they're uh building out a 222 exif flops
system we Cuda accelerating Sage maker
AI we Cuda accelerating Bedrock AI uh
Amazon robotics is working with us uh
using Nvidia Omniverse and Isaac Sim AWS
Health has Nvidia Health Integrated into
it so AWS has has really leaned into
accelerated Computing uh Google is
gearing up for Blackwell gcp already has
A1 100s h100s t4s l4s a whole Fleet of
Nvidia Cuda gpus and they recently
announced the Gemma model that runs
across all of it uh we're work working
to optimize uh and accelerate every
aspect of gcp we're accelerating data
proc which for data processing the data
processing engine Jacks xlaa vertex Ai
and mujo for robotics so we're working
with uh Google and gcp across whole
bunch of initiatives uh Oracle is
gearing up for blackw Oracle is a great
partner of ours for Nvidia dgx cloud and
we're also working together to
accelerate something that's really
important to a lot of companies Oracle
database Microsoft is accelerating and
Microsoft is gearing up for Blackwell
Microsoft Nvidia has a wide- ranging
partnership we're accelerating could
accelerating all kinds of services when
you when you chat obviously and uh AI
services that are in Microsoft Azure uh
it's very very very likely nvidia's in
the back uh doing the inference and the
token generation uh we built they built
the largest Nvidia infiniband super
computer basically a digital twin of
ours or a physical twin of ours we're
bringing the Nvidia ecosystem to Azure
Nvidia DJ's Cloud to Azure uh Nvidia
Omniverse is now hosted in Azure Nvidia
Healthcare is in Azure and all of it is
deeply integrated and deeply connected
with Microsoft fabric a NM it's a
pre-trained model so it's pretty clever
and it is packaged and optimized to run
across nvidia's install base which is
very very large what's inside it is
incredible you have all these
pre-trained stateof the open source
models they could be open source they
could be from one of our partners it
could be created by us like Nvidia
moment it is packaged up with all of its
dependencies so Cuda the right version
cdnn the right version tensor RT llm
Distributing across the multiple gpus
tried and inference server all
completely packaged together it's
optimized depending on whether you have
a single GPU multi- GPU or multi- node
of gpus it's optimized for that and it's
connected up with apis that are simple
to use these packages incredible bodies
of software will be optimized and
packaged and we'll put it on a
website and you can download it you
could take it with you you could run it
in any Cloud you could run it in your
own data Center you can run in
workstations if it fit and all you have
to do is come to ai. nvidia.com we call
it Nvidia inference microservice but
inside the company we all call it Nims
we have a service called Nemo
microservice that helps you curate the
data preparing the data so that you
could teach this on board this AI you
fine-tune them and then you guardrail it
you can even evaluate the answer
evaluate its performance against um
other other examples and so we are
effectively an AI Foundry we will do for
you and the industry on AI what tsmc
does for us building chips and so we go
to it with our go to tsmc with our big
Ideas they manufacture and we take it
with us and so exactly the same thing
here AI Foundry and the three pillars
are the NIMS Nemo microservice and dgx
Cloud we're announcing that Nvidia AI
Foundry is working with some of the
world's great companies sap generates
87% of the world's global Commerce
basically the world runs on sap we run
on sap Nvidia and sap are building sap
Jewel co-pilots uh using Nvidia Nemo and
dgx Cloud uh service now they run 80 85%
of the world's Fortune 500 companies run
their people and customer service
operations on service now and they're
using Nvidia AI Foundry to build service
now uh assist virtual
assistance cohesity backs up the world's
data their sitting on a gold mine of
data hundreds of exobytes of data over
10,000 companies Nvidia AI Foundry is
working with them helping them build
their Gia generative AI agent snowflake
is a company that stores the world's uh
digital Warehouse in the cloud and
serves over three billion queries a day
for 10,000 Enterprise customers
snowflake is working with Nvidia AI
Foundry to build co-pilots with Nvidia
Nemo and Nims net apppp nearly half of
the files in the world are stored on
Prem on net app Nvidia AI Foundry is
helping them uh build chat Bots and
co-pilots like those Vector databases
and retrievers with enidan Nemo and
Nims and we have a great partnership
with Dell everybody who everybody who is
building these chatbots and generative
AI when you're ready to run it you're
going to need an AI Factory
and nobody is better at Building
endtoend Systems of very large scale for
the Enterprise than Dell and so anybody
any company every company will need to
build AI factories and it turns out that
Michael is here he's happy to take your
order we need a simulation
engine that represents the world
digitally for the robot so that the
robot has a gym to go learn how to be a
robot we call that
virtual world Omniverse and the computer
that runs Omniverse is called ovx and
ovx the computer itself is hosted in the
Azure Cloud the future of heavy
Industries starts as a digital twin the
AI agents helping robots workers and
infrastructure navigate unpredictable
events in complex industrial spaces will
be built and evaluated first in
sophisticated digital twins once you
connect everything together it's insane
how much productivity you can get and
it's just really really wonderful all of
a sudden everybody's operating on the
same ground
truth you don't have to exchange data
and convert data make mistakes everybody
is working on the same ground truth from
the design Department to the art
Department the architecture Department
all the way to the engineering and even
the marketing department today we're
announcing that Omniverse
Cloud streams to The Vision Pro and
it is very very
strange that you walk around virtual
doors when I was getting out of that
car and everybody does it it is really
really quite amazing Vision Pro
connected to Omniverse portals you into
Omniverse and because all of these cat
tools and all these different design
tools are now integrated and connected
to Omniverse
you can have this type of workflow
really
incredible this is Nvidia Project
Groot a general purpose Foundation model
for humanoid robot
learning the group model takes
multimodal instructions and past
interactions as input and produces the
next action for the robot to
execute we developed Isaac lab a robot
learning application to train Gro on
Omniverse Isaac
Sim and we scale out with osmo a new
compute orchestration service that
coordinates workflows across djx systems
for training and ovx systems for
simulation the group model will enable a
robot to learn from a handful of human
demonstrations so it can help with
everyday
tasks and emulate human movement just by
observing us all this incredible
intelligence is powered by the new
Jetson Thor robotics chips designed for
Gro built for the future with Isaac lab
osmo and Groot we're providing the
building blocks for the next generation
of AI powered
[Applause]
[Music]
robotics
about the same
size the soul of
Nvidia the intersection of computer
Graphics physics artificial intelligence
it all came to bear at this moment the
name of that project general robotics
003 I know super
good
super
good well I think we have some special
guests do
[Music]
we hey
guys so I understand you guys are
powered by
Jetson they're powered by
Jetson little Jetson robotics computer
inside they learn to walk in Isaac
Sim ladies and gentlemen this this is
orange and this is the famous green they
are the bdx robots of
Disney amazing Disney
research come on you guys let's wrap up
let's
go five things where you going
I sit right
here Don't Be Afraid come here green
hurry
up what are you
saying no it's not time to
eat it's not time to
eat I'll give I'll give you a snack in a
moment let me finish up real quick
come on green hurry up stop wasting
time this is what we announce to you
today this is Blackwell this is the
plat amazing amazing processors MV link
switches networking systems and the
system design is a miracle this is
Blackwell and this to me is what a GPU
looks like in my mind
5.0 / 5 (0 votes)
🔴 WATCH LIVE: NVIDIA GTC 2024 Keynote - The Future Of AI!
Microsoft's New PHI-3 AI Turns Your iPhone Into an AI Superpower! (Game Changer!)
正解 (18FES ver.)
Risk-Based Alerting (RBA) for Splunk Enterprise Security Explained—Bite-Size Webinar Series (Part 3)
AI music is getting good!! Udio vs. Suno
6款工具帮你自动赚钱,轻松上手帮你打开全新的收入渠道,赚钱效率高出100倍,用好这几款AI人工智能工具,你会发现赚钱从来没如此简单过