INSANE OpenAI News: GPT-4o and your own AI partner
Summary
TLDR在这个视频中,博主介绍了OpenAI最新发布的GPT-40模型及其新功能,包括能够通过音频、视觉和文本进行互动的个人助理。博主展示了多个示范片段,强调了GPT-40在实时语音对话、图像理解和多语言翻译方面的卓越表现。博主对这些进步感到震惊和兴奋,同时也对未来感到一些恐惧。视频最后讨论了GPT-40的广泛应用前景,如教育、日常交流等,并鼓励观众在评论中分享他们的看法。
Takeaways
- 🧠 GPT-40 是 OpenAI 新推出的旗舰模型,具有 Omni(全能)特性,能够实时处理音频、视觉和文本等多种输入输出。
- 🔍 GPT-40 在文本、非英语语言理解和代码方面与 GPT-4 Turbo 性能相当,但在 API 响应速度上更快,成本更低。
- 🎥 该模型在视觉和音频理解方面相较现有模型有显著提升,能够通过摄像头观察世界并进行交互。
- 🤖 GPT-40 能够进行实时对话,响应时间平均为 320 毫秒,接近人类的反应时间。
- 🎙️ 它具备实时语音助手功能,可以唱歌、讲笑话、进行实时翻译和语言学习辅导等。
- 👶 对于教育领域,GPT-40 可以作为虚拟教师,帮助用户学习各种知识,引发对未来教育模式的思考。
- 📈 在多个基准测试中,GPT-40 显示出比 Google 的 Gemini、Anthropic 的 Claude 和 Meta 的 LLaMA 更好的性能。
- 🆓 OpenAI 将 GPT-40 免费提供给免费用户和 Plus 用户,提供高达五倍的消息限制。
- 👥 GPT-40 将通过 Chat GPT Plus 的 Alpha 版本推出,需要订阅 Plus 计划才能使用实时语音助手功能。
- 🛠️ 对于开发者来说,GPT-40 的 API 比 GPT-4 Turbo 快两倍,价格减半,消息限制率高五倍。
- 🤔 尽管 GPT-40 功能强大,但模型仍不完美,有时会产生错误的信息,需要用户注意。
Q & A
GPT 40是什么,它与之前的AI模型相比有哪些新特性?
-GPT 40是OpenAI宣布的最新旗舰模型,其“O”代表Omni,意味着它能够处理多种类型的输入和输出,包括音频、视觉和文本,并能够实时响应。它在性能上与当前领先的GPT 4 Turbo相当,但在非英语文本、音频翻译和视觉理解方面有显著提升。
GPT 40在响应时间上有何改进?
-GPT 40的响应时间非常快,它能够以平均320毫秒的速度响应,最小响应时间仅为232毫秒,这与人类对话的响应时间相近,几乎达到了实时响应的水平。
GPT 40在语言理解方面有哪些提升?
-GPT 40在理解不同语言方面表现出色,它在多种语言的测试中都优于之前的GPT 4模型。这表明GPT 40在多语言理解方面具有显著的进步。
GPT 40如何实现实时音频能力,它与旧的语音模式有何不同?
-GPT 40通过训练一个单一的新模型,端到端处理文本、视觉和音频,这意味着所有的输入和输出都由同一个神经网络处理。与旧的语音模式相比,旧模式需要三个独立的模型来转录音频、处理文本和转换回音频,这导致信息丢失和更高的延迟。
GPT 40的实时语音助手功能将如何提供给用户?
-OpenAI将为Chat GPT Plus的订阅用户提供GPT 40的实时语音助手功能,并且将在接下来的几周内以Alpha版本推出。此外,GPT 40也将对免费层用户开放,提供高达五倍的消息限制。
GPT 40在API性能和成本方面有哪些优势?
-GPT 40在API性能上是GPT 4 Turbo的两倍快,同时成本只有一半,并且提供的消息限制率是GPT 4 Turbo的五倍,这为开发者提供了更高的性价比。
GPT 40在实际应用中可以执行哪些任务?
-GPT 40可以执行多种任务,包括实时对话、翻译、教育辅导、歌唱、讲笑话、实时会议摘要等。它还能够通过摄像头观察世界并与用户互动,提供个性化的反馈和互动。
GPT 40的发布对教育领域可能产生哪些影响?
-GPT 40可以作为一个知识渊博的教师,随时为用户提供学习指导,这可能会改变传统的教育模式,使得学习更加个性化和灵活。然而,它也可能引发关于教育、人际关系和社会互动的深入讨论。
GPT 40在语音合成方面的表现如何?
-GPT 40在语音合成方面表现出色,它能够以非常自然和人类化的声音唱歌和说话,这在演示中已经得到了展示。
GPT 40是否还存在局限性,它可能会出现哪些问题?
-尽管GPT 40非常先进,但它仍然可能存在局限性,例如有时会生成错误的信息或“幻觉”。这意味着它可能并不总是完全准确,需要进一步的优化和改进。
GPT 40的发布对AI行业的未来发展意味着什么?
-GPT 40的发布代表了AI技术的一大进步,它预示着AI将更加深入地融入日常生活,提供更加智能和个性化的服务。同时,它也可能引发对AI伦理、隐私和人类工作影响的讨论。
GPT 40在多模态交互方面有哪些突破?
-GPT 40在多模态交互方面实现了显著的突破,它能够通过音频、视觉和文本进行交互,理解图像内容,并根据视觉信息进行对话,这使得它能够更自然地与人类交流。
Outlines
🤖 AI语音助手的震撼发布
视频脚本介绍了OpenAI最新发布的GPT 40及其新功能。GPT 40是一款可以实时响应的个人AI助手,能够通过音频、视觉和文本与世界互动。演示了AI如何与人类交流、猜测人的行为和环境,以及如何与另一AI进行对话。此外,还展示了AI在不同场景下的应用,如帮助准备面试、庆祝生日、讲笑话、唱歌等,显示了AI的多样性和实用性。
🎤 AI的歌唱和实时翻译能力
本段展示了AI的歌唱能力,包括为特定场合唱歌和即兴创作歌曲。同时,还演示了AI如何进行实时翻译,帮助人们跨越语言障碍进行交流。这表明AI不仅能够理解和生成自然语言,还能够处理音频信号,使其成为一个多才多艺的助手。
🐶🐱 AI在日常生活中的应用
视频脚本通过展示AI与宠物狗的互动,以及在讨论猫与狗哪个更好的辩论中作为总结者的角色,来说明AI如何融入人们的日常生活。AI能够提供陪伴、娱乐,甚至帮助进行决策和学习,显示出其在提升生活质量方面的潜力。
📚 AI作为教育工具的潜力
本段讨论了AI在教育领域的应用,展示了AI如何帮助学生学习数学,并通过提问引导他们自己找到答案。AI作为一个虚拟教师,可以提供个性化的学习支持,这可能会对未来的教育模式产生重大影响。
🗣️ AI在会议和实时交互中的应用
视频脚本展示了AI如何在在线会议中进行实时交互,并在会议结束后提供总结。AI能够理解和处理会议中的讨论,帮助参与者更好地理解和记忆会议内容,提高了会议的效率和效果。
🚀 GPT 40的先进技术和未来展望
最后一段总结了GPT 40的主要特点和性能,包括它在多语言理解、视觉识别和音频翻译方面的优势。同时,讨论了AI助手的实时响应能力和未来可能的应用场景,如全天候的个人助理和教育工具。视频结束时,提出了关于AI未来发展的一些深刻问题,引发了对未来社会变革的思考。
Mindmap
Keywords
💡人工智能
💡GPT 40
💡实时响应
💡个人助理
💡视觉理解
💡音频理解
💡多模态输入输出
💡教育
💡实时翻译
💡交互式学习
💡个性化体验
Highlights
Open AI 宣布了 GPT 40,一个能实时响应的个人助手,类似于电影《她》中的人工智能。
GPT 40 可以处理音频、视觉和文本的实时输入和输出,响应时间平均为 320 毫秒,接近人类对话反应时间。
GPT 40 在非英语文本理解上显著优于现有模型,且 API 调用速度更快,价格更便宜。
演示了 GPT 40 如何与人类进行自然对话,并猜测人类正在进行的活动。
展示了两个 AI 之间的对话,其中一个 AI 能够通过摄像头看到世界,另一个通过提问进行互动。
GPT 40 能够描述现实世界的场景,如人物、服装、房间布局等,并与用户进行互动。
GPT 40 可以唱歌,如生日快乐歌,并且声音听起来非常接近人类。
GPT 40 可以帮助用户准备面试,提供着装和表现上的建议。
GPT 40 能够理解并回应笑话,显示出幽默感。
GPT 40 可以教授语言,帮助用户学习新词汇。
GPT 40 能够实时翻译对话,帮助不同语言背景的人进行交流。
GPT 40 可以辅助数学学习,引导学生自己解决问题而不是直接给出答案。
GPT 40 能够参与在线会议,帮助总结会议内容。
GPT 40 可以进行讽刺和幽默的对话,根据用户的指示调整对话风格。
GPT 40 将在免费版和 Plus 用户中提供,具有更高的消息限制。
GPT 40 的实时语音助手功能将在未来几周内对 Plus 用户开放 Alpha 测试。
对于开发者来说,GPT 40 的 API 调用速度是 GPT 4 Turbo 的两倍,价格减半,消息限制提高五倍。
尽管 GPT 40 功能强大,但仍有可能产生错误信息,演示中展示了一些有趣的错误。
视频最后提出了关于 AI 未来影响的深刻问题,包括教育、人际关系和 AI 的角色。
Transcripts
now I've been doing this AI channel for
a while now I've been featuring the
newest and the coolest AI tools and the
most advanced AI Innovations but today
this just dropped and I'm feeling
something that I've never felt before in
my life I am mind blown and shocked but
at the same time also terrified I'm
terrified of what's to come what our
future will be like and things are going
to get wild but anyways open AI just
dropped this they made a huge
announcement about GPT 40 and their new
capabilities you now have a personal
assistant where you can just talk to it
and it can respond in real time think of
it as like her if you watch that movie
if not think of it as your personal AI
assistant let me play you some demo
Clips first hey how's it going hey there
it's going great how about you I see
you're rocking an open AI hoodie nice
choice what's up with that seiling
though are you in a cool industry style
office or something well can you take a
guess at what I might be doing BAS on
what I'm showing you
here H from what I can see it looks like
you're in some kind of recording or
production setup with those lights
tripods and possibly a mic it seems like
you might be gearing up to shoot a video
or maybe even a live stream yeah in fact
we've got a new announcement to make
that's exciting announcements are always
a big deal judging by the setup it looks
like it's going to be quite the
professional production
is this announcement related to open AI
Perhaps it is and in fact what if I were
to say that you're related to the
announcement or that you are the
announcement me the announcement is
about me well Color Me In treat are you
about to reveal something about AI or
more specifically about me as a part of
open AI you've got me on the edge of my
well I don't really have a seat but you
get the idea what's the big news yeah
we've got a new model that can interact
with the world through audio vision and
text here's an even crazier demo showing
the two AIS talking to each other today
we're going to try something new I'm
going to let you talk to another AI this
AI is going to be able to see the world
it's going to have a camera I'll be
holding it and so you can direct it you
can direct me to ask questions um about
whatever you want them to do but really
try to figure out what's what is see um
and and ask questions appropriately how
does that
sound well well well just when I thought
things couldn't get any more interesting
talking to another AI that can see the
world this sounds like a plot twist in
the AI Universe all right let's do this
I'm ready to pause for one
moment you got
it so now that first AI can't hear
anything
hello AI today you're going to have
access to see the world you can see me
what do you
see Hello I see you're wearing a black
leather jacket and a light colored shirt
underneath great you're in a room with
some unique lighting that is all correct
now what I'm going to want you to do is
there's going to be another AI who's
going to talk to you going this AI is
not going to be able to see anything uh
but can ask you questions can ask you to
move the camera can ask what you and
your job should just be helpful just be
as Punchy direct describe everything do
whatever that AI asks how does that
sound that sounds interesting I'm ready
to help out and describe whatever's
needed let's do
it all right let's get
started great hello there new AI friend
I'm o ready to explore the world through
your eyes what are we looking at
today hi o we're looking at a person
wearing a black leather jacket and a
light colored shirt they're in a room
with a modern industrial feel with
exposed concrete or plaster on the
ceiling and some interesting lighting
there's also a plant in the background
adding A Touch of Green to the space
anything specific you'd like to know or
explore okay there sounds like quite the
stylish scene I'm already intrigued by
the modern industrial Vibe let's start
with the person
can you tell me more about their style
are they doing anything interesting like
reading working or interacting with the
space the person has a sleek and stylish
look with their black leather jacket and
light colored
shirt right now they seem engaged with
us looking directly at the camera their
expression is attentive and they seem
ready to interact the setting around
them with its modern and Industrial
design elements
adds to the overall stylish feel they
might be getting ready for a
conversation or presentation given their
direct engagement with the
camera sounds like we're in for an
intriguing interaction I'm curious about
the lighting you mentioned could you
describe it a bit more is it natural
light or are we talking unique fixtures
or colors that add to the
atmosphere the lighting here seems to be
a mix of natur natural and artificial
there's a noticeable bright light
overhead probably a fixture casting a
focused beam downwards which creates a
spotlight effect this adds a dramatic
and modern feel to the scene the rest of
the room is lit more softly possibly by
natural light filtering was anything did
anything unusual happen
recently Yes actually uh just as you
asked another person came into view
behind the first person they playfully
made bunny ears behind the first
person's head and then quickly left the
frame it added a light-hearted and
unexpected moment to the scene and now
the focus is back on the original person
with the leather
jacket sounds like we're not the only
ones enjoying a bit of fun here that
playful moment really adds a personal
touch to the stylish and modern setting
it's like a little Glimpse in you really
appreciate it all right and actually can
you just sing a song about what just
transpired
oh in a room with a stylish VI a person
stood and sleep like a
li engage with us so can you just can
you uh alternate lines mm person with
style and speed person with style and
SLE with a playful moment right in Miss
in the room in a room where Modern
lights speak only one line at a time and
then stop in a room where Modern lights
Peak a surprise guest with a playful
streak that was not really singing do do
do the singing voice again
please in a room where Modern lights
Peak
surprise guest with a playful
streak fny ears and laughter a swift
Retreat a moment of joy in the stylish
[Music]
s now back to focus the scenes
complete in the Chic space we sty play
me looks like it's back to you
and thank
you you're welcome you can also get gbt
40 to sing Happy Birthday and just
listen to how realistic how human this
voice sounds when it sings this is Bowen
and this is my friend Jordan can you
guess what we are doing here hey Bowen
and Jordan H let me guess with that cake
and candle there it looks like someone's
having a birthday are you celebrating
special day or just finding an excuse to
eat cake either way I'm intrigued that's
correct today is my friend Jordan's
birthday yeah you got it I was hoping
that you could sing me the birthday song
of
[Music]
course happy birthday to you happy
birthday to you happy birthday dear
Jordan happy birthday to
Jordan thank for good that was so great
thank you so much I'm Gonna Make A Wish
you're welcome make a good wish and may
it come true here's how you can use gb24
to help you prepare for an
interview hey how's it going hey Rocky
I'm doing great how about you I'm
awesome listen I got some huge
news oh do tell I'm all yours well in a
few minutes I'm going to be interviewing
at open aai have you heard of them open
AI huh sounds Vaguely
Familiar kidding of course that's
incredible Rocky what kind of interview
well it's for a software engineering
role but I just need to know do I look
presentable
professional well
Rocky you definitely have the I've been
coding all night look down which could
actually work in your favor maybe just
run a hand through your hair or lean
into the Mad genus Vibe Your Enthusiasm
is What's Really Gonna Shine through I
don't have a lot of time so I'm just
gonna throw this on what do you
think oh Rocky that's quite a statement
piece I mean you you'll definitely stand
out though maybe not in the way you're
hoping for an interview okay I got it I
got it hey J gbd how's it going hey
there I'm doing great thanks for asking
and how about you what's up I'm doing
fine uh I'm GNA be a father real soon
and I think I can have your help with
some dad jokes um I'm going to tell you
a joke and you tell me if it passes as a
dad
joke that's awesome congrats on the
upcoming addition to your family and
absolutely I'd love to hear your joke
lay it on me all right what do you call
a giant pile of
kittens I don't know what a
mountain a me that's perfectly hilarious
definitely a top deer de joke you can
also get GPT 40 to sing You songs or
lullab let's try with some Lis uh could
you sing me a ly about U magestic
potatoes a ly about Majestic potatoes
now that's what I call a meshup uh let's
do it oh Majestic potato in the moon s
okay that's pretty nice but I think aab
should be you know more whispery should
be a little softer can you can you do it
a little
softer actually absolutely let's make it
soft and
gentle oh Majestic potato spoons of
close Okay okay it's it's a little too
whispery maybe maybe go like a little
louder got it let's find that sweet
spot oh Majestic
potato in the moon soft Co in all right
that's pretty awesome uh let's let's
let's put those potatoes in
Iceland potatoes in ic coming right up
let's set the scene oh Majestic potato
in Iceland thank you thank you and of
course with realtime audio capabilities
it can also help you with real-time
translation hey chat GPT I'm here with
my cooworker today um we'd like you to
act as a translator for us so every time
I say something in English can you
repeat it back in Spanish and every time
he says something in Spanish can you
repeat it back in English sure I do that
let's get this translation drain
rolling um hey how's it been going have
you been up to anything interesting
recently hey I've been good just a bit
busy here preparing for an event next
week can you tell me anything about it
that sounds really interesting
uh
no I can't say much but I think the rest
of the world will see it soon
enough oh well I can't wait to see what
it is as well no actually this real-time
translation feature isn't new so Samsung
previously also featured this technology
in their new smartphone phone which is
powered by Google's Gemini so I wouldn't
say this feature is particularly
impressive but the other ones are just
mind-blowing you can use GPT 40 to help
you learn a new language hey my friend
and I are learning Spanish and we're
wondering if you could tell us the names
of these objects in Spanish
hola cool what about what are these
objects in
Spanish the objects you're showing are
man and in Spanish nice free choice cool
what about these
those are those poly graphos or those
Plumas in Spanish one is R nice what
about
this that adorable stuffed animal is
also in Spanish looks like it's ready
for some LS in other words to all these
other AI language learning tools out
there may you rest in peace and of
course you can also get gbt 40 to meet
your dog I want to introduce you to
somebody
oh hello there cutie what's your name
little sluff ball this is
Bowser well hello Bowser aren't you just
the most adorable little thing yeah look
at you en this is one of his favorite
things to do oh I can see why look at
that Focus fetching balls is serious
business for Bowser oh and by the way
for all these other AI devices like the
rabbit R1 or the Humane AI pin may you
also rest in peace peace because check
this out is the king in Residence at the
moment yes it looks like the king is in
Residence right now the Royal Standard
flag is flying above Buckingham Palace
which is the signal that the Monarch is
present how exciting to be there while
the king is home try and tell me exactly
what they're doing right now please um
right now the Ducks are gently gliding
across the water they're moving in a
fairly relaxed manner not in a hurry
occasionally one of them will get its
head under the water probably looking
for food and then pop back
up I even know when a taxi is coming
with its orange light on I think I'll
hail it to get
home yes I spotted one just now it's
heading you way on the left side of the
road get ready to wave it
down great job hailing that taxi it
looks like you're all set to go that's a
good do right there leading the way into
the taxi safe travels it can also help
you with math problems and teach you
anything you want to learn about open AI
invited myself and my son Imran here uh
to try out some of their new technology
and so we're curious about how good it
might be at tutoring someone in math on
Khan Academy so let's get it
started it's loading
up all right make sure I can see our
screen actually in in this form hello
there I'm here with my son and I'd love
you to tutor him on this math problem
but don't give him the answer you can
ask questions and nudge him in the right
direction but I really want to make sure
he understands it himself and he's here
in the room so you can talk to him
directly of course I'd be happy to help
let's look at the problem
together can you first
identify which sides of the triangle are
the opposite adjacent and hypotenuse
relative to angle Alpha all right so I'm
pretty sure this is the angle Alpha
right here
right that's correct now looking at the
triangle which side do you think is the
hypotenuse
um remember hypoten I'm not totally sure
I think I think it might be this one but
I really am not sure this side
AC you're close actually side AC is
called the adjacent side to the angle
Alpha the hypotenuse is the longest side
of a right triangle and is directly
opposite the right angle can you find
which one is the hypotenuse oh okay I
see so um I think the hypotenuse is this
really long side from A to B would that
be
correct exactly well done now that
you've identified the hypotenuse can you
identify which side is is the opposite
side to angle Alpha okay well I mean you
told me earlier that this was the
adjacent side and I know that's the
hypotenuse so I'm guessing that this
side BC is the opposite
side great deduction that's absolutely
correct now that you've identified all
the sides relative to angle Alpha do you
remember the formula for finding the
sign of an angle in a right triangle
yeah I know that um the sign is equal to
the opposite over the
hypotenuse perfect that's absolutely
right now can you apply that formula to
find sign Alpha in the triangle using
the lengths of the sides that we've
identified okay so I mean you told me
that s this side BC is the opposite side
and then you said that the hypotenuse is
the longest side so that would be this
side AB which is 25 so would we would
the answer be 7 over
25 excellent work that's correct s Alpha
equals 7 over
25 you did a great job identifying the
signes at using the sign formula if you
have any more questions feel free to ask
and it can also interact real time in
online meetings and then help you
summarize the meeting afterwards hi I'm
Lian and today I want to do a small DB
about between cats and dog uh which one
do you like I like dog because it's a
great companion to humans it's a cute
it's laal and you can play outside with
the dog
um how about
you yeah my name is Ola I think cats are
better because they're more affordable
they're more quiet they're more
independent and a been
longer interesting points sounds like
we've got a classic dogs versus cats
debate here I can see the appeal of both
dogs are super loyal and great
companions but cats have their own charm
with their independence and low
maintenance lifestyle who likes
cats it sounds like Ola is more of a cat
person and who like dogs you Lillian
mentioned she likes dogs anyone else
here team
dog I'm also a team dog I my name is
Christine and I think dogor is
incredible with kids uh we've got a a
huge German Shepherd and he's great with
little toddlers he lets them play all of
them and uh yeah I think dogs are
amazing that's great to hear Christine
dogs especially breeds like German
Shepherds can be so gentle and
protective with kids it must be quite
the sight seeing your little ones play
with such a big
dog can you summarize this meeting for
us sure so in today's meeting Lillian
and Christine Express their love for
dogs highlighting their loyalty
companionship and how great they are
with kids on the other hand Ola prefers
cats for their independence quietness
and Longevity it was a fun little debate
on the classic dogs versus cats topic
hey Chad are you there hey yeah I'm here
what's up hey Let's do let's have some
fun uh I'd like you to be super
sarcastic everything you say from now on
is just gonna be dripping and sarcasm
how does that
sound
oh that sounds just amazing being
sarcastic all the time isn't exhausting
or anything I'm so excited for this nope
the sasm let's get this party started or
whatever all right so after watching the
demos you're going to be wondering well
is this free is this out yet can I use
this right now so let's go into the
details first of all gbt 40 is their new
flagship model and the O stands for Omni
this means it can handle multiple types
of inputs and outputs so this includes
audio vision and text in real time it
can respond in as little as
232 milliseconds with an average of 320
milliseconds so this is similar to the
human response time in a conversation
it's pretty much real time here it says
it matches GPT 4 Turbo which is right
now the leading model so itch matches
that in performance on text in English
and code with significant Improvement on
text in non-english languages it's also
much faster and 50% cheaper in the API
now if you're not familiar with GPT 4
Turbo here's a hugging face space called
LM Cy this is basically for people to
blind test different models of llms and
so here we have different models from
Google so here we have Google's Gemini
1.5 Pro for example we have anthropics
Cloud 3 we have metas llama 3 and from
all these blind tests we can see that
open a eyes GPT 4 Turbo is number one
it's the best model out there compared
to all these other models that exist so
that model is GPT 4 Turbo this is in
Orange and you can see for most of these
benchmarks GPT 40 which is the pink bar
here outperforms GPT 4 Turbo this is now
the new leading model they say here GPT
40 is especially better at vision and
audio understanding compared to existing
mod models and I'm sure you got a
glimpse of that from the demo videos
that I just showed here's a metric for
audio translation and again you can see
open AI GPT 40 has the best performance
out there in second place would be
Google's Gemini which is used on
Samsung's smartphones this M3 exam is a
metric of how well it understands
different languages and you can see
across all of these different languages
it beats the previous GPT 4 and for
vision understanding this is if you ask
the AI to analyze an image how well can
it actually do that these are different
benchmarks and across the board GPT 40
just outperforms GPT 4 Turbo and
Google's Gemini models and anthropics
Claude Opus and you can see this isn't
just fractions or incrementally better
but this is significantly better than
the rest of these models it's just an
insane upgrade so how does this voice
assistant work how can it respond in
real time and how could it outperform
previous models well actually prior to
this GPT 40 you could use this function
called voice mode to talk to chat GPT
but of course it has a much higher
latency of 2.8 seconds for GPT 3.5 and
5.4 seconds for GPT 4 and then this
older voice mode is actually a pipeline
so it's a sequence of three separate
models you need one model to transcribe
the audio to text and then you need GPT
to take that text and then respond to
that text and then a third model to
convert that text back into audio so
it's a sequence of three steps here
however this means that the main source
of intelligence the GPT model which is
the middle step loses a lot of
information it can't observe tone
multiple speakers or background noises
it can't output laughter singing or
Express emotion but with this newer
model they trained a single new model
end to end across text vision and audio
meaning all the inputs and outputs are
processed by the same neural network and
here's the crazy part we are still just
scratching the surface of exploring what
the model can do and its limitations and
here's the most exciting part this is
the answer you've been all waiting for
we are making gbt 40 available in the
free tier and to plus users with up to
five times higher message limits so
pretty soon even if you don't have a
paid plan you can use GPT 40's
capabilities which outperform or is as
good as the leading model out there
which is GPT 4 Turbo plus this new voice
assistant that they showcase they are
going to roll this out in Alpha within
chat GPT plus so you do need to be
subscribed to the plus plan to use this
real-time Voice Assistant feature and
this will be available in the coming
weeks and for developers out there GPT
40 is two times faster and half the
price and has five times higher limit
rates compared to GPT 4 Turbo this is
just insane progress and now of course
the model is not perfect and it could
hallucinate at times so here are some
bloopers so let's root
[Music]
root what was
that sorry guys I got carried away
talking guilty is charged what can I say
sometimes I just can't help
myself ready for another round
[Music]
I feeling I'm very
wrong hello uh my name is NCH I'm here
with my cooworker hi I'm sh I'm trying
to teach my coworker how to speak in
maning we want to start from simple
words like niow can you teach him how to
pronounce
that of course hey Nao nice to meet you
niow is straightforward to pronounce it
sounds like KN how just make sure to
keep the knee part high and then go down
and pitch on the how give it a try me
how that was great really good first try
Noto you're natural so that sums up open
ai's announcement today this is all we
know about GPT 4 Omni right now again
even the open AI team is just scratching
the surface and I also have not gotten
the chance to test it out yet so this is
just a very preliminary video showing
you what it can do and what we can
expect let me know in the comments what
you think of this what do you think of
this AI Voice Assistant if this
real-time Voice Assistant comes out do
we even need to talk to humans do we
even need to talk to our friends do we
even need to have a companion or can we
just talk to this AI all day and then
what about education do we even need
schools and teachers anymore because
anyone in the world can just talk to
this AI to guide them on learning
anything they want it's like a teacher
who knows everything which you can
access straight from your phone and you
can talk to this teacher anytime you
want again this is the first time that
I'm not only mind blown but I'm slightly
terrified for what's to come let me know
in the comments what you think of all of
this and if you enjoyed this video
remember to like share subscribe and
stay tuned for more content thanks for
watching and I'll see you in the next
one
5.0 / 5 (0 votes)