WHISTLEBLOWER Reveals Complete AGI TIMELINE, 2024 - 2027 (Q*, QSTAR)

TheAIGRID
4 Mar 202445:05

Summary

TLDR这份视频脚本披露了一份据称来自OpenAI内部的机密文件,暗示OpenAI计划在2027年之前实现人工通用智能(AGI)。该文件详细阐述了OpenAI正在训练一个125万亿参数的多模态模型,预计将在2023年12月完成训练。尽管文件中有一些猜测成分,但它提供了相当多的证据和细节,令人怀疑OpenAI可能真的在秘密推进AGI的工作。这份视频探讨了该文件所引发的一系列问题和猜测。

Takeaways

  • 😃 视频指出 OpenAI 有一份秘密计划,计划在 2027 年之前实现 AGI(通用人工智能)。
  • 🤔 这份文件声称 OpenAI 在 2022 年 8 月开始培训了一个 1.25 万亿参数的多模态模型,并在 2023 年 12 月完成训练。
  • 🔍 文件显示 OpenAI 一直在努力创建一个人脑大小(1千万亿参数)的 AI 模型,这是他们实现 AGI 的计划。
  • 📈 根据 Chinchilla 定律,即使 1 千万亿参数的模型性能略低于人类,如果使用大量数据进行训练,它就能超过人类水平。
  • 😲 一些 AI 领袖如 Hinton 和 Hassabis 近期发出警告,表示 AGI 的到来比预期的要快。
  • 🕵️ 微软对 OpenAI 的 100 亿美元投资有望为 OpenAI 提供训练 AGI 系统所需的计算资源。
  • 💰 Sam Altman 正在筹集 70 亿美元,可能是为了训练大规模 AGI 系统所需的巨额计算成本。
  • ⚠️ 有声音呼吁暂停训练超过 GPT-4 水平的 AI 系统,包括正在训练中的 GPT-5。
  • 🔐 OpenAI 计划在 2027 年之前解决"超级对齐"问题,以确保安全释放 AGI。
  • 🔜 视频暗示,每年都会有新的GPT模型发布,GPT-7 之后可能就是 AGI 系统。

Q & A

  • OpenAI计划到2027年创造AGI的文件揭示了哪些关键信息?

    -该文件透露,OpenAI从2022年8月开始训练一个具有125万亿参数的多模态模型,首个阶段命名为Q星,模型在2023年12月训练完成,但由于高昂的推理成本,发布被取消。文件还暗示,OpenAI的长期计划是通过逐年发布新模型,最终在2027年达到AGI。

  • 文档中提到的“Q星”、“ARUS”和“GOI”是什么?

    -Q星、ARUS和GOI被提及为OpenAI开发的模型名称。其中,ARUS模型的开发被取消,因为它运行效率不高。这些名称被视为OpenAI内部计划和模型的一部分,指示了公司在人工智能领域的研究方向和进展。

  • 为什么OpenAI会将原本计划的GPT-5模型取消或重命名?

    -原本计划的GPT-5模型被取消或重命名的具体原因在文档中没有详细说明,但暗示这与模型开发过程中的策略调整和技术挑战有关,可能是由于在推理成本、性能预期或技术突破方面遇到的问题。

  • Elon Musk对OpenAI的计划提出诉讼有何影响?

    -Elon Musk提起的诉讼导致了对OpenAI计划的一些延迟,特别是影响了原计划中的GPT-6和GPT-7的开发和发布。Musk的诉讼主要是基于对OpenAI远离其开源目标和创造高级技术应对公众开放的承诺的担忧。

  • 什么是AGI,以及它与当前AI技术的区别是什么?

    -AGI(人工通用智能)是指能够执行任何智力任务的人工智能,与人类智力水平相当。它与当前的AI技术(通常专注于特定任务的解决方案)的主要区别在于其通用性和灵活性,AGI能够在没有特定培训的情况下理解和执行各种复杂任务。

  • 文档如何定义人类水平的AGI,以及它的实现对社会有什么潜在影响?

    -文档定义人类水平的AGI为能够执行任何一个聪明人类能够完成的智力任务的人工智能。它的实现可能会彻底改变社会,包括经济、就业、教育和技术发展等方面,同时也引发了对安全、伦理和社会影响的广泛关注。

  • 为什么说模型参数数量是预测AI性能的关键指标?

    -模型参数数量被视为预测AI性能的关键指标,因为参数越多,模型处理和理解复杂数据的能力通常越强。参数数量与生物大脑中的突触数目相类比,用来估计AI模型的复杂度和潜在的智能水平。

  • OpenAI如何通过参数计数和数据量来逼近人类大脑的性能?

    -OpenAI通过增加模型的参数计数并训练它们使用大量的数据来逼近人类大脑的性能。通过模仿人脑中神经元间突触的连接方式,以及利用海量数据进行训练,OpenAI旨在创造出能够模拟人类认知过程的AI模型。

  • 文档中提到的“Chinchilla法则”是什么,它如何影响AI模型的训练?

    -Chinchilla法则基于DeepMind的研究,指出当前模型训练方式在数据使用方面存在低效,通过使用更多的数据进行训练,可以显著提高AI模型的性能。这一发现促使OpenAI和其他AI研究机构重新评估其训练策略,使用更多数据以期望达到更优的训练效果。

  • 为什么说AI领域的研究是动态且迅速发展的?

    -AI领域的研究被认为是动态且迅速发展的,因为每天都有大量的新研究和技术突破被发布,不断推动了AI技术的极限,并改变了我们对可能实现的智能水平的理解。这一领域的快速进步要求研究者、开发者和利益相关者持续关注最新动态,以适应不断变化的技术环境。

Outlines

00:00

📄 开篇介绍

视频开始讨论一个揭示OpenAI到2027年创建人工通用智能(AGI)计划的文档。强调了该文档包含许多推测和不完全是事实,但提到了一些关键内容,比如OpenAI于2022年8月开始训练一个125万亿参数的多模态模型,以及由于高昂的推理成本而取消了其发布。视频作者提醒观众以怀疑的态度看待这些信息,同时强调了对OpenAI未来计划的兴趣。

05:02

🔍 文档深入解析

深入分析了文档中提及的OpenAI的计划,包括关于GPT 4.5被重命名为GPT 5,以及GPT 5原定于2025年发布但被取消的信息。讨论了一些被提及的模型,如arus和GOI,及其与OpenAI相关文章的联系。同时,文档提及了GPT 6(现称GPT 7)的发布被埃隆·马斯克的诉讼所延迟,以及关于OpenAI的公开资源目标的讨论。

10:03

🚀 向AGI迈进

该段落讨论了100万亿参数模型的潜力以及通过参数数量预测AI性能的可行性。提到了一个重要的研究,展示了随着参数数量的增加,AI在不同任务上的表现也随之提升,尽管收益递减。这段落也提到了如何通过增加数据量来弥补100万亿参数模型在性能上的不足,以及OpenAI使用的工程技术来桥接这一差距。

15:04

🧠 关于模型的进一步洞察

深入探讨了OpenAI对GPT 4模型的开发计划,包括关于100万亿参数模型的误解和澄清。提到了OpenAI利用更少的模型参数实现更高性能的方法,以及在网络上训练模型以处理更复杂问题的挑战。还讨论了关于GPT 4的早期泄露信息,以及OpenAI如何回应这些泄露信息。

20:05

🔧 模型测试与预期

讨论了GPT 4在2022年10月和11月被测试的信息,以及这些测试与先前泄露的一致性。文档还揭示了OpenAI官方对于100万亿参数GPT 4模型的立场,以及关于模型视频处理能力的泄露信息。这段落展示了关于AI性能的预期,尤其是与视频生成相关的能力。

25:06

🤖 机器人学与未来展望

这一部分讨论了将AI应用于机器人技术的潜力,尤其是关于视觉数据的重要性和通过大规模模型训练实现复杂任务的可能性。提及了特斯拉在其自动驾驶技术中完全依赖视觉数据的决定,以及通过视觉和语言数据训练的大模型如何处理机器人任务。这强调了通过训练更大模型来实现更高级别的AI性能的趋势。

30:06

🌐 AGI和超级对齐的道路

分析了OpenAI对于创建AGI的长期计划和目标,特别是关于通过提高模型的参数数量和训练数据量来实现更接近人类水平的AI性能。这部分还涉及了关于AI研究方向和超级对齐(super alignment)问题的深入讨论,以及知名AI研究人员对于AI发展速度和潜在风险的担忧。

35:07

💡 综合理解与未来展望

最后一段深入探讨了OpenAI在创建AGI方面的总体战略和计划,包括预计每年发布一个新的GPT模型,直到2027年。讨论了OpenAI对超级对齐技术的期望,以及如何通过逐步发布更新和更强大的模型来为社会和技术环境提供适应AGI的时间。同时,也反思了关于AGI发展的不确定性和可能的未来趋势。

Mindmap

Speculation on internal knowledge and strategic planning
OpenAI's responsibility in managing AGI's development
Open letters urging a pause in AI development
AI leaders expressing concerns over rapid AI advancement
Speculation on compute requirements and Microsoft's investment
Impact of chinchilla scaling laws on model performance
Extrapolation of performance based on parameter count
Comparison of brain's synapses to AI parameters
Implications of reaching human-level AGI and beyond
Differentiating between emerging, competent, expert, and virtuoso AGI
Trademarking of GPT names and its implications on AGI's naming convention
Elon Musk's lawsuit and its influence on development timeline
Progression from GPT-4 to potential GPT-8 as AGI
Yearly model release plan
Introduction of terms like 'qstar', 'GOI', 'arus', implying development stages
Plan involves creating a 125 trillion parameter multimodal model
Reminder to take the analysis with caution due to speculative nature
Critical view on the reliability of the document
The Role of OpenAI and its Secrecy
Calls for Caution and Regulation
Scaling Laws and Model Training
Parameter Size and Intelligence Prediction
Levels of AGI
Impact of External Factors on AGI Development
Iterative Releases Leading to AGI
OpenAI's Secret Plan for AGI by 2027
Speculation and Unverified Information
Societal and Ethical Implications
Technical Aspects and Predictive Models
GPT Model Evolution and AGI Timeline
Introduction to the Document
Analysis of OpenAI's Speculated AGI Development by 2027
Alert

Keywords

💡人工通用智能(AGI)

人工通用智能(Artificial General Intelligence)指的是能够像人类一样执行各种智力任务的人工智能系统。根据视频内容,OpenAI的目标是在2027年前开发出人类大脑规模的AGI模型,具有类似人类的推理和学习能力。视频中提到,OpenAI计划通过迭代开发GPT模型系列,最终实现GPT-8或类似的125万亿参数模型达到AGI水平。

💡GPT(生成式预训练转换器)

GPT是生成式预训练转换器(Generative Pre-trained Transformer)的缩写,指的是OpenAI开发的一系列大型语言模型。视频重点讨论了GPT-4、5、6和7的开发进展,认为它们是通向AGI的关键步骤。根据视频,OpenAI计划每年发布新的GPT系统,以逐步增加参数规模和功能,直到实现AGI级别的GPT-8。

💡参数量

参数量指的是神经网络中可训练的参数数量。视频中强调,增加参数量是实现人工智能提升的关键方式之一。视频援引研究发现,在参数量达到与人脑同等水平(约1000亿万亿个参数)时,人工智能性能将达到人类水平。因此,OpenAI的目标是开发出拥有1000亿万亿参数的AGI系统。

💡计算资源

计算资源指的是训练大型人工智能模型所需的计算能力,包括GPU、TPU等专用硬件。视频提到,增加参数量需要大量计算资源,而OpenAI通过与公司如Cerebras等合作来获取所需的计算能力。视频还暗示,OpenAI正在寻求7万亿美元的投资,可能与为AGI项目获取计算资源有关。

💡数据

数据是训练人工智能模型不可或缺的燃料。视频引用SamAltman的观点,互联网上存在足够的数据来训练出AGI系统。视频还提到"Chinchilla scaling laws"的概念,即通过大量数据来弥补模型参数量与人脑参数量之间的差距,从而提高性能。

💡多模态

多模态指的是结合文本、图像、视频、音频等多种形式的数据训练人工智能模型。视频暗示OpenAI计划开发的AGI系统将是多模态的,能够处理互联网上的各种数据形式。这与仅基于文本训练的早期GPT模型形成鲜明对比。

💡超级对齐(Superintelligence Alignment)

超级对齐指的是确保高度智能的人工智能系统符合人类意愿和价值观的技术和方法。视频显示,OpenAI计划在2027年前解决"超级对齐"问题,以确保AGI系统的安全和可控性。这被视为实现AGI的关键障碍之一。

💡泄露(Leak)

视频中多次提及OpenAI关于未来AI发展计划的"泄露"消息。这些泄露来自不同来源,包括一些内部人士、研究人员和企业家等,透露了OpenAI正在开发的高参数量模型的细节,如参数数量、训练进度和预计发布时间等。虽然视频承认这些泄露可能不完全准确,但认为它们与OpenAI官方声明有一定吻合之处。

💡缩放定律(Scaling Laws)

缩放定律指的是人工智能模型性能与参数量、计算资源和训练数据之间的关系规律。视频提到,OpenAI最初的缩放定律估计有误,低估了实现AGI所需的数据和计算资源。而后来DeepMind的"Chinchilla scaling laws"研究改变了这一认知,因此OpenAI调整了开发计划,决定大幅增加所需的计算资源和训练数据量。

💡风险意识

随着AGI的临近,视频显示一些人工智能领域的重要人物对其潜在风险越来越谨慎。DeepMind的Hassabis、谷歌的Hinton等人发出警告,呼吁行业谨慎行事。视频还提到存在号召暂停AGI研究6个月的公开信,以及Elon Musk对OpenAI提起诉讼等情节。这反映出人们对超级智能AI的担忧与期待并存。

Highlights

There was a recent document that apparently reveals OpenAI's secret plan to create AGI by 2027.

The document states that OpenAI started training a 125 trillion parameter multimodal model called 'qstar' in August 2022, which finished training in December 2023 but the launch was cancelled due to high inference cost.

Multiple AI researchers and entrepreneurs claim to have had inside information about OpenAI training models with over 100 trillion parameters, intended for AGI.

OpenAI's president Greg Brockman stated in 2019 that their plan was to build a human brain sized model within 5 years to achieve AGI.

AI leaders like Demis Hassabis and Geoffrey Hinton have recently expressed growing concerns about the potential risks of advanced AI capabilities.

After the release of GPT-4, the Future of Life Institute released an open letter calling for a 6-month pause on training systems more powerful than GPT-4, including the planned GPT-5.

Sam Altman confidently stated that there is enough data on the internet to create AGI.

OpenAI realized their initial scaling laws were flawed and have adjusted to take into account DeepMind's 'Chinchilla' laws, which show vastly more data can lead to massive performance boosts.

While a 100 trillion parameter model may be slightly suboptimal, OpenAI plans to use the Chinchilla scaling laws and train on vastly more data to exceed human-level performance.

Microsoft invested $10 billion into OpenAI in early 2023, providing funds to train a compute-optimal 100 trillion parameter model.

An OpenAI researcher's note reveals they were working on preventing an AI system called 'qstar' from potentially destructive outcomes.

The document theorizes that OpenAI plans to release a new model each year until 2027, aligning with their 4-year timeline to solve the 'super alignment' problem for safe AGI release.

GPT-7 is speculated to be the last pre-AGI model before GPT-8 achieves full AGI capability in 2027.

Sam Altman mentioned OpenAI can accurately predict model capabilities by training less compute-intensive systems, potentially forecasting the path to AGI.

Altman is reportedly trying to raise $7 trillion, likely to fund the immense compute required for training a brain-scale AGI model using the Chinchilla scaling laws.

Transcripts

00:00

so there was a recent document that

00:02

actually apparently reveals open ai's

00:05

secret plan to create AGI by 2027 now

00:08

I'm going to go through this document

00:10

with you Page by Page I've read it over

00:12

twice and there are some key things that

00:13

actually did stand out to me so without

00:16

further Ado let's not waste any time and

00:18

of course just before we get into this

00:19

this is of course going to contain a lot

00:23

of speculation remember that this

00:24

document isn't completely 100% factual

00:27

so just take this video with a huge of

00:30

salt so you can see here the document

00:32

essentially says revealing open ai's

00:34

plan to create AGI by 2027 and that is a

00:37

rather important date which we will come

00:39

back to if we look at this first thing

00:41

you can see there's an introduction okay

00:42

and of course remember like I said there

00:44

is a lot of speculation in this document

00:46

there are a lot of different facts and

00:47

of course like I said anyone can write

00:49

any document and submit it to um you

00:51

know Twitter or Reddit or anything but I

00:53

think this document does contain a

00:55

little bit more than that so it starts

00:57

out by stating that in this document I

00:59

will be revealing information I have

01:01

gathered regarding opening eyes delayed

01:03

plans to create human level AGI by 2027

01:07

not all of it will be easily verifiable

01:09

but hopefully there's enough evidence to

01:11

convince you summary is basically that

01:13

openai has started training a 125

01:16

trillion parameter multimodal model in

01:19

August of 2022 and the first stage was a

01:22

rakus also called qstar and the model

01:24

finished training in December of 2023

01:27

but the launch was cancelled due to the

01:29

high inference cost

01:30

and before you guys think it's just

01:31

document with like just words I'm going

01:33

to show you guys later on like all of

01:35

the crazy kind of stuff that is kind of

01:38

verifiable that does actually um line up

01:40

with some of the stuff that I've seen as

01:42

someone that's been paying attention to

01:43

this stuff so this is literally just the

01:45

introduction um the juicier stuff does

01:47

come later but essentially they actually

01:49

talk about the and this is just like an

01:50

overview so you're going to want to

01:51

continue watching they essentially state

01:53

that you know this is the original GPT 5

01:55

which was planned for release in 2025

01:57

bobi GPT 4.5 has been renamed name to

02:00

gbt 5 because the original gbt 5 has

02:02

been cancelled now I got to be honest

02:03

this paragraph here is a little bit

02:05

confusing um but I do want to say that

02:07

the words arus and the words GOI are

02:09

definitely models that were referred to

02:12

by several articles that were referring

02:14

to leaks from open eye and I think they

02:17

were actually on the information so this

02:19

is some kind of stuff that I didn't

02:21

really hear that much about but the

02:22

stuff that I did hear was pretty crazy

02:25

so um this arus and this goby thing

02:27

although you might not have heard a lot

02:28

about it of course it is like like a

02:30

kind of like half and half leak but like

02:31

I was saying this stuff is kind of true

02:33

so you can see here open AI dropped work

02:35

on a new arus model in rare AI setback

02:38

and this one actually just talks about

02:40

um how by the middle of open AI you know

02:42

scrapping an araus launch after it

02:43

didn't run as efficiently so there's

02:45

actually some references to this but a

02:47

lot of the stuff is a little bit

02:48

confusing but we're going to get on to

02:49

the main part of this story now I just

02:51

wanted to include that just to show you

02:52

that you know these names aren't made up

02:54

because if I was watching this video for

02:55

the first time and I hadn't seen some of

02:56

the prior articles before I'd be

02:58

thinking what on Earth is a r what on

03:00

Earth is GOI I've only heard about qar

03:02

so essentially let's just take a look

03:04

and it says the next stage of qstar

03:06

originally GPT 6 but since renamed

03:09

gpt7 originally for release in 2026 has

03:12

been put on hold because of the recent

03:14

lawsuit by Elon Musk if you haven't been

03:16

paying attention to the space

03:17

essentially Elon Musk just f a lawsuit

03:19

released a video yesterday um stating

03:21

that open ey have strayed far too long

03:23

from their goals and if they are

03:25

creating some really advanced technology

03:26

the public do deserve to have it open

03:28

source because that was their goal um

03:30

and essentially you can see here it says

03:32

qar GBC planned to be released in 2027

03:36

achieving full AGI and one thing that I

03:38

do want to say about this because

03:39

essentially they're stating that you

03:40

know they're doing this up to gpt7 and

03:42

then after gpt7 they do get to AGI one

03:45

thing that I do think okay and I'm going

03:47

to come back to this as well is that the

03:48

dates kind of do line up and I say kind

03:51

of because not like 100% because we

03:53

don't know but presuming let's just

03:55

presume okay because GPT 4 was released

03:58

in 2023 right um let's just say you know

04:01

every year release a new model okay um

04:03

that would mean that you know in 2024 we

04:05

would get gbt 5 in 2025 we get GPT 6 in

04:08

2026 we would get GPT 7 and in 2027 we

04:12

would get GPT 8 which is of course AGI

04:14

now one thing I do think about this that

04:16

is kind of interesting and remember I'm

04:17

going to come back to this so pay

04:19

attention Okay what I'm basically saying

04:21

is that if openi are consistent with

04:23

their year releases so for example if

04:25

they are going to release a new model

04:26

every year and if we continue at the

04:28

same rate like a new GPC every single

04:30

year which is possible him stating that

04:31

gp7 being the last release before GPT 8

04:35

which is Agi does actually kind of make

04:37

sense because once again and I know you

04:38

guys are going to hate this but if we

04:40

look at the trademarks okay remember

04:41

that they trademarked this around the

04:43

same time okay around that 2023 time

04:45

when all of this crazy stuff was going

04:46

on and I think it's important to note as

04:49

well is that like there's no gp8 you

04:52

might might argue that if they're going

04:53

to use all the GPT names why wouldn't

04:55

they just trademark GPT a and I think

04:57

maybe because like the document States

04:59

the model after gpt7 could be AGI and

05:02

I'm going to give you guys another

05:03

reason on top of that um another reason

05:05

is and I'm going to show you guys that

05:06

later on in the video but essentially um

05:08

open ey's timeline on super alignment

05:10

actually does coincide with this Theory

05:12

which is a little bit Co coincidental of

05:14

course like I said pure speculation

05:15

could be completely false open ey like I

05:17

said before can go ahead and completely

05:19

change their entire plans you know they

05:21

can go ahead and drop two models in one

05:23

year the point I'm trying to make is

05:24

that um certain timelines do align but

05:26

just remember this because I'm going to

05:27

come back to this because of some

05:28

documents stuff that you're going to see

05:30

in this document at the end of the video

05:31

so anyways um you know it says Elon Musk

05:33

caused a delay because of his lawsuit

05:35

this why I'm revealing the information

05:36

now because no further harm can be done

05:38

so I guess Elon musk's lawsuit has kind

05:40

of um you know if you wanted bought you

05:43

some time so he says I've seen many

05:45

definitions of AGI artificial general

05:46

intelligence but I will Define AGI

05:48

simply as an artificial general

05:49

intelligence that can do any

05:50

intellectual task a smart human can this

05:52

is how most people Define the term now

05:54

2020 was the first time that I was

05:56

shocked by an AI system so this is just

05:57

some um of course you know talk about

06:00

his experience with you know AI systems

06:02

I'm guessing the person who wrote this

06:03

but you know AGI if you don't know AGI

06:05

is like an AI system that can do any

06:06

task human can but one thing that is

06:08

important to discern is that you know

06:10

AGI there was a recent paper that

06:11

actually talks about the levels of AGI

06:13

and I think it's important to remember

06:15

that AGI isn't just you know one AI that

06:17

can do absolutely everything there are

06:18

going to be levels to this AGI system

06:20

that we've seen so far and in this paper

06:21

levels of AGI they actually talk about

06:23

how you know we're already at emerging

06:25

AGI which is you know emerging which is

06:27

equal or somewhat better than an

06:28

unskilled human so we are at level one

06:30

AGI and then of course we've got um you

06:33

know competent AGI which is going to be

06:34

at least 50% of the 50th percentile of

06:37

skilled adults and that's competent AGI

06:38

that's not yet achieved and then of

06:40

course we've got expert AGI which is

06:41

90th percentile of skilled adults which

06:43

is not yet achieved then we've got

06:45

virtuoso AGI which is not yet achieved

06:47

which is 99th percentile of all skilled

06:50

adults and then we've got artificial

06:52

super intelligence which is just 100% so

06:54

I think it's important to understand

06:55

that there are these levels to AGI

06:57

because once someone says AGI I mean is

06:58

it 90 9 can it do like half you know

07:00

it's like it's just it's just pretty

07:02

confusing but I think this is uh a

07:03

really good framework for actually

07:05

looking at the definition because trust

07:06

me it's an industry standard but it is

07:08

very very confusing so here's where he

07:09

basically says that you know um you know

07:11

GPT 3.5 which powered the famous chat

07:14

GPT and of course gpt3 which was the not

07:16

the successor but the predecessor of 3.5

07:19

it says you know these were a massive

07:20

step forward towards AGI but the note is

07:22

you know gbt2 and all chatbot since

07:24

Eliza had no real ability to respond

07:26

coherently so while such gpt3 a massive

07:28

leap and of course this is where we talk

07:29

about parameter count and of course he

07:31

says deep learning is a concept that

07:32

essentially goes back to the beginning

07:33

of AI research in the 1950s first new

07:36

network was created in the 50s y y y so

07:38

basically this is where he's giving the

07:39

description of a parameter and he says

07:41

you may already know but to give a brief

07:42

digestible summary it's a nalist to a

07:45

synapse in a biological brain which is a

07:47

connection between neurons and each

07:48

neuron in a biological brain has roughly

07:50

a thousand connections to other neurons

07:52

obviously digital networks OB vular to

07:54

biological brains basically saying that

07:55

you know of course we're comparing them

07:56

but different but um how many synapses

07:59

or parameters are in a human brain the

08:01

most commonly cited figure for synapse

08:02

count in the brain is roughly 100

08:04

trillion which would mean each neuron is

08:06

100 billion in the human brain has

08:08

roughly 1,000 connection and remember um

08:10

this number 100 trillion because it's

08:12

going to actually be a very very big

08:15

number that uh you do need to remember

08:17

so of course you can see here the human

08:18

brain consists of 100 billion urans and

08:20

over 100 trillion synaptic connections

08:22

okay and essentially this is trying to

08:24

you know um just pair the similarities

08:27

between parameters and synapses so

08:29

entially stating here that you know if

08:30

each neuron in a brain and trust me guys

08:32

this is just all going to come into

08:34

everything like I know you guys might be

08:35

thinking what is the point of talking

08:36

about this I just want to hear about qar

08:38

but just trust me all of this stuff it

08:40

does actually make sense like I've read

08:41

this a lot of times so I'm going to skip

08:43

some pages but the pages I'm talking

08:44

about now just trust me guys you're

08:45

going to want to read them it basically

08:47

says here if each neuron in a brain has

08:48

a th000 connections this means a cat has

08:50

roughly 250 billion synapses and a dog

08:53

has roughly 530 billion synapses synapse

08:55

count generally seems to predict to

08:56

intelligence with a few exceptions for

08:58

instance elephants techn have a higher

08:59

signups count than humans but yet

09:01

display lower intelligence of course

09:03

basically here's where he's actually

09:04

talking about how you know the simplest

09:05

explanation for larger signups accounts

09:07

with lower intelligence is a smaller

09:09

amount of quality data and from an

09:10

evolutionary perspective brains are

09:12

quote unquote trained on billions of

09:14

years of epigenetic data and human

09:16

brains evolve from higher quality

09:17

socialization communication data than

09:19

elephants leading to our Superior

09:20

ability to reason but the point he's

09:22

trying to make here is that you know

09:23

while there are nuances that you know

09:25

don't make sense synapse count is

09:27

definitely important and I think we've

09:28

definitely seen that um with the

09:30

similarities in the parameter size with

09:33

the explosion of llms and what we've

09:35

seen in these multimodal models and

09:36

their capabilities and it says again the

09:38

explosion in a capabilties since the

09:40

early 20110 has been a result of far

09:41

more computing power and far more data

09:43

gbt2 had 1.5 billion connections Which

09:46

is less than a mouse's brain and DBT 3

09:48

had 175 billion connections which is get

09:50

somewhat closer to a cat's brain and

09:52

obviously it's intuitively obvious that

09:54

an AI system the size of a cat's brain

09:56

would be superior to a system than the

09:58

size of a mouse's brain so so here's

09:59

where things start to get interesting so

10:00

he says in 2020 after the release of the

10:03

175 billion parameter gbt 3 many

10:05

speculated about the potential

10:07

performance of a Model 600 times larger

10:09

at 100 trillion parameters just remember

10:11

this number because this number is about

10:13

to just keep you know repeating in your

10:15

head and of course he says the big

10:16

question is is it possible to predict AI

10:18

performance by parameter count and as it

10:20

turns out the answer is yes as you'll

10:21

see on the next page and this is where

10:22

he actually references this article

10:25

which is called extrapolating GPT and

10:26

performance by lrien and it was not

10:29

score written in 2022 and basically it

10:31

talks about how as you scale up in

10:33

parameter count you approach Optimal

10:34

Performance so essentially this graph

10:36

seems to be illustrating the

10:37

relationship between neuron networks

10:39

measured by the number of parameters

10:40

which can be thought of as the strength

10:42

of connections between neurons and their

10:44

performance on various tasks and these

10:46

tasks included language related

10:47

challenges like translation read and

10:49

comprehension and question and answering

10:51

among others and the performance on

10:52

these task is measured in the vertical

10:54

axis higher values indicating better

10:56

performance and the graph shows that as

10:58

the number of parameters in increases

10:59

the performance on these tasks also

11:01

tends to but of course it does have

11:02

diminishing returns As you move right

11:04

because the curves actually do tend to

11:06

Plateau as they reach the higher

11:07

parameter counts of course the various

11:09

colors on this chart just essentially

11:11

represent different tasks and each dot

11:12

on those lines represents a neural

11:14

network model of a certain size and

11:15

certain parameter count being tested on

11:18

that you can see right down here this is

11:20

where the trained G you can see the gbt

11:22

performance so it says flop us TR at gp3

11:25

and then of course you can see right

11:26

here this is apparently the number of

11:28

synap in the brain and just remember the

11:31

number 100 trillion or 200 trillion

11:32

because it's going to be really

11:33

important so s it then says as Lan

11:37

Illustrated extrapolations show that air

11:39

performance inexplicably seems to reach

11:41

human level at the same time as a human

11:43

level brain size is matched with the

11:45

parameter count his count for the

11:46

synapse the brain is roughly 200

11:48

trillion parameters as opposed to the

11:49

commonly cited 100 trillion figure but

11:51

the point still stands 100 trillion

11:53

parameters is remarkably close to

11:55

Optimal by the way an important thing to

11:57

not is that although 100 trillion is

11:58

slightly suboptimal in performance there

12:00

is an engineering technique that openi

12:02

is using to bridge this cap and I'll

12:04

explain this towards the very end of

12:05

this document because it is crucial to

12:06

open ey is building and Lan's post is

12:09

one of many similar posts online it's an

12:11

extrapolation of Performance Based on

12:12

the jump between previous models and

12:13

open ey certain has much more detailed

12:16

metrics and they've come to the same

12:17

conclusion as lanon as I'll show later

12:20

in this document so if AI performance is

12:22

predictable based on parameter count and

12:25

100 trillion parameters is enough for

12:27

human level performance when will 100

12:29

trillion parameter AI model be released

12:32

in the future so here's where we go okay

12:35

it says that gbt 5 achieved Proto AGI in

12:38

late 2023 with an IQ of 48 now that is a

12:41

statement that um you know with the IQ

12:43

of 48 I genuinely don't know where

12:45

that's coming from I'm guessing that's

12:46

maybe just based on addictions and stuff

12:47

like that but I genuinely don't know

12:49

where this like in the beginning there

12:50

was this uh you know IQ thing where it

12:53

was like you know IQ for get IQ 96 delay

12:55

I I genuinely don't know where those IQ

12:56

values are coming from it doesn't really

12:57

state in this document unless I missed

12:59

it somewhere but um essentially this is

13:01

a point where things start to get super

13:02

interesting because this is where he's

13:03

referencing a whole bunch of sources but

13:05

it says that you know um of course Jimmy

13:07

apples did tweet a has been achieved

13:09

internally if you didn't watch the video

13:10

that I made on a has been achieved

13:11

internally Jimmy apples wasn't the only

13:13

one that tweeted this um Sam when

13:14

actually himself did actually post this

13:16

on Reddit and then he edited the comment

13:18

to say look we were just joking around

13:20

um and then of course this person this

13:21

is a guy who uh runs Runway he said that

13:23

I have been told that gbg 5 is scheduled

13:25

to complete training this December and

13:26

that op AI expects it to achieve AI

13:29

which means we will hotly debate whether

13:30

or not it actually will achieve AI which

13:32

means it will and of course you can see

13:34

um this guy is CEO of Runway and a bunch

13:36

of other companies and it's actually

13:37

funny enough he actually is followed by

13:39

Sam Alman so um this statement actually

13:41

wasn't like there weren't many

13:42

detractors of the statement which was

13:44

quite interesting considering that

13:45

that's just a bold statement about gbt 5

13:48

nonetheless it was a very interesting

13:49

statement anyways we continue on in this

13:51

document and it states the first mention

13:53

of a 100 trillion parameter model being

13:55

developed by open AI was in the summer

13:57

of 2021 mentioned offand in a wide

14:00

interview by the CEO of cerebrus Andrew

14:01

Feldman a company which is Sam Alman is

14:03

a major investor of and if you don't

14:05

know what the cerebrus company does uh

14:07

they produce crazy crazy clusters um and

14:11

you can see right here that they have

14:13

the fastest HPC accelerator on Earth you

14:15

can see that the latest GPU can theirs

14:18

is pretty pretty insane and I'm guessing

14:19

that they're looking to deploy these in

14:20

the future based on future AI systems

14:23

essentially this basically talks about

14:24

the first mention of an 100 trillion

14:26

parameter model being deployed whichin

14:28

the summer of 2021 and you can see that

14:30

it's mentioned in an offhand wide

14:31

interview by the CEO cerebrus like I

14:33

just talked about which Sam Alman is a

14:34

major investor of this is actually

14:36

pretty true samman actually has lested I

14:38

think it's between 50 million to 80

14:39

million um and I think open has actually

14:42

agreed to purchase some of the chips

14:43

some kind of deal but it says from

14:44

talking to opening ibt4 will be about

14:46

100 trillion parameters that won't be

14:48

ready for several years and remember

14:49

this was of course uh debunked if you

14:52

were in the space at the time if you

14:54

were actually paying attention to things

14:55

but anyways if we continue okay because

14:57

I want to get through some of this stuff

14:58

it says Sam alman's response to Andrew

15:00

Feldman at an online Meetup at ac10 it's

15:02

crucial to note that Sam Alman admits to

15:04

their plans for 100 trillion parameter

15:05

mod so this is an essentially an excerpt

15:07

from that blog post interview and it

15:09

says gb4 is coming but the current focus

15:11

is on the Codex and also where the

15:13

available computer is coming gb4 will be

15:14

a text model as opposed to multimodal

15:16

and it says okay this is where things

15:18

get interesting okay pay attention to

15:19

this it says 100 trillion perameter

15:21

model won't be gb4 and is far off

15:23

they're getting much more performance

15:24

out of smaller models maybe they may

15:27

never need such a big model and and they

15:29

then says it's not yet obvious how to

15:30

train a model to do stuff on the

15:32

internet and think long and very

15:33

difficult problems and a lot of work is

15:35

going to make it accurate until tell the

15:36

truth so basically this part is to look

15:39

at the early stages where um Sam Alman

15:41

is basically stating that you know they

15:43

do have plans for an 100 trillion

15:45

parameter model which is essentially

15:46

this brain-like AGI type system now

15:48

here's we where we talk about some more

15:49

leaks and of course things are very

15:51

speculative so of course like a said it

15:53

could be just speculation but it says an

15:55

AI researcher Igor made the claim a few

15:57

weeks later that gp4 was being trained

15:59

and would be released between December

16:00

and February again I will prove that he

16:02

really did have accurate information as

16:04

a credible Source this will be important

16:06

soon so you can see here he tweets that

16:07

open AI started to drain GPT 4 releas

16:09

his plan for December slfb then

16:11

essentially GW a famous figure in the AI

16:13

World he's an AI research and blogger he

16:15

messaged igle bov on Twitter in

16:17

September 2022 and this is the response

16:19

he received it's important to remember

16:21

the Colossal number of parameters text

16:22

audio images possibly video and

16:24

multimodal and this comes from a

16:25

subreddit called this is the way it will

16:26

be which is a private small sub

16:28

subreddit run by mathematicians with an

16:30

interest in AGI and a few AI enthusiasts

16:33

and it says they use this subreddit to

16:35

discuss topics deeper than what you'll

16:37

find in the main stream and you can see

16:39

here that he actually does talk about

16:40

how open AI started training gbt 4 and

16:43

the training will be completed in a

16:44

couple months and trust me guys I'm

16:45

going to get to the big bit but

16:46

essentially this is someone who had

16:48

early information on this model okay and

16:50

essentially a colossal number of

16:51

parameters it sounds like this guy was

16:53

referencing 100 trillion parameter model

16:54

as 500 billion parameter models and 1

16:57

trillion parameter models had already

16:58

been trained many times by the time of

16:59

this tweet in summer 2022 making models

17:02

of that size unexceptional and certainly

17:04

not colossal so these tweets from I'm

17:06

just going to say Ru because they

17:08

actually do make a similar claim which

17:10

is pretty interesting okay um so this is

17:12

where stuff gets even more interesting

17:13

okay because this guy okay just call him

17:15

rapu because I don't know how to say rxp

17:17

time but he says he also mentions an 125

17:20

trillion synapse gbt 4 however

17:22

incorrectly States the gbt 3's paramet

17:25

account as 1 trillion but essentially he

17:26

states that um you know it was far

17:28

behind the human brain containing 100

17:29

trillion synapses but what he did get

17:31

right was stating that it will be

17:32

introduced the public at the beginning

17:34

of 2023 and is able to write academic

17:36

pieces and articles that human

17:37

intelligence cannot distinguish so this

17:39

is another person that did have earlier

17:41

inside information now he does cite that

17:43

this is a weaker piece of evidence

17:44

because Rune is a fairly notable Silicon

17:46

Valley air researcher followed by CEO

17:48

Sam Alman and he did tweet at run I'm

17:51

guessing is gb4 100 trillion parameters

17:53

are rumored and then the guy thumbs it

17:54

up I'm not I'm not really sure what that

17:55

is referring to that much but

17:57

essentially the information from the

17:59

previous two people who had early

18:01

information about gbt 4 were then

18:04

basically sent to this person okay it

18:06

says in November 2022 I reached out to

18:08

an AI blogger named Alberto and his post

18:10

seemed to spread pretty far online so I

18:12

was hoping that I sent him some basic

18:13

gbt 4 he might be able to do a write up

18:15

so you can see they were actually

18:16

talking and he said of course the info

18:18

necessary isn't official but this is all

18:20

important just trust me because

18:21

essentially later on of this document

18:23

you'll see how the 100 trillion

18:25

parameter model being talked about now

18:27

is something that's been in the works

18:28

for a while and there are a few things

18:30

that do make sense so you can see here

18:32

that of course Alberto then goes ahead

18:34

to publish this where there this rumor

18:36

does start to spread and you can see

18:37

that then the 100 trillion parameter

18:39

leak went viral reaching millions of

18:40

people to the point that opening eyes

18:42

employees including CEO Sam Alman had to

18:44

respond calling it complete foolishness

18:46

called it factually Incorrect and of

18:48

course this guy of course did claim

18:49

responsibility for the leak because if

18:50

you remember at this time this was a

18:52

really really huge deal where people

18:53

were stating that this is going to be

18:54

really really huge but this isn't the

18:57

caveat it's not about them not getting

18:58

this wrong it's about the fact that GPT

19:00

4 is not going to be 100 trillion

19:02

parameters it's about that this document

19:03

is basically stating that in the future

19:05

GPT 8 whatever model it's going to be

19:08

the one that achieves AGI will actually

19:09

have this count and these guys knew this

19:11

from quite some time ago that's why

19:13

we're um essentially looking at this

19:15

kind of data in terms of the leaks and

19:16

stuff to kind of show you guys how we

19:18

actually got there so remember this

19:19

Eagle guy he was also stating that he

19:21

also heard about 100 trillion parameters

19:23

but he found that this was going to be

19:25

ridiculous so he decided not to include

19:27

it with his tweet about GPT 4 because

19:30

when he heard 100 trillion parameters he

19:31

was like wait that doesn't make sense

19:33

for GPT 4 because why would it be that

19:35

much bigger than gpt3 like it doesn't

19:37

make sense and then you can see this is

19:38

where you start to talk about you know

19:40

2022 where I became convinced that

19:42

opening I plan to release a one to2

19:44

trillion parameter subset of GPT 4

19:47

before releasing the full 100 trillion

19:49

parameter model GPT 5 so basically all

19:51

of these people said the same thing all

19:53

of these people who were early on GPT 4

19:55

somehow I'm not entirely sure stated

19:57

that this was all going to to be a model

19:58

in the future that would have 125

20:01

trillion parameters or over 100 trillion

20:03

parameters and essentially bringing it

20:05

back to Eagles leak you can see that

20:07

there were a couple of people that were

20:08

also using a beta version of GPT 4 now

20:11

of course it says the sources here are

20:12

varying credibility but they all

20:14

inexplicitly say the same thing gp4 was

20:16

being tested in October November of 2022

20:19

and according to the US military AI

20:20

researcher it was definitely being

20:21

trained in October which again lines up

20:24

with eagles's leap now essentially

20:26

what's crazy here is that it says

20:27

opening eyes of official position as

20:29

demonstrated by samman himself is that

20:30

the idea of 100 trillion parameter gbt 4

20:33

is complete false this is a half true

20:35

because gbt 4 is 100 trillion parameter

20:37

subset of the full 100 trillion

20:39

parameter model I'm not sure if I do

20:41

believe that statement because one of

20:42

the problems is is that of course GPT 4

20:44

isn't actually confirmed but you know

20:46

According to some articles it has been

20:48

confirmed that gbt 4 does have at least

20:50

1 trillion parameters I mean some people

20:52

have said that is 1.8 I mean some people

20:54

said it's 1.2 but here's where one of

20:56

the leaks that I do remember and this is

20:57

one one of the First videos I did make

20:59

on GPT 4 and it was because the CTO of

21:02

Microsoft Germany a week prior to the

21:04

official release of GPT 4 actually did

21:05

slip up and reveal that there exists a

21:07

GPT 4 which does have the ability to

21:09

process videos and it says I imagine he

21:11

was unaware of opening ey's decision not

21:13

to reveal the video capabilities of the

21:15

system and this completely proves that

21:16

gbt 4/5 was trained on not just text and

21:19

images but also video data and of course

21:21

we can infer that audio data was

21:23

included as well so he says we will

21:24

introduce gp4 next week I remember I

21:26

actually made this video if you go back

21:27

to the channel you can actually see this

21:29

and he says we will introduce GT4 next

21:30

week they will have multimodal models

21:32

that will offer completely different

21:34

possibilities for example videos and we

21:36

said the CTO called the llm a game

21:38

changer because they teach M to

21:39

understand natural language which then

21:41

understand in a statistical way what was

21:43

previously only readable and

21:45

understandable by humans in the meantime

21:46

the technology has come so far that it

21:48

basically Works in all language you can

21:50

ask a question in German and get an

21:52

answer in Italian and remember this was

21:54

actually a very very credible statement

21:56

because it was the CTO of Microsoft and

21:58

and we did actually get gbt 4 a week

22:00

later and the point again is that eagle

22:02

before said that GPT 4 was going to be

22:04

released in january/february and this

22:06

kind of claim was also cooperated by a

22:09

credible entrepreneur who stated that in

22:11

October 2022 gbt 4's release date would

22:13

between January and February 2023 now

22:15

some people are stating well gbt 4 was

22:17

released in March but of course you can

22:18

see right here although gbt 4 was

22:20

released in March 2023 slightly outside

22:22

the window I think it was done

22:23

intentionally by open ey to discredit

22:24

the leak and I I can't lie I actually do

22:26

agree with that because I've seen some

22:28

stuff you know tweeted by people like

22:29

Jimmy apples and stuff like that and I'm

22:31

pretty sure sometimes when leaks have

22:33

gotten pretty big open air have decided

22:35

against it because it just discredits

22:37

the leakers and I mean they're working

22:39

on some top secret stuff here now here's

22:40

where we get into the AGI debate it says

22:42

a note about robotics AI researchers are

22:44

beginning to believe that vision is all

22:46

that's necessary for optimal real world/

22:48

physical performance and in order to

22:50

give one example Tesla completely

22:52

ditched all censers and fully committed

22:53

div vision for their self-driving cars

22:55

the point is that training a humanized

22:57

AI model on on all the image and video

22:58

data on the internet will clearly be

23:00

more than enough to handle complex

23:01

robotic tasks that Common Sense

23:03

reasoning is buried in the video data

23:04

and just like it's buried in the text

23:05

data and the text focus gp4 is stally

23:08

good at Common Sense reasoning and it

23:09

does actually show an example of where

23:11

you know Tesla actually does remove its

23:13

ultrasonic sensors and then I'm going to

23:14

show you another slide where Tesla

23:16

actually do talk about how images all

23:17

they need SL video and you can see the

23:19

former head of Tesla AI where explains

23:21

they remove the sensors While others

23:22

differ that's Andre Kathy I believe and

23:25

this was another example if you remember

23:27

at the time and this is all going to tie

23:29

in So just pay attention essentially

23:30

palm e if you do remember this video

23:32

from around 2023 where essentially it

23:34

was AI system that learned mainly from

23:37

Vision combined with an large language

23:38

model and there were minimal robotics

23:40

data required on top of the language and

23:42

vision training and palm e was a 500

23:44

billion parameter model and it and it

23:46

asks the question that what happens when

23:47

robotics is trained on an 100 trillion

23:50

parameter model on all the data

23:52

available on the internet and this was a

23:54

pretty crazy system I'm not going to lie

23:55

to you guys like looking at pal how good

23:57

it was was it was definitely a very very

23:59

comprehensive system that could do a lot

24:01

of stuff I remember it was able to deal

24:02

with adversarial disturbances it was

24:05

able to you know get stuff that you

24:06

wanted it to do this was a pretty pretty

24:08

good system um they did upgrade this but

24:10

I'm surprised they haven't dropped any

24:12

updates recently because it was a really

24:14

good system for the time and of course

24:15

this is where they talk about the Tesla

24:16

update where it says you know Optimus

24:18

performed you know its first endtoend

24:20

learn successful graph today you can see

24:21

it below at 1X speed this was learned

24:23

from Human demonstrations no specific

24:25

task programming was done this means we

24:27

can now scale quickly to many tasks it

24:29

says join us before it's to late make

24:30

the AI happen with us it says if human

24:32

demonstrations are all that's needed for

24:33

advanced robotics performance and 100

24:35

trillion parameter model trained on all

24:37

the web would certainly be able to

24:38

achieve astonishing robotics performance

24:40

and I wonder what's going to happen in

24:41

the next couple of weeks cuz I'm not too

24:43

clued up on Robotics and how entirely

24:45

that is to trange but I do know that

24:46

next week or in fact two to three weeks

24:48

from now we're going to have a huge

24:49

robotics announcement where they're

24:51

basically you know one someone at one of

24:53

the leading robotics companies they

24:55

actually did state that we thought we

24:57

didn't have the data but now we do and

24:58

that they've made a breakthrough so I

25:00

wouldn't be surprised if soon we do get

25:02

a real update because someone said that

25:04

3 weeks from now someone ATL said you

25:05

know there going to be a huge update

25:07

that's going to completely change

25:08

robotics so before you actually you know

25:10

discredit this area I would say give it

25:12

like 3 weeks because we don't know now

25:14

essentially it says the image on the

25:15

left shows what aund what a trillion

25:17

parameter GPT 4 is capable of an image

25:20

recognition which is pretty crazy if you

25:22

haven't seen this demo before this is in

25:23

the gbt 4 paper where they talked about

25:25

how good this model is at understanding

25:27

things

25:28

and it's already clearer and written

25:31

more well than many humans would have

25:32

been able to come up with so it says so

25:34

again what happens when you train a

25:35

model 100 times larger than gbt 4 which

25:38

is the actual size of the human brain on

25:41

all the data available on the internet

25:42

and then we get to this slide where it

25:43

says important notice how the AI model

25:45

is able to generate multiple angles of

25:46

the same scene with physically accurate

25:48

lighting and in even some cases

25:50

physically accurate flu fluid and rain

25:51

and if you can generate images and

25:53

videos with accurate Common Sense

25:54

physics you have common sense reasoning

25:56

and if you can generate common sense you

25:58

understand common sense and I mean this

26:00

is once again one of those points where

26:02

it's delving into the argument of if an

26:03

AI system can accurately do lighting if

26:06

it can understand um you know physics

26:08

and rain and all that kind of stuff this

26:10

is meta's video thing it's basically

26:11

saying that does the AI have a common

26:13

sense World model in its head where it

26:15

can truly understand how the world you

26:17

know is and how it interacts and how you

26:19

know um like what kind of reasoning does

26:21

it use in his head to be able to get

26:22

this uh well okay and some people say

26:25

that you know some people say these AI

26:26

systems just patent recognition

26:28

it's just regenerating stuff it's seen

26:29

in this training data but some argue

26:32

that it's able to completely understand

26:34

the physical world that we live in and

26:35

it's got some kind of world model it's

26:37

got some kind of Common Sense thinking

26:38

algorithm whichever way you want to put

26:39

it so that it can completely get this

26:41

stuff correctly because it needs to be

26:43

able to if it wants to and I mean that's

26:45

something again that is really debated

26:46

on now here it gives us some talks about

26:49

someone that did give the early leaks

26:51

for GPT 4 again and what's crazy is that

26:54

they do talk about image and audio

26:56

generation would be trained in quarter 3

26:58

of 2023 and apparently If Video

27:00

generation training is simulation

27:02

simultaneous or shortly after this

27:04

actually does line up with Chen's claim

27:06

of gbt 5 being finished training in

27:08

December of 2023 so remember that guy

27:10

before this year of Runway stating that

27:12

gbt 5 would be finished training in

27:13

December was now around 3 months ago

27:16

that actually does line up with some of

27:18

the claims made here and this is also

27:20

someone that did have early inside

27:22

access so we can see clearly that there

27:24

are do people that do have inside access

27:26

now here's where we get into the crazy

27:28

bit because this is where um we talk

27:30

about long-term planning and of course

27:31

things do change but I think this is

27:33

going to be the most important part of

27:34

the video but it says here the open ey

27:36

president Greg Brookman stated that in

27:37

2019 following a$1 billion investment

27:39

from Microsoft at the time open ey

27:41

planned to build a human brain sized

27:43

model within 5 years and that was their

27:45

plan on how to achieve AI remember that

27:47

bit guys of plan to build a humaniz

27:49

brain model within 5 years and that was

27:51

their plan for how to achieve AGI you

27:53

can see um you know within 5 years and

27:55

possibly much faster with the aim of

27:56

building a system that can run run a

27:57

humaniz braid model and of course that's

27:59

why I said before the same number 100

28:01

trillion parameters is the reason that

28:04

number is being cited a lot and that's

28:06

why I'm guessing if open ey had

28:08

previously stated in 2019 that their

28:10

investments from Microsoft would

28:12

actually help them to try and build a

28:14

human a humanized brain within 5 years

28:16

you know leaks about 100 trillion going

28:19

viral and then it's spreading about GPT

28:21

4 could actually be the reason that

28:23

actually occurred because many people

28:24

were confused thinking gbd4 is going to

28:25

have 100 trillion but many people didn't

28:27

realize that this is actually the future

28:29

models which is what open have been

28:31

planning all along and I think that

28:33

actually does make sense like if we can

28:36

actually corroborate statements and

28:38

understand what's actually being said

28:40

from actual openai employees I think

28:42

those are the best chances we do have at

28:44

looking at what is realistic you can see

28:46

here it says both of these sources are

28:48

clearly referring to the same plan to

28:49

achieve AGI a human brain sized AI model

28:52

trained on images text and other data

28:54

due to be trained within 5 years of 2019

28:57

so by 2024 and it seems to line up with

29:00

all the other sources I've listed in

29:02

this document so essentially that's what

29:04

Greg Brockman did state which is a

29:06

pretty pretty hefty piece of information

29:08

and yeah that would actually line up

29:09

with 2024 and this is where we start to

29:12

get into the part where certain people

29:13

start to urge caution on AI and this is

29:17

the part where he starts to argue that

29:18

you know suddenly AI leaders are

29:20

starting to sound the alarm almost like

29:22

they know something very very specific

29:24

that the general public doesn't and I

29:26

would argue that yes is kind of true but

29:28

you know AI leaders have been stating

29:30

this from ages ago like Sam Alman has

29:32

been stating since 2015 AI is dangerous

29:34

and Elon mus has been stating the same

29:36

thing but I would argue as well that

29:38

certain people from Google did actually

29:40

leave recently um and we're going to get

29:43

into that now they actually did leave

29:44

fairly recently and I'm not surprised

29:46

but um they actually did really leave

29:48

recently and maybe it's just a

29:49

coincidence and part of it but it says

29:50

you know in this uncertain climate that

29:52

hassabis agrees to a interview stock

29:54

warning about his growing concerns I

29:56

would Advocate not moving for and

29:57

breaking things he says referring to an

29:59

old Facebook model and essentially their

30:01

old motto was you know release

30:02

Technologies into the world first and

30:03

then fix any problems that arose later

30:05

and of course you can't do that with

30:06

super intelligence and he says he says

30:08

AI is now on the cusp of being able to

30:09

make tools that could be dation urging

30:12

his competitors to proceed with caution

30:13

than ever before and of course Jeffrey

30:15

Hinton if you don't know this guy um by

30:17

industry standards The Godfather of aiir

30:19

that he actually did leave Google in

30:22

2023 because he wanted to actually talk

30:24

about the dangers of AI he said the idea

30:27

that this stuff could actually get

30:28

smarter than the people a few people

30:30

believe that Hinton said in the

30:31

interview but most people thought it was

30:32

a way off and I thought it was way off I

30:34

thought it was 30 to 50 years away or

30:35

even longer obviously I no longer think

30:37

that and he thinks super intelligence is

30:39

Coming Far quicker than we did think now

30:41

if you remember as well this is very

30:44

fascinating because at the time this was

30:46

something that was pretty crazy and it

30:48

says shortly after the release of gbt 4

30:50

the future of Life Institute a highly

30:52

influential nonprofit organization

30:54

concerned with the mitigating the

30:55

potential C catastrophic risks to the

30:57

world released an open letter on all AI

30:59

labs to P AI development for 6 months

31:01

why because they essentially stated that

31:03

you know including the currently being

31:05

trained GPT 5 and this is kind of crazy

31:07

I didn't actually see that okay it says

31:09

we call on all AI labs to immediately

31:10

pause Training Systems more powerful

31:12

than GPT 4 including the currently being

31:14

trained gbt 5 signatures include you

31:16

know all of these people that were like

31:18

El musk that even worked at Apple and

31:19

apparently the first release version of

31:21

the letter specifically said including

31:22

the currently being trained gbt 5 and

31:24

then apparently it was removed which is

31:25

pretty crazy and then of course there is

31:27

a quote from samman like I said again um

31:29

and this is pretty crazy because it says

31:30

you know do we have enough information

31:32

on the internet to create AGI and Sam

31:34

alman's blunt immediate response

31:35

interrupting the man asking the question

31:36

is yes he elaborates yes we're confident

31:38

there is we think about this and we

31:39

measure it quite a lot uh information in

31:42

the internet to create AGI so if you

31:45

contrast it with yes we have a

31:46

continuous video feed to our eyes and um

31:49

on the internet we only have like a

31:51

subset of that yeah we're confident

31:52

there is we think about this and measure

31:54

it quite a lot what gives you that

31:55

confidence is it the

31:57

size of the knowledge base is it

31:59

complexity is it one of the things that

32:01

I think open AI has driven in the field

32:04

um is that's been really healthy is that

32:07

you can treat scaling laws as a

32:09

scientific prediction you can do this

32:11

for compute you can do this for data but

32:13

you can measure at small scale and you

32:14

can predict quite accurately how it's

32:16

going to scale up how much data you're

32:18

going to need how much compute you're

32:19

going to need how many parameters you're

32:20

going to need when you know when the

32:23

generated data gets like good enough to

32:24

be helpful um the internet is like

32:26

there's a lot of data out there there's

32:28

a lot of video out there too few more

32:30

questions maybe so I mean you did see

32:31

from that clip how samman actually did

32:33

talk about it I mean he seems very very

32:36

very confident on that prediction in

32:39

order to get to AGI the kind of data

32:41

that they do need and then here's what

32:42

he says on Sam alman's Q&A he says that

32:45

you know first of all he seems highly

32:46

highly confident that exists enough data

32:48

on and AGI system confident to the point

32:50

that it makes one question if they've

32:52

already done it or in the process of

32:54

doing it which is definitely a bold

32:55

statement to state that they've you know

32:57

basically done it in 2022 in fact that

32:59

interview was actually from the early

33:01

stages of 23 which is quite fascinating

33:03

but then again um gb4 did finish

33:06

training at that time so they would

33:07

definitely be working on other AI

33:09

systems at that point now essentially

33:11

what he talks about here is that um and

33:12

I think this is you know probably the

33:14

thing that you should probably take away

33:15

from this video and it says as I

33:16

mentioned earlier in the document an 100

33:18

trillion parameter model is actually

33:20

slightly subtable but there is a new

33:22

scaling Paradigm openai is using to

33:24

bridge this Gap and it's based on

33:26

something called the G scaling laws

33:27

chinchilla was an AI model unveiled by

33:29

Deep Mind in early 22 and the

33:31

implication of the chinchilla research

33:32

paper was that current models are signif

33:35

trained and with far more computes

33:37

meaning more data we would see a massive

33:39

boost in performance the need to

33:40

increase parameters point is that while

33:42

an untrained or undertrained 100

33:44

trillion parameter model may be slightly

33:46

suboptimal if it were trained on vastly

33:48

more data it would be able to exceed

33:50

human level performance and the

33:51

chinchilla Paradigm is widely understood

33:54

and accepted in the field of machine

33:56

learning but just just to give a

33:57

specific example from openi president

33:59

Greg Brockman discusses this in an

34:01

interview on how openi realized their

34:03

initial scaling laws were flawed and

34:04

have since adjusted to take the

34:06

chinchilla laws into account regulations

34:07

to figure out like hey how big of a

34:09

model do you think you're going to need

34:10

lots of mistakes for sure um like a good

34:13

example of this is the scaling laws so

34:14

we we did this study to S to actually

34:17

start to really scientifically

34:18

understand how do models improve as you

34:21

push on various axes so as you pour more

34:22

Compu in as you pour more data in and

34:24

one conclusion that we had at one point

34:26

was that basically uh that there's you

34:29

know sort of a limited amount of of data

34:32

that you want to pour into these models

34:33

and that there's kind of this very this

34:34

very clear curve um and that one thing

34:36

that that I we realized only years later

34:39

was actually that we'd read the curves a

34:40

little bit wrong and you actually want

34:42

to be trading for way more tokens way

34:44

more data than anyone had expected and

34:46

that that uh you know there's definitely

34:48

these moments where these things that

34:50

just didn't quite click whereas like it

34:51

just didn't add up that we were training

34:53

for so little and that you know

34:54

something the conclusions that you drew

34:55

Downstream um but then then you realize

34:57

there was a foundational assumption that

34:59

was wrong and suddenly things make way

35:01

more sense I think it's a little bit

35:02

like you know physics in some sense

35:04

where like do you do you doubt physics

35:05

it's like I kind of do I think all of

35:07

physics is wrong right but like only so

35:09

wrong right it's like we clearly haven't

35:10

reconciled like Quantum and relativity

35:12

so there's like something wrong there

35:14

but that that wrongness is actually an

35:16

opportunity it's actually a sign of you

35:18

have this thing it's already useful

35:19

right it really like has affected our

35:21

lives and it's actually like pretty

35:22

great I'm very happy with what physics

35:23

has done but also there's fruit and so I

35:26

think that that for me that's always

35:27

been the feeling that there's something

35:29

here and that you know if we do keep

35:32

pushing and somehow the scaling laws all

35:33

Peter out right they suddenly drop off a

35:35

cliff and we can't make any further

35:36

progress like that would be the most

35:38

exciting time in this field because we

35:40

would have finally reached the limit of

35:42

Technology we would have finally finally

35:43

learn something and then we would

35:44

finally have a picture of what the next

35:46

thing to do is and yeah you can clearly

35:48

see that Greg Brockman discusses how

35:50

their initial scaling laws were flawed

35:52

and of course a lot of people were

35:53

stating at this time that training a

35:55

compute optimal 100 trillion parameter

35:57

model would have cost billions of

35:58

dollars and just isn't feasible well

36:01

Microsoft invested $10 billion into

36:03

openi in early 2023 so I guess it's it

36:06

isn't that ridiculous of a possibility

36:09

after all the only question I do have

36:11

but then again I don't really have that

36:12

question anymore after reading the

36:13

lawsuit from yesterday was that you know

36:15

of course Microsoft does get access to

36:17

only pre AGI Tech but then again open AI

36:20

is the one that gets to determine

36:21

whether the technology is Agi or not

36:24

part of me even wonders if you know

36:25

Microsoft will even exist or even some

36:27

of these companies want AGI exist

36:29

because you know AI could potentially

36:31

make competition irrelevant that's

36:32

another question for another video but

36:34

essentially Alberto Romero worked about

36:36

deep min's chinchilla scaling

36:37

breakthrough and chinchilla showed us

36:39

that despite being vastly smaller than

36:40

gbt 3 and deep mind's own gopher it

36:42

outperformed them as a result of being

36:44

trained vastly on more data just to

36:46

reiterate this one more time although a

36:48

100 trillion model is predicted to

36:49

achieve slightly suboptimal performance

36:51

open AI is well aware of the chinella

36:53

scaling laws as is pretty much everyone

36:54

in the AI field and as they're training

36:56

Q Star as a 100 trillion parameter model

36:59

that is compute optimal and trained on

37:01

far more data than they really intended

37:03

they have the funds to do it now through

37:05

Microsoft and this will result in a

37:06

model that is far far exceeds the

37:08

performance of what they had initially

37:09

planned for their 100 trillion parameter

37:10

model and 100 trillion parameters

37:13

without the chinchilla scaling laws

37:14

equal roughly human level but slightly

37:16

suboptimal but 100 trillion parameters

37:18

multimodal with chinchilla scaling laws

37:21

taken into account what on Earth do we

37:22

get and I think um I'm not that glued up

37:25

on the chinella scaling laws of have

37:26

been being completely honest I wasn't

37:27

aware of this but what I do know is that

37:30

if this statement is true and if the

37:32

research is up to date because I do know

37:34

that a research is a very complex field

37:36

and it is a very Dynamic field and

37:37

research paper every single day will be

37:39

changing how future system are trained

37:42

and maybe there's a research paper that

37:43

was released you know just 20 minutes

37:45

ago or something there are literally

37:46

like 160 papers released every single

37:48

day which is absolutely insane but this

37:50

is a true fact that there were so many

37:51

research papers released every single

37:52

day the AI space isn't Dynamic enough

37:54

and these chinchilla scaling laws are

37:56

true and if they do now have the funds

37:58

to actually train something that is

37:59

going to be an 100 trillion parameter

38:01

model I'm guessing that this might have

38:03

been what potentially sparked that

38:05

letter that kind of breakthrough that

38:07

you know I guess you could say got Sam

38:09

out when fired and I'm guessing that

38:11

maybe something like this was

38:12

potentially the ignition to spark a lot

38:15

of stress at opening eye because they

38:16

actually did State Sam when in the

38:18

interview said look everyone at Open Eye

38:19

stressed because we're building super

38:20

intelligence I actually made a video on

38:22

that because I was like that's a crazy

38:24

statement to say um and of course at

38:26

around that time you know there was so

38:28

much drama and around that time they did

38:30

you know trademark the models and of

38:31

course you know trademarks could

38:32

absolutely nothing but I just find that

38:34

you know everything happening around

38:35

that time I just don't think that that's

38:37

a coincidence now what is crazy as well

38:39

is uh this is where they actually do

38:40

talk about super alignment it says our

38:42

goal is to build roughly human level

38:44

automated alignment researcher we can

38:45

then use vast amounts of computer scale

38:47

our efforts and of course open AI has

38:49

planned to build human level AI by 2027

38:51

and then scale up super intelligence and

38:53

this has been delayed because Elam mus

38:54

lawsuit be coming up but short and

38:57

essentially he does talk about this and

38:58

I don't think this is that crazy um

39:00

essentially this guy is an open AI

39:02

researcher who worked there in the

39:03

summer of 2022 he joined there for one

39:06

year I'm not sure he works there anymore

39:07

but essentially he wrote a letter to

39:09

himself and essentially he said that you

39:11

know the company he's working on and

39:13

this is the kind of note that he wrote

39:14

to himself he said there's an AI company

39:16

building an AI that fills giant rooms

39:18

eats a Town's worth of electricity and

39:20

has recently gained an astounding

39:21

ability to converse like people it can

39:23

write essays or poetry on any topic it

39:25

can Ace college level exams it's daily

39:26

gaining new capabilities that Engineers

39:28

who tent to the AI can't even talk about

39:30

in public yet those Engineers do have a

39:32

sit in the cafeteria and debate the

39:34

meaning of what they're creating what

39:36

will it learn to do next week which jobs

39:38

might render it obsolete should they

39:39

slow down or stop so as to not tickle

39:41

the tail end of the Dragon but wouldn't

39:43

that mean that someone else probably

39:44

someone with less Scruples would wake

39:46

the dragon first is then ethical

39:48

obligation to tell the world about this

39:50

is then obligation to tell it less I AMU

39:52

are spending a year working at my job

39:55

well your job to developed the

39:56

mathematical theory on how to prevent

39:58

the AI and it successors from where that

40:00

essentially means destroying the world

40:01

and essentially he's referring to the

40:03

qstar multimodal 125 trillion parameter

40:05

B so essentially this document basically

40:08

just tries to say that look every year

40:10

we're going to get a GPT update till

40:13

2027 and the reason 2027 is the date is

40:16

because that actually does coincide with

40:18

opening eyes deadline to create super

40:21

alignment and they actually state that

40:23

they want to do this within 4 years and

40:25

since it was 2023 when they said this if

40:27

they do manage to get super alignment

40:29

done within 4 years that would be

40:31

2027 and considering they trademarks all

40:34

the way up to gpt7 that would mean that

40:37

if a new GPT model is released every

40:39

single year accounting for GPT 5 GPT 6

40:42

and gpt7 considering that's three more

40:44

years that takes us exactly to 2027 when

40:47

super alignment should be done and then

40:50

once we have you know I guess you could

40:52

say the system that will solve super

40:54

linman and we can actually solve that

40:56

then they can actually safely realize

40:58

that look we can safely train an AGI

41:00

level system or this 125 trillion

41:03

parameter model qar whatever it is

41:05

because then it would actually be safe

41:06

to do so so I'm guessing that kind of

41:09

does make sense with the iterative

41:10

deployment that we did see before I'm

41:12

guessing that if I'm being honest with

41:14

you guys maybe I did miss some key

41:15

things in this document maybe there's

41:17

some stuff that I'm going to miss but I

41:18

do think that certain things do make

41:20

sense like if they are iteratively

41:22

deploying things that does mean that you

41:24

know we're not going to get everything

41:25

that they do have they could have liter

41:27

no crazy crazy models and do remember

41:29

that samman in that interview he did

41:31

actually talk about how they're able to

41:33

predict the capabilities of future

41:34

models by training in a system which is

41:37

much less compute intensive and if you

41:38

remember this was something that I did

41:40

retweet and this was something that I

41:41

did talk about and I'm guessing that

41:43

whatever breakthrough that they had with

41:45

whether it was qar whether they realized

41:47

they just need a lot more comp I think I

41:49

think the elephant in the room is that

41:51

over the next couple of years we're

41:52

going to get iteratively better systems

41:54

which open air are going to release well

41:56

well after they've trained the models

41:57

because they stated that the society

41:59

needs time to adapt if you looked at the

42:01

Sora paper they did State one of their

42:03

employees in a deleted tweet stated that

42:05

you know we're going to release that

42:07

just to make sure that people can

42:08

reassess their timelines on what a ai's

42:10

capabilities are so I'm guessing that

42:12

they just wanted to release that to tell

42:13

us what's capable but the point is is

42:15

that they know where these a systems are

42:16

going to go because they have

42:17

predictable scaling and they've

42:19

accurately predicted GPT 4's capability

42:21

and I'm guessing if they've accurately

42:22

predicted all the way the capabilities

42:24

to GPT 5 6 and 7 even though during that

42:27

time things could really change because

42:29

ai's development field is so rapid I do

42:32

think that some of this stuff kind of

42:33

does make sense considering the fact

42:35

that they want super alignment solved in

42:37

the next four years and that would align

42:38

with 2027 when if we're continuing on

42:41

the release of a new llm model or

42:44

multimodal AI system each year which is

42:46

what they've already Trad mark it kind

42:48

of would make sense that we do get an

42:50

AGI system at that level one question I

42:52

do have is what does Sam need the 7

42:55

trillion for which is already we know to

42:57

be likely the AGI type system and I'm

42:59

guessing that if we're going to train

43:01

something that's potentially of course

43:03

you know really really compute intensive

43:06

you know at first I was thinking that

43:07

something compute as intensive as GPT 4

43:09

which was really really compute

43:10

intensive which is why they don't even

43:12

give us the message cap but with these

43:13

chinchilla laws now being four times

43:16

less I still think that they're going to

43:17

need a ridiculous amount of compute

43:18

because of course there is that compute

43:20

overhang which samman talked about which

43:22

essentially just means that there isn't

43:23

enough GP to go around there isn't

43:25

enough to actually provide the compute

43:27

but I'm guessing that with a 7 trillion

43:29

whatever Sam Alman has talked about in

43:31

these private interviews whatever he's

43:32

going to be you know pitching to these

43:33

investors I'm guessing that potentially

43:36

this could work now some of the caveat

43:38

that I do have about this is that of

43:39

course some of the information is just

43:41

unverifiable like you're not really able

43:42

to verify some of this stuff but I think

43:45

from this what we can take is that they

43:47

probably do know where AGI is in terms

43:49

of being able to predict the

43:50

capabilities and based on the statements

43:52

that they have said it does look like we

43:54

could be getting a model that is is

43:56

going to be a larger increase in

43:57

parameter count because it does seem to

43:59

be something that opening ey did State

44:01

and they did previously state that they

44:02

were going to build something that does

44:04

have the same amount of size as a human

44:07

brain however there were some caveats to

44:09

this cuz recently uh Google's AI boss

44:11

actually did say that scale only gets

44:13

you so far but then again samman is

44:15

trying to raise $7 trillion so I'm

44:17

guessing that maybe if scale isn't what

44:20

gets you so far Sam Alman wouldn't be

44:22

trying to raise $7 trillion if they

44:23

could just do it with the computer that

44:25

they did have at and so I'm guessing

44:27

that they do need scale because they've

44:29

already predicted it with less comput

44:31

and I'm just going to leave with that

44:32

but let me know what you guys thought

44:33

about this this is this wasn't meant to

44:35

be that long but I did want to cover

44:36

absolutely everything and I just wanted

44:38

to make this information available to

44:39

you I don't know whether or not this is

44:41

all going to be possible like I said

44:43

opening ey could go out and they could

44:44

change everything the truth is

44:46

regardless of all of this interesting

44:47

information and regardless of all of

44:49

this speculation the only people that

44:51

know what's going on is opening eye and

44:53

the people that work there but let me

44:54

know what you guys think about this do

44:55

you think this is going to be there do

44:56

you think this information is completely

44:58

false do you think this guy's got it

44:59

completely wrong do you think he's got

45:01

it completely right I'd love to know

45:02

your thoughts down in the comment

45:03

section below