37% Better Output with 15 Lines of Code - Llama 3 8B (Ollama) & 70B (Groq)

All About AI
24 Apr 202416:15

Summary

TLDRفي هذا النص، يُعرض مستخدم تقنية الذكاء الاصطناعي لتحسين تجربة استعلام المستخدمين عن ملفات الوثائق. يُظهر مستخدم كيفية استبدال استعلامات غير محددة باستعلامات محددة بشكل أفضل من خلال إعادة كتابة السؤال الأصلي بإضافة سياق من محادثة سابقة. يُظهر هذا التحسين في استجابة نظام الذكاء الاصطناعي، حيث يُظهر نسخة ثانية من النموذج يحتوي على حل تحسيني. يُظهر مستخدم أيضًا مدى فائدة استخدام JSON في تحسين الإخراج المطلوب. يُشير إلى أن التحسين قد أدى إلى تحسين الاستجابة بنسبة تتراوح بين 30% إلى 50%، وفقًا لتقييمات GPT-4. يُشير إلى وجود تحديثات جديدة في مشروع GitHub للغة الذكاء الاصطناعي (AMA)، ويُتوقع مشاركة مستخدم في مقاطع الفيديو القادمة لمناقشة Gro ونموذج Llama 70B.

Takeaways

  • 🚀 **使用RAG系统提问**: 视频展示了如何通过RAG系统向文档提问,例如询问Meta的AI模型Llama 3的训练细节。
  • 🔍 **处理模糊问题**: 讨论了如何处理模糊或不具体的问题,如“这是什么意思?”,并展示了如何改进以获得更好的答案。
  • 💡 **查询重写解决方案**: 实现了一个查询重写的解决方案,通过增加上下文来改善问题,从而获得更丰富的答案。
  • 📈 **模型性能对比**: 展示了在8B和70B版本的Llama模型上运行相同查询的结果,并比较了它们的性能。
  • 📚 **赞助商介绍**: 视频中提到了赞助商Brilliant.org,这是一个学习平台,提供数学、编程、AI和数据分析的互动课程。
  • 🔧 **代码和逻辑解释**: 视频中深入探讨了实现查询重写的代码和逻辑,提供了详细的步骤和解释。
  • 📝 **JSON的使用**: 为了确保输出的结构化,视频中使用了JSON来组织和传递查询信息。
  • 🔗 **GitHub资源**: 提到了GitHub仓库的更新,包括使用Dolphin Tree Llama模型和AMA嵌入模型的更新。
  • 📦 **模型选择**: 讨论了如何从终端选择不同的模型,增加了操作的灵活性。
  • 🤖 **Gro和Llama 70B模型测试**: 视频最后测试了使用Gro和Llama 70B模型的系统,展示了重写查询的效果。
  • 🎯 **改进的响应质量**: 通过比较有无重写查询的响应,得出了重写查询可以提高响应质量30-50%的结论。

Q & A

  • 视频中提到的问题是关于什么?

    -视频中提到的问题是关于如何改进一个基于文档的问答系统,使其能够更好地处理模糊或不具体的查询。

  • Llama 3 在训练时使用了大约多少个token?

    -Llama 3 在训练时使用了大约15万亿个token。

  • 视频中提到的解决方案是什么?

    -视频中提到的解决方案是重写查询(Rewritten query),通过增加更多的上下文信息来改善查询的明确性和信息量。

  • 重写查询的目的是什么?

    -重写查询的目的是保留原始查询的核心意图和含义,同时扩展和澄清查询,使其更具体、更有信息量,以便检索到相关的上下文。

  • 视频中提到的赞助商是谁?

    -视频中提到的赞助商是Brilliant.org,一个提供数学、编程、人工智能和数据分析课程的在线学习平台。

  • 如何使用赞助商Brilliant.org来提高编程技能?

    -通过Brilliant.org的互动课程,用户可以学习Python编程,并从第一天开始构建程序,同时学习循环、变量、嵌套和条件等基本编码元素。

  • 视频中提到的AMA shat函数是什么?

    -AMA shat函数是视频中提到的系统中用于处理用户查询并生成重写查询的一部分。

  • 为什么视频中的作者对使用JSON感到满意?

    -作者对使用JSON感到满意,因为它提供了一个更确定的输出结构,确保了输出的一致性和可预测性。

  • 视频中提到的GitHub项目是什么?

    -视频中提到的GitHub项目是一个名为'super easy 100% local AMA rag'的本地运行的问答系统,使用了Llama模型。

  • 作者在视频中提到了哪些模型的比较?

    -作者比较了使用8B Llama 3模型和70B Llama模型的重写查询的效果,发现70B模型生成的重写查询效果更好。

  • 如何评估重写查询的效果?

    -作者通过将没有使用重写查询的响应与使用重写查询的响应进行比较,多次询问GPT-4模型,发现使用重写查询的响应通常比未使用的响应好30%到50%。

Outlines

00:00

🔍 Introduction to the Problem and Solution

The speaker begins by introducing a problem they encountered with their AI system, specifically when asked vague questions that did not pull relevant context from documents. They then demonstrate their solution, which involves rewriting queries to provide more context and specificity, thus improving the AI's ability to retrieve relevant information. The speaker also mentions testing this solution on different models of AI, including the 8B and 70B models.

05:00

🛠️ Explaining the Query Rewriting Process

The speaker provides a step-by-step explanation of how they implemented the query rewriting process. This includes receiving user input, parsing JSON, extracting the original query, constructing a prompt for the AI model, and feeding the rewritten query back into the system to retrieve relevant context. The use of JSON ensures a structured and deterministic output, which aids in the clarity and effectiveness of the rewritten queries.

10:02

📈 Updates and Testing with Llama 70b Model

The speaker discusses updates made to their GitHub repository, including changes to the model and embeddings. They share their experience with testing the rewritten query process using the Llama 70b model, noting that it produced better results than previous models. The speaker demonstrates the improved query by asking questions and showing how the rewritten queries lead to more detailed and informative responses.

15:02

📊 Measuring Improvement and Future Plans

The speaker reveals how they measured the improvement in responses, by comparing rewritten queries to original queries using GPT-4 and Ophidian, which showed an improvement of about 30-50%. They express gratitude for the support they've received and encourage viewers to check out their GitHub for updates. They also hint at future plans to work more with the Gro and Llama 70b model, pending resolution of rate limit issues.

Mindmap

Keywords

💡RAG system

RAG system، يشير إلى نظام الذكاء الاصطناعي (Retrieval-Augmented Generation)، وهو نظام يدمج بين استرداد المعلومات وإنشاءها. يستخدم في هذا الفيديو لطرح أسئلة حول مستندات المقدم. يشير النص إلى أنه تم تشغيل نظام RAG للبدأ بطرح أسئلة حول المعلومات المتعلقة بـ 'meta's AI llama 3'.

💡Tokens

Tokens في سياق الذكاء الاصطناعي، تشير إلى وحدات نصية مثل الكلمات والشخصيات التي يتم استخدامها في تدريب النماذج الآلية. في الفيديو، يشير إلى عدد الرموز (tokens) التي تم تدريب النموذج 'llama 3' عليها، وهو ما يُستخدم لتوضيح مدى تطور النموذج وحجم البيانات التي تم استخدامها في تطويره.

💡Vague question

سؤال مبهم هو سؤال غير محددة المعنى أو يتضمن استفسارات. في الفيديو، يشير إلى مشكلة يواجهها المقدم عندما يطرح سؤالًا مبهمًا مثل 'ماذا يعني ذلك؟'، حيث لا يُ.Pull أي سياق من الوثائق لمساعدتها في الإجابة، مما يُعتبر مشكلة في استرداد المعلومات.

💡Rewritten query

Rewritten query هو استبدال الاستعلام الأصلي باستعلامًا جديدًا يتضمن سياقًا أكثر. في الفيديو، يستخدم هذا المفهوم لتحسين الاستعلامات المبهمة من قبل المستخدمين عن طريق إضافة سياقًا من المحادثات السابقة لتحسين الإجابة التي تُرجعها النموذج الذكاءي الاصطناعي.

💡AMA (Ask Me Anything)

Ask Me Anything (AMA) هو عبارة عن منصة تفاعلية حيث يستطيع الشخص الجمهور من طرح أسئلة حول أي موضوع. في سياق الفيديو، يُستخدم كجزء من اسم الوظيفة (AMA shat function) التي تتعامل مع استرداد السياق وتحسين الاستفسارات من قبل المستخدمين.

💡Llama 3 Model

Llama 3 Model هو نموذج ذكاء اصطناعي مخصص في الفيديو، يُستخدم لتدريب النموذج على بيانات كبيرة وتحسين الإجابة على الاستفسارات. يُشير إلى أنه تم تدريب النموذج على 15 تريليون token، مما يُشير إلى مدى تطوره وقدرته على معالجة المعلومات.

💡Json

Json (JavaScript Object Notation) هي لغة محددة التنسيق لتمثيل البيانات. يُستخدم في الفيديو لتنسيق المخرجات والإدخالات مع النموذج الذكاءي الاصطناعي، مما يضمن استلام المخرجات المطلوبة بشكل منتظم ومنظمة.

💡Brilliant.org

Brilliant.org هي منصة تعليمية تركز على تطوير مهاراتك في الرياضيات وبرمجة الحواسيب وتحليل البيانات. في الفيديو، يُشير إلى أنه يُعد مقدمة لمن يرغبون في تعلم البرمجة وتحليل البيانات، وهو ما يُشير إليه كجزء من المحتوى التعليمية في الفيديو.

💡AI model

AI model أو نموذج ذكاء اصطناعي هو نموذج مخصص يستخدم في الذكاء الاصطناعي لتحليل البيانات و提取 المعرفة. في الفيديو، يُستخدم نموذج AI لإعادة كتابة الاستفسارات وتحسين استرداد السياق من الوثائق.

💡GitHub

GitHub هو منصة تعاون لتطوير البرمجيات. يُستخدم في الفيديو لمشاركة الكود المصدري للنظام الذكاءي الاصطناعي المستخدم و邀议 الجمهور للمساهمة والتعليق على المشروع.

💡Llama 70b model

Llama 70b model هو نموذج ذكاء اصطناعي مخصص يشير إلى أنه يحتوي على 70 بليون معلم (token) في قواه التدريب. يُستخدم في الفيديو لتوضيح التحسينات في النموذج والتحسينات في الإجابة على الاستفسارات.

Highlights

The speaker is introducing a problem they wanted to solve regarding handling vague questions in an AI system.

They demonstrate the AI system's initial inability to provide context for vague queries.

The speaker presents a solution involving a rewritten query to provide more context to vague questions.

The AI model, Llama 3, is shown to provide an answer after the query is rewritten, improving the response.

The speaker explains the process of rewriting queries using conversation history to improve specificity.

A step-by-step explanation of the code and logic behind the query rewriting process is provided.

The use of JSON for structured output is highlighted as a key component of the solution.

The speaker discusses the improvements made to the AMA chat function to incorporate the rewritten query feature.

The impact of using a larger AI model, Llama 70B, on the quality of the rewritten queries is explored.

The speaker shares the results of comparing responses with and without the rewritten query, showing an improvement of 30-50%.

The practical application of the rewritten query feature is demonstrated through a live example using the Llama 70B model.

The speaker provides a humorous estimate of how many books a human would need to read to match Llama 3's training data.

The importance of the project for improving AI's ability to understand and respond to vague human queries is emphasized.

The speaker expresses satisfaction with the current state of the project and its potential for further development.

Updates to the GitHub repository related to the project are mentioned, inviting interested individuals to explore and contribute.

The speaker teases an upcoming video featuring more work with Gro and the Llama 70B model, subject to overcoming rate limit issues.

A call to action for viewers to support the project by starring the GitHub repository is included.

The video concludes with an invitation to join a subsequent live session and well wishes for the viewers' week.

Transcripts

00:00

today I'm going to start by showing you

00:01

the problem I wanted to solve I want to

00:03

show you how I tried to solve it and if

00:05

it was a success and then I'm going to

00:07

explain it to you so you can understand

00:09

it and start using this too so yeah

00:11

let's just get started okay so what you

00:13

see here is me I have fired up my rag

00:16

system so we can start asking questions

00:18

about my documents so I fed in some

00:20

information about meta's AI llama 3

00:23

right I asked the question how many

00:25

tokens was llama 3 trained on and we

00:28

have the context that is pulled from the

00:29

document

00:30

and we use that context to kind of

00:32

answer llama 3 was preened on 15

00:34

trillion tokens okay so far so good

00:37

right and here comes kind of my problem

00:40

uh it's not a big problem right if you

00:42

know what you're doing but what happens

00:44

when I say what does that mean a very

00:47

vague question right okay so we say that

00:51

no we don't pull anything from our

00:53

documents right so that means that we

00:56

don't have any relevant context to this

00:59

problem so this is kind of the problem I

01:01

wanted to take a look at today how can

01:03

we kind of improve this so yeah I'm just

01:06

going to show you how I implemented a

01:09

solution for this and how it works okay

01:13

so let's fire up the second version here

01:15

so this is the version that contains my

01:17

solution so we're going to ask the same

01:19

question how many tokens was llama 3

01:21

trained on uh this is running on the 8B

01:24

llama 3 Model on AMA so it's totally

01:27

locally right and you can see see the

01:30

Llama Tre was trained over 50 million

01:32

tokens so pretty much exactly the same

01:35

answer as before what if we say what

01:38

does that mean so a very vague question

01:41

right so what I implemented was this

01:44

kind of Rewritten query so we take our

01:47

original query and we try to rewrite it

01:51

so can you provide more details about

01:52

the improvements made in llama 3

01:54

compared to its predecessor increasing

01:56

training data code size support for non-

01:59

languages also how does a tokenizer yeah

02:02

blah blah blah so you can see we added

02:03

much more context to our query uh just

02:07

by putting this through some kind of

02:09

solution I'm going to show you and now

02:11

you can see we get context pull from the

02:14

documents even though our query was kind

02:16

of the same and yeah you can see we get

02:19

a pretty good answer here I'm not going

02:21

to read it but you can pause and read if

02:23

you want to so yeah I'm pretty happy how

02:26

this works out this is of course not

02:28

like a perfect solution but uh for me

02:31

this has improved the responses a bit at

02:35

least in this very small model I haven't

02:37

tried it too much we're going to try it

02:39

on the 70b model later uh this video but

02:42

for now yeah I'm pretty happy with this

02:44

so uh I think we're just going to head

02:46

over and try to explain how this works

02:48

because a lot of you enjoy that in the

02:50

previous video going a bit deeper into

02:52

the code and explaining the logic and

02:55

kind of how this works so yeah let's do

02:57

that but first let's say you are one of

02:58

those that wants to learn more about

02:59

about Python and computer science then

03:02

you should really pay attention to

03:03

today's sponsor brilliant have you ever

03:06

wondered how to make sense of fast

03:07

amounts of data or maybe you're eager to

03:09

learn coding but you don't know where to

03:11

start well brilliant.org the sponsor of

03:13

today's video is the perfect place to

03:15

learn these skills Brilliance is a

03:17

learning platform that is designed to be

03:19

uniquely effective their interactive

03:20

lessons in math programming Ai and data

03:23

analysis are created by a team of

03:26

award-winning teachers professionals and

03:28

researchers if you looking to build a

03:30

foundation in probability to better

03:32

understand the likelihood of events the

03:34

course introduction to probability is a

03:37

great place to start you work with real

03:38

data set from sources like Starbucks

03:40

Twitter Spotify learning to parse and

03:42

visualize massive data sets to make them

03:45

easier to interpret and for those ready

03:46

to level up their programming skills the

03:48

creative coding course is a must you'll

03:51

get familiar with python and start

03:52

building programs on day one learning

03:55

essential coding elements like Loops

03:57

variables nesting and conditionals what

03:59

set brilliant apart is that it helps you

04:01

build critical thinking skills to

04:03

problem solving not just memorizing so

04:05

while you are getting knowledge on

04:06

specific topics you're also becoming a

04:09

better Tinker overall to try everything

04:11

brilliant has to offer for free for 30

04:13

days visit brilliant.org allout AI or

04:16

just click the link in the description

04:18

below you will also get a 20% of an

04:20

annual premium subscription a big thanks

04:23

to brilliant for sponsoring this video

04:24

now let's go back to the project okay so

04:27

you can see from the code here these

04:29

lines here and a few lines down here in

04:31

our AMA shat function was kind of all I

04:34

added to try to solve this problem if

04:37

you can call it a problem uh but yeah

04:40

that was something I wanted to try to do

04:42

so uh I'm just going to explain quickly

04:44

how this works or not quickly I'm going

04:46

to go into a bit of a detail but you can

04:48

see we have a pretty long prompt here so

04:51

I'm going to blow this up so you can see

04:52

it better and then we're going to move

04:54

on and kind of go through step by step

04:56

now how this actually works so yeah

04:58

hopefully you can learn learn something

05:00

about this I want to start by explaining

05:02

how I thought about this prompt we used

05:04

for this so uh basically I'm just going

05:06

to go through it and explain it so you

05:08

can see rewrite the following query by

05:10

incorporating relevant context from the

05:13

conversation history so we are actually

05:15

using bits from our conversation history

05:17

the two previous messages to try to

05:20

improve our query so the Rewritten query

05:23

should preserve the core intent and

05:25

meaning of the original query expand and

05:27

clarify the query to make it more

05:29

specific and informative for retrieving

05:31

relevant context avoid introducing new

05:34

topics and queries that deviate from the

05:35

original query don't ever answer the

05:38

original query but instead focus on

05:40

rephrasing and expanding it into a new

05:42

query return only the Rewritten query

05:44

text without any additional formatting

05:46

or explanations then we're going to pass

05:48

in our context so we're going to pass in

05:50

our true previous messages then we're

05:52

going to pass in our original query from

05:54

the user input and then we are want our

05:57

Rewritten query as the output right and

06:00

that is kind of how I set this prompt up

06:03

so of course this is important uh but we

06:05

are using some help from Json here to

06:08

get the structure output we want so that

06:10

is kind of what I wanted to explain to

06:13

you in this stepbystep process here okay

06:15

so let's start on step one here so

06:17

receive user input Json the function

06:20

receives a Json string containing user's

06:21

original query for example query and

06:24

what does that mean right how this is

06:27

being put into a rewrite query function

06:30

is in the AMA shat function here so if

06:33

we take a look here so I kind of set

06:35

this up so the first query we make to

06:39

our rag system is not going to have a

06:41

Rewritten query because I found out that

06:43

was pretty stupid we don't need that but

06:46

from the second query we put in

06:48

everything is going to be Rewritten so

06:50

you can see here is our function and

06:53

we're going to pass in this Json here

06:56

that comes from this one right so we can

06:58

see we have the user input here okay so

07:00

in step two here we're going to par the

07:02

Json to a dictionary so the Json string

07:04

is converted to a python dictionary

07:06

using Json loads so this could for

07:08

example be user input equals and then we

07:10

have a query and uh yeah the parameter

07:12

for the query could be what does this

07:14

mean okay and then we kind of move on to

07:17

step three that is going to be

07:18

extracting the original query from this

07:21

python dictionary so let's say we have

07:24

this as a python dictionary now we kind

07:26

of want to grab the query right so the

07:30

user input now is equal to what does

07:33

that mean because we grabbed it from

07:35

this python dictionary up here right The

07:37

Next Step then is going to be step four

07:39

and this is preparing the prompt for the

07:40

AI model a prompt is constructed that

07:42

includes a conversation history and

07:43

instruction for rewriting the query we

07:46

already took a look at that right up

07:47

here so we kind of know how this prompt

07:50

Works uh and you can see in Step file we

07:52

want to call our AI model so in this

07:54

case this is AMA running on llama 3 uh

07:58

is called with a prepared prompt the

08:00

model generates a Rewritten version of

08:02

the query and if we move on to step six

08:04

that is going to be extracting the

08:06

Rewritten query so the Rewritten query

08:07

extracted from the models response if

08:09

you take a look at the code here that is

08:11

going to happen yeah here right so

08:14

Rewritten query and we have the response

08:16

from the model right we feed in our

08:19

prompt and we kind of get this Json dump

08:23

Rewritten query out and here we pass in

08:25

our Rewritten query from the response

08:28

from the model right and that is of

08:30

course going to be step seven so return

08:31

Rewritten query in Json a new Json

08:34

string is constructed containing the

08:35

Rewritten query and return to the

08:37

calling function so this could be for

08:39

example Rewritten query right like we

08:42

saw down here Rewritten query and we can

08:45

maybe the parameters is going to be what

08:47

does it mean that llama 3 has been

08:49

trained on 15 trillion tokens and that

08:51

means that we're ready for kind of our

08:53

final step and that is going to be to

08:55

feed this Rewritten query back to uh or

08:58

to the get re elant context function

09:01

down in our AMA chat function right so

09:05

you can see Rewritten query here and

09:08

this is going to be fed back into the

09:11

get relevant context right so if we go

09:14

down here you can see we are feeding a

09:17

relevant query here into the get

09:20

relevant context function and we are

09:22

skipping all together uh the user the

09:26

original user input or the original user

09:29

query quy is not going to be fed into

09:31

the relevant context so we only going to

09:34

pass in a Rewritten query right so you

09:36

can see the Rewritten query is passed to

09:38

get to the get relevant context function

09:41

which retrieve relevant context from the

09:43

knowledge Vault based on the Rewritten

09:44

query and that is kind of how I set this

09:47

up so uh like I said the original user

09:51

query here is not going to be even

09:55

consideration even though we print it uh

09:57

that is just to compare it

09:59

if we take a look here you can see we

10:01

print it here but we are not going to

10:04

pass it into any functions so we only

10:07

going to print it so we can kind of

10:08

compare them here side by side just for

10:11

fun I guess so yeah that is kind of how

10:14

I set this up and so far I'm been pretty

10:16

happy with it I hope it was uh okay to

10:20

kind of understand how this works so it

10:22

really helps using Json because that

10:26

gives us a more deterministic output

10:29

so we will always kind of get this very

10:32

structured form and if I tried to use I

10:36

tried to not use Json but that was not a

10:39

great success but you can try that if

10:41

you want to uh but for me this has been

10:43

working pretty okay so this code is

10:45

going to be H kind of an extension of my

10:48

the GitHub repo you can see on the

10:50

screen here which is the super easy 100%

10:53

local uh AMA rag uh so we made some

10:56

updates uh what we updated was we are

10:59

using the dolphin tree llama model now

11:02

uh we have an update that we change our

11:05

embeddings models so we are actually

11:07

using a AMA embeddings model now and

11:10

that has been working out pretty good we

11:13

have a few other updates so we can kind

11:15

of pick our model from the terminal line

11:17

if we want to do that yeah just some

11:19

issues we had that I got on the GitHub

11:22

that I have implemented uh of course

11:24

this is just a a layout so you can do

11:27

whatever you want with this you can find

11:28

the link in the description I'm probably

11:30

going to put put this video up and

11:32

explaining all the updates to the code

11:35

here so this should be code should be up

11:38

now so you can start playing around with

11:39

it and yeah I hope you kind of enjoyed

11:42

it so I kind of wanted to finish this

11:44

video by I created a local rag version

11:48

here using Gro and the Llama 70b model

11:53

uh I was supposed to do a video today

11:56

using grock and llama 7B but I had so

11:58

many issues with the rate limit so I had

12:01

to skip it that might be for Sunday we

12:03

will see but let's just finish up this

12:06

video by testing this using the gro and

12:10

the lava 70b so yeah this is basically

12:13

exactly the same uh I found that the

12:15

Rewritten queries were a bit better so

12:17

let's just try the same questions so how

12:19

many

12:20

tokens was this you can see it's pretty

12:23

fast though we're running right uh okay

12:25

so let's do what does that mean okay so

12:28

let's take a look at the Rewritten query

12:30

here what does that mean what does it

12:33

mean that llama 3 was trained on

12:34

enormous data set uh equivalent of two

12:37

to billion

12:38

books 13 tokens what is the impact

12:41

moldability so you can see yeah this is

12:44

a much better Rewritten query right this

12:47

is good so let's see the answer

12:50

here uh okay here is a breakdown of what

12:53

it means 15 trillion tokens refers to

12:56

the massive amount of data using okay T

12:59

stands for

13:00

trillions uh okay so this is great

13:03

tokens are indidual unit of text such as

13:05

words and characters the mother train a

13:08

huge data set wow this is good right so

13:12

we got all of this just by asking what

13:15

does this mean so you can see how

13:17

actually how good this Rewritten query

13:19

is and of course the better model we are

13:22

going to use the better answer we are

13:23

going to get right in summary llama 3 is

13:25

a highly adapt highly Advanced language

13:28

model trained on enormous data set with

13:29

focus on simplistic scalability and high

13:32

quality

13:34

data uh let's do wow

13:38

that's crazy how many books must a human

13:42

read to be this smart that's a bad

13:45

question um what's the equivalent amount

13:49

of human reading in terms of number of

13:51

books that would be required to achieve

13:52

the same L understanding knowledge L Tre

13:55

train 15 trillion tokens of data again a

13:57

very good Rewritten Qui if you ask me uh

14:01

what a question to put it at scale and

14:04

it goes

14:06

into uh okay so let's

14:09

say to read

14:11

330,000 to 600,000 books it would take

14:16

around

14:17

16,500 to 30,000 years assuming one book

14:21

per week uh around 15,000 years assuming

14:25

two bucks per week of course this is a

14:29

rough estimate them meant to be humorous

14:31

this model is so good so yeah you can

14:35

see we have to read uh around how many

14:38

books was it 600,000 books to be this

14:42

smart so yeah uh so I I think this kind

14:45

of shows how good this uh Rewritten

14:49

query kind of is and yeah how good the

14:52

70b model is so really excited about

14:54

llama Tre uh hope you found this

14:57

enjoyable hope you learned something

14:58

from it it that's the most important

15:00

thing the result doesn't matter too much

15:02

but maybe this give you some new ideas

15:04

how you can use embeddings to kind of

15:06

improve stuff how you can use uh the get

15:09

relevant context function to do other

15:11

stuff so yeah so I guess a lot of you

15:13

wondering where I got the 30% better

15:15

response from so what I did is I took

15:17

one response uh without the rewrite

15:21

query and I took like a second response

15:23

with the rewriting query function and I

15:26

asked the first GPT 4 to compare them

15:29

and I asked a lot of times and most of

15:32

the times it came in between 30 and 50%

15:35

better response than response one so

15:38

response two is the one with the rewrite

15:40

query and I did the same on Opus and

15:43

here it always landed in 30 40% better

15:47

than response one so response two that

15:49

was with the uh yeah reite query

15:52

function so yeah that is where I got the

15:55

37% from just want to say big thank you

15:58

for the support lately it's been awesome

16:00

and give the GitHub a star if you

16:03

enjoyed it other than that come back for

16:06

Sunday probably going to do more Gro and

16:08

llama 70b if the rate limit is okay have

16:12

a great week and I see you again yeah

16:14

Sunday

Rate This

5.0 / 5 (0 votes)

العلامات ذات الصلة
تطوير البرامجالذكاء الاصطناعيالتعلم العميقنموذج لاما 3تحسين استجاباتإعادة كتابة الاستعلاماتتحليل بياناتبرمجة بايثونتعليم البرمجةالتحليل الرقمي