37% Better Output with 15 Lines of Code - Llama 3 8B (Ollama) & 70B (Groq)

All About AI
24 Apr 202416:15

Summary

TLDRThe video script details an innovative approach to improving the effectiveness of AI language models when faced with vague user queries. The creator demonstrates a system that uses a rewritten query function to provide more detailed and informative responses. By incorporating relevant context from previous messages, the system generates a more specific query, which leads to more accurate and useful information being retrieved from documents. The creator also discusses the use of the 8B and 70B Llama 3 models and shares their excitement about the potential of these models. The script concludes with a comparison of the rewritten query function's effectiveness, showing an improvement of around 30-50% in response quality.

Takeaways

  • πŸ“ˆ The speaker developed a solution to improve the handling of vague queries by rewriting them to include more context from previous messages.
  • πŸ” The AI model, Llama 3, was trained on 15 trillion tokens, which is a significant amount of data equivalent to around two billion books.
  • πŸ’‘ The rewritten query function was designed to preserve the core intent of the original query while making it more specific and informative.
  • βœ… The speaker demonstrated the effectiveness of the rewritten query by comparing the responses from the AI model with and without the rewritten query.
  • πŸ“š The use of JSON was emphasized for structured output, ensuring a deterministic format for the rewritten query.
  • πŸ€– The AMA chat function was updated to include a rewritten query step for all user inputs after the first, enhancing the context retrieval.
  • πŸš€ The speaker tested the solution using the Llama 70b model, noting that it provided better rewritten queries and more detailed responses.
  • πŸ“ The speaker mentioned that the rewritten query function improved the response quality by about 30-50% as determined by comparing responses from the AI model.
  • πŸŽ“ The video includes a sponsorship for Brilliant.org, a learning platform for math, programming, AI, and data analysis.
  • πŸ”§ The speaker provided a detailed step-by-step explanation of the code and logic behind the rewritten query function.
  • 🌟 The speaker expressed satisfaction with the improvements made to the AMA chat function and encouraged viewers to explore and learn from the code.

Q & A

  • What was the problem the speaker initially wanted to solve?

    -The speaker wanted to solve the issue of vague questions not pulling relevant context from documents, which led to less informative responses.

  • How many tokens was Meta's AI, Llama 3, trained on?

    -Llama 3 was trained on 15 trillion tokens.

  • What does the speaker mean by 'Rewritten query'?

    -A 'Rewritten query' is a modified version of the user's original query that incorporates relevant context from the conversation history to make it more specific and informative for retrieving relevant context.

  • What improvements were made in Llama 3 compared to its predecessor?

    -The improvements in Llama 3 include increasing training data code size, support for non-languages, and enhancements in tokenizer capabilities.

  • How does the speaker's solution handle vague questions?

    -The speaker's solution rewrites vague questions by adding more context, which helps in retrieving relevant information from documents even when the original query is not specific.

  • What is the role of JSON in the speaker's solution?

    -JSON is used to structure the output from the solution, ensuring a deterministic and well-organized format for the rewritten queries and responses.

  • How does the speaker's solution improve responses from the AI model?

    -The solution improves responses by rephrasing and expanding vague queries into more specific ones that can pull relevant context from documents, leading to more informative answers.

  • What is the significance of the 70B model in the speaker's project?

    -The 70B model is a larger and more powerful version of the AI model that the speaker plans to use to test the effectiveness of the rewritten query solution on a more complex scale.

  • What is the 'get relevant context' function in the speaker's project?

    -The 'get relevant context' function retrieves relevant information from the knowledge vault based on the rewritten query, which is more specific and informative due to the solution's processing.

  • How does the speaker evaluate the effectiveness of the rewritten query?

    -The speaker evaluates the effectiveness by comparing responses with and without the rewritten query, using GPT-4 to assess which response is better, and conducting multiple tests to get an average improvement percentage.

  • What is the estimated time it would take for a human to read the equivalent amount of books that Llama 3 was trained on?

    -Assuming a human reads one book per week, it would take around 16,500 to 30,000 years to read the equivalent amount of books that Llama 3 was trained on, which is based on 15 trillion tokens.

Outlines

00:00

πŸš€ Introduction to the AI Query Optimization Project

The speaker introduces a problem they aimed to solve regarding AI query handling. They explain their process of feeding information into an AI system, asking questions, and receiving answers. The issue arises when a vague question is asked, and the AI fails to pull relevant context from the documents. The speaker then demonstrates their solution, which involves rewriting queries to provide more context and improve the AI's responses. They also mention testing the solution on different AI models and express satisfaction with the results.

05:00

πŸ“ Step-by-Step Explanation of Query Rewriting Process

The speaker provides a detailed walkthrough of how they approached rewriting queries. They discuss the structure of the prompt used for the AI model, emphasizing the importance of using conversation history to improve the query. The process involves receiving user input, parsing it into a dictionary, extracting the original query, constructing a prompt for the AI model, and generating a rewritten query. The rewritten query is then used to retrieve relevant context from a knowledge vault, which is a significant improvement over the original user query.

10:02

πŸ” Testing and Updates to the AI System

The speaker shares their experience with testing the query rewriting solution and mentions updates made to their GitHub repository. They discuss the use of a different model, the Llama 70b, and the benefits of using JSON for structured output. The speaker also talks about the improvements in the system, such as using a more advanced embeddings model and allowing users to select models from the terminal. They express excitement about the potential of the Llama 70b model and its ability to provide better answers.

15:02

πŸŽ“ Conclusion and Future Plans

The speaker concludes by summarizing the benefits of using rewritten queries and the effectiveness of the Llama 70b model. They mention conducting tests to compare the quality of responses with and without the rewritten query feature, which showed an improvement of about 30-50%. The speaker thanks the audience for their support, encourages them to star their GitHub repository, and hints at future videos involving more work with the Gro and Llama 70b models, pending resolution of rate limit issues.

Mindmap

Keywords

πŸ’‘RAG system

The RAG system, which stands for Retrieval-Augmented Generation, is a type of AI model that combines retrieval mechanisms with generative capabilities. In the video, the RAG system is utilized to ask questions about documents and retrieve relevant information. It is central to the problem-solving process that the creator is demonstrating, as it forms the basis of the AI's ability to understand and respond to queries.

πŸ’‘Tokens

In the context of AI and natural language processing, tokens refer to the individual units of text, such as words and characters, that are used to train language models. In the video, the creator discusses how the AI model 'Llama 3' was trained on 15 trillion tokens, highlighting the vast amount of data that the model has been exposed to in order to learn and generate human-like text.

πŸ’‘Vague question

A vague question is one that lacks specificity and can be open to broad interpretation. In the video, the creator points out the challenge of handling vague questions like 'What does that mean?' when using an AI system. The problem arises because vague questions may not provide enough context for the AI to generate a relevant and informative response.

πŸ’‘Rewritten query

A rewritten query is a reformulated version of the original user's question, which is designed to be more specific and informative. The creator implements a solution that automatically rewrites vague queries to provide more context and clarity. This technique is shown to improve the AI's ability to retrieve relevant information from documents, even when the original query is not very specific.

πŸ’‘AMA (Ask Me Anything)

AMA, short for 'Ask Me Anything,' is a concept where an individual or a system is open to a wide range of questions, often found in forums and Q&A sessions. In the video, the AMA chat function is a part of the system that interacts with the RAG model to facilitate the asking and answering of questions, demonstrating the practical application of the AI's capabilities.

πŸ’‘Llama 3 Model

The Llama 3 Model refers to a specific version of an AI language model used in the video. It is mentioned as being trained on a significant amount of data, which is a key factor in its ability to understand and generate responses. The model's capabilities are tested and demonstrated through the various queries and rewritten queries presented in the video.

πŸ’‘Contextual understanding

Contextual understanding in AI refers to the model's ability to comprehend the meaning of information based on the context in which it is presented. The video emphasizes the importance of context in providing accurate responses. The creator's solution for handling vague queries involves adding context to improve the AI's ability to understand and respond appropriately.

πŸ’‘JSON

JSON, or JavaScript Object Notation, is a lightweight data-interchange format that is easy for humans to read and write, and easy for machines to parse and generate. In the video, JSON is used to structure the input and output data for the AI model, ensuring that the data is formatted consistently and predictably, which aids in the process of rewriting queries and retrieving relevant information.

πŸ’‘Brilliant.org

Brilliant.org is an online learning platform mentioned in the video as a sponsor. It offers interactive lessons in various subjects, including math, programming, AI, and data analysis. The platform is highlighted for its effectiveness in building critical thinking and problem-solving skills, which are relevant to the video's theme of improving AI's ability to understand and respond to queries.

πŸ’‘Grok and Llama 7B

Grok and Llama 7B refer to a specific AI model and a framework or tool used in the video for testing the system's capabilities. The creator was planning to demonstrate the use of Grok with the Llama 7B model but encountered rate limit issues. This mention shows the continuous exploration and testing of different models and tools in the pursuit of improving AI performance.

πŸ’‘Rate limit

A rate limit is a restriction put in place by systems or services to control the number of requests a user can make within a certain time period. In the video, the creator mentions encountering rate limit issues while attempting to use Grok with the Llama 7B model, which prevented the demonstration. This highlights the practical challenges that can be faced when working with AI models and services.

Highlights

The speaker introduces a problem related to handling vague questions in an AI system and presents a solution to improve the system's responses.

The AI system is demonstrated with a question about Meta's AI, Llama 3, and its training on 15 trillion tokens.

A solution is implemented to rewrite vague queries to provide more context and specificity, leading to better responses from the AI.

The speaker shows the AI's improved ability to answer vague questions by demonstrating a rewritten query that fetches relevant context from documents.

The process of rewriting queries is detailed, explaining how it preserves the core intent while expanding on the original query for more specificity.

The use of JSON is highlighted for its role in structuring the output and ensuring a deterministic format for the rewritten queries.

The speaker discusses the AMA chat function and how it's updated to include the new query rewriting feature.

The speaker provides a step-by-step explanation of how the query rewriting process works within the AMA chat function.

A sponsor, Brilliant.org, is introduced for those interested in learning Python and computer science, offering interactive lessons in math, programming, AI, and data analysis.

The speaker shares the GitHub repository link for those interested in the project and its updates.

An update to the system using the Gro and Llama 70b model is mentioned, with a demonstration of its capabilities.

The speaker discusses the improved performance of the rewritten query function, with an estimated 30-50% better response compared to the original query.

A humorous comparison is made between the amount of data Llama 3 was trained on and the equivalent amount of human reading required to achieve similar understanding.

The speaker expresses excitement about the potential of Llama 3 and encourages viewers to explore new ideas for using embeddings and the get relevant context function.

The speaker thanks the audience for their support and invites them to give a star on GitHub if they enjoyed the content.

An upcoming video on Sunday is teased, which will likely feature more on Gro and Llama 70b, subject to rate limit conditions.

The speaker concludes by emphasizing the importance of learning from the project and looking forward to future interactions.

Transcripts

00:00

today I'm going to start by showing you

00:01

the problem I wanted to solve I want to

00:03

show you how I tried to solve it and if

00:05

it was a success and then I'm going to

00:07

explain it to you so you can understand

00:09

it and start using this too so yeah

00:11

let's just get started okay so what you

00:13

see here is me I have fired up my rag

00:16

system so we can start asking questions

00:18

about my documents so I fed in some

00:20

information about meta's AI llama 3

00:23

right I asked the question how many

00:25

tokens was llama 3 trained on and we

00:28

have the context that is pulled from the

00:29

document

00:30

and we use that context to kind of

00:32

answer llama 3 was preened on 15

00:34

trillion tokens okay so far so good

00:37

right and here comes kind of my problem

00:40

uh it's not a big problem right if you

00:42

know what you're doing but what happens

00:44

when I say what does that mean a very

00:47

vague question right okay so we say that

00:51

no we don't pull anything from our

00:53

documents right so that means that we

00:56

don't have any relevant context to this

00:59

problem so this is kind of the problem I

01:01

wanted to take a look at today how can

01:03

we kind of improve this so yeah I'm just

01:06

going to show you how I implemented a

01:09

solution for this and how it works okay

01:13

so let's fire up the second version here

01:15

so this is the version that contains my

01:17

solution so we're going to ask the same

01:19

question how many tokens was llama 3

01:21

trained on uh this is running on the 8B

01:24

llama 3 Model on AMA so it's totally

01:27

locally right and you can see see the

01:30

Llama Tre was trained over 50 million

01:32

tokens so pretty much exactly the same

01:35

answer as before what if we say what

01:38

does that mean so a very vague question

01:41

right so what I implemented was this

01:44

kind of Rewritten query so we take our

01:47

original query and we try to rewrite it

01:51

so can you provide more details about

01:52

the improvements made in llama 3

01:54

compared to its predecessor increasing

01:56

training data code size support for non-

01:59

languages also how does a tokenizer yeah

02:02

blah blah blah so you can see we added

02:03

much more context to our query uh just

02:07

by putting this through some kind of

02:09

solution I'm going to show you and now

02:11

you can see we get context pull from the

02:14

documents even though our query was kind

02:16

of the same and yeah you can see we get

02:19

a pretty good answer here I'm not going

02:21

to read it but you can pause and read if

02:23

you want to so yeah I'm pretty happy how

02:26

this works out this is of course not

02:28

like a perfect solution but uh for me

02:31

this has improved the responses a bit at

02:35

least in this very small model I haven't

02:37

tried it too much we're going to try it

02:39

on the 70b model later uh this video but

02:42

for now yeah I'm pretty happy with this

02:44

so uh I think we're just going to head

02:46

over and try to explain how this works

02:48

because a lot of you enjoy that in the

02:50

previous video going a bit deeper into

02:52

the code and explaining the logic and

02:55

kind of how this works so yeah let's do

02:57

that but first let's say you are one of

02:58

those that wants to learn more about

02:59

about Python and computer science then

03:02

you should really pay attention to

03:03

today's sponsor brilliant have you ever

03:06

wondered how to make sense of fast

03:07

amounts of data or maybe you're eager to

03:09

learn coding but you don't know where to

03:11

start well brilliant.org the sponsor of

03:13

today's video is the perfect place to

03:15

learn these skills Brilliance is a

03:17

learning platform that is designed to be

03:19

uniquely effective their interactive

03:20

lessons in math programming Ai and data

03:23

analysis are created by a team of

03:26

award-winning teachers professionals and

03:28

researchers if you looking to build a

03:30

foundation in probability to better

03:32

understand the likelihood of events the

03:34

course introduction to probability is a

03:37

great place to start you work with real

03:38

data set from sources like Starbucks

03:40

Twitter Spotify learning to parse and

03:42

visualize massive data sets to make them

03:45

easier to interpret and for those ready

03:46

to level up their programming skills the

03:48

creative coding course is a must you'll

03:51

get familiar with python and start

03:52

building programs on day one learning

03:55

essential coding elements like Loops

03:57

variables nesting and conditionals what

03:59

set brilliant apart is that it helps you

04:01

build critical thinking skills to

04:03

problem solving not just memorizing so

04:05

while you are getting knowledge on

04:06

specific topics you're also becoming a

04:09

better Tinker overall to try everything

04:11

brilliant has to offer for free for 30

04:13

days visit brilliant.org allout AI or

04:16

just click the link in the description

04:18

below you will also get a 20% of an

04:20

annual premium subscription a big thanks

04:23

to brilliant for sponsoring this video

04:24

now let's go back to the project okay so

04:27

you can see from the code here these

04:29

lines here and a few lines down here in

04:31

our AMA shat function was kind of all I

04:34

added to try to solve this problem if

04:37

you can call it a problem uh but yeah

04:40

that was something I wanted to try to do

04:42

so uh I'm just going to explain quickly

04:44

how this works or not quickly I'm going

04:46

to go into a bit of a detail but you can

04:48

see we have a pretty long prompt here so

04:51

I'm going to blow this up so you can see

04:52

it better and then we're going to move

04:54

on and kind of go through step by step

04:56

now how this actually works so yeah

04:58

hopefully you can learn learn something

05:00

about this I want to start by explaining

05:02

how I thought about this prompt we used

05:04

for this so uh basically I'm just going

05:06

to go through it and explain it so you

05:08

can see rewrite the following query by

05:10

incorporating relevant context from the

05:13

conversation history so we are actually

05:15

using bits from our conversation history

05:17

the two previous messages to try to

05:20

improve our query so the Rewritten query

05:23

should preserve the core intent and

05:25

meaning of the original query expand and

05:27

clarify the query to make it more

05:29

specific and informative for retrieving

05:31

relevant context avoid introducing new

05:34

topics and queries that deviate from the

05:35

original query don't ever answer the

05:38

original query but instead focus on

05:40

rephrasing and expanding it into a new

05:42

query return only the Rewritten query

05:44

text without any additional formatting

05:46

or explanations then we're going to pass

05:48

in our context so we're going to pass in

05:50

our true previous messages then we're

05:52

going to pass in our original query from

05:54

the user input and then we are want our

05:57

Rewritten query as the output right and

06:00

that is kind of how I set this prompt up

06:03

so of course this is important uh but we

06:05

are using some help from Json here to

06:08

get the structure output we want so that

06:10

is kind of what I wanted to explain to

06:13

you in this stepbystep process here okay

06:15

so let's start on step one here so

06:17

receive user input Json the function

06:20

receives a Json string containing user's

06:21

original query for example query and

06:24

what does that mean right how this is

06:27

being put into a rewrite query function

06:30

is in the AMA shat function here so if

06:33

we take a look here so I kind of set

06:35

this up so the first query we make to

06:39

our rag system is not going to have a

06:41

Rewritten query because I found out that

06:43

was pretty stupid we don't need that but

06:46

from the second query we put in

06:48

everything is going to be Rewritten so

06:50

you can see here is our function and

06:53

we're going to pass in this Json here

06:56

that comes from this one right so we can

06:58

see we have the user input here okay so

07:00

in step two here we're going to par the

07:02

Json to a dictionary so the Json string

07:04

is converted to a python dictionary

07:06

using Json loads so this could for

07:08

example be user input equals and then we

07:10

have a query and uh yeah the parameter

07:12

for the query could be what does this

07:14

mean okay and then we kind of move on to

07:17

step three that is going to be

07:18

extracting the original query from this

07:21

python dictionary so let's say we have

07:24

this as a python dictionary now we kind

07:26

of want to grab the query right so the

07:30

user input now is equal to what does

07:33

that mean because we grabbed it from

07:35

this python dictionary up here right The

07:37

Next Step then is going to be step four

07:39

and this is preparing the prompt for the

07:40

AI model a prompt is constructed that

07:42

includes a conversation history and

07:43

instruction for rewriting the query we

07:46

already took a look at that right up

07:47

here so we kind of know how this prompt

07:50

Works uh and you can see in Step file we

07:52

want to call our AI model so in this

07:54

case this is AMA running on llama 3 uh

07:58

is called with a prepared prompt the

08:00

model generates a Rewritten version of

08:02

the query and if we move on to step six

08:04

that is going to be extracting the

08:06

Rewritten query so the Rewritten query

08:07

extracted from the models response if

08:09

you take a look at the code here that is

08:11

going to happen yeah here right so

08:14

Rewritten query and we have the response

08:16

from the model right we feed in our

08:19

prompt and we kind of get this Json dump

08:23

Rewritten query out and here we pass in

08:25

our Rewritten query from the response

08:28

from the model right and that is of

08:30

course going to be step seven so return

08:31

Rewritten query in Json a new Json

08:34

string is constructed containing the

08:35

Rewritten query and return to the

08:37

calling function so this could be for

08:39

example Rewritten query right like we

08:42

saw down here Rewritten query and we can

08:45

maybe the parameters is going to be what

08:47

does it mean that llama 3 has been

08:49

trained on 15 trillion tokens and that

08:51

means that we're ready for kind of our

08:53

final step and that is going to be to

08:55

feed this Rewritten query back to uh or

08:58

to the get re elant context function

09:01

down in our AMA chat function right so

09:05

you can see Rewritten query here and

09:08

this is going to be fed back into the

09:11

get relevant context right so if we go

09:14

down here you can see we are feeding a

09:17

relevant query here into the get

09:20

relevant context function and we are

09:22

skipping all together uh the user the

09:26

original user input or the original user

09:29

query quy is not going to be fed into

09:31

the relevant context so we only going to

09:34

pass in a Rewritten query right so you

09:36

can see the Rewritten query is passed to

09:38

get to the get relevant context function

09:41

which retrieve relevant context from the

09:43

knowledge Vault based on the Rewritten

09:44

query and that is kind of how I set this

09:47

up so uh like I said the original user

09:51

query here is not going to be even

09:55

consideration even though we print it uh

09:57

that is just to compare it

09:59

if we take a look here you can see we

10:01

print it here but we are not going to

10:04

pass it into any functions so we only

10:07

going to print it so we can kind of

10:08

compare them here side by side just for

10:11

fun I guess so yeah that is kind of how

10:14

I set this up and so far I'm been pretty

10:16

happy with it I hope it was uh okay to

10:20

kind of understand how this works so it

10:22

really helps using Json because that

10:26

gives us a more deterministic output

10:29

so we will always kind of get this very

10:32

structured form and if I tried to use I

10:36

tried to not use Json but that was not a

10:39

great success but you can try that if

10:41

you want to uh but for me this has been

10:43

working pretty okay so this code is

10:45

going to be H kind of an extension of my

10:48

the GitHub repo you can see on the

10:50

screen here which is the super easy 100%

10:53

local uh AMA rag uh so we made some

10:56

updates uh what we updated was we are

10:59

using the dolphin tree llama model now

11:02

uh we have an update that we change our

11:05

embeddings models so we are actually

11:07

using a AMA embeddings model now and

11:10

that has been working out pretty good we

11:13

have a few other updates so we can kind

11:15

of pick our model from the terminal line

11:17

if we want to do that yeah just some

11:19

issues we had that I got on the GitHub

11:22

that I have implemented uh of course

11:24

this is just a a layout so you can do

11:27

whatever you want with this you can find

11:28

the link in the description I'm probably

11:30

going to put put this video up and

11:32

explaining all the updates to the code

11:35

here so this should be code should be up

11:38

now so you can start playing around with

11:39

it and yeah I hope you kind of enjoyed

11:42

it so I kind of wanted to finish this

11:44

video by I created a local rag version

11:48

here using Gro and the Llama 70b model

11:53

uh I was supposed to do a video today

11:56

using grock and llama 7B but I had so

11:58

many issues with the rate limit so I had

12:01

to skip it that might be for Sunday we

12:03

will see but let's just finish up this

12:06

video by testing this using the gro and

12:10

the lava 70b so yeah this is basically

12:13

exactly the same uh I found that the

12:15

Rewritten queries were a bit better so

12:17

let's just try the same questions so how

12:19

many

12:20

tokens was this you can see it's pretty

12:23

fast though we're running right uh okay

12:25

so let's do what does that mean okay so

12:28

let's take a look at the Rewritten query

12:30

here what does that mean what does it

12:33

mean that llama 3 was trained on

12:34

enormous data set uh equivalent of two

12:37

to billion

12:38

books 13 tokens what is the impact

12:41

moldability so you can see yeah this is

12:44

a much better Rewritten query right this

12:47

is good so let's see the answer

12:50

here uh okay here is a breakdown of what

12:53

it means 15 trillion tokens refers to

12:56

the massive amount of data using okay T

12:59

stands for

13:00

trillions uh okay so this is great

13:03

tokens are indidual unit of text such as

13:05

words and characters the mother train a

13:08

huge data set wow this is good right so

13:12

we got all of this just by asking what

13:15

does this mean so you can see how

13:17

actually how good this Rewritten query

13:19

is and of course the better model we are

13:22

going to use the better answer we are

13:23

going to get right in summary llama 3 is

13:25

a highly adapt highly Advanced language

13:28

model trained on enormous data set with

13:29

focus on simplistic scalability and high

13:32

quality

13:34

data uh let's do wow

13:38

that's crazy how many books must a human

13:42

read to be this smart that's a bad

13:45

question um what's the equivalent amount

13:49

of human reading in terms of number of

13:51

books that would be required to achieve

13:52

the same L understanding knowledge L Tre

13:55

train 15 trillion tokens of data again a

13:57

very good Rewritten Qui if you ask me uh

14:01

what a question to put it at scale and

14:04

it goes

14:06

into uh okay so let's

14:09

say to read

14:11

330,000 to 600,000 books it would take

14:16

around

14:17

16,500 to 30,000 years assuming one book

14:21

per week uh around 15,000 years assuming

14:25

two bucks per week of course this is a

14:29

rough estimate them meant to be humorous

14:31

this model is so good so yeah you can

14:35

see we have to read uh around how many

14:38

books was it 600,000 books to be this

14:42

smart so yeah uh so I I think this kind

14:45

of shows how good this uh Rewritten

14:49

query kind of is and yeah how good the

14:52

70b model is so really excited about

14:54

llama Tre uh hope you found this

14:57

enjoyable hope you learned something

14:58

from it it that's the most important

15:00

thing the result doesn't matter too much

15:02

but maybe this give you some new ideas

15:04

how you can use embeddings to kind of

15:06

improve stuff how you can use uh the get

15:09

relevant context function to do other

15:11

stuff so yeah so I guess a lot of you

15:13

wondering where I got the 30% better

15:15

response from so what I did is I took

15:17

one response uh without the rewrite

15:21

query and I took like a second response

15:23

with the rewriting query function and I

15:26

asked the first GPT 4 to compare them

15:29

and I asked a lot of times and most of

15:32

the times it came in between 30 and 50%

15:35

better response than response one so

15:38

response two is the one with the rewrite

15:40

query and I did the same on Opus and

15:43

here it always landed in 30 40% better

15:47

than response one so response two that

15:49

was with the uh yeah reite query

15:52

function so yeah that is where I got the

15:55

37% from just want to say big thank you

15:58

for the support lately it's been awesome

16:00

and give the GitHub a star if you

16:03

enjoyed it other than that come back for

16:06

Sunday probably going to do more Gro and

16:08

llama 70b if the rate limit is okay have

16:12

a great week and I see you again yeah

16:14

Sunday

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
AI EnhancementQuery RewritingContext RetrievalLlama 3 ModelAMA ChatData AnalysisPython CodingMachine LearningNatural LanguageBrilliant.orgGrok and Llama