Run your own AI (but private)

NetworkChuck
12 Mar 202422:13

Summary

TLDR视频介绍了如何在自己的电脑上设置私人AI,类似于Chat GPT,但更加注重隐私保护。通过使用Hugging Face社区提供的AI模型,以及工具如O Lama,用户可以轻松地在本地运行各种AI模型。视频还探讨了私人AI在职场中的应用,特别是对于因隐私和安全原因无法使用公共AI的公司。VMware作为视频赞助商,展示了他们如何使公司能够在自己的数据中心运行私人AI,并通过NVIDIA等合作伙伴提供必要的工具和硬件支持。此外,视频还提供了一个高级教程,教观众如何将个人知识库连接到私人AI,并进行了一个小测验,奖励前五名满分者免费咖啡。

Takeaways

  • 🌟 私有AI的概念是在本地计算机上运行AI模型,保证数据的私密性和安全性。
  • 🚀 通过简单的设置,用户可以在大约五分钟内在自己的笔记本电脑上运行AI模型。
  • 📚 通过huggingface.co网站,用户可以访问和下载各种预训练的AI模型。
  • 💡 AI模型如LLM(大型语言模型)可以进行微调,以适应特定的数据集和应用场景。
  • 🔧 微调AI模型不需要像原始训练那样庞大的硬件资源,只需较少的数据和计算能力。
  • 🛠️ VMware提供了一套完整的私有AI解决方案,包括硬件、软件和工具,使得企业能够在本地运行和微调AI模型。
  • 🔗 通过RAG(检索增强生成)技术,可以将AI模型与数据库连接,使其在回答问题前能够查询数据库以提供准确的信息。
  • 🌐 私有AI的应用前景广泛,可以用于企业知识库、客户服务、产品信息等领域。
  • 💻 在Windows系统上,可以通过WSL(Windows子系统Linux)来运行基于Linux的私有AI项目。
  • 🎁 视频最后提供了一个关于私有AI的测验,前五名满分者可获得Network Chuck Coffee的免费咖啡。
  • 🔄 私有AI的发展强调了选择的重要性,企业和个人可以选择适合自己的AI解决方案和合作伙伴。

Q & A

  • 视频主要介绍了什么内容?

    -视频主要介绍了如何在自己的电脑上设置和运行一个私人的AI模型,以及如何将个人或公司的数据与这个AI模型结合,从而创建一个可以回答关于个人或公司特定信息的私人AI助手。

  • 私人AI与公共AI模型(如Chat GPT)有何不同?

    -私人AI模型运行在用户自己的电脑上,不依赖互联网连接,且数据不会与任何外部公司共享,保证了数据的隐私和安全性。而公共AI模型如Chat GPT则是在远程服务器上运行,用户的数据可能会被用于模型的训练或其他用途。

  • 如何在自己的电脑上设置私人AI?

    -首先,可以通过访问huggingface.co网站搜索并下载一个适合自己的AI模型。然后,使用一个名为O Lama的工具来安装和运行这个AI模型。对于Windows用户,还可以通过安装WSL(Windows子系统Linux)来实现。

  • 为什么有的公司不允许员工使用公共AI模型?

    -一些公司出于隐私和安全的考虑,不允许员工使用公共AI模型,因为这些模型可能需要访问敏感数据或信息。但如果员工能够运行自己的私人AI,那么就可以避免这些问题。

  • VMware在私人AI领域做了哪些贡献?

    -VMware提供了一个名为VMware Private AI的解决方案,它使得公司能够在自己的数据中心运行私人AI,而不需要依赖云服务。这个解决方案包括了必要的硬件、软件和工具,使得公司能够轻松地部署和使用私人AI。

  • 如何将个人笔记或公司文档与私人AI结合?

    -可以通过一个名为RAG(Retrieval-Augmented Generation)的技术,将个人笔记或公司文档上传到一个数据库中,然后让私人AI在回答问题前先查询这个数据库,以确保提供的信息是准确和相关的。

  • 私人AI在工作场景中的应用有哪些?

    -私人AI可以在工作场景中用于帮助员工解决技术问题、提供内部知识库的查询服务、辅助代码调试、客户服务等,同时还可以保护公司的敏感数据不被外泄。

  • 为什么说私人AI是AI未来的发展方向?

    -私人AI能够提供更加定制化的服务,同时保护用户数据的隐私和安全。随着技术的发展,越来越多的个人和公司将会选择运行自己的私人AI,以便更好地控制数据和提供更符合自己需求的服务。

  • 如何参与视频结尾的测验赢取免费咖啡?

    -观众需要在视频描述中找到链接,加入Network Chuck Coffee的学院账户,然后点击链接参与测验。前五名获得满分的参与者将通过电子邮件收到免费咖啡的奖励。

  • 视频提到了多少个AI模型?

    -视频提到了505,000个AI模型,这些模型都可以在huggingface.co网站上找到。

  • Llama模型是由哪家公司预训练的?

    -Llama模型是由Meta(前身为Facebook)公司预训练的。

Outlines

00:00

🤖 介绍私有AI及其优势

本段落介绍了私有AI的概念,它是一种在本地计算机上运行的人工智能,不同于依赖互联网连接的Chat GPT。私有AI的优势在于数据的隐私性和安全性,因为所有数据都存储在本地,不会与外部公司共享。视频作者计划展示如何快速简便地设置私有AI,并讨论私有AI如何帮助人们在工作领域取得帮助,尤其是那些因隐私和安全原因无法使用Chat GPT等工具的公司员工。此外,还提到了VMware作为视频的赞助商,他们支持在本地数据中心运行私有AI的技术。

05:01

🚀 如何安装私有AI

这一部分详细介绍了如何安装私有AI的过程。首先,作者指导观众如何在Linux和Windows系统上安装WSL(Windows子系统Linux),以便在Windows上运行Linux应用程序。接下来,介绍了如何使用curl命令安装一个名为O Lama的工具,该工具可以运行多种不同的大型语言模型(LLM),例如Llama二号。作者还强调了拥有Nvidia GPU的重要性,因为它可以显著提高AI模型的运行速度。最后,展示了如何通过简单的命令下载并运行Llama二号模型,并通过提问来测试其功能。

10:02

🧠 私有AI的应用场景

在这一段中,讨论了私有AI如何应用于实际场景,特别是在公司环境中。私有AI可以用于帮助员工解决工作中的问题,例如帮助台的技术支持或代码故障排除。此外,私有AI还可以通过细调(fine-tuning)过程,使用公司内部的数据和知识库来训练AI,使其更加精准地服务于公司的特定需求。这种方法不仅可以保护公司的数据隐私,还可以根据公司的实际情况定制AI的功能。

15:02

🛠️ 细调AI模型

本段深入探讨了细调AI模型的过程。细调是指在AI模型已经预训练的基础上,使用特定数据集进一步训练模型以适应特定任务。作者通过VMware的例子说明了细调的实际操作,包括需要的硬件资源、工具和库。特别指出,细调不需要像预训练那样庞大的计算资源,只需改变模型的一小部分参数即可。此外,还介绍了一种名为RAG的技术,它允许AI模型在回答问题前查询数据库,以确保信息的准确性。

20:04

🎉 私有GPT与知识库的结合

这一部分描述了如何将私有GPT与个人知识库结合使用。作者通过亲身实践,展示了如何上传文档并让私有GPT理解和回答有关这些文档的问题。通过这种方式,私有GPT可以成为个人或公司的智能助手,帮助用户快速获取所需信息。作者还强调了VMware提供的私有AI解决方案的便利性,它为公司提供了一套完整的工具和服务,使得部署和使用私有AI变得更加简单和高效。

Mindmap

Keywords

💡私有AI

私有AI指的是在个人或企业内部运行的人工智能系统,与公共云服务提供的AI相对,更注重数据的隐私和安全性。视频中提到通过在本地计算机上运行私有AI,可以避免数据被外部公司获取,保护用户隐私。

💡Hugging Face

Hugging Face是一个提供和分享AI模型的社区平台,拥有大量的预训练模型供用户选择和使用。

💡LLM(大型语言模型)

LLM指的是大型语言模型,这类模型通常通过大量数据进行预训练,能够理解和生成自然语言文本。

💡超级计算集群

超级计算集群是由大量计算节点组成的网络,用于执行复杂的计算任务。在AI领域,超级计算集群通常用于训练大型模型。

💡WSL(Windows子系统Linux)

WSL是Windows操作系统下的一个功能,允许用户在Windows环境下直接运行Linux操作系统。

💡O Lama

O Lama是一个工具,用于在本地计算机上安装和运行不同的LLM模型。

💡GPU(图形处理单元)

GPU是一种专门用于处理图像和复杂计算任务的硬件设备,它在AI模型的训练和推理过程中发挥重要作用,尤其是在加速计算速度方面。

💡微调(Fine-tuning)

微调是指在预训练的AI模型基础上,使用特定数据集进行额外训练,以适应特定任务或数据的需要。

💡RAG(Retrieval-Augmented Generation)

RAG是一种结合了信息检索和文本生成的技术,它允许AI模型在生成回答前先查询数据库,以确保信息的准确性。

💡VMware私有AI

VMware私有AI是VMware公司提供的一种解决方案,使企业能够在自己的数据中心内部署和运行私有AI模型。

Highlights

介绍了一种名为Private AI的本地运行的人工智能模型,类似于Chat GPT但更加注重隐私保护。

Private AI完全在本地计算机上运行,不依赖互联网连接,确保数据的私密性和安全性。

通过简单的设置过程,用户可以在大约五分钟内快速部署自己的AI模型。

展示了如何将个人知识库、笔记、文档和日记条目与Private GPT连接,以便提出关于个人资料的问题。

讨论了Private AI如何在职场中提供帮助,尤其是在那些因隐私和安全原因无法使用公共AI模型的工作环境中。

VMware作为视频的赞助商,展示了他们如何使公司能够在自己的数据中心运行私有AI。

介绍了huggingface.co,这是一个提供和分享AI模型的社区,拥有超过505,000个AI模型。

解释了AI模型的概念,以及如何通过预训练数据集来创建和训练它们。

提到了Llama模型,这是一个由Facebook开发的大规模语言模型(LLM),并讨论了其训练过程和数据量。

介绍了O Lama工具,它允许用户在本地计算机上运行多种不同的LLM。

讨论了在Windows和Linux操作系统上安装和运行Private AI的方法。

展示了如何通过O Lama工具下载和运行Llama模型,并进行实时的AI交互。

讨论了在没有GPU支持的情况下运行AI模型的性能差异。

提到了通过fine tuning过程,可以训练AI模型以理解和回应特定于公司或个人的数据和信息。

VMware提供了一套完整的解决方案,包括必要的硬件、服务器、工具和库,以方便公司进行AI的fine tuning。

介绍了RAG(Retrieval-Augmented Generation)技术,它允许LLM在回答问题前查询数据库以提供准确的信息。

Nvidia和Intel等公司提供了工具和支持,以便用户可以自定义和部署自己的LLM。

提供了一个关于如何使用Private GPT和RAG技术的额外高级教程,展示如何将个人文档和笔记与AI模型结合。

Transcripts

00:00

I'm running something called private ai. It's kind of like chat GPT,

00:03

except it's not. Everything about it is running right here on my computer.

00:07

Am I even connected to the internet?

00:08

This is private contained and my data isn't being shared with some random

00:12

company. So in this video I want to do two things. First,

00:15

I want to show you how to set this up.

00:16

It is ridiculously easy and fast to run your own AI on your laptop computer or

00:21

whatever. It's this is free, it's amazing.

00:23

It'll take you about five minutes and if you stick around until the end,

00:26

I want to show you something even crazier, a bit more advanced.

00:28

I'll show you how you can connect your knowledge base, your notes,

00:31

your documents,

00:32

your journal entries to your own private GPT and then ask it questions

00:37

about your stuff. And then second,

00:38

I want to talk about how private AI is helping us in the area we need help Most.

00:42

Our jobs, you may not know this,

00:44

but not everyone can use chat GBT or something like it at their job.

00:47

Their companies won't let them mainly because of privacy and security reasons,

00:51

but if they could run their own private ai, that's a different story.

00:54

That's a whole different ballgame and VMware is a big reason. This is possible.

00:58

They're the sponsor of this video and they're enabling some amazing things that

01:01

companies can do on-Prem in their own data center to run their own ai.

01:05

And it's not just the cloud man, it's like in your data center.

01:07

The stuff they're doing is crazy. We're going to talk about it here in a bit,

01:10

but tell you what, go ahead and do this. There's a link in the description.

01:13

Just go ahead and open it and take a little glimpse at what they're doing.

01:16

We're going to dive deeper,

01:16

so just go ahead and have it open right in your second monitor or something or

01:20

on the side or minimize. I don't know what you're doing.

01:22

I dunno how many monitors you have. You have three Actually, Bob,

01:25

I can see before we get started, I have to show you this.

01:27

You can run your own private ai. That's kind of uncensored. I watch this,

01:34

So yeah, please don't do this to destroy me. Also,

01:37

make sure you're paying attention at the end of this video,

01:39

I'm doing a quiz and if you're one of the first five people to get a hundred

01:42

percent on this quiz, you're getting some free coffee network. Chuck Coffee.

01:46

So take some notes, study up. Let's do this

01:51

now real quick, before we install a private local AI model on your computer,

01:55

what does it even mean? What's an AI model? At its core,

01:58

an AI model is simply an artificial intelligence pre-trained on data we

02:02

provided. One you may have heard of is open AI's Chat GBT,

02:05

but it's not the only one out there. Let's take a field trip.

02:08

We're going to go to a website called hugging face.co.

02:11

Just an incredible brand name. I love it so much.

02:14

This is an entire community dedicated to providing and sharing AI models and

02:18

there are a ton. You're about to have your mind blown. Ready?

02:21

I'm going to click on models up here. Do you see that number? 505,000 AI models.

02:26

Many of these are open and free for you to use and pre-trained,

02:30

which is kind of a crazy thing. Let me show you this.

02:32

We're going to search for a model named Llama two,

02:35

one of the most popular models out there. We'll do LAMA two seven B. Again,

02:39

I love the branding.

02:40

LAMA two is an AI model known as an LLM or large language model,

02:45

open AI's Chat. GPT is also an LLM. Now this LLM,

02:48

this pre-trained AI model was made by meda,

02:51

AKA Facebook and what they did to pre-train.

02:54

This model is kind of insane and the fact that we're about to download this and

02:58

use it even crazier, check this out if you scroll down just a little bit,

03:01

here we go. Training data.

03:03

It was trained by over 2 trillion tokens of data from publicly available

03:07

sources. Instruction data sets over a million human annotated examples,

03:11

data freshness. We're talking in July, 2023. I love that term.

03:15

Data freshness and getting the data was just step one.

03:18

Step two is insane because this is where the training happens.

03:21

Mata to train this model put together what's called a super cluster.

03:25

It already sounds cool, right? This sucker is over 6,000 GPUs.

03:29

It took 1.7 million GPU hours to train this model and it's estimated it

03:34

costs around $20 million to train it and now made is just like,

03:39

here you go kid. Download this incredibly powerful thing.

03:43

I don't want to call it a being yet. I'm not ready for that,

03:46

but this intelligent source of information that you can just download on your

03:50

laptop and ask it questions,

03:51

no internet required and this is just one of the many models we could download.

03:55

They have special models like text to speech, image to image.

03:58

They even have uncensored ones. They have an uncensored version of a llama too.

04:02

This guy George Sung,

04:04

took this model and fine tuned it with a pretty hefty GPU,

04:08

took him 19 hours and made it to where you could pretty much ask this thing.

04:11

Anything you wanted, whatever question comes to mind,

04:14

it's not going to hold back. Okay,

04:16

so how did we get this fine tuned model onto your computer? Well,

04:19

actually I should warn you, this involves quite a bit of llamas,

04:22

more than you would expect. Our journey starts at a tool called O Lama.

04:26

Let's go ahead and take a field trip out there real quick.

04:28

We'll go to O lama.ai. All we'll have to do is install this little guy, Mr.

04:32

Alama,

04:32

and then we can run a ton of different LLMs Llama two Code Llama told you lots

04:37

of llamas and there's others that are pretty fun like Llama two Uncensored or

04:41

Llamas. Tdrl. I'll show you in a second. But first, what do we install alama on?

04:46

We can see right down here that we have it available on macOS and Linux,

04:49

but oh bummer, windows coming soon.

04:52

It's okay because we've got WSL, the Windows subsystem for Linux,

04:56

which is now really easy to set up.

04:58

So we'll go ahead and click on download right here from os.

05:01

You'll just simply download this and install like one of your regular

05:04

applications for Linux. We'll click on this.

05:07

We got to fun curl command that will copy and paste now because we're going to

05:09

install WSL on Windows. This will be the same step. So Mac OS folks,

05:15

go ahead and just run that installer. Linux and Windows folks, let's keep going.

05:19

Now, if you're on Windows,

05:20

all you have to do now to get WSL installed is launch your Windows terminal.

05:23

Just go to your search bar and search for terminal and with one command it'll

05:27

just happen. It used to be so much harder, which is WSL dash dash install.

05:32

It'll go through a few steps. It'll install Ubuntu as default.

05:35

I'll go ahead and let that do that. And boom, just like that.

05:39

I've got Ubuntu 22 0 4 3 lts installed and I'm actually inside of it right

05:44

now. So now at this point, Linux and Windows folks, we converged.

05:47

We're on the same path. Let's install alama.

05:49

I'm going to copy that curl command that alama gave us,

05:52

jump back into my terminal, paste that in there and press enter.

05:55

Fingers crossed, everything should be great. Like the way it is right now,

05:59

it'll ask for my pseudo password and that was it. Oh, LAMA is now installed.

06:04

Now this will directly apply to Linux people and Windows people.

06:07

See right here where it says Nvidia GPU installed. If you have that,

06:10

you're going to have a better time than other people who don't have that.

06:13

I'll show you here in a second. If you don't have it, that's fine.

06:15

We'll keep going. Now let's run an LLM. We'll start with llama two.

06:18

So we'll simply type in, oh Lama run,

06:22

and then we'll pick one llama two and that's it. Ready,

06:26

set go. It's going to pull the manifest.

06:28

It'll then start pulling down and downloading Llama two.

06:31

And I want you to just realize this, that powerful LAMA two pre-training,

06:34

we talked about all the money and hours spent. That's how big it is.

06:38

This is the 7 billion parameter model or the seven B.

06:42

It's pretty powerful and we're about to literally have this in the palm of our

06:45

hands in like 3, 2, 1. Oh, I thought I had it. Anyways,

06:49

it's almost done. And boom, it's done.

06:52

We've got a nice success message right here and it's ready for us.

06:56

We can ask you anything. Let's try what is a pug?

06:59

Now the reason this is going so fast, just like a side note,

07:01

is that I'm running A GPU and AI models love GPUs.

07:05

So lemme just show you real quick.

07:06

I did install alama on a Linux virtual machine and I'll just demo the

07:10

performance for you real quick. By the way, if you're running a Mac with an M1,

07:13

M two or M three processor, it actually works great. I forgot to install it.

07:17

I got to install it real quick and I'll ask you that same question.

07:19

What is a pug? It's going to take a minute, it'll still work,

07:22

but it's going to be slower on CPUs and there it goes. It didn't take too long,

07:25

but notice it is a bit slower.

07:27

Now if you're running WSL and you know have an Nvidia GPU and it didn't show up,

07:31

I'll show you in a minute how you can get those drivers installed. But anyways,

07:34

just sit back for a minute,

07:35

sip your coffee and think about how powerful this is.

07:38

The tinfoil hat version of me stinking loves this because let's say

07:43

the zombie apocalypse happens, right? The grid goes down, things are crazy,

07:47

but as long as I have my laptop and a solar panel,

07:51

I still have AI and it can help me survive the zombie apocalypse.

07:55

Let's actually see how that would work. It gives me next steps.

07:58

I could have it help me with the water filtration system. This is just cool,

08:01

right? It's amazing. But can I show you something funny?

08:04

You may have caught this earlier. Who is network? Chuck?

08:09

What? Dude, I've always wanted to be Rick Grimes.

08:14

That is so fun, but seriously, it kind of hallucinated there.

08:17

It didn't have the correct information.

08:19

It's so funny how it mixed the zombie apocalypse prompt with me.

08:23

I love that so much. Let's try a different model. I'll say bye.

08:27

I'll try a really fun one called mytral. And by the way,

08:30

if you want to know which ones you can run with Llama, which LLMs,

08:33

they get a page for their models right here and all the ones you can run,

08:36

including llama two, uncensored Wizard Math.

08:39

I might give that to my kids actually. Let's see what it says.

08:41

Now who is Network Chuck?

08:45

Now my name is not Chuck Davis and my YouTube channel is not called Network

08:50

Chuck on Tech.

08:50

So clearly the data this thing was trained on is either not up to date or just

08:54

plain wrong. So now the question is cool,

08:57

we've got this local private ai, this LLM, that's super powerful,

09:02

but how do we teach it the correct information for us?

09:05

How can I teach it to know that I'm network Chuck, Chuck Keith, not Chuck Davis,

09:08

and my channel is called Network Chuck.

09:09

Or maybe I'm a business and I want it to know more than just what's publicly

09:13

available because sure, right now if you downloaded this lm,

09:16

you could probably use it in your job,

09:17

but you can only go so far without it knowing more about your job. For example,

09:22

maybe you're on a help desk.

09:23

Imagine if you could take your help desk's knowledge base, your IT procedures,

09:27

your documentation. Not only that,

09:29

but maybe you have a database of closed tickets, open tickets.

09:31

If you could take all that data and feed it to this LLM and then ask it

09:35

questions about all of that, that would be crazy.

09:38

Or maybe you wanted to help troubleshoot code that your company's written.

09:41

You could even make this LM public facing for your customers.

09:44

You feed information about your product and the customer could interact with

09:47

that chat bot you make.

09:49

Maybe this is all possible with a process called fine tuning where we can train

09:53

this AI on our own proprietary secret private stuff about our

09:58

company or maybe our lives or whatever you want to use it for,

10:00

whatever use case is,

10:01

and this is fantastic because maybe before you couldn't use a public LLM because

10:05

you weren't allowed to share your company's data with that LLM,

10:08

whether it's compliance reasons or you just simply didn't want to share that

10:10

data because it's secret. Whatever the case,

10:12

it's possible now because this AI is private,

10:15

it's local and whatever data you feed to it,

10:18

it's going to stay right there in a company. It's not leaving the door.

10:20

That idea just makes me so excited because I think it is the future of AI and

10:24

how companies and individuals will approach it. It's going to be more private.

10:28

Back to our question though, fine tuning, that sounds cool.

10:31

Training and AI on your own data, but how does that work?

10:34

Because as we saw before with pre-training a model with mata,

10:38

it took them 6,000 GPUs over 1.7 million GPU hours.

10:42

Do we have to have this massive data center to make this happen? No.

10:46

Check this out, and this is such a fun example, VMware, they asked chat GPT,

10:50

what's the latest version of VMware vSphere?

10:52

Now the latest chat GPT knew about was vSphere 7.0,

10:55

but that wasn't helpful to VMware because their latest version they were working

10:58

on chat hadn't been released yet.

10:59

So it wasn't public knowledge was vSphere eight update too.

11:02

And they wanted information like this internal information not yet released to

11:06

the public.

11:07

They wanted this to be available to their internal team so they could ask

11:10

something like chat GBT, Hey, what's the latest version of vSphere?

11:14

And they could answer correctly.

11:15

So to do what VMware is trying to do to fine tune a model or train it on new

11:19

data, it does require a lot. First of all,

11:22

you would need some hardware servers with GPUs.

11:24

Then you would also need a bunch of tools and libraries and SDKs like PyTorch

11:29

and TensorFlow, pandas, MPI side kit, learn transformers and fast ai.

11:33

The list goes on.

11:34

You need lots of tools and resources in order to fine tune an LLM.

11:37

That's why I'm a massive fan of what VMware is doing right here.

11:40

They have something called the VMware private AI with Nvidia,

11:44

the gajillion things I just listed off. They include in one package,

11:49

one combo meal, a recipe of ai, fine tuning goodness.

11:53

So as a company it becomes a bit easier to do this stuff yourself locally.

11:57

For the system engineer you have on staff who knows VMware and loves it,

12:00

they could do this stuff,

12:01

they could implement this and the data scientists they have on staff that will

12:04

actually do some of the fine tuning, all the tools are right there.

12:07

So here's what it looks like to fine tune and we're going to kind of peek behind

12:10

the curtain at what a data scientist actually does.

12:12

So first we have the infrastructure and we start here in vSphere, VMware.

12:17

Now if you don't know what vSphere is or VMware, think virtual machines,

12:20

you got one big physical server. The hardware, the stuff you can feel,

12:23

touch and smell. You haven't smelled the server, I dunno what you're doing.

12:26

And instead of installing one operating system on them like Windows or Linux,

12:29

you install VMware's, EA XI,

12:31

which will then allow you to virtualize or create a bunch of additional virtual

12:35

computers. So instead of one computer,

12:37

you've got a bunch of computers all using the same hardware resources.

12:40

And that's what we have right here. One of those virtual computers,

12:43

a virtual machine.

12:44

This by the way is one of their special deep learning VMs that has all the tools

12:49

I mentioned and many, many more pre-installed, ready to go.

12:53

Everything a data scientist could love.

12:55

It's kind of like a surgeon walking in to do some surgery and like their doctor

12:59

assistants or whatever have prepared all their tools.

13:01

It's all in the tray laid out nice and neat to the surgeon.

13:04

All he has to do is walk in and just go scalpel.

13:08

That's what we're doing here for the data scientist.

13:10

Now talking more about hardware,

13:11

this guy has a couple Nvidia GPUs assigned to it or pass through to it through

13:16

a technology called PCIE Passthrough. These are some beefy GPUs.

13:20

I notice they are V GPU for virtual GPU similar to what you do with the CPU,

13:25

cutting up the PU and assigning some of that to a virtual CPU on a virtual

13:29

machine. So here we are in data scientists world. This is a Jupiter notebook,

13:33

a common tool used by a data scientist,

13:35

and what you're going to see here is a lot of code that they're using to prepare

13:37

the data,

13:38

specifically the data that they're going to train or fine tune the existing

13:42

model on. Now we're not going to dive deep on that,

13:44

but I do want you to see this, check this out.

13:45

A lot of this code is all about getting the data ready. So in VMware's case,

13:48

it might be a bunch of the knowledge base product documentation and they're

13:51

getting it ready to be fed to the LLM. And here's what I wanted you to see.

13:55

Here's the dataset that we're training this model on. We're fine tuning.

13:59

We only have 9,800 examples that we're giving it or 9,800 new prompts or

14:04

pieces of data. And that data might look like this,

14:06

like a simple question or a prompt and then we feed it the correct answer and

14:11

that's how we essentially train ai. But again,

14:14

we're only giving it 9,800 examples,

14:16

which is not a lot at all and is extremely small compared to how the

14:20

model was originally trained.

14:22

And I point that out to say that we're not going to need a ton of hardware or a

14:25

ton of resources to fine tune this model.

14:28

We won't need the 6,000 GPUs we needed for MATA to originally create this model.

14:32

We're just adding to it,

14:33

changing some things or fine tuning it to what our use case is and looking at

14:37

what actually will be changed when we run this and we train it,

14:41

we're only changing 65 million parameters, which sounds like a lot, right?

14:46

But not in the grand scheme of things of like a 7 billion parameter model.

14:49

We're only changing 0.93% of the model.

14:52

And then we can actually run our fine tuning,

14:54

which this is a specific technique in fine tuning called prompt tuning where we

14:58

simply feed up additional prompts with answers to change how it'll react to

15:02

people asking you questions.

15:03

This process will take three to four minutes to fine tune it because again,

15:06

we're not changing a lot and that is just so super powerful and I think VMware

15:10

is leading the charge with private ai.

15:12

VMware and Nvidia take all the guesswork out of getting things set up to fine

15:17

tune an LLM. They've got deep learning VMs,

15:19

which are insane VMs that come pre-installed with everything you could want

15:23

everything a data scientist would need to find tune an LLM.

15:26

Then Nvidia has an entire suite of tools sensor around their GPUs,

15:29

taking advantage of some really exciting things to help you fine tune your lms.

15:33

Now there's one thing I didn't talk about because I wanted to save it for last.

15:36

For right now it's this right here, this vector database,

15:39

post gray SQL box here.

15:42

This is something called rag and it's what we're about to do with our own

15:46

personal GPT here in a bit. Retrieval, augment the generation. So scenario,

15:51

let's say you have a database of product information, internal docs,

15:54

whatever it is, and you haven't fine tuned your LLM on this just yet.

15:58

So it doesn't know about it. You don't have to do that with rag.

16:01

You can connect your LLM to this database of information,

16:05

this knowledge base and give it these instructions.

16:08

Say whenever I ask you a question about any of the things in this database,

16:11

before you answer, consult the database,

16:13

go look at it and make sure what you're saying is accurate.

16:16

We're not retraining the LLM, we're just saying, Hey, before you answer,

16:20

go check real quick in this database to make sure it's accurate to make sure you

16:23

got your stuff right. Isn't that cool? So yes,

16:25

fine tuning is cool and training an LLM on your own data is awesome,

16:29

but in between those moments of fine tuning,

16:31

you can have rag set up where it can consult your database,

16:34

your internal documentation and give correct answers based on what you have in

16:38

that database. That is so stinking cool.

16:40

So with VMware private AI foundation with nvidia,

16:43

they have those tools baked right in to where it just kind of works for what

16:47

would otherwise be a very complex setup. And by the way, this whole rag thing,

16:51

like I said earlier, we're about to do this,

16:53

I actually connected a lot of my notes and journal entries to a private GPT

16:58

using RAG and I was able to talk with it about me asking it about my

17:03

journal entries and answering questions about my past. That's so powerful. Now,

17:07

before we move on,

17:08

I just want to highlight the fact that Nvidia with their Nvidia AI enterprise

17:12

gives you some amazing and fantastic tools to pull the LLM of your choice and

17:17

then fine tune and customize and deploy that LLM. It's all built in right here.

17:21

So VMware Cloud Foundation,

17:22

they provide the robust infrastructure and NVIDIA provides all the amazing AI

17:26

tools you need to develop and deploy these custom LLMs.

17:29

Now it's not just Nvidia, they're partnering with Intel as well.

17:31

So VMware is covering all the tools that admins care about.

17:34

And then for the data scientists, this is for you.

17:36

Intel's got your back data analytics,

17:38

generative AI and deep learning tools and some classic ML or machine learning.

17:42

And they're also working with IBM, all you IBM fans. You can do this too. Again,

17:46

VMware has the admin's back. But for the data scientist, Watson,

17:49

one of the first AI things I ever heard about Red Hat and OpenShift,

17:52

and I love this because what VMware is doing is all about choice.

17:55

If you want to run your own local private ai, you can.

17:58

You're not just stuck with one of the big guys out there and you can choose to

18:00

run it with Nvidia and VMware, Intel and VMware, IBM and VMware.

18:04

You got options. So there's nothing stopping you.

18:06

It's not for some of the bonus section of this video and that's how to run your

18:09

own private GPT with your own knowledge base. Now, fair warning,

18:14

it is a bit more advanced, but if you stick with me,

18:16

you should be able to get this up and running. So take one more sip of coffee.

18:20

Let's get this going. Now, first of all, this will not be using a lama.

18:23

This will be a separate project called Private GPT. Now disclaimer,

18:26

this is kind of hard to do. Unlike VMware private ai,

18:29

which they do it all for you,

18:30

it's a complete solution for companies to run their own private local ai.

18:34

What I'm about to show you is not that at all. No affiliation with VMware.

18:37

It's a free side project.

18:39

You can try just to get a little taste of what running your own private GPT with

18:44

rag tastes like. Did I do that right? I don't know.

18:47

Now L Martinez has a great doc on how to install this. It's a lot,

18:51

but you can do it. And if you just want a quick start,

18:53

he does have a few lines of code for Linux and Mac users. Fair warning,

18:57

this is CPU only. You can't really take advantage of RAG without A GPU,

19:00

which is what I wanted to do. So here's my very specific scenario.

19:03

I've got a Windows PC with an NVIDIA 40 90. How do I run this?

19:06

Linux-based project. WSL, and I'm so thankful to this guy Emelia Lance a lot.

19:11

He put an entire guide together of how to set this up.

19:14

I'm not going to walk you through every step because he already did that link

19:17

below, but I seriously need to buy this guy a coffee. How do I do that?

19:20

I don't know, Emil, if you're watching this, reach out to me.

19:22

I'll send you some coffee. So anyways,

19:24

I went through every step from installing all the prereqs to installing NVIDIA

19:27

drivers and using poetry to handle dependencies, which poetry is pretty cool.

19:31

I landed here.

19:32

I've got a private local working private GPT that I can access through my web

19:36

browser and it's using my GPU, which is pretty cool. Now,

19:38

first I try a simple document upload,

19:40

got this VMware article that details a lot of what we talked about in this

19:43

video. I upload it and I start asking you questions about this article.

19:46

I tried something specific like show me something about VMware AI market growth.

19:50

Bam, it figured it out, it told me. Then I'm like,

19:52

what's the coolest thing about VMware private ai?

19:55

It told me I'm sitting here chatting with a document, but then I'm like,

19:58

let's try something bigger. I want to chat with my journals.

20:00

I've got a ton of journals on markdown format and I want to ask you questions

20:03

about me. Now this specific step is not covered in the article.

20:06

So here's how you do it. First,

20:07

you'll want to grab your folder of whatever documents you want to ask questions

20:10

about and throw it onto your machine.

20:12

So I copied over to my WSL machine and then I ingested it with this command once

20:16

complete and I ran private GPT. Again,

20:18

here's all my documents and I'm ready to ask it questions.

20:21

So let's test this out. I'm going to ask it what did I do in takayama?

20:26

So I went to Japan in November of 2023. Let's see if you can search my notes,

20:31

figure out when that was and what I did.

20:36

That's awesome. Oh my goodness.

20:41

Let's see, what did I eat in Tokyo?

20:45

How cool is that? Oh my gosh, that's so fun. No, it's not perfect,

20:49

but I can see the potential here. That's insane. I love this so much.

20:53

Private AI is the future and that's why we're seeing VMware bring products like

20:57

this to companies to run their own private local AI and then make it pretty

21:01

easy. If you actually did that private GPT thing, that little side project,

21:04

there's a lot to it. Lots of tools you have to install, it's kind of a pain.

21:07

But with VMware,

21:08

they kind of cover everything like that deep learning VM they offer as part of

21:11

their solution. It's got all the tools ready to go. Pre-baked again,

21:15

you're like a surgeon just walking in saying scalpel.

21:17

You got all this stuff right there. So if you want to bring AI to your company,

21:20

check out VMware private AI link below and thank you to VMware by Broadcom for

21:24

sponsoring this video. You made it to the end of the video time for a quiz.

21:28

This quiz will test the knowledge you've gained in this video and the first five

21:32

people to get a hundred percent on this quiz will get free coffee from Network

21:36

Chuck Coffee. So here's how you take the quiz right now.

21:38

Check the description in your video and click on this link.

21:41

If you're not currently signed into the academy, go ahead and get signed in.

21:43

If you're not a member, go ahead and click on sign off. It's free.

21:47

Once you're signed in,

21:48

it will take you to your dashboard showing you all the stuff you have access to

21:51

with your free academy account. But to get right back to that quiz,

21:54

go back to the YouTube video,

21:55

click on that link once more and it should take you right to it.

21:58

Go ahead and click on start now and start your quiz. Here's a little preview.

22:03

That's it. The first five to get a hundred percent free coffee.

22:06

If you're one of the five,

22:06

you'll know because you'll receive an email with free coffee.

22:09

You got to be quick, you got to be smart. I'll see you guys in the next video.