Run your own AI (but private)
Summary
TLDR视频介绍了如何在自己的电脑上设置私人AI,类似于Chat GPT,但更加注重隐私保护。通过使用Hugging Face社区提供的AI模型,以及工具如O Lama,用户可以轻松地在本地运行各种AI模型。视频还探讨了私人AI在职场中的应用,特别是对于因隐私和安全原因无法使用公共AI的公司。VMware作为视频赞助商,展示了他们如何使公司能够在自己的数据中心运行私人AI,并通过NVIDIA等合作伙伴提供必要的工具和硬件支持。此外,视频还提供了一个高级教程,教观众如何将个人知识库连接到私人AI,并进行了一个小测验,奖励前五名满分者免费咖啡。
Takeaways
- 🌟 私有AI的概念是在本地计算机上运行AI模型,保证数据的私密性和安全性。
- 🚀 通过简单的设置,用户可以在大约五分钟内在自己的笔记本电脑上运行AI模型。
- 📚 通过huggingface.co网站,用户可以访问和下载各种预训练的AI模型。
- 💡 AI模型如LLM(大型语言模型)可以进行微调,以适应特定的数据集和应用场景。
- 🔧 微调AI模型不需要像原始训练那样庞大的硬件资源,只需较少的数据和计算能力。
- 🛠️ VMware提供了一套完整的私有AI解决方案,包括硬件、软件和工具,使得企业能够在本地运行和微调AI模型。
- 🔗 通过RAG(检索增强生成)技术,可以将AI模型与数据库连接,使其在回答问题前能够查询数据库以提供准确的信息。
- 🌐 私有AI的应用前景广泛,可以用于企业知识库、客户服务、产品信息等领域。
- 💻 在Windows系统上,可以通过WSL(Windows子系统Linux)来运行基于Linux的私有AI项目。
- 🎁 视频最后提供了一个关于私有AI的测验,前五名满分者可获得Network Chuck Coffee的免费咖啡。
- 🔄 私有AI的发展强调了选择的重要性,企业和个人可以选择适合自己的AI解决方案和合作伙伴。
Q & A
视频主要介绍了什么内容?
-视频主要介绍了如何在自己的电脑上设置和运行一个私人的AI模型,以及如何将个人或公司的数据与这个AI模型结合,从而创建一个可以回答关于个人或公司特定信息的私人AI助手。
私人AI与公共AI模型(如Chat GPT)有何不同?
-私人AI模型运行在用户自己的电脑上,不依赖互联网连接,且数据不会与任何外部公司共享,保证了数据的隐私和安全性。而公共AI模型如Chat GPT则是在远程服务器上运行,用户的数据可能会被用于模型的训练或其他用途。
如何在自己的电脑上设置私人AI?
-首先,可以通过访问huggingface.co网站搜索并下载一个适合自己的AI模型。然后,使用一个名为O Lama的工具来安装和运行这个AI模型。对于Windows用户,还可以通过安装WSL(Windows子系统Linux)来实现。
为什么有的公司不允许员工使用公共AI模型?
-一些公司出于隐私和安全的考虑,不允许员工使用公共AI模型,因为这些模型可能需要访问敏感数据或信息。但如果员工能够运行自己的私人AI,那么就可以避免这些问题。
VMware在私人AI领域做了哪些贡献?
-VMware提供了一个名为VMware Private AI的解决方案,它使得公司能够在自己的数据中心运行私人AI,而不需要依赖云服务。这个解决方案包括了必要的硬件、软件和工具,使得公司能够轻松地部署和使用私人AI。
如何将个人笔记或公司文档与私人AI结合?
-可以通过一个名为RAG(Retrieval-Augmented Generation)的技术,将个人笔记或公司文档上传到一个数据库中,然后让私人AI在回答问题前先查询这个数据库,以确保提供的信息是准确和相关的。
私人AI在工作场景中的应用有哪些?
-私人AI可以在工作场景中用于帮助员工解决技术问题、提供内部知识库的查询服务、辅助代码调试、客户服务等,同时还可以保护公司的敏感数据不被外泄。
为什么说私人AI是AI未来的发展方向?
-私人AI能够提供更加定制化的服务,同时保护用户数据的隐私和安全。随着技术的发展,越来越多的个人和公司将会选择运行自己的私人AI,以便更好地控制数据和提供更符合自己需求的服务。
如何参与视频结尾的测验赢取免费咖啡?
-观众需要在视频描述中找到链接,加入Network Chuck Coffee的学院账户,然后点击链接参与测验。前五名获得满分的参与者将通过电子邮件收到免费咖啡的奖励。
视频提到了多少个AI模型?
-视频提到了505,000个AI模型,这些模型都可以在huggingface.co网站上找到。
Llama模型是由哪家公司预训练的?
-Llama模型是由Meta(前身为Facebook)公司预训练的。
Outlines
🤖 介绍私有AI及其优势
本段落介绍了私有AI的概念,它是一种在本地计算机上运行的人工智能,不同于依赖互联网连接的Chat GPT。私有AI的优势在于数据的隐私性和安全性,因为所有数据都存储在本地,不会与外部公司共享。视频作者计划展示如何快速简便地设置私有AI,并讨论私有AI如何帮助人们在工作领域取得帮助,尤其是那些因隐私和安全原因无法使用Chat GPT等工具的公司员工。此外,还提到了VMware作为视频的赞助商,他们支持在本地数据中心运行私有AI的技术。
🚀 如何安装私有AI
这一部分详细介绍了如何安装私有AI的过程。首先,作者指导观众如何在Linux和Windows系统上安装WSL(Windows子系统Linux),以便在Windows上运行Linux应用程序。接下来,介绍了如何使用curl命令安装一个名为O Lama的工具,该工具可以运行多种不同的大型语言模型(LLM),例如Llama二号。作者还强调了拥有Nvidia GPU的重要性,因为它可以显著提高AI模型的运行速度。最后,展示了如何通过简单的命令下载并运行Llama二号模型,并通过提问来测试其功能。
🧠 私有AI的应用场景
在这一段中,讨论了私有AI如何应用于实际场景,特别是在公司环境中。私有AI可以用于帮助员工解决工作中的问题,例如帮助台的技术支持或代码故障排除。此外,私有AI还可以通过细调(fine-tuning)过程,使用公司内部的数据和知识库来训练AI,使其更加精准地服务于公司的特定需求。这种方法不仅可以保护公司的数据隐私,还可以根据公司的实际情况定制AI的功能。
🛠️ 细调AI模型
本段深入探讨了细调AI模型的过程。细调是指在AI模型已经预训练的基础上,使用特定数据集进一步训练模型以适应特定任务。作者通过VMware的例子说明了细调的实际操作,包括需要的硬件资源、工具和库。特别指出,细调不需要像预训练那样庞大的计算资源,只需改变模型的一小部分参数即可。此外,还介绍了一种名为RAG的技术,它允许AI模型在回答问题前查询数据库,以确保信息的准确性。
🎉 私有GPT与知识库的结合
这一部分描述了如何将私有GPT与个人知识库结合使用。作者通过亲身实践,展示了如何上传文档并让私有GPT理解和回答有关这些文档的问题。通过这种方式,私有GPT可以成为个人或公司的智能助手,帮助用户快速获取所需信息。作者还强调了VMware提供的私有AI解决方案的便利性,它为公司提供了一套完整的工具和服务,使得部署和使用私有AI变得更加简单和高效。
Mindmap
Keywords
💡私有AI
💡Hugging Face
💡LLM(大型语言模型)
💡超级计算集群
💡WSL(Windows子系统Linux)
💡O Lama
💡GPU(图形处理单元)
💡微调(Fine-tuning)
💡RAG(Retrieval-Augmented Generation)
💡VMware私有AI
Highlights
介绍了一种名为Private AI的本地运行的人工智能模型,类似于Chat GPT但更加注重隐私保护。
Private AI完全在本地计算机上运行,不依赖互联网连接,确保数据的私密性和安全性。
通过简单的设置过程,用户可以在大约五分钟内快速部署自己的AI模型。
展示了如何将个人知识库、笔记、文档和日记条目与Private GPT连接,以便提出关于个人资料的问题。
讨论了Private AI如何在职场中提供帮助,尤其是在那些因隐私和安全原因无法使用公共AI模型的工作环境中。
VMware作为视频的赞助商,展示了他们如何使公司能够在自己的数据中心运行私有AI。
介绍了huggingface.co,这是一个提供和分享AI模型的社区,拥有超过505,000个AI模型。
解释了AI模型的概念,以及如何通过预训练数据集来创建和训练它们。
提到了Llama模型,这是一个由Facebook开发的大规模语言模型(LLM),并讨论了其训练过程和数据量。
介绍了O Lama工具,它允许用户在本地计算机上运行多种不同的LLM。
讨论了在Windows和Linux操作系统上安装和运行Private AI的方法。
展示了如何通过O Lama工具下载和运行Llama模型,并进行实时的AI交互。
讨论了在没有GPU支持的情况下运行AI模型的性能差异。
提到了通过fine tuning过程,可以训练AI模型以理解和回应特定于公司或个人的数据和信息。
VMware提供了一套完整的解决方案,包括必要的硬件、服务器、工具和库,以方便公司进行AI的fine tuning。
介绍了RAG(Retrieval-Augmented Generation)技术,它允许LLM在回答问题前查询数据库以提供准确的信息。
Nvidia和Intel等公司提供了工具和支持,以便用户可以自定义和部署自己的LLM。
提供了一个关于如何使用Private GPT和RAG技术的额外高级教程,展示如何将个人文档和笔记与AI模型结合。
Transcripts
I'm running something called private ai. It's kind of like chat GPT,
except it's not. Everything about it is running right here on my computer.
Am I even connected to the internet?
This is private contained and my data isn't being shared with some random
company. So in this video I want to do two things. First,
I want to show you how to set this up.
It is ridiculously easy and fast to run your own AI on your laptop computer or
whatever. It's this is free, it's amazing.
It'll take you about five minutes and if you stick around until the end,
I want to show you something even crazier, a bit more advanced.
I'll show you how you can connect your knowledge base, your notes,
your documents,
your journal entries to your own private GPT and then ask it questions
about your stuff. And then second,
I want to talk about how private AI is helping us in the area we need help Most.
Our jobs, you may not know this,
but not everyone can use chat GBT or something like it at their job.
Their companies won't let them mainly because of privacy and security reasons,
but if they could run their own private ai, that's a different story.
That's a whole different ballgame and VMware is a big reason. This is possible.
They're the sponsor of this video and they're enabling some amazing things that
companies can do on-Prem in their own data center to run their own ai.
And it's not just the cloud man, it's like in your data center.
The stuff they're doing is crazy. We're going to talk about it here in a bit,
but tell you what, go ahead and do this. There's a link in the description.
Just go ahead and open it and take a little glimpse at what they're doing.
We're going to dive deeper,
so just go ahead and have it open right in your second monitor or something or
on the side or minimize. I don't know what you're doing.
I dunno how many monitors you have. You have three Actually, Bob,
I can see before we get started, I have to show you this.
You can run your own private ai. That's kind of uncensored. I watch this,
So yeah, please don't do this to destroy me. Also,
make sure you're paying attention at the end of this video,
I'm doing a quiz and if you're one of the first five people to get a hundred
percent on this quiz, you're getting some free coffee network. Chuck Coffee.
So take some notes, study up. Let's do this
now real quick, before we install a private local AI model on your computer,
what does it even mean? What's an AI model? At its core,
an AI model is simply an artificial intelligence pre-trained on data we
provided. One you may have heard of is open AI's Chat GBT,
but it's not the only one out there. Let's take a field trip.
We're going to go to a website called hugging face.co.
Just an incredible brand name. I love it so much.
This is an entire community dedicated to providing and sharing AI models and
there are a ton. You're about to have your mind blown. Ready?
I'm going to click on models up here. Do you see that number? 505,000 AI models.
Many of these are open and free for you to use and pre-trained,
which is kind of a crazy thing. Let me show you this.
We're going to search for a model named Llama two,
one of the most popular models out there. We'll do LAMA two seven B. Again,
I love the branding.
LAMA two is an AI model known as an LLM or large language model,
open AI's Chat. GPT is also an LLM. Now this LLM,
this pre-trained AI model was made by meda,
AKA Facebook and what they did to pre-train.
This model is kind of insane and the fact that we're about to download this and
use it even crazier, check this out if you scroll down just a little bit,
here we go. Training data.
It was trained by over 2 trillion tokens of data from publicly available
sources. Instruction data sets over a million human annotated examples,
data freshness. We're talking in July, 2023. I love that term.
Data freshness and getting the data was just step one.
Step two is insane because this is where the training happens.
Mata to train this model put together what's called a super cluster.
It already sounds cool, right? This sucker is over 6,000 GPUs.
It took 1.7 million GPU hours to train this model and it's estimated it
costs around $20 million to train it and now made is just like,
here you go kid. Download this incredibly powerful thing.
I don't want to call it a being yet. I'm not ready for that,
but this intelligent source of information that you can just download on your
laptop and ask it questions,
no internet required and this is just one of the many models we could download.
They have special models like text to speech, image to image.
They even have uncensored ones. They have an uncensored version of a llama too.
This guy George Sung,
took this model and fine tuned it with a pretty hefty GPU,
took him 19 hours and made it to where you could pretty much ask this thing.
Anything you wanted, whatever question comes to mind,
it's not going to hold back. Okay,
so how did we get this fine tuned model onto your computer? Well,
actually I should warn you, this involves quite a bit of llamas,
more than you would expect. Our journey starts at a tool called O Lama.
Let's go ahead and take a field trip out there real quick.
We'll go to O lama.ai. All we'll have to do is install this little guy, Mr.
Alama,
and then we can run a ton of different LLMs Llama two Code Llama told you lots
of llamas and there's others that are pretty fun like Llama two Uncensored or
Llamas. Tdrl. I'll show you in a second. But first, what do we install alama on?
We can see right down here that we have it available on macOS and Linux,
but oh bummer, windows coming soon.
It's okay because we've got WSL, the Windows subsystem for Linux,
which is now really easy to set up.
So we'll go ahead and click on download right here from os.
You'll just simply download this and install like one of your regular
applications for Linux. We'll click on this.
We got to fun curl command that will copy and paste now because we're going to
install WSL on Windows. This will be the same step. So Mac OS folks,
go ahead and just run that installer. Linux and Windows folks, let's keep going.
Now, if you're on Windows,
all you have to do now to get WSL installed is launch your Windows terminal.
Just go to your search bar and search for terminal and with one command it'll
just happen. It used to be so much harder, which is WSL dash dash install.
It'll go through a few steps. It'll install Ubuntu as default.
I'll go ahead and let that do that. And boom, just like that.
I've got Ubuntu 22 0 4 3 lts installed and I'm actually inside of it right
now. So now at this point, Linux and Windows folks, we converged.
We're on the same path. Let's install alama.
I'm going to copy that curl command that alama gave us,
jump back into my terminal, paste that in there and press enter.
Fingers crossed, everything should be great. Like the way it is right now,
it'll ask for my pseudo password and that was it. Oh, LAMA is now installed.
Now this will directly apply to Linux people and Windows people.
See right here where it says Nvidia GPU installed. If you have that,
you're going to have a better time than other people who don't have that.
I'll show you here in a second. If you don't have it, that's fine.
We'll keep going. Now let's run an LLM. We'll start with llama two.
So we'll simply type in, oh Lama run,
and then we'll pick one llama two and that's it. Ready,
set go. It's going to pull the manifest.
It'll then start pulling down and downloading Llama two.
And I want you to just realize this, that powerful LAMA two pre-training,
we talked about all the money and hours spent. That's how big it is.
This is the 7 billion parameter model or the seven B.
It's pretty powerful and we're about to literally have this in the palm of our
hands in like 3, 2, 1. Oh, I thought I had it. Anyways,
it's almost done. And boom, it's done.
We've got a nice success message right here and it's ready for us.
We can ask you anything. Let's try what is a pug?
Now the reason this is going so fast, just like a side note,
is that I'm running A GPU and AI models love GPUs.
So lemme just show you real quick.
I did install alama on a Linux virtual machine and I'll just demo the
performance for you real quick. By the way, if you're running a Mac with an M1,
M two or M three processor, it actually works great. I forgot to install it.
I got to install it real quick and I'll ask you that same question.
What is a pug? It's going to take a minute, it'll still work,
but it's going to be slower on CPUs and there it goes. It didn't take too long,
but notice it is a bit slower.
Now if you're running WSL and you know have an Nvidia GPU and it didn't show up,
I'll show you in a minute how you can get those drivers installed. But anyways,
just sit back for a minute,
sip your coffee and think about how powerful this is.
The tinfoil hat version of me stinking loves this because let's say
the zombie apocalypse happens, right? The grid goes down, things are crazy,
but as long as I have my laptop and a solar panel,
I still have AI and it can help me survive the zombie apocalypse.
Let's actually see how that would work. It gives me next steps.
I could have it help me with the water filtration system. This is just cool,
right? It's amazing. But can I show you something funny?
You may have caught this earlier. Who is network? Chuck?
What? Dude, I've always wanted to be Rick Grimes.
That is so fun, but seriously, it kind of hallucinated there.
It didn't have the correct information.
It's so funny how it mixed the zombie apocalypse prompt with me.
I love that so much. Let's try a different model. I'll say bye.
I'll try a really fun one called mytral. And by the way,
if you want to know which ones you can run with Llama, which LLMs,
they get a page for their models right here and all the ones you can run,
including llama two, uncensored Wizard Math.
I might give that to my kids actually. Let's see what it says.
Now who is Network Chuck?
Now my name is not Chuck Davis and my YouTube channel is not called Network
Chuck on Tech.
So clearly the data this thing was trained on is either not up to date or just
plain wrong. So now the question is cool,
we've got this local private ai, this LLM, that's super powerful,
but how do we teach it the correct information for us?
How can I teach it to know that I'm network Chuck, Chuck Keith, not Chuck Davis,
and my channel is called Network Chuck.
Or maybe I'm a business and I want it to know more than just what's publicly
available because sure, right now if you downloaded this lm,
you could probably use it in your job,
but you can only go so far without it knowing more about your job. For example,
maybe you're on a help desk.
Imagine if you could take your help desk's knowledge base, your IT procedures,
your documentation. Not only that,
but maybe you have a database of closed tickets, open tickets.
If you could take all that data and feed it to this LLM and then ask it
questions about all of that, that would be crazy.
Or maybe you wanted to help troubleshoot code that your company's written.
You could even make this LM public facing for your customers.
You feed information about your product and the customer could interact with
that chat bot you make.
Maybe this is all possible with a process called fine tuning where we can train
this AI on our own proprietary secret private stuff about our
company or maybe our lives or whatever you want to use it for,
whatever use case is,
and this is fantastic because maybe before you couldn't use a public LLM because
you weren't allowed to share your company's data with that LLM,
whether it's compliance reasons or you just simply didn't want to share that
data because it's secret. Whatever the case,
it's possible now because this AI is private,
it's local and whatever data you feed to it,
it's going to stay right there in a company. It's not leaving the door.
That idea just makes me so excited because I think it is the future of AI and
how companies and individuals will approach it. It's going to be more private.
Back to our question though, fine tuning, that sounds cool.
Training and AI on your own data, but how does that work?
Because as we saw before with pre-training a model with mata,
it took them 6,000 GPUs over 1.7 million GPU hours.
Do we have to have this massive data center to make this happen? No.
Check this out, and this is such a fun example, VMware, they asked chat GPT,
what's the latest version of VMware vSphere?
Now the latest chat GPT knew about was vSphere 7.0,
but that wasn't helpful to VMware because their latest version they were working
on chat hadn't been released yet.
So it wasn't public knowledge was vSphere eight update too.
And they wanted information like this internal information not yet released to
the public.
They wanted this to be available to their internal team so they could ask
something like chat GBT, Hey, what's the latest version of vSphere?
And they could answer correctly.
So to do what VMware is trying to do to fine tune a model or train it on new
data, it does require a lot. First of all,
you would need some hardware servers with GPUs.
Then you would also need a bunch of tools and libraries and SDKs like PyTorch
and TensorFlow, pandas, MPI side kit, learn transformers and fast ai.
The list goes on.
You need lots of tools and resources in order to fine tune an LLM.
That's why I'm a massive fan of what VMware is doing right here.
They have something called the VMware private AI with Nvidia,
the gajillion things I just listed off. They include in one package,
one combo meal, a recipe of ai, fine tuning goodness.
So as a company it becomes a bit easier to do this stuff yourself locally.
For the system engineer you have on staff who knows VMware and loves it,
they could do this stuff,
they could implement this and the data scientists they have on staff that will
actually do some of the fine tuning, all the tools are right there.
So here's what it looks like to fine tune and we're going to kind of peek behind
the curtain at what a data scientist actually does.
So first we have the infrastructure and we start here in vSphere, VMware.
Now if you don't know what vSphere is or VMware, think virtual machines,
you got one big physical server. The hardware, the stuff you can feel,
touch and smell. You haven't smelled the server, I dunno what you're doing.
And instead of installing one operating system on them like Windows or Linux,
you install VMware's, EA XI,
which will then allow you to virtualize or create a bunch of additional virtual
computers. So instead of one computer,
you've got a bunch of computers all using the same hardware resources.
And that's what we have right here. One of those virtual computers,
a virtual machine.
This by the way is one of their special deep learning VMs that has all the tools
I mentioned and many, many more pre-installed, ready to go.
Everything a data scientist could love.
It's kind of like a surgeon walking in to do some surgery and like their doctor
assistants or whatever have prepared all their tools.
It's all in the tray laid out nice and neat to the surgeon.
All he has to do is walk in and just go scalpel.
That's what we're doing here for the data scientist.
Now talking more about hardware,
this guy has a couple Nvidia GPUs assigned to it or pass through to it through
a technology called PCIE Passthrough. These are some beefy GPUs.
I notice they are V GPU for virtual GPU similar to what you do with the CPU,
cutting up the PU and assigning some of that to a virtual CPU on a virtual
machine. So here we are in data scientists world. This is a Jupiter notebook,
a common tool used by a data scientist,
and what you're going to see here is a lot of code that they're using to prepare
the data,
specifically the data that they're going to train or fine tune the existing
model on. Now we're not going to dive deep on that,
but I do want you to see this, check this out.
A lot of this code is all about getting the data ready. So in VMware's case,
it might be a bunch of the knowledge base product documentation and they're
getting it ready to be fed to the LLM. And here's what I wanted you to see.
Here's the dataset that we're training this model on. We're fine tuning.
We only have 9,800 examples that we're giving it or 9,800 new prompts or
pieces of data. And that data might look like this,
like a simple question or a prompt and then we feed it the correct answer and
that's how we essentially train ai. But again,
we're only giving it 9,800 examples,
which is not a lot at all and is extremely small compared to how the
model was originally trained.
And I point that out to say that we're not going to need a ton of hardware or a
ton of resources to fine tune this model.
We won't need the 6,000 GPUs we needed for MATA to originally create this model.
We're just adding to it,
changing some things or fine tuning it to what our use case is and looking at
what actually will be changed when we run this and we train it,
we're only changing 65 million parameters, which sounds like a lot, right?
But not in the grand scheme of things of like a 7 billion parameter model.
We're only changing 0.93% of the model.
And then we can actually run our fine tuning,
which this is a specific technique in fine tuning called prompt tuning where we
simply feed up additional prompts with answers to change how it'll react to
people asking you questions.
This process will take three to four minutes to fine tune it because again,
we're not changing a lot and that is just so super powerful and I think VMware
is leading the charge with private ai.
VMware and Nvidia take all the guesswork out of getting things set up to fine
tune an LLM. They've got deep learning VMs,
which are insane VMs that come pre-installed with everything you could want
everything a data scientist would need to find tune an LLM.
Then Nvidia has an entire suite of tools sensor around their GPUs,
taking advantage of some really exciting things to help you fine tune your lms.
Now there's one thing I didn't talk about because I wanted to save it for last.
For right now it's this right here, this vector database,
post gray SQL box here.
This is something called rag and it's what we're about to do with our own
personal GPT here in a bit. Retrieval, augment the generation. So scenario,
let's say you have a database of product information, internal docs,
whatever it is, and you haven't fine tuned your LLM on this just yet.
So it doesn't know about it. You don't have to do that with rag.
You can connect your LLM to this database of information,
this knowledge base and give it these instructions.
Say whenever I ask you a question about any of the things in this database,
before you answer, consult the database,
go look at it and make sure what you're saying is accurate.
We're not retraining the LLM, we're just saying, Hey, before you answer,
go check real quick in this database to make sure it's accurate to make sure you
got your stuff right. Isn't that cool? So yes,
fine tuning is cool and training an LLM on your own data is awesome,
but in between those moments of fine tuning,
you can have rag set up where it can consult your database,
your internal documentation and give correct answers based on what you have in
that database. That is so stinking cool.
So with VMware private AI foundation with nvidia,
they have those tools baked right in to where it just kind of works for what
would otherwise be a very complex setup. And by the way, this whole rag thing,
like I said earlier, we're about to do this,
I actually connected a lot of my notes and journal entries to a private GPT
using RAG and I was able to talk with it about me asking it about my
journal entries and answering questions about my past. That's so powerful. Now,
before we move on,
I just want to highlight the fact that Nvidia with their Nvidia AI enterprise
gives you some amazing and fantastic tools to pull the LLM of your choice and
then fine tune and customize and deploy that LLM. It's all built in right here.
So VMware Cloud Foundation,
they provide the robust infrastructure and NVIDIA provides all the amazing AI
tools you need to develop and deploy these custom LLMs.
Now it's not just Nvidia, they're partnering with Intel as well.
So VMware is covering all the tools that admins care about.
And then for the data scientists, this is for you.
Intel's got your back data analytics,
generative AI and deep learning tools and some classic ML or machine learning.
And they're also working with IBM, all you IBM fans. You can do this too. Again,
VMware has the admin's back. But for the data scientist, Watson,
one of the first AI things I ever heard about Red Hat and OpenShift,
and I love this because what VMware is doing is all about choice.
If you want to run your own local private ai, you can.
You're not just stuck with one of the big guys out there and you can choose to
run it with Nvidia and VMware, Intel and VMware, IBM and VMware.
You got options. So there's nothing stopping you.
It's not for some of the bonus section of this video and that's how to run your
own private GPT with your own knowledge base. Now, fair warning,
it is a bit more advanced, but if you stick with me,
you should be able to get this up and running. So take one more sip of coffee.
Let's get this going. Now, first of all, this will not be using a lama.
This will be a separate project called Private GPT. Now disclaimer,
this is kind of hard to do. Unlike VMware private ai,
which they do it all for you,
it's a complete solution for companies to run their own private local ai.
What I'm about to show you is not that at all. No affiliation with VMware.
It's a free side project.
You can try just to get a little taste of what running your own private GPT with
rag tastes like. Did I do that right? I don't know.
Now L Martinez has a great doc on how to install this. It's a lot,
but you can do it. And if you just want a quick start,
he does have a few lines of code for Linux and Mac users. Fair warning,
this is CPU only. You can't really take advantage of RAG without A GPU,
which is what I wanted to do. So here's my very specific scenario.
I've got a Windows PC with an NVIDIA 40 90. How do I run this?
Linux-based project. WSL, and I'm so thankful to this guy Emelia Lance a lot.
He put an entire guide together of how to set this up.
I'm not going to walk you through every step because he already did that link
below, but I seriously need to buy this guy a coffee. How do I do that?
I don't know, Emil, if you're watching this, reach out to me.
I'll send you some coffee. So anyways,
I went through every step from installing all the prereqs to installing NVIDIA
drivers and using poetry to handle dependencies, which poetry is pretty cool.
I landed here.
I've got a private local working private GPT that I can access through my web
browser and it's using my GPU, which is pretty cool. Now,
first I try a simple document upload,
got this VMware article that details a lot of what we talked about in this
video. I upload it and I start asking you questions about this article.
I tried something specific like show me something about VMware AI market growth.
Bam, it figured it out, it told me. Then I'm like,
what's the coolest thing about VMware private ai?
It told me I'm sitting here chatting with a document, but then I'm like,
let's try something bigger. I want to chat with my journals.
I've got a ton of journals on markdown format and I want to ask you questions
about me. Now this specific step is not covered in the article.
So here's how you do it. First,
you'll want to grab your folder of whatever documents you want to ask questions
about and throw it onto your machine.
So I copied over to my WSL machine and then I ingested it with this command once
complete and I ran private GPT. Again,
here's all my documents and I'm ready to ask it questions.
So let's test this out. I'm going to ask it what did I do in takayama?
So I went to Japan in November of 2023. Let's see if you can search my notes,
figure out when that was and what I did.
That's awesome. Oh my goodness.
Let's see, what did I eat in Tokyo?
How cool is that? Oh my gosh, that's so fun. No, it's not perfect,
but I can see the potential here. That's insane. I love this so much.
Private AI is the future and that's why we're seeing VMware bring products like
this to companies to run their own private local AI and then make it pretty
easy. If you actually did that private GPT thing, that little side project,
there's a lot to it. Lots of tools you have to install, it's kind of a pain.
But with VMware,
they kind of cover everything like that deep learning VM they offer as part of
their solution. It's got all the tools ready to go. Pre-baked again,
you're like a surgeon just walking in saying scalpel.
You got all this stuff right there. So if you want to bring AI to your company,
check out VMware private AI link below and thank you to VMware by Broadcom for
sponsoring this video. You made it to the end of the video time for a quiz.
This quiz will test the knowledge you've gained in this video and the first five
people to get a hundred percent on this quiz will get free coffee from Network
Chuck Coffee. So here's how you take the quiz right now.
Check the description in your video and click on this link.
If you're not currently signed into the academy, go ahead and get signed in.
If you're not a member, go ahead and click on sign off. It's free.
Once you're signed in,
it will take you to your dashboard showing you all the stuff you have access to
with your free academy account. But to get right back to that quiz,
go back to the YouTube video,
click on that link once more and it should take you right to it.
Go ahead and click on start now and start your quiz. Here's a little preview.
That's it. The first five to get a hundred percent free coffee.
If you're one of the five,
you'll know because you'll receive an email with free coffee.
You got to be quick, you got to be smart. I'll see you guys in the next video.
5.0 / 5 (0 votes)
6款工具帮你自动赚钱,轻松上手帮你打开全新的收入渠道,赚钱效率高出100倍,用好这几款AI人工智能工具,你会发现赚钱从来没如此简单过
Microsoft's New PHI-3 AI Turns Your iPhone Into an AI Superpower! (Game Changer!)
【保姆级+免费】GPT4自动化神器:一键生成文案、视频、编程,小白也能轻松掌握AgentAI!
Breaking down Scarlett Johansson's dispute with OpenAI
不露脸YouTube新手做什么类型视频最好!28个低调的网上赚钱视频领域大揭秘,自媒体拍什么类型的视频比较好?看完你就有答案
AI神助攻,轻松驾驭ChatGPT的五大神器,,一跃成为GPT达人 | 回到Axton