Stop paying for ChatGPT with these two tools | LMStudio x AnythingLLM
Summary
TLDRتشرح في النص المقدم كيفية تثبيت واجهة LM Studio وتطبيق أي شيء LLM على سطح المكتب لتشغيل نموذج LLM محليًا بسهولة. يتضمن الشرح الخطوات لتحميل النماذج من مستودع Hugging Face وتشغيل خادم إكمال باستخدام نموذج Mistral 7B Q4. يظهر النص البرمجي كيف يمكن استخدام أي شيء LLM لتحقيق تجربة محادثة شاملة وغير مفتوحة للجمهور، مع القدرة على إضافة ملفات ومواقع ويب لتحسين الفهم والاستجابة. يشدد المتحدث على أن التجربة النهائية تعتمد على النموذج المستخدم، وينصح بالاختيار من بين النماذج الشهيرة مثل LLaMA 2 أو MiSTOL لتحقيق أفضل النتائج.
Takeaways
- 😀 توضيح أن ما يسمى '_IMPLEX LABS' هو منشئ 'ANYTHING LLM' ويقدم طريقة سهلة لتشغيل نموذج 'LLM' محليًا على الكمبيوتر الخاص بك.
- 🛠️ سيستخدم الشرح أدوات تثنيتين يمكن تثبيتها بنقرة واحدة، وهي 'LM STUDIO' و 'ANYTHING LLM DESKTOP' لتحقيق تجربة مستخدم مميزة.
- 💻 يدعم 'LM STUDIO' ثلاث أنظمة تشغيل مختلفة، لكن التركيز في الشرح هو على الإصدار الويندوزي لأسباب التوافق مع الأجهزة الشخصية التي تحتوي على GPU.
- 🔗 يوفر 'ANYTHING LLM' واجهة مستخدم خاصة يمكنها التوصيل بأي شيء، وتتضمن ميزات مجانية مفيدة، كما أنه مفتوح المصدر للمساهمات.
- 📥 يتطلب الشرح تنزيل نموذج 'LLM' معين من 'LM STUDIO'، مما قد يتطلب بعض الوقت، لكن هو الخطوة الرئيسية للبدء في العملية.
- 📊 يتضمن 'LM STUDIO' واجهة تفاعلية تعرض النماذج الشهيرة مثل 'Google's Gemma'، ويدعم المقارنة بين النماذج المختلفة.
- 🔍 يمكن للمستخدمين استخدام 'LM STUDIO' لتجربة النماذج المحملة وتجربة واجهة الدردشة البسيطة المدمجة، لكن التركيز هو على استخدام 'ANYTHING LLM' لاستغلال القوة الكاملة.
- 🔗 يشرح الشرح كيفية الربط بين 'LM STUDIO' و 'ANYTHING LLM' من خلال إعداد الخادم ونسخ عنوان URL المناسب.
- 📈 يظهر الشرح كيف يمكن لـ 'ANYTHING LLM' استخدام المحتوى الشخصي للتحسين الاستجابة من النموذج 'LLM' عن طريق الإضافة وتضمين المحتوى.
- 📝 يوضح الشرح أن الدقة في الاستجابة تعتمد على المحتوى المستخدم والمعلومات المتاحة للنموذج، مما يدعم أهمية تضمين المصادر المناسبة.
- 🌐 يدعم الشرح أن التجربة النهائية هي نظام خاص وكامل للدردشة مع الوثائق بطريقة خاصة، باستخدام أحدث النماذج المفتوحة المصدر.
- 📚 ينصح الشرح باختيار النماذج القوية مثل 'llama 2' أو 'mistol' لتحقيق أفضل تجربة في الدردشة.
Q & A
ما هي implex labs و ما هي العلاقة بينها بـ anything llm؟
-implex labs هي شركة تأسست من قبل تيموثي كارات، و هي منشئ anything llm، وهي تطبيق يمكن من خلاله تشغيل نموذج LLM محليًا بطريقة سهلة.
ما هي anything llm؟
-anything llm هو تطبيق لوحة التحكم الكامل للدردشة يمكن أن يتصل بأي شيء ويوفر مميزات مجانية واسعة.
لماذا يمكن أن تكون تجربة anything llm أفضل إذا كان لدينا GPU؟
-تتضمن تجربة GPU أفضل لأنها توفر تجربة أسرع وأكثر فعالية عندما يتم استخدام النماذج الأكبر أو الأكثر تعقيدًا.
ما هي الأدوات التي سيتم استخدامها لتشغيل anything llm محليًا؟
-سيتم استخدام LM studio و anything llm desktop، و كلاهما يمكن تثبيته بنقرة واحدة.
كيف يمكنني تنزيل وتثبيت anything llm على جهازي؟
-يمكنك الذهاب إلى anything.com، واختيار التنزيل لـ anything llm للمكتب، ثم تحديد نظام التشغيل المناسب.
ماذا يوفر LM Studio؟
-يوفر LM Studio واجهة تفاعلية لتجربة وتشغيل النماذج المختلفة، ويتضمن خادم الدردشة الداخلي للتفاعل مع النماذج.
ما هي الميزة الرئيسية لـ anything llm في سياق الخصوصية؟
-anything llm هو مفتوح المصدر بالكامل مما يتيح للمستخدمين إضافة التكاملات التي يرغبون فيها، ويوفر أيضًا الخصوصية الكاملة للمستخدمين.
كيف يمكنني استخدام LM Studio مع anything llm؟
-يمكنك بدء تشغيل خادم التحليل في LM Studio ونسخ عنوان URL الخادم ثم نسخه إلى anything llm لربط الخادم.
ما هي الخطوات اللازمة للبدء مع anything llm بعد التثبيت؟
-بعد التثبيت، تحتاج إلى إدخال المعلومات اللازمة مثل السياق الواضح والرابط الأساسي لـ LM Studio، ثم إعداد الخادم وابدأ بإنشاء مساحة عمل جديدة.
كيف يمكن أن يساعد anything llm في تحسين فهم النموذج LLM للمحتوى الخاص؟
-يمكن لـ anything llm إضافة ملفات ومواقع الويب الخاصة للنموذج LLM للتعرف على المحتوى الخاص، مما يساعد على تحسين الاستجابة ودقة النموذج.
ماذا تعني الجملة 'anything llm هو أداة الذكاء الاصطناعي لتحليل الأعمال التي توفر نصاً بشريًا'؟
-تعني أن anything llm يمكنها تحليل وإنشاء رسائل نصية تشبه النص البشري، وتتضمن دعم LLM ومجموعة متنوعة من النماذج للشركات.
كيف يمكن لي أن أعرف ما هي النماذج الأكثر تفوقًا لاستخدامها مع anything llm؟
-يمكنك اختيار النماذج الأكثر شعبية مثل llama 2 أو mistol التي توفر تجربة جيدة، أو يمكنك البحث عن النماذج المتخصصة في مجال معين.
لماذا ينصح بمعرفة التفاصيل حول النموذج LLM الذي تختاره؟
-لأن التفاصيل تحدد التجربة النهائية مع الدردشة، ولكل نموذج خصائص وقدرات مختلفة، فمن المهم اختيار النموذج الذي يناسب احتياجاتك.
كيف يمكنني الحصول على مزيد من المعلومات والرابط اللازم لLM Studio و anything llm؟
-سيتم وضع الروابط في الوصف، مما يتيح لك الوصول إلى التفاصيل والتنزيلات اللازمة.
Outlines
😀 Introduction to Implex Labs and Anything LLM
Timothy Carat, the founder of Implex Labs and creator of Anything LLM, introduces himself and the purpose of the video. He aims to demonstrate the simplest method to run a highly capable, locally hosted, large language model (LLM) application on a laptop or desktop, preferably with a GPU for an enhanced experience. Timothy mentions two tools, LM Studio and Anything LLM Desktop, which are both single-click installable applications. He highlights that Anything LLM is a fully private, open-source chat application that can connect to various platforms and offers many features for free. The tutorial will guide viewers through setting up LM Studio on a Windows machine, exploring its capabilities, and integrating it with Anything LLM to unlock its full potential.
🔧 Setting Up LM Studio and Testing the Chat
The video proceeds with a step-by-step guide on setting up LM Studio on a Windows desktop. The process involves downloading and installing LM Studio and Anything LLM Desktop. Timothy explains that the installation of these two programs completes half of the setup. He then demonstrates how to use LM Studio, focusing on downloading models, such as the Mistral 7B Q4 model, from the Hugging Face repository. He also discusses the importance of GPU offloading for faster token processing and provides a brief tutorial on how to use the chat client within LM Studio. The chat client is used to test the model's response to a simple prompt, like saying 'hello,' and to showcase the metrics provided by LM Studio, such as time to first token.
🤖 Integrating Anything LLM with LM Studio
In this section, Timothy shows how to integrate Anything LLM with LM Studio. He first launches Anything LLM and navigates to the setup for LM Studio, requiring a token context window and the LM Studio base URL. He explains how to start a server in LM Studio to run completions against the selected model. The tutorial continues with instructions on configuring the server, including setting the port, enabling request queuing, and allowing GPU offloading. After starting the server, Timothy demonstrates how to connect LM Studio's inference server to Anything LLM by copying and pasting the necessary URL. He also discusses how to augment the model's knowledge with private documents or by scraping websites, which can then be embedded to improve the model's responses. The video concludes with a demonstration of asking the model a question about Anything LLM and seeing how the response improves after embedding relevant information.
🚀 Conclusion and Future Potential
Timothy concludes the tutorial by emphasizing the ease with which local LLM usage can be achieved using tools like LM Studio and Anything LLM Desktop. He points out that these tools demystify the technical aspects of running a local LLM and allow users to have a comprehensive LLM experience without the need for a subscription to services like OpenAI. He also reminds viewers that the choice of model is crucial for the quality of the chatting experience and suggests opting for popular and capable models like Llama 2 or Mistral. The video ends with an invitation for feedback and a promise to include helpful links in the description for further exploration.
Mindmap
Keywords
💡implex labs
💡anything llm
💡LM studio
💡windows
💡GPU
💡hugging face
💡Q4 model
💡CUDA
💡chat GPT
💡open source
Highlights
Timothy Carat, founder of Implex Labs, introduces a locally running, fully capable large language model (LLM).
The tutorial demonstrates setting up Anything LLM and LM Studio for a private AI chat experience.
LM Studio and Anything LLM are both single-click installable applications.
The process is optimized for systems with GPUs but is also possible with CPUs.
LM Studio supports multiple operating systems, with a focus on Windows in this tutorial.
Anything LLM is an all-in-one chat application that is fully private and open source.
The tutorial guides through downloading and setting up models in LM Studio.
Models from Hugging Face repository can be downloaded and used in LM Studio.
Different model types like Q4, Q5, and Q8 are explained, with recommendations for usage.
LM Studio's chat client is used for experimenting with models.
The importance of GPU offloading for faster token generation is discussed.
Anything LLM is downloaded and set up to work with LM Studio.
Instructions on configuring the LM Studio server for model completions are provided.
Connecting the LM Studio inference server to Anything LLM is detailed.
The tutorial shows how to enhance the LLM's knowledge with private documents or web scraping.
A demonstration of asking the model about Anything LLM with and without context.
The ability to embed and modify information within Anything LLM is highlighted.
The video concludes with the benefits of using LM Studio and Anything LLM for a private, end-to-end LLM system.
The tutorial emphasizes the ease of setting up a local LLM without technical expertise.
LM Studio and Anything LLM are positioned as core parts of a local LLM stack.
Transcripts
hey there my name is Timothy carat
founder of implex labs and creator of
anything llm and today I actually want
to show you possibly the easiest way to
get a very extremely capable locally
running fully rag like talk to anything
with any llm application running on
honestly your laptop a desktop if you
have something with the GPU this will be
a way better experience if all you have
is a CPU this is still possible and
we're going to use two tools
both of which are a single-click
installable application and one of them
is LM studio and the other is of course
anything LM desktop right now I'm on LM
studio. a they have three different
operating systems they support we're
going to use the windows one today
because that's the machine that I have a
GPU for and I'll show you how to set it
up how the chat normally works and then
how to connect it to anything LM to
really unlock a lot of its capabilities
if you aren't familiar with anything llm
anything llm is is an all-in-one chat
with anything desktop application it's
fully private it can connect to pretty
much anything and you get a whole lot
for actually free anything in LM is also
fully open source so if you are capable
of programming or have an integration
you want to add you can actually do it
here and we're happy to accept
contributions so what we're going to do
now is we're going to switch over to my
Windows machine and I'm going to show
you how to use LM studio with anything
LM and walking through both of the
products so that you can really get
honestly like the most comprehensive llm
experience and pay nothing for it okay
so here we are on my Windows desktop and
of course the first thing we're going to
want to do is Click LM Studio for
Windows this is version
0.216 whatever version you might be on
things may change a little bit but in
general this tutorial should be accurate
you're going to want to go to use
anything.com go to download anything LM
for desktop and select your appropriate
operating system once you have these two
programs installed you are actually 50%
done with the entire process that's how
quick this was let me get LM Studio
installed and running and we'll show you
what that looks like so you've probably
installed LM Studio by now you click the
icon on your desktop and you usually get
dropped on this screen I don't work for
LM studio so I'm just going to show you
kind of some of the capabilities that
are relevant to this integration and
really unlocking any llm you use they
kind of land you on this exploring page
and this exploring page is great it
shows you basically some of the more
popular models that exist uh like
Google's Gemma just dropped and it's
already live that's really awesome if
you go down here into if you click on
the bottom you'll see I've actually
already downloaded some models cuz this
takes time downloading the models will
probably take you the longest time out
of this entire operation I went ahead
and downloaded the mistal 7B instruct
the Q4 means 4bit quantized model now
I'm using a Q4 model honestly Q4 is kind
of the lowest end you should really go
for Q5 is really really great Q8 if you
want to um if you actually go and look
up any model on LM Studio like for
example let's look up mistol as you can
see there's a whole bunch of models here
for mistol there's a whole bunch of
different types these are all coming
from the hugging face repository and
there's a whole bunch of different types
that you can find here published by
bunch of different people you can see
that you know how many times this one
has been downloaded this is a very
popular model and once you click on it
you'll likely get some options now LM
studio will tell you if the model is
compatible with your GPU or your system
this is pretty accurate I've found that
sometimes it doesn't quite work um one
thing you'll be interested in is full
GPU offloading exactly what it sounds
like using the GPU as much as you can
you'll get way faster tokens something
honestly on the speed level of a chat
GPT if you're working with a small
enough model or have a big enough
graphics card I have 12 gigs of vram
available and you can see there's all
these Q4 models again you probably want
to stick with the Q5 models at least uh
for the best experience versus size as
you can see the Q8 is quite Hefty 7.7
gigs which even if you have fast
internet won't matter because it takes
forever to download something from
hugging face if you want to get working
on this in the day you might want to
start to download now for the sake of
this video I've already downloaded a
model so now that we have a model
downloaded we're going to want to try to
chat with it LM Studio actually comes
with a chat client inside of it it's
very very simplistic though and it's
really just for experimenting with
models we're going to want to go to this
chat bubble icon and you can see that we
have a thread already started and I'm
going to want to pick the one model that
I have available and you'll see this
loading bar continue There are some
system prompts that you can preset for
the model I have GPU offloading enabled
and I've set it to Max already and as
you can see I have Nvidia Cuda already
going there are some tools there are
some other things that you can mess with
but in general that's really all you
need to do so let's test the chat and
let's just say hello how are you and you
get the pretty standard response from
any AI model and you even get some
really cool metrics down here like time
to First token was 1.21 seconds I mean
really really kind of cool showing the
GPU layers that are there however you
really can't get much out of this right
here if you wanted to add a document
you'd have to copy paste it into the
entire user prompt there's really just a
lot more that can be done here to
Leverage The Power of this local llm
that I have running even though it's a
quite small one so to really kind of
Express how powerful these models can be
for your own local use we're going to
use anything llm now I've already
downloaded anything llm let me show you
how to get that running and how to get
to LM Studio to work work with anything
llm just booted up anything llm after
installing it and you'll usually land on
a screen like this let's get started we
already know who we're looking for here
LM studio and you'll see it asks for two
pieces of information a token context
window which is a property of your model
that you'd already be familiar with and
then the LM Studio base URL if we open
up LM studio and go to this local server
tab on the side this is a really really
cool part of LM Studio this doesn't work
with multimodel support So once you have
a model selected that's the model that
you are going to be using so here we're
going to select the exact same model but
we're going to start a server to run
completions against this model so the
way that we do that is we can configure
the server Port usually it's 1 2 3 4 but
you can change it to whatever you want
you probably want to turn off cores
allow request queuing so you can keep
sending requests over and over and they
don't just fail you want to enable log
buing and prompt formatting these are
all just kind of debugging tools on the
right side you are going to still want
to make sure that you have GPU
offloading allowed if that is
appropriate but other than that you just
click Start server and you'll see that
we get some logs saved here now to
connect the LM Studio inference server
to anything llm you just want to copy
this string right here up to the V1 part
and then you're going to want to open
anything ilm paste that into here I know
that my models Max to token window is
496 I'll click next embedding preference
we don't really even need one we can
just use the anything LM built in EMB
better which is free and private same
for the vector database all of this is
going to be running on machines that I
own and then of course we can skip the
survey and let's make a our first
workspace and we'll just call it
anything llm we don't have any documents
or anything like that so if we were to
send a chat asking the model about
anything llm will'll either get get a
refusal response or it will just make
something up so let's ask what is
anything llm and if you go to LM Studio
during any part you can actually see
that we sent the requests to the model
and it is now streaming the response
first token has been generated
continuing to stream when anything llm
does receive that first token stream
this is when we will uh start to show it
on our side and you can see that we get
a response it just kind of pops up
instantly uh which was very quick but it
is totally wrong and it is wrong because
we actually don't have any context to
give the model on what anything llm
actually is now we can augment the lm's
ability to know about our private
documents by clicking and adding them
here or I can just go and scrape a
website so I'm going to go and scrape
the use.com homepage cuz that should
give us enough information and you'll
see that we've scraped the page so now
it's time to embed it and we'll just run
that embedding and now our llm should be
smarter so let's ask the same question
again but this time knowing that it has
information that could be
useful and now you can see that we've
again just been given a response that
says anything LM is an AI business
intelligence tool to form humanlike text
messages based on prompt it offers llm
support as well as a variety of
Enterprise models this is definitely
much more accur it but we also tell you
where this information came from and you
can see that it cited the use.com
website this is what the actual chunks
that were used uh to formulate this
response and so now actually we have a
very coherent machine we can embed and
modify create different threads we can
do a whole bunch of stuff from within
anything llm but the core piece of
infrastructure the llm itself we have
running on LM Studio on a machine that
we own so now we have a fully private
endtoend kind of system for chatting
with documents privately using the
latest and greatest models that are open
source and available on hugging face so
hopefully this tutorial for how to
integrate LM studio and anything llm
desktop was helpful for you and unlocks
probably a whole bunch of potential for
your local llm usage tools like LM
studio oama and local AI make running a
local llm no longer a very technical
task and you can see that with tools
that provide an interface like LM Studio
pair that with another more powerful
tool built for chatting exclusively like
anything llm on your desktop and now you
can have this entire experience and not
have to pay open AI 20 bucks a month and
again I do want to iterate that the
model that you use will determine
ultimately your experience with chatting
now there are more capable models there
are more Niche models for programming so
be careful and know about the model that
you're choosing or just choose some of
the ones that are more popular like
llama 2 or mistol and you'll honestly be
great hopefully LM Studio Plus anything
llm desktop just become a core part of
your local llm stack and we're happy to
be a part of it and hear your feedback
we'll put the links in the description
and have fun
5.0 / 5 (0 votes)