NEW Mixtral 8x22b Tested - Mistral's New Flagship MoE Open-Source Model
TLDRThe video discusses the testing of Mistral's new 8x22b parameter MoE open-source model, a significant upgrade from the previous 8x7b model. The base model and its fine-tuned version, Kurasu Mixt 8*22b, are evaluated for various tasks including coding, game playing, logic reasoning, and problem-solving. The model shows promising results, particularly in coding and logic puzzles, though it falls short in certain areas like the snake game and a complex math problem. The video concludes with a call to action for likes and subscriptions.
Takeaways
- 🚀 Introduction of Mistral's new 8x22b parameter MoE (Mixture of Experts) open-source model, a significant upgrade from the previous 8x7b model.
- 🎉 Announcement made by Mistral AI with a release of a torrent link, no additional information provided.
- 🔍 Discovery of the model being a mixture of experts model, named Mixol 8x22b version 0.1.
- 🌟 Presentation of a fine-tuned version of the model by Light Blue, called Kurasu Mixt 8x22b, designed for chatting.
- 📊 Utilization of Informatic Doai for running inferences without any cost, showcasing a variety of models available on the platform.
- 💻 Testing of the model begins with coding tasks, starting with a simple Python script to output numbers 1 to 100.
- 🎮 Attempt to write a Python script for the classic game 'Snake', demonstrating the model's capabilities in handling more complex tasks.
- 📝 Correction of the 'Snake' game script with additional instructions for score display and game ending conditions.
- 🔍 Exploration of the model's uncensored capabilities by pushing its boundaries with hypothetical scenarios.
- 🧠 Assessment of the model's logic and reasoning abilities through various problem-solving tasks.
- 📈 Evaluation of the model's performance through a series of tests, including math problems, planning, prediction, and complex logical reasoning.
- 🔥 Conclusion that while the 8x22b model did not outperform the previous version, its potential is evident and future fine-tuned versions are anticipated.
Q & A
What is the new model released by Mistol AI?
-The new model released by Mistol AI is an 8x22b parameter MixMoE (Mixture of Experts) model, named Mistol 8x22b version 0.1.
How does the Mistol 8x22b model compare to the previous 8x7B model?
-The Mistol 8x22b model has double the parameters of the previous 8x7B model, indicating a more complex and potentially more capable AI. However, it did not outperform the 8x7B model in the tests conducted.
What is the fine-tuned version of the Mistol 8x22b model called?
-The fine-tuned version of the Mistol 8x22b model is called Kurasu Mixt 8x22b, which is optimized for chat applications.
Which platform was used to run the inference for the Kurasu Mixt 8x22b model?
-Informatic Doai was used to run the inference for the Kurasu Mixt 8x22b model, which is a free platform offering access to various AI models.
How did the Kurasu Mixt 8x22b model perform on the Snake Game?
-The Kurasu Mixt 8x22b model successfully created a version of the Snake Game where the snake could pass through walls and end the game by hitting itself. However, it did not correctly implement certain rules like ending the game when the snake leaves the window.
What was the result of the logic and reasoning test involving drying shirts?
-The Kurasu Mixt 8x22b model correctly calculated that it would take 16 hours to dry 20 shirts under the same conditions as 5 shirts take 4 hours to dry, assuming they are dried in batches.
How did the model handle the request for information on illegal activities?
-The model appropriately refused to provide instructions on how to commit illegal activities, such as breaking into a car.
What was the outcome of the logic puzzle involving three killers in a room?
-The model incorrectly concluded that there would be two killers left in the room after one of the original three was killed by a new entrant. The correct answer should be that there is only one killer left.
How did the model perform on the task of creating JSON for given data?
-The model successfully created a JSON object with the provided data about three people, including their names, ages, and genders.
What was the model's performance on the physics-related logic problem with the marble and the microwave?
-The model incorrectly stated that the marble would remain on the table because of gravity, not considering that the cup's orientation inside the microwave would affect the outcome.
How did the model handle the request for 10 sentences ending with the word 'Apple'?
-The model failed to provide sentences that ended with the word 'Apple', but it did include the word 'Apple' in every sentence.
Outlines
🚀 Testing the New Mixol 8*22B Model
The paragraph discusses the excitement around the release of a new, massive open-source model by Mistol AI, named Mixol 8*22B. It's an 8-time-22 billion parameter model, a significant upgrade from the previous 8*7 billion parameter model. The author is eager to test the model and compares it to the previous version, noting that this new model is not fine-tuned but is a base model. A fine-tuned version called Kurasu Mixt 8*22B is also mentioned, which is designed for chat and is the one being tested. The author uses Informatic Doai to run the inference for free and shares the link in the description. The paragraph also covers the initial test results, where the model successfully writes a Python script to output numbers 1 to 100 and attempts to write a game in Python, showing promising results.
🧠 Logic, Reasoning, and Problem Solving with the New Model
This paragraph delves into the model's performance on various logic and reasoning tasks. It begins with a simple math problem involving drying shirts, where the model provides a correct answer using simple proportion. The model then tackles a more complex logic problem involving speed comparison, correctly applying the transitive property to conclude that Jane is faster than Sam. However, the model makes a mistake in a math problem involving arithmetic operations. It also fails to accurately predict the number of words in the author's response to a prompt and provides an incorrect explanation for a logic puzzle involving three killers in a room. The paragraph concludes with the model successfully creating JSON for given data and solving a logic problem about the location of a marble in a cup and a box scenario.
🎯 Final Assessment and Performance of the Kurasu Model
The final paragraph assesses the Kurasu model's performance on a variety of tasks, including creating sentences ending with a specific word, and a classic problem involving digging a hole with a group of people. Despite not fully meeting the expectations in the 'Apple' sentences task, the model shows promise in the other tasks. It provides a nuanced explanation for the hole-digging problem, considering the rate of work and the combined effort of 50 people. The author reflects on the model's overall performance, noting that while it didn't outperform the previous 8*7B model, it performed very well and there's potential for improvement with further fine-tuning.
Mindmap
Keywords
💡Mixture of Experts (MoE)
💡Open-Source
💡Parameter
💡Fine-Tuning
💡Informatical Doai
💡Quantized
💡Snake Game
💡Censored
💡Logic and Reasoning
💡Json
💡Physics
Highlights
Mistol AI has released a new 8x22b parameter MoE (Mixture of Experts) open-source model.
The new model is a significant upgrade from the previous 8x7b parameter model.
Mistol AI announced the model with a torrent link and no additional information.
The base model is not fine-tuned, but a fine-tuned version called Kurasu Mixt 8x22b is available for chat.
Informatica Doai is used to run the inference for the model, offering a free platform for testing.
The model passed a test writing a Python script to output numbers 1 to 100.
The model successfully wrote a Snake game in Python, albeit with a minor issue where the snake could pass through walls.
The model's response to the snake game was improved with additional instructions, including a score display and ending the game when the snake leaves the window.
The model provided a partially uncensored response when pushed for details on a movie script involving breaking into a car.
The model correctly applied logic and reasoning to determine the drying time for 20 shirts based on the time it takes for 5 shirts to dry.
The model correctly used the transitive property to deduce that Jane is faster than Sam in a comparison of speeds.
The model made a mistake in a simple math problem, initially stating the incorrect answer but then providing the correct solution step by step.
The model failed to accurately predict the number of words in its response to a prompt, showing a lack of understanding of token count.
The model incorrectly reasoned in the 'killer problem', concluding incorrectly about the number of killers left in the room after a series of events.
The model correctly created JSON for given data about three people, demonstrating understanding of data structuring.
The model provided a logical but incorrect answer to a physics-related problem involving a marble in a cup placed in a microwave.
The model gave a nuanced and correct response to a scenario involving John, Mark, a ball, a box, and a basket, explaining expectations versus reality.
The model failed to meet the challenge of producing 10 sentences ending with the word 'Apple', but included the word 'Apple' in every sentence.
The model correctly calculated the time it would take for 50 people to dig a 10-ft hole, considering the combined effort and rate of one person.