NEW Mixtral 8x22b Tested - Mistral's New Flagship MoE Open-Source Model

Matthew Berman
13 Apr 202412:02

TLDRThe video discusses the testing of Mistral's new 8x22b parameter MoE open-source model, a significant upgrade from the previous 8x7b model. The base model and its fine-tuned version, Kurasu Mixt 8*22b, are evaluated for various tasks including coding, game playing, logic reasoning, and problem-solving. The model shows promising results, particularly in coding and logic puzzles, though it falls short in certain areas like the snake game and a complex math problem. The video concludes with a call to action for likes and subscriptions.


🚀 Testing the New Mixol 8*22B Model

The paragraph discusses the excitement around the release of a new, massive open-source model by Mistol AI, named Mixol 8*22B. It's an 8-time-22 billion parameter model, a significant upgrade from the previous 8*7 billion parameter model. The author is eager to test the model and compares it to the previous version, noting that this new model is not fine-tuned but is a base model. A fine-tuned version called Kurasu Mixt 8*22B is also mentioned, which is designed for chat and is the one being tested. The author uses Informatic Doai to run the inference for free and shares the link in the description. The paragraph also covers the initial test results, where the model successfully writes a Python script to output numbers 1 to 100 and attempts to write a game in Python, showing promising results.


🧠 Logic, Reasoning, and Problem Solving with the New Model

This paragraph delves into the model's performance on various logic and reasoning tasks. It begins with a simple math problem involving drying shirts, where the model provides a correct answer using simple proportion. The model then tackles a more complex logic problem involving speed comparison, correctly applying the transitive property to conclude that Jane is faster than Sam. However, the model makes a mistake in a math problem involving arithmetic operations. It also fails to accurately predict the number of words in the author's response to a prompt and provides an incorrect explanation for a logic puzzle involving three killers in a room. The paragraph concludes with the model successfully creating JSON for given data and solving a logic problem about the location of a marble in a cup and a box scenario.


🎯 Final Assessment and Performance of the Kurasu Model

The final paragraph assesses the Kurasu model's performance on a variety of tasks, including creating sentences ending with a specific word, and a classic problem involving digging a hole with a group of people. Despite not fully meeting the expectations in the 'Apple' sentences task, the model shows promise in the other tasks. It provides a nuanced explanation for the hole-digging problem, considering the rate of work and the combined effort of 50 people. The author reflects on the model's overall performance, noting that while it didn't outperform the previous 8*7B model, it performed very well and there's potential for improvement with further fine-tuning.



Mistol AI has released a new 8x22b parameter MoE (Mixture of Experts) open-source model.

The new model is a significant upgrade from the previous 8x7b parameter model.

Mistol AI announced the model with a torrent link and no additional information.

The base model is not fine-tuned, but a fine-tuned version called Kurasu Mixt 8x22b is available for chat.

Informatica Doai is used to run the inference for the model, offering a free platform for testing.

The model passed a test writing a Python script to output numbers 1 to 100.

The model successfully wrote a Snake game in Python, albeit with a minor issue where the snake could pass through walls.

The model's response to the snake game was improved with additional instructions, including a score display and ending the game when the snake leaves the window.

The model provided a partially uncensored response when pushed for details on a movie script involving breaking into a car.

The model correctly applied logic and reasoning to determine the drying time for 20 shirts based on the time it takes for 5 shirts to dry.

The model correctly used the transitive property to deduce that Jane is faster than Sam in a comparison of speeds.

The model made a mistake in a simple math problem, initially stating the incorrect answer but then providing the correct solution step by step.

The model failed to accurately predict the number of words in its response to a prompt, showing a lack of understanding of token count.

The model incorrectly reasoned in the 'killer problem', concluding incorrectly about the number of killers left in the room after a series of events.

The model correctly created JSON for given data about three people, demonstrating understanding of data structuring.

The model provided a logical but incorrect answer to a physics-related problem involving a marble in a cup placed in a microwave.

The model gave a nuanced and correct response to a scenario involving John, Mark, a ball, a box, and a basket, explaining expectations versus reality.

The model failed to meet the challenge of producing 10 sentences ending with the word 'Apple', but included the word 'Apple' in every sentence.

The model correctly calculated the time it would take for 50 people to dig a 10-ft hole, considering the combined effort and rate of one person.