So Mistral AI released new models recently at sizes 3B, 8B and 14B (sadly not 12B). And also a bigger 675B-A41B MoE model.
The smaller models seem to be better so, may as well just focus on those.
The world knowledge seems to be worse on the 14B model in comparison to Nemo which was released well over a year ago.. which is disappointing. But it seems like they aren't allowed to train on as much data anymore due to EU regulation on AI. Hopefully EU will realize that this is stupid and let AI companies compete against Chinese/US models with more relax data regulations for training data. Otherwise Chinese stuff will dominate the market and I guess that's fine?
Anyway, these models are still pretty decent I guess. We'll see if training on them is better than training on Nemo. The 3B model is also the only 3B model they have released publicly (they had one API only for some odd reason).
Huggingface: https://huggingface.co/collections/mistr...inistral-3
Mistral's Blog post: https://mistral.ai/news/mistral-3
The smaller models seem to be better so, may as well just focus on those.
The world knowledge seems to be worse on the 14B model in comparison to Nemo which was released well over a year ago.. which is disappointing. But it seems like they aren't allowed to train on as much data anymore due to EU regulation on AI. Hopefully EU will realize that this is stupid and let AI companies compete against Chinese/US models with more relax data regulations for training data. Otherwise Chinese stuff will dominate the market and I guess that's fine?
Anyway, these models are still pretty decent I guess. We'll see if training on them is better than training on Nemo. The 3B model is also the only 3B model they have released publicly (they had one API only for some odd reason).
Huggingface: https://huggingface.co/collections/mistr...inistral-3
Mistral's Blog post: https://mistral.ai/news/mistral-3
The current owner of this forum.

