Post History

Current version by Nick Antonaccio

Current VersionJun 05, 2026 at 16:45

This release feels especially important: Minimax M3 is now available on Openrouter https://openrouter.ai/minimax/minimax-m3 , as well as on https://www.minimax.io and Ollama. It's currently 1/2 price until June 7th, and this model is a significant improvement over previous versions.

The M3 version has a 1 million token context limit, and is multi-modal from the ground up.

All the other Minimax LLMs up to M2.7 have been open sourced, and the expectation is that M3 weights will be released soon.

BTW, https://www.minimax.io also offers a suite of other specialized types of models to generate music, video, voice, etc., as well as a coding harness made for their models.

Here are a 3D game and CRUD database example I built with Minimax M3 this morning:

I notice that Minimax M3 performs a lot more structured testing and automatically iterates based upon the results of those tests, and it provides explicitly verbose feedback about what it's doing during the agentic session (I used it in Pi).

Minimax is promoting the fact that their sparce attention routing improves long context performance with 9.7x faster prefill and 15.6x faster decoding at 1 million tokens.

I can't wait to try this locally!

Previous Versions
Version 2Jun 05, 2026 at 16:45

This release feels especially important: Minimax M3 is now available on Openrouter https://openrouter.ai/minimax/minimax-m3 , as well as on https://www.minimax.io and Ollama. It's currently 1/2 price until June 7th, and this model is a significant improvement over previous versions.

The M3 version has a 1 million token context limit, and is multi-modal from the ground up.

BTW, https://www.minimax.io also offers a suite of other specialized types of models to generate music, video, voice, etc., as well as a coding harness made for their models.

Here are a 3D game and CRUD database example I built with Minimax M3 this morning:

I notice that minimax3 performs a lot more structured testing and automatically iterates based upon the results of those tests, and it provides explicitly verbose feedback about what it's doing during the agentic session (I used it in Pi).

Minimax is promoting the fact that their sparce attention routing improves long context performance with 9.7x faster prefill and 15.6x faster decoding at 1 million tokens.

I can't wait to try this locally!

Version 1Jun 01, 2026 at 16:16

This release feels especially important: Minimax M3 is now available on Openrouter https://openrouter.ai/minimax/minimax-m3 , as well as on https://www.minimax.io and Ollama. It's currently 1/2 price until June 7th, and this model is a significant improvement over previous versions.

The M3 version has a 1 million token context limit, and is multi-modal from the ground up.

BTW, https://www.minimax.io also offers a suite of other specialized types of models to generate music, video, voice, etc., as well as a coding harness made for their models.

Here are a 3D game and CRUD database example I built with Minimax M3 this morning:

I notice that minimax3 performs a lot more structured testing and automatically iterates based upon the results of those test, and it provides explicitly verbose feedback about what it's doing during the agentic session (I used it in Pi).

Minimax is promoting the fact that their sparce attention routing improves long context performance with 9.7x faster prefill and 15.6x faster decoding at 1 million tokens.

I can't wait to try this locally!