MiniMax M3: Open-Weight 1M-Context Coding Rival
MiniMax launched M3, a coding model with a 1M-token context via MSA, priced at a fraction of frontier rivals. Open-weight, but the training code stays closed.
Shanghai-based artificial intelligence startup MiniMax unveiled M3, its latest multimodal model, on June 1. Tailored for coding and agentic workflows, M3 leverages a proprietary MSA architecture (MiniMax Sparse Attention) to process context windows of up to one million tokens. The model accepts text, image, and video inputs natively.
Its aggressive pricing structure has drawn immediate industry attention. The API is priced at $0.30 per million input tokens and $1.20 per million output tokens—approximately 5% to 10% of the rates charged by leading proprietary models. Alongside the release of the model's weights, MiniMax claimed that M3 outperforms GPT-5.5 and Gemini 3.1 Pro on critical software engineering benchmarks, including SWE-bench Pro.
The Economics of One-Million-Token Contexts Powered by MSA
Underpinning this cost efficiency is the new MSA architecture. By processing only high-salience segments of an input rather than executing quadratic computations across all tokens, this MSA architecture reduces compute costs within the one-million-token range to approximately one-twentieth of the previous generation's requirements. The architecture reads each key-value block only once, keeping memory access contiguous. Furthermore, under the MSA architecture, decoding speeds have reportedly increased by up to 15.6 times.
While the model supports a full one-million-token context window, it guarantees a minimum threshold of 512,000 tokens depending on deployment configurations. This capacity is particularly significant for ingestion of entire codebases or executing multi-step agentic workflows. Native multimodality—trained on interleaved text, image, and video data—remains a rare offering among open-weight models within this price tier.
The convergence of low inference costs and expanded context windows continues to dominate the AI landscape. While the price reductions initiated by DeepSeek disrupted standard token pricing for proprietary models, M3 intensifies the competition by introducing an open-weight alternative.
Evaluating Performance Across Coding and Agentic Benchmarks
MiniMax's performance claims focus heavily on coding and autonomous agent applications. On SWE-bench Pro, an evaluation measuring real-world software engineering capabilities, M3 recorded a score of 59.0%, exceeding reported metrics for GPT-5.5 and Gemini 3.1 Pro while approaching those of Claude 4.7 Opus. The model also posted a 66.0% success rate on TerminalBench 2.1 and scored 83.5 on BrowseComp for autonomous browsing, marginally outperforming Opus 4.7, which scored 79.3.
However, these benchmarks remain self-reported by the developer. As noted by VentureBeat, while M3 achieves leading results in targeted evaluations, it does not consistently outperform premium proprietary systems, such as Opus 4.8, across a broader spectrum of capabilities. Until independent, third-party evaluations are conducted, industry analysts suggest viewing M3 primarily as a coding AI model specialized for agentic workflows.
The Limits of the Open-Weight Release Model
Upon launch, M3 was made immediately available via API, with support integrated across platforms including OpenRouter and Ollama. The provider also introduced OpenAI- and Anthropic-compatible endpoints to facilitate seamless integration into existing software pipelines. MiniMax is distributing the model's weights through Hugging Face.
However, the openness of the release is constrained. By withholding the training dataset, training code, and proprietary inference operators, the release has drawn criticism for being an open-weight distribution rather than a fully open-source project. While developers can deploy and run M3 locally, replicating the model's training from scratch is impossible.
Nonetheless, the combination of frontier-class coding performance, a one-million-token context window, and native multimodal support at this price point represents a significant market entry. This release increases pressure on proprietary developers to lower their pricing, though the long-term adoption of this coding AI model will depend on how its performance claims hold up under independent, third-party evaluation.
- MiniMax Blog - MiniMax-M3
- VentureBeat - MiniMax M3 debuts, eclipsing GPT-5.5 and Gemini 3.1 Pro on a key benchmark for 5-10% of the cost
- The Decoder - MiniMax M3: open-weight model with a million-token context challenges proprietary leaders
- OpenRouter - MiniMax: MiniMax M3
- Open Source For You - MiniMax challenges AI rivals with M3, but stops short of full open-source commitment