Anthropic’s new Claude Opus 4.7 model landed on April 16, 2026, and developers say it is already reshaping how people must write prompts. The release keeps list prices steady at $5 per million input tokens and $25 per million output tokens, yet a wave of breaking API changes is forcing engineers to rewrite code that worked on Opus 4.6 just days ago.
At a Glance:
- Claude Opus 4.7 released April 16, 2026, across API, Bedrock, Vertex AI and Microsoft Foundry.
- Model scores 87.6% on SWE-bench Verified, up from 80.8% on Opus 4.6.
- Non-default temperature, top_p and top_k values now return a 400 error.
- New tokenizer can consume up to 35% more tokens for identical text.
A Flagship Model That Rewrites The Rules
Anthropic on Thursday announced a new artificial intelligence model, Claude Opus 4.7, which the company said is an improvement over past models but is “less broadly capable” than its most recent offering, Claude Mythos Preview. The model is pitched as the strongest one available to the general public.
Anthropic released Claude Opus 4.7 on April 16, available across the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. Pricing stays the same as Opus 4.6: $5 per million input tokens and $25 per million output tokens.
The headline feature is not a benchmark. It is a new pricing reality hidden inside the model’s design. According to Anthropic’s official launch announcement, the model is tuned for long, agentic tasks and pays “precise attention to instructions,” which means sloppy prompts no longer get a free pass.

The API Changes That Break Old Code
Developers who upgraded on day one hit walls fast. Sampling parameters are gone. Setting temperature, top_p, or top_k to any non-default value returns a 400 error. If your agent pipelines set temperature to zero for determinism, that code breaks on upgrade.
The second shock involved thinking budgets. Instead of manually setting a thinking token budget, adaptive thinking lets Claude dynamically determine when and how much to use extended thinking based on the complexity of each request. On Claude Opus 4.7, adaptive thinking is the only supported thinking mode; manual thinking: {type: “enabled”, budget_tokens: N} is no longer accepted.
A third change hit user interfaces. Starting with Claude Opus 4.7, thinking content is omitted from the response by default. Thinking blocks still appear in the response stream, but their thinking field will be empty unless the caller explicitly opts in.
Anthropic’s official migration documentation also warns that the model uses a new tokenizer that processes text differently. Claude Opus 4.7 uses a new tokenizer, contributing to its improved performance on a wide range of tasks. This new tokenizer may use roughly 1x to 1.35x as many tokens when processing text compared to previous models (up to ~35% more, varying by content).
Why Bad Prompts Now Cost You Money
The sum of these changes has a name in developer circles: the ambiguity tax. Vague prompts used to lean on temperature=0 for stability. That lever is gone.
Task budgets replace the old hard caps on thinking. max_tokens is a hard ceiling per request, invisible to the model. task_budget is a suggestion across the whole loop that the model is aware of. Use task_budget for self-moderation, and max_tokens as the per-request fence. For open-ended agentic work where you want the best answer, skip the task budget.
Anthropic has quietly shifted the economics of imprecision from free to billable. A fuzzy prompt forces the model to spend reasoning tokens figuring out what the user meant before any real work starts.
A new xhigh effort setting also moves the baseline price up. To address this, Anthropic is introducing a new “effort” parameter. Users can now select an xhigh (extra high) effort level, positioned between high and max, allowing for more granular control over the depth of reasoning the model applies to a specific problem. Internal data shows that while max effort yields the highest scores (approaching 75% on coding tasks), the xhigh setting provides a compelling sweet spot between performance and token expenditure.
The model also stops guessing at intent. More literal instruction following, particularly at lower effort levels. The model will not silently generalize an instruction from one item to another, and will not infer requests you didn’t make. Response length calibrates to perceived task complexity rather than defaulting to a fixed verbosity.
https://x.com/the_smart_ape/status/2045070676063649908
Benchmarks, Coding Wins And A Regression
“Opus 4.7 is a notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks,” Anthropic said in a blog post.
The numbers back that framing. On our 93-task coding benchmark, Claude Opus 4.7 lifted resolution by 13% over Opus 4.6, including four tasks neither Opus 4.6 nor Sonnet 4.6 could solve. Claude Opus 4.7 currently dominates the autonomous coding SWE-bench Pro leaderboard with a 64.3% success rate, a massive leap from the 53.4% achieved by version 4.6.
But the release is not a clean sweep. On the MRCR benchmark, Opus 4.7 scored 32.2% compared to Opus 4.6’s 78.3%. That is not a minor regression. That is a collapse.
How Developers Are Reacting
Early reactions split sharply. GitHub’s Copilot team rolled the model out the same day it launched. In our early testing, Opus 4.7 delivers stronger multi-step task performance and more reliable agentic execution, building on the coding strategy strengths of its predecessor. It also shows meaningful improvement in long-horizon reasoning and complex, tool-dependent workflows.
Enterprise customers echoed that view. Meanwhile, Box’s Head of AI Yashodha Bhavnani says the new model is more efficient based on the company’s evaluations: “Claude Opus 4.7 demonstrates significant efficiency gains while preserving the performance of Claude Opus 4.6,” said Yashodha Bhavnani, Head of AI at Box.
Critics argue the release hides a stealth price hike behind a flat sticker price. One widely shared complaint on developer forums called the update “the pre-nerf build of 4.6 dressed in a higher model number,” citing the new tokenizer as the real cost driver.
The companion complaint is usage limits. Max subscribers ($200/month) are reporting they hit weekly caps after a handful of prompts, which didn’t happen on 4.6.
What Comes Next For Anthropic And Its Users
Opus 4.7 is also a safety test bed. We are releasing Opus 4.7 with safeguards that automatically detect and block requests that indicate prohibited or high-risk cybersecurity uses. What we learn from the real-world deployment of these safeguards will help us work towards our eventual goal of a broad release of Mythos-class models.
Key Takeaway: Opus 4.7 is not a drop-in upgrade. Teams must audit code for removed parameters, retune prompts for literal instruction following, and budget for up to 35% higher token consumption.
Anthropic’s own best practices guide for Claude Code recommends specifying tasks up front and cutting user turns. Well-specified task descriptions that incorporate intent, constraints, acceptance criteria, and relevant file locations give Opus 4.7 the context it needs to deliver stronger outputs. Ambiguous prompts conveyed progressively across many turns tend to reduce both token efficiency and, sometimes, overall quality.
Cloud partners are already live. AWS announced availability on Amazon Bedrock the same day, with up to 10,000 requests per minute (RPM) per account per Region are available immediately, with more available upon request.
The broader signal is clear. The Opus 4.7 release surfaces a problem that every serious AI team already knows: prompts are model-specific artifacts. A prompt tuned for 4.6 does not behave identically on 4.7, just as a GPT-5.4 prompt does not translate to Gemini 3.1 Pro without adjustment.
Frequently Asked Questions
When did Claude Opus 4.7 launch?
Anthropic released Claude Opus 4.7 on April 16, 2026, across its API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.
How much does Claude Opus 4.7 cost?
Pricing holds steady at $5 per million input tokens and $25 per million output tokens, the same as Opus 4.6. Actual costs can rise because the new tokenizer may use up to 35% more tokens.
Why does temperature=0 break on Opus 4.7?
Anthropic removed sampling parameters. Non-default values for temperature, top_p and top_k now return a 400 error and must be stripped from API calls.
Is Opus 4.7 better than Opus 4.6 at coding?
Yes. Opus 4.7 scores 87.6% on SWE-bench Verified versus 80.8% for Opus 4.6, and 64.3% on SWE-bench Pro versus 53.4%.
What is adaptive thinking in Opus 4.7?
Adaptive thinking lets the model decide how long to reason based on task complexity, replacing the old budget_tokens cap. It is the only supported thinking mode on Opus 4.7.
Claude Opus 4.7 shows what a mature AI market looks like. Gains on coding come with sharper edges: breaking API changes, a tokenizer that quietly inflates bills up to 35%, and a model that no longer fills in blanks for lazy prompts. The 13% jump on Anthropic’s 93-task benchmark is real, but so is the migration work. Share your Opus 4.7 experience in the comments below.



Leave a Comment