🔥BTC/USDT

Anthropic launches Claude Sonnet 5 model

Anthropic unveiled its Claude Sonnet 5 model on June 30, delivering near‑flagship performance at under two‑thirds the price of its Opus 4.8 model. The release briefly lifted the U.S. semiconductor index by nearly 4%, signaling strong market optimism for hardware despite rapidly declining AI model costs.

Sonnet 5 scored 63.2 on the SWE-bench Pro benchmark compared with Opus 4.8’s 69.2, while slightly outperforming it on the GPQA-AAA v2 reasoning test. Promotional pricing is set at $2 per million input tokens and $10 per million output tokens, versus $5 and $25 for Opus 4.8.

AI inference costs collapse while usage surges

AI inference costs have dropped dramatically since 2022. Stanford’s AI Index Report estimates GPT-4 level inference is now around 1,000 times cheaper, with prices falling from $0.03 per 1,000 tokens in 2022 to fractions of a cent by 2025.

Major developers are accelerating the trend. Google’s video model costs $0.10 per second, while its Nano Banana 2 Lite image model generates 1,000 images for $0.034. DeepSeek pushed token input costs as low as $0.035 per million tokens.

At the same time, efficiency gains are reducing infrastructure needs. Software optimization has cut GPU demand by more than half in some cases, while hardware reuse and new decoding techniques have lowered costs further and increased inference speed up to tenfold.

Despite this, total demand is rising sharply. Global enterprise spending on generative AI jumped from $11.5 billion in 2024 to $37 billion in 2025. In practice, companies are deploying more tools, running heavier workloads, and consuming exponentially larger volumes of tokens.

For example, AT&T increased daily token usage from 800 million to 27 billion within 18 months, while a U.S. insurer scaled monthly usage from 3 million to 150 million tokens.

Chip prices climb as demand intensifies

Falling costs have not reduced hardware demand. Instead, they are fueling it. Memory prices surged alongside adoption, with DRAM and NAND Flash spot prices rising over 300% since the third quarter of 2025. DDR5 prices doubled between late 2025 and early 2026.

Samsung reported over ₩20 trillion in quarterly operating profit from its storage division in late 2025, driven by demand for high-end DRAM and HBM chips used in AI data centers.

A Goldman Sachs report projects $7.6 trillion in global AI infrastructure investment between 2026 and 2031, with annual spending expected to reach $1.6 trillion. Benchmark GPUs are priced around $80,500, with a single supplier accounting for roughly 75% of compute spending.

Even if alternative chips reduce unit costs, total spending is expected to remain elevated as lower prices drive higher consumption.

Jevons paradox returns in AI economy

The trend mirrors the Jevons Paradox, where efficiency gains increase overall resource consumption rather than reduce it. As AI becomes cheaper, new applications—from real-time customer service to personalized content—are becoming economically viable, expanding total usage.

This creates a feedback loop. Lower costs drive broader adoption, which increases total compute demand and reinforces the need for physical infrastructure such as chips, memory, and data centers.

While software improves rapidly, hardware remains constrained by long production cycles and limited fabrication capacity. As a result, more value is shifting toward semiconductor manufacturers and infrastructure providers.

Decentralized compute networks gain traction

This dynamic is extending into digital infrastructure markets, particularly decentralized physical infrastructure networks (DePIN). By early 2026, the sector’s total market capitalization reached between $9 billion and $10 billion, with leading projects generating tens of millions in monthly on-chain revenue.

Decentralized GPU marketplaces are emerging as a secondary supply layer for AI compute, offering lower-cost and more accessible processing power. Akash Network, for example, reported 428% annual growth in usage, with utilization exceeding 80% entering 2026.

Market data shows a growing linkage between semiconductor stocks and AI-related digital assets, with chip stock movements explaining up to 60% of short-term price action in AI-focused tokens.

Infrastructure becomes the dominant value layer

As AI adoption expands, value is increasingly shifting away from applications toward infrastructure. Physical and digital compute networks are becoming the foundation of the AI economy.

Analysts estimate decentralized compute networks could generate over $200 million in annualized protocol revenue by early 2026. The key question now is which platforms can sustain real demand for compute resources at scale.

As AI models grow cheaper and more widespread, control over compute capacity—both physical and decentralized—is emerging as the central economic driver of the next phase of industry growth.


Curious how falling AI costs reshape finance? Explore Toobit’s take in web3, AI, and crypto breaking the internet.

Disclaimer: The content on this page is provided for general informational purposes only and does not represent the views or financial advice of Toobit. We make no guarantees regarding the accuracy or completeness of this information and shall not be held liable for any errors, omissions, or outcomes resulting from its use. Investing in digital assets involves risk; users should independently evaluate their financial situation and the risks involved. For further details, please consult our Terms of Service and Risk Disclosure.

Sign up and trade to earn over 15,000 USDT
Sign up