Crypto news

28.06.2026
21:42

Coinbase has halved its AI spending amid explosive growth in token consumption: a strategy worth noting

Coinbase CEO Brian Armstrong shared impressive results from optimizing artificial intelligence costs. Despite exponential growth in token consumption, the company managed to nearly halve its expenses. And the key to success lies not in strict limits and bans, but in fine-tuning the infrastructure.

Armstrong emphasizes that Coinbase engineers are free to choose any models, but default settings play a decisive role. The company is actively experimenting by setting open-source models, such as GLM 5.2 and Kimi 2.7, as defaults, accessed through an internal gateway. Interestingly, 91% of employees never hit the limits, allowing the company to avoid reducing quotas and instead switch to cheaper configurations.

Routing, Caching, and Context Savings

The strategy is based on intelligent request routing. Coinbase's internal systems preprocess each request, directing it to the most suitable model based on cache hits and cost. For example, a cutting-edge model is used for strategic planning, but it is overkill for routine tasks. Armstrong insists that model selection should be automated by AI itself, not by humans.

Special attention is given to caching. Cache misses are a direct path to unnecessary spending. At Coinbase, all requests are configured to reuse already processed information. In the LibreChat service, the cache hit rate increased from 5% to 60% after proper configuration. Context savings also paid off: new sessions when switching tasks, tight file context limits, and disabling unused tools. As Armstrong summarizes, the goal is not to spend fewer tokens in principle, but to avoid wasting them.

Deutscher's "Barbell" Strategy

Analyst Miles Deutscher describes a similar approach, calling it "token engineering" and proposing a "barbell" strategy to reduce AI costs by 50% or more. The idea is simple: the first 10% of work and project planning are entrusted to the most powerful models (Opus, GPT). The bulk 80% of routine tasks are handled by cheaper open-source models. The final 10% and result verification are again assigned to top-tier models. Deutscher has been using this scheme for several months and considers it the best way to curb excessive AI spending.

Expert opinion: Coinbase's experience is not just an optimization case study but a clear demonstration that effective AI implementation is not a race for the most expensive tool, but an art of proper architecture. For crypto companies, where every dollar counts, this approach becomes not a luxury but a necessity for survival in an increasingly competitive environment.