Companies

Google Introduces Flexible Pricing Tiers for Gemini API

Byswgoettelman April 24, 2026April 24, 2026

Google on Thursday unveiled new pricing and reliability options for its Gemini API, introducing tiered inference modes designed to give developers more flexibility in managing costs and performance guarantees.

The company announced two new inference tiers — “Flex” and “Priority” — that allow developers to choose between lower-cost access with variable throughput and premium access with guaranteed capacity and faster response times.

The Flex tier offers reduced pricing in exchange for best-effort serving, meaning requests may experience higher latency or be queued during periods of peak demand. The Priority tier provides dedicated throughput and consistent performance, aimed at production workloads where reliability is critical.

“Developers building with the Gemini API have told us they want more control over how they spend on inference,” Google said in a blog post announcing the changes. The tiered approach mirrors pricing models already established by competitors in the large language model API market.

The move reflects a broader industry trend toward more granular API pricing as AI providers seek to attract a wider range of customers, from individual developers running experiments to enterprises deploying models at scale. OpenAI, Anthropic and other major providers have similarly introduced variable pricing structures in recent months.

For developers running batch processing jobs, prototyping applications or handling non-time-sensitive workloads, the Flex tier represents a significant cost reduction. Production applications serving end users, by contrast, can opt for the Priority tier to ensure consistent response times.

The new tiers are available immediately across Gemini API endpoints. Google said existing API users will not experience any changes to their current service levels and can opt into the new pricing structure at their discretion.

Google has been aggressively expanding the Gemini API’s capabilities and developer tooling as competition intensifies among AI providers for developer adoption. The company’s cloud division reported strong growth in AI-related revenue in its most recent earnings, driven in part by increasing API usage.

Pricing details for both tiers are available in Google’s updated API documentation.

Source

Google AI Blog

Companies

Google’s Gemma 4 VLA Model Runs on NVIDIA Edge Hardware in Robotics Demo
Byswgoettelman April 24, 2026April 24, 2026

Google’s open-weight Gemma 4 Vision-Language-Action model has been demonstrated running directly on NVIDIA’s Jetson Orin Nano Super, marking a step forward for on-device robotics AI.

Read More Google’s Gemma 4 VLA Model Runs on NVIDIA Edge Hardware in Robotics Demo
Companies

Hugging Face Publishes Open-Source Guide After Anthropic Limits Claude
Byswgoettelman April 24, 2026April 27, 2026

Hugging Face fires back at Anthropic’s Claude restrictions with an open-source migration guide — recommending GLM-5 and Qwen3.5 for developers cut off from third-party agent platforms. “You do not need a closed hosted model.”

Read More Hugging Face Publishes Open-Source Guide After Anthropic Limits Claude
Companies

OpenAI Adds WebSocket Support to Responses API for Faster Agent Performance
Byswgoettelman April 24, 2026April 24, 2026

OpenAI introduced WebSocket connections and connection-scoped caching to its Responses API, reducing latency for developers building AI agent workflows.

Read More OpenAI Adds WebSocket Support to Responses API for Faster Agent Performance
Companies

Hugging Face Releases TRL v1.0, Marking Stability Milestone for AI Training Library
Byswgoettelman April 24, 2026April 24, 2026

The open-source post-training framework used to fine-tune and align large language models reaches its first major stable release.

Read More Hugging Face Releases TRL v1.0, Marking Stability Milestone for AI Training Library
Companies

DeepSeek Releases V4 Model With Million-Token Context Window
Byswgoettelman April 24, 2026April 24, 2026

Chinese AI lab DeepSeek has unveiled its latest foundation model featuring a one-million token context window designed for agentic applications.

Read More DeepSeek Releases V4 Model With Million-Token Context Window
Companies

Anthropic Valuation Tops $1 Trillion in Pre-IPO Private Trading
Byswgoettelman April 28, 2026

Anthropic hits $1 trillion valuation in pre-IPO private trading — up from $61.5B just months ago. The Claude AI maker joins a rare group of trillion-dollar tech companies as investor demand surges.

Read More Anthropic Valuation Tops $1 Trillion in Pre-IPO Private Trading

Similar Posts

Leave a Reply Cancel reply