Rodium AI
Back to all models
Google

Google DeepMind

Gemini family — multimodal models with massive context windows and hybrid reasoning.

Available models

15

Starting rate

59.7 RODI / 1M

Max context

1.0M

Google
Lyria 3 Pro PreviewNewgoogle/lyria-3-pro-preview
visionaudiochatlong-contextpreview

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Pro can generate full-length songs with verses, choruses, bridges.

Price

Pricing unavailable

Context1.0MSpeedDeepInput:textimageOutput:textaudioModel details
Google
Lyria 3 Clip PreviewNewgoogle/lyria-3-clip-preview
visionaudiochatlong-contextpreview

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Clip can generate short clips, loops, previews.

Price

Pricing unavailable

Context1.0MSpeedBalancedInput:textimageOutput:textaudioModel details
Google
Veo 3.1google/veo-3-1
visionvideo

Google's state-of-the-art video generation model, built for maximum visual fidelity in final production cuts. Veo 3.1 generates high-quality 1080p video from text or image prompts with native synchronized audio — including dialogue, ambient effects, and background sound. Supports scene extension (up to 20 chained clips for 140+ second narratives), frames-to-video transitions between two images, vertical video for Shorts, and 4K upscaling.

Price

  • Per second238.5RODI/s~0.400 USD/s
  • With audio / s238.5RODI/s~0.400 USD/s
  • No audio / s119.3RODI/s~0.200 USD/s
View all rates
Context128KSpeedDeepInput:textimageOutput:videoModel details
Google
Veo 3.1 Fastgoogle/veo-3-1-fast
visionvideo

Google's mid-tier video generation model balancing speed and quality. Veo 3.1 Fast generates high-quality video from text or image prompts with native synchronized audio, offering faster turnaround than Veo 3.1 at lower cost. Supports first-frame and last-frame conditioning, multiple resolutions and aspect ratios, and SynthID watermarking.

Price

  • Per second59.7RODI/s~0.100 USD/s
  • With audio / s59.7RODI/s~0.100 USD/s
  • No audio / s47.7RODI/s~0.0800 USD/s
View all rates
Context128KSpeedDeepInput:textimageOutput:videoModel details
Google
Gemini 2.5 Flashgoogle/gemini-2-5-flash
visionchatlong-context

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling. Additionally, Gemini 2.5 Flash is configurable through the "max tokens for reasoning" parameter, as described in the documentation (https://openrouter.ai/docs/use-cases/reasoning-tokens#max-tokens-for-reasoning).

Price

  • In178.9RODI/M~0.300 USD/M
  • Out1490.2RODI/M~2.50 USD/M
  • Cached17.9RODI/M~0.0300 USD/M
View all rates
Context1.0MSpeedFastInput:documentimagetextaudiovideoOutput:textModel details
Google
Gemini 2.5 Flash Litegoogle/gemini-2-5-flash-lite
visionchatlong-context

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off c

Price

  • In59.7RODI/M~0.100 USD/M
  • Out238.5RODI/M~0.400 USD/M
  • Cached6.0RODI/M~0.0100 USD/M
View all rates
Context1.0MSpeedFastInput:textimagedocumentaudiovideoOutput:textModel details
Google
Gemini 2.5 Progoogle/gemini-2-5-pro
visionchatlong-context

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

Price

  • In745.1RODI/M~1.25 USD/M
  • Out5960.8RODI/M~10.00 USD/M
  • Cached74.6RODI/M~0.125 USD/M
View all rates
Context1.0MSpeedFastInput:textimagedocumentaudiovideoOutput:textModel details
Google
Gemini 3.1 Flash Litegoogle/gemini-3-1-flash-lite
visionchatlong-context

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic workflows, simple data extraction, and applications where responsiveness and API cost are the primary constraints. Supports full thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs. Priced at half the cost of Gemini 3 Flash.

Price

  • In149.1RODI/M~0.250 USD/M
  • Out894.2RODI/M~1.50 USD/M
  • Cached15.0RODI/M~0.0250 USD/M
View all rates
Context1.0MSpeedFastInput:textimagevideodocumentaudioOutput:textModel details
Google
Gemini 3.1 Flash TTS PreviewNewgoogle/gemini-3-1-flash-tts-preview
audiopreview

Gemini 3.1 Flash TTS Preview is a text-to-speech model from Google, and a substantial generational step up from Gemini 2.5 Flash TTS. It takes text input and produces audio output across 70+ languages — nearly 3× the language coverage of its predecessor. The headline addition is a system of 200+ inline audio tags (e.g. `[whispers]`, `[laughs]`, `[excited]`) that let developers steer delivery, emotion, and pacing mid-sentence, alongside a "director's chair" workflow in Google AI Studio for defin

Price

  • In596.1RODI/M~1.00 USD/M
  • Out11921.6RODI/M~20.00 USD/M
Context8KSpeedFastInput:textOutput:audioModel details
Google
Gemini 3.1 Pro PreviewNewgoogle/gemini-3-1-pro-preview
visionchatlong-contextpreview

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation of the Gemini 3 series, it combines high-precision reasoning across text, image, video, audio, and code with a 1M-token context window. Reasoning Details must be preserved when using multi-turn tool calling, see our docs here: https://openrouter.ai/docs/use-ca

Price

  • In1192.2RODI/M~2.00 USD/M
  • Out7153.0RODI/M~12.00 USD/M
  • Cached119.3RODI/M~0.200 USD/M
View all rates
Context1.0MSpeedFastInput:audiodocumentimagetextvideoOutput:textModel details
Google
Gemini 3 Flash PreviewNewgoogle/gemini-3-flash-preview
visionchatlong-contextpreview

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool use performance with substantially lower latency than larger Gemini variants, making it well suited for interactive development, long running agent loops, and collaborative coding tasks. Compared to Gemini 2.5 Flash, it provides broad quality improvements across reasoning, multimodal understanding, and reliability.

Price

  • In298.1RODI/M~0.500 USD/M
  • Out1788.3RODI/M~3.00 USD/M
  • Cached29.9RODI/M~0.0500 USD/M
View all rates
Context1.0MSpeedFastInput:textimagedocumentaudiovideoOutput:textModel details
Google
Nano Banana 2 (Gemini 3.1 Flash Image Preview)Newgoogle/gemini-3-1-flash-image-preview
visionimage-genpreview

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines advanced contextual understanding with fast, cost-efficient inference, making complex image generation and iterative edits significantly more accessible. Aspect ratios can be controlled with the [image_config API Parameter](https://openrouter.ai/docs/features/multimodal/image-generation#image-aspect-ratio-c

Price

  • In298.1RODI/M~0.500 USD/M
  • Out1788.3RODI/M~3.00 USD/M
  • Image out35761.5RODI/M~60.00 USD/M
Context131KSpeedFastInput:imagetextOutput:imagetextModel details
Google
Nano Banana (Gemini 2.5 Flash Image)google/gemini-2-5-flash-image
visionimage-gen

Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversations. Aspect ratios can be controlled with the [image_config API Parameter](https://openrouter.ai/docs/features/multimodal/image-generation#image-aspect-ratio-configuration)

Price

  • In178.9RODI/M~0.300 USD/M
  • Out1490.2RODI/M~2.50 USD/M
  • Cached17.9RODI/M~0.0300 USD/M
View all rates
Context33KSpeedFastInput:imagetextOutput:imagetextModel details
Google
Nano Banana Pro (Gemini 3 Pro Image Preview)Newgoogle/gemini-3-pro-image-preview
visionimage-genpreview

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and high-fidelity visual synthesis. The model generates context-rich graphics, from infographics and diagrams to cinematic composites, and can incorporate real-time information via Search grounding. It offers industry-leading text rendering in images (including long passages and multilingu

Price

  • In1192.2RODI/M~2.00 USD/M
  • Out7153.0RODI/M~12.00 USD/M
  • Cached119.3RODI/M~0.200 USD/M
View all rates
Context66KSpeedFastInput:imagetextOutput:imagetextModel details
Google
Veo 3.1 Litegoogle/veo-3-1-lite
visionvideo

Google's most cost-effective video generation model, designed for high-volume applications and rapid iteration. Veo 3.1 Lite generates 720p and 1080p video from text or image prompts with native synchronized audio at less than 50% of the cost of Veo 3.1 Fast. Supports 4–8 second clips in landscape (16:9) and portrait (9:16) formats, with SynthID watermarking. Ideal for content platforms, short-form video creation, and automated media generation.

Price

  • Per second29.9RODI/s~0.0500 USD/s
  • With audio / s29.9RODI/s~0.0500 USD/s
  • No audio / s17.9RODI/s~0.0300 USD/s
View all rates
Context128KSpeedDeepInput:textimageOutput:videoModel details