Google Cloud opens supervised fine-tuning for Gemini 3.5 Flash and Pro
Google Cloud has released documentation detailing its supervised fine-tuning process for Gemini models, including Gemini 3.5 Flash and Gemini 2.5 Pro. This guide helps developers customize AI models for various applications, such as text, image, document, audio, and video tuning. The fine-tuning process involves preparing datasets, configuring model details, and managing hyperparameters.
Key Takeaways
- Supported models for supervised fine-tuning include Gemini 3.5 Flash, Gemini 3.1 Flash-Lite, and the Gemini 2.5 Pro and Flash series.
- System supports multimodal tuning datasets for text, image, document, audio, and video applications.
- The us-central1 region offers a preview evaluation service to automatically verify model performance against ground truth datasets post-tuning.
- Training metrics for Gemini 2.0 Flash now include total loss, token accuracy, and prediction count visualizations in Agent Platform Studio.
Why It Matters
This documentation marks a critical step for streaming platforms transitioning from generic AI to domain-specific agents. By allowing supervised fine-tuning on multimodal data—including video—Google is providing the tools necessary for precise automated metadata tagging, content moderation, and scene detection that generic foundation models often fail to execute with high accuracy. This capability challenges Amazon Bedrock’s recent expansion of fine-tuning for Anthropic Claude, as Google emphasizes native video reasoning within the tuning loop. The ability to minimize 'thinking' budgets on tuned models also suggests a path toward reducing inference latency and costs for high-volume streaming workloads. Watch for the general availability of Gemini 3.5 Pro tuning in late June 2026 to see if it sustains reasoning leads over the Flash tier.
Additional Context
The rollout of supervised fine-tuning for the Gemini 3.5 family follows the May 2026 Google I/O announcement where Gemini 3.5 Flash was released to general availability. Per Wavespeed AI (May 2026), the Flash model was specifically optimized for speed and agentic tasks, though it reportedly showed minor regressions in complex reasoning compared to Gemini 3.1 Pro. The subsequent introduction of tuning for Gemini 3.5 Pro in June 2026 is positioned as the primary solution for enterprises requiring frontier-level performance in long-context retrieval and multi-step reasoning. Simultaneously, Google’s naming convention has shifted, with Vertex AI Agent Builder now consolidated under the 'Gemini Enterprise Agent Platform,' per Codersera (May 2026). This move intensifies competition with Amazon Web Services, which expanded its own fine-tuning suite on June 5, 2026. Per AWS reporting, a new Bedrock console experience was launched to streamline model experimentation and scaling across the Claude and GPT families. While AWS facilitates fine-tuning primarily for text-based models like Claude Haiku, Google’s inclusion of native video and audio tuning remains a distinct differentiator for media-heavy industries. Furthermore, Azure OpenAI Service has extended its fine-tuning support for GPT-4o through March 2026, targeting customers who require high-throughput inference for AI agents, as noted by Microsoft (February 2026). In the broader ecosystem, the pressure to offer localized and customizable AI has led to a surge in 'agentic' features. Google recently launched Managed Agents in the Gemini API (May 2026), allowing developers to run autonomous agents in secure, isolated Linux sandboxes. This suggests that fine-tuning is becoming just one component of a larger 'data flywheel' strategy, similar to NVIDIA’s NeMo Microservices, which targets a 1.8x faster post-training cycle on H100 hardware, per NVIDIA (May 2025).
Read full article at docs.cloud.google.com