Google Gemini 3.5 Flash

According to Google, Gemini 3.5 Flash is the next iteration in the Gemini 3 series of highly-capable, natively multimodal, reasoning models. Gemini 3.5 Flash is based on the Gemini 3 Flash reasoning foundation with thinking levels to control the mix of quality, cost and latency.

Model details

Item	Value	Description
Model name	Google Gemini 3.5 Flash	The name of the model.
Model category	Premium	The category of the model: Standard or Premium.
API model name	`google__gemini_3_5_flash`	The name of the model that is used in the . The user must provide this exact name for the API to work.
Compliance	FedRAMP Moderate, FedRAMP High, DoD IL4, DoD IL5	Government compliance frameworks and authorizations applicable to this model.
Hosting layer	Google	The trusted organization that securely hosts the LLM.
Model provider	Google	The organization that provides this model.
Release date	May 19th, 2026	The release date for the model.
Knowledge cutoff date	January 2025	The date after which the model does not get any information updates.
Input context window	1m tokens	The number of tokens supported by the input context window.
Maximum output tokens	64k tokens	The number of tokens that can be generated by the model in a single request.
Empirical throughput	Not specified	The number of tokens the model can generate per second.
Open source	No	Specifies if the model’s code is available for public use.

Additional documentation

For additional information, see official Google Gemini 3.5 Flash documentation.

Last modified on June 2, 2026

Azure text-embedding-ada-002

Google Gemini 3.1 Pro

⌘I

​Model details

​Additional documentation

Model details

Additional documentation