Skip to main content
According to Google, Gemini 3.5 Flash is the next iteration in the Gemini 3 series of highly-capable, natively multimodal, reasoning models. Gemini 3.5 Flash is based on the Gemini 3 Flash reasoning foundation with thinking levels to control the mix of quality, cost and latency.

Model details

ItemValueDescription
Model nameGoogle Gemini 3.5 FlashThe name of the model.
Model categoryPremiumThe category of the model: Standard or Premium.
API model namegoogle__gemini_3_5_flashThe name of the model that is used in the . The user must provide this exact name for the API to work.
ComplianceFedRAMP Moderate, FedRAMP High, DoD IL4, DoD IL5Government compliance frameworks and authorizations applicable to this model.
Hosting layerGoogleThe trusted organization that securely hosts the LLM.
Model providerGoogleThe organization that provides this model.
Release dateMay 19th, 2026The release date for the model.
Knowledge cutoff dateJanuary 2025The date after which the model does not get any information updates.
Input context window1m tokensThe number of tokens supported by the input context window.
Maximum output tokens64k tokensThe number of tokens that can be generated by the model in a single request.
Empirical throughputNot specifiedThe number of tokens the model can generate per second.
Open sourceNoSpecifies if the model’s code is available for public use.

Additional documentation

For additional information, see official Google Gemini 3.5 Flash documentation.