Updated Dec 1365,536 context
$0.25 / 1M input tokens$0.50 / 1M output tokens$2.50 / 1K input images
Google’s flagship multimodal model, supporting image and video in text or chat prompts for a text or code response.
See the benchmarks and prompting guidelines from Deepmind.
Note: Preview models are offered for testing purposes and should not be used in production apps.
#multimodal