Gemini API

The Vertex AI Gemini API supports multimodal prompts as input and ouputs text or code.

HTTP request

POST https://{REGION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{REGION}/publishers/google/models/gemini-pro:streamGenerateContent

POST https://{REGION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{REGION}/publishers/google/models/gemini-pro-vision:streamGenerateContent

The following regions are supported:

Iowa (us-central1)
Las Vegas, Nevada (us-west4)
Montréal, Canada (northamerica-northeast1)
Northern Virginia (us-east4)
Oregon (us-west1)
Seoul, Korea (asia-northeast3)
Singapore (asia-southeast1)
Tokyo, Japan (asia-northeast1)

Model versions

To use the latest model version, specify the model name without a version number, for example gemini-pro or gemini-pro-vision.

For more information, see Model versions and lifecycle.

Request body

The request body contains data with the following structure:

{
  "contents": [
    {
      {
        "role": string,
        "parts": [
          {
            // Union field data can be only one of the following:
            "text": string,
            "inlineData": {
              "mimeType": string,
              "data": string
            },
            "fileData": {
              "mimeType": string,
              "fileUri": string
            },
            // End of list of possible types for union field data.

            "videoMetadata": {
              "startOffset": {
                "seconds": integer,
                "nanos": integer
              },
              "endOffset": {
                "seconds": integer,
                "nanos": integer
              }
            }
          }
        ]
      }
    }
  ],
  "tools": [
    {
      "functionDeclarations": [
        {
          "name": string,
          "description": string,
          "parameters": {
            object (OpenAPI Object Schema)
          }
        }
      ]
    }
  ],
  "safetySettings": [
    {
      "category": enum (HarmCategory),
      "threshold": enum (HarmBlockThreshold)
    }
  ],
  "generationConfig": {
    "temperature": number,
    "topP": number,
    "topK": number,
    "candidateCount": integer,
    "maxOutputTokens": integer,
    "stopSequences": [
      string
    ]
  }
}

Use the following parameters:

Parameter	Description
`role`	The role in a conversation associated with the content. Specifying a role is required even in singleturn use cases. Acceptable values include the following: `USER`: Specifies content that's sent by you. `MODEL`: Specifies the model's response.
`parts`	Ordered parts that make up the input. Parts may have different MIME types. For `gemini-pro`, only the `text` field is valid. The token limit is 32k. For `gemini-pro-vision`, you may specify either text only, text and up to 16 images, or text and 1 video. The token limit is 16k.
`text`	The text instructions or chat dialogue to include in the prompt.
`inlineData`	Serialized bytes data of the image or video. You can specify at most 1 image with `inlineData`. To specify up to 16 images, use `fileData`.
`mimeType`	The media type of the image or video specified in the `data` or `fileUri` fields. Acceptable values include the following: Click to expand MIME types `image/png` `image/jpeg` `image/webp` `image/heic` `image/heif` `video/mov` `video/mpeg` `video/mp4` `video/mpg` `video/avi` `video/wmv` `video/mpegps` `video/flv` Maximum video length: 2 minutes. No limit on image resolution.
`data`	The base64 encoding of the image or video to include inline in the prompt. When including media inline, you must also specify `MIMETYPE`. size limit: 20MB
`fileURI`	The Cloud Storage URI of the image or video to include in the prompt. The bucket that stores the file must be in the same Google Cloud project that's sending the request. You must also specify `MIMETYPE`. size limit: 20MB
`videoMetadata`	Optional. For video input, the start and end offset of the video in Duration format. For example, to specify a 10 second clip starting at 1:00, set `"start_offset": { "seconds": 60 }` and `"end_offset": { "seconds": 70 }`.
`tools`	A piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the model.
`functionDeclarations`	One or more function declarations. Each function declaration contains information about one function that includes the following: `name` The name of the function to call. Must start with a letter or an underscore. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. `description` (optional). The description and purpose of the function. The model uses this to decide how and whether to call the function. For the best results, we recommend that you include a description. `parameters` The parameters of this function in a format that's compatible with the OpenAPI schema format. For more information, see Function calling.
`category`	The safety category to configure a threshold for. Acceptable values include the following: Click to expand safety categories `HARM_CATEGORY_SEXUALLY_EXPLICIT` `HARM_CATEGORY_HATE_SPEECH` `HARM_CATEGORY_HARASSMENT` `HARM_CATEGORY_DANGEROUS_CONTENT`
`threshold`	The threshold for blocking responses that could belong to the specified safety category based on probability. BLOCK_NONE BLOCK_LOW_AND_ABOVE BLOCK_MED_AND_ABOVE BLOCK_HIGH_AND_ABOVE
`temperature`	The temperature is used for sampling during the response generation, which occurs when topP and topK are applied. Temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that require a more deterministic and less open-ended or creative response, while higher temperatures can lead to more diverse or creative results. A temperature of 0 is deterministic: the highest probability response is always selected. Range: `0.0 - 1.0` Default for gemini-pro: `0.9` Default for gemini-pro-vision: `0.4`
`maxOutputTokens`	Maximum number of tokens that can be generated in the response. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words. Specify a lower value for shorter responses and a higher value for potentially longer responses. Range: `1-2048` Default for gemini-pro: `8192` Default for gemini-pro-vision: `2048`
`topK`	Top-K changes how the model selects tokens for output. A top-K of `1` means the next selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-K of `3` means that the next token is selected from among the three most probable tokens by using temperature. For each token selection step, the top-K tokens with the highest probabilities are sampled. Then tokens are further filtered based on top-P with the final token selected using temperature sampling. Specify a lower value for less random responses and a higher value for more random responses. Range: `1-40` Default for gemini-pro-vision: `32` Default for gemini-pro: `none`
`topP`	Top-P changes how the model selects tokens for output. Tokens are selected from the most (see top-K) to least probable until the sum of their probabilities equals the top-P value. For example, if tokens A, B, and C have a probability of 0.3, 0.2, and 0.1 and the top-P value is `0.5`, then the model will select either A or B as the next token by using temperature and excludes C as a candidate. Specify a lower value for less random responses and a higher value for more random responses. Range: `0.0 - 1.0` `Default: 1.0`
`candidateCount`	The number of response variations to return. This value must be 1.
`stopSequences`	Specifies a list of strings that tells the model to stop generating text if one of the strings is encountered in the response. If a string appears multiple times in the response, then the response truncates where it's first encountered. The strings are case-sensitive. For example, if the following is the returned response when `stopSequences` isn't specified: `public static string reverse(string myString)` Then the returned response with `stopSequences` set to `["Str", "reverse"]` is: `public static string` Maximum 5 items in the list.

Response body

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": string
          }
        ]
      },
      "finishReason": enum (FinishReason),
      "safetyRatings": [
        {
          "category": enum (HarmCategory),
          "probability": enum (HarmProbability),
          "blocked": boolean
        }
      ],
      "citationMetadata": {
        "citations": [
          {
            "startIndex": integer,
            "endIndex": integer,
            "uri": string,
            "title": string,
            "license": string,
            "publicationDate": {
              "year": integer,
              "month": integer,
              "day": integer
            }
          }
        ]
      }
    }
  ],
  "usageMetadata": {
    "promptTokenCount": integer,
    "candidatesTokenCount": integer,
    "totalTokenCount": integer
  }
}

Response element	Description
`text`	The generated text.
`finishReason`	The reason why the model stopped generating tokens. If empty, the model has not stopped generating the tokens. `FINISH_REASON_UNSPECIFIED` The finish reason is unspecified. `FINISH_REASON_STOP` Natural stop point of the model or provided stop sequence. `FINISH_REASON_MAX_TOKENS` The maximum number of tokens as specified in the request was reached. `FINISH_REASON_SAFETY` The token generation was stopped as the response was flagged for safety reasons. Note that `Candidate.content` is empty if content filters block the output. `FINISH_REASON_RECITATION` The token generation was stopped as the response was flagged for unauthorized citations. `FINISH_REASON_OTHER` All other reasons that stopped the token
`category`	The safety category to configure a threshold for. Acceptable values include the following: Click to expand safety categories `HARM_CATEGORY_SEXUALLY_EXPLICIT` `HARM_CATEGORY_HATE_SPEECH` `HARM_CATEGORY_HARASSMENT` `HARM_CATEGORY_DANGEROUS_CONTENT`
`probability`	The harm probability levels in the content. `HARM_PROBABILITY_UNSPECIFIED` `NEGLIGIBLE` `LOW`. `MEDIUM` `HIGH`
`blocked`	A `boolean` flag associated with a safety attribute that indicates if the model's input or output was blocked. If `blocked` is `true`, then the `errors` field in the response contains one or more error codes. If `blocked` is `false`, then the response doesn't include the `errors` field.
`startIndex`	An integer that specifies where a citation starts in the content.
`endIndex`	An integer that specifies where a citation ends in the `content`.
`url`	The URL of a citation source. Examples of a URL source might be a news website or a GitHub repository.
`title`	The title of a citation source. Examples of source titles might be that of a news article or a book.
`license`	The license associated with a citation.
`publicationDate`	The date a citation was published. Its valid formats are `YYYY`, `YYYY-MM`, and `YYYY-MM-DD`.
`promptTokenCount`	Number of tokens in the request.
candidatesTokenCount	Number of tokens in the response(s).
`totalTokenCount`	Number of tokens in the request and response(s).

Sample requests

TextChatMultimodalFunction

REST

To test a text prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.

For other fields, see the Request body table.

HTTP method and URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent

Request JSON body:

{
  "contents": {
    "role": "user",
    "parts": {
        "text": "Give me a recipe for banana bread."
    },
  },
  "safety_settings": {
    "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
    "threshold": "BLOCK_LOW_AND_ABOVE"
  },
  "generation_config": {
    "temperature": 0.2,
    "topP": 0.8,
    "topK": 40,
    "maxOutputTokens": 200,
    "stopSequences": [".", "?", "!"]
  }
}

To send your request, choose one of these options:

curlPowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login, or by using Cloud Shell, which automatically logs you into the gcloud CLI. You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent"

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent" | Select-Object -Expand Content

You should receive a JSON response similar to the sample response.

Also see Send chat prompt requests (Gemini).

REST

To test a chat prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.

For other fields, see the Request body table.

HTTP method and URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent

Request JSON body:

{
  "contents": [
    {
      "role": "USER",
      "parts": { "text": "Hello!" }
    },
    {
      "role": "ASSISTANT",
      "parts": { "text": "Argh! What brings ye to my ship?" }
    },
    {
      "role": "USER",
      "parts": { "text": "Wow! You are a real-life priate!" }
    }
  ],
  "safety_settings": {
    "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
    "threshold": "BLOCK_LOW_AND_ABOVE"
  },
  "generation_config": {
    "temperature": 0.2,
    "topP": 0.8,
    "topK": 40,
    "maxOutputTokens": 200,
  }
}

To send your request, choose one of these options:

curlPowerShell

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent"

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent" | Select-Object -Expand Content

You should receive a JSON response similar to the sample response.

Also see Send multimodal prompt requests.

REST

To test a multimodal prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.

For other fields, see the Request body table.

HTTP method and URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro-vision:streamGenerateContent

Request JSON body:

{
  "contents": {
    "role": "user",
    "parts": [
      {
        "fileData": {
          "mimeType": "image/png",
          "fileUri": "gs://cloud-samples-data/ai-platform/flowers/daisy/10559679065_50d2b16f6d.jpg"
        }
      },
      {
        "text": "Describe this picture."
      }
    ]
  },
  "safety_settings": {
    "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
    "threshold": "BLOCK_LOW_AND_ABOVE"
  },
  "generation_config": {
    "temperature": 0.4,
    "topP": 1.0,
    "topK": 32,
    "maxOutputTokens": 2048
  }
}

To send your request, choose one of these options:

curlPowerShell

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro-vision:streamGenerateContent"

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro-vision:streamGenerateContent" | Select-Object -Expand Content

You should receive a JSON response similar to the sample response.

Also see Function calling.

REST

To test a function prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.

For other fields, see the Request body table.

HTTP method and URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent

Request JSON body:

{
  "contents": {
    "role": "user",
    "parts": {
      "text": "Which theaters in Mountain View show Barbie movie?"
    }
  },
  "tools": [
    {
      "function_declarations": [
        {
          "name": "find_movies",
          "description": "find movie titles currently playing in theaters based on any description, genre, title words, etc.",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA or a zip code e.g. 95616"
              },
              "description": {
                "type": "string",
                "description": "Any kind of description including category or genre, title words, attributes, etc."
              }
            },
            "required": [
              "description"
            ]
          }
        },
        {
          "name": "find_theaters",
          "description": "find theaters based on location and optionally movie title which are is currently playing in theaters",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA or a zip code e.g. 95616"
              },
              "movie": {
                "type": "string",
                "description": "Any movie title"
              }
            },
            "required": [
              "location"
            ]
          }
        },
        {
          "name": "get_showtimes",
          "description": "Find the start times for movies playing in a specific theater",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA or a zip code e.g. 95616"
              },
              "movie": {
                "type": "string",
                "description": "Any movie title"
              },
              "theater": {
                "type": "string",
                "description": "Name of the theater"
              },
              "date": {
                "type": "string",
                "description": "Date for requested showtime"
              }
            },
            "required": [
              "location",
              "movie",
              "theater",
              "date"
            ]
          }
        }
      ]
    }
  ]
}

To send your request, choose one of these options:

curlPowerShell

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent"

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent" | Select-Object -Expand Content

You should receive a JSON response similar to the sample response.

Sample responses

TextChatMultimodalFunction

[{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "**Ingredients:**\n\n* 2 "
          }
        ]
      },
      "finishReason": "FINISH_REASON_STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 8,
    "candidatesTokenCount": 8,
    "totalTokenCount": 16
  }
}]

[{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": "Avast there, landlubber! Ye be mistaken. I be but a"
          }
        ]
      },
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "LOW"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ]
}
,
{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": " humble sailor, livin' the pirate's life for the thrill of it"
          }
        ]
      },
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ]
}
,
{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": ". No treasure for me, just the freedom of the open seas and the company of me hearty crew. What brings ye to our shores?"
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 23,
    "candidatesTokenCount": 60,
    "totalTokenCount": 83
  }
}]

[{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": " A daisy is growing up through a pile of brown and yellow fall leaves"
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 262,
    "candidatesTokenCount": 14,
    "totalTokenCount": 276
  }
  }
  ,
  {
  "error": {
    "code": 499,
    "message": "The operation was cancelled.",
    "status": "CANCELLED"
  }
}]

[{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "functionCall": {
              "name": "find_theaters",
              "args": {
                "movie": "Barbie",
                "location": "Mountain View, CA"
              }
            }
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 9,
    "totalTokenCount": 9
  }
}]