Getting Started with Gemini Pro API on Google AI Studio

Introduction

Google has recently opened API access to their Gemini Pro models to the public, and the best part is it's free to test. In this article, we will explore how to use the Gemini Pro API through their Python SDK, discuss pricing details, and provide a step-by-step guide on setting up your development environment.

Overview of Gemini Pro API Pricing

Google's Gemini Pro API is available for free to users making less than 60 queries per minute. This free tier includes both input and output usage, with the only catch being that Google will use the data provided to improve their products. If you require more than 60 queries per minute, Google offers a pay-as-you-go option, which is not yet available but is expected to be released soon.

Compared to other models like GPT 3.5 Turbo, the pricing for Gemini Pro is quite attractive. It offers a lower price for both input and output tokens, making it a cost-effective choice.

Using Gemini Pro Models in Google AI Studio

To make the most of Gemini Pro's capabilities, it is recommended to use it within Google AI Studio. Previously known as Maker Suite, Google AI Studio allows you to experiment with the models and test their functionalities.

Currently, there are two models available in Google AI Studio: Gemini Pro (text model) and Gemini Pro Vision (image model). You can explore and experiment with both models in the platform's interface.

However, if you plan to use these models in your own applications, you will need to create an API key. We will provide a step-by-step guide on how to do this later in the article.

Testing Gemini Pro in Google Colab

To test the Gemini Pro API, we can use Google Colab. In this section, we will walk through setting up your development environment, accessing your API key, and generating text responses from the Gemini Pro model.

First, you need to install the Google generative AI package:

!pip install google_generative_models

Next, retrieve your API key. If you are using Google Colab, you can use the user_data function to retrieve the key. Otherwise, you can set the API key as an environment variable.

To initiate the Gemini Pro model, you can use the following code:

from google_generative_models import gen_ai

genAI = gen_ai.GenAi()
model = genAI.get_model('gemini-pro')

To generate a response from the model, use the generate_content function:

prompt = "What is the meaning of life?"
response = model.generate_content(prompt)
print(response.text)

Additionally, you can set various configurations such as temperature, safety settings, and more to control the behavior of the model.

Chat Generation with Gemini Pro

Gemini Pro can also be used as a chat model. You can initiate a chat session using the start_chat function and provide a history of previous conversations. Then, you can use the send_message function to continue the chat conversation.

chat_model = genAI.get_model('gemini-pro')
chat_model.start_chat([])  # Empty list or previous conversation history

response = chat_model.send_message("Ask a question")
print(response.text)

By using the chat mode, you can have interactive conversations with the model, providing prompts and receiving real-time responses.

Utilizing the Gemini Vision Model

Gemini Pro Vision is the image version of Gemini Pro, designed to analyze and understand images. Using this model, you can generate responses based on text or image prompts.

To use the Gemini Vision model, you can initiate a new model in the following way:

vision_model = genAI.get_model('gemini-pro-vision')

You can pass an image as an input and the model will generate a response based on its understanding of the image. It can identify objects, describe scenes, and more.

from PIL import Image

## Introduction
image = Image.open('image.jpg')
preprocessed_image = genAI.preprocess_image(image)

## Introduction
response = vision_model.generate_content(preprocessed_image)
print(response.text)

Embedding Model for Text Analysis

Google's generative AI package also provides an embedding model for text analysis tasks such as retrieval, similarity, classification, and clustering. The embedding model allows you to encode text into high-dimensional vectors for various applications.

To use the embedding model, you can invoke the embed_content function:

text = "This is an example sentence."
embedding = genAI.embed_content(text, task='retrieval')
print(embedding)

The embedding has 768 dimensions and can be used for tasks like document retrieval, semantic similarity, or clustering.

Keyword

Gemini Pro API, Google AI Studio, Gemini Pro models, pricing, Python SDK, development environment, text generation, chat model, embedding model, Gemini Vision model, image processing, text analysis

FAQ

How much does it cost to use the Gemini Pro API? Google offers a free tier for users making less than 60 queries per minute. Pay-as-you-go pricing for higher volumes will be available soon.
Can I use Gemini Pro for image processing? Yes, the Gemini Pro Vision model allows you to process images and generate responses based on image prompts.
What are the safety settings in Gemini Pro API? Gemini Pro API provides customizable safety settings for harmful content, including categories like harassment, hate speech, sexually explicit content, and dangerous content. Users can set thresholds to control the content based on their preferences.
Can I use Gemini Pro in my own applications? Yes, by creating an API key and using the provided Python SDK, you can integrate Gemini Pro API into your own projects or applications.
How can I control the behavior of the Gemini Pro models? You can control the model's behavior by adjusting configurations such as temperature, safety settings, and streaming options. These configurations allow you to fine-tune the output to meet your requirements.