Implementing Google Gemini AI on ESP32: A Step-by-Step Guide

Introduction

Hello everyone, welcome back to the channel! In today's article, we are going to learn how to implement Gemini AI on an ESP32 development board. Exciting, right? For those who might not know, Google Gemini is a family of multimodal large language models developed by Google DeepMind, similar to ChatGPT. You can use prompts to ask questions and get answers from it.

Now, you can access the Gemini API from two places: Google AI Studio and Vertex AI. Both require a Google account. If you choose Vertex AI, you will also need a Google Cloud account with billing enabled. But don't worry, we'll be using the API from Google AI Studio, which is free and a fantastic starting point for development.

Getting Started with Gemini API

Here is what you need to do:

Go to Google and search for "Gemini API docs".
Click on the first link that appears.
Click on the "Try Gemini in AI Studio" button.
Now click on "Get API key". We will be creating a new API key for our project.
Read through the documentation, check the required check boxes, and click on "Continue".
Click on "Create API key" and then click on "Create API key in the new project". Your new API key will be ready for you.

Here is the API key we'll be using for our project.

Understanding the API and JSON Format

This is the JSON format we'll be using in our POST API request:

role field is an optional string.
parts field includes the text and inline_data. The text is required, while inline_data can be ignored.
generation_config includes several parameters, most of which are optional.

This is the final request we’ll be sending from the ESP32. For testing, you can use the cURL built-in tool to send the request like this:

curl -X POST \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '(
    "role": "user",
    "parts": {
      "text": "Your question here",
    ),
    "generation_config": ()
  }' \
  https://api.ai.google/studio/gemini

Replace YOUR_API_KEY and Your question here accordingly. You can also use Postman for testing and understanding the behavior.

Coding Part: Implementing on ESP32

I'll be using the ESP32 development board in this project. Here’s the step-by-step coding process:

Include the necessary header files for the libraries:

#include <WiFi.h>
#include <HTTPClient.h>
#include <ArduinoJson.h>

Define your Wi-Fi SSID, password, and the Gemini API key:

const char* ssid = "Your_SSID";
const char* password = "Your_PASSWORD";
const char* apiKey = "YOUR_API_KEY";

Set the maximum token value:

const int maxTokens = 120; // Adjust according to your needs

Initialize the Wi-Fi in the setup() function:

void setup() (
  Serial.begin(115200);
  WiFi.begin(ssid, password);
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.println("Connecting to WiFi...");
  )
  Serial.println("Connected to WiFi");
}

Loop function to capture user input and send HTTP request:

void loop() (
  if (Serial.available() > 0) {
    String userInput = Serial.readStringUntil('\n');
    String payload = "{\"role\":\"user\",\"parts\":{\"text\":\"" + userInput + "\"),\"generation_config\":()}";

    if (WiFi.status() == WL_CONNECTED) (
      HTTPClient http;
      http.begin("https://api.ai.google/studio/gemini");
      http.addHeader("Authorization", "Bearer " + String(apiKey));
      http.addHeader("Content-Type", "application/json");

      int httpResponseCode = http.POST(payload);
      if (httpResponseCode > 0) {
        String response = http.getString();
        DynamicJsonDocument doc(1024);
        deserializeJson(doc, response);
        const char* reply = doc["parts"]["text"];
        Serial.println(reply);
      ) else (
        Serial.println("Error in HTTP request");
      )
      http.end();
    }
  }
}

Demo and Conclusion

Let's compile the code and see the demo now.

Ask some real-life questions to the AI.
Notice there might be some delay due to multiple factors, including ESP32 performance and Internet connectivity.

That's all for today! If you found this article helpful, please consider subscribing to the channel. Thanks for watching!

Keywords

Google Gemini
ESP32
API Key
JSON Format
AI Studio
Wi-Fi SSID
HTTP Request
cURL
Postman Tool
Arduino Json

FAQ

Q: What is Google Gemini? A: Google Gemini is a family of multimodal large language models developed by Google DeepMind, similar to ChatGPT.

Q: Which platforms can you use to access Gemini API? A: You can access the Gemini API from Google AI Studio, which is free, or Vertex AI, which requires a Google Cloud account with billing enabled.

Q: What tools can help test the API requests? A: You can use the cURL built-in tool or the Postman tool to test and understand the API behavior.

Q: Why might there be a delay in getting a response? A: Delays can be due to multiple factors, including ESP32 performance, Internet connectivity, and API response time.

Q: How can you update the maximum token value for longer answers in the response? A: You can adjust the maxTokens variable in the code to expect a longer response. Adjust it based on your requirements.