aboutsummaryrefslogtreecommitdiffstats

gemini-py

Python SDK for the Gemini API, reverse-engineered from Gemini CLI v0.31.0.

Uses the cloudcode-pa.googleapis.com Code Assist endpoint with your existing Gemini CLI OAuth credentials — no API key required.

Prerequisites

  • Gemini CLI installed and authenticated (npx @google/gemini-cli → login)
  • Credentials at ~/.gemini/oauth_creds.json

Install

pip install -e .

Usage

import asyncio
from gemini import GeminiClient, GeminiOptions

async def main():
    async with GeminiClient() as client:
        # Streaming
        async for chunk in client.send_message_stream("Hello!"):
            print(chunk.text_delta, end="", flush=True)
        print()

        # Non-streaming
        response = await client.send_message("What is 2+2?")
        print(response.text)
        print("Tokens:", response.usage_metadata.total_token_count)

asyncio.run(main())

Custom options

opts = GeminiOptions(
    model="gemini-2.5-flash-lite",   # or "gemini-2.5-pro", "gemini-2.5-flash"
    temperature=0.7,
    max_output_tokens=4096,
    system_prompt="You are a concise assistant.",
    thinking_budget=1024,            # enable thinking mode
)
client = GeminiClient(options=opts)

One-shot helper

from gemini import query

result = await query("Explain quantum computing in one sentence")
print(result.text)

Multi-turn conversations

async with GeminiClient() as client:
    await client.send_message("My name is Alice.")
    r = await client.send_message("What is my name?")
    print(r.text)  # "Your name is Alice."

    client.clear_history()  # start fresh

API

GeminiClient

Method Description
send_message(prompt) Non-streaming, returns GenerateContentResponse
send_message_stream(prompt) Async generator yielding StreamChunk
clear_history() Reset conversation
history List of raw message dicts

GenerateContentResponse

Property Description
.text Full text response (excludes thoughts)
.thinking Thinking content (if thinking_budget set)
.candidates List of Candidate objects
.usage_metadata Token counts

GeminiOptions

Field Default Description
model gemini-2.5-pro Model name
temperature 1.0 Sampling temperature
max_output_tokens 32768 Max tokens to generate
thinking_budget None Tokens for thinking (enables thought mode)
system_prompt None System instruction
credentials_path ~/.gemini/oauth_creds.json OAuth credentials file

How it works

The Gemini CLI uses Google's internal Code Assist API (cloudcode-pa.googleapis.com/v1internal) with OAuth2 credentials. This library:

  1. Reads ~/.gemini/oauth_creds.json (written by Gemini CLI login)
  2. Refreshes the access token when expired
  3. Calls loadCodeAssist to get your project ID
  4. Sends requests to generateContent or streamGenerateContent

The OAuth client credentials are the same as those embedded in the open-source Gemini CLI.