gemini-py

Python SDK for the Gemini API, reverse-engineered from Gemini CLI v0.31.0.

Uses the cloudcode-pa.googleapis.com Code Assist endpoint with your existing Gemini CLI OAuth credentials — no API key required.

Prerequisites

Gemini CLI installed and authenticated (npx @google/gemini-cli → login)
Credentials at ~/.gemini/oauth_creds.json

Install

pip install -e .

Usage

import asyncio
from gemini import GeminiClient, GeminiOptions

async def main():
    async with GeminiClient() as client:
        # Streaming
        async for chunk in client.send_message_stream("Hello!"):
            print(chunk.text_delta, end="", flush=True)
        print()

        # Non-streaming
        response = await client.send_message("What is 2+2?")
        print(response.text)
        print("Tokens:", response.usage_metadata.total_token_count)

asyncio.run(main())

Custom options

opts = GeminiOptions(
    model="gemini-2.5-flash-lite",   # or "gemini-2.5-pro", "gemini-2.5-flash"
    temperature=0.7,
    max_output_tokens=4096,
    system_prompt="You are a concise assistant.",
    thinking_budget=1024,            # enable thinking mode
)
client = GeminiClient(options=opts)

One-shot helper

from gemini import query

result = await query("Explain quantum computing in one sentence")
print(result.text)

Multi-turn conversations

async with GeminiClient() as client:
    await client.send_message("My name is Alice.")
    r = await client.send_message("What is my name?")
    print(r.text)  # "Your name is Alice."

    client.clear_history()  # start fresh

API

`GeminiClient`

Method	Description
`send_message(prompt)`	Non-streaming, returns `GenerateContentResponse`
`send_message_stream(prompt)`	Async generator yielding `StreamChunk`
`clear_history()`	Reset conversation
`history`	List of raw message dicts

`GenerateContentResponse`

Property	Description
`.text`	Full text response (excludes thoughts)
`.thinking`	Thinking content (if `thinking_budget` set)
`.candidates`	List of `Candidate` objects
`.usage_metadata`	Token counts

`GeminiOptions`

Field	Default	Description
`model`	`gemini-2.5-pro`	Model name
`temperature`	`1.0`	Sampling temperature
`max_output_tokens`	`32768`	Max tokens to generate
`thinking_budget`	`None`	Tokens for thinking (enables thought mode)
`system_prompt`	`None`	System instruction
`credentials_path`	`~/.gemini/oauth_creds.json`	OAuth credentials file

How it works

The Gemini CLI uses Google's internal Code Assist API (cloudcode-pa.googleapis.com/v1internal) with OAuth2 credentials. This library:

Reads ~/.gemini/oauth_creds.json (written by Gemini CLI login)
Refreshes the access token when expired
Calls loadCodeAssist to get your project ID
Sends requests to generateContent or streamGenerateContent

The OAuth client credentials are the same as those embedded in the open-source Gemini CLI.