gemini-py
Python SDK for the Gemini API, reverse-engineered from Gemini CLI v0.31.0.
Uses the cloudcode-pa.googleapis.com Code Assist endpoint with your existing Gemini CLI OAuth credentials — no API key required.
Prerequisites
- Gemini CLI installed and authenticated (
npx @google/gemini-cli→ login) - Credentials at
~/.gemini/oauth_creds.json
Install
pip install -e .
Usage
import asyncio
from gemini import GeminiClient, GeminiOptions
async def main():
async with GeminiClient() as client:
# Streaming
async for chunk in client.send_message_stream("Hello!"):
print(chunk.text_delta, end="", flush=True)
print()
# Non-streaming
response = await client.send_message("What is 2+2?")
print(response.text)
print("Tokens:", response.usage_metadata.total_token_count)
asyncio.run(main())
Custom options
opts = GeminiOptions(
model="gemini-2.5-flash-lite", # or "gemini-2.5-pro", "gemini-2.5-flash"
temperature=0.7,
max_output_tokens=4096,
system_prompt="You are a concise assistant.",
thinking_budget=1024, # enable thinking mode
)
client = GeminiClient(options=opts)
One-shot helper
from gemini import query
result = await query("Explain quantum computing in one sentence")
print(result.text)
Multi-turn conversations
async with GeminiClient() as client:
await client.send_message("My name is Alice.")
r = await client.send_message("What is my name?")
print(r.text) # "Your name is Alice."
client.clear_history() # start fresh
API
GeminiClient
| Method | Description |
|---|---|
send_message(prompt) |
Non-streaming, returns GenerateContentResponse |
send_message_stream(prompt) |
Async generator yielding StreamChunk |
clear_history() |
Reset conversation |
history |
List of raw message dicts |
GenerateContentResponse
| Property | Description |
|---|---|
.text |
Full text response (excludes thoughts) |
.thinking |
Thinking content (if thinking_budget set) |
.candidates |
List of Candidate objects |
.usage_metadata |
Token counts |
GeminiOptions
| Field | Default | Description |
|---|---|---|
model |
gemini-2.5-pro |
Model name |
temperature |
1.0 |
Sampling temperature |
max_output_tokens |
32768 |
Max tokens to generate |
thinking_budget |
None |
Tokens for thinking (enables thought mode) |
system_prompt |
None |
System instruction |
credentials_path |
~/.gemini/oauth_creds.json |
OAuth credentials file |
How it works
The Gemini CLI uses Google's internal Code Assist API (cloudcode-pa.googleapis.com/v1internal) with OAuth2 credentials. This library:
- Reads
~/.gemini/oauth_creds.json(written by Gemini CLI login) - Refreshes the access token when expired
- Calls
loadCodeAssistto get your project ID - Sends requests to
generateContentorstreamGenerateContent
The OAuth client credentials are the same as those embedded in the open-source Gemini CLI.
