README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112

# gemini-py

Python SDK for the Gemini API, reverse-engineered from [Gemini CLI](https://github.com/google-gemini/gemini-cli) v0.31.0.

Uses the `cloudcode-pa.googleapis.com` Code Assist endpoint with your existing Gemini CLI OAuth credentials — no API key required.

## Prerequisites

- Gemini CLI installed and authenticated (`npx @google/gemini-cli` → login)
- Credentials at `~/.gemini/oauth_creds.json`

## Install

```bash
pip install -e .
```

## Usage

```python
import asyncio
from gemini import GeminiClient, GeminiOptions

async def main():
    async with GeminiClient() as client:
        # Streaming
        async for chunk in client.send_message_stream("Hello!"):
            print(chunk.text_delta, end="", flush=True)
        print()

        # Non-streaming
        response = await client.send_message("What is 2+2?")
        print(response.text)
        print("Tokens:", response.usage_metadata.total_token_count)

asyncio.run(main())
```

### Custom options

```python
opts = GeminiOptions(
    model="gemini-2.5-flash-lite",   # or "gemini-2.5-pro", "gemini-2.5-flash"
    temperature=0.7,
    max_output_tokens=4096,
    system_prompt="You are a concise assistant.",
    thinking_budget=1024,            # enable thinking mode
)
client = GeminiClient(options=opts)
```

### One-shot helper

```python
from gemini import query

result = await query("Explain quantum computing in one sentence")
print(result.text)
```

### Multi-turn conversations

```python
async with GeminiClient() as client:
    await client.send_message("My name is Alice.")
    r = await client.send_message("What is my name?")
    print(r.text)  # "Your name is Alice."

    client.clear_history()  # start fresh
```

## API

### `GeminiClient`

| Method | Description |
|---|---|
| `send_message(prompt)` | Non-streaming, returns `GenerateContentResponse` |
| `send_message_stream(prompt)` | Async generator yielding `StreamChunk` |
| `clear_history()` | Reset conversation |
| `history` | List of raw message dicts |

### `GenerateContentResponse`

| Property | Description |
|---|---|
| `.text` | Full text response (excludes thoughts) |
| `.thinking` | Thinking content (if `thinking_budget` set) |
| `.candidates` | List of `Candidate` objects |
| `.usage_metadata` | Token counts |

### `GeminiOptions`

| Field | Default | Description |
|---|---|---|
| `model` | `gemini-2.5-pro` | Model name |
| `temperature` | `1.0` | Sampling temperature |
| `max_output_tokens` | `32768` | Max tokens to generate |
| `thinking_budget` | `None` | Tokens for thinking (enables thought mode) |
| `system_prompt` | `None` | System instruction |
| `credentials_path` | `~/.gemini/oauth_creds.json` | OAuth credentials file |

## How it works

The Gemini CLI uses Google's internal Code Assist API (`cloudcode-pa.googleapis.com/v1internal`) with OAuth2 credentials. This library:

1. Reads `~/.gemini/oauth_creds.json` (written by Gemini CLI login)
2. Refreshes the access token when expired
3. Calls `loadCodeAssist` to get your project ID
4. Sends requests to `generateContent` or `streamGenerateContent`

The OAuth client credentials are the same as those embedded in the open-source Gemini CLI.