@ollama Chat Participant | Opilot — Ollama for GitHub Copilot

The @ollama chat participant provides a dedicated, history-aware conversational interface to your local Ollama models inside GitHub Copilot Chat.

Invoking `@ollama`

Type @ollama at the beginning of any Copilot Chat message:

text

@ollama explain the architecture of this TypeScript project

The participant is sticky — once invoked in a conversation thread, it continues handling subsequent messages automatically until you switch to a different participant.

How It Differs from the Model Picker

	Model Picker	`@ollama`
Invocation	Select from dropdown	`@ollama` prefix
Sticky	Thread model stays selected	Yes, participant stays selected
Default model	Most recently selected	`ollama.completionModel` or first available
Tool calling	Yes	Yes

Context and History

The @ollama participant passes the full conversation history from Copilot Chat to Ollama in each request, so it maintains context across turns:

text

@ollama what does the `activate` function do in this file?
# (reads through the response)
How does it handle errors?
# (follow-up — still using @ollama with full context)

Workspace Context

When VS Code provides workspace context (open files, selected code, terminal output), it is injected as XML-tagged context blocks at the beginning of the conversation. The extension extracts these tags and promotes them to a system message, so they influence replies without cluttering the conversation view.

Tool Calling

For models with tool-calling capability (🛠 badge), the @ollama participant can invoke VS Code tools:

text

@ollama look at the test file for sidebar.ts and tell me what cases are missing

The tool loop runs inside the extension — the model emits tool call requests, VS Code executes them, and the results flow back for the next response turn. If a model rejects the tool schema (not all models support OpenAI-compatible function calling), the extension automatically retries the request without tools.

Vision

For models with vision capability (👁 badge), you can attach images directly in the chat input:

Drag an image file into the chat input
Paste an image from clipboard

text

@ollama what is shown in this screenshot?
[image attached]

Images are automatically stripped for models that do not support vision, so you can leave images in your message history without worrying about errors.

Thinking Models

For models that expose chain-of-thought reasoning (🧠 badge — DeepSeek-R1, Qwen QwQ, Kimi, etc.), responses are split into two collapsible sections:

Thinking — the model's internal reasoning, wrapped in <think>…</think> tags
Response — the final answer

This makes it easy to review the reasoning or skip straight to the answer.

Streaming

Responses stream token-by-token in real time. The extension uses an XML stream filter to safely handle model responses that may emit XML-like system tags inside their output, ensuring they are rendered as plain text rather than being interpreted as injected context.

Error Handling

Situation	Behavior
Ollama unreachable	Error message with "Open Settings" and "Open Logs" options
Model not found	Error with model name; use sidebar to pull model
Tool schema rejected	Automatic retry without tools
Model crash / OOM detected	Warning dialog with model name and "Open Logs" option
Rate limit (cloud models)	Error surfaced in chat

Invoking @ollama ​

How It Differs from the Model Picker ​

Context and History ​

Workspace Context ​

Tool Calling ​

Vision ​

Thinking Models ​

Streaming ​

Error Handling ​

Invoking `@ollama`

How It Differs from the Model Picker

Context and History

Workspace Context

Tool Calling

Vision

Thinking Models

Streaming

Error Handling