GitHub Copilot · VS Code Guide

Optimize

AI Credit

Usage

10 practical techniques to get the most out of your monthly Copilot allowance — without sacrificing quality.

10 Tips

Actionable techniques

3 Phases

Plan, implement, monitor

1 Goal

More output, fewer credits

Scroll to explore

Tip 1 · Model Selection

Match the Model to the Task

More capable models cost more per token. Use the right tool for each job.

Light Models

GPT-4o mini, Claude Haiku

Low

Credit cost per request20%

Best for

Quick edits
Boilerplate generation
Straightforward Q&A
Auto-complete tasks

Balanced Models

GPT-4o, Claude Sonnet

Medium

Credit cost per request55%

Best for

Code review
Feature implementation
Documentation writing
General debugging

Reasoning Models

o1, Claude Opus

High

Credit cost per request90%

Best for

Complex refactoring
Architectural decisions
Multi-step debugging
System design

Auto Model Selection

Let VS Code route each request to an efficient model that balances quality and cost. The model picker in chat shows cost details in the hover menu, including cost per token type and a generic cost tier label.

Tips 2–10

10 Ways to Extend Your Credits

Each technique independently reduces token consumption. Stack them for maximum efficiency.

Workflow2

Plan Before You Implement

Separate planning and implementation phases. Use a reasoning model for planning, then switch to a faster model for execution once the plan is solidified.

Use the Plan agent → Review & refine → Hand off to implementation agent

Reasoning3

Thinking Effort Defaults

VS Code sets default effort levels with adaptive reasoning. Only increase thinking effort for genuinely complex problems.

Default effort = sufficient for most tasks

Context4

New Chat, New Task

Start fresh when changing topics. Accumulated context from previous messages consumes tokens without improving results.

Ctrl+N (Linux) · ⌘N (macOS)

Branching5

Leverage Forking

Fork the conversation to explore alternatives without re-establishing context. Type /fork to branch from the current message.

/fork — branches from current point

Tools6

Disable Unneeded Tools

Every tool call produces output that consumes context window space. Disable tools and MCP servers you don't need for the current task.

Configure Tools button → disable per-request

Files7

Exclude Files from Context

Large generated files, build outputs, and irrelevant directories inflate token usage. Use .gitignore and files.exclude to keep context lean.

.gitignore · files.exclude setting

Compaction8

Manage Context with /compact

When a conversation grows long, use /compact to summarize older parts and reclaim context window space. Add optional instructions to guide the summary.

/compact focus on the API design decisions

Monitoring9

Monitor Your Usage

View current Copilot usage in the Status Bar dashboard. It shows percentage of monthly allowance used for AI credits.

Run /chronicle:cost-tips for personalized tips

Debug10

Inspect Token Usage & Caching

Use Agent Debug Logs to understand credit consumption. The Cache Explorer shows prompt cache hit rates — cached tokens cost less.

Summary view · Cache Explorer · cache hit rates

Tip 2 Deep Dive · Workflow

Separate Planning from Implementation

The biggest credit drain is generating code before the approach is right. Two-phase workflow fixes this.

Phase 1

Planning

Model type

HighReasoning Model

🔍Use Plan agent to research the task

📋Generate a structured implementation plan

✏️Review and refine the plan

✅Approve the plan before any code is written

Structured, approved plan

Phase 2

Implementation

Model type

LowFast Model

🤝Hand off approved plan to implementation agent

⚡Agent executes with a faster, efficient model

🔄Minimal back-and-forth, no rework

🚀Deliver working code faster, at lower cost

Working code, fewer credits used

Hand off approved plan

Why This Saves Credits

Jumping straight into code generation with a reasoning model for the entire session wastes credits if the approach is wrong. By separating phases, you use the expensive model only for the short planning phase, then switch to a cost-effective model for the longer implementation phase — where most tokens are consumed.

Planning tokens15%

Implementation tokens85%

85% of tokens are in implementation — use the cheapest model there.

Tips 9 & 10 · Monitor

Know Where Your Credits Go

The Agent Debug Logs and Copilot Status Dashboard give you full visibility into token consumption.

Summary View

Aggregate token usage for the session, total tool calls, and overall duration.

Input tokens12,480

Output tokens3,210

Cached tokens8,150

Tool calls24

Cache Explorer

Prompt cache hit rates and how many input tokens were reused from previous requests.

Cache hit rate67%

Tokens saved8,150

Latency reduction~40%

Cost reduction~30%

Monthly AI Credit Allowance

Copilot Status Dashboard

Available via the VS Code Status Bar. Shows percentage of monthly allowance used. Run /chronicle:cost-tips for personalized recommendations.

Credits used this month63%

0%100%

Start Optimizing Today

Run /chronicle:cost-tips in any chat session to get personalized recommendations based on your recent activity.

/chronicle:cost-tips

Open VS Code View GitHub Docs