Menu
GitHub Copilot · VS Code Guide

Optimize

AI Credit
Usage

10 practical techniques to get the most out of your monthly Copilot allowance — without sacrificing quality.

10 Tips

Actionable techniques

3 Phases

Plan, implement, monitor

1 Goal

More output, fewer credits

Scroll to explore
Tip 1 · Model Selection

Match the Model to the Task

More capable models cost more per token. Use the right tool for each job.

Light Models

GPT-4o mini, Claude Haiku

Low
Credit cost per request20%

Best for

  • Quick edits
  • Boilerplate generation
  • Straightforward Q&A
  • Auto-complete tasks

Balanced Models

GPT-4o, Claude Sonnet

Medium
Credit cost per request55%

Best for

  • Code review
  • Feature implementation
  • Documentation writing
  • General debugging

Reasoning Models

o1, Claude Opus

High
Credit cost per request90%

Best for

  • Complex refactoring
  • Architectural decisions
  • Multi-step debugging
  • System design

Auto Model Selection

Let VS Code route each request to an efficient model that balances quality and cost. The model picker in chat shows cost details in the hover menu, including cost per token type and a generic cost tier label.

Tips 2–10

10 Ways to Extend Your Credits

Each technique independently reduces token consumption. Stack them for maximum efficiency.

Workflow2

Plan Before You Implement

Separate planning and implementation phases. Use a reasoning model for planning, then switch to a faster model for execution once the plan is solidified.

Use the Plan agent → Review & refine → Hand off to implementation agent
Reasoning3

Thinking Effort Defaults

VS Code sets default effort levels with adaptive reasoning. Only increase thinking effort for genuinely complex problems.

Default effort = sufficient for most tasks
Context4

New Chat, New Task

Start fresh when changing topics. Accumulated context from previous messages consumes tokens without improving results.

Ctrl+N (Linux) · ⌘N (macOS)
Branching5

Leverage Forking

Fork the conversation to explore alternatives without re-establishing context. Type /fork to branch from the current message.

/fork — branches from current point
Tools6

Disable Unneeded Tools

Every tool call produces output that consumes context window space. Disable tools and MCP servers you don't need for the current task.

Configure Tools button → disable per-request
Files7

Exclude Files from Context

Large generated files, build outputs, and irrelevant directories inflate token usage. Use .gitignore and files.exclude to keep context lean.

.gitignore · files.exclude setting
Compaction8

Manage Context with /compact

When a conversation grows long, use /compact to summarize older parts and reclaim context window space. Add optional instructions to guide the summary.

/compact focus on the API design decisions
Monitoring9

Monitor Your Usage

View current Copilot usage in the Status Bar dashboard. It shows percentage of monthly allowance used for AI credits.

Run /chronicle:cost-tips for personalized tips
Debug10

Inspect Token Usage & Caching

Use Agent Debug Logs to understand credit consumption. The Cache Explorer shows prompt cache hit rates — cached tokens cost less.

Summary view · Cache Explorer · cache hit rates
Tip 2 Deep Dive · Workflow

Separate Planning from Implementation

The biggest credit drain is generating code before the approach is right. Two-phase workflow fixes this.

Phase 1

Planning

Model type
HighReasoning Model
🔍Use Plan agent to research the task
📋Generate a structured implementation plan
✏️Review and refine the plan
Approve the plan before any code is written
Structured, approved plan
Phase 2

Implementation

Model type
LowFast Model
🤝Hand off approved plan to implementation agent
Agent executes with a faster, efficient model
🔄Minimal back-and-forth, no rework
🚀Deliver working code faster, at lower cost
Working code, fewer credits used

Why This Saves Credits

Jumping straight into code generation with a reasoning model for the entire session wastes credits if the approach is wrong. By separating phases, you use the expensive model only for the short planning phase, then switch to a cost-effective model for the longer implementation phase — where most tokens are consumed.

Planning tokens15%
Implementation tokens85%

85% of tokens are in implementation — use the cheapest model there.

Tips 9 & 10 · Monitor

Know Where Your Credits Go

The Agent Debug Logs and Copilot Status Dashboard give you full visibility into token consumption.

Summary View

Aggregate token usage for the session, total tool calls, and overall duration.

Input tokens12,480
Output tokens3,210
Cached tokens8,150
Tool calls24

Cache Explorer

Prompt cache hit rates and how many input tokens were reused from previous requests.

Cache hit rate67%
Tokens saved8,150
Latency reduction~40%
Cost reduction~30%

Monthly AI Credit Allowance

Copilot Status Dashboard

Available via the VS Code Status Bar. Shows percentage of monthly allowance used. Run /chronicle:cost-tips for personalized recommendations.

Credits used this month63%
0%100%

Start Optimizing Today

Run /chronicle:cost-tips in any chat session to get personalized recommendations based on your recent activity.

/chronicle:cost-tips