Skip to main content

Performance Optimization Guide

Optimize Claude Code performance for faster responses, reduced token usage, and efficient workflows.

Model Selection Strategies

Choose the Right Model

Different models for different tasks:

{
"model_routing": {
"quick_questions": "claude-3-haiku",
"code_generation": "claude-3-sonnet",
"complex_analysis": "claude-3-opus",
"batch_processing": "claude-3-haiku"
}
}

Dynamic Model Switching

# Switch models during session
claude-code --model=haiku "Quick question about syntax"
claude-code --model=opus "Deep architectural analysis"

# Auto-select based on complexity
claude-code --smart-routing "Implement user authentication"

Model Performance Characteristics

ModelSpeedQualityCostBest For
HaikuFastGoodLowQuick tasks, iterations
SonnetModerateExcellentMediumCode generation, analysis
OpusSlowerSuperiorHighComplex problems, architecture

Context Window Optimization

Efficient Context Usage

Strategic File Reading

# Good: Read specific sections
claude-code read src/auth.ts --lines=50-100

# Better: Use targeted search
claude-code grep "authentication" --type=ts

# Best: Let agent search intelligently
claude-code task "Find authentication implementation"

Context Prioritization

{
"context_priority": {
"high": ["current_files", "error_messages", "user_intent"],
"medium": ["project_structure", "recent_changes"],
"low": ["historical_context", "documentation"]
}
}

Memory Management

Session Segmentation

# Start focused session
claude-code --session=auth-feature --context=minimal

# Segment by feature
claude-code --session=frontend --include="src/components/**"
claude-code --session=backend --include="src/api/**"

Context Reset Strategies

# Automatic reset after threshold
claude-code --auto-reset-at=75%

# Manual reset with summary
claude-code --reset-with-summary

# Selective context clearing
claude-code --clear=historical --keep=current

Workflow Efficiency

Batch Operations

Parallel Tool Execution

// Execute multiple operations concurrently
const results = await Promise.all([
readFile('src/component.tsx'),
grepPattern('useState', '**/*.tsx'),
runTests('component.test.tsx')
]);

Bulk File Operations

# Process multiple files efficiently
claude-code batch-edit \
--pattern="**/*.tsx" \
--operation="add-prop-types" \
--parallel=true

Workflow Caching

Result Caching

{
"caching": {
"tool_results": {
"ttl": 300,
"max_size": "100MB"
},
"file_contents": {
"ttl": 60,
"watch_changes": true
},
"search_results": {
"ttl": 600,
"invalidate_on": ["file_changes"]
}
}
}

Intelligent Invalidation

# Cache with dependencies
claude-code cache set "test-results" \
--depends-on="src/**/*.ts,tests/**/*.ts" \
--ttl=3600

Response Optimization

Concise Communication

Response Templates

{
"response_templates": {
"code_change": "Changed {{file}}:{{line}} - {{description}}",
"test_result": "{{status}} - {{passed}}/{{total}} tests passed",
"build_status": "Build {{status}} in {{duration}}ms"
}
}

Structured Outputs

# Request structured responses
claude-code --format=json "Analyze code quality"
claude-code --format=table "Compare framework options"
claude-code --format=checklist "Review deployment readiness"

Progressive Enhancement

Layered Information

Level 1: Direct answer to question
Level 2: Implementation details (on request)
Level 3: Alternative approaches (on request)
Level 4: Deep technical analysis (on request)

Efficiency Measurement

Performance Metrics

Response Time Tracking

# Enable performance monitoring
claude-code --performance-mode

# View session metrics
claude-code metrics --session=current

# Historical performance data
claude-code metrics --period=7d --breakdown=model

Token Usage Analysis

{
"token_metrics": {
"input_tokens": 1500,
"output_tokens": 800,
"cached_tokens": 300,
"efficiency_ratio": 0.73
}
}

Optimization Recommendations

# Get optimization suggestions
claude-code optimize --analyze-session

# Sample output:
# ✅ Context usage: 68% (good)
# ⚠️ Model selection: Use Haiku for 23% of queries
# ❌ Redundant reads: 5 files read multiple times
# 💡 Suggestion: Enable context caching

Advanced Optimization Techniques

Context Compression

Semantic Compression

{
"compression": {
"method": "semantic",
"preserve": ["code_structure", "error_messages"],
"compress": ["verbose_logs", "repeated_content"],
"ratio": 0.3
}
}

Smart Summarization

# Compress conversation history
claude-code --compress-history --keep-decisions

# Maintain key information
claude-code --summarize --preserve=["architecture", "constraints"]

Workflow Design Patterns

Lazy Loading

// Load expensive data only when needed
const getData = lazy(() => {
return expensiveDataFetch();
});

// Use data when required
if (needsData) {
const data = await getData();
}

Resource Pooling

// Pool expensive resources
const agentPool = new Pool({
create: () => new SpecializedAgent(),
max: 5,
min: 1
});

// Reuse agents efficiently
const agent = await agentPool.acquire();
const result = await agent.process(task);
agentPool.release(agent);

Benchmarking & Profiling

Performance Baselines

# Establish performance baselines
claude-code benchmark \
--tasks="code-generation,file-analysis,test-creation" \
--iterations=10 \
--output=baseline.json

Continuous Monitoring

{
"monitoring": {
"response_time": {
"target": "< 2s",
"alert_threshold": "5s"
},
"token_efficiency": {
"target": "> 0.7",
"alert_threshold": "< 0.5"
},
"context_usage": {
"target": "< 80%",
"alert_threshold": "> 95%"
}
}
}

A/B Testing

# Test optimization strategies
claude-code ab-test \
--strategy-a="current" \
--strategy-b="optimized" \
--metric="response_time" \
--duration=1h

Resource Management

Memory Optimization

Smart Garbage Collection

{
"memory_management": {
"gc_interval": 300000,
"max_heap": "512MB",
"cache_cleanup": {
"unused_for": "1h",
"size_threshold": "50MB"
}
}
}

Context Pruning

# Automatic context pruning
claude-code --auto-prune-context \
--keep-recent=10 \
--keep-important=true \
--prune-duplicates=true

CPU Optimization

Parallel Processing

{
"parallel_processing": {
"max_workers": 4,
"queue_size": 100,
"task_distribution": "round_robin"
}
}

Process Prioritization

# Set process priorities
claude-code --priority=high "Critical bug fix"
claude-code --priority=low "Code documentation"

Network Optimization

Request Batching

// Batch API requests
const batchRequests = [
{ tool: 'read', params: { file: 'a.ts' } },
{ tool: 'read', params: { file: 'b.ts' } },
{ tool: 'grep', params: { pattern: 'error' } }
];

await executeBatch(batchRequests);

Connection Management

{
"network": {
"max_connections": 10,
"connection_timeout": 30000,
"retry_strategy": {
"max_retries": 3,
"backoff": "exponential"
}
}
}

Optimization Checklist

Pre-Session Optimization

  • Choose appropriate model for task complexity
  • Configure context management settings
  • Enable relevant caching
  • Set up performance monitoring

During Session

  • Use batch operations when possible
  • Minimize redundant file reads
  • Leverage context compression
  • Monitor token usage

Post-Session Analysis

  • Review performance metrics
  • Identify optimization opportunities
  • Update configurations based on learnings
  • Share insights with team

Performance Monitoring Dashboard

Key Metrics

{
"dashboard": {
"response_times": {
"p50": "1.2s",
"p95": "3.1s",
"p99": "5.4s"
},
"token_usage": {
"avg_input": 1200,
"avg_output": 650,
"efficiency": 0.71
},
"workflow_success": {
"completion_rate": "94%",
"error_rate": "6%"
}
}
}

Alerts and Notifications

# Configure performance alerts
claude-code alerts set \
--metric="response_time" \
--threshold="5s" \
--action="slack-notify"

# Email reports
claude-code reports schedule \
--frequency=weekly \
--metrics="all" \
--email="team@company.com"

Next Steps