Performance Optimization Guide
Optimize Claude Code performance for faster responses, reduced token usage, and efficient workflows.
Model Selection Strategies
Choose the Right Model
Different models for different tasks:
{
"model_routing": {
"quick_questions": "claude-3-haiku",
"code_generation": "claude-3-sonnet",
"complex_analysis": "claude-3-opus",
"batch_processing": "claude-3-haiku"
}
}
Dynamic Model Switching
# Switch models during session
claude-code --model=haiku "Quick question about syntax"
claude-code --model=opus "Deep architectural analysis"
# Auto-select based on complexity
claude-code --smart-routing "Implement user authentication"
Model Performance Characteristics
Model | Speed | Quality | Cost | Best For |
---|---|---|---|---|
Haiku | Fast | Good | Low | Quick tasks, iterations |
Sonnet | Moderate | Excellent | Medium | Code generation, analysis |
Opus | Slower | Superior | High | Complex problems, architecture |
Context Window Optimization
Efficient Context Usage
Strategic File Reading
# Good: Read specific sections
claude-code read src/auth.ts --lines=50-100
# Better: Use targeted search
claude-code grep "authentication" --type=ts
# Best: Let agent search intelligently
claude-code task "Find authentication implementation"
Context Prioritization
{
"context_priority": {
"high": ["current_files", "error_messages", "user_intent"],
"medium": ["project_structure", "recent_changes"],
"low": ["historical_context", "documentation"]
}
}
Memory Management
Session Segmentation
# Start focused session
claude-code --session=auth-feature --context=minimal
# Segment by feature
claude-code --session=frontend --include="src/components/**"
claude-code --session=backend --include="src/api/**"
Context Reset Strategies
# Automatic reset after threshold
claude-code --auto-reset-at=75%
# Manual reset with summary
claude-code --reset-with-summary
# Selective context clearing
claude-code --clear=historical --keep=current
Workflow Efficiency
Batch Operations
Parallel Tool Execution
// Execute multiple operations concurrently
const results = await Promise.all([
readFile('src/component.tsx'),
grepPattern('useState', '**/*.tsx'),
runTests('component.test.tsx')
]);
Bulk File Operations
# Process multiple files efficiently
claude-code batch-edit \
--pattern="**/*.tsx" \
--operation="add-prop-types" \
--parallel=true
Workflow Caching
Result Caching
{
"caching": {
"tool_results": {
"ttl": 300,
"max_size": "100MB"
},
"file_contents": {
"ttl": 60,
"watch_changes": true
},
"search_results": {
"ttl": 600,
"invalidate_on": ["file_changes"]
}
}
}
Intelligent Invalidation
# Cache with dependencies
claude-code cache set "test-results" \
--depends-on="src/**/*.ts,tests/**/*.ts" \
--ttl=3600
Response Optimization
Concise Communication
Response Templates
{
"response_templates": {
"code_change": "Changed {{file}}:{{line}} - {{description}}",
"test_result": "{{status}} - {{passed}}/{{total}} tests passed",
"build_status": "Build {{status}} in {{duration}}ms"
}
}
Structured Outputs
# Request structured responses
claude-code --format=json "Analyze code quality"
claude-code --format=table "Compare framework options"
claude-code --format=checklist "Review deployment readiness"
Progressive Enhancement
Layered Information
Level 1: Direct answer to question
Level 2: Implementation details (on request)
Level 3: Alternative approaches (on request)
Level 4: Deep technical analysis (on request)
Efficiency Measurement
Performance Metrics
Response Time Tracking
# Enable performance monitoring
claude-code --performance-mode
# View session metrics
claude-code metrics --session=current
# Historical performance data
claude-code metrics --period=7d --breakdown=model
Token Usage Analysis
{
"token_metrics": {
"input_tokens": 1500,
"output_tokens": 800,
"cached_tokens": 300,
"efficiency_ratio": 0.73
}
}
Optimization Recommendations
# Get optimization suggestions
claude-code optimize --analyze-session
# Sample output:
# ✅ Context usage: 68% (good)
# ⚠️ Model selection: Use Haiku for 23% of queries
# ❌ Redundant reads: 5 files read multiple times
# 💡 Suggestion: Enable context caching
Advanced Optimization Techniques
Context Compression
Semantic Compression
{
"compression": {
"method": "semantic",
"preserve": ["code_structure", "error_messages"],
"compress": ["verbose_logs", "repeated_content"],
"ratio": 0.3
}
}
Smart Summarization
# Compress conversation history
claude-code --compress-history --keep-decisions
# Maintain key information
claude-code --summarize --preserve=["architecture", "constraints"]
Workflow Design Patterns
Lazy Loading
// Load expensive data only when needed
const getData = lazy(() => {
return expensiveDataFetch();
});
// Use data when required
if (needsData) {
const data = await getData();
}
Resource Pooling
// Pool expensive resources
const agentPool = new Pool({
create: () => new SpecializedAgent(),
max: 5,
min: 1
});
// Reuse agents efficiently
const agent = await agentPool.acquire();
const result = await agent.process(task);
agentPool.release(agent);
Benchmarking & Profiling
Performance Baselines
# Establish performance baselines
claude-code benchmark \
--tasks="code-generation,file-analysis,test-creation" \
--iterations=10 \
--output=baseline.json
Continuous Monitoring
{
"monitoring": {
"response_time": {
"target": "< 2s",
"alert_threshold": "5s"
},
"token_efficiency": {
"target": "> 0.7",
"alert_threshold": "< 0.5"
},
"context_usage": {
"target": "< 80%",
"alert_threshold": "> 95%"
}
}
}
A/B Testing
# Test optimization strategies
claude-code ab-test \
--strategy-a="current" \
--strategy-b="optimized" \
--metric="response_time" \
--duration=1h
Resource Management
Memory Optimization
Smart Garbage Collection
{
"memory_management": {
"gc_interval": 300000,
"max_heap": "512MB",
"cache_cleanup": {
"unused_for": "1h",
"size_threshold": "50MB"
}
}
}
Context Pruning
# Automatic context pruning
claude-code --auto-prune-context \
--keep-recent=10 \
--keep-important=true \
--prune-duplicates=true
CPU Optimization
Parallel Processing
{
"parallel_processing": {
"max_workers": 4,
"queue_size": 100,
"task_distribution": "round_robin"
}
}
Process Prioritization
# Set process priorities
claude-code --priority=high "Critical bug fix"
claude-code --priority=low "Code documentation"
Network Optimization
Request Batching
// Batch API requests
const batchRequests = [
{ tool: 'read', params: { file: 'a.ts' } },
{ tool: 'read', params: { file: 'b.ts' } },
{ tool: 'grep', params: { pattern: 'error' } }
];
await executeBatch(batchRequests);
Connection Management
{
"network": {
"max_connections": 10,
"connection_timeout": 30000,
"retry_strategy": {
"max_retries": 3,
"backoff": "exponential"
}
}
}
Optimization Checklist
Pre-Session Optimization
- Choose appropriate model for task complexity
- Configure context management settings
- Enable relevant caching
- Set up performance monitoring
During Session
- Use batch operations when possible
- Minimize redundant file reads
- Leverage context compression
- Monitor token usage
Post-Session Analysis
- Review performance metrics
- Identify optimization opportunities
- Update configurations based on learnings
- Share insights with team
Performance Monitoring Dashboard
Key Metrics
{
"dashboard": {
"response_times": {
"p50": "1.2s",
"p95": "3.1s",
"p99": "5.4s"
},
"token_usage": {
"avg_input": 1200,
"avg_output": 650,
"efficiency": 0.71
},
"workflow_success": {
"completion_rate": "94%",
"error_rate": "6%"
}
}
}
Alerts and Notifications
# Configure performance alerts
claude-code alerts set \
--metric="response_time" \
--threshold="5s" \
--action="slack-notify"
# Email reports
claude-code reports schedule \
--frequency=weekly \
--metrics="all" \
--email="team@company.com"
Next Steps
- Review Advanced Workflows
- Explore Hooks Framework
- Learn about Configuration Options