Kilo Code: Integrating with Qwen Code CLI
In our previous posts, we covered the basics of Kilo Code and how to get it installed on your machine. One of the most powerful features of Kilo Code is its ability to integrate with various AI providers.
Kilo Code Deep Dive Series
This comprehensive series covers Kilo Code (kiro.dev) - the AI-first agentic development platform:
- Part 1: Introduction to Agentic Development - Understanding agents, skills, rules, and workflows
- Part 2: Installation and Setup Guide - Kiro IDE, CLI, and VSCode/JetBrains extensions
- Part 3: Qwen Code CLI Integration - 1M token context with free tier
- Part 4: Understanding Modes and Orchestrator - Specialized agent personas for different tasks
- Part 5: Codebase Indexing with Qdrant - Semantic search across your repository
- Part 6: Spec-Driven Development (SDD) - Structured approach to complex features
- Part 7: Steering and Custom Agents - Persistent instructions and specialized agents
- Part 8: Advanced MCP Integration - Connect to GitHub, filesystem, and external tools
- Part 9: Skills - Extending Agent Capabilities - Create reusable expertise packages
- Part 10: Parallel Agents and Agent Manager - Multi-task workflows with Git worktrees
- Part 11: Checkpoints - Your AI Safety Net - Automatic snapshots and rollback for AI changes
- Part 12: Mastering Codebase Indexing - Semantic search and AI context configuration
✓ 12 parts complete!
Today, we’re focusing on a particularly high-value integration: Qwen Code CLI. By linking these two tools, you can unlock a massive 1 million token context window and a generous free tier of 2,000 requests per day.
Why Qwen Code CLI?
Qwen3-Coder is Alibaba’s state-of-the-art open-weight coding model. Here’s why it’s the perfect companion for Kilo Code:
| Feature | Qwen Code CLI | Alternatives |
|---|---|---|
| Free Tier | 2,000 requests/day | 50-100 requests/day |
| Context Window | 1M tokens | 100K-200K tokens |
| Cost | Free (with API key) | $10-30/month |
| Code Quality | SOTA for open weights | Proprietary models |
| Latency | ~2-5 seconds | ~5-15 seconds |
What You’ll Get
- 2,000 free requests per day (resets daily)
- 1 million token context (entire codebases in one prompt)
- Specialized coding models (Qwen3-Coder-32B, Qwen3-Coder-Plus)
- Low latency responses optimized for code generation
- No credit card required for free tier
Step 1: Install Qwen Code CLI
Method 1: npm (Recommended)
npm install -g @qwen-code/qwen-code
Method 2: bun (Fastest)
bun install -g @qwen-code/qwen-code
Method 3: pip (Python)
pip install qwen-code-cli
Verify Installation
qwen-code --version
# Output: qwen-code version 1.0.0
Step 2: Get Your API Key
Option A: Free Tier (Recommended for Getting Started)
- Visit https://dashscope.aliyun.com
- Sign up with Alibaba Cloud account (or create one)
- Navigate to API Keys in the dashboard
- Click Create New API Key
- Copy and save your key securely
Free Tier Limits:
- 2,000 requests per day
- 1M tokens per request
- Rate limit: 10 requests/minute
Option B: Pay-As-You-Go (For Heavy Usage)
If you exceed the free tier:
- $0.002 per 1K tokens (input)
- $0.006 per 1K tokens (output)
- Still significantly cheaper than GPT-4 or Claude
Option C: Self-Hosted (For Enterprises)
Run Qwen locally with Ollama:
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull Qwen model
ollama pull qwen2.5-coder:32b
# Qwen Code CLI will auto-detect Ollama
Step 3: Configure Qwen Code CLI
Initialize Configuration
qwen-code config init
This creates a config file at ~/.qwen-code/config.json:
{
"api_key": "sk-your-api-key-here",
"model": "qwen-coder-plus",
"base_url": "https://dashscope.aliyuncs.com/api/v1",
"timeout": 120,
"max_tokens": 100000
}
Test Your Setup
qwen-code "Hello! Can you help me write Python code?"
Expected response:
Hello! I'm Qwen Code, your AI coding assistant. I'd be happy to help you write Python code. What would you like to build?
Step 4: Integrate with Kilo Code
In Kilo Code IDE
- Open Settings (Cmd+, or Ctrl+,)
- Navigate to Kilo Code → AI Provider
- Select Qwen Code CLI from dropdown
- Enter path to qwen-code binary (usually auto-detected)
Settings JSON:
{
"kiloCode.provider": "qwen-code",
"kiloCode.qwenCode.path": "/usr/local/bin/qwen-code",
"kiloCode.qwenCode.model": "qwen-coder-plus",
"kiloCode.qwenCode.maxTokens": 100000
}
In Kilo Code CLI
# Set Qwen as default provider
kilo-code config set provider qwen-code
# Configure model
kilo-code config set model qwen-coder-plus
# Set API key (or use environment variable)
kilo-code config set api_key sk-your-api-key-here
Alternative: Environment Variables
Add to your ~/.zshrc or ~/.bashrc:
export QWEN_CODE_API_KEY="sk-your-api-key-here"
export QWEN_CODE_MODEL="qwen-coder-plus"
export QWEN_CODE_MAX_TOKENS="100000"
In VS Code Extension
- Open Command Palette (
Cmd+Shift+PorCtrl+Shift+P) - Type “Kilo Code: Configure Provider”
- Select Qwen Code CLI
- Follow the setup wizard
Step 5: Test the Integration
Quick Test
Open a project folder and try:
In Kilo Code IDE:
Press Cmd+K and type:
"Create a simple REST API with Express.js"
In Kilo Code CLI:
cd ~/projects/test-app
kilo-code "Create a Python Flask app with user authentication"
Expected Behavior
- Kilo Code sends your request to Qwen Code CLI
- Qwen analyzes your codebase context
- Response appears in the chat/terminal
- Code suggestions are generated
- You can accept, reject, or modify the suggestions
Sample Session
$ kilo-code "Create a function that validates email addresses"
🤖 Kilo Code (powered by Qwen):
I'll create an email validation function with comprehensive checks.
📝 Proposed changes:
--- lib/email_validator.py ---
import re
def validate_email(email: str) -> tuple[bool, str]:
"""
Validate an email address.
Returns:
tuple: (is_valid, error_message)
"""
if not email:
return False, "Email is required"
if len(email) > 254:
return False, "Email too long"
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
if not re.match(pattern, email):
return False, "Invalid email format"
return True, "Valid email"
---
Accept these changes? (y/n/diff)
Advanced Configuration
Model Selection
Qwen offers multiple models for different use cases:
| Model | Best For | Context | Speed |
|---|---|---|---|
qwen-coder-32b |
General coding tasks | 256K | Fast |
qwen-coder-plus |
Complex reasoning | 1M | Medium |
qwen-coder-flash |
Quick completions | 128K | Very Fast |
Change model:
kilo-code config set model qwen-coder-plus
Context Window Tuning
For large codebases:
{
"qwenCode": {
"maxContextTokens": 1000000,
"indexingStrategy": "semantic",
"chunkSize": 4096,
"overlapSize": 512
}
}
Rate Limiting
Avoid hitting API limits:
{
"qwenCode": {
"requestsPerMinute": 8,
"retryDelay": 5000,
"maxRetries": 3
}
}
Troubleshooting
Issue: “API Key Not Found”
# Check if API key is set
echo $QWEN_CODE_API_KEY
# If empty, set it:
export QWEN_CODE_API_KEY="sk-your-key"
# Or update config:
qwen-code config set api_key sk-your-key
Issue: “Model Not Available”
# List available models
qwen-code models list
# Update to available model:
qwen-code config set model qwen-coder-32b
Issue: “Rate Limit Exceeded”
// Reduce request frequency
{
"qwenCode": {
"requestsPerMinute": 5,
"enableCaching": true
}
}
Issue: “Context Window Exceeded”
// Reduce context size
{
"qwenCode": {
"maxContextTokens": 500000,
"smartContextTrimming": true
}
}
Cost Optimization Tips
1. Use the Right Model
- qwen-coder-flash: Quick edits, completions (cheapest)
- qwen-coder-32b: Standard development work
- qwen-coder-plus: Complex architecture, debugging (most expensive)
2. Enable Caching
{
"qwenCode": {
"enableCache": true,
"cacheDir": "~/.cache/qwen-code",
"cacheTTL": 3600
}
}
3. Limit Context
Only index relevant files:
{
"qwenCode": {
"indexing": {
"include": ["src/**/*", "lib/**/*"],
"exclude": ["node_modules/**", "dist/**", "*.min.js"]
}
}
}
4. Monitor Usage
# Check daily usage
qwen-code usage
# Output:
# Today's usage: 847 / 2000 requests
# Tokens used: 45.2M / 100M
# Estimated cost: $0.00 (free tier)
What’s Next?
Now that you have Qwen Code CLI integrated, you’re ready to explore Kilo Code Modes - specialized personas for different development tasks.
Coming up: Kilo Code Series #4: Understanding Modes and the Orchestrator
In the next post, we’ll cover:
- Orchestrator mode for task coordination
- Architect mode for system design
- Code mode for implementation
- Debug mode for troubleshooting
- Creating custom modes
Quick Reference
Environment Variables
export QWEN_CODE_API_KEY="sk-..."
export QWEN_CODE_MODEL="qwen-coder-plus"
export QWEN_CODE_MAX_TOKENS="100000"
export QWEN_CODE_BASE_URL="https://dashscope.aliyuncs.com/api/v1"
Configuration Files
| Location | Purpose |
|---|---|
~/.qwen-code/config.json |
Qwen CLI config |
.kilocode/config.json |
Project-specific config |
~/.kilo-code/settings.json |
Kilo Code IDE settings |
Useful Commands
# Test Qwen installation
qwen-code "Hello"
# Check usage
qwen-code usage
# List models
qwen-code models list
# Change model
qwen-code config set model qwen-coder-plus
# In Kilo Code
kilo-code config get provider
kilo-code config set model qwen-coder-plus