Kilo Code: Codebase Indexing with Nomic and Qdrant

March 29, 2026

9 min read

Kilo Code: Codebase Indexing with Nomic and Qdrant

In our previous post, we explored the different modes of Kilo Code. While these modes are powerful, their effectiveness depends on the quality of the context they can access.

Kilo Code Deep Dive Series

This comprehensive series covers Kilo Code (kiro.dev) - the AI-first agentic development platform:

Part 1: Introduction to Agentic Development - Understanding agents, skills, rules, and workflows
Part 2: Installation and Setup Guide - Kiro IDE, CLI, and VSCode/JetBrains extensions
Part 3: Qwen Code CLI Integration - 1M token context with free tier
Part 4: Understanding Modes and Orchestrator - Specialized agent personas for different tasks
Part 5: Codebase Indexing with Qdrant - Semantic search across your repository
Part 6: Spec-Driven Development (SDD) - Structured approach to complex features
Part 7: Steering and Custom Agents - Persistent instructions and specialized agents
Part 8: Advanced MCP Integration - Connect to GitHub, filesystem, and external tools
Part 9: Skills - Extending Agent Capabilities - Create reusable expertise packages
Part 10: Parallel Agents and Agent Manager - Multi-task workflows with Git worktrees
Part 11: Checkpoints - Your AI Safety Net - Automatic snapshots and rollback for AI changes
Part 12: Mastering Codebase Indexing - Semantic search and AI context configuration

✓ 12 parts complete!

Semantic codebase indexing visualization Codebase indexing enables AI to understand your entire project semantically

This is where Codebase Indexing comes in. Instead of just searching for keywords, Kilo Code can perform semantic search, allowing the agent to understand the “meaning” of your code across your entire project.

Why Codebase Indexing Matters

Traditional search tools like grep or VS Code’s search are limited to exact text matching:

# grep finds exact matches only
grep -r "validateEmail" src/

# Misses: validate_email, emailValidation, checkEmail, etc.

With semantic search, Kilo Code understands that these are related concepts:

User query: "How do we validate email addresses?"

Semantic search finds:
- validateEmail() function
- email_validation.py module
- EmailValidator class
- check_email_format() utility

This is achieved through vector embeddings - numerical representations of code that capture semantic meaning.

Architecture Overview

Kilo Code’s indexing system consists of three components:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Code Files    │────▶│  Embedding      │────▶│   Qdrant        │
│   (Your Repo)   │     │  Model          │     │   Vector DB     │
│                 │     │  (nomic-embed)  │     │                 │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                               │
                               ▼
                        ┌─────────────────┐
                        │   Ollama        │
                        │   (Local AI)    │
                        └─────────────────┘

Components:

Component	Purpose	Options
Embedding Model	Converts code to vectors	nomic-embed-text, mxbai-embed-large
Vector Database	Stores and searches embeddings	Qdrant (local), Chroma, Weaviate
Runtime	Runs embedding model locally	Ollama, LM Studio

Step 1: Install Ollama

Ollama is the easiest way to run embedding models locally.

macOS

# Install with Homebrew
brew install ollama

# Or download from https://ollama.com

Linux

# Official install script
curl -fsSL https://ollama.com/install.sh | sh

# Or use package manager
sudo apt install ollama  # Ubuntu/Debian
sudo dnf install ollama  # Fedora/RHEL

Windows

# Download installer from https://ollama.com
# Or use winget
winget install Ollama.Ollama

Verify Installation

ollama --version
# Output: ollama version 0.5.x

ollama serve &
# Starts the Ollama server (port 11434)

Step 2: Download Embedding Model

Kilo Code recommends nomic-embed-text for code indexing:

# Pull the embedding model
ollama pull nomic-embed-text

# Alternative: mxbai-embed-large (slightly better for code)
ollama pull mxbai-embed-large

Model Comparison

Model	Dimensions	Max Tokens	Speed	Quality
`nomic-embed-text`	768	8192	Fast	Good
`mxbai-embed-large`	1024	512	Medium	Better
`all-minilm`	384	512	Very Fast	Basic

Recommendation: Use nomic-embed-text for most projects. It offers the best balance of speed and quality.

Test the Model

# Generate embeddings for test text
curl http://localhost:11434/api/embeddings -d '{
  "model": "nomic-embed-text",
  "prompt": "function validateEmail(email) { return email.includes(\"@\"); }"
}'

# Returns a 768-dimensional vector
{"embedding": [0.0234, -0.0156, 0.0891, ...]}

Step 3: Install Qdrant

Qdrant is a vector similarity search engine that stores your code embeddings.

Option A: Docker (Recommended)

# Pull Qdrant image
docker pull qdrant/qdrant

# Run Qdrant
docker run -d \
  -p 6333:6333 \
  -p 6334:6334 \
  -v $(pwd)/qdrant_storage:/qdrant/storage \
  qdrant/qdrant

Ports:

6333: REST API
6334: gRPC API (faster for large datasets)

Option B: Local Binary

macOS:

brew install qdrant
qdrant

Linux:

# Download binary
wget https://github.com/qdrant/qdrant/releases/latest/download/qdrant-x86_64-unknown-linux-gnu.tar.gz
tar -xzf qdrant-*.tar.gz
./qdrant

Windows:

# Download from GitHub releases
# https://github.com/qdrant/qdrant/releases

Verify Qdrant

# Check if Qdrant is running
curl http://localhost:6333/

# Expected response:
{"title":"qdrant - vector search engine","version":"1.x.x"}

Step 4: Configure Kilo Code Indexing

Create Configuration File

Create .kilocode/indexing.json in your project root:

{
  "indexing": {
    "enabled": true,
    "provider": "qdrant",
    "embedding": {
      "model": "nomic-embed-text",
      "provider": "ollama",
      "endpoint": "http://localhost:11434"
    },
    "vectorStore": {
      "provider": "qdrant",
      "endpoint": "http://localhost:6333",
      "collectionName": "my-project-codebase"
    },
    "chunking": {
      "strategy": "code-aware",
      "chunkSize": 512,
      "overlap": 64
    },
    "filters": {
      "include": [
        "**/*.js",
        "**/*.ts",
        "**/*.py",
        "**/*.go",
        "**/*.rs",
        "**/*.java",
        "**/*.md"
      ],
      "exclude": [
        "**/node_modules/**",
        "**/dist/**",
        "**/build/**",
        "**/*.min.js",
        "**/*.bundle.js",
        "**/vendor/**",
        "**/.git/**",
        "**/test/**",
        "**/*.test.*",
        "**/*.spec.*"
      ]
    }
  }
}

Configuration Options Explained

Embedding Settings:

Option	Description	Default
`model`	Ollama embedding model	`nomic-embed-text`
`provider`	Embedding provider	`ollama`
`endpoint`	Ollama API endpoint	`http://localhost:11434`

Vector Store Settings:

Option	Description	Default
`provider`	Vector database	`qdrant`
`endpoint`	Qdrant API endpoint	`http://localhost:6333`
`collectionName`	Collection name	`{project-name}`

Chunking Settings:

Option	Description	Default
`strategy`	Chunking approach	`code-aware`
`chunkSize`	Tokens per chunk	`512`
`overlap`	Overlap between chunks	`64`

Chunking Strategies

Code-Aware (Recommended): Respects code boundaries (functions, classes):

// Keeps function intact
function calculateTotal(items) {
  // Entire function in one chunk
  return items.reduce((sum, item) => sum + item.price, 0);
}

Fixed-Size: Splits at exact token count:

Chunk 1: function calculateTotal(items) {
Chunk 2:   return items.reduce((sum, item) =>
Chunk 3:     sum + item.price, 0);
}

Line-Based: Splits at line boundaries:

Chunk 1: Lines 1-20
Chunk 2: Lines 21-40

Step 5: Build the Index

CLI Command

# Build index for current directory
kilo-code index build

# Build with verbose output
kilo-code index build --verbose

# Rebuild from scratch
kilo-code index rebuild

# Build specific paths only
kilo-code index build src/ lib/

Expected Output

$ kilo-code index build

🔍 Kilo Code Indexing Service
━━━━━━━━━━━━━━━━━━━━━━━━━━━━

📁 Scanning project: /Users/paladm/my-project
📊 Found 247 files (1.2 MB total)
⚙️  Applying filters...
   ✓ Included: 189 files
   ✗ Excluded: 58 files (node_modules, dist, tests)

🔄 Chunking code...
   Created 1,247 chunks (avg 423 tokens)

🧮 Generating embeddings...
   [████████████████████] 100% | 1,247/1,247 chunks
   Model: nomic-embed-text
   Time: 2m 34s

💾 Storing in Qdrant...
   Collection: my-project-codebase
   Vectors: 1,247
   Dimensions: 768

✅ Indexing complete!
   Search ready in ~50ms

Index Status

# Check index status
kilo-code index status

# Output:
Index Status: Ready
Collection: my-project-codebase
Documents: 1,247 chunks
Last Updated: 2026-03-29 22:30:00
Size: 45.2 MB (vectors + metadata)

Step 6: Use Semantic Search

In Kilo Code IDE

Method 1: Search Panel

Open the Kilo Code sidebar
Click the “Search” tab
Type your query in natural language

Method 2: Chat Integration

User: "Where is the email validation logic?"

Kilo Code (with indexing):
I found several relevant files:

1. src/utils/emailValidator.js (95% match)
   - validateEmail() function
   - checkEmailDomain() function
   
2. src/services/userService.js (78% match)
   - Uses email validation during registration
   
3. src/middleware/validation.js (65% match)
   - Email format middleware

Would you like me to show the code from any of these files?

In Kilo Code CLI

# Semantic search
kilo-code search "user authentication logic"

# Search with file filter
kilo-code search "database connection" --filter "*.py"

# Search with limit
kilo-code search "API endpoints" --limit 5

Search Results Format

$ kilo-code search "password hashing"

🔍 Search Results for: "password hashing"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

1. src/auth/password.js (Score: 0.94)
   ─────────────────────────────────────
   Location: Lines 15-42
   
   function hashPassword(plainPassword) {
     const salt = bcrypt.genSaltSync(12);
     return bcrypt.hashSync(plainPassword, salt);
   }
   
   function verifyPassword(plain, hashed) {
     return bcrypt.compareSync(plain, hashed);
   }
   ─────────────────────────────────────

2. src/models/User.js (Score: 0.87)
   ─────────────────────────────────────
   Location: Lines 28-35 (pre-save hook)
   
   userSchema.pre('save', function(next) {
     if (this.isModified('password')) {
       this.password = hashPassword(this.password);
     }
     next();
   });
   ─────────────────────────────────────

3. docs/security.md (Score: 0.72)
   ─────────────────────────────────────
   Location: Section 3.2
   
   ## Password Storage
   We use bcrypt with 12 salt rounds for
   password hashing. Never store plain text.
   ─────────────────────────────────────

Advanced Configuration

Multi-Project Indexing

For monorepos or multiple related projects:

{
  "indexing": {
    "collections": [
      {
        "name": "frontend",
        "path": "./packages/frontend",
        "filters": { "include": ["**/*.tsx", "**/*.ts"] }
      },
      {
        "name": "backend",
        "path": "./packages/backend",
        "filters": { "include": ["**/*.py"] }
      },
      {
        "name": "shared",
        "path": "./packages/shared",
        "filters": { "include": ["**/*.ts", "**/*.json"] }
      }
    ]
  }
}

Incremental Indexing

For large codebases, enable incremental updates:

{
  "indexing": {
    "incremental": true,
    "watchMode": true,
    "debounceMs": 5000,
    "batchSize": 100
  }
}

How it works:

Initial full index build
Watch for file changes
Re-index only modified files
Update vector database incrementally

Custom Metadata

Add custom metadata to improve search:

{
  "indexing": {
    "metadata": {
      "includeGitBlame": true,
      "includeFileStats": true,
      "includeDependencies": true,
      "customFields": {
        "team": "platform",
        "service": "api-gateway"
      }
    }
  }
}

Troubleshooting

Issue: Ollama Connection Failed

# Check if Ollama is running
ps aux | grep ollama

# Start Ollama server
ollama serve

# Test connection
curl http://localhost:11434/api/version

# Check firewall
sudo lsof -i :11434

Issue: Qdrant Collection Error

# Check Qdrant status
curl http://localhost:6333/

# List collections
curl http://localhost:6333/collections

# Delete problematic collection
curl -X DELETE http://localhost:6333/collections/my-project-codebase

# Rebuild index
kilo-code index rebuild

Issue: Slow Indexing

Symptoms: Indexing takes >30 minutes for medium projects

Solutions:

Reduce chunk size:

{
  "chunking": {
    "chunkSize": 256,
    "overlap": 32
  }
}

Exclude more files:

{
  "filters": {
    "exclude": [
      "**/node_modules/**",
      "**/dist/**",
      "**/*.test.*",
      "**/*.md",
      "**/docs/**",
      "**/coverage/**"
    ]
  }
}

Use faster model:

ollama pull all-minilm

{
  "embedding": {
    "model": "all-minilm"
  }
}

Issue: Poor Search Results

Symptoms: Search doesn’t find relevant code

Solutions:

Rebuild index:

kilo-code index rebuild --force

Adjust chunking:

{
  "chunking": {
    "strategy": "code-aware",
    "chunkSize": 512,
    "overlap": 128
  }
}

Check embedding model:

# Test model quality
ollama run nomic-embed-text "generate embeddings for: authentication"

Issue: Out of Memory

Symptoms: Ollama or Qdrant crashes during indexing

Solutions:

Limit Ollama memory:

# Set memory limit (in GB)
OLLAMA_MAX_VRAM=4 ollama serve

Reduce batch size:

{
  "indexing": {
    "batchSize": 50
  }
}

Use smaller model:

ollama pull all-minilm:22m

Performance Optimization

Index Size vs. Search Quality

Configuration	Index Size	Search Speed	Quality
Default	Medium	~50ms	Good
High Quality	Large	~100ms	Better
Fast Search	Small	~20ms	Basic

High Quality Configuration

{
  "indexing": {
    "embedding": {
      "model": "mxbai-embed-large"
    },
    "chunking": {
      "chunkSize": 512,
      "overlap": 128
    },
    "vectorStore": {
      "quantization": false
    }
  }
}

Fast Search Configuration

{
  "indexing": {
    "embedding": {
      "model": "all-minilm"
    },
    "chunking": {
      "chunkSize": 256,
      "overlap": 32
    },
    "vectorStore": {
      "quantization": true
    }
  }
}

Quantization

Qdrant supports vector quantization to reduce storage:

{
  "vectorStore": {
    "quantization": {
      "type": "scalar",
      "quantile": 0.99,
      "granularity": 0.01
    }
  }
}

Benefits:

4x smaller index size
Faster search
Minimal quality loss

Best Practices

1. Index Only What You Need

{
  "filters": {
    "include": ["src/**/*", "lib/**/*"],
    "exclude": ["**/*.test.*", "**/mocks/**", "**/fixtures/**"]
  }
}

2. Use Code-Aware Chunking

Always prefer code-aware chunking for source code:

{
  "chunking": {
    "strategy": "code-aware"
  }
}

3. Schedule Regular Rebuilds

For active projects, rebuild weekly:

# Add to crontab
0 2 * * 0 cd /path/to/project && kilo-code index rebuild

4. Monitor Index Health

# Add health check to CI/CD
kilo-code index status --json | jq '.status'

5. Use Collection Namespaces

For multiple projects on same Qdrant instance:

{
  "vectorStore": {
    "collectionName": "team-project-service"
  }
}

Real-World Example: Large Monorepo

Here’s how to index a large monorepo efficiently:

{
  "indexing": {
    "enabled": true,
    "provider": "qdrant",
    "collections": [
      {
        "name": "web-frontend",
        "path": "./apps/web",
        "filters": {
          "include": ["**/*.tsx", "**/*.ts", "**/*.css"],
          "exclude": ["**/*.test.*", "**/node_modules/**"]
        }
      },
      {
        "name": "api-backend",
        "path": "./apps/api",
        "filters": {
          "include": ["**/*.py"],
          "exclude": ["**/tests/**", "**/__pycache__/**"]
        }
      },
      {
        "name": "shared-libraries",
        "path": "./packages",
        "filters": {
          "include": ["**/*.ts", "**/*.json"],
          "exclude": ["**/dist/**", "**/node_modules/**"]
        }
      }
    ],
    "embedding": {
      "model": "nomic-embed-text",
      "provider": "ollama"
    },
    "vectorStore": {
      "provider": "qdrant",
      "endpoint": "http://localhost:6333"
    },
    "incremental": true,
    "watchMode": true
  }
}

Build all collections:

kilo-code index build --all

Search across all collections:

kilo-code search "authentication middleware" --collections web-frontend,api-backend

Kilo Code Deep Dive Series

Why Codebase Indexing Matters

Architecture Overview

Step 1: Install Ollama

macOS

Linux

Windows

Verify Installation

Step 2: Download Embedding Model

Model Comparison

Test the Model

Step 3: Install Qdrant

Option A: Docker (Recommended)

Option B: Local Binary

Verify Qdrant

Step 4: Configure Kilo Code Indexing

Create Configuration File

Configuration Options Explained

Chunking Strategies

Step 5: Build the Index

CLI Command

Expected Output

Index Status

Step 6: Use Semantic Search

In Kilo Code IDE

In Kilo Code CLI

Search Results Format

Advanced Configuration

Multi-Project Indexing

Incremental Indexing

Custom Metadata

Troubleshooting

Issue: Ollama Connection Failed

Issue: Qdrant Collection Error

Issue: Slow Indexing

Issue: Poor Search Results

Issue: Out of Memory

Performance Optimization

Index Size vs. Search Quality

High Quality Configuration

Fast Search Configuration

Quantization

Best Practices

1. Index Only What You Need

2. Use Code-Aware Chunking

3. Schedule Regular Rebuilds

4. Monitor Index Health

5. Use Collection Namespaces

Real-World Example: Large Monorepo

Your support is our everlasting motivation, that cup of coffee is what keeps us going!

Your support is our everlasting motivation,
that cup of coffee is what keeps us going!