Common Pitfalls

Learn from common mistakes and how to avoid them when working with the LLM Platform, BuildKit, and OSSA agents.

Installation & Setup
OSSA Agent Development
BuildKit CLI Usage
Drupal Development
Workflow Orchestration
Production Deployment
Performance & Optimization
Security

Installation & Setup

❌ Pitfall: Using Docker Desktop Instead of OrbStack (macOS)

Problem:

# Slow file sync, high CPU usage, poor performance
docker ps  # Takes 5+ seconds

Solution:

# Switch to OrbStack for 10x better performance
brew install orbstack

# Uninstall Docker Desktop
# macOS → Applications → Docker → Uninstall

# Verify OrbStack
docker ps  # Should be instant

Why it matters: Docker Desktop on macOS uses inefficient file mounting. OrbStack uses VirtIO-FS for native-speed file access.

❌ Pitfall: Not Installing DDEV Addons

Problem:

ddev drush status
# Command not found: drush

Solution:

# Install platform-specific DDEV addons (one-time)
cd ~/Sites/LLM/llm-platform
./infrastructure/ddev-addons/install-addons.sh

# Now available:
ddev drush status
ddev tddai check
ddev git-safe commit

❌ Pitfall: Wrong PHP Version

Problem:

composer install
# Your requirements could not be resolved to an installable set of packages
# drupal/core requires php >=8.3

Solution:

# Check PHP version
php --version

# macOS: Install PHP 8.3
brew install php@8.3
brew link php@8.3 --force --overwrite

# Verify
php --version  # Should show 8.3.x

❌ Pitfall: Missing Environment Variables

Problem:

buildkit agents deploy
# Error: GITLAB_TOKEN not set

Solution:

# Store tokens in ~/.tokens/
mkdir -p ~/.tokens
echo "your-gitlab-token" > ~/.tokens/gitlab
chmod 600 ~/.tokens/gitlab

# Set environment variables
export GITLAB_URL="https://gitlab.bluefly.io"
export GITLAB_TOKEN=$(cat ~/.tokens/gitlab)

# Persist in shell profile
echo 'export GITLAB_URL="https://gitlab.bluefly.io"' >> ~/.zshrc
echo 'export GITLAB_TOKEN=$(cat ~/.tokens/gitlab)' >> ~/.zshrc

OSSA Agent Development

❌ Pitfall: Invalid OSSA Manifest

Problem:

# agent.ossa.yaml
ossaVersion: "0.2.4"
agent:
  id: my-agent
  name: My Agent
  # Missing required fields!

Solution:

ossaVersion: "0.2.4"

agent:
  id: my-agent                    # Required: DNS-1123 format
  name: My Agent                  # Required: Human-readable name
  version: "1.0.0"                # Required: Semantic version
  role: worker                    # Required: worker, governor, critic, observer

  runtime:                        # Required
    type: local                   # Required
    node:
      version: "20.x"
      entrypoint: "dist/index.js"

  capabilities:                   # Required: At least one
    - name: process_data
      description: Process data
      input_schema: { type: object }
      output_schema: { type: object }

Validate:

ossa validate agent.ossa.yaml

❌ Pitfall: Missing Capability Input/Output Schemas

Problem:

capabilities:
  - name: validate_code
    description: Validate code
    # Missing input_schema and output_schema!

Impact: Agents can't communicate, workflow orchestration breaks, no type safety.

Solution:

capabilities:
  - name: validate_code
    description: Validate code quality
    input_schema:
      type: object
      properties:
        files:
          type: array
          items:
            type: string
          description: List of file paths
        standards:
          type: string
          enum: [drupal, javascript, python]
      required: [files]
    output_schema:
      type: object
      properties:
        valid:
          type: boolean
        violations:
          type: array
          items:
            type: object
        summary:
          type: string
      required: [valid]

❌ Pitfall: Not Handling Agent Errors

Problem:

// Agent crashes on error
app.post('/capabilities/validate', async (req, res) => {
  const result = await validateCode(req.body.files);  // Might throw!
  res.json(result);
});

Solution:

app.post('/capabilities/validate', async (req, res) => {
  try {
    const { files, standards = 'javascript' } = req.body;

    if (!files || !Array.isArray(files)) {
      return res.status(400).json({
        error: 'Invalid input: files array required',
        code: 'INVALID_INPUT'
      });
    }

    const result = await validateCode(files, standards);
    res.json(result);
  } catch (error) {
    logger.error('Validation failed', { error: error.message, stack: error.stack });
    res.status(500).json({
      error: 'Validation failed',
      code: 'VALIDATION_ERROR',
      message: error.message
    });
  }
});

❌ Pitfall: Ignoring Agent Health Checks

Problem:

# Kubernetes kills agent repeatedly
kubectl get pods -n agents
# agent-pod   0/1   CrashLoopBackOff

Solution:

// Add health check endpoint
app.get('/health', (req, res) => {
  const health = {
    status: 'ok',
    agent: 'my-agent',
    version: '1.0.0',
    uptime: process.uptime(),
    memory: process.memoryUsage(),
  };

  res.json(health);
});

// Add readiness check
app.get('/ready', async (req, res) => {
  try {
    // Check dependencies
    await checkDatabaseConnection();
    await checkExternalServices();

    res.json({ ready: true });
  } catch (error) {
    res.status(503).json({ ready: false, error: error.message });
  }
});

Kubernetes config:

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 15
  periodSeconds: 5

BuildKit CLI Usage

❌ Pitfall: Not Checking Agent Status Before Deployment

Problem:

buildkit agents deploy my-agent
# Deploys broken agent to production!

Solution:

# Always validate first
buildkit ossa validate agent.ossa.yaml

# Test locally
buildkit agents start my-agent --local

# Run health check
curl http://localhost:3000/health

# Then deploy
buildkit agents deploy my-agent --namespace agents

❌ Pitfall: Hardcoding Configuration

Problem:

// Hardcoded values
const DATABASE_URL = 'postgresql://user:password@localhost:5432/db';
const API_KEY = 'sk-1234567890';

Solution:

// Use environment variables
const DATABASE_URL = process.env.DATABASE_URL;
const API_KEY = process.env.API_KEY;

// Validate at startup
if (!DATABASE_URL || !API_KEY) {
  console.error('Missing required environment variables');
  process.exit(1);
}

Set in Kubernetes:

env:
  - name: DATABASE_URL
    valueFrom:
      secretKeyRef:
        name: db-credentials
        key: url
  - name: API_KEY
    valueFrom:
      secretKeyRef:
        name: api-credentials
        key: key

❌ Pitfall: Not Using BuildKit Golden Commands

Problem:

# Doing manual work that BuildKit automates
grep -r "TODO" src/
find . -name "*.ts" -exec eslint {} \;
git add . && git commit -m "Update"

Solution:

# Use BuildKit golden commands instead
buildkit golden audit          # Comprehensive security + quality audit
buildkit golden fix            # Auto-fix issues
buildkit golden test           # Run all tests
buildkit golden sync           # Sync GitLab (issues + wiki)
buildkit golden deploy --env dev  # Deploy with checks

Drupal Development

❌ Pitfall: Editing Composer-Managed Modules

Problem:

# Editing files in web/modules/custom/
cd /Users/flux423/Sites/LLM/llm-platform/web/modules/custom/llm
nano llm.module  # Changes will be LOST on composer install!

Solution:

# Edit source files instead
cd /Users/flux423/Sites/LLM/all_drupal_custom/modules/llm
nano llm.module

# Sync to llm-platform
buildkit drupal sync --modules

# Or manually
cd /Users/flux423/Sites/LLM/llm-platform
composer update drupal/llm

Why: web/modules/custom/* is managed by Composer and will be overwritten.

❌ Pitfall: Not Clearing Drupal Cache

Problem:

# Made changes but don't see them
# Updated routing, added service, changed config

Solution:

# Always clear cache after changes
ddev drush cr

# Or use DDEV shortcut
ddev restart

When to clear cache: - After configuration import (drush cim) - After module enable/disable - After routing changes - After service definition changes - After pretty much anything!

❌ Pitfall: Skipping Configuration Export

Problem:

# Made configuration changes in UI
# Didn't export to code
# Lost on next deployment!

Solution:

# After any UI configuration changes
ddev drush cex -y

# Commit changes
git add config/
git commit -m "feat: update content type configuration"

Automate with Git hook:

# .git/hooks/pre-commit
#!/bin/bash
ddev drush cex -y
git add config/

Workflow Orchestration

❌ Pitfall: Missing Workflow Dependencies

Problem:

stages:
  - name: deploy
    # Forgot depends_on!
    steps:
      - name: deploy_to_prod
        agent: deployment-orchestrator

Impact: Deploy runs before tests complete, deploys broken code.

Solution:

stages:
  - name: validate
    steps: [...]

  - name: test
    depends_on: [validate]
    steps: [...]

  - name: deploy
    depends_on: [test]
    condition: "{{ stages.test.status == 'passed' }}"
    steps: [...]

❌ Pitfall: No Timeout Configuration

Problem:

steps:
  - name: run_tests
    agent: test-runner
    # No timeout! Hangs forever if tests freeze.

Solution:

steps:
  - name: run_tests
    agent: test-runner
    capability: run_tests
    timeout: 10m              # Fail after 10 minutes
    retry:
      max_attempts: 2
      backoff: exponential

❌ Pitfall: Not Handling Workflow Failures

Problem:

# No failure handling
# Leaves deployments in inconsistent state

Solution:

on_workflow_failure:
  - name: rollback
    agent: deployment-orchestrator
    capability: rollback
    input:
      environment: "{{ env.ENVIRONMENT }}"

  - name: notify_team
    agent: slack-notifier
    capability: send_message
    input:
      channel: "#incidents"
      message: "Deployment failed: {{ workflow.error }}"

Production Deployment

❌ Pitfall: No Resource Limits

Problem:

# Kubernetes deployment without limits
spec:
  containers:
    - name: agent
      image: my-agent:latest
      # No resources! Agent can consume all cluster resources!

Solution:

spec:
  containers:
    - name: agent
      image: my-agent:latest
      resources:
        requests:
          cpu: 250m
          memory: 512Mi
        limits:
          cpu: 1000m
          memory: 2Gi

❌ Pitfall: Missing Health Checks in Kubernetes

Problem:

# Kubernetes doesn't know if pod is healthy
kubectl get pods
# Pod shows Running but agent is crashed inside

Solution:

spec:
  containers:
    - name: agent
      livenessProbe:
        httpGet:
          path: /health
          port: 3000
        initialDelaySeconds: 30
        periodSeconds: 10
        failureThreshold: 3

      readinessProbe:
        httpGet:
          path: /ready
          port: 3000
        initialDelaySeconds: 15
        periodSeconds: 5
        failureThreshold: 3

❌ Pitfall: Not Using Persistent Volumes

Problem:

# Pod restarts, loses all data!
kubectl delete pod my-agent-xyz
# Files, databases, logs - all gone!

Solution:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: agent-storage
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd

---
spec:
  containers:
    - name: agent
      volumeMounts:
        - name: storage
          mountPath: /data
  volumes:
    - name: storage
      persistentVolumeClaim:
        claimName: agent-storage

❌ Pitfall: No SSL/TLS Configuration

Problem:

# Accessing service via HTTP
curl http://agents.yourcompany.com
# Insecure! Credentials sent in plaintext!

Solution:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: agents-ingress
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
    - hosts:
        - agents.yourcompany.com
      secretName: agents-tls
  rules:
    - host: agents.yourcompany.com
      http:
        paths:
          - path: /
            backend:
              service:
                name: agents
                port:
                  number: 80

Performance & Optimization

❌ Pitfall: Not Enabling Caching

Problem:

// Recalculates expensive operation on every request
app.get('/data', async (req, res) => {
  const data = await expensiveCalculation();  // Takes 5 seconds!
  res.json(data);
});

Solution:

import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);

app.get('/data', async (req, res) => {
  // Check cache first
  const cached = await redis.get('expensive-data');
  if (cached) {
    return res.json(JSON.parse(cached));
  }

  // Calculate and cache
  const data = await expensiveCalculation();
  await redis.setex('expensive-data', 3600, JSON.stringify(data));
  res.json(data);
});

❌ Pitfall: Blocking Event Loop

Problem:

// Synchronous file operations block event loop
app.post('/process', (req, res) => {
  const files = fs.readdirSync('./large-directory');  // Blocks!
  files.forEach(file => {
    const content = fs.readFileSync(file);  // Blocks!
    processFile(content);
  });
  res.json({ done: true });
});

Solution:

app.post('/process', async (req, res) => {
  // Use async operations
  const files = await fs.promises.readdir('./large-directory');
  await Promise.all(
    files.map(async file => {
      const content = await fs.promises.readFile(file);
      await processFile(content);
    })
  );
  res.json({ done: true });
});

❌ Pitfall: Not Monitoring Memory Usage

Problem:

# Agent crashes with OOM (Out of Memory)
kubectl get pods
# agent-xyz   0/1   OOMKilled

Solution:

// Monitor memory usage
setInterval(() => {
  const usage = process.memoryUsage();
  const mbUsed = Math.round(usage.heapUsed / 1024 / 1024);

  logger.info('Memory usage', { heapUsed: mbUsed });

  if (mbUsed > 1500) {  // 1.5 GB threshold
    logger.warn('High memory usage', { heapUsed: mbUsed });
    // Trigger cleanup, restart, or scale
  }
}, 60000);  // Check every minute

Security

❌ Pitfall: Committing Secrets to Git

Problem:

# Committed .env file with secrets!
git add .env
git commit -m "Add config"
git push
# Secrets now in Git history forever!

Solution:

# Add to .gitignore
echo ".env" >> .gitignore
echo ".env.*" >> .gitignore
echo "*.pem" >> .gitignore
echo "*.key" >> .gitignore

# Remove from Git history if already committed
git filter-branch --force --index-filter \
  "git rm --cached --ignore-unmatch .env" \
  --prune-empty --tag-name-filter cat -- --all

Store secrets securely:

# Use ~/.tokens/ directory
mkdir -p ~/.tokens
chmod 700 ~/.tokens

echo "secret-value" > ~/.tokens/service-name
chmod 600 ~/.tokens/service-name

# Reference in code
const token = fs.readFileSync(path.join(os.homedir(), '.tokens', 'gitlab'), 'utf8').trim();

❌ Pitfall: Not Validating Input

Problem:

// Vulnerable to injection attacks
app.post('/execute', (req, res) => {
  const command = req.body.command;
  exec(command);  // Command injection!
});

Solution:

import validator from 'validator';

app.post('/execute', (req, res) => {
  const { command } = req.body;

  // Validate input
  if (!command || typeof command !== 'string') {
    return res.status(400).json({ error: 'Invalid command' });
  }

  // Whitelist allowed commands
  const allowedCommands = ['test', 'build', 'deploy'];
  if (!allowedCommands.includes(command)) {
    return res.status(403).json({ error: 'Command not allowed' });
  }

  // Sanitize and execute safely
  exec(validator.escape(command), { timeout: 30000 }, (error, stdout) => {
    if (error) {
      return res.status(500).json({ error: error.message });
    }
    res.json({ output: stdout });
  });
});

❌ Pitfall: Missing Rate Limiting

Problem:

// No rate limiting - vulnerable to DDoS
app.post('/webhook', async (req, res) => {
  await processWebhook(req.body);
  res.json({ received: true });
});

Solution:

import rateLimit from 'express-rate-limit';

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 100,                   // Limit each IP to 100 requests per windowMs
  message: 'Too many requests, please try again later',
});

app.post('/webhook', limiter, async (req, res) => {
  await processWebhook(req.body);
  res.json({ received: true });
});

Quick Reference: Troubleshooting Commands

# DDEV
ddev describe                    # Show DDEV project info
ddev logs                        # View container logs
ddev restart                     # Restart containers
ddev delete -O && ddev start     # Nuclear option: rebuild everything

# BuildKit
buildkit agents status <name>    # Check agent health
buildkit agents logs <name>      # View agent logs
buildkit agents restart <name>   # Restart agent
buildkit ossa validate <file>    # Validate OSSA manifest

# Kubernetes
kubectl get pods -n agents       # List agent pods
kubectl describe pod <pod>       # Pod details
kubectl logs <pod> --follow      # Stream logs
kubectl exec -it <pod> -- /bin/sh  # Shell into pod

# Drupal
ddev drush status                # Drupal status
ddev drush cr                    # Clear cache
ddev drush cex -y                # Export config
ddev drush cim -y                # Import config
ddev drush updb -y               # Run database updates

Next Steps

Review System Requirements for optimal setup
Follow Development Setup for best practices
Learn Production Deployment patterns

Common Pitfalls

Table of Contents

Installation & Setup

❌ Pitfall: Using Docker Desktop Instead of OrbStack (macOS)

❌ Pitfall: Not Installing DDEV Addons

❌ Pitfall: Wrong PHP Version

❌ Pitfall: Missing Environment Variables

OSSA Agent Development

❌ Pitfall: Invalid OSSA Manifest

❌ Pitfall: Missing Capability Input/Output Schemas

❌ Pitfall: Not Handling Agent Errors

❌ Pitfall: Ignoring Agent Health Checks

BuildKit CLI Usage

❌ Pitfall: Not Checking Agent Status Before Deployment

❌ Pitfall: Hardcoding Configuration

❌ Pitfall: Not Using BuildKit Golden Commands

Drupal Development

❌ Pitfall: Editing Composer-Managed Modules

❌ Pitfall: Not Clearing Drupal Cache

❌ Pitfall: Skipping Configuration Export

Workflow Orchestration

❌ Pitfall: Missing Workflow Dependencies

❌ Pitfall: No Timeout Configuration

❌ Pitfall: Not Handling Workflow Failures

Production Deployment

❌ Pitfall: No Resource Limits

❌ Pitfall: Missing Health Checks in Kubernetes

❌ Pitfall: Not Using Persistent Volumes

❌ Pitfall: No SSL/TLS Configuration

Performance & Optimization

❌ Pitfall: Not Enabling Caching

❌ Pitfall: Blocking Event Loop

❌ Pitfall: Not Monitoring Memory Usage

Security

❌ Pitfall: Committing Secrets to Git

❌ Pitfall: Not Validating Input

❌ Pitfall: Missing Rate Limiting

Quick Reference: Troubleshooting Commands

Next Steps

Additional Resources