TDD Methodology

Test-Driven Development (TDD) is a fundamental development practice enforced across all LLM platform projects. TDD ensures code quality, maintainability, and reliability through a disciplined RED-GREEN-REFACTOR cycle.

Core Principles

1. Tests First, Code Second

NEVER write implementation code before tests exist.

Write failing tests that define expected behavior
Implement minimal code to make tests pass
Refactor with confidence knowing tests protect against regressions

2. RED-GREEN-REFACTOR Cycle

The foundational workflow of TDD:

graph LR
    A[RED: Write Failing Test] --> B[GREEN: Make Test Pass]
    B --> C[REFACTOR: Improve Code]
    C --> A

RED Phase

Write a test for the next bit of functionality
Test MUST fail (proves test is valid)
Failure should be for the right reason (not syntax errors)

GREEN Phase

Write minimal code to make test pass
Don't worry about perfect code yet
Focus solely on passing the test

REFACTOR Phase

Clean up code while keeping tests green
Improve design, remove duplication
Optimize performance
Tests provide safety net

TDD Workflow Example

Step 1: RED - Write Failing Test

// tests/auth/login.test.ts
import { describe, it, expect } from 'vitest'
import { AuthService } from '../src/auth/AuthService'

describe('AuthService', () => {
  it('should authenticate user with valid credentials', async () => {
    const auth = new AuthService()
    const result = await auth.login('user@example.com', 'password123')

    expect(result.success).toBe(true)
    expect(result.user).toBeDefined()
    expect(result.token).toBeDefined()
  })
})

Run test: ❌ FAILS (AuthService doesn't exist)

Step 2: GREEN - Make Test Pass

// src/auth/AuthService.ts
export class AuthService {
  async login(email: string, password: string) {
    // Minimal implementation to pass test
    return {
      success: true,
      user: { email },
      token: 'mock-token'
    }
  }
}

Run test: ✅ PASSES

Step 3: REFACTOR - Improve Code

// src/auth/AuthService.ts
import bcrypt from 'bcrypt'
import jwt from 'jsonwebtoken'

export class AuthService {
  constructor(private userRepository: UserRepository) {}

  async login(email: string, password: string): Promise<LoginResult> {
    const user = await this.userRepository.findByEmail(email)
    if (!user) {
      throw new Error('Invalid credentials')
    }

    const isValid = await bcrypt.compare(password, user.passwordHash)
    if (!isValid) {
      throw new Error('Invalid credentials')
    }

    const token = jwt.sign({ userId: user.id }, process.env.JWT_SECRET!)

    return {
      success: true,
      user: { id: user.id, email: user.email },
      token
    }
  }
}

Add more tests for edge cases, then refactor again.

TDD Best Practices

Write Tests at Multiple Levels

// Unit Test - Fast, isolated
describe('calculateTotal', () => {
  it('should sum array of numbers', () => {
    expect(calculateTotal([1, 2, 3])).toBe(6)
  })
})

// Integration Test - Multiple components
describe('Order API', () => {
  it('should create order and update inventory', async () => {
    const order = await createOrder({ items: [...] })
    const inventory = await getInventory(order.items[0].productId)
    expect(inventory.quantity).toBe(originalQuantity - 1)
  })
})

// E2E Test - Full user workflow
describe('Checkout Flow', () => {
  it('should complete purchase from cart to confirmation', async () => {
    await page.goto('/cart')
    await page.click('[data-testid="checkout-button"]')
    // ... complete flow
  })
})

Test Names Should Be Descriptive

// ❌ BAD
it('works', () => { ... })
it('test1', () => { ... })

// ✅ GOOD
it('should throw error when email is invalid', () => { ... })
it('should return 404 when user not found', () => { ... })
it('should cache results for 5 minutes', () => { ... })

One Assertion Per Test (When Possible)

// ❌ BAD - Multiple unrelated assertions
it('should handle user operations', () => {
  expect(createUser(...)).toBeDefined()
  expect(deleteUser(...)).toBe(true)
  expect(updateUser(...)).toEqual(...)
})

// ✅ GOOD - Focused tests
describe('User operations', () => {
  it('should create user successfully', () => {
    expect(createUser(...)).toBeDefined()
  })

  it('should delete user successfully', () => {
    expect(deleteUser(...)).toBe(true)
  })

  it('should update user successfully', () => {
    expect(updateUser(...)).toEqual(...)
  })
})

Test Edge Cases and Error Conditions

describe('divide', () => {
  it('should divide two positive numbers', () => {
    expect(divide(10, 2)).toBe(5)
  })

  it('should handle negative numbers', () => {
    expect(divide(-10, 2)).toBe(-5)
  })

  it('should throw error when dividing by zero', () => {
    expect(() => divide(10, 0)).toThrow('Division by zero')
  })

  it('should handle decimal results', () => {
    expect(divide(10, 3)).toBeCloseTo(3.333, 3)
  })
})

TDD Anti-Patterns to Avoid

❌ Writing Tests After Implementation

Problem: Tests become validation of existing code rather than specification of desired behavior.

Solution: Always write tests first.

❌ Testing Implementation Details

// ❌ BAD - Tests internal implementation
it('should call validateEmail method', () => {
  const spy = jest.spyOn(service, 'validateEmail')
  service.register('user@example.com')
  expect(spy).toHaveBeenCalled()
})

// ✅ GOOD - Tests behavior
it('should reject invalid email addresses', () => {
  expect(() => service.register('invalid-email'))
    .toThrow('Invalid email format')
})

❌ Large Test Setup

// ❌ BAD - Massive setup
it('should process order', () => {
  const user = createUser(...)
  const account = createAccount(...)
  const product1 = createProduct(...)
  const product2 = createProduct(...)
  const cart = createCart(...)
  cart.addItem(product1)
  cart.addItem(product2)
  // ... 20 more lines of setup

  expect(processOrder(cart)).toBe(true)
})

// ✅ GOOD - Use factories/fixtures
it('should process order', () => {
  const cart = createTestCart({ itemCount: 2 })
  expect(processOrder(cart)).toBe(true)
})

❌ Slow Tests

Problem: Slow tests discourage running them frequently.

Solution: - Mock external dependencies - Use in-memory databases for integration tests - Parallelize test execution - Keep unit tests fast (<10ms each)

TDD in Different Contexts

API Development

// 1. Define OpenAPI spec first
// openapi/auth.yaml
paths:
  /auth/login:
    post:
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                email: { type: string }
                password: { type: string }

// 2. Write contract test
it('should match OpenAPI specification', async () => {
  const response = await request(app)
    .post('/auth/login')
    .send({ email: 'user@example.com', password: 'pass' })

  expect(response.status).toBe(200)
  expect(response.body).toMatchSchema(loginResponseSchema)
})

// 3. Implement endpoint
// 4. Refactor

Drupal Module Development

// 1. Write test first
class MyModuleTest extends KernelTestBase {
  public function testCustomEntityCreation() {
    $entity = MyCustomEntity::create([
      'title' => 'Test Entity',
      'status' => 1,
    ]);
    $entity->save();

    $this->assertNotNull($entity->id());
    $this->assertEquals('Test Entity', $entity->getTitle());
  }
}

// 2. Implement entity
// 3. Run tests: phpunit
// 4. Refactor

LLM Agent Development

// 1. Write test for agent behavior
it('should generate code based on specification', async () => {
  const agent = new CodeGeneratorAgent()
  const spec = loadOpenAPISpec('user-api.yaml')

  const result = await agent.generate({ spec })

  expect(result.files).toContain('src/api/user.ts')
  expect(result.testsGenerated).toBe(true)
  expect(result.coverage).toBeGreaterThan(0.8)
})

// 2. Implement agent
// 3. Run tests
// 4. Refactor

TDD Metrics

Coverage Requirements

Minimum 80% code coverage across all projects:

# Run tests with coverage
npm run test:coverage

# Check coverage report
# ✅ Statements: 85%
# ✅ Branches: 82%
# ✅ Functions: 88%
# ✅ Lines: 84%

Test Execution Time Targets

Unit Tests: <10ms per test
Integration Tests: <100ms per test
E2E Tests: <5s per test
Full Suite: <5 minutes

Test Reliability

Flaky Tests: 0% tolerance
False Positives: Immediate investigation
False Negatives: Immediate investigation

TDD Enforcement

See TDD Enforcement for automated compliance checking.

Resources

Jest Documentation: https://jestjs.io/
Vitest Documentation: https://vitest.dev/
PHPUnit Documentation: https://phpunit.de/
Playwright E2E: https://playwright.dev/
Kent Beck's TDD Book: Test-Driven Development by Example