Agent Chat - API Reference

REST, WebSocket, and GraphQL Endpoint Documentation

Overview

Agent Chat provides three API interfaces: 1. REST API: HTTP endpoints for chat operations, service management, and CLI 2. WebSocket API: Real-time streaming and collaboration via Socket.IO 3. GraphQL API: Structured queries, mutations, and subscriptions

All APIs support authentication via JWT tokens or session cookies.

Base URLs

REST API: http://localhost:3080/api
WebSocket: ws://localhost:3080 (Socket.IO)
GraphQL: http://localhost:3080/graphql
Metrics: http://localhost:3080/metrics (Prometheus)

Authentication

JWT Bearer Token

Authorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...

Cookie: session=abc123...

API Key (Service-to-Service)

X-API-Key: your-api-key-here

REST API Endpoints

Health & Info

GET /health

Service health check with ecosystem status.

Response:

{
  "status": "healthy",
  "service": "bluefly-enterprise-chat",
  "version": "0.1.0",
  "timestamp": "2025-11-02T10:00:00Z",
  "ecosystem": {
    "gateway": "http://localhost:4000",
    "vector": "http://localhost:4002",
    "tddai": "http://localhost:3002"
  },
  "voice": {
    "enabled": true,
    "status": "connected"
  },
  "librechat": {
    "enhanced": true,
    "kagent": true,
    "ossa": true,
    "agentStudio": true
  },
  "consumers": {
    "drupal": true,
    "llmPlatform": true
  },
  "rocketship": {
    "enabled": true,
    "features": 13,
    "statusEndpoint": "/api/rocketship/status"
  }
}

GET /api/info

API information and feature list.

Response:

{
  "name": "Bluefly Enterprise Chat Platform",
  "description": "LibreChat-powered enterprise chat with ecosystem integration",
  "version": "0.1.0",
  "features": [
    "LibreChat Integration",
    "TDDAI Cursor Agent Support",
    "Vector Search & RAG",
    "Agentic Workflows",
    "Enterprise Security",
    "Multi-Provider LLM Gateway",
    "Echo Voice Assistant Integration"
  ],
  "endpoints": {
    "health": "/health",
    "chat": "/api/chat",
    "search": "/api/search",
    "agents": "/api/agents",
    "tddai": "/api/tddai",
    "voice": "/api/voice"
  }
}

Chat Endpoints (LibreChat Compatible)

POST /api/chat/completions

Send a chat message and receive completion.

Request:

{
  "model": "claude-3-5-sonnet-20241022",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false
}

Response:

{
  "id": "msg_abc123",
  "object": "chat.completion",
  "created": 1730566800,
  "model": "claude-3-5-sonnet-20241022",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 8,
    "total_tokens": 28
  }
}

POST /api/chat/stream

Streaming chat completion via Server-Sent Events (SSE).

Request: Same as /api/chat/completions with "stream": true

Response (SSE stream):

data: {"id":"msg_abc123","choices":[{"delta":{"content":"The"}}]}

data: {"id":"msg_abc123","choices":[{"delta":{"content":" capital"}}]}

data: {"id":"msg_abc123","choices":[{"delta":{"content":" of"}}]}

data: {"id":"msg_abc123","choices":[{"delta":{"content":" France"}}]}

data: {"id":"msg_abc123","choices":[{"delta":{"content":" is"}}]}

data: {"id":"msg_abc123","choices":[{"delta":{"content":" Paris"}}]}

data: {"id":"msg_abc123","choices":[{"finish_reason":"stop"}]}

data: [DONE]

Conversation Management

GET /api/conversations

List user's conversations.

Query Parameters: - limit (int, default: 20): Number of conversations - offset (int, default: 0): Pagination offset

Response:

{
  "conversations": [
    {
      "id": "conv_123",
      "title": "Help with Python",
      "created_at": "2025-11-01T10:00:00Z",
      "updated_at": "2025-11-01T10:15:00Z",
      "message_count": 12,
      "model": "gpt-4-turbo"
    }
  ],
  "total": 45,
  "limit": 20,
  "offset": 0
}

GET /api/conversations/:id

Get conversation by ID with full history.

Response:

{
  "id": "conv_123",
  "title": "Help with Python",
  "created_at": "2025-11-01T10:00:00Z",
  "messages": [
    {
      "id": "msg_1",
      "role": "user",
      "content": "How do I read a CSV file in Python?",
      "timestamp": "2025-11-01T10:00:00Z"
    },
    {
      "id": "msg_2",
      "role": "assistant",
      "content": "You can use the csv module...",
      "timestamp": "2025-11-01T10:00:05Z",
      "model": "gpt-4-turbo",
      "usage": {
        "prompt_tokens": 15,
        "completion_tokens": 120,
        "total_tokens": 135
      }
    }
  ]
}

DELETE /api/conversations/:id

Delete a conversation.

Response:

{
  "success": true,
  "message": "Conversation deleted"
}

Vector Search

POST /api/search

Semantic search across conversation history.

Request:

{
  "query": "python csv file reading",
  "limit": 5,
  "min_score": 0.7,
  "filters": {
    "user_id": "user_123",
    "date_range": {
      "start": "2025-10-01",
      "end": "2025-11-01"
    }
  }
}

Response:

{
  "results": [
    {
      "id": "msg_456",
      "content": "You can use the csv module...",
      "score": 0.92,
      "conversation_id": "conv_123",
      "timestamp": "2025-11-01T10:00:05Z",
      "metadata": {
        "model": "gpt-4-turbo",
        "topic": "python-csv"
      }
    }
  ],
  "total": 1,
  "query_time_ms": 45
}

Agent Operations

POST /api/agents/orchestrate

Orchestrate agent swarm for complex task.

Request:

{
  "task": "Analyze sales data and generate report",
  "agents": ["data-analyst", "report-writer"],
  "context": {
    "data_source": "sales_q4_2024.csv"
  },
  "options": {
    "parallel": true,
    "timeout": 300000
  }
}

Response:

{
  "orchestration_id": "orch_789",
  "status": "running",
  "agents": [
    {
      "id": "agent_1",
      "name": "data-analyst",
      "status": "running"
    },
    {
      "id": "agent_2",
      "name": "report-writer",
      "status": "pending"
    }
  ],
  "progress_url": "/api/agents/orchestrate/orch_789/progress"
}

Model Management

GET /api/models

List available LLM models.

Response:

{
  "models": [
    {
      "id": "claude-3-5-sonnet-20241022",
      "name": "Claude 3.5 Sonnet",
      "provider": "anthropic",
      "context_window": 200000,
      "max_output": 8192,
      "capabilities": ["chat", "vision", "function_calling"],
      "cost_per_1k_tokens": {
        "input": 0.003,
        "output": 0.015
      }
    },
    {
      "id": "gpt-4-turbo",
      "name": "GPT-4 Turbo",
      "provider": "openai",
      "context_window": 128000,
      "max_output": 4096,
      "capabilities": ["chat", "vision", "function_calling"],
      "cost_per_1k_tokens": {
        "input": 0.01,
        "output": 0.03
      }
    }
  ]
}

Service Management (CLI API)

POST /api/cli/start

Start LibreChat services.

Request:

{
  "profile": "simple"
}

Response:

{
  "success": true,
  "message": "LibreChat services started successfully",
  "services": ["librechat", "mongodb", "vectordb", "meilisearch"]
}

POST /api/cli/stop

Stop LibreChat services.

Response:

{
  "success": true,
  "message": "LibreChat services stopped"
}

GET /api/cli/status

Show service status.

Response:

{
  "containers": [
    {
      "name": "bluefly-librechat",
      "status": "Up 2 hours",
      "ports": "0.0.0.0:80->3080/tcp",
      "image": "ghcr.io/danny-avila/librechat:latest"
    }
  ]
}

WebSocket API (Socket.IO)

Connection

import io from 'socket.io-client';

const socket = io('http://localhost:3080', {
  auth: {
    token: 'your-jwt-token'
  }
});

Events (Client → Server)

chat_message

Send a chat message.

socket.emit('chat_message', {
  conversation_id: 'conv_123',
  message: 'Hello, how are you?',
  model: 'claude-3-5-sonnet-20241022',
  stream: true
});

voice_command

Send voice command from Echo Voice Assistant.

socket.emit('voice_command', {
  transcription: 'Create a new chat session',
  confidence: 0.95
});

Subscribe to real-time updates.

socket.emit('subscribe', {
  type: 'agent_progress',
  agent_id: 'agent_123'
});

Events (Server → Client)

chat_response

Receive streaming AI response.

socket.on('chat_response', (data) => {
  console.log('Token:', data.token);
  console.log('Conversation ID:', data.conversation_id);
  console.log('Finished:', data.finished);
});

Payload:

{
  "conversation_id": "conv_123",
  "message_id": "msg_456",
  "token": "The",
  "finished": false,
  "model": "claude-3-5-sonnet-20241022"
}

agent_progress

Real-time agent task progress.

socket.on('agent_progress', (data) => {
  console.log('Progress:', data.progress);
  console.log('Status:', data.status);
});

Payload:

{
  "agent_id": "agent_123",
  "task_id": "task_456",
  "progress": 0.45,
  "status": "analyzing_data",
  "message": "Processing sales records..."
}

voice_feedback

Voice assistant response.

socket.on('voice_feedback', (data) => {
  console.log('Speech:', data.speech);
});

Payload:

{
  "speech": "I've created a new chat session for you.",
  "action": "session_created",
  "session_id": "conv_789"
}

GraphQL API

Schema Overview

Access GraphQL Playground: http://localhost:3080/graphql

Queries

conversations

List conversations with pagination.

query GetConversations {
  conversations(limit: 10, offset: 0) {
    id
    title
    createdAt
    updatedAt
    messageCount
    messages {
      id
      role
      content
      model
      timestamp
      usage {
        promptTokens
        completionTokens
        totalTokens
      }
    }
  }
}

searchConversations

Semantic search across conversations.

query SearchConversations($query: String!) {
  searchConversations(query: $query) {
    id
    title
    relevanceScore
    messages {
      id
      content
      timestamp
    }
  }
}

agentHealth

Check agent health status.

query AgentHealth($agentId: ID!) {
  agentHealth(agentId: $agentId) {
    status
    uptime
    requestCount
    errorRate
    metrics {
      latency
      throughput
      tokenUsage
    }
  }
}

Mutations

sendMessage

Send a message to conversation.

mutation SendMessage($input: MessageInput!) {
  sendMessage(input: $input) {
    id
    content
    model
    timestamp
    usage {
      promptTokens
      completionTokens
      totalTokens
    }
  }
}

Variables:

{
  "input": {
    "conversationId": "conv_123",
    "content": "What is 2+2?",
    "model": "gpt-4-turbo",
    "stream": false
  }
}

createConversation

Create a new conversation.

mutation CreateConversation($title: String, $model: String) {
  createConversation(title: $title, model: $model) {
    id
    title
    createdAt
    model
  }
}

Subscriptions

messageStream

Subscribe to streaming message.

subscription MessageStream($conversationId: ID!) {
  messageStream(conversationId: $conversationId) {
    id
    token
    finished
    model
  }
}

agentProgress

Subscribe to agent progress updates.

subscription AgentProgress($agentId: ID!) {
  agentProgress(agentId: $agentId) {
    agentId
    taskId
    progress
    status
    message
  }
}

Error Responses

Standard Error Format

{
  "success": false,
  "error": "Invalid request",
  "details": "Missing required field: message",
  "code": "VALIDATION_ERROR",
  "timestamp": "2025-11-02T10:00:00Z"
}

HTTP Status Codes

200 OK: Success
201 Created: Resource created
400 Bad Request: Invalid input
401 Unauthorized: Missing/invalid authentication
403 Forbidden: Insufficient permissions
404 Not Found: Resource not found
429 Too Many Requests: Rate limit exceeded
500 Internal Server Error: Server error
503 Service Unavailable: Service down

Rate Limiting

When rate limit is exceeded:

{
  "error": "Rate limit exceeded",
  "limit": 100,
  "window": "1m",
  "retry_after": 45
}

Headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1730567100
Retry-After: 45

OpenAPI Specifications

Full OpenAPI 3.1 specifications are available in the technical guide registry:

Chat CLI API: /openapi/chat-cli.yaml
Service Management API: /openapi/services.openapi.yml

Registry Location: https://gitlab.bluefly.io/llm/technical-guide/openapi/agent-chat/

Related Pages: - Architecture - System design and components - Integration Guide - Integration examples - Development - Local development setup

Last Updated: 2025-11-02

Agent Chat - API Reference

Overview

Base URLs

Authentication

JWT Bearer Token

Session Cookie

API Key (Service-to-Service)

REST API Endpoints

Health & Info

GET /health

GET /api/info

Chat Endpoints (LibreChat Compatible)

POST /api/chat/completions

POST /api/chat/stream

Conversation Management

GET /api/conversations

GET /api/conversations/:id

DELETE /api/conversations/:id

Vector Search

POST /api/search

Agent Operations

POST /api/agents/orchestrate

Model Management

GET /api/models

Service Management (CLI API)

POST /api/cli/start

POST /api/cli/stop

GET /api/cli/status

WebSocket API (Socket.IO)

Connection

Events (Client → Server)

chat_message

voice_command

subscribe

Events (Server → Client)

chat_response

agent_progress

voice_feedback

GraphQL API

Schema Overview

Queries

conversations

searchConversations

agentHealth

Mutations

sendMessage

createConversation

Subscriptions

messageStream

agentProgress

Error Responses

Standard Error Format

HTTP Status Codes

Rate Limiting

OpenAPI Specifications