Agent Mesh - Home

Distributed Agent Coordination with Tailscale Integration and Zero-Trust Security

Overview

Agent Mesh is a production-grade distributed agent coordination layer that enables automatic agent discovery, intelligent task routing, secure communication, and load balancing across mesh networks. Built on Tailscale's MagicDNS for zero-configuration service discovery and leveraging zero-trust security principles.

Core Function: Coordinates autonomous agents across distributed networks, providing service discovery, task distribution, secure transport, and authentication without manual configuration.

Quick Start

# Install
npm install @bluefly/agent-buildkit

# Deploy agent to mesh
buildkit agent:mesh deploy \
  --agent-id worker-1 \
  --agent-name "Task Worker" \
  --agent-type worker \
  --namespace production \
  --capabilities "task-execution,data-processing"

# Check mesh status
buildkit agent:mesh status

# Execute task on mesh
buildkit agent:mesh execute \
  --task-id task-001 \
  --task-type data-processing \
  --payload '{"data": "example"}'

# Discover agents
buildkit agent:mesh discover --namespace production

Key Features

Automatic Service Discovery: Tailscale MagicDNS integration for zero-configuration agent discovery
Intelligent Task Routing: Capability-based agent matching with multiple load balancing strategies
Secure Transport: Agent-to-agent communication over Tailscale encrypted network
Zero-Trust Authentication: JWT-based authentication with ACL policies per agent type
Load Balancing: Round-robin, least-loaded, capability-match, and random strategies
Health Monitoring: Automatic heartbeat checking and health status tracking
Fault Tolerance: Automatic retry logic, failover, and task reassignment
Multi-Namespace Support: Isolated agent groups for different environments

Core Documentation

Architecture - Mesh networking and service discovery architecture
Deployment Guide - Production deployment strategies
Development Guide - Local development and contribution

Advanced Topics

Tailscale Integration - MagicDNS and service discovery
Security Model - Zero-trust authentication and ACL policies
Load Balancing - Task distribution strategies
Agent Types - Orchestrator, worker, monitor, integrator, governor, critic

CLI Reference

Deploy Command - Register agents in the mesh
Status Command - Monitor mesh health and agents
Execute Command - Submit tasks to the mesh
Discover Command - Find agents in the network

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                      Agent Mesh Network                      │
├─────────────────────────────────────────────────────────────┤
│                                                               │
│  ┌──────────────┐     ┌──────────────┐     ┌──────────────┐│
│  │ Orchestrator │────▶│ Coordinator  │────▶│   Workers    ││
│  │   Agent      │     │   Service    │     │  (Pool)      ││
│  └──────────────┘     └──────────────┘     └──────────────┘│
│         │                     │                    │         │
│         │                     │                    │         │
│  ┌──────▼──────┐     ┌───────▼──────┐     ┌──────▼──────┐ │
│  │  Discovery  │     │   Transport  │     │     Auth     │ │
│  │   Service   │     │   Service    │     │   Service    │ │
│  └─────────────┘     └──────────────┘     └──────────────┘ │
│         │                     │                    │         │
│  ┌──────▼─────────────────────▼────────────────────▼──────┐│
│  │            Tailscale MagicDNS Network Layer             ││
│  │      (Zero-Config Service Discovery + Encryption)       ││
│  └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

Agent Types

Orchestrator

Role: Master coordinator for complex workflows
Permissions: Full access to all resources and actions
Use Cases: Multi-agent task orchestration, workflow management

Worker

Role: Task execution and data processing
Permissions: Read and execute on tasks/results
Use Cases: CPU-intensive tasks, data transformations, batch processing

Monitor

Role: Read-only observability and metrics collection
Permissions: Read-only access to all resources
Use Cases: Health monitoring, metrics aggregation, alerting

Integrator

Role: External system integration
Permissions: Read/write access to integrations and data
Use Cases: API integrations, data pipelines, ETL processes

Governor

Role: Policy and governance management
Permissions: Admin access to policies, ACLs, audit logs
Use Cases: Security policy management, compliance, auditing

Critic

Role: Quality assurance and evaluation
Permissions: Read/write access to reviews and metrics
Use Cases: Output validation, quality scoring, feedback loops

Core Services

Discovery Service

Automatic agent registration and discovery using Tailscale MagicDNS: - Agent registration with identity and capabilities - Namespace-based isolation - Health checking and heartbeat monitoring - Capability-based filtering

Coordinator Service

Intelligent task distribution and management: - Task queue management - Agent capability matching - Load balancing strategies (round-robin, least-loaded, capability-match, random) - Fault tolerance and retry logic - Task status tracking and result retrieval

Transport Service

Secure agent-to-agent communication: - HTTP/HTTPS over Tailscale encrypted network - Request/response patterns - Streaming support for large payloads - Broadcast messaging - Connection pooling and retry logic

Auth Service

Zero-trust authentication and authorization: - JWT-based agent authentication - ACL policies per agent type - Permission-based access control - Security audit logging - Token revocation

Performance Metrics

Service Discovery: <100ms agent lookup time
Task Routing: <50ms agent selection and assignment
Transport Latency: <10ms agent-to-agent (within Tailscale network)
Failover Time: <5s automatic task reassignment
Concurrent Tasks: 10,000+ tasks per coordinator instance
Agent Scalability: 1,000+ agents per mesh network

CLI Commands

Deploy Agent

buildkit agent:mesh deploy \
  --agent-id worker-1 \
  --agent-name "Task Worker" \
  --agent-type worker \
  --namespace production \
  --capabilities "task-execution,data-processing" \
  --port 3000

Check Status

buildkit agent:mesh status \
  --namespace production \
  --health healthy

Execute Task

buildkit agent:mesh execute \
  --task-id task-001 \
  --task-type data-processing \
  --payload '{"data": "example"}' \
  --priority high \
  --timeout 300000

Discover Agents

buildkit agent:mesh discover \
  --namespace production \
  --agent-type worker \
  --capability task-execution

Generate Auth Token

buildkit agent:mesh auth \
  --agent-id worker-1 \
  --agent-type worker \
  --namespace production \
  --capabilities "task-execution"

View Workload

buildkit agent:mesh workload --agent-id worker-1

Load Balancing Strategies

Round-Robin

Distributes tasks evenly across all agents
Simple and predictable distribution
Best for homogeneous agent pools

Least-Loaded

Routes tasks to agents with fewest active tasks
Optimizes for balanced workload distribution
Best for heterogeneous environments

Capability-Match

Prefers agents with exact capability matches
Optimizes for specialized task execution
Best for diverse task requirements

Random

Randomly selects from available agents
Simple and stateless
Best for testing and development

Security Model

Network Layer

Tailscale Encryption: WireGuard-based end-to-end encryption
Zero-Configuration: MagicDNS automatic service discovery
Network Isolation: Namespace-based logical separation

Application Layer

JWT Authentication: Signed tokens with expiration
ACL Policies: Role-based access control per agent type
Audit Logging: Comprehensive security event tracking
Token Revocation: Immediate access termination

Default ACL Policies

Agent Type	Allowed Actions	Allowed Resources	Denied Actions
Orchestrator	* (all)	* (all)	None
Worker	read, execute	tasks, results	admin
Monitor	read	* (all)	write, execute, admin
Integrator	read, write	integrations, data	admin
Governor	read, write, admin	policies, acl, audit	None
Critic	read, write	reviews, feedback, metrics	admin

Integration Examples

Python Agent

import requests
import json

# Agent registration
response = requests.post('http://agent-mesh:3000/mesh/register', json={
    'agentId': 'python-worker-1',
    'agentName': 'Python Data Processor',
    'agentType': 'worker',
    'namespace': 'production',
    'capabilities': ['python', 'data-processing', 'ml'],
    'port': 8080
})

# Task execution endpoint
@app.route('/mesh/rpc', methods=['POST'])
def execute_task():
    task = request.json
    result = process_task(task['payload'])
    return jsonify({'success': True, 'result': result})

Node.js Agent

import axios from 'axios';
import express from 'express';

// Agent registration
await axios.post('http://agent-mesh:3000/mesh/register', {
  agentId: 'node-worker-1',
  agentName: 'Node.js API Processor',
  agentType: 'worker',
  namespace: 'production',
  capabilities: ['nodejs', 'api-integration', 'json'],
  port: 8080
});

// Task execution endpoint
app.post('/mesh/rpc', async (req, res) => {
  const task = req.body;
  const result = await processTask(task.payload);
  res.json({ success: true, result });
});

Technology Stack

Runtime: Node.js 20+, TypeScript 5.0+
Service Discovery: Tailscale MagicDNS
Authentication: JWT (jsonwebtoken)
Validation: Zod schemas
HTTP Client: Axios with connection pooling
CLI: Commander.js with chalk styling
Logging: Structured logging with log levels

@bluefly/agent-router - LLM request routing
@bluefly/agent-protocol - MCP protocol implementation
@bluefly/agent-brain - Knowledge graph and vector search
@bluefly/agent-buildkit - Enterprise agent framework

Quick Links

Repository: https://gitlab.bluefly.io/llm/npm/agent-buildkit
Issues: https://gitlab.bluefly.io/llm/npm/agent-buildkit/-/issues
CI/CD: https://gitlab.bluefly.io/llm/npm/agent-buildkit/-/pipelines
Package Registry: https://gitlab.bluefly.io/llm/npm/agent-buildkit/-/packages

Support

Issues: https://gitlab.bluefly.io/llm/npm/agent-buildkit/-/issues
Documentation: This wiki
Team: LLM Platform Team llm-platform@bluefly.io

Last Updated: 2025-11-02 Maintainer: LLM Platform Team License: MIT