Security Architecture
Comprehensive security architecture for the Bluefly LLM Platform.
Overview
The platform implements defense-in-depth security with multiple layers: - Authentication: JWT + OAuth 2.0 with GitLab - Authorization: RBAC with granular permissions - Encryption: AES-256-GCM at rest, TLS 1.3 in transit - Compliance: FedRAMP Moderate, NIST 800-53, HIPAA, GDPR - Audit: Complete audit trails with tamper-proof logging - Network Security: Zero-trust architecture with mTLS
Security Layers
graph TB
A[Edge Security] --> B[Network Security]
B --> C[Authentication Layer]
C --> D[Authorization Layer]
D --> E[Application Security]
E --> F[Data Security]
F --> G[Audit & Monitoring]
1. Edge Security
Web Application Firewall (WAF)
Features: - SQL injection prevention - XSS attack mitigation - CSRF protection - Rate limiting - DDoS protection
Configuration:
waf:
enabled: true
rulesets:
- OWASP_CRS_3.3
- custom-llm-rules
blockMode: true
logAll: true
Content Security Policy (CSP)
Headers:
Content-Security-Policy:
default-src 'self';
script-src 'self';
style-src 'self' 'unsafe-inline';
img-src 'self' data: https:;
connect-src 'self';
font-src 'self';
object-src 'none';
media-src 'self';
frame-src 'none';
Security Headers
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Referrer-Policy: strict-origin-when-cross-origin
Permissions-Policy: geolocation=(), microphone=(), camera=()
2. Network Security
Zero-Trust Architecture
Principles: - Never trust, always verify - Least privilege access - Assume breach mindset - Verify explicitly
Implementation:
graph LR
A[Client] -->|mTLS| B[API Gateway]
B -->|JWT Validation| C[Auth Service]
C -->|RBAC Check| D[Service Mesh]
D -->|Encrypted| E[Microservices]
mTLS (Mutual TLS)
All service-to-service communication uses mTLS:
Certificate Authority: - Internal CA for service certificates - 90-day certificate rotation - Automatic renewal via cert-manager
Example Configuration:
mtls:
enabled: true
mode: STRICT
certificates:
ca: /etc/certs/ca.crt
cert: /etc/certs/service.crt
key: /etc/certs/service.key
rotation:
autoRenew: true
renewBefore: 720h # 30 days
Network Segmentation
Zones: 1. DMZ: Edge services (gateway, WAF) 2. Application: Services, APIs 3. Data: Databases, storage 4. Management: Admin tools, monitoring
Firewall Rules:
DMZ → Application: Port 443 (HTTPS), 50051 (gRPC)
Application → Data: Port 5432 (PostgreSQL), 6379 (Redis)
Management → All: Port 22 (SSH, admin only)
3. Authentication
See JWT Authentication and OAuth 2.0 for details.
Methods: - JWT: Service-to-service - OAuth 2.0: User authentication (GitLab) - API Keys: External integrations - mTLS: Microservice mesh
Token Security: - RS256 signing (RSA 2048-bit) - 1-hour access token lifetime - 30-day refresh token lifetime - Automatic key rotation (90 days) - Token revocation support
4. Authorization
See RBAC Configuration for complete details.
Role-Based Access Control (RBAC)
Roles: - Admin: Full system access - Developer: Agent execution, workflow creation - User: Read-only access - Service: Service account permissions
Permission Model:
permission = resource:action
Examples:
- agent:execute
- workflow:create
- mesh:communicate
- admin:configure
Policy Enforcement
Open Policy Agent (OPA):
package authz
default allow = false
allow {
input.user.roles[_] == "admin"
}
allow {
input.user.permissions[_] == concat(":", [input.resource, input.action])
}
Attribute-Based Access Control (ABAC)
Fine-grained access based on attributes:
{
"user": {
"id": "user-123",
"roles": ["developer"],
"groups": ["llm-platform/ml-team"],
"clearance": "confidential"
},
"resource": {
"type": "model",
"classification": "confidential",
"owner": "llm-platform/ml-team"
},
"action": "deploy",
"context": {
"time": "2025-01-15T10:00:00Z",
"location": "us-west-2"
}
}
5. Data Security
Encryption at Rest
See Encryption at Rest for details.
Algorithms: - Database: AES-256-GCM - Files: AES-256-GCM - Backups: AES-256-GCM with separate keys
Key Management: - Hardware Security Module (HSM) for key storage - Key rotation every 90 days - Key versioning for decryption of old data
Encryption in Transit
See Encryption in Transit for details.
Protocols: - TLS 1.3: All HTTP(S) traffic - gRPC with mTLS: Service mesh - SSH: Admin access only
Cipher Suites (TLS 1.3):
TLS_AES_256_GCM_SHA384
TLS_CHACHA20_POLY1305_SHA256
TLS_AES_128_GCM_SHA256
Secrets Management
See Secrets Management for details.
Tools: - HashiCorp Vault: Centralized secrets storage - Kubernetes Secrets: Encrypted at rest - Environment Variables: Injected at runtime
Secret Types: - API keys - Database credentials - Encryption keys - OAuth client secrets - Service account tokens
6. Application Security
Input Validation
Zod Schemas:
import { z } from 'zod';
const ChatCompletionSchema = z.object({
model: z.string().min(1).max(100),
messages: z.array(z.object({
role: z.enum(['system', 'user', 'assistant']),
content: z.string().min(1).max(10000)
})).min(1).max(50),
temperature: z.number().min(0).max(2).optional(),
max_tokens: z.number().min(1).max(4096).optional()
});
Output Encoding
HTML Encoding:
import DOMPurify from 'isomorphic-dompurify';
function sanitizeHTML(dirty: string): string {
return DOMPurify.sanitize(dirty, {
ALLOWED_TAGS: ['p', 'b', 'i', 'em', 'strong', 'a'],
ALLOWED_ATTR: ['href']
});
}
SQL Injection Prevention
Parameterized Queries:
// ✅ Safe
const result = await db.query(
'SELECT * FROM users WHERE email = $1',
[email]
);
// ❌ Unsafe
const result = await db.query(
`SELECT * FROM users WHERE email = '${email}'`
);
Dependency Security
Automated Scanning:
- npm audit (daily)
- Snyk vulnerability scanning
- Dependabot security updates
Policy: - High/Critical vulnerabilities: Fix within 7 days - Medium vulnerabilities: Fix within 30 days - Low vulnerabilities: Fix in next release
7. Audit & Monitoring
Audit Logging
Events Logged: - Authentication attempts (success/failure) - Authorization decisions - Data access (PII/PHI) - Configuration changes - Admin operations - Security events
Log Format (JSON):
{
"timestamp": "2025-01-15T10:00:00.000Z",
"eventType": "authentication",
"action": "login_success",
"userId": "user-123",
"username": "developer@bluefly.io",
"sourceIP": "192.168.1.100",
"userAgent": "Mozilla/5.0...",
"metadata": {
"mfa": false,
"provider": "gitlab"
}
}
Tamper-Proof Logging: - Hash chain for log integrity - Append-only storage - Encrypted at rest - Retention: 90 days (configurable)
Security Monitoring
Tools: - SIEM: Splunk / ELK Stack - IDS/IPS: Snort - File Integrity: AIDE - Vulnerability Scanning: Nessus
Alerts: - Failed login attempts (5+ in 5 min) - Privilege escalation attempts - Unusual API usage patterns - Data exfiltration indicators - Certificate expiration (30 days)
8. Incident Response
Incident Response Plan
Phases: 1. Detection: Automated alerts + manual reporting 2. Analysis: Determine scope and impact 3. Containment: Isolate affected systems 4. Eradication: Remove threat 5. Recovery: Restore services 6. Post-Incident: Review and improve
Incident Severity: - P0 (Critical): Data breach, system down - P1 (High): Security vulnerability, degraded service - P2 (Medium): Minor security issue - P3 (Low): Security improvement
Disaster Recovery
RPO/RTO: - RPO (Recovery Point Objective): 1 hour - RTO (Recovery Time Objective): 4 hours
Backups: - Database: Every 6 hours - Files: Daily - Configuration: On change - Retention: 30 days
9. Compliance
FedRAMP Moderate
See FedRAMP Compliance for details.
Controls: 325 controls from NIST 800-53 Certification: FedRAMP Moderate baseline Audit: Annual assessment
NIST 800-53
See NIST 800-53 Controls for details.
Families Implemented: - AC (Access Control) - AU (Audit and Accountability) - SC (System and Communications Protection) - IA (Identification and Authentication) - IR (Incident Response)
HIPAA
Applicable Controls: - PHI encryption at rest and in transit - Access controls (RBAC) - Audit logging - Data retention policies - Breach notification procedures
GDPR
Compliance Features: - Data subject rights (access, deletion, portability) - Consent management - Data processing agreements - Privacy by design - Breach notification (72 hours)
Security Testing
Penetration Testing
Frequency: Quarterly Scope: Full platform Methods: - Automated scanning (OWASP ZAP) - Manual testing - Social engineering - Physical security
Vulnerability Management
Process: 1. Scan (weekly) 2. Triage (1 business day) 3. Remediate (7-30 days based on severity) 4. Verify fix 5. Document
Security Metrics
Key Metrics: - Mean Time to Detect (MTTD): < 5 minutes - Mean Time to Respond (MTTR): < 30 minutes - Vulnerability patch time: < 7 days (high/critical) - Failed login rate: < 0.1% - Audit log retention: 90 days