AutoBench: Production-Grade Prompt Benchmarking & SecOps Shield
AutoBench is an enterprise-grade, real-time prompt profiling, optimization, and security isolation platform. It evaluates, sanitizes, and distributes LLM prompts using concurrent thread sandboxing, automated red-teaming, live shadow routing split proxies, and multi-region edge mesh latency analysis.
Live Demo: Deployed on Render (may take ~30s to cold-start on the free tier)
🚀 Key Features
📦 Concurrent Sandbox Thread Streamer: Spawns isolated sandbox worktrees in parallel (Sandbox A: Baseline, Sandbox B: Stress/Adversarial, Sandbox C: Codex-Optimized) streaming real-time logs over a bi-directional WebSocket tunnel.
🛡️ SecOps Sanitizer Shield: Active input sanitizer that intercepts prompt injections, blocks restricted system execution commands, prevents PII/Credential leakage, and enforces strict JSON output compliance.
🔀 Shadow Traffic Split Routing Proxy: Simulates a live 99/1 production traffic split with an automated LLM-as-a-Judge scoring engine and a dynamic circuit breaker that terminates candidate routing if semantic accuracy drops below a 94% safety threshold.
🌐 Global Telemetry Edge Mesh: Captures high-resolution performance, TTFT (Time-To-First-Token), and percentile latency benchmarks (P50, P95, P99) across three distributed simulated nodes (us-east-1, eu-west-1, and ap-south-1).
💀 Adversarial Red-Team Simulator: Runs an automated suite of 50 multi-vector prompt injection attacks to stress-test prompt alignment and generate an interactive Vulnerability Matrix report with actionable remediation.
🤖 CI/CD Gate Validator: A headless gate validation script verifying target performance against strict enterprise regression limits, exiting with standard exit codes (0 for Pass, 1 for Breach) to fit pipeline environments.
🖥️ Cloud IDE Workspace: In-browser code editor with Monaco-grade features including syntax highlighting, file tree navigation, terminal emulation, and real-time Git diff overlays.
💎 SaaS Subscription Tiers: Interactive billing panel with Starter, Pro, Teams, and Enterprise tiers featuring a secure sandbox checkout flow.
🔗 A2A Mesh Visualizer: Real-time Agent-to-Agent mesh network topology with live latency heatmaps and node health monitoring.
🏠 Enterprise Homepage: SEO-optimized marketing homepage with interactive modals for extensions, security trust center, legal compliance, and enterprise contact.
🛠️ Tech Stack & Architecture
Framework: React 19 + TypeScript + Vite
Styling: Tailwind CSS v4 (harmonious dark UI, HSL glow vectors, micro-animations)
Icons: Lucide React
Data Layer: Custom WebSocket stream hook (useAutoBench.ts) + Fetch API
Runtime: Node.js + ESM Modules
Framework: Express v5 + standard ws WebSocket connection manager
Simulation Engine: Promise-isolated orchestrator + performance micro-benchmarking
Hosting: Render.com (Web Service, free tier)
CI/CD: Auto-deploy from main branch via Render Git integration
Built with