# FaraCUA Backend The backend server for FaraCUA - a Computer Use Agent (CUA) demo powered by Microsoft's Fara-7B vision-language model and Modal for serverless GPU inference. ## Overview This backend provides: - **WebSocket API** - Real-time communication with the React frontend for streaming agent actions - **REST API** - Model listing, random question generation, and trace storage - **FARA Agent Integration** - Runs the Fara agent with Playwright for browser automation - **Modal Integration** - Proxies requests to Modal's vLLM endpoint and trace storage ## Architecture ``` ┌─────────────┐ WebSocket ┌─────────────┐ HTTP ┌─────────────┐ │ Frontend │ ◄───────────────► │ Backend │ ◄───────────► │ Modal │ │ (React) │ │ (FastAPI) │ │ (vLLM) │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ Playwright ▼ ┌─────────────┐ │ Browser │ │ (Headless) │ └─────────────┘ ``` ## Files | File | Description | |------|-------------| | `server.py` | Main FastAPI server with WebSocket and REST endpoints | | `modal_fara_vllm.py` | Modal deployment for vLLM inference and trace storage | | `pyproject.toml` | Python dependencies | | `.env.example` | Example environment configuration | ## Setup ### 1. Install Dependencies ```bash # Using uv (recommended) uv sync # Or using pip pip install -e . ``` ### 2. Install Playwright ```bash playwright install chromium ``` ### 3. Deploy Modal Endpoints ```bash modal deploy backend/modal_fara_vllm.py ``` This deploys: - **vLLM Server** - GPU-accelerated inference for Fara-7B at `https://--fara-vllm-serve.modal.run` - **Trace Storage** - Endpoint for storing task traces at `https://--fara-vllm-store-trace.modal.run` ### 4. Configure Environment Copy `.env.example` to `.env` and fill in your values: ```bash cp .env.example .env ``` Required variables: | Variable | Description | |----------|-------------| | `FARA_MODEL_NAME` | Model name (default: `microsoft/Fara-7B`) | | `FARA_ENDPOINT_URL` | Modal vLLM endpoint URL (from deploy output) | | `FARA_API_KEY` | API key (default: `not-needed` for Modal) | | `MODAL_TOKEN_ID` | Modal proxy auth token ID | | `MODAL_TOKEN_SECRET` | Modal proxy auth token secret | | `MODAL_TRACE_STORAGE_URL` | Modal trace storage endpoint URL | Get Modal proxy auth tokens at: https://modal.com/settings/proxy-auth-tokens ### 5. Run the Server ```bash # Development mode uvicorn backend.server:app --host 0.0.0.0 --port 8000 --reload # Or directly python -m backend.server ``` ## API Endpoints ### WebSocket - `ws://localhost:8000/ws` - Real-time agent communication - **Receives**: `user_task`, `stop_task`, `ping` - **Sends**: `heartbeat`, `agent_start`, `agent_progress`, `agent_complete`, `agent_error` ### REST | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/health` | Health check | | GET | `/api/models` | List available models | | GET | `/api/random-question` | Get a random example task | | POST | `/api/traces` | Store a trace (proxies to Modal) | ## Trace Storage Task traces are automatically uploaded to Modal volumes for research purposes. Traces include: - Task instruction and model used - Step-by-step agent actions with screenshots - Token usage and timing metrics - User evaluation (success/failed) Duplicate traces (same ID and instruction) are automatically overwritten to capture the latest evaluation. ## Docker The backend is designed to run in Docker alongside the frontend. See the root `Dockerfile` for the combined deployment. ```bash # Build from root docker build -t fara-cua . # Run with env file docker run -d --name fara-cua -p 7860:7860 --env-file backend/.env fara-cua ``` ## Development ### Running Locally For local development, you can run the backend separately: ```bash cd backend uvicorn server:app --host 0.0.0.0 --port 8000 --reload ``` Make sure the frontend is configured to connect to `http://localhost:8000`. ### Testing Modal Endpoints ```bash # Test vLLM endpoint modal run backend/modal_fara_vllm.py::test # Check deployment status modal app list ``` ## License See the root LICENSE file for license information.