VyoJ's picture
Upload 78 files
7fcdb70 verified

FaraCUA Backend

The backend server for FaraCUA - a Computer Use Agent (CUA) demo powered by Microsoft's Fara-7B vision-language model and Modal for serverless GPU inference.

Overview

This backend provides:

  • WebSocket API - Real-time communication with the React frontend for streaming agent actions
  • REST API - Model listing, random question generation, and trace storage
  • FARA Agent Integration - Runs the Fara agent with Playwright for browser automation
  • Modal Integration - Proxies requests to Modal's vLLM endpoint and trace storage

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     WebSocket      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     HTTP      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Frontend  β”‚ ◄───────────────► β”‚   Backend   β”‚ ◄───────────► β”‚    Modal    β”‚
β”‚   (React)   β”‚                    β”‚  (FastAPI)  β”‚               β”‚   (vLLM)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                         β”‚
                                         β”‚ Playwright
                                         β–Ό
                                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                   β”‚   Browser   β”‚
                                   β”‚  (Headless) β”‚
                                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Files

File Description
server.py Main FastAPI server with WebSocket and REST endpoints
modal_fara_vllm.py Modal deployment for vLLM inference and trace storage
pyproject.toml Python dependencies
.env.example Example environment configuration

Setup

1. Install Dependencies

# Using uv (recommended)
uv sync

# Or using pip
pip install -e .

2. Install Playwright

playwright install chromium

3. Deploy Modal Endpoints

modal deploy backend/modal_fara_vllm.py

This deploys:

  • vLLM Server - GPU-accelerated inference for Fara-7B at https://<workspace>--fara-vllm-serve.modal.run
  • Trace Storage - Endpoint for storing task traces at https://<workspace>--fara-vllm-store-trace.modal.run

4. Configure Environment

Copy .env.example to .env and fill in your values:

cp .env.example .env

Required variables:

Variable Description
FARA_MODEL_NAME Model name (default: microsoft/Fara-7B)
FARA_ENDPOINT_URL Modal vLLM endpoint URL (from deploy output)
FARA_API_KEY API key (default: not-needed for Modal)
MODAL_TOKEN_ID Modal proxy auth token ID
MODAL_TOKEN_SECRET Modal proxy auth token secret
MODAL_TRACE_STORAGE_URL Modal trace storage endpoint URL

Get Modal proxy auth tokens at: https://modal.com/settings/proxy-auth-tokens

5. Run the Server

# Development mode
uvicorn backend.server:app --host 0.0.0.0 --port 8000 --reload

# Or directly
python -m backend.server

API Endpoints

WebSocket

  • ws://localhost:8000/ws - Real-time agent communication
    • Receives: user_task, stop_task, ping
    • Sends: heartbeat, agent_start, agent_progress, agent_complete, agent_error

REST

Method Endpoint Description
GET /api/health Health check
GET /api/models List available models
GET /api/random-question Get a random example task
POST /api/traces Store a trace (proxies to Modal)

Trace Storage

Task traces are automatically uploaded to Modal volumes for research purposes. Traces include:

  • Task instruction and model used
  • Step-by-step agent actions with screenshots
  • Token usage and timing metrics
  • User evaluation (success/failed)

Duplicate traces (same ID and instruction) are automatically overwritten to capture the latest evaluation.

Docker

The backend is designed to run in Docker alongside the frontend. See the root Dockerfile for the combined deployment.

# Build from root
docker build -t fara-cua .

# Run with env file
docker run -d --name fara-cua -p 7860:7860 --env-file backend/.env fara-cua

Development

Running Locally

For local development, you can run the backend separately:

cd backend
uvicorn server:app --host 0.0.0.0 --port 8000 --reload

Make sure the frontend is configured to connect to http://localhost:8000.

Testing Modal Endpoints

# Test vLLM endpoint
modal run backend/modal_fara_vllm.py::test

# Check deployment status
modal app list

License

See the root LICENSE file for license information.