# FaraCUA Backend

The backend server for FaraCUA - a Computer Use Agent (CUA) demo powered by Microsoft's Fara-7B vision-language model and Modal for serverless GPU inference.

## Overview

This backend provides:

- **WebSocket API** - Real-time communication with the React frontend for streaming agent actions
- **REST API** - Model listing, random question generation, and trace storage
- **FARA Agent Integration** - Runs the Fara agent with Playwright for browser automation
- **Modal Integration** - Proxies requests to Modal's vLLM endpoint and trace storage

## Architecture

```
┌─────────────┐     WebSocket      ┌─────────────┐     HTTP      ┌─────────────┐
│   Frontend  │ ◄───────────────► │   Backend   │ ◄───────────► │    Modal    │
│   (React)   │                    │  (FastAPI)  │               │   (vLLM)    │
└─────────────┘                    └─────────────┘               └─────────────┘
                                         │
                                         │ Playwright
                                         ▼
                                   ┌─────────────┐
                                   │   Browser   │
                                   │  (Headless) │
                                   └─────────────┘
```

## Files

| File | Description |
|------|-------------|
| `server.py` | Main FastAPI server with WebSocket and REST endpoints |
| `modal_fara_vllm.py` | Modal deployment for vLLM inference and trace storage |
| `pyproject.toml` | Python dependencies |
| `.env.example` | Example environment configuration |

## Setup

### 1. Install Dependencies

```bash
# Using uv (recommended)
uv sync

# Or using pip
pip install -e .
```

### 2. Install Playwright

```bash
playwright install chromium
```

### 3. Deploy Modal Endpoints

```bash
modal deploy backend/modal_fara_vllm.py
```

This deploys:
- **vLLM Server** - GPU-accelerated inference for Fara-7B at `https://<workspace>--fara-vllm-serve.modal.run`
- **Trace Storage** - Endpoint for storing task traces at `https://<workspace>--fara-vllm-store-trace.modal.run`

### 4. Configure Environment

Copy `.env.example` to `.env` and fill in your values:

```bash
cp .env.example .env
```

Required variables:

| Variable | Description |
|----------|-------------|
| `FARA_MODEL_NAME` | Model name (default: `microsoft/Fara-7B`) |
| `FARA_ENDPOINT_URL` | Modal vLLM endpoint URL (from deploy output) |
| `FARA_API_KEY` | API key (default: `not-needed` for Modal) |
| `MODAL_TOKEN_ID` | Modal proxy auth token ID |
| `MODAL_TOKEN_SECRET` | Modal proxy auth token secret |
| `MODAL_TRACE_STORAGE_URL` | Modal trace storage endpoint URL |

Get Modal proxy auth tokens at: https://modal.com/settings/proxy-auth-tokens

### 5. Run the Server

```bash
# Development mode
uvicorn backend.server:app --host 0.0.0.0 --port 8000 --reload

# Or directly
python -m backend.server
```

## API Endpoints

### WebSocket

- `ws://localhost:8000/ws` - Real-time agent communication
  - **Receives**: `user_task`, `stop_task`, `ping`
  - **Sends**: `heartbeat`, `agent_start`, `agent_progress`, `agent_complete`, `agent_error`

### REST

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/health` | Health check |
| GET | `/api/models` | List available models |
| GET | `/api/random-question` | Get a random example task |
| POST | `/api/traces` | Store a trace (proxies to Modal) |

## Trace Storage

Task traces are automatically uploaded to Modal volumes for research purposes. Traces include:

- Task instruction and model used
- Step-by-step agent actions with screenshots
- Token usage and timing metrics
- User evaluation (success/failed)

Duplicate traces (same ID and instruction) are automatically overwritten to capture the latest evaluation.

## Docker

The backend is designed to run in Docker alongside the frontend. See the root `Dockerfile` for the combined deployment.

```bash
# Build from root
docker build -t fara-cua .

# Run with env file
docker run -d --name fara-cua -p 7860:7860 --env-file backend/.env fara-cua
```

## Development

### Running Locally

For local development, you can run the backend separately:

```bash
cd backend
uvicorn server:app --host 0.0.0.0 --port 8000 --reload
```

Make sure the frontend is configured to connect to `http://localhost:8000`.

### Testing Modal Endpoints

```bash
# Test vLLM endpoint
modal run backend/modal_fara_vllm.py::test

# Check deployment status
modal app list
```

## License

See the root LICENSE file for license information.