VyoJ's picture
Upload 78 files
7fcdb70 verified
# FaraCUA Backend
The backend server for FaraCUA - a Computer Use Agent (CUA) demo powered by Microsoft's Fara-7B vision-language model and Modal for serverless GPU inference.
## Overview
This backend provides:
- **WebSocket API** - Real-time communication with the React frontend for streaming agent actions
- **REST API** - Model listing, random question generation, and trace storage
- **FARA Agent Integration** - Runs the Fara agent with Playwright for browser automation
- **Modal Integration** - Proxies requests to Modal's vLLM endpoint and trace storage
## Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” WebSocket β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” HTTP β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Frontend β”‚ ◄───────────────► β”‚ Backend β”‚ ◄───────────► β”‚ Modal β”‚
β”‚ (React) β”‚ β”‚ (FastAPI) β”‚ β”‚ (vLLM) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”‚ Playwright
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Browser β”‚
β”‚ (Headless) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Files
| File | Description |
|------|-------------|
| `server.py` | Main FastAPI server with WebSocket and REST endpoints |
| `modal_fara_vllm.py` | Modal deployment for vLLM inference and trace storage |
| `pyproject.toml` | Python dependencies |
| `.env.example` | Example environment configuration |
## Setup
### 1. Install Dependencies
```bash
# Using uv (recommended)
uv sync
# Or using pip
pip install -e .
```
### 2. Install Playwright
```bash
playwright install chromium
```
### 3. Deploy Modal Endpoints
```bash
modal deploy backend/modal_fara_vllm.py
```
This deploys:
- **vLLM Server** - GPU-accelerated inference for Fara-7B at `https://<workspace>--fara-vllm-serve.modal.run`
- **Trace Storage** - Endpoint for storing task traces at `https://<workspace>--fara-vllm-store-trace.modal.run`
### 4. Configure Environment
Copy `.env.example` to `.env` and fill in your values:
```bash
cp .env.example .env
```
Required variables:
| Variable | Description |
|----------|-------------|
| `FARA_MODEL_NAME` | Model name (default: `microsoft/Fara-7B`) |
| `FARA_ENDPOINT_URL` | Modal vLLM endpoint URL (from deploy output) |
| `FARA_API_KEY` | API key (default: `not-needed` for Modal) |
| `MODAL_TOKEN_ID` | Modal proxy auth token ID |
| `MODAL_TOKEN_SECRET` | Modal proxy auth token secret |
| `MODAL_TRACE_STORAGE_URL` | Modal trace storage endpoint URL |
Get Modal proxy auth tokens at: https://modal.com/settings/proxy-auth-tokens
### 5. Run the Server
```bash
# Development mode
uvicorn backend.server:app --host 0.0.0.0 --port 8000 --reload
# Or directly
python -m backend.server
```
## API Endpoints
### WebSocket
- `ws://localhost:8000/ws` - Real-time agent communication
- **Receives**: `user_task`, `stop_task`, `ping`
- **Sends**: `heartbeat`, `agent_start`, `agent_progress`, `agent_complete`, `agent_error`
### REST
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/health` | Health check |
| GET | `/api/models` | List available models |
| GET | `/api/random-question` | Get a random example task |
| POST | `/api/traces` | Store a trace (proxies to Modal) |
## Trace Storage
Task traces are automatically uploaded to Modal volumes for research purposes. Traces include:
- Task instruction and model used
- Step-by-step agent actions with screenshots
- Token usage and timing metrics
- User evaluation (success/failed)
Duplicate traces (same ID and instruction) are automatically overwritten to capture the latest evaluation.
## Docker
The backend is designed to run in Docker alongside the frontend. See the root `Dockerfile` for the combined deployment.
```bash
# Build from root
docker build -t fara-cua .
# Run with env file
docker run -d --name fara-cua -p 7860:7860 --env-file backend/.env fara-cua
```
## Development
### Running Locally
For local development, you can run the backend separately:
```bash
cd backend
uvicorn server:app --host 0.0.0.0 --port 8000 --reload
```
Make sure the frontend is configured to connect to `http://localhost:8000`.
### Testing Modal Endpoints
```bash
# Test vLLM endpoint
modal run backend/modal_fara_vllm.py::test
# Check deployment status
modal app list
```
## License
See the root LICENSE file for license information.