File size: 5,029 Bytes
7fcdb70 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 |
# FaraCUA Backend
The backend server for FaraCUA - a Computer Use Agent (CUA) demo powered by Microsoft's Fara-7B vision-language model and Modal for serverless GPU inference.
## Overview
This backend provides:
- **WebSocket API** - Real-time communication with the React frontend for streaming agent actions
- **REST API** - Model listing, random question generation, and trace storage
- **FARA Agent Integration** - Runs the Fara agent with Playwright for browser automation
- **Modal Integration** - Proxies requests to Modal's vLLM endpoint and trace storage
## Architecture
```
βββββββββββββββ WebSocket βββββββββββββββ HTTP βββββββββββββββ
β Frontend β βββββββββββββββββΊ β Backend β βββββββββββββΊ β Modal β
β (React) β β (FastAPI) β β (vLLM) β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β
β Playwright
βΌ
βββββββββββββββ
β Browser β
β (Headless) β
βββββββββββββββ
```
## Files
| File | Description |
|------|-------------|
| `server.py` | Main FastAPI server with WebSocket and REST endpoints |
| `modal_fara_vllm.py` | Modal deployment for vLLM inference and trace storage |
| `pyproject.toml` | Python dependencies |
| `.env.example` | Example environment configuration |
## Setup
### 1. Install Dependencies
```bash
# Using uv (recommended)
uv sync
# Or using pip
pip install -e .
```
### 2. Install Playwright
```bash
playwright install chromium
```
### 3. Deploy Modal Endpoints
```bash
modal deploy backend/modal_fara_vllm.py
```
This deploys:
- **vLLM Server** - GPU-accelerated inference for Fara-7B at `https://<workspace>--fara-vllm-serve.modal.run`
- **Trace Storage** - Endpoint for storing task traces at `https://<workspace>--fara-vllm-store-trace.modal.run`
### 4. Configure Environment
Copy `.env.example` to `.env` and fill in your values:
```bash
cp .env.example .env
```
Required variables:
| Variable | Description |
|----------|-------------|
| `FARA_MODEL_NAME` | Model name (default: `microsoft/Fara-7B`) |
| `FARA_ENDPOINT_URL` | Modal vLLM endpoint URL (from deploy output) |
| `FARA_API_KEY` | API key (default: `not-needed` for Modal) |
| `MODAL_TOKEN_ID` | Modal proxy auth token ID |
| `MODAL_TOKEN_SECRET` | Modal proxy auth token secret |
| `MODAL_TRACE_STORAGE_URL` | Modal trace storage endpoint URL |
Get Modal proxy auth tokens at: https://modal.com/settings/proxy-auth-tokens
### 5. Run the Server
```bash
# Development mode
uvicorn backend.server:app --host 0.0.0.0 --port 8000 --reload
# Or directly
python -m backend.server
```
## API Endpoints
### WebSocket
- `ws://localhost:8000/ws` - Real-time agent communication
- **Receives**: `user_task`, `stop_task`, `ping`
- **Sends**: `heartbeat`, `agent_start`, `agent_progress`, `agent_complete`, `agent_error`
### REST
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/health` | Health check |
| GET | `/api/models` | List available models |
| GET | `/api/random-question` | Get a random example task |
| POST | `/api/traces` | Store a trace (proxies to Modal) |
## Trace Storage
Task traces are automatically uploaded to Modal volumes for research purposes. Traces include:
- Task instruction and model used
- Step-by-step agent actions with screenshots
- Token usage and timing metrics
- User evaluation (success/failed)
Duplicate traces (same ID and instruction) are automatically overwritten to capture the latest evaluation.
## Docker
The backend is designed to run in Docker alongside the frontend. See the root `Dockerfile` for the combined deployment.
```bash
# Build from root
docker build -t fara-cua .
# Run with env file
docker run -d --name fara-cua -p 7860:7860 --env-file backend/.env fara-cua
```
## Development
### Running Locally
For local development, you can run the backend separately:
```bash
cd backend
uvicorn server:app --host 0.0.0.0 --port 8000 --reload
```
Make sure the frontend is configured to connect to `http://localhost:8000`.
### Testing Modal Endpoints
```bash
# Test vLLM endpoint
modal run backend/modal_fara_vllm.py::test
# Check deployment status
modal app list
```
## License
See the root LICENSE file for license information. |