Spaces:

HyperCluster
/

Fara-BrowserUse

Paused

App Files Files Community

Fara-BrowserUse / backend /README.md

VyoJ

Upload 78 files

7fcdb70 verified 9 days ago

preview code

raw

history blame contribute delete

5.03 kB

	# FaraCUA Backend

	The backend server for FaraCUA - a Computer Use Agent (CUA) demo powered by Microsoft's Fara-7B vision-language model and Modal for serverless GPU inference.

	## Overview

	This backend provides:

	- WebSocket API - Real-time communication with the React frontend for streaming agent actions
	- REST API - Model listing, random question generation, and trace storage
	- FARA Agent Integration - Runs the Fara agent with Playwright for browser automation
	- Modal Integration - Proxies requests to Modal's vLLM endpoint and trace storage

	## Architecture

	```
	┌─────────────┐ WebSocket ┌─────────────┐ HTTP ┌─────────────┐
	│ Frontend │ ◄───────────────► │ Backend │ ◄───────────► │ Modal │
	│ (React) │ │ (FastAPI) │ │ (vLLM) │
	└─────────────┘ └─────────────┘ └─────────────┘
	│
	│ Playwright
	▼
	┌─────────────┐
	│ Browser │
	│ (Headless) │
	└─────────────┘
	```

	## Files

	\| File \| Description \|
	\|------\|-------------\|
	\| `server.py` \| Main FastAPI server with WebSocket and REST endpoints \|
	\| `modal_fara_vllm.py` \| Modal deployment for vLLM inference and trace storage \|
	\| `pyproject.toml` \| Python dependencies \|
	\| `.env.example` \| Example environment configuration \|

	## Setup

	### 1. Install Dependencies

	```bash
	# Using uv (recommended)
	uv sync

	# Or using pip
	pip install -e .
	```

	### 2. Install Playwright

	```bash
	playwright install chromium
	```

	### 3. Deploy Modal Endpoints

	```bash
	modal deploy backend/modal_fara_vllm.py
	```

	This deploys:
	- vLLM Server - GPU-accelerated inference for Fara-7B at `https://<workspace>--fara-vllm-serve.modal.run`
	- Trace Storage - Endpoint for storing task traces at `https://<workspace>--fara-vllm-store-trace.modal.run`

	### 4. Configure Environment

	Copy `.env.example` to `.env` and fill in your values:

	```bash
	cp .env.example .env
	```

	Required variables:

	\| Variable \| Description \|
	\|----------\|-------------\|
	\| `FARA_MODEL_NAME` \| Model name (default: `microsoft/Fara-7B`) \|
	\| `FARA_ENDPOINT_URL` \| Modal vLLM endpoint URL (from deploy output) \|
	\| `FARA_API_KEY` \| API key (default: `not-needed` for Modal) \|
	\| `MODAL_TOKEN_ID` \| Modal proxy auth token ID \|
	\| `MODAL_TOKEN_SECRET` \| Modal proxy auth token secret \|
	\| `MODAL_TRACE_STORAGE_URL` \| Modal trace storage endpoint URL \|

	Get Modal proxy auth tokens at: https://modal.com/settings/proxy-auth-tokens

	### 5. Run the Server

	```bash
	# Development mode
	uvicorn backend.server:app --host 0.0.0.0 --port 8000 --reload

	# Or directly
	python -m backend.server
	```

	## API Endpoints

	### WebSocket

	- `ws://localhost:8000/ws` - Real-time agent communication
	- Receives: `user_task`, `stop_task`, `ping`
	- Sends: `heartbeat`, `agent_start`, `agent_progress`, `agent_complete`, `agent_error`

	### REST

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| GET \| `/api/health` \| Health check \|
	\| GET \| `/api/models` \| List available models \|
	\| GET \| `/api/random-question` \| Get a random example task \|
	\| POST \| `/api/traces` \| Store a trace (proxies to Modal) \|

	## Trace Storage

	Task traces are automatically uploaded to Modal volumes for research purposes. Traces include:

	- Task instruction and model used
	- Step-by-step agent actions with screenshots
	- Token usage and timing metrics
	- User evaluation (success/failed)

	Duplicate traces (same ID and instruction) are automatically overwritten to capture the latest evaluation.

	## Docker

	The backend is designed to run in Docker alongside the frontend. See the root `Dockerfile` for the combined deployment.

	```bash
	# Build from root
	docker build -t fara-cua .

	# Run with env file
	docker run -d --name fara-cua -p 7860:7860 --env-file backend/.env fara-cua
	```

	## Development

	### Running Locally

	For local development, you can run the backend separately:

	```bash
	cd backend
	uvicorn server:app --host 0.0.0.0 --port 8000 --reload
	```

	Make sure the frontend is configured to connect to `http://localhost:8000`.

	### Testing Modal Endpoints

	```bash
	# Test vLLM endpoint
	modal run backend/modal_fara_vllm.py::test

	# Check deployment status
	modal app list
	```

	## License

	See the root LICENSE file for license information.