Spaces:

HyperCluster
/

Fara-BrowserUse

Paused

App Files Files Community

Fara-BrowserUse / backend /README.md

VyoJ

Upload 78 files

7fcdb70 verified 7 days ago

preview code

raw

history blame contribute delete

5.03 kB

FaraCUA Backend

The backend server for FaraCUA - a Computer Use Agent (CUA) demo powered by Microsoft's Fara-7B vision-language model and Modal for serverless GPU inference.

Overview

This backend provides:

WebSocket API - Real-time communication with the React frontend for streaming agent actions
REST API - Model listing, random question generation, and trace storage
FARA Agent Integration - Runs the Fara agent with Playwright for browser automation
Modal Integration - Proxies requests to Modal's vLLM endpoint and trace storage

Architecture

┌─────────────┐     WebSocket      ┌─────────────┐     HTTP      ┌─────────────┐
│   Frontend  │ ◄───────────────► │   Backend   │ ◄───────────► │    Modal    │
│   (React)   │                    │  (FastAPI)  │               │   (vLLM)    │
└─────────────┘                    └─────────────┘               └─────────────┘
                                         │
                                         │ Playwright
                                         ▼
                                   ┌─────────────┐
                                   │   Browser   │
                                   │  (Headless) │
                                   └─────────────┘

Files

File	Description
`server.py`	Main FastAPI server with WebSocket and REST endpoints
`modal_fara_vllm.py`	Modal deployment for vLLM inference and trace storage
`pyproject.toml`	Python dependencies
`.env.example`	Example environment configuration

Setup

1. Install Dependencies

# Using uv (recommended)
uv sync

# Or using pip
pip install -e .

2. Install Playwright

playwright install chromium

3. Deploy Modal Endpoints

modal deploy backend/modal_fara_vllm.py

This deploys:

vLLM Server - GPU-accelerated inference for Fara-7B at https://<workspace>--fara-vllm-serve.modal.run
Trace Storage - Endpoint for storing task traces at https://<workspace>--fara-vllm-store-trace.modal.run

4. Configure Environment

Copy .env.example to .env and fill in your values:

cp .env.example .env

Required variables:

Variable	Description
`FARA_MODEL_NAME`	Model name (default: `microsoft/Fara-7B`)
`FARA_ENDPOINT_URL`	Modal vLLM endpoint URL (from deploy output)
`FARA_API_KEY`	API key (default: `not-needed` for Modal)
`MODAL_TOKEN_ID`	Modal proxy auth token ID
`MODAL_TOKEN_SECRET`	Modal proxy auth token secret
`MODAL_TRACE_STORAGE_URL`	Modal trace storage endpoint URL

Get Modal proxy auth tokens at: https://modal.com/settings/proxy-auth-tokens

5. Run the Server

# Development mode
uvicorn backend.server:app --host 0.0.0.0 --port 8000 --reload

# Or directly
python -m backend.server

API Endpoints

WebSocket

ws://localhost:8000/ws - Real-time agent communication
- Receives: user_task, stop_task, ping
- Sends: heartbeat, agent_start, agent_progress, agent_complete, agent_error

REST

Method	Endpoint	Description
GET	`/api/health`	Health check
GET	`/api/models`	List available models
GET	`/api/random-question`	Get a random example task
POST	`/api/traces`	Store a trace (proxies to Modal)

Trace Storage

Task traces are automatically uploaded to Modal volumes for research purposes. Traces include:

Task instruction and model used
Step-by-step agent actions with screenshots
Token usage and timing metrics
User evaluation (success/failed)

Duplicate traces (same ID and instruction) are automatically overwritten to capture the latest evaluation.

Docker

The backend is designed to run in Docker alongside the frontend. See the root Dockerfile for the combined deployment.

# Build from root
docker build -t fara-cua .

# Run with env file
docker run -d --name fara-cua -p 7860:7860 --env-file backend/.env fara-cua

Development

Running Locally

For local development, you can run the backend separately:

cd backend
uvicorn server:app --host 0.0.0.0 --port 8000 --reload

Make sure the frontend is configured to connect to http://localhost:8000.

Testing Modal Endpoints

# Test vLLM endpoint
modal run backend/modal_fara_vllm.py::test

# Check deployment status
modal app list

License

See the root LICENSE file for license information.