# HuggingFace Model Validator OAuth & API Analysis ## Executive Summary This document analyzes the feasibility of improving OAuth integration and provider discovery in `src/utils/hf_model_validator.py` (lines 49-58), based on available Gradio OAuth features and Hugging Face Hub API capabilities. ## Current Implementation Issues ### 1. Non-Existent API Endpoint **Problem**: Lines 61-64 attempt to query `/static-proxy?url=https%3A%2F%2Fapi-inference.huggingface.co%2Fproviders%60%2C which does not exist. **Evidence**: - No documentation for this endpoint - The code already has a fallback to hardcoded providers - Hugging Face Hub API documentation shows no such endpoint **Impact**: Unnecessary API call that always fails, adding latency and error noise. ### 2. Hardcoded Provider List **Problem**: Lines 36-48 maintain a static list of providers that may become outdated. **Current List**: `["auto", "nebius", "together", "scaleway", "hyperbolic", "novita", "nscale", "sambanova", "ovh", "fireworks", "cerebras"]` **Impact**: New providers won't be discovered automatically, requiring manual code updates. ### 3. Limited OAuth Token Utilization **Problem**: While the function accepts OAuth tokens, it doesn't fully leverage them for provider discovery. **Current State**: Token is passed to API calls but not used to discover providers dynamically. ## Available OAuth Features ### Gradio OAuth Integration 1. **`gr.LoginButton`**: Enables "Sign in with Hugging Face" in Spaces 2. **`gr.OAuthToken`**: Automatically passed to functions when user is logged in - Has `.token` attribute containing the access token - Is `None` when user is not logged in 3. **`gr.OAuthProfile`**: Contains user profile information - `.username`: Hugging Face username - `.name`: Display name - `.profile_image`: Profile image URL ### OAuth Token Scopes According to Hugging Face documentation: - **`inference-api` scope**: Required for accessing Inference Providers API - Grants access to: - HuggingFace's own Inference API - All third-party inference providers (nebius, together, scaleway, etc.) - All models available through the Inference Providers API **Reference**: https://huggingface.co/docs/hub/oauth#currently-supported-scopes ## Available Hugging Face Hub API Endpoints ### 1. List Models by Provider **Endpoint**: `HfApi.list_models(inference_provider="provider_name")` **Usage**: ```python from huggingface_hub import HfApi api = HfApi(token=token) models = api.list_models(inference_provider="fireworks-ai", task="text-generation") ``` **Capabilities**: - Filter models by specific provider - Filter by task type - Support multiple providers: `inference_provider=["fireworks-ai", "together"]` - Get all provider-served models: `inference_provider="all"` ### 2. Get Model Provider Mapping **Endpoint**: `HfApi.model_info(model_id, expand="inferenceProviderMapping")` **Usage**: ```python from huggingface_hub import model_info info = model_info("google/gemma-3-27b-it", expand="inferenceProviderMapping") providers = info.inference_provider_mapping # Returns: {'hf-inference': InferenceProviderMapping(...), 'nebius': ...} ``` **Capabilities**: - Get all providers serving a specific model - Includes provider status (`live` or `staging`) - Includes provider-specific model ID ### 3. List All Provider-Served Models **Endpoint**: `HfApi.list_models(inference_provider="all")` **Usage**: ```python models = api.list_models(inference_provider="all", task="text-generation", limit=100) ``` **Capabilities**: - Get all models served by any provider - Can extract unique providers from model metadata ## Feasibility Assessment ### ✅ Feasible Improvements 1. **Dynamic Provider Discovery** - **Method**: Query models with `inference_provider="all"` and extract unique providers from model info - **Limitation**: Requires querying multiple models, which can be slow - **Alternative**: Use a hybrid approach: query a sample of popular models and extract providers 2. **OAuth Token Integration** - **Method**: Extract token from `gr.OAuthToken.token` attribute - **Status**: Already implemented in `src/app.py` (lines 384-408) - **Enhancement**: Better error handling and scope validation 3. **Provider Validation** - **Method**: Use `model_info(expand="inferenceProviderMapping")` to validate model/provider combinations - **Status**: Partially implemented in `validate_model_provider_combination()` - **Enhancement**: Use provider mapping instead of test API calls ### ⚠️ Limitations 1. **No Public Provider List API** - There is no public endpoint to list all available providers - Must discover providers indirectly through model queries 2. **Performance Considerations** - Querying many models to discover providers can be slow - Caching is essential for good user experience 3. **Provider Name Variations** - Provider names in API may differ from display names - Some providers may use different identifiers (e.g., "fireworks-ai" vs "fireworks") ## Proposed Improvements ### 1. Dynamic Provider Discovery **Approach**: Query a sample of popular models and extract unique providers from their `inferenceProviderMapping`. **Implementation**: ```python async def get_available_providers(token: str | None = None) -> list[str]: """Get list of available inference providers dynamically.""" try: # Query popular models to discover providers popular_models = [ "meta-llama/Llama-3.1-8B-Instruct", "mistralai/Mistral-7B-Instruct-v0.3", "google/gemma-2-9b-it", "deepseek-ai/DeepSeek-V3-0324", ] providers = set(["auto"]) # Always include "auto" loop = asyncio.get_running_loop() api = HfApi(token=token) for model_id in popular_models: try: info = await loop.run_in_executor( None, lambda m=model_id: api.model_info(m, expand="inferenceProviderMapping"), ) if hasattr(info, "inference_provider_mapping") and info.inference_provider_mapping: providers.update(info.inference_provider_mapping.keys()) except Exception: continue # Fallback to known providers if discovery fails if len(providers) <= 1: # Only "auto" providers.update(KNOWN_PROVIDERS) return sorted(list(providers)) except Exception: return KNOWN_PROVIDERS ``` ### 2. Enhanced OAuth Token Handling **Improvements**: - Add helper function to extract token from `gr.OAuthToken` - Validate token scope using `api.whoami()` and inference API test - Better error messages for missing scopes ### 3. Caching Strategy **Implementation**: - Cache provider list for 1 hour (providers don't change frequently) - Cache model lists per provider for 30 minutes - Invalidate cache on authentication changes ### 4. Provider Validation Enhancement **Current**: Makes test API calls (slow, unreliable) **Proposed**: Use `model_info(expand="inferenceProviderMapping")` to check if provider is listed for the model. ## Implementation Priority 1. **High Priority**: Remove non-existent API endpoint call (lines 58-73) 2. **High Priority**: Add caching for provider discovery 3. **Medium Priority**: Implement dynamic provider discovery 4. **Medium Priority**: Enhance OAuth token validation 5. **Low Priority**: Add provider status (live/staging) information ## References - [Hugging Face OAuth Documentation](https://huggingface.co/docs/hub/oauth) - [Gradio LoginButton Documentation](https://www.gradio.app/docs/gradio/loginbutton) - [Hugging Face Hub API - Inference Providers](https://huggingface.co/docs/inference-providers/hub-api) - [Hugging Face Hub Python Client](https://huggingface.co/docs/huggingface_hub/package_reference/hf_api)