---
title: Multi-Modal Knowledge Distillation
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: docker
app_file: app.py
pinned: false
license: mit
short_description: Multi-Modal Knowledge Distillation for AI models
tags:
  - machine-learning
  - knowledge-distillation
  - multi-modal
  - pytorch
  - transformers
  - computer-vision
  - nlp
suggested_hardware: t4-small
suggested_storage: medium
---

# Multi-Modal Knowledge Distillation

Create new AI models through knowledge distillation from multiple pre-trained models across different modalities (text, vision, audio, and multimodal).

## Features

- **Multi-Modal Support**: Distill knowledge from text, vision, audio, and multimodal models
- **Multiple Input Sources**: Upload local files, use Hugging Face repositories, or direct URLs
- **Real-Time Monitoring**: Live progress tracking with WebSocket updates
- **Flexible Configuration**: Customizable student model architecture and training parameters
- **Production Ready**: Built with FastAPI, comprehensive error handling, and security measures
- **Responsive UI**: Modern, mobile-friendly web interface
- **Multiple Formats**: Support for PyTorch (.pt, .pth, .bin), Safetensors, and Hugging Face models

## 🆕 New Advanced Features

### 🔧 System Optimization
- **Memory Management**: Advanced memory management for 16GB RAM systems
- **CPU Optimization**: Optimized for CPU-only training environments
- **Chunk Loading**: Progressive loading for large models
- **Performance Monitoring**: Real-time system performance tracking

### 🔑 Token Management
- **Secure Storage**: Encrypted storage of Hugging Face tokens
- **Multiple Token Types**: Support for read, write, and fine-grained tokens
- **Auto Validation**: Automatic token validation and recommendations
- **Usage Tracking**: Monitor token usage and access patterns

### 🏥 Medical AI Support
- **Medical Datasets**: Specialized medical datasets (ROCOv2, CT-RATE, UMIE)
- **DICOM Processing**: Advanced DICOM file processing and visualization
- **Medical Preprocessing**: Specialized preprocessing for medical images
- **Modality Support**: CT, MRI, X-ray, and ultrasound image processing

### 🌐 Enhanced Model Support
- **Google Models**: Direct access to Google's open-source models
- **Streaming Datasets**: Memory-efficient dataset streaming
- **Progressive Training**: Incremental model training capabilities
- **Arabic Documentation**: Full Arabic language support

## How to Use

1. **Select Teacher Models**: Choose 1-10 pre-trained models as teachers
   - Upload local model files (.pt, .pth, .bin, .safetensors)
   - Enter Hugging Face repository names (format: organization/model-name)
   - Provide direct download URLs to model files
   - For private/gated models: Add your HF token in Space settings

2. **Configure Training**: Set up training parameters
   - Student model architecture (hidden size, layers)
   - Training parameters (steps, learning rate, temperature)
   - Distillation strategy (ensemble, weighted, sequential)

3. **Monitor Training**: Watch real-time progress
   - Live progress bar and metrics
   - Training console output
   - Download trained model when complete

## Setup for Private/Gated Models

To access private or gated Hugging Face models:

1. **Get your Hugging Face token**:
   - Go to [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
   - Create a new token with "Read" permissions

2. **Add token to Hugging Face Space**:
   - Go to your Space settings
   - Add a new secret: `HF_TOKEN` = `your_token_here`
   - Restart your Space

3. **Alternative**: Enter token in the interface
   - Use the "Hugging Face Token" field in the web interface
   - This is temporary and only for the current session

## Supported Formats

- **PyTorch**: .pt, .pth, .bin files
- **Safetensors**: .safetensors files
- **Hugging Face**: Any public repository
- **Direct URLs**: Publicly accessible model files

## Supported Modalities

- **Text**: BERT, GPT, RoBERTa, T5, DistilBERT, etc.
- **Vision**: ViT, ResNet, EfficientNet, SigLIP, etc.
- **Multimodal**: CLIP, BLIP, ALBEF, etc.
- **Audio**: Wav2Vec2, Whisper, etc.
- **Specialized**: Background removal (RMBG), Medical imaging (MedSigLIP), etc.

## Troubleshooting Common Models

### SigLIP Models (e.g., google/siglip-base-patch16-224)
- These models may require "Trust Remote Code" to be enabled
- Use the "Test Model" button to verify compatibility before training

### Custom Architecture Models
- Some models use custom code that requires "Trust Remote Code"
- Always test models before starting training
- Check model documentation on Hugging Face for requirements

### Gemma Models (e.g., google/gemma-2b, google/gemma-3-27b-it)
- **Requires**: Hugging Face token AND access permission
- **Steps**:
  1. Request access at the model page on Hugging Face
  2. Add your HF token in Space settings or interface
  3. Enable "Trust Remote Code" if needed
- **Note**: Gemma 3 models require latest transformers version

## Technical Details

- **Backend**: FastAPI with async support
- **ML Framework**: PyTorch with Transformers
- **Frontend**: Responsive HTML/CSS/JavaScript
- **Real-time Updates**: WebSocket communication
- **Security**: File validation, input sanitization, resource limits

## 🚀 Quick Start (Optimized)

### Option 1: Standard Run
```bash
python app.py
```

### Option 2: Optimized Run (Recommended)
```bash
python run_optimized.py
```

The optimized runner provides:
- ✅ Automatic CPU optimization
- ✅ Memory management setup
- ✅ System requirements check
- ✅ Performance recommendations
- ✅ Enhanced logging

### Option 3: Docker (Coming Soon)
```bash
docker run -p 8000:8000 ai-knowledge-distillation
```

## 🔧 Advanced Configuration

### Environment Variables
```bash
# Memory optimization
export OMP_NUM_THREADS=8
export MKL_NUM_THREADS=8
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128

# Cache directories
export HF_DATASETS_CACHE=./cache/datasets
export TRANSFORMERS_CACHE=./cache/transformers

# Token management
export HF_TOKEN=your_token_here
```

### System Requirements

#### Minimum Requirements
- Python 3.9+
- 4GB RAM
- 10GB free disk space
- CPU with 2+ cores

#### Recommended Requirements
- Python 3.10+
- 16GB RAM
- 50GB free disk space
- CPU with 8+ cores
- Intel CPU with MKL support

#### For Medical AI
- 16GB+ RAM
- 100GB+ free disk space
- Fast SSD storage

## 📊 Performance Tips

1. **Memory Optimization**:
   - Use streaming datasets for large medical datasets
   - Enable chunk loading for models >2GB
   - Monitor memory usage in real-time

2. **CPU Optimization**:
   - Install Intel Extension for PyTorch
   - Use optimized BLAS libraries (MKL, OpenBLAS)
   - Set appropriate thread counts

3. **Storage Optimization**:
   - Use SSD for cache directories
   - Regular cleanup of old datasets
   - Compress model checkpoints

---

Built with ❤️ for the AI community | مبني بـ ❤️ لمجتمع الذكاء الاصطناعي

<!-- Updated: 2024-12-19 - Advanced features with Arabic support -->