--- title: Multi-Modal Knowledge Distillation emoji: 🧠 colorFrom: blue colorTo: purple sdk: docker app_file: app.py pinned: false license: mit short_description: Multi-Modal Knowledge Distillation for AI models tags: - machine-learning - knowledge-distillation - multi-modal - pytorch - transformers - computer-vision - nlp suggested_hardware: t4-small suggested_storage: medium --- # Multi-Modal Knowledge Distillation Create new AI models through knowledge distillation from multiple pre-trained models across different modalities (text, vision, audio, and multimodal). ## Features - **Multi-Modal Support**: Distill knowledge from text, vision, audio, and multimodal models - **Multiple Input Sources**: Upload local files, use Hugging Face repositories, or direct URLs - **Real-Time Monitoring**: Live progress tracking with WebSocket updates - **Flexible Configuration**: Customizable student model architecture and training parameters - **Production Ready**: Built with FastAPI, comprehensive error handling, and security measures - **Responsive UI**: Modern, mobile-friendly web interface - **Multiple Formats**: Support for PyTorch (.pt, .pth, .bin), Safetensors, and Hugging Face models ## 🆕 New Advanced Features ### 🔧 System Optimization - **Memory Management**: Advanced memory management for 16GB RAM systems - **CPU Optimization**: Optimized for CPU-only training environments - **Chunk Loading**: Progressive loading for large models - **Performance Monitoring**: Real-time system performance tracking ### 🔑 Token Management - **Secure Storage**: Encrypted storage of Hugging Face tokens - **Multiple Token Types**: Support for read, write, and fine-grained tokens - **Auto Validation**: Automatic token validation and recommendations - **Usage Tracking**: Monitor token usage and access patterns ### 🏥 Medical AI Support - **Medical Datasets**: Specialized medical datasets (ROCOv2, CT-RATE, UMIE) - **DICOM Processing**: Advanced DICOM file processing and visualization - **Medical Preprocessing**: Specialized preprocessing for medical images - **Modality Support**: CT, MRI, X-ray, and ultrasound image processing ### 🌐 Enhanced Model Support - **Google Models**: Direct access to Google's open-source models - **Streaming Datasets**: Memory-efficient dataset streaming - **Progressive Training**: Incremental model training capabilities - **Arabic Documentation**: Full Arabic language support ## How to Use 1. **Select Teacher Models**: Choose 1-10 pre-trained models as teachers - Upload local model files (.pt, .pth, .bin, .safetensors) - Enter Hugging Face repository names (format: organization/model-name) - Provide direct download URLs to model files - For private/gated models: Add your HF token in Space settings 2. **Configure Training**: Set up training parameters - Student model architecture (hidden size, layers) - Training parameters (steps, learning rate, temperature) - Distillation strategy (ensemble, weighted, sequential) 3. **Monitor Training**: Watch real-time progress - Live progress bar and metrics - Training console output - Download trained model when complete ## Setup for Private/Gated Models To access private or gated Hugging Face models: 1. **Get your Hugging Face token**: - Go to [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens) - Create a new token with "Read" permissions 2. **Add token to Hugging Face Space**: - Go to your Space settings - Add a new secret: `HF_TOKEN` = `your_token_here` - Restart your Space 3. **Alternative**: Enter token in the interface - Use the "Hugging Face Token" field in the web interface - This is temporary and only for the current session ## Supported Formats - **PyTorch**: .pt, .pth, .bin files - **Safetensors**: .safetensors files - **Hugging Face**: Any public repository - **Direct URLs**: Publicly accessible model files ## Supported Modalities - **Text**: BERT, GPT, RoBERTa, T5, DistilBERT, etc. - **Vision**: ViT, ResNet, EfficientNet, SigLIP, etc. - **Multimodal**: CLIP, BLIP, ALBEF, etc. - **Audio**: Wav2Vec2, Whisper, etc. - **Specialized**: Background removal (RMBG), Medical imaging (MedSigLIP), etc. ## Troubleshooting Common Models ### SigLIP Models (e.g., google/siglip-base-patch16-224) - These models may require "Trust Remote Code" to be enabled - Use the "Test Model" button to verify compatibility before training ### Custom Architecture Models - Some models use custom code that requires "Trust Remote Code" - Always test models before starting training - Check model documentation on Hugging Face for requirements ### Gemma Models (e.g., google/gemma-2b, google/gemma-3-27b-it) - **Requires**: Hugging Face token AND access permission - **Steps**: 1. Request access at the model page on Hugging Face 2. Add your HF token in Space settings or interface 3. Enable "Trust Remote Code" if needed - **Note**: Gemma 3 models require latest transformers version ## Technical Details - **Backend**: FastAPI with async support - **ML Framework**: PyTorch with Transformers - **Frontend**: Responsive HTML/CSS/JavaScript - **Real-time Updates**: WebSocket communication - **Security**: File validation, input sanitization, resource limits ## 🚀 Quick Start (Optimized) ### Option 1: Standard Run ```bash python app.py ``` ### Option 2: Optimized Run (Recommended) ```bash python run_optimized.py ``` The optimized runner provides: - ✅ Automatic CPU optimization - ✅ Memory management setup - ✅ System requirements check - ✅ Performance recommendations - ✅ Enhanced logging ### Option 3: Docker (Coming Soon) ```bash docker run -p 8000:8000 ai-knowledge-distillation ``` ## 🔧 Advanced Configuration ### Environment Variables ```bash # Memory optimization export OMP_NUM_THREADS=8 export MKL_NUM_THREADS=8 export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128 # Cache directories export HF_DATASETS_CACHE=./cache/datasets export TRANSFORMERS_CACHE=./cache/transformers # Token management export HF_TOKEN=your_token_here ``` ### System Requirements #### Minimum Requirements - Python 3.9+ - 4GB RAM - 10GB free disk space - CPU with 2+ cores #### Recommended Requirements - Python 3.10+ - 16GB RAM - 50GB free disk space - CPU with 8+ cores - Intel CPU with MKL support #### For Medical AI - 16GB+ RAM - 100GB+ free disk space - Fast SSD storage ## 📊 Performance Tips 1. **Memory Optimization**: - Use streaming datasets for large medical datasets - Enable chunk loading for models >2GB - Monitor memory usage in real-time 2. **CPU Optimization**: - Install Intel Extension for PyTorch - Use optimized BLAS libraries (MKL, OpenBLAS) - Set appropriate thread counts 3. **Storage Optimization**: - Use SSD for cache directories - Regular cleanup of old datasets - Compress model checkpoints --- Built with ❤️ for the AI community | مبني بـ ❤️ لمجتمع الذكاء الاصطناعي