Real CUDA warmup benchmarking with actual Transformers models. Measure the performance impact of the caching_allocator_warmup function.
caching_allocator_warmup function at transformers/src/transformers/modeling_utils.py:6186. This interactive tool loads models twice - once with warmup disabled and once with warmup enabled - to demonstrate the significant loading time improvements.