| license: apache-2.0 | |
| datasets: | |
| - ILSVRC/imagenet-1k | |
| model-index: | |
| - name: MaskGIT-Tokenizer-10bits | |
| results: | |
| - task: | |
| type: image-generation | |
| dataset: | |
| name: ILSVRC/imagenet-1k | |
| type: ILSVRC/imagenet-1k | |
| metrics: | |
| - name: rFID | |
| type: rFID | |
| value: 1.96 | |
| - name: InceptionScore | |
| type: InceptionScore | |
| value: 178.3 | |
| - name: LPIPS | |
| type: LPIPS | |
| value: 0.331 | |
| - name: PSNR | |
| type: PSNR | |
| value: 18.6 | |
| - name: SSIM | |
| type: SSIM | |
| value: 0.47 | |
| - name: CodebookUsage | |
| type: CodebookUsage | |
| value: 1.0 | |
| This model is the MaskGIT tokenizer with a vocabulary size of 10bits adopted for the usage in the MaskBit codebase. It uses a downsampling factor of 16 and is trained on ImageNet for images of resolution 256. | |
| You can find more details in the original [repository](https://github.com/google-research/maskgit) and in the [paper](https://arxiv.org/abs/2202.04200). All credits for this model belong to Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, and William T. Freeman. |