Update README.md
Browse files
README.md
CHANGED
|
@@ -16,6 +16,7 @@ pipeline_tag: text-generation
|
|
| 16 |
|
| 17 |
**SeqKD-gpt2-340M** is a gpt2-medium (340M) model distilled from [gpt2-xlarge (1.5B)](https://huggingface.co/MiniLLM/teacher-gpt2-1.5B) on [databricks-dolly-15k](https://huggingface.co/datasets/aisquared/databricks-dolly-15k) with sequence-level forward KLD.
|
| 18 |
|
|
|
|
| 19 |
|
| 20 |
## Other Baselines
|
| 21 |
+ [SFT w/o KD](https://huggingface.co/MiniLLM/SFT-gpt2-340M)
|
|
|
|
| 16 |
|
| 17 |
**SeqKD-gpt2-340M** is a gpt2-medium (340M) model distilled from [gpt2-xlarge (1.5B)](https://huggingface.co/MiniLLM/teacher-gpt2-1.5B) on [databricks-dolly-15k](https://huggingface.co/datasets/aisquared/databricks-dolly-15k) with sequence-level forward KLD.
|
| 18 |
|
| 19 |
+
It is used as a baseline for [MiniLLM](https://huggingface.co/MiniLLM/MiniLLM-gpt2-340M).
|
| 20 |
|
| 21 |
## Other Baselines
|
| 22 |
+ [SFT w/o KD](https://huggingface.co/MiniLLM/SFT-gpt2-340M)
|