mosaicml
/

mpt-7b

Text Generation

StreamingDatasets

text-generation-inference

Model card Files Files and versions

jacobfulano commited on May 5, 2023

Commit

b1188a4

·

1 Parent(s): 02a80c2

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -85,7 +85,13 @@ model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-7b', con
 model.to(device='cuda:0')
 ```
-The model size is approximately 13 GB total in two shards.
 This model was trained with the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.

 model.to(device='cuda:0')
 ```
+Although the model was trained with a sequence length of 2048, ALiBi enables users to increase the maximum sequence length during finetuning and/or deployment. For example:
+```python
+config = transformers.AutoConfig.from_pretrained('mosaicml/mpt-7b', trust_remote_code=True)
+config.update({"max_seq_len": 4096})
+model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-7b', config=config, trust_remote_code=True)
+```
 This model was trained with the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.