BERT Human-like Reward Model

This is a reward model based on Bert Uncased.

Inference

!pip install transformers accelerate
model_name = "entfane/BERT_human_like_RM"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
messages = ["How are you doing? Great",
"How are you doing? Greetings! I am doing just fine, may I ask you, how are you doing?"
]
input = tokenizer(messages, return_tensors="pt", padding="max_length").to(model.device)
output = model(**input)
print(output)

Downloads last month: 7

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for entfane/BERT_human_like_RM

Base model

google-bert/bert-base-uncased

Finetuned

(6258)

this model

entfane
/

BERT_human_like_RM

BERT Human-like Reward Model

Inference

Model tree for entfane/BERT_human_like_RM

Dataset used to train entfane/BERT_human_like_RM