Spaces:
Running
on
Zero
Running
on
Zero
Update app.py
Browse files
app.py
CHANGED
|
@@ -56,13 +56,22 @@ def chat(message: str, history: list, stm_state: torch.Tensor, llm_history: list
|
|
| 56 |
with gr.Blocks(title="RxT-Beta-Micro-AI 270M (Supervised) Demo") as demo:
|
| 57 |
gr.Markdown("""
|
| 58 |
# RxT-Beta-Micro-Supervised 290M vs Stateless LLM Reference 275M
|
| 59 |
-
Compare Experimental Reactive Transformer with Stateless LLM Reference, trained on the same limited
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
## Limitations
|
| 62 |
Supervised version of the model is still in intermediate stage and will be further improved
|
| 63 |
in Reinforcement Learning stages (demo will be constantly updated), so model could generate
|
| 64 |
-
inaccurate answers and memory retention is weak.
|
| 65 |
-
advantages, especially infinite context and no delays (small delays are caused by Spaces ZeroGPU allocation).
|
| 66 |
""")
|
| 67 |
|
| 68 |
with gr.Row():
|
|
|
|
| 56 |
with gr.Blocks(title="RxT-Beta-Micro-AI 270M (Supervised) Demo") as demo:
|
| 57 |
gr.Markdown("""
|
| 58 |
# RxT-Beta-Micro-Supervised 290M vs Stateless LLM Reference 275M
|
| 59 |
+
Compare Experimental Reactive Transformer with Stateless LLM Reference, trained on the same limited real-world data.
|
| 60 |
+
|
| 61 |
+
Both models were pre-trained on 10B tokens from english wikipedia and FineWeb-edu, then fine-tuned on 1.1M single interactions
|
| 62 |
+
and on 30k filtered multi-turn conversations.
|
| 63 |
+
|
| 64 |
+
That's very small amount of pre-training data, compared to 1T/2T tokens in production small LLMs. Experiment is made to prove
|
| 65 |
+
that RxT is learning faster and achieve better results, even after very short training.
|
| 66 |
+
|
| 67 |
+
Accuracy (next token prediction) in multi-turn conversation training (validation dataset):
|
| 68 |
+
- RxT 88%
|
| 69 |
+
- LLM 60%
|
| 70 |
|
| 71 |
## Limitations
|
| 72 |
Supervised version of the model is still in intermediate stage and will be further improved
|
| 73 |
in Reinforcement Learning stages (demo will be constantly updated), so model could generate
|
| 74 |
+
inaccurate answers and memory retention is weak.
|
|
|
|
| 75 |
""")
|
| 76 |
|
| 77 |
with gr.Row():
|