AdamF92 commited on
Commit
f2bf604
·
verified ·
1 Parent(s): 50dbd4a

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +12 -3
app.py CHANGED
@@ -56,13 +56,22 @@ def chat(message: str, history: list, stm_state: torch.Tensor, llm_history: list
56
  with gr.Blocks(title="RxT-Beta-Micro-AI 270M (Supervised) Demo") as demo:
57
  gr.Markdown("""
58
  # RxT-Beta-Micro-Supervised 290M vs Stateless LLM Reference 275M
59
- Compare Experimental Reactive Transformer with Stateless LLM Reference, trained on the same limited 10B tokens dataset.
 
 
 
 
 
 
 
 
 
 
60
 
61
  ## Limitations
62
  Supervised version of the model is still in intermediate stage and will be further improved
63
  in Reinforcement Learning stages (demo will be constantly updated), so model could generate
64
- inaccurate answers and memory retention is weak. However, it should still demonstate the architecture
65
- advantages, especially infinite context and no delays (small delays are caused by Spaces ZeroGPU allocation).
66
  """)
67
 
68
  with gr.Row():
 
56
  with gr.Blocks(title="RxT-Beta-Micro-AI 270M (Supervised) Demo") as demo:
57
  gr.Markdown("""
58
  # RxT-Beta-Micro-Supervised 290M vs Stateless LLM Reference 275M
59
+ Compare Experimental Reactive Transformer with Stateless LLM Reference, trained on the same limited real-world data.
60
+
61
+ Both models were pre-trained on 10B tokens from english wikipedia and FineWeb-edu, then fine-tuned on 1.1M single interactions
62
+ and on 30k filtered multi-turn conversations.
63
+
64
+ That's very small amount of pre-training data, compared to 1T/2T tokens in production small LLMs. Experiment is made to prove
65
+ that RxT is learning faster and achieve better results, even after very short training.
66
+
67
+ Accuracy (next token prediction) in multi-turn conversation training (validation dataset):
68
+ - RxT 88%
69
+ - LLM 60%
70
 
71
  ## Limitations
72
  Supervised version of the model is still in intermediate stage and will be further improved
73
  in Reinforcement Learning stages (demo will be constantly updated), so model could generate
74
+ inaccurate answers and memory retention is weak.
 
75
  """)
76
 
77
  with gr.Row():