From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones. https://huggingface.co/papers/2509.25123
-
weizechen/RL-Compositionality-Stage1-RFT-Data
Viewer • Updated • 118k • 55 • 1 -
weizechen/RL-Compositionality-Stage2-RL-Level1-TrainData
Viewer • Updated • 500k • 42 • 1 -
weizechen/RL-Compositionality-Stage2-RL-Level2-TrainData
Viewer • Updated • 500k • 42 • 1 -
weizechen/RL-Compositionality-Stage2-RL-Level8-TestData
Viewer • Updated • 2.05k • 33 • 1