suu's picture

suu

Suu

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

Olmo 3

upvoted a paper 10 days ago

Step-GUI Technical Report

upvoted a collection 12 days ago

Olmo 3 Pre-training

View all activity

Organizations

upvoted 2 papers 10 days ago

Olmo 3

Paper • 2512.13961 • Published 13 days ago • 22

Step-GUI Technical Report

Paper • 2512.15431 • Published 11 days ago • 123

upvoted 2 collections 12 days ago

Olmo 3 Pre-training

All artifacts related to Olmo 3 pre-training • 10 items • Updated 5 days ago • 31

Olmo 3.1

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated 5 days ago • 36

upvoted 2 collections 17 days ago

Olmo 3

Artifacts for the Olmo 3 release. • 9 items • Updated 5 days ago • 156

Olmo 3 Post-training

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 5 days ago • 46

upvoted a paper 21 days ago

Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

Paper • 2512.05591 • Published 23 days ago • 16

upvoted a paper about 2 months ago

The Path Not Taken: RLVR Provably Learns Off the Principals

Paper • 2511.08567 • Published Nov 11 • 33

upvoted a collection 2 months ago

AEPO

The official datasets and model checkpoints of AEPO • 5 items • Updated 8 days ago • 4

upvoted a paper 2 months ago

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published Oct 16 • 104

upvoted 2 papers 3 months ago

CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning

Paper • 2509.20712 • Published Sep 25 • 19

Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts

Paper • 2502.12928 • Published Feb 18 • 1

upvoted 2 collections 4 months ago

KlearReasoner

KlearReasoner • 7 items • Updated 20 days ago • 5

RL+reason model

257 items • Updated 3 days ago • 22

upvoted a paper 5 months ago

Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

Paper • 2508.07629 • Published Aug 11 • 43

upvoted a paper about 1 year ago

CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts

Paper • 2410.16077 • Published Oct 21, 2024 • 1

upvoted a paper over 1 year ago

MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts

Paper • 2407.09816 • Published Jul 13, 2024 • 1