-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 18 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 30 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7
Collections
Discover the best community collections!
Collections including paper arxiv:2511.14460
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 80 -
A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning
Paper • 2510.01132 • Published • 5 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 125 -
MixReasoning: Switching Modes to Think
Paper • 2510.06052 • Published • 21
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 62 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
-
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization
Paper • 2511.15705 • Published • 92 -
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents
Paper • 2511.13593 • Published • 24 -
OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists
Paper • 2511.16931 • Published • 6 -
General Agentic Memory Via Deep Research
Paper • 2511.18423 • Published • 158
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 61 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
-
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 63 -
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning
Paper • 2502.15425 • Published • 9 -
EgoLife: Towards Egocentric Life Assistant
Paper • 2503.03803 • Published • 46 -
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper • 2503.01785 • Published • 85
-
GenEx: Generating an Explorable World
Paper • 2412.09624 • Published • 97 -
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Paper • 2501.02790 • Published • 9 -
Who's Your Judge? On the Detectability of LLM-Generated Judgments
Paper • 2509.25154 • Published • 29 -
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
Paper • 2509.25760 • Published • 55
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 18 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 30 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7
-
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization
Paper • 2511.15705 • Published • 92 -
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents
Paper • 2511.13593 • Published • 24 -
OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists
Paper • 2511.16931 • Published • 6 -
General Agentic Memory Via Deep Research
Paper • 2511.18423 • Published • 158
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 61 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 80 -
A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning
Paper • 2510.01132 • Published • 5 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 125 -
MixReasoning: Switching Modes to Think
Paper • 2510.06052 • Published • 21
-
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 63 -
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning
Paper • 2502.15425 • Published • 9 -
EgoLife: Towards Egocentric Life Assistant
Paper • 2503.03803 • Published • 46 -
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper • 2503.01785 • Published • 85
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 62 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
-
GenEx: Generating an Explorable World
Paper • 2412.09624 • Published • 97 -
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Paper • 2501.02790 • Published • 9 -
Who's Your Judge? On the Detectability of LLM-Generated Judgments
Paper • 2509.25154 • Published • 29 -
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
Paper • 2509.25760 • Published • 55