Spaces:

HiTZ
/

Critical_Questions_Leaderboard

Running

Apply for community grant: Academic project (gpu)

by Blanca - opened Jun 11

HiTZ zentroa org Jun 11

We are building a leaderboard to assess the capacity of LLMs to generate Critical Questions. You can find the details on this task here: Critical Questions Generation: Motivation and Challenges
The benefit of this leaderboard is to avoid sharing the test set publicly, which could lead to data contamination. The validation data and all the resources we used to build it are already public: https://github.com/hitz-zentroa/Benchmarking_CQs-Gen
The GPU is required because the best way to evaluate this task is using an LLM-as-a-Judge of 9B parameters. See this in the paper "Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models" https://arxiv.org/abs/2505.11341

Blanca

HiTZ zentroa org Jul 2

Hi, we are waiting for a resolution on this. @hysts is there something else we have to do?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment