Why RL? Why Now?

2025 is the year (or decade) AI moves from "talking" to "doing." For the last two years, we've optimised for plausibility (does it sound right?). Now, we optimise for verifiability (did it work?).

Reinforcement Learning is the engine of this shift. It is crucial for problems where:

  1. Multiple solutions exist (Creativity > Pattern Matching).
  2. No training data exists (We can't clone human behaviour; we must discover new strategies).
  3. The environment is non-differentiable (Black-box software, compilers, games, biology).

Every project in this hackathon should address some part of the loop: AgentActionEnvironmentRewardUpdate

Importantly, RL isn’t just about training. Rich environments with realistic and verifiable tasks are the new “gold” for data, and research and development in these areas is just as, or perhaps even more valuable. As such we’ve organised around 3 themes/tracks: Environments, Tasks, and Training.


🏁 The Tracks

Track 1: Building Environments

The model can only be as smart as the world it lives in. This track is about wrapping real software, games, or business logic into Gyms (environments with a step() function).

The main question: How do we create new, novel, challenging RL environments out of datasets, existing software, or entirely from scratch?

Ideas to Push:

Track 2: Building Task Curricula

An environment is useless without tasks and their reward functions.

The main question: How can we find Interesting ways to automate the production of interesting, diverse tasks with progressive difficulty?

Ideas to Push:

Track 3: Training Agents

For the Machine Learning Engineers. Take an environment and make a number go up.

The main question: Can you successfully train agents through RL and how well can you do this with with respect to compute limits, sample efficiency, model size?

Ideas to Push:

IMPORTANT NOTE: Although these three tracks are the primary focus of the hackathon, participants are strongly encouraged to pursue any compelling or creative RL-related ideas they’d like to explore. If you have an exciting direction that doesn’t fit neatly into a track but pushes the boundaries of what’s possible in RL, we want to see it!


🧰 The Stack: Recommended Resources

Training Frameworks

Existing Environments