The next Shocklab Seminar will be in person (online component available) on Wednesday, 23 July 2025, at 16:00.

Batsi Ziki will be presenting: "Meta-Learning the Intrinsic Reward Weighting in Curiosity-Driven RL". Swing by if you can, it’s sure to be an engaging talk.

Title: Meta-Learning the Intrinsic Reward Weighting in Curiosity-Driven RL
Speaker: Batsi Ziki
Date: Wednesday, 23 July 2025
Venue: M209 Mathematics Building, University of Cape Town
Time: 16:00-17:00 (GMT +2)
Zoom Meeting Link:   https://uct-za.zoom.us/j/92750361177?pwd=QzNiRzBJRjRITVlwa2k5SVNkVmx5UT09

Abstract: Reinforcement learning agents must find a balance between exploitation and exploration to maximise the cumulative sum of extrinsic rewards. However, it is particularly challenging for agents to explore effectively in sparse reward environments where feedback is rare. Curiosity-driven algorithms can be used to encourage effective exploration. These algorithms generate an additional reward called the intrinsic reward that encourages agents to seek novel situations. The intrinsic rewards are combined with extrinsic rewards using a weighted sum, where λ is the weighting for the intrinsic reward. However, λ is often fine-tuned for each new environment, even when environments are similar. We propose a meta-learning approach that replaces the fixed λ parameter with a neural network that outputs λ values at each time step. This network is trained using evolutionary strategies and can generalise across similar environments without retraining. Our approach highlights the potential for reducing the need for fine-tuning λ across similar tasks.

Bio: Batsi is a Master's student at the University of Cape Town with interests in curiosity-driven reinforcement learning and meta-reinforcement learning. His research focuses on improving the sample efficiency of reinforcement learning algorithms.

See you there!