Blog (HTML) – Yuki Minai

Series · Pokémon Red RL

3 parts

Reinforcement learningTutorial

Training RED with RL to evolve Wartortle — part 3

Introducing a distance-based healing reward and Q-learning with longer look-ahead to complete the full training loop.

Oct 20244 min read

Reinforcement learningTutorial

Training RED with RL to evolve Wartortle — part 2

Adjusting the reward function so RED learns to move between tiles rather than staying frozen in one spot.

Oct 20244 min read

Reinforcement learningTutorial

Training RED with RL to evolve Wartortle — part 1

Building the Pokémon Red gym environment and teaching RED to stay in the grass to encounter wild Pokémon.

Oct 20245 min read

Series · Gymnasium environments

2 parts

Reinforcement learningTutorial

Creating a custom Gymnasium environment — part 2

Building a fully custom Pokémon Red environment from scratch using PyBoy and the Gymnasium framework.

Mar 20245 min read

Reinforcement learningTutorial

Creating a custom Gymnasium environment — part 1

How to edit an existing Gymnasium environment and set up the foundation for a custom RL testbed.

Mar 20245 min read

Standalone · RL concepts

1 post

Reinforcement learning

A taxonomy of RL algorithms: DQN, actor-critic, PPO

A structured map of model-free RL algorithms and how they relate to each other — useful reference for practitioners.

Aug 20246 min read

Series · Finite MDPs

3 parts

Reinforcement learningTutorial

Solving MDPs — part 3: TD learning

Temporal difference learning — learning value functions mid-episode without waiting for a terminal state.

Nov 20237 min read

Reinforcement learningTutorial

Solving MDPs — part 2: Monte Carlo methods

Learning without a model — estimating value functions from sampled episode returns.

Nov 20237 min read

Reinforcement learningTutorial

Solving MDPs — part 1: dynamic programming

The Bellman equation in practice — using iterative policy evaluation and value iteration on FrozenLake.

Nov 20237 min read