The Dutch Railways (NS) have a lot of trains. Most of them are in use during peak hours. However at night and off peak hours they have to deal with surplus of this ‘rolling stock’. Simply leaving them on the main rail network is not an option (because it is used for freight trains for instance), and therefore they are stored in a shunting yard. On these yards also maintenance activities take place.
The Train Unit Shunting Problem is a problem the NS faces that encompasses matching incoming trains to outgoing train services, parking the trains on tracks in such order that they can leave when they need to (and no other train is blocking their path), routing the trains to their right positions and planning service activities like cleaning and maintenance tasks. Since the NS is expanding their fleet, it gets increasingly complex to create plans to fit all trains on available tracks, and find feasible solutions.
An simplified example of a (small) shunting yard is shown below.
Current solution techniques include formulating several elements of the problem in one big linear program, and solving it using CPLEX or row/column generating methods. These techniques tend to fail when the problem instance get close to real-life sizes. A more recent, more successful technique uses local search and simulated annealing to find a feasible solution to the integrated problem. This technique is however largely ‘brute-force’ and does not take experience or knowledge into account w.r.t. specific shunting yard layouts.
Recent advantages in Deep Reinforcement Learning, like the work done on creating a system that can play the game Go by Deepmind, gave rise to the impression that using Deep Reinforcement Learning to solve this planning issue could incorporate more information about a specific problem instance into the solution. Just like human planners, the RL algorithm could potentially build up experience about what good moves are, and which moves lead to infeasible solutions.
The hope is that by training a reinforcement learning algorithm to be able to solve both the original planning and all kinds of versions of the planning with small disruptions, we can arrive at solutions which can handle during-the-day disruptions better.
This master project concerns exploring whether indeed deep reinforcement learning can indeed help in solve the Train Unit Shunting Problem. Tim Salimans, a research scientist at OpenAI argues that since the algorithms for reinforcement learning become more powerful, the developing efforts need to shift to developing sufficiently sophisticated tasks and environments in which the RL agent can learn. This project is an effort to see whether the recent breakthroughs in AI and specifically Deep Reinforcement Learning can be applied to a real world problem, amongst others by developing an environment specifically to this practical problem.
“Even very talented people will not develop a great intellect if they are not exposed to the right environment” - Tim Salimans (OpenAI)