logo

TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

Choose a topic in AI and Robotics, find and read three research papers on that topic, and summarize each paper.

6 Pages3790 Words66 Views
   

Added on  2023-06-15

About This Document

TorchCraft is a library that enables deep learning research on Real-Time Strategy (RTS) games such as StarCraft: Brood War, by making it easier to control these games from a machine learning framework, here Torch. This white paper argues for using RTS games as a benchmark for AI research, and describes the design and components of TorchCraft.

TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

Choose a topic in AI and Robotics, find and read three research papers on that topic, and summarize each paper.

   Added on 2023-06-15

ShareRelated Documents
TorchCraft: a Library for Machine Learning Research
on Real-Time Strategy Games
Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala,
Timothée Lacroix, Zeming Lin, Florian Richoux, Nicolas Usunier
gab@fb.com, nantas@robots.ox.ac.uk
November 7, 2016
Abstract
We present TorchCraft, a library that enables deep learning research on Real-Time
Strategy (RTS) games such as StarCraft: Brood War, by making it easier to control these
games from a machine learning framework, here Torch [ 9]. This white paper argues for
using RTS games as a benchmark for AI research, and describes the design and components
of TorchCraft.
1 Introduction
Deep Learning techniques [ 13] have recently enabled researchers to successfully tackle low-level
perception problems in a supervised learning fashion. In the field of Reinforcement Learning this
has transferred into the ability to develop agents able to learn to act in high-dimensional input
spaces. In particular, deep neural networks have been used to help reinforcement learning scale
to environments with visual inputs, allowing them to learn policies in testbeds that previously
were completely intractable. For instance, algorithms such as Deep Q-Network (DQN) [14]
have been shown to reach human-level performances on most of the classic ATARI 2600 games
by learning a controller directly from raw pixels, and without any additional supervision beside
the score. Most of the work spawned in this new area has however tackled environments where
the state is fully observable, the reward function has no or low delay, and the action set is
relatively small. To solve the great majority of real life problems agents must instead be able to
handle partial observability, structured and complex dynamics, and noisy and high-dimensional
control interfaces.
To provide the community with useful research environments, work was done towards
building platforms based on videogames such as Torcs [27], Mario AI [ 20 ], Unreal’s BotPrize
[10], the Atari Learning Environment [ 3], VizDoom [ 12], and Minecraft [ 11], all of which have
allowed researchers to train deep learning models with imitation learning, reinforcement learning
and various decision making algorithms on increasingly difficult problems. Recently there have
also been efforts to unite those and many other such environments in one platform to provide
a standard interface for interacting with them [ 4 ]. We propose a bridge between StarCraft:
Brood War, an RTS game with an active AI research community and annual AI competitions
[16, 6, 1], and Lua, with examples in Torch [9] (a machine learning library).
1
arXiv:1611.00625v2 [cs.LG] 3 Nov 2016
TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games_1
2 Real-Time Strategy for Games AI
Real-time strategy (RTS) games have historically been a domain of interest of the planning and
decision making research communities [ 5, 2 , 6, 16, 17]. This type of games aims to simulate
the control of multiple units in a military setting at different scales and level of complexity,
usually in a fixed-size 2D map, in duel or in small teams. The goal of the player is to collect
resources which can be used to expand their control on the map, create buildings and units
to fight off enemy deployments, and ultimately destroy the opponents. These games exhibit
durative moves (with complex game dynamics) with simultaneous actions (all players can give
commands to any of their units at any time), and very often partial observability (a “fog of
war”: opponent units not in the vicinity of a player’s units are not shown).
RTS gameplay: Components RTS game play are economy and battles (“macro” and
“micro” respectively): players need to gather resources to build military units and defeat their
opponents. To that end, they often have worker units (or extraction structures) that can gather
resources needed to build workers, buildings, military units and research upgrades. Workers
are often also builders (as in StarCraft), and are weak in fights compared to military units.
Resources may be of varying degrees of abundance and importance. For instance, in StarCraft
minerals are used for everything, whereas gas is only required for advanced buildings or military
units, and technology upgrades. Buildings and research define technology trees (directed acyclic
graphs) and each state of a “tech tree” allow for the production of different unit types and the
training of new unit abilities. Each unit and building has a range of sight that provides the
player with a view of the map. Parts of the map not in the sight range of the player’s units are
under fog of war and the player cannot observe what happens there. A considerable part of the
strategy and the tactics lies in which armies to deploy and where.
Military units in RTS games have multiple properties which differ between unit types, such
as: attack range (including melee), damage types, armor, speed, area of effects, invisibility,
flight, and special abilities. Units can have attacks and defenses that counter each others in a
rock-paper-scissors fashion, making planning armies a extremely challenging and strategically
rich process. An “opening” denotes the same thing as in Chess: an early game plan for which
the player has to make choices. That is the case in Chess because one can move only one
piece at a time (each turn), and in RTS games because, during the development phase, one is
economically limited and has to choose which tech paths to pursue. Available resources constrain
the technology advancement and the number of units one can produce. As producing buildings
and units also take time, the arbitrage between investing in the economy, in technological
advancement, and in units production is the crux of the strategy during the whole game.
Related work: Classical AI approaches normally involving planning and search [ 2, 15 ,
24, 7] are extremely challenged by the combinatorial action space and the complex dynamics
of RTS games, making simulation (and thus Monte Carlo tree search) difficult [8, 22]. Other
characteristics such as partial observability, the non-obvious quantification of the value of the
state, and the problem of featurizing a dynamic and structured state contribute to making them
an interesting problem, which altogether ultimately also make them an excellent benchmark for
AI. As the scope of this paper is not to give a review of RTS AI research, we refer the reader to
these surveys about existing research on RTS and StarCraft AI [16, 17].
It is currently tedious to do machine learning research in this domain. Most previous
reinforcement learning research involve simple models or limited experimental settings [ 26, 23].
Other models are trained on offline datasets of highly skilled players [25, 18, 19 , 21]. Contrary
to most Atari games [ 3], RTS games have much higher action spaces and much more structured
states. Thus, we advocate here to have not only the pixels as input and keyboard/mouse
for commands, as in [3, 4 , 12], but also a structured representation of the game state, as in
2
TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games_2
-- main game engine loop:
while true do
game.receive_player_actions()
game.compute_dynamics()
-- our injected code:
torchcraft.send_state()
torchcraft.receive_actions()
end
featurize, model = init()
tc = require ’torchcraft’
tc:connect(port)
while not tc.state.game_ended do
tc:receive()
features = featurize(tc.state)
actions = model:forward(features)
tc:send(tc:tocommand(actions))
end
Figure 1: Simplified client/server code that runs in the game engine (server, on the left) and
the library for the machine learning library or framework (client, on the right).
[11]. This makes it easier to try a broad variety of models, and may be useful in shaping loss
functions for pixel-based models.
Finally, StarCraft: Brood War is a highly popular game (more than 9.5 million copies sold)
with professional players, which provides interesting datasets, human feedback, and a good
benchmark of what is possible to achieve within the game. There also exists an active academic
community that organizes AI competitions.
3 Design
The simplistic design of TorchCraft is applicable to any video game and any machine learning
library or framework. Our current implementation connects Torch to a low level interface [1]
to StarCraft: Brood War. TorchCraft’s approach is to dynamically inject a piece of code in
the game engine that will be a server. This server sends the state of the game to a client (our
machine learning code), and receives commands to send to the game. This is illustrated in
Figure 1. The two modules are entirely synchronous, but the we provide two modalities of
execution based on how we interact with the game:
Game-controlled - we inject a DLL that provides the game interface to the bots, and one that
includes all the instructions to communicate with the machine learning client, interpreted
by the game as a player (or bot AI). In this mode, the server starts at the beginning of the
match and shuts down when that ends. In-between matches it is therefore necessary to
re-establish the connection with the client, however this allows for the setting of multiple
learning instances extremely easily.
Game-attached - we inject a DLL that provides the game interface to the bots, and we
interact with it by attaching to the game process and communicating via pipes. In this
mode there is no need to re-establish the connection with the game every time, and the
control of the game is completely automatized out of the box, however it’s currently
impossible to create multiple learning instances on the same guest OS.
Whatever mode one chooses to use, TorchCraft is seen by the AI programmer as a library
that provides: connect(), receive() (to get the state), send(commands), and some helper
functions about specifics of StarCraft’s rules and state representation. TorchCraft also provides
an efficient way to store game frames data from past (played or observed) games so that existing
state (“replays”, “traces”) can be re-examined.
3
TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Games with Artificial Intelligence - AI-based Game Design Patterns
|5
|670
|294

Training Self-Driving Cars using Deep Reinforcement Learning
|8
|1182
|338

SIT740 Research and Development in Information Technology : Assignment
|4
|639
|356

Machine Learning and Literature Review
|5
|667
|327