![]() This is, a map from the current state to a probability distribution over the available actions. Ī3C is short for Asynchronous Advantage Actor Critic and belongs to the family of the so-called Actor-Critic (from now on, just AC) algorithms inside Reinforcement Learning.ĪC algorithms maintain and update a stochastic policy. We have worked on top of two implementations of A3C: one by Xiaowei Hu and another by Lim Swee Kiat, which at the same time is based on top of Juliani's tutorials on Reinforcement Learning with TensorFlow. The algorithm of choice for the most successful implementations of Reinforcement Learning agent for StarCraft II seems to be A3C. Action-Selection Strategies for Exploration.Partial Observability and Deep Recurrent Q-Networks.Visualizing an Agent’s Thoughts and Actions.Q-Learning with Tables and Neural Networks. ![]() The series goes through the following topics: Dealing and try to improve the reward system (reward "hacking").īefore starting to train the SC2 agents, we went through a series of tutorials, which implement in TensorFlow the different RL algorithms applied to the OpenAI GYM environment.Training and evaluating several RL agents.That's why we can define our objectives by: Moreover, we want to experiment with the reward system to see how several changes may influence the behaviour of the agent. It is our intention to develop an intelligent Deep RL agent that can perform successfully on several mini-games with bound objectives. However, the StarCraft II learning environment provides several challenges that are most appropriate to test the learning capabilities of an intelligent agent. Playing the whole game is quite an ambitious goal that currently is only whithin the reach of scripted agents. Scheme 1 explains how SC2LE works combining StarCarft II API with Google DeepMind Libraries: PySC2 provides an interface for RL agents to interact with StarCraft 2, getting observations and rewards and sending actions. This is a collaboration between DeepMind and Blizzard to develop StarCraft II into a rich environment for RL research. It exposes Blizzard Entertainment's StarCraft II Machine Learning API as a Python reinforcement learning ( RL) Environment. PySC2 is DeepMind's Python component of the StarCraft II Learning Environment ( SC2LE). The map is only partially observed via a local camera, which must be actively moved in order for the player to integrate. It is also multi-agent at a lower-level: each player controls hundreds of units, which need to collaborate to achieve a common goal. It is a multi-agent problem in which several players compete for influence and resources.Defeating top human players therefore becomes a meaningful and measurable long-term objective.įrom a reinforcement learning perspective, StarCraft II also offers an unparalleled opportunity to explore many challenging new frontiers: Over the previous two decades, StarCraft I and II have been pioneering and enduring e-sports, 2 with millions of casual and highly competitive professional players. It combines fast paced micro-actions with the need for high-level planning and execution. Strategic thinking is key to success you need to gather information about your opponents, anticipate their moves, outflank their attacks, and formulate a winning strategy. As commander, you observe the battlefield from a top-down perspective and issue orders to your units in real time. The armies in play can be as small as a single squad of Marines or as large as a full-blown planetary invasion force. In typical real-time strategy games, players build armies and vie for control of the battlefield. StarCraft II: Wings of Liberty is both a challenging single-player game and a fast-paced multiplayer game. StarCraft II: Wings of Liberty is the long-awaited sequel to the original StarCraft, Blizzard Entertainment’s critically acclaimed sci-fi real-time strategy (RTS) game. The slides are accessible through this link.Īs defined on the Blizzard website (the company that develops the game): This was presented in the DLAI session of 2. This project was developed during the Deep Learning for Artificial Intelligence Course at UPC TelecomBCN, Autumn 2017. This is the project repository for the group 5 at the DLAI. DLAI 2017 - Project Work Group 5 : Playing StarCraft II with Reinforcement Learning
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |