Comparing and Evaluating Prioritized Experience Replay Methods in Reinforcement Learning (Thesis-Proposal)

Reinforcement Learning (RL) is a subdomain of machine learning that has developed rapidly in recent years and has become increasingly popular. In reinforcement learning an agent learns from experience using a scalar reward signal, in contrast to learning from examples of labelled data as it is done in supervised machine learning. The agent interacts with an environment by taking actions that influence the environment and thus the observations that are the agent’s input.Experience Replay is a technique that helps agents learn more from a given amount of transitions (i.e. experience). While classical incremental online algorithms use only the current transition in a learning step, experience replay stores transitions in a replay memory from where samples can be taken randomly. The idea of prioritized experience replay is that some transitions are more valuable to learn from than others, so they should be sampled with a higher probability. There are several ways to prioritize and multiple implementation choices.For this thesis you would search for such methods in existing literature, choose some of them (the number depends on the amount of work each one entails) and implement and compare them amongst each other and, if time allows it, with standard non-prioritized variants of the same underlying algorithms. Criteria will include learning efficiency and computational efficiency but you should add your own as well.

Planned steps with estimated time· Reinforcement Learning foundation (depending on level of knowledge)Literature review and understanding (45 days)Implementation of chosen algorithms (20 days).Evaluation and comparison of implemented algorithms (15 days)Writing of the thesis (45 days)

Prerequisites:Interest, motivation in machine learningKnowledge and experience with RL are desirable, but not strictly necessary.
This will be reflected in the length of the initial learning phase.Programming skills (preferably in Python) Proficient use of scientific work

Begin and duration:Anytime, 6 months

Co-supervisors: Felix Grün, M. Sc., Tel.: +49 208 88254 875,
felix.gruen@hs-ruhrwest.de
Supervisors: Prof. Dr. Ioannis Iossifidis, Tel: +49 208 88254806
iossifidis@hs-ruhrwest.de

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Leave a Reply Cancel reply