Generative Gemini - Utilizing Generational Gap for Exploration in Reinforcement Learning

Type: COMP 579 (Reinforcement Learning) Project

Year: 2023

Teammates: Krishna Maneesha Dendukuri, Hena Ghonia

Key Skills: Python; Deep Reinforcement Learning; Machine Learning Pipeline; Machine Learning Libraries

Additional Information: Course Website

Objectives:

Develop a framework that promotes mindful and intentional exploration and is adaptable to different environment.

Summary:
One of the greatest challenges in reinforcement learning is insufficient exploration, especially in dynamic and sparse reward environments. Borrowing ideas from contrastive learning and generative adversarial networks, the proposed approach uses two different generators to influence the decision of the policy network. The modified decisions would allow the agent to explore novel state spaces, ultimately leading to a better-performing policy. The preliminary results suggest that additional work is required.

PyTorch was used as the primary library for establishing the codebase.
The Minigrid environment was selected as the test environment.
The DQN model was implemented as the baseline algorithm.
My primary focus was on developing the new algorithm.

gemini_algorithm — Proposed Algorithm Structure