MAZR_RL_TEST.mp4
This repository contains the code and models for training an AI model to play a retro maze game. The AI model is trained using reinforcement learning techniques to navigate through the maze and reach the goal while avoiding obstacles.
The Retro Maze Game AI project aims to develop an intelligent agent capable of autonomously navigating a maze-like environment. The AI model is trained using the Proximal Policy Optimization (PPO) algorithm implemented in the Stable Baselines3 library. The trained model demonstrates the ability to learn and improve its performance over time.
- Reinforcement Learning: The AI model is trained using the PPO algorithm, which is a popular on-policy reinforcement learning method.
- Maze Environment: The project includes a custom maze environment designed specifically for this game. The environment provides a challenging and dynamic setting for training the AI model.
- Training Code: The repository provides code for training the AI model. It includes the necessary setup, hyperparameters, and training loop to train the model from scratch.
- Testing Code: The repository also includes code for testing the trained model. You can evaluate the model's performance on unseen maze scenarios and observe its decision-making capabilities.
- Model Checkpoints: We have included model checkpoints at different training steps, allowing you to explore and use models trained at various stages of the training process.
- Training Logs: The logs folder contains detailed training logs, documenting the training progress, rewards, and other metrics. These logs provide insights into the model's learning process and performance over time.
To get started with the Retro Maze Game AI project, follow these steps:
- Clone the repository:
git clone [https://github.com/your-username/retro-maze-game-ai.git](https://github.com/Nirab123456/Maze_RL_Research.git)
- Install the required dependencies:
pip install -r requirements.txt
- Train the AI model:
maze.ipynb
- Test the trained model:
trst.ipynb
Please refer to the documentation for more detailed instructions on running the training and testing code.
The trained AI model demonstrates remarkable performance in navigating through the maze environment. Despite a decrease in performance initially, the model shows significant improvement after reaching the 50,000,000 training step milestone. It develops its own strategies to overcome obstacles and reach the goal efficiently.
We provide the trained models at different training steps, allowing you to explore the model's progression and witness its learning process firsthand. Additionally, the training logs provide a comprehensive view of the model's performance, rewards, and training progress.
Contributions to the Retro Maze Game AI project are welcome! If you have any ideas, bug fixes, or improvements, please feel free to open an issue or submit a pull request. We appreciate any contributions that can enhance the project and make it more robust.
This project is intended for research purposes only. You are allowed to use and modify the code, models, and data for academic or personal research. However, please note that this project may include third-party libraries or dependencies that have their own licenses. Make sure to comply with the licenses of those components as well.
You are not granted permission to use this project, in whole or in part, for commercial purposes or to distribute it without proper authorization. The authors and contributors of this project shall not be held liable for any misuse or damages resulting from the use of this project.
If you have any questions regarding the usage or licensing of this project, please reach out to the project maintainers for clarification.
We would like to acknowledge the developers and contributors of the Stable Baselines3 library, which served as the foundation for implementing the reinforcement learning algorithms in this project. Their work has been instrumental in making this project possible.
- Stable Baselines3: https://github.com/DLR-RM/stable-baselines3
- PPO Algorithm: https://arxiv.org/abs/1707.06347