# pytorch-maddpg

**Repository Path**: wpc94/pytorch-maddpg

## Basic Information

- **Project Name**: pytorch-maddpg
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2024-04-30
- **Last Updated**: 2025-03-31

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

#+TITLE: An implementation of MADDPG
#+AUTHOR: xuehy
#+EMAIL: hyxue@outlook.com
#+STARTUP: content

* 1. Introduction

This is a pytorch implementation of [[https://arxiv.org/abs/1706.02275][multi-agent deep deterministic policy gradient algorithm]].

The experimental environment is a modified version of Waterworld based on [[https://github.com/sisl/MADRL][MADRL]]. 

* 2. Environment

The main features (different from MADRL) of the modified Waterworld environment are:

- evaders and poisons now bounce at the wall obeying physical rules
- sizes of the evaders, pursuers and poisons are now the same so that random actions will lead to average rewards around 0.
- need exactly n_coop agents to catch food.

* 3. Dependency

- [[https://github.com/pytorch/pytorch][pytorch]]
- [[https://github.com/facebookresearch/visdom][visdom]]
- =python==3.6.1= (recommend using the anaconda/miniconda)
- if you need to render the environments, =opencv= is required

* 4. Install

- Install [[https://github.com/sisl/MADRL][MADRL]].
- Replace the =madrl_environments/pursuit= directory with the one in this repo.
- =python main.py=

if scene rendering is enabled, recommend to install =opencv= through [[https://github.com/conda-forge/opencv-feedstock][conda-forge]].

* 5. Results

** two agents, cooperation = 2
The two agents need to cooperate to achieve the food for reward 10.

[[PNG/demo.gif]]

[[PNG/3.png]]

the average

[[PNG/4.png]]

** one agent, cooperation = 1

[[PNG/newplot.png]]


* 6. TODO

- reproduce the experiments in the paper with competitive environments.