ROS + DQN + RatSLAM

Training Deep Q-Learning neural network based on ConvNetJS demo to use sonar range sensors and RatSLAM goals.

ConvNetJS - demo
ROSLibJS
RatSLAM fork (extended ROS integration)

Status

Got busy and distracted, it works well enough for direct goal seeking and that may be enough to train up an agent which makes pretty maps in RatSLAM (if not straying too far before turning back). Have some decent experiments with ReinforceJS. Finding goals on the other side of walls and traps will require a different implementation, namely Actor-critic and/or Actor-mimic style architectures to get around these opstacles (when a goal can be seen on the other side of a trap).

Setup

npm install
bower install

TODO

Teleop.
Integrate IMU/tilt/odom feedback.
Catkin-ise.
Define custom ROS messages.
LTM/STM with long-term sets of "important" experiences.
Save/load DQN experience sets.

Usage

roslaunch kulbu_base sim.launch world:=rat1
roslaunch kulbu_slam rat.launch use_rat_odom:=false topic_odom:=/kulbu/odometry/filtered
rosrun turtlebot_teleop turtlebot_teleop_key /turtlebot_teleop/cmd_vel:=/kulbu/diff_drive_controller/cmd_vel

roslaunch rosbridge_server rosbridge_websocket.launch # ROSLibJS
node src/main.js
node src/main.js --noise # Generate noise on extra sensors.
node src/ratsim.js # Simulate RatSLAM goals for training.
rqt_plot /dqn/reward:epsilon
rqt_plot /dqn/avg_reward:avg_loss

rostopic pub -1 /dqn/status std_msgs/String -- '"{\"learning\": true, \"moving\": true, \"sensors\": false}"' # TODO: Custom message format.
rostopic pub -1 /dqn/save std_msgs/String -- 'file'   # Save DQN as JSON.
rostopic pub -1 /dqn/load std_msgs/String -- 'file'   # Load DQN from JSON.
rostopic pub -1 /dqn/set_age std_msgs/String -- '"100000"' # FIXME: Datatype.

Future work

RatSLAM

Reverse goal order and tweak for use on exploration tasks.
Discard experiences with many links.
Quality metric for LV. Don't link low quality experiences.
Reject closures with vastly different magnetic reading?
Implement multi Experience Maps RatSLAM on Humanoids

DQN

Further test Dropout uncertainty.
Implement in Caffe fork or Theano if not Torch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ROS + DQN + RatSLAM

Status

Setup

TODO

Usage

Future work

RatSLAM

DQN

Files

README.md

Latest commit

History

README.md

File metadata and controls

ROS + DQN + RatSLAM

Status

Setup

TODO

Usage

Future work

RatSLAM

DQN