Skip to content

Latest commit

 

History

History
62 lines (45 loc) · 2.78 KB

README.md

File metadata and controls

62 lines (45 loc) · 2.78 KB

ROS + DQN + RatSLAM

Training Deep Q-Learning neural network based on ConvNetJS demo to use sonar range sensors and RatSLAM goals.

Status

Got busy and distracted, it works well enough for direct goal seeking and that may be enough to train up an agent which makes pretty maps in RatSLAM (if not straying too far before turning back). Have some decent experiments with ReinforceJS. Finding goals on the other side of walls and traps will require a different implementation, namely Actor-critic and/or Actor-mimic style architectures to get around these opstacles (when a goal can be seen on the other side of a trap).

Setup

npm install
bower install

TODO

  • Teleop.
  • Integrate IMU/tilt/odom feedback.
  • Catkin-ise.
  • Define custom ROS messages.
  • LTM/STM with long-term sets of "important" experiences.
  • Save/load DQN experience sets.

Usage

roslaunch kulbu_base sim.launch world:=rat1
roslaunch kulbu_slam rat.launch use_rat_odom:=false topic_odom:=/kulbu/odometry/filtered
rosrun turtlebot_teleop turtlebot_teleop_key /turtlebot_teleop/cmd_vel:=/kulbu/diff_drive_controller/cmd_vel

roslaunch rosbridge_server rosbridge_websocket.launch # ROSLibJS
node src/main.js
node src/main.js --noise # Generate noise on extra sensors.
node src/ratsim.js # Simulate RatSLAM goals for training.
rqt_plot /dqn/reward:epsilon
rqt_plot /dqn/avg_reward:avg_loss

rostopic pub -1 /dqn/status std_msgs/String -- '"{\"learning\": true, \"moving\": true, \"sensors\": false}"' # TODO: Custom message format.
rostopic pub -1 /dqn/save std_msgs/String -- 'file'   # Save DQN as JSON.
rostopic pub -1 /dqn/load std_msgs/String -- 'file'   # Load DQN from JSON.
rostopic pub -1 /dqn/set_age std_msgs/String -- '"100000"' # FIXME: Datatype.

Future work

RatSLAM

  • Reverse goal order and tweak for use on exploration tasks.
  • Discard experiences with many links.
  • Quality metric for LV. Don't link low quality experiences.
  • Reject closures with vastly different magnetic reading?
  • Implement multi Experience Maps RatSLAM on Humanoids

DQN