The intended command line usage is through the gps_main.py script:
cd /path/to/gps
python python/gps/gps_main.py -h
usage: gps_main.py [-h] [-n] [-t] [-r N] experiment
Run the Guided Policy Search algorithm.
positional arguments:
experiment experiment name
optional arguments:
-h, --help show this help message and exit
-n, --new create new experiment
-t, --targetsetup run target setup
-r N, --resume N resume training from iter N
-p N, --policy N take N policy samples (for BADMM only)
Usage:
python python/gps/gps_main.py <EXPERIMENT_NAME> -n
Creates a new experiment folder at experiments/<EXPERIMENT_NAME>
with a hyperparams file hyperparams.py
(and possibly a targets file targets.npz
) copied from the previous experiment created. Change hyperparams.py
to specify the new experiment.
python python/gps/gps_main.py <EXPERIMENT_NAME> -t
(for ROS only)
Opens the Target Setup GUI, for target setup when using ROS. See the Target Setup GUI section for details.
python python/gps/gps_main.py <EXPERIMENT_NAME>
Opens the GPS Training GUI and runs the guided policy search algorithm for your specific experiment hyperparams. See the GPS Training GUI section for details.
python python/gps/gps_main.py <EXPERIMENT_NAME> -r N
Resumes the guided policy search algorithm, loading the algorithm state from iteration N. (The file experiments/<EXPERIMENT_NAME>/data_files/algorithm_itr_<N>.pkl
must exist.)
python python/gps/gps_main.py <EXPERIMENT_NAME> -p N
Takes N policy samples from the most recent algorithm state, for testing the policy to see how it is behaving. (The file experiments/<EXPERIMENT_NAME>/data_files/algorithm_itr_<N>.pkl
must exist.)
For your reference, your experiments folder contains the following:
data_files/
- holds the data files.data_files/algorithm_itr_<N>.pkl
- the algorithm state at iteration N.data_files/traj_samples_itr_<N>.pkl
- the trajectory samples collected at iteration N.data_files/pol_samples_itr_<N>.pkl
- the policy samples collected at iteration N.data_files/figure_itr_<N>.png
- an image of the GPS Training GUI figure at iteration N.hyperparams.py
- the hyperparams used for this experiment. For more details, see this page.log.txt
- the log text of output from the Target Setup GUI and the GPS Training GUI.targets.npz
- the initial and target state used for this experiment (ROS agent only – set for other agents in hyperparams.py)To shorten gps commands, we suggest that you create an alias in your bashrc:
echo "alias gps='python /path/to/gps/python/gps/gps_main.py'" >> ~/.bashrc
cd /path/to/gps # (only required for MuJoCo experiments)
gps -h # help
gps <EXPERIMENT_NAME> -n # new experiment
gps <EXPERIMENT_NAME> -t # target setup
gps <EXPERIMENT_NAME> # run training
gps <EXPERIMENT_NAME> -r N # resume training
gps <EXPERIMENT_NAME> -p N # test policy