Learning Dexterity End-to-End


How the OpenAI Robotics Team Uses W&B Reports

On the Robotics team at OpenAI, we have heavily adopted W&B Reports into our workflow and over the last ~6 months have shifted to using them as our primary means of sharing results within the team. The ability to mix real data from experiments with context and commentary on the results was the primary selling point for us; prior to Reports, we typically would create a Google Doc for tracking all of the "runs" in a given line of experimentation, and would have to spend a fair amount of time properly linking these docs to the experimental data (either in W&B or Tensorboard). Reports save us from this tedious bookkeeping while also allowing us to more easily share complete views on the data (since the viewer can select which runs to view, drill into specific runs, or even clone the report to add more plots).

Workflow with Reports

Whenever we begin a new line of experimentation (e.g. batch size ablations, architecture search), we tend to use the following workflow with Reports:

  1. Create a new report with text at the top explaining the context, hypotheses under test, and experimental plan; then share this with the team for review. Doing this helps to bring more rigor to the experimental process, and helps teammates to spot issues or suggest ideas earlier on.
  2. Launch one or more experiments aimed at testing the hypotheses.
  3. Tip: log the git SHA used to kick off each experiment, or use W&B's code logging feature; we tend to include the Git SHA prefix in experiment names for easy access
  4. Once the experiments are confirmed to be running correctly (i.e. no visible bugs), we add each as its own Run Set, and then add a number of plots for all the metrics we care about. Doing this up front makes it easy to monitor in-flight experiments, particularly when you have many running concurrently.
  5. Tip: descriptively name Run Sets and make all runs in each Run Set within a given section the same color
  6. Monitor the experiments. If enough data has been collected after the first set of experiments, add a conclusion to the top and share with the team. Otherwise, update the hypotheses and experimental plan and return to step 2.

The rest of this Report presents one example of this workflow applied to the line of research aimed at solving the block reorientation task from our Learning Dexterity release in an end-to-end manner (there is of course a bit more context included here compared to internal reports). If you are unfamiliar with this work, I suggest reading through the collapsed "Background" section below.

Read full post →

Join our mailing list to get the latest machine learning updates.