Searching through high dimensional hyperparameter spaces to find the most performant model can get unwieldy very fast. Hyperparameter sweeps provide an organized and efficient way to conduct a battle royale of models and pick the most accurate model. They enable this by automatically searching through combinations of hyperparameter values (e.g. learning rate, batch size, number of hidden layers, optimizer type) to find the most optimal values.
In this project we see how you can run sophisticated hyperparameter sweeps in 3 easy steps using Weights and Biases.
We train a plethora of convolutional neural networks and our battle royale surfaces the model that classifies Simpsons characters with the highest accuracy. We worked with this dataset from Kaggle. We also used Weights & Biases to log models metrics, inspect performance and share findings about the best architecture for the network.
If you'd like to play with Sweeps, please fork the accompanying colab notebook.
Running a hyperparameter sweep with Weights & Biases is very easy. There are just 3 simple steps:
we do this by creating a dictionary or a YAML file that specifies the parameters to search through, the search strategy, the optimization metric et all.
with one line of code we initialize the sweep and pass in the dictionary of sweep configurations: "sweep_id = wandb.sweep(sweep_config)"
also accomplished with one line of code, we call wandb.agent() and pass the sweep_id to run, along with a function that defines your model architecture and trains it: "wandb.agent(sweep_id, function=train)"
And voila! That's all there is to running a hyperparameter sweep! You can also find the full sweeps docs with all configuration options here.
We highly encourage you to fork the accompanying colab notebook, tweak the parameters, or try the model with your own dataset!
Use a parallel coordinates chart to see which hyperparameter values led to the best accuracy.
We can tweak the slides in the parallel co-ordinates chart to only view the runs that led to the best accuracy values. This can help us hone in on ranges of hyperparameter values to sweep over next.
Click through to a single run to see more details about that run. For example, on this run page you can see the performance metrics I logged when I ran this script.
You can visualize predictions made at every step by clicking on the Media tab.
The overview tab picks up a link to the code. In this case, it's a link to the Google Colab. If you're running a script from a git repo, we'll pick up the SHA of the latest git commit and give you a link to that version of the code in your own GitHub repo.
The System tab on the runs page lets you visualize how resource efficient your model was. It lets you monitor the GPU, memory, CPU, disk, and network usage in one spot.
As you can see running sweeps is super easy! We highly encourage you to fork this notebook, tweak the parameters, or try the model with your own dataset!