Benchmarks

Weights & Biases is an experiment tracking platform for deep learning. Our tools make collaborative deep learning easy for teams by organizing experiments and notes in a shared workspace, tracking all the code and hyperparameters, and plotting results.

With public benchmarks, we want to explore collaboration at the broader community level. This feature allows people to coordinate all their efforts in one place. You can easily see the code, workflow, and results of other people's experiments. If you're new to a problem, you don't have to start from scratch. Benchmarks are a collaborative way for deep learning practitioners to share the results of training their models.

Our hope is that everyone can see other people's work on similar problems in a single place. We think that the more collaborative and inclusive the process of model building becomes, the safer and better the models we build.

Browse and contribute to our current benchmarks. If you have an idea for a benchmark or would like us to host one for you, please reach out.

Drought Watch
Predict drought severity from satellite imagery and ground-level photos
KMNIST
Learn to read classical Japanese handwriting from images
The Witness
Teach an AI to play a video game (spoiler alert for The Witness!)
CATZ Benchmark
Learn to predict cat behavior from consecutive frames in GIFs
Super Resolution
Learn to enhance images without losing quality
Colorizer
Add realistic color to black and white photos