Colorizing Pictures with Deoldify

Boris Dayma

DeOldify is a colorizer made by Jason Antic with fastai.

Colorizing with neural networks traditionally produce dull colors as the model minimize its loss by producing average colors that end up in the brown area.

The use of a GAN pushes the model to use more vivid colors, even if they are not the correct ones, leading to more realistic predictions.

We recently integrated W&B into DeOldify to make it easier to understand what happens behind the scenes.

Selecting a data-set and a model

As with any machine learning project, having a rich data-set will lead to better results. Fortunately, we just need to convert any color picture to gray to create new samples.

I used ImageNet in my experiments but any other source could be used as long as they are high quality and diverse.

We feed produced black & white images and try to predict their original colorized version.

The reference model for outputting images is a U-net.

Fastai lets us easily create custom U-nets that incorporate pre-trained models such as ResNets.

Pre-training Generator & Critic

One interesting advantage of this particular task is that we also have labeled data and can make use of traditional supervised learning.

Instead of directly starting a GAN training, we will pre-train the generator & the critic.

For the generator, we do it in several phases where we gradually increase the dimensions of the image. This typically leads to faster training as we can feed larger batches in the early steps.

While we start getting some colors, we tend to get brown colors when the model is not sure of what to output (bed sheets vs grass).

We then pre-train the critic whose objective is just to identify which images are real and which ones come from the pre-trained generator.

Bringing colors

Finally, a traditional GAN training is performed.

The following sequence of steps is performed several times:

The author decides manually when to stop GAN training by looking at the current generated samples. There is a moment where the quality of generated pictures actually decreases.

In practice, the author would perform the fastidious task of manually deciding the optimal number of updates to perform at each step over several experiments until reaching satisfactory results.

The results

We highly encourage you to play with this repo. You can find the code to reproduce this analysis and the full report here:

Join our mailing list to get the latest machine learning updates.