When I started studying neural networks, one of the most inspiring projects I found was CycleGAN. The results were impressive but - at first - I had a hard time understanding how it worked. Now with a bit more experience and effective experiment tracking tools, I have a better idea of what happens.
In Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, the authors do an amazing job at providing clear code that is well documented with reproducible results. The team presents a method that can learn to capture special characteristics of one image collection and figure out how these characteristics could be translated into a second image collection - all in the absence of any paired training examples. This means CycleGAN can solve problems with limited amount of labeled data, where normally it is costly, tedious or impossible to label and.
To dive deeper, it is fascinating to see how the network achieves results without being fed any paired “before-and-after” images. For example, to produce the outcome in the example below I gave it several images of zebras and several images of horses and the network learned how to go from zebra to horse while keeping the background and the shape of the animal the same.
This type of style transfer also works on other categories. Here’s another example where I gave the network pictures of winter scenes and summer scenes. I can feed this model a picture of a winter scene and it will change the season in the picture to summer.
Results were quite amazing when combining datasets of Monet paintings with real pictures.
Details suddenly appear and pictures look quite realistic. We can easily guess from the examples that the dataset of real pictures included a lot of sunsets or sunrises.
For CycleGAN, there are quite a few different losses defining the model. The network comprises 2 generators and 2 discriminators. Those losses help us understand what the model is doing when mapping A images with B images:
As you can see from the graphs, the loss decreases and oscillates within a range. The best way to see if the model is performing well is to look at sample predictions. The W&B interface makes this easy. You can see some example images in my logged runs:
Here are some of the most amazing samples I obtained going from Monet paintings to real pictures.
And other examples doing the opposite: real picture to Monet painting.
If you’re interested in learning more, check out our fork to easily log CycleGAN experiments with Weights & Biases.