Colorful Image Colorization
Richard Zhang
Phillip Isola
Alexei A. Efros


Example input grayscale photos and output colorizations from our algorithm. These examples are cases where our model works especially well. For randomly selected examples, see the Performance comparisons section below.


How to interpret the results

Welcome! Computer vision algorithms often work well on some images, but completely fail on others. Ours is like this too. We believe our work is a significant step forward in solving the colorization problem. However, there are still many hard cases, and this is by no means a solved problem. Some failure cases can be seen below and the figure here.

This is partly because our algorithm is trained on one million images from the Imagenet dataset, and will thus work well for these types of images, but not necessarily for others. We call this the "dataset bias" problem. We include colorizations of black and white photos of renowned photographers as an interesting "out-of-dataset" experiment and make no claims as to artistic improvements, although we do enjoy many of the results!

Please enjoy our results, and if you're so inclined, try the model yourself (or ask the nearest hacker for help)!



Abstract

Given a grayscale photograph as input, this paper attacks the problem of hallucinating a plausible color version of the photograph. This problem is clearly underconstrained, so previous approaches have either relied on significant user interaction or resulted in desaturated colorizations. We propose a fully automatic approach that produces vibrant and realistic colorizations. We embrace the underlying uncertainty of the problem by posing it as a classification task and explore using class-rebalancing at training time to increase the diversity of colors in the result. The system is implemented as a feed-forward operation in a CNN at test time and is trained on over a million color images. We evaluate our algorithm using a "colorization Turing test", asking human subjects to choose between a generated and ground truth color image. Our method successfully fools humans 20% of the time, significantly higher than previous methods.


Try our model


Demo [IPython Notebook]      Caffe [Prototxt] [Model 129MB]


Paper and Supplementary Material

Full paper [10MB] Additional details and experiments [1MB]

[Bibtex]


Results on legacy black and white photos

We show results on legacy black and white photographs from renowned photographers Ansel Adams and Henri Cartier-Bresson.

Ansel Adams

(hovering shows our results; click for full images)
extention of Figure 15 from our paper
Henri Cartier-Bresson

(hovering shows our results; click for full images)
Figure 16 from our paper



Performance comparisons

Click the montage to the left to see our results on Imagenet validation photos (this is an extension of Figure 6 from our paper). Click the montage to the right to see results on a test set sampled from SUN (extension of Figure 12 in our paper). These images are random samples from the test set and are not hand-selected.

Comparisons on Imagenet

(hovering shows our results; click for additional examples)
Comparisons on SUN

(hovering shows our results; click for additional examples)


We also provide an initial comparison against Cheng et al. 2015 here. We were unable to acquire code or results from the authors, so we simply ran our method on screenshots from the figures in the paper of Cheng et al. See Section 3 in the supplementary pdf for further discussion of the differences between our algorithm and that of Cheng et al.


Semantic interpretability of results

Here, we show the ImageNet categories for which our colorization helps and hurts the most on object classification. Categories are ranked according to the difference in performance of VGG classification on the colorized result compared to on the grayscale version. This is an extension of Figure 6 in the paper.

Click a category below to see our results on all test images in that category.

Top
Bottom
  1. Rapeseed
  2. Lorikeet
  3. Cheeseburger
  4. Meat Loaf
  5. Pomegranate
  1. Green Snake
  2. Pizza
  3. Yellow Lady's Slipper
  4. Orange
  5. Goldfinch
  1. Chain
  2. Wok
  3. Can opener
  4. Water bottle
  5. Modem
  1. Standard Schnauzer
  2. Pickelhaube
  3. Half Track
  4. Barbershop
  5. Military Uniform



Recent Related Work

There have been a number of works in the field of automatic image colorization in the last few months! We would like to direct you to these recent related works for comparison. For a more thorough discussion of related work, please see Section 2 of our full paper.

Concurrent Work

Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. Learning Representations for Automatic Colorization. In arXiv, Mar 2016. [PDF]


Previous Work

Ryan Dahl. Automatic Colorization. Jan 2016. [Website]
Aditya Deshpande, Jason Rock and David Forsyth. Learning Large-Scale Automatic Image Colorization. In ICCV, Dec 2015. [PDF] [Website]
Zezhou Cheng, Qingxiong Yang, and Bin Sheng. Deep Colorization. In ICCV, Dec 2015. [PDF]



Acknowledgements

This research was supported, in part, by ONR MURI N000141010934, NSF SMA-1514512, an Intel research grant, and a hardware donation by NVIDIA Corp. We thank members of the Berkeley Vision Lab for helpful discussions. We thank Aditya Deshpande for providing help with comparisons to Deshpande et al.