HDR image reconstruction from a single exposure using deep CNNs

Abstract

Camera sensors can only capture a limited range of luminance simultaneously, and in order to create high dynamic range (HDR) images a set of different exposures are typically combined. In this paper we address the problem of predicting information that have been lost in saturated image areas, in order to enable HDR reconstruction from a single exposure. We show that this problem is well-suited for deep learning algorithms, and propose a deep convolutional neural network (CNN) that is specifically designed taking into account the challenges in predicting HDR values. To train the CNN we gather a large dataset of HDR images, which we augment by simulating sensor saturation for a range of cameras. To further boost robustness, we pre-train the CNN on a simulated HDR dataset created from a subset of the MIT Places database. We demonstrate that our approach can reconstruct high-resolution visually convincing HDR results in a wide range of situations, and that it generalizes well to reconstruction of images captured with arbitrary and low-end cameras that use unknown camera response functions and post-processing. Furthermore, we compare to existing methods for HDR expansion, and show high quality results also for image based lighting. Finally, we evaluate the results in a subjective experiment performed on an HDR display. This shows that the reconstructed HDR images are visually convincing, with large improvements as compared to existing methods.

Overview

We propose a novel method for reconstructing HDR images from low dynamic range (LDR) input images, by estimating missing information in bright image parts, such as highlights, lost due to saturation of the camera sensor. We base our approach on a fully convolutional neural network (CNN) design in the form of a hybrid dynamic range autoencoder:

Teaser — Fully convolutional deep hybrid dynamic range autoencoder network, used for HDR reconstruction. The encoder converts an LDR input to a latent feature representation, and the decoder reconstructs this into an HDR image in the log domain. The skip-connections include a domain transformation from LDR display values to logarithmic HDR, and the fusion of the skip-layers is initialized to perform an addition. The network is pre-trained on a subset of the Places database, and deconvolutions are initialized to perform bilinear upsampling. While the specified spatial resolutions are given for a 320 x 320 pixels input image, which is used in the training, the network is not restricted to a fixed image size.

For training, we first gather data from a large set of existing HDR image sources in order to create a training dataset. For each HDR image we then simulate a set of corresponding LDR exposures using a virtual camera model. The network weights are optimized over the dataset by minimizing a custom HDR loss function. As the amount of available HDR content is still limited we utilize transfer-learning, where the weights are pre-trained on a large set of simulated HDR images, created from a subset of the MIT Places database.

Expansion of LDR images for HDR applications is commonly referred to as inverse tone-mapping (iTM). Most existing inverse tone-mapping operators (iTMOs) are not very successful in reconstruction of saturated pixels. They focus on boosting the dynamic range to look plausible on an HDR display, or to produce rough estimates needed for image based lighting (IBL). The proposed method demonstrates a step improvement in the quality of reconstruction, in which the structures and shapes in the saturated regions are recovered. It offers a range of new applications, such as exposure correction, tone-mapping, or glare simulation.

For details on the method, we refer to the paper. Additional results are also available in the supplementary document. The complete testset with reconstructions is also provided. In order to perform reconstruction of arbitrary LDR images, souce code can be found on GitHub. All the downloads are listed below.

Downloads

In addition to the paper presented above, we provide a set of supplementary material:

Download	Size	Description
Paper, small	12.6Mb	Paper with JPEG compressed figures.
Paper, large	53.4Mb	Paper with no image compression applied.
Supplementary document	11.5Mb	Supporting document with additional figures for complementing the figures in the paper, aswell as specification of HDR image sources used for training.
Presentation	95.7Mb	Presentation from Siggraph Asia 2017, Bangkok.
Video	107Mb	The video overview presented above.
Testset reconstructions	581Mb	A zipped archive of the 96 images in the testset, together with corresponding HDR reconstructions in OpenEXR format. There are also JPGs with example exposures for easy comparison. These examples can be viewed with the accompanying HTML gallery. The gallery is also directly accessible here.
Source code	-	GitHub project with Python scripts for inference and training, implementing the autoencoder CNN using Tensorflow. The code is accompanied with trained parameters.

Related projects

Temporal stability (CVPR 2019)

Reconstruction video material frame by frame most often results in flickering artifacts and different local temporal incoherencies. In order to alleviate this problem, we propose a regularization method in our CVPR 2019 paper (paper, project web). This has been applied to the HDR reconstruction problem, see GitHub link above for code and trained weights.

Evaluation of single-image HDR image reconstruction (SIGGRAPH 2022)

Properly evaluating single-image HDR reconstruction methods is difficult (see, e.g., here). We recommend using the advised evaluation protocols proposed in our SIGGRAPH 2022 paper (paper, project web).

Citation

Reference

Gabriel Eilertsen, Joel Kronander, Gyorgy Denes, Rafał K. Mantiuk, Jonas Unger. HDR image reconstruction from a single exposure using deep CNNs. In: ACM Transactions on Graphics (Proc. of SIGGRAPH Asia 2017), 36(6), Article 178, 2017.

BibTeX

@article{EKDMU17,
  author       = "Eilertsen, Gabriel and 
                  Kronander, Joel, and 
                  Denes, Gyorgy and 
                  Mantiuk, Rafa\l and 
                  Unger, Jonas",
  title        = "HDR image reconstruction from a single 
                  exposure using deep CNNs",
  journal      = "ACM Transactions on Graphics (TOG)",
  number       = "6",
  volume       = "36",
  articleno    = "178",
  year         = "2017"
}