Deep learning-based image quality enhancement models have been proposed to improve the perceptual quality of distorted synthesized views impaired by compression and the Depth Image-Based Rendering (DIBR) process in a multi-view video system. Due to the lack of Multi-view Video plus Depth (MVD) data, a deep learning-based model using more synthetic Synthesized View Images (SVI) is proposed, in which a random irregular polygon-based SVI synthesis method is proposed to simulate the DIBR distortion based on existing massive RGB/RGBD data. In addition, the DIBR distortion mask prediction network is embedded to further enhance the performance.
1. DIBR Distortion Simulation
shows the pipeline of DIBR distortion simulation. Original images from NYU 
and DIV2K 
databases (public RGB/RGBD databases) are first compressed by using codec with a given Quantization Parameter (QP) parameter. The associated depth images of the compressed images are available for RGBD images or could be generated by mono-depth estimation methods 
. Then, the DIBR distortion will be generated along the depth edges because depth edges are assumed to be the most possible areas where DIBR distortion resides. Next, the proposed random irregular polygon-based DIBR distortion generation method is employed on the compressed RGB/RGBD data. In this way, the synthetic synthesized view database is constructed, which includes synthetic synthesized images and corresponding DIBR distortion masks.
Figure 1. Overview of DIBR distortion simulation pipeline.
2. Different Local Noise Comparison and Proposed Random Irregular Polygon-Based DIBR Distortion Generation
a demonstrates the SVI with DIBR distortion of sequences Lovebird1 and Balloons, such as cracker, fragment, and irregular geometric displacement along the edges of objects. To investigate which kind of distortion resembles the DIBR distortion more, three different noise patterns, e.g., Gaussian noise, speckle noise, and patch shuffle-based noise, are compared. Gaussian noise is a well-known noise with normal distribution. Speckle noise is a type of granular noise which often exists in medical ultrasound images and synthetic aperture radar (SAR) images. Patch shuffle 
is a method to randomly shuffle the pixels in a local patch of images or feature maps during training which is used to regularize the training of classification-related CNN models. Taking the DIBR distortion simulation effects for Lovebird1 as example, as shown in Figure 2
, different synthetic SVIs are obtained by adding compressed neighboring captured views with Gaussian noise, speckle noise, and patch shuffle-based noise along the areas with strongly discontinuous depth, respectively. The real SVI is listed as anchor. Denote the captured view as I
, then the synthetic synthesized view by the random noise can be written as
where denotes the synthetic SVI, I denotes the compressed captured view images, 1 denotes the matrix with all elements as 1, M denotes the mask area corresponding to the detected strong depth edges, ⊙ denotes dot product, and denotes the images added with random noise, i.e., Gaussian noise, speckle noise, or the patch-shuffled version of I. It could be observed that synthesized by Gaussian noise and speckle noise is not very visually resembling synthesis distortion, and synthesized by patch-based noise exhibits similar behaviors a little in the way that the pixels in a local patch appear as disorderly and irregular.
Figure 2. Comparison of DIBR distortion simulation effects by local random noise. (a,b) are SVIs from sequences Lovebird1 and Balloons, respectively, and the enlarged areas are the representative areas with both compression and DIBR distortion. (c–j) represent the DIBR distortion simulation effects of rectangle areas in (a,b) by Gaussian, speckle, patch shuffle-based, and the proposed random irregular polygon-based noise on compressed captured views of Lovebird1 and Balloons, respectively.
SVI with DIBR distortion can be viewed as the tiny movement of textures within random polygon area along the depth transition area. To better simulate the irregular geometric distortion, a simple random polygon generation method which could control irregularity and spikiness will be introduced as follows. A random polygon generation method could be found in 
. Following the method 
, to generate a random polygon, a random set of points with angularly sorted order would be first generated; then, the vertices would be connected based on the order. First, given a center point P
, a group of points would be sampled on a circle around point P
. Random noise is added by varying the angular spacing between sequential points and the radial distance of each point from the center. The process can be formulated as
represent the angle and radius between the i
-th point and assumed center point, respectively.
denotes the random variable controlling angular space between sequential points, which is subject to a uniform distribution featured by the smallest value
and largest value
, where n
denotes the number of vertices. Moreover,
is subject to Gaussian distribution with a given radius R
as mean value and
as the variance. R
could be used to adjust the magnitude of the generated polygon.
could be used to adjust the irregularity of the generated polygon by controlling the angular variance degree through the interval size of U
could be used to adjust the spikiness of the generated polygon by controlling the radius variance through the normal distribution. Large
indicates strong irregularity and spikiness, and vice versa, which can be shown in Figure 3
Figure 3. Examples of generated random polygons. n denotes the number of vertices, denotes irregularity, and denotes spikiness. (a) n = 6, = 0, = 0. (b) n = 6, = 0.5, = 0. (c) n = 6, = 0, = 0.5. (d) n = 6, = 0.7, = 0.7. (e) n = 15, = 0.7, = 0.7.
Thus, the synthetic SVI composed by the proposed random polygon noise can be obtained as
denotes the vertices set located in a local region generated by the random polygon method,
denotes a random vector for all points of
to be bodily shifted in
is fused with I
in the strong depth regions. In Figure 2
f,j, it can be observed that the DIBR distortion generated by the activity of textures within random polygon area along the edges resembles the distortion visually.
3. DIBR Distortion Mask Prediction Network Embedding
Existing IQA models for SVI demonstrate that DIBR distortion position determination is the key procedure for quality assessment 
, which hints that knowing and paying more attention to DIBR distortion position may elevate SVQE models in enhancing SVI quality. Therefore, how to incorporate the DIBR distortion position into SVQE models becomes a new issue. The intuitive way is directly integrating DIBR distortion position with distorted image as a whole input. Figure 4
a shows the sketch map of this way. It could be validated that knowing DIBR distortion position is helpful for synthesized image quality enhancement. However, the ground truth DIBR distortion position is often not known, so the position has to be detected or estimated. Inspired by de-raining 
and shadow removal 
, SVI quality enhancement could be regarded as two tasks, i.e., DIBR distortion mask estimation and image restoration/denoising. Reviewing these works, there are three main possible ways to group mask estimation and image restoration task network, i.e., successive (series) network, parallel network (multi-task), parallel interactive network. The sketch map of these ways is demonstrated in Figure 4
b–d. In addition to different organization or design of networks, attention mechanism, such as spatial attention 
, self-attention 
, or non-local attention 
, is also considered in existing denoising or restoration networks. Researchers mainly focus on networks which explicitly combine the DIBR mask prediction and DIBR distortion elimination and mainly test the successive (series) network shown in Figure 4
Figure 4. Four possible ways of image denoising/restoration networks integrating with DIBR distortion position. (a) Intuitive way of integrating ground truth DIBR distortion position. (b) Successive networks with DIBR distortion prediction. (c) Parallel networks with DIBR distortion prediction. (d) Parallel interactive network with DIBR distortion prediction.