Voix commited on
Commit
e76fca0
·
verified ·
1 Parent(s): 4185b7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +200 -0
README.md CHANGED
@@ -6,4 +6,204 @@ tags:
6
  - gan
7
  license: mit
8
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
 
6
  - gan
7
  license: mit
8
  ---
9
+ ## ESRGAN (Enhanced SRGAN) [:rocket: [BasicSR](https://github.com/xinntao/BasicSR)] [[Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN)]
10
+
11
+ :sparkles: **New Updates.**
12
+
13
+ We have extended ESRGAN to [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN), which is a **more practical algorithm for real-world image restoration**. For example, it can also remove annoying JPEG compression artifacts. <br> You are recommended to have a try :smiley:
14
+
15
+ In the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo,
16
+
17
+ - You can still use the original ESRGAN model or your re-trained ESRGAN model. [The model zoo in Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN#european_castle-model-zoo).
18
+ - We provide a more handy inference script, which supports 1) **tile** inference; 2) images with **alpha channel**; 3) **gray** images; 4) **16-bit** images.
19
+ - We also provide a **Windows executable file** RealESRGAN-ncnn-vulkan for easier use without installing the environment. This executable file also includes the original ESRGAN model.
20
+ - The full training codes are also released in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo.
21
+
22
+ Welcome to open issues or open discussions in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo.
23
+
24
+ - If you have any question, you can open an issue in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo.
25
+ - If you have any good ideas or demands, please open an issue/discussion in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo to let me know.
26
+ - If you have some images that Real-ESRGAN could not well restored, please also open an issue/discussion in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo. I will record it (but I cannot guarantee to resolve it😛).
27
+
28
+ Here are some examples for Real-ESRGAN:
29
+
30
+ <p align="center">
31
+ <img src="https://raw.githubusercontent.com/xinntao/Real-ESRGAN/master/assets/teaser.jpg">
32
+ </p>
33
+ :book: Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data
34
+
35
+ > [[Paper](https://arxiv.org/abs/2107.10833)] <br>
36
+ > [Xintao Wang](https://xinntao.github.io/), Liangbin Xie, [Chao Dong](https://scholar.google.com.hk/citations?user=OSDCB0UAAAAJ), [Ying Shan](https://scholar.google.com/citations?user=4oXBp9UAAAAJ&hl=en) <br>
37
+ > Applied Research Center (ARC), Tencent PCG<br>
38
+ > Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
39
+
40
+ -----
41
+
42
+ As there may be some repos have dependency on this ESRGAN repo, we will not modify this ESRGAN repo (especially the codes).
43
+
44
+ The following is the original README:
45
+
46
+ #### The training codes are in :rocket: [BasicSR](https://github.com/xinntao/BasicSR). This repo only provides simple testing codes, pretrained models and the network interpolation demo.
47
+
48
+ [BasicSR](https://github.com/xinntao/BasicSR) is an **open source** image and video super-resolution toolbox based on PyTorch (will extend to more restoration tasks in the future). <br>
49
+ It includes methods such as **EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR**, etc. It now also supports **StyleGAN2**.
50
+
51
+ ### Enhanced Super-Resolution Generative Adversarial Networks
52
+ By Xintao Wang, [Ke Yu](https://yuke93.github.io/), Shixiang Wu, [Jinjin Gu](http://www.jasongt.com/), Yihao Liu, [Chao Dong](https://scholar.google.com.hk/citations?user=OSDCB0UAAAAJ&hl=en), [Yu Qiao](http://mmlab.siat.ac.cn/yuqiao/), [Chen Change Loy](http://personal.ie.cuhk.edu.hk/~ccloy/)
53
+
54
+ We won the first place in [PIRM2018-SR competition](https://www.pirm2018.org/PIRM-SR.html) (region 3) and got the best perceptual index.
55
+ The paper is accepted to [ECCV2018 PIRM Workshop](https://pirm2018.org/).
56
+
57
+ :triangular_flag_on_post: Add [Frequently Asked Questions](https://github.com/xinntao/ESRGAN/blob/master/QA.md).
58
+
59
+ > For instance,
60
+ > 1. How to reproduce your results in the PIRM18-SR Challenge (with low perceptual index)?
61
+ > 2. How do you get the perceptual index in your ESRGAN paper?
62
+
63
+ #### BibTeX
64
+
65
+ @InProceedings{wang2018esrgan,
66
+ author = {Wang, Xintao and Yu, Ke and Wu, Shixiang and Gu, Jinjin and Liu, Yihao and Dong, Chao and Qiao, Yu and Loy, Chen Change},
67
+ title = {ESRGAN: Enhanced super-resolution generative adversarial networks},
68
+ booktitle = {The European Conference on Computer Vision Workshops (ECCVW)},
69
+ month = {September},
70
+ year = {2018}
71
+ }
72
+
73
+ <p align="center">
74
+ <img src="figures/baboon.jpg">
75
+ </p>
76
+
77
+ The **RRDB_PSNR** PSNR_oriented model trained with DF2K dataset (a merged dataset with [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) and [Flickr2K](http://cv.snu.ac.kr/research/EDSR/Flickr2K.tar) (proposed in [EDSR](https://github.com/LimBee/NTIRE2017))) is also able to achive high PSNR performance.
78
+
79
+ | <sub>Method</sub> | <sub>Training dataset</sub> | <sub>Set5</sub> | <sub>Set14</sub> | <sub>BSD100</sub> | <sub>Urban100</sub> | <sub>Manga109</sub> |
80
+ |:---:|:---:|:---:|:---:|:---:|:---:|:---:|
81
+ | <sub>[SRCNN](http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html)</sub>| <sub>291</sub>| <sub>30.48/0.8628</sub> |<sub>27.50/0.7513</sub>|<sub>26.90/0.7101</sub>|<sub>24.52/0.7221</sub>|<sub>27.58/0.8555</sub>|
82
+ | <sub>[EDSR](https://github.com/thstkdgus35/EDSR-PyTorch)</sub> | <sub>DIV2K</sub> | <sub>32.46/0.8968</sub> | <sub>28.80/0.7876</sub> | <sub>27.71/0.7420</sub> | <sub>26.64/0.8033</sub> | <sub>31.02/0.9148</sub> |
83
+ | <sub>[RCAN](https://github.com/yulunzhang/RCAN)</sub> | <sub>DIV2K</sub> | <sub>32.63/0.9002</sub> | <sub>28.87/0.7889</sub> | <sub>27.77/0.7436</sub> | <sub>26.82/ 0.8087</sub>| <sub>31.22/ 0.9173</sub>|
84
+ |<sub>RRDB(ours)</sub>| <sub>DF2K</sub>| <sub>**32.73/0.9011**</sub> |<sub>**28.99/0.7917**</sub> |<sub>**27.85/0.7455**</sub> |<sub>**27.03/0.8153**</sub> |<sub>**31.66/0.9196**</sub>|
85
+
86
+ ## Quick Test
87
+ #### Dependencies
88
+ - Python 3
89
+ - [PyTorch >= 1.0](https://pytorch.org/) (CUDA version >= 7.5 if installing with CUDA. [More details](https://pytorch.org/get-started/previous-versions/))
90
+ - Python packages: pip install numpy opencv-python
91
+
92
+ ### Test models
93
+ 1. Clone this github repo.
94
+ git clone https://github.com/xinntao/ESRGAN
95
+ cd ESRGAN
96
+
97
+ 2. Place your own **low-resolution images** in ./LR folder. (There are two sample images - baboon and comic).
98
+ 3. Download pretrained models from [Google Drive](https://drive.google.com/drive/u/0/folders/17VYV_SoZZesU6mbxz2dMAIccSSlqLecY) or [Baidu Drive](https://pan.baidu.com/s/1-Lh6ma-wXzfH8NqeBtPaFQ). Place the models in ./models. We provide two models with high perceptual quality and high PSNR performance (see [model list](https://github.com/xinntao/ESRGAN/tree/master/models)).
99
+ 4. Run test. We provide ESRGAN model and RRDB_PSNR model and you can config in the test.py.
100
+ python test.py
101
+
102
+ 5. The results are in ./results folder.
103
+ ### Network interpolation demo
104
+ You can interpolate the RRDB_ESRGAN and RRDB_PSNR models with alpha in [0, 1].
105
+
106
+ 1. Run python net_interp.py 0.8, where *0.8* is the interpolation parameter and you can change it to any value in [0,1].
107
+ 2. Run python test.py models/interp_08.pth, where *models/interp_08.pth* is the model path.
108
+
109
+ <p align="center">
110
+ <img height="400" src="figures/43074.gif">
111
+ </p>
112
+
113
+ ## Perceptual-driven SR Results
114
+
115
+ You can download all the resutls from [Google Drive](https://drive.google.com/drive/folders/1iaM-c6EgT1FNoJAOKmDrK7YhEhtlKcLx?usp=sharing). (:heavy_check_mark: included; :heavy_minus_sign: not included; :o: TODO)
116
+
117
+ HR images can be downloaed from [BasicSR-Datasets](https://github.com/xinntao/BasicSR#datasets).
118
+
119
+ | Datasets |LR | [*ESRGAN*](https://arxiv.org/abs/1809.00219) | [SRGAN](https://arxiv.org/abs/1609.04802) | [EnhanceNet](http://openaccess.thecvf.com/content_ICCV_2017/papers/Sajjadi_EnhanceNet_Single_Image_ICCV_2017_paper.pdf) | [CX](https://arxiv.org/abs/1803.04626) |
120
+ |:---:|:---:|:---:|:---:|:---:|:---:|
121
+ | Set5 |:heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
122
+ | Set14 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
123
+ | BSDS100 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
124
+ | [PIRM](https://pirm.github.io/) <br><sup>(val, test)</sup> | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :heavy_check_mark: |
125
+ | [OST300](https://arxiv.org/pdf/1804.02815.pdf) |:heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
126
+ | urban100 | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
127
+ | [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) <br><sup>(val, test)</sup> | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
128
+
129
+ ## ESRGAN
130
+ We improve the [SRGAN](https://arxiv.org/abs/1609.04802) from three aspects:
131
+ 1. adopt a deeper model using Residual-in-Residual Dense Block (RRDB) without batch normalization layers.
132
+ 2. employ [Relativistic average GAN](https://ajolicoeur.wordpress.com/relativisticgan/) instead of the vanilla GAN.
133
+ 3. improve the perceptual loss by using the features before activation.
134
+
135
+ In contrast to SRGAN, which claimed that **deeper models are increasingly difficult to train**, our deeper ESRGAN model shows its superior performance with easy training.
136
+
137
+ <p align="center">
138
+ <img height="120" src="figures/architecture.jpg">
139
+ </p>
140
+ <p align="center">
141
+ <img height="180" src="figures/RRDB.png">
142
+ </p>
143
+
144
+ ## Network Interpolation
145
+ We propose the **network interpolation strategy** to balance the visual quality and PSNR.
146
+
147
+ <p align="center">
148
+ <img height="500" src="figures/net_interp.jpg">
149
+ </p>
150
+
151
+ We show the smooth animation with the interpolation parameters changing from 0 to 1.
152
+ Interestingly, it is observed that the network interpolation strategy provides a smooth control of the RRDB_PSNR model and the fine-tuned ESRGAN model.
153
+
154
+ <p align="center">
155
+ <img height="480" src="figures/81.gif">
156
+ &nbsp &nbsp
157
+ <img height="480" src="figures/102061.gif">
158
+ </p>
159
+
160
+ ## Qualitative Results
161
+ PSNR (evaluated on the Y channel) and the perceptual index used in the PIRM-SR challenge are also provided for reference.
162
+
163
+ <p align="center">
164
+ <img src="figures/qualitative_cmp_01.jpg">
165
+ </p>
166
+ <p align="center">
167
+ <img src="figures/qualitative_cmp_02.jpg">
168
+ </p>
169
+ <p align="center">
170
+ <img src="figures/qualitative_cmp_03.jpg">
171
+ </p>
172
+ <p align="center">
173
+ <img src="figures/qualitative_cmp_04.jpg">
174
+ </p>
175
+
176
+ ## Ablation Study
177
+ Overall visual comparisons for showing the effects of each component in
178
+ ESRGAN. Each column represents a model with its configurations in the top.
179
+ The red sign indicates the main improvement compared with the previous model.
180
+ <p align="center">
181
+ <img src="figures/abalation_study.png">
182
+ </p>
183
+
184
+ ## BN artifacts
185
+ We empirically observe that BN layers tend to bring artifacts. These artifacts,
186
+ namely BN artifacts, occasionally appear among iterations and different settings,
187
+ violating the needs for a stable performance over training. We find that
188
+ the network depth, BN position, training dataset and training loss
189
+ have impact on the occurrence of BN artifacts.
190
+ <p align="center">
191
+ <img src="figures/BN_artifacts.jpg">
192
+ </p>
193
+
194
+ ## Useful techniques to train a very deep network
195
+ We find that residual scaling and smaller initialization can help to train a very deep network. More details are in the Supplementary File attached in our [paper](https://arxiv.org/abs/1809.00219).
196
+
197
+ <p align="center">
198
+ <img height="250" src="figures/train_deeper_neta.png">
199
+ <img height="250" src="figures/train_deeper_netb.png">
200
+ </p>
201
+
202
+ ## The influence of training patch size
203
+ We observe that training a deeper network benefits from a larger patch size. Moreover, the deeper model achieves more improvement (∼0.12dB) than the shallower one (∼0.04dB) since larger model capacity is capable of taking full advantage of
204
+ larger training patch size. (Evaluated on Set5 dataset with RGB channels.)
205
+ <p align="center">
206
+ <img height="250" src="figures/patch_a.png">
207
+ <img height="250" src="figures/patch_b.png">
208
+ </p>
209