0x90e commited on
Commit
651e715
·
1 Parent(s): 0d5c9d2

Readme fix.

Browse files
Files changed (1) hide show
  1. README.md +10 -174
README.md CHANGED
@@ -1,174 +1,10 @@
1
- # ESRGAN (Enhanced SRGAN) [[Paper]](https://arxiv.org/abs/1809.00219) [[BasicSR]](https://github.com/xinntao/BasicSR)
2
- ## :smiley: Training codes are in [BasicSR](https://github.com/xinntao/BasicSR) repo.
3
- ### Enhanced Super-Resolution Generative Adversarial Networks
4
- By Xintao Wang, [Ke Yu](https://yuke93.github.io/), Shixiang Wu, [Jinjin Gu](http://www.jasongt.com/), Yihao Liu, [Chao Dong](https://scholar.google.com.hk/citations?user=OSDCB0UAAAAJ&hl=en), [Yu Qiao](http://mmlab.siat.ac.cn/yuqiao/), [Chen Change Loy](http://personal.ie.cuhk.edu.hk/~ccloy/)
5
-
6
- This repo only provides simple testing codes, pretrained models and the network strategy demo.
7
-
8
- ### **For full training and testing codes, please refer to [BasicSR](https://github.com/xinntao/BasicSR).**
9
-
10
- We won the first place in [PIRM2018-SR competition](https://www.pirm2018.org/PIRM-SR.html) (region 3) and got the best perceptual index.
11
- The paper is accepted to [ECCV2018 PIRM Workshop](https://pirm2018.org/).
12
-
13
- :triangular_flag_on_post: Add [Frequently Asked Questions](https://github.com/xinntao/ESRGAN/blob/master/QA.md).
14
-
15
- > For instance,
16
- > 1. How to reproduce your results in the PIRM18-SR Challenge (with low perceptual index)?
17
- > 2. How do you get the perceptual index in your ESRGAN paper?
18
-
19
- #### BibTeX
20
- <!--
21
- @article{wang2018esrgan,
22
- author={Wang, Xintao and Yu, Ke and Wu, Shixiang and Gu, Jinjin and Liu, Yihao and Dong, Chao and Loy, Chen Change and Qiao, Yu and Tang, Xiaoou},
23
- title={ESRGAN: Enhanced super-resolution generative adversarial networks},
24
- journal={arXiv preprint arXiv:1809.00219},
25
- year={2018}
26
- }
27
- -->
28
- @InProceedings{wang2018esrgan,
29
- author = {Wang, Xintao and Yu, Ke and Wu, Shixiang and Gu, Jinjin and Liu, Yihao and Dong, Chao and Qiao, Yu and Loy, Chen Change},
30
- title = {ESRGAN: Enhanced super-resolution generative adversarial networks},
31
- booktitle = {The European Conference on Computer Vision Workshops (ECCVW)},
32
- month = {September},
33
- year = {2018}
34
- }
35
-
36
- <p align="center">
37
- <img src="figures/baboon.jpg">
38
- </p>
39
-
40
- The **RRDB_PSNR** PSNR_oriented model trained with DF2K dataset (a merged dataset with [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) and [Flickr2K](http://cv.snu.ac.kr/research/EDSR/Flickr2K.tar) (proposed in [EDSR](https://github.com/LimBee/NTIRE2017))) is also able to achive high PSNR performance.
41
-
42
- | <sub>Method</sub> | <sub>Training dataset</sub> | <sub>Set5</sub> | <sub>Set14</sub> | <sub>BSD100</sub> | <sub>Urban100</sub> | <sub>Manga109</sub> |
43
- |:---:|:---:|:---:|:---:|:---:|:---:|:---:|
44
- | <sub>[SRCNN](http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html)</sub>| <sub>291</sub>| <sub>30.48/0.8628</sub> |<sub>27.50/0.7513</sub>|<sub>26.90/0.7101</sub>|<sub>24.52/0.7221</sub>|<sub>27.58/0.8555</sub>|
45
- | <sub>[EDSR](https://github.com/thstkdgus35/EDSR-PyTorch)</sub> | <sub>DIV2K</sub> | <sub>32.46/0.8968</sub> | <sub>28.80/0.7876</sub> | <sub>27.71/0.7420</sub> | <sub>26.64/0.8033</sub> | <sub>31.02/0.9148</sub> |
46
- | <sub>[RCAN](https://github.com/yulunzhang/RCAN)</sub> | <sub>DIV2K</sub> | <sub>32.63/0.9002</sub> | <sub>28.87/0.7889</sub> | <sub>27.77/0.7436</sub> | <sub>26.82/ 0.8087</sub>| <sub>31.22/ 0.9173</sub>|
47
- |<sub>RRDB(ours)</sub>| <sub>DF2K</sub>| <sub>**32.73/0.9011**</sub> |<sub>**28.99/0.7917**</sub> |<sub>**27.85/0.7455**</sub> |<sub>**27.03/0.8153**</sub> |<sub>**31.66/0.9196**</sub>|
48
-
49
- ## Quick Test
50
- #### Dependencies
51
- - Python 3
52
- - [PyTorch >= 0.4](https://pytorch.org/) (CUDA version >= 7.5 if installing with CUDA. [More details](https://pytorch.org/get-started/previous-versions/))
53
- - Python packages: `pip install numpy opencv-python`
54
-
55
- ### Test models
56
- 1. Clone this github repo.
57
- ```
58
- git clone https://github.com/xinntao/ESRGAN
59
- cd ESRGAN
60
- ```
61
- 2. Place your own **low-resolution images** in `./LR` folder. (There are two sample images - baboon and comic).
62
- 3. Download pretrained models from [Google Drive](https://drive.google.com/drive/u/0/folders/17VYV_SoZZesU6mbxz2dMAIccSSlqLecY) or [Baidu Drive](https://pan.baidu.com/s/1-Lh6ma-wXzfH8NqeBtPaFQ). Place the models in `./models`. We provide two models with high perceptual quality and high PSNR performance (see [model list](https://github.com/xinntao/ESRGAN/tree/master/models)).
63
- 4. Run test. We provide ESRGAN model and RRDB_PSNR model.
64
- ```
65
- python test.py models/RRDB_ESRGAN_x4.pth
66
- python test.py models/RRDB_PSNR_x4.pth
67
- ```
68
- 5. The results are in `./results` folder.
69
- ### Network interpolation demo
70
- You can interpolate the RRDB_ESRGAN and RRDB_PSNR models with alpha in [0, 1].
71
-
72
- 1. Run `python net_interp.py 0.8`, where *0.8* is the interpolation parameter and you can change it to any value in [0,1].
73
- 2. Run `python test.py models/interp_08.pth`, where *models/interp_08.pth* is the model path.
74
-
75
- <p align="center">
76
- <img height="400" src="figures/43074.gif">
77
- </p>
78
-
79
- ## Perceptual-driven SR Results
80
-
81
- You can download all the resutls from [Google Drive](https://drive.google.com/drive/folders/1iaM-c6EgT1FNoJAOKmDrK7YhEhtlKcLx?usp=sharing). (:heavy_check_mark: included; :heavy_minus_sign: not included; :o: TODO)
82
-
83
- HR images can be downloaed from [BasicSR-Datasets](https://github.com/xinntao/BasicSR#datasets).
84
-
85
- | Datasets |LR | [*ESRGAN*](https://arxiv.org/abs/1809.00219) | [SRGAN](https://arxiv.org/abs/1609.04802) | [EnhanceNet](http://openaccess.thecvf.com/content_ICCV_2017/papers/Sajjadi_EnhanceNet_Single_Image_ICCV_2017_paper.pdf) | [CX](https://arxiv.org/abs/1803.04626) |
86
- |:---:|:---:|:---:|:---:|:---:|:---:|
87
- | Set5 |:heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
88
- | Set14 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
89
- | BSDS100 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
90
- | [PIRM](https://pirm.github.io/) <br><sup>(val, test)</sup> | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :heavy_check_mark: |
91
- | [OST300](https://arxiv.org/pdf/1804.02815.pdf) |:heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
92
- | urban100 | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
93
- | [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) <br><sup>(val, test)</sup> | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
94
-
95
- ## ESRGAN
96
- We improve the [SRGAN](https://arxiv.org/abs/1609.04802) from three aspects:
97
- 1. adopt a deeper model using Residual-in-Residual Dense Block (RRDB) without batch normalization layers.
98
- 2. employ [Relativistic average GAN](https://ajolicoeur.wordpress.com/relativisticgan/) instead of the vanilla GAN.
99
- 3. improve the perceptual loss by using the features before activation.
100
-
101
- In contrast to SRGAN, which claimed that **deeper models are increasingly difficult to train**, our deeper ESRGAN model shows its superior performance with easy training.
102
-
103
- <p align="center">
104
- <img height="120" src="figures/architecture.jpg">
105
- </p>
106
- <p align="center">
107
- <img height="180" src="figures/RRDB.png">
108
- </p>
109
-
110
- ## Network Interpolation
111
- We propose the **network interpolation strategy** to balance the visual quality and PSNR.
112
-
113
- <p align="center">
114
- <img height="500" src="figures/net_interp.jpg">
115
- </p>
116
-
117
- We show the smooth animation with the interpolation parameters changing from 0 to 1.
118
- Interestingly, it is observed that the network interpolation strategy provides a smooth control of the RRDB_PSNR model and the fine-tuned ESRGAN model.
119
-
120
- <p align="center">
121
- <img height="480" src="figures/81.gif">
122
- &nbsp &nbsp
123
- <img height="480" src="figures/102061.gif">
124
- </p>
125
-
126
- ## Qualitative Results
127
- PSNR (evaluated on the Y channel) and the perceptual index used in the PIRM-SR challenge are also provided for reference.
128
-
129
- <p align="center">
130
- <img src="figures/qualitative_cmp_01.jpg">
131
- </p>
132
- <p align="center">
133
- <img src="figures/qualitative_cmp_02.jpg">
134
- </p>
135
- <p align="center">
136
- <img src="figures/qualitative_cmp_03.jpg">
137
- </p>
138
- <p align="center">
139
- <img src="figures/qualitative_cmp_04.jpg">
140
- </p>
141
-
142
- ## Ablation Study
143
- Overall visual comparisons for showing the effects of each component in
144
- ESRGAN. Each column represents a model with its configurations in the top.
145
- The red sign indicates the main improvement compared with the previous model.
146
- <p align="center">
147
- <img src="figures/abalation_study.png">
148
- </p>
149
-
150
- ## BN artifacts
151
- We empirically observe that BN layers tend to bring artifacts. These artifacts,
152
- namely BN artifacts, occasionally appear among iterations and different settings,
153
- violating the needs for a stable performance over training. We find that
154
- the network depth, BN position, training dataset and training loss
155
- have impact on the occurrence of BN artifacts.
156
- <p align="center">
157
- <img src="figures/BN_artifacts.jpg">
158
- </p>
159
-
160
- ## Useful techniques to train a very deep network
161
- We find that residual scaling and smaller initialization can help to train a very deep network. More details are in the Supplementary File attached in our [paper](https://arxiv.org/abs/1809.00219).
162
-
163
- <p align="center">
164
- <img height="250" src="figures/train_deeper_neta.png">
165
- <img height="250" src="figures/train_deeper_netb.png">
166
- </p>
167
-
168
- ## The influence of training patch size
169
- We observe that training a deeper network benefits from a larger patch size. Moreover, the deeper model achieves more improvement (∼0.12dB) than the shallower one (∼0.04dB) since larger model capacity is capable of taking full advantage of
170
- larger training patch size. (Evaluated on Set5 dataset with RGB channels.)
171
- <p align="center">
172
- <img height="250" src="figures/patch_a.png">
173
- <img height="250" src="figures/patch_b.png">
174
- </p>
 
1
+ ---
2
+ title: ESRGAN MANGA
3
+ emoji: 🏃
4
+ colorFrom: red
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ sdk_version: 3.12.0
8
+ app_file: app.py
9
+ pinned: false
10
+ ---