How To Upscale

Notice

OpenModelDB is still in alpha and actively being worked on. Please feel free to share your feedback and report any bugs you find.

The best place to find AI Upscaling models

OpenModelDB is a community driven database of AI Upscaling models. We aim to provide a better way to find and compare models than existing sources.

Found 630 models
MoSR
2x
2x-AnimeSharpV2_MoSR_Sharp
2x-AnimeSharpV2_MoSR_Sharp
2x-AnimeSharpV2_MoSR_Sharp
GitHub Link: https://github.com/Kim2091/Kim2091-Models/releases/tag/2x-AnimeSharpV2_Set This is my first anime model in years. Hopefully you guys can find a good use-case for it. Included are 4 models: 1. RealPLKSR (Higher quality, slower) 2. MoSR (Lower quality, faster) There are Sharp and Soft versions of both When to use each: - __Sharp:__ For heavily degraded sources. Sharp models have issues depth of field but are best at removing artifacts - __Soft:__ For cleaner sources. Soft models preserve depth of field but may not remove other artifacts as well __Notes:__ - MoSR doesn't work in chaiNNer currently - To use MoSR: 1. Use the ONNX version in tools like VideoJaNai 2. Update spandrel in the latest version of ComfyUI The ONNX version may produce slightly different results than the .pth version. If you have issues, try the .pth model.
MoSR
2x
2x-AnimeSharpV2_MoSR_Soft
2x-AnimeSharpV2_MoSR_Soft
2x-AnimeSharpV2_MoSR_Soft
GitHub Link: https://github.com/Kim2091/Kim2091-Models/releases/tag/2x-AnimeSharpV2_Set This is my first anime model in years. Hopefully you guys can find a good use-case for it. Included are 4 models: 1. RealPLKSR (Higher quality, slower) 2. MoSR (Lower quality, faster) There are Sharp and Soft versions of both When to use each: - __Sharp:__ For heavily degraded sources. Sharp models have issues depth of field but are best at removing artifacts - __Soft:__ For cleaner sources. Soft models preserve depth of field but may not remove other artifacts as well __Notes:__ - MoSR doesn't work in chaiNNer currently - To use MoSR: 1. Use the ONNX version in tools like VideoJaNai 2. Update spandrel in the latest version of ComfyUI The ONNX version may produce slightly different results than the .pth version. If you have issues, try the .pth model.
RealPLKSR
2x
2x-AnimeSharpV2_RPLKSR_Sharp
2x-AnimeSharpV2_RPLKSR_Sharp
2x-AnimeSharpV2_RPLKSR_Sharp
GitHub Link: https://github.com/Kim2091/Kim2091-Models/releases/tag/2x-AnimeSharpV2_Set This is my first anime model in years. Hopefully you guys can find a good use-case for it. Included are 4 models: 1. RealPLKSR (Higher quality, slower) 2. MoSR (Lower quality, faster) There are Sharp and Soft versions of both When to use each: - __Sharp:__ For heavily degraded sources. Sharp models have issues depth of field but are best at removing artifacts - __Soft:__ For cleaner sources. Soft models preserve depth of field but may not remove other artifacts as well __Notes:__ - MoSR doesn't work in chaiNNer currently - To use MoSR: 1. Use the ONNX version in tools like VideoJaNai 2. Update spandrel in the latest version of ComfyUI The ONNX version may produce slightly different results than the .pth version. If you have issues, try the .pth model.
RealPLKSR
2x
2x-AnimeSharpV2_RPLKSR_Soft
2x-AnimeSharpV2_RPLKSR_Soft
2x-AnimeSharpV2_RPLKSR_Soft
GitHub Link: https://github.com/Kim2091/Kim2091-Models/releases/tag/2x-AnimeSharpV2_Set This is my first anime model in years. Hopefully you guys can find a good use-case for it. Included are 4 models: 1. RealPLKSR (Higher quality, slower) 2. MoSR (Lower quality, faster) There are Sharp and Soft versions of both When to use each: - __Sharp:__ For heavily degraded sources. Sharp models have issues depth of field but are best at removing artifacts - __Soft:__ For cleaner sources. Soft models preserve depth of field but may not remove other artifacts as well __Notes:__ - MoSR doesn't work in chaiNNer currently - To use MoSR: 1. Use the ONNX version in tools like VideoJaNai 2. Update spandrel in the latest version of ComfyUI The ONNX version may produce slightly different results than the .pth version. If you have issues, try the .pth model.
RealPLKSR_dysample
4x
4x-PBRify_RPLKSRd_V3
4x-PBRify_RPLKSRd_V3
4x-PBRify_RPLKSRd_V3
PBRify Github: https://github.com/Kim2091/PBRify_Remix Release Link: https://github.com/Kim2091/Kim2091-Models/releases/tag/4x-PBRify_RPLKSRd_V3 This update brings a new upscaling model, 4x-PBRify_RPLKSRd_V3. This model is roughly 8x faster than the current DAT2 model, while being *higher quality*. It produces far more natural detail, resolves lines and edges more smoothly, and cleans up compression artifacts better. As a result of those improvements, PBR is also much improved. It tends to be clearer with less defined artifacts. However, this model is currently **only compatible with ComfyUI**. chaiNNer has not yet been updated to support this architecture. More Comparisons
MoSR
2x
2xAoMR_mosr
2xAoMR_mosr
2xAoMR_mosr
A 2x model for A 2x mosr upscaling model for game textures . Link to Github Release with more infos to the process 2xAoMR_mosr Scale: 4 Architecture: MoSR Architecture Option: mosr Author: Philip Hofmann License: CC-BY-0.4 Subject: Game Textures Input Type: Images Release Date: 21.09.2024 (dd/mm/yy) Dataset: Game Textures from Age of Mythology: Retold Dataset Size: 13'847 OTF (on the fly augmentations): No Pretrained Model: 4xNomos2_hq_mosr Iterations: 510'000 Batch Size: 4 Patch Size: 64 ## Description: In short: A 2x game texture mosr upscaling model, trained on and for (but not limited to) Age of Mythology: Retold textures. Since I have been playing Age of Mythology: Retold (casual player), I thought it would be interesting to train an single image super resolution model on (and for) game textures of AoMR, but this model should be usable for other game textures aswell. This is a 2x model, since the biggest texture images are already 4096x4096, I thought going 4x on those would be overkill (also there are already 4x game texture upscaling models, so this model can be used for similiar cases where 4x is not needed). Model Showcase: Slowpics
RealPLKSR
4x
4xPurePhoto-RealPLSKR
4xPurePhoto-RealPLSKR
4xPurePhoto-RealPLSKR
Skilled in working with cats, hair, parties, and creating clear images. Also proficient in resizing photos and enlarging large, sharp images. Can effectively improve images from small sizes as well (300px at smallest on one side, depending on the subject). Experienced in experimenting with techniques like upscaling with this model twice and then reducing it by 50% to enhance details, especially in features like hair or animals.
DRCT
4x
4xNomos2_hq_drct-l
4xNomos2_hq_drct-l
4xNomos2_hq_drct-l
A 4x model for Upscaler . Link to Github Release 4xNomos2_hq_drct-l Scale: 4 Architecture: DRCT Architecture Option: drct_l Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 08.09.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: DRCT-L_X4 Iterations: 200'000 Batch Size: 2 Patch Size: 64 Description: An drct-l 4x upscaling model, similiar to the 4xNomos2_hq_atd, 4xNomos2_hq_dat2 and 4xNomos2_hq_mosr models, trained and for usage on non-degraded input to give good quality output. Model Showcase: Slowpics
ATD
4x
4xNomos2_hq_atd
4xNomos2_hq_atd
4xNomos2_hq_atd
A 4x model for Upscaler . Link to Github Release 4xNomos2_hq_atd Scale: 4 Architecture: ATD Architecture Option: atd Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 05.09.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: 003_ATD_SRx4_finetune Iterations: 180'000 Batch Size: 2 Patch Size: 48 Norm: true Description: An atd 4x upscaling model, similiar to the 4xNomos2_hq_dat2 or 4xNomos2_hq_mosr models, trained and for usage on non-degraded input to give good quality output.
DAT
4x
4xNomos2_hq_dat2
4xNomos2_hq_dat2
4xNomos2_hq_dat2
A 4x model for Upscaler . Link to Github Release # 4xNomos2_hq_dat2 Scale: 4 Architecture: DAT Architecture Option: dat2 Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 29.08.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: DAT_2_x4 Iterations: 140'000 Batch Size: 4 Patch Size: 48 Description: A dat2 4x upscaling model, similiar to the 4xNomos2_hq_mosr model, trained and for usage on non-degraded input to give good quality output. I scored 7 validation outputs of each of the 21 checkpoints (10k-210k) of this model training with 68 metrics. The metric scores can be found in this google sheet. The corresponding image files for this scoring can be found here Screenshot of the google sheet: !|100 Release checkpoint has been selected by looking at the scores, manually inspecting, and then getting responses on discord to this quick visual test, A B or C, which denote different checkpoints: https://slow.pics/c/8Akzj6rR Checkpoint B is the one released here, but you can also try out Checkpoint A or Checkpoint C if you like them better. ## Model Showcase: Slowpics
RealPLKSR
2x
Text2HD v.1
Text2HD v.1
Text2HD v.1
A 2x model for Upscale text in very low quality to normal quality.. The upscale model is specifically designed to enhance lower-quality text images, improving their clarity and readability by upscaling them by 2x. It excels at processing moderately sized text, effectively transforming it into high-quality, legible scans. However, the model may encounter challenges when dealing with very small text, as its performance is optimized for text of a certain minimum size. For best results, input images should contain text that is not excessively small.
RealPLKSR
2x
VHS2HD
VHS2HD
VHS2HD
A 2x model for VHS Restoration. An advanced VHS recording model designed to enhance video quality by reducing artifacts such as haloing, ghosting, and noise patterns. Optimized primarily for PAL resolution (NTSC might work good as well).
MoSR
4x
4xNomos2_hq_mosr
4xNomos2_hq_mosr
4xNomos2_hq_mosr
A 4x model for Upscaler . Link to Github Release # 4xNomos2_hq_mosr Scale: 4 Architecture: MoSR Architecture Option: mosr Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 25.08.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: 4xmssim_mosr_pretrain Iterations: 190'000 Batch Size: 6 Patch Size: 64 Description: A 4x MoSR upscaling model, meant for non-degraded input, since this model was trained on non-degraded input to give good quality output. If your input is degraded, use a 1x degrade model first. So for example if your input is a .jpg file, you could use a 1x dejpg model first. Model Showcase: Slowpics
Compact
1x
1x BW Denoise
1x BW Denoise
1x BW Denoise
A fast upscaler that is a replacement for standard methods of cleaning noise from the picture. Ideal for BDRemux 720p and 1080p, but only if the noise is monochrome or close to it. So far, attempts are underway to create a dataset that could eliminate color noise at the same speed.
Compact
1x
1x RGB max Denoise
1x RGB max Denoise
1x RGB max Denoise
A fast upscaler that is a replacement for standard methods of cleaning noise from the picture. Ideal for BDRemux or WEB-DL 720p or 1080p it. It gets rid of floral noise well, the smaller its grain size, the more uniform the resolution for all videos will be. If you need a more stable video, then you can also add 2x-DigitalFlim-SuperUltraCompact after this upscaler. My upscaler can slightly simplify the geometry of very small objects.
RealPLKSR
1x
1xDeH264_realplksr
1xDeH264_realplksr
1xDeH264_realplksr
A 1x model for Restoration, H264 . Link to Github Release # 1xDeH264_realplksr Scale: 1 Architecture: RealPLKSR Architecture Option: realplksr Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 08.08.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: 1xDeJPG_realplksr_otf Iterations: 210'000 Batch Size: 8 Patch Size: 64 Description: A 1x de h264 model to remove h264 compression. Showcase: Slowpics Imgsli
RealPLKSR
1x
1xDeNoise_realplksr_otf
1xDeNoise_realplksr_otf
A 1x model for Restoration, Denoise . Link to Github Release # 1xDeNoise_realplksr_otf Scale: 1 Architecture: RealPLKSR Architecture Option: realplksr Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 08.08.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): Yes Pretrained Model: 1xDeJPG_realplksr_otf Iterations: 200'000 Batch Size: 8 Patch Size: 64 Description: A 1x realplksr model to denoise, trained with the realesrgan-otf pipeline, also handles a bit of jpg compression (if stronger jpg compression handling is needed, 1xDeJPG_realplksr_otf can be used).
RealPLKSR_dysample
4x
 4xArtFaces_realplksr_dysample
4xArtFaces_realplksr_dysample
A 4x model for Restoration . Link to Github Release # 4xArtFaces_realplksr_dysample Scale: 4 Architecture: RealPLKSR with Dysample Architecture Option: realplksr Author: Philip Hofmann License: CC-BY-0.4 Subject: Art Input Type: Images Release Date: 08.08.2024 Dataset: ArtFaces Dataset Size: 5'630 OTF (on the fly augmentations): No Pretrained Model: 4xNomos2_realplksr_dysample Iterations: 139'000 Batch Size: 6 Patch Size: 64 Description: A Dysample RealPLKSR 4x upscaling model for art / painted faces. Based on my ArtFaces Dataset which is a curated version of the metfaces dataset for the purpose of training single image super resolution models.
ESRGAN
4x
No Image
AnalogFrames 1.0
by Muf
Restore frames to their original look as on Drawings. Before analog transfer Trained on thousands of 4K HQ Drawing scans, the purpose of this model is restore the frame to its original paper and plastic look, with a particular emphasis on textures and preserving those brushstrokes and natural lines. Not to be used with heavily grainy sources. Note: The random white speckles are an unfortunate side effect of not filtering the dataset.
ESRGAN
4x
No Image
NumericFrames 2.0
by Muf
Upscaling of Numeric Animation from SD to HD with high texture retention. A serious upgrade over 1.0. Much more universality. Can look good on both clean SD and noisy/grainy source frames. It still has a weakness against sources with strong halos and jagged edges and rainbowing. (Recommended that these issues are fixed beforehand with avisynth)
ESRGAN
4x
No Image
NumericFrames 2.1-Aggressive
by Muf
Upscaling of Numeric Animation from SD to HD with high texture retention. Provides the sharpest results yet. About 30% better than the base 2.0 version. But as the title says it's the most aggressive version of NF. Does it's best on animation with little to no textures. The model as well as any version of NF is not suitable at all for overly deteriorated/blurred and low quality TVRips and VHSRips copies of animation pieces.
ESRGAN
4x
No Image
NumericFrames 2.5
by Muf
Upscaling of Numeric Animation from SD to HD with high texture retention. Big leap from 2.0, especially as much as texture retention is concerned. But also deals much better with distant characters/shots.
RealPLKSR_dysample
4x
4xHFA2k_ludvae_realplksr_dysample
4xHFA2k_ludvae_realplksr_dysample
4xHFA2k_ludvae_realplksr_dysample
A 4x model for Restoration . Link to Github Release 4xHFA2k_ludvae_realplksr_dysample Scale: 4 Architecture: RealPLKSR with Dysample Architecture Option: realplksr Author: Philip Hofmann License: CC-BY-0.4 Subject: Anime Input Type: Images Release Date: 13.07.2024 Dataset: HFA2k_LUDVAE Dataset Size: 10'272 OTF (on the fly augmentations): No Pretrained Model: 4xNomos2_realplksr_dysample Iterations: 165'000 Batch Size: 12 GT Size: 256 Description: A Dysample RealPLKSR 4x upscaling model for anime single-image resolution. The dataset has been degraded using DM600_LUDVAE, for more realistic noise/compression. Downscaling algorithms used were imagemagick box, triangle, catrom, lanczos and mitchell. Blurs applied were gaussian, box and lens blur (using chaiNNer). Some images were further compressed using -quality 75-92. Down-up was applied to roughly 10% of the dataset (5 to 15% variation in size). Degradations orders were shuffled, to give as many variations as possible. Examples are inferenced with neosr testscript and the released pth file. I include the test images also as a zip file in this release together with the model outputs, so others can test their models against these test images aswell to compare. onnx conversions are static since dysample doesnt allow dynamic conversion, I tested the conversions with chaiNNer. Showcase: Slowpics
RealPLKSR
1x
1xDeJPG_realplksr_otf
1xDeJPG_realplksr_otf
1xDeJPG_realplksr_otf
A 1x model for JPEG, Restoration . Link to Github Release 1xDeJPG_realplksr_otf Scale: 1 Architecture: RealPLKSR Architecture Option: realplksr Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 09.07.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): Yes Pretrained Model: This one used the 60 model as pretrain, while I used 4xNomos2_realplksr_dysample as a pretrain for the 60 model. Iterations: 196'000 Batch Size: 24 GT Size: 64 Description: A 1x dejpg model, trained with otf down to jpg 40 in both degradation processes, see the yml config file for details. I also release the 1xDeJPG_realplksr_otf_60 model additionally, which had been trained with less jpg compression in comparison, down to 60 in both cases, where the hope was that it could give a bit higher quality on less compressed inputs. The 40 model is released as default since it will be able to handle more compressed inputs in general. Since these are trained with otf will slightly deblur. Examples here are compressed, and then recompressed, with jpg 40. Showcase: Slowpics Imgsli
RealPLKSR
4x
4xNomos2_realplksr_dysample
4xNomos2_realplksr_dysample
4xNomos2_realplksr_dysample
A 4x model for Pretrained . 4xNomos2_realplksr_dysample Scale: 4 Architecture: RealPLKSR with Dysample Architecture Option: realplksr Github Release Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 30.06.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: 4xmssim_realplksr_dysample_pretrain Iterations: 185'000 Batch Size: 8 GT Size: 256, 512 Description: A Dysample RealPLKSR 4x upscaling model that was trained with / handles jpg compression down to 70 on the Nomosv2 dataset, preserves DoF. Based on the 4xmssim_realplksr_dysample_pretrain I released 3 days ago. This model affects / saturate colors, which can be counteracted a bit by using wavelet color fix, as used in these examples. Showcase: Slowpics
ESRGAN
4x
4xNomos2_otf_esrgan
4xNomos2_otf_esrgan
4xNomos2_otf_esrgan
A 4x model for Restoration . 4xNomos2_otf_esrgan Scale: 4 Architecture: ESRGAN Architecture Option: esrgan Github Release Link Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 22.06.2024 Dataset: Nomos-v2 Dataset Size: 6000 OTF (on the fly augmentations): Yes Pretrained Model: RealESRGAN_x4plus Iterations: 246'000 Batch Size: 8 GT Size: 256 Description: 4x ESRGAN model for photography, trained using the Real-ESRGAN otf degradation pipeline. Showcase: Slow Pics 8 Examples
ESRGAN
4x
4xNomosWebPhoto_esrgan
4xNomosWebPhoto_esrgan
4xNomosWebPhoto_esrgan
A 4x model for Restoration . I simply wanted to release an ESRGAN model just because I had not trained one for quite a while and just wanted to revisit this older arch for the current series. 4xNomosWebPhoto_esrgan Scale: 4 Architecture: ESRGAN Architecture Option: esrgan Github Release Link Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 16.06.2024 Dataset: Nomos-v2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: RealESRGAN_x4plus Iterations: 210'000 Batch Size: 12 GT Size: 256 Description: 4x ESRGAN model for photography, trained with realistic noise, lens blur, jpg and webp re-compression. ESRGAN version of 4xNomosWebPhoto_RealPLKSR, trained on the same dataset and in a similiar way. For more information look into the 4xNomosWebPhoto_RealPLKSR release, and the pdf file in its attachments.
SPAN
2x
AniSD Suite (Multiple Models)
AniSD Suite (Multiple Models)
## AniSD Suite (15 models) **Scale:** 2x (1x for AniSD DB) **Architecture:** SPAN / Compact / SwinIR Small / CRAFT / DAT2 / RPLKSR **Dataset:** Anime frames. Credits to @.kuronoe. and @pwnsweet (EVA dataset) for their contributions to the dataset! **Dataset Size:** ~7,000 - ~13,000 AniSD is a suite of 15 (as of time of writing) specialized SISR models trained to restore and upscale standard definition digital anime from ~2000s and onwards, including both both WEB and DVD releases. Faithfulness to the source and natural-looking output are the guiding principles behind the training of the AniSD models. This means avoiding oversharpened output (which can look especially absurd on standard definition sources), minimizing upscaling artifacts, retaining the natural detail of the source and of course, fixing the standard range of issues found in many DVD/WEB release (chroma issues, compression, haloing/ringing, blur, dotcrawl, banding etc.). Refer to the infographic above for a quick breakdown of the available models, and refer to the Github release for further information.
ATD
4x
4xNomosWebPhoto_atd
4xNomosWebPhoto_atd
4xNomosWebPhoto_atd
A 4x model for Restoration . 4xNomosWebPhoto_atd Scale: 4 Architecture: ATD Architecture Option: atd Github Release Link Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 07.06.2024 Dataset: Nomos-v2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: 003_ATD_SRx4_finetune.pth Iterations: 460'000 Batch Size: 6, 2 GT Size: 128, 192 Description: 4x ATD model for photography, trained with realistic noise, lens blur, jpg and webp re-compression. ATD version of 4xNomosWebPhoto_RealPLKSR, trained on the same dataset and in the same way. For more information look into the 4xNomosWebPhoto_RealPLKSR release, and the pdf file in its attachments. Showcase: Slow Pics 18 Examples
ESRGAN
2x
2x Pooh V4
2x Pooh V4
2x Pooh V4
A 2x model for Compression Removal, Noise Reduction, Line Correction, MPEG2 / LD Artifact Removal. This is my first model release. The model was inspired from a personal project which I have been pursuing for some time now, which this model aims to solve. This model will upscale low resolution hand drawn animation from 1970s to 2000. The colors are retained with effective noise control. Details and Textures are maintained to a good degree considering animations. Color Spills are also corrected depending on the colors. Shades of white and yellows have been difficult. It also makes the lines slightly sharper and thinner. This could be a plus depending on your source. The model is also temporally stable across my tests with little observable issues. **Showcase:** Images - https://imgsli.com/MjYwNzY1/12/13 Video Sample - https://t.ly/Jp7-w vs Upscale - https://t.ly/PdsKs
DAT
4x
4x-PBRify_UpscalerDAT2_V1
4x-PBRify_UpscalerDAT2_V1
4x-PBRify_UpscalerDAT2_V1
Project page: https://github.com/Kim2091/PBRify_Remix Yet another model in the PBRify_Remix series. This is a new upscaler to replace the previous 4x-PBRify_UpscalerSIR-M_V2 model. This model far exceeds the quality of the previous, with far more natural detail generation and better reconstruction of lines and edges. Comparison: https://slow.pics/c/DCjlXPGb
Compact
1x
1xSkinContrast-High-SuperUltraCompact
1xSkinContrast-High-SuperUltraCompact
1xSkinContrast-High-SuperUltraCompact
A 1x model designed for skin contrast (although some backgrounds suffer a little modification), also try the other SkinContrast models
Compact
1x
1xSkinContrast-HighAlternative-SuperUltraCompact
1xSkinContrast-HighAlternative-SuperUltraCompact
1xSkinContrast-HighAlternative-SuperUltraCompact
A 1x model designed for skin contrast (although some backgrounds suffer a little modification), some images may suffer from the creation of artifacts, High Alternative is the model that generates the most artifacts, also try the other SkinContrast models
Compact
1x
1xSkinContrast-SuperUltraCompact
1xSkinContrast-SuperUltraCompact
1xSkinContrast-SuperUltraCompact
A 1x model designed for skin contrast (although some backgrounds suffer a little modification), also try the other SkinContrast models
RealPLKSR
4x
4xNomosWebPhoto_RealPLKSR
4xNomosWebPhoto_RealPLKSR
4xNomosWebPhoto_RealPLKSR
A 4x model for Restoration . 4xNomosWebPhoto_RealPLKSR Scale: 4 Architecture: RealPLKSR Architecture Option: realplksr Link to Github Release Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 28.05.2024 Dataset: Nomos-v2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: 4x_realplksr_gan_pretrain Iterations: 404'000, 445'000 Batch Size: 12, 4 GT Size: 128, 256, 512 Description: short: 4x RealPLKSR model for photography, trained with realistic noise, lens blur, jpg and webp re-compression. full: My newest version of my RealWebPhoto series, this time I used the newly released Nomos-v2 dataset by musl. I then made 12 different low resolution degraded folders, using kim's datasetdestroyer for scaling and compression, my ludvae200 model for realistic noise, and umzi's wtp_dataset_destroyer with its floating point lens blur implementation for better control. I then mixed them together in a single lr folder and trained for 460'000 iters, checked the results, and upon kims suggestion of using interpolation, I tested and am releasing this interpolation between the checkpoints 404'000 and 445'000. This model has been trained on neosr using mixup, cutmix, resizemix, cutblur, nadam, unet, multisteplr, mssim, perceptual, gan, dists, ldl, ff, color and lumaloss, and interpolated using the current chaiNNer nightly version. This model took multiple retrainings and reworks of the dataset, until I am now satisfied enough with the quality to release this version. For more details on the whole process see the pdf file in the attachement. I am also attaching the 404'000, 445'000 and 460'000 checkpoints for completeness. PS in general degradation strengths have been reduced/adjusted in comparison to my previous RealWebPhoto models Showcase: Slow Pics 10 Examples
Compact
2x
MLP StarSample V1.0
MLP StarSample V1.0
MLP StarSample V1.0
This is a model for the restoration of My Little Pony: Friendship is Magic, however it also works decently well on all similar art. It was trained in 2x on ground truth 3840x2160 HRs and 1920x1080 LRs of varying compression, so it is able to upscale from 1080p to 2160p, where its detail retention is great, however it may create noticeable artifacting if looked at closely, like areas of randomly coloured pixels along edges. In 1x or 1.5x (2x upscaled and then downscaled back down) it performs extremely well, almost perfectly in fact, in correcting colours, removing compression, and crisping up lines - and this is the way the model is intended to be used (hence the acronym of its name being "SS", or "supersampling"). **Github Release** **Showcase:** https://slow.pics/s/1ixqCSjy
SwinIR
4x
4x-PBRify_UpscalerSIR-M_V2
4x-PBRify_UpscalerSIR-M_V2
4x-PBRify_UpscalerSIR-M_V2
This is part of my PBRify_Remix project. This is a much more capable model based on SwinIR Medium, which should strike a balance between capacity for learning + inference speed. It appears to have done so :) ### Replaced by 4x-PBRify_UpscalerDAT2_V1
MoSR
4x
 4xmssim_mosr_pretrain
 4xmssim_mosr_pretrain
4xmssim_mosr_pretrain
A 4x model for Pretrained . Link to Github Release # 4xmssim_mosr_pretrain Scale: 4 Architecture: MoSR Architecture Option: mosr Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 25.08.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: None Iterations: 420'000 Batch Size: 6 Patch Size: 64 Description: A 4x MoSR pretrain model with mssim only on the Nomos2 dataset, usable as a pretrain when starting training new mosr models. Training metrics graphs: !image Showcase: Slowpics
RGT
4x
4xTextures_GTAV_rgt-s_dither
4xTextures_GTAV_rgt-s_dither
4xTextures_GTAV_rgt-s_dither
## 4xTextures_GTAV_rgt-s_dither **Scale:** 4 **Architecture:** RGT **Architecture Option:** RGT-S **Author:** Philip Hofmann **License:** CC-BY-0.4 **Purpose:** Restoration **Subject:** Game Textures **Input Type:** Images **Release Date:** 08.05.2024 **Dataset:** GTAV_512_Textures **Dataset Size:** 7061 **OTF (on the fly augmentations):** No **Pretrained Model:** 4xTextures_GTAV_rgt-s **Iterations:** 128'000 **Batch Size:** 6,4 **GT Size:** 128,256 **Description:** A model to upscale game textures, trained on GTAV Textures, handles jpg compression down to 80 and was trained with dithering. Basically the previous 4xTextures_GTAV_rgt-s model but extended to handle dithering. **Showcase:** Slow Pics 25 Examples
RGT
4x
4xTextures_GTAV_rgt-s
4xTextures_GTAV_rgt-s
4xTextures_GTAV_rgt-s
Github Release Link ## 4xTextures_GTAV_rgt-s **Scale:** 4 **Architecture:** RGT **Architecture Option:** RGT-S **License:** CC-BY-0.4 **Purpose:** Restoration **Subject:** Game Textures **Input Type:** Images **Release Date:** 04.05.2024 **Dataset:** GTAV_512_Textures **Dataset Size:** 8492 **OTF (on the fly augmentations):** No **Pretrained Model:** RGT_S_x4 **Iterations:** 165'000 **Batch Size:** 6,4 **GT Size:** 128,256 **Description:** A model to upscale game textures, trained on GTAV Textures, handles jpg compression down to 80. **Showcase:** Slow Pics
DRCT
4x
4xRealWebPhoto_v4_drct-l
4xRealWebPhoto_v4_drct-l
4xRealWebPhoto_v4_drct-l
Link to Github Release ## 4xRealWebPhoto_v4_drct-l **Scale:** 4 **Architecture:** DRCT **Architecture Option:** DRCT-L **Author:** Philip Hofmann **License:** CC-BY-0.4 **Purpose:** Restoration **Subject:** Realistic, Photography **Input Type:** Images **Release Date:** 02.05.2024 **Dataset:** 4xRealWebPhoto_v4 **Dataset Size:** 8492 **OTF (on the fly augmentations):** No **Pretrained Model:** 4xmssim_drct-l_pretrain **Iterations:** 260'000 **Batch Size:** 6,4 **GT Size:** 128,192 **Description:** The first real-world drct model, so I am releasing it, or at least my try at it, maybe others will be able to get better results than me, I think I'd recommend my 4xRealWebPhoto_v3_atd model over this one if a real-world model for upscaling photos downloaded from the web is desired. This model is based on my previously released drct pretrain. Used mixup, cutmix, resizemix augmentations, and mssim, perceptual, gan, dists, ldl, focalfrequency, gradvar, color and luma losses. **Showcase:** Slow.pics
DAT
4x
4xRealWebPhoto_v4_dat2
4xRealWebPhoto_v4_dat2
4xRealWebPhoto_v4_dat2
Link to Github Release ## 4xRealWebPhoto_v4_dat2 **Scale:** 4 **Architecture:** DAT **Author:** Philip Hofmann **License:** CC-BY-4.0 **Purpose:** Compression Removal, Deblur, Denoise, JPEG, WEBP, Restoration **Subject:** Photography **Input Type:** Images **Date:** 04.04.2024 **Architecture Option:** DAT-2 **I/O Channels:** 3(RGB)->3(RGB) **Dataset:** Nomos8k **Dataset Size:** 8492 **OTF (on the fly augmentations):** No **Pretrained Model:** DAT_2_x4 **Iterations:** 243'000 **Batch Size:** 4-6 **GT Size:** 128-256 **Description:** 4x Upscaling Model for Photos from the Web. The dataset consists of only downscaled photos (to handle good quality), downscaled and compressed photos (uploaded to the web and compressed by service provider), and downscale, compressed, rescaled, recompressed photos (downloaded from the web and re-uploaded to the web). Applied lens blur, realistic noise with my ludvae200 model, JPG and WEBP compression (40-95), and down_up, linear, cubic_mitchell, lanczos, gaussian and box downsampling algorithms. For details on the degradation process, check out the pdf with its explanations and visualizations. This is basically a dat2 version of my previous 4xRealWebPhoto_v3_atd model, but trained with a bit stronger noise values, and also a single image per variant so drastically reduced training dataset size. **Showcase:** 12 Slowpics Examples
SPAN
1x
1x-PBRify_NormalV3
1x-PBRify_NormalV3
1x-PBRify_NormalV3
This is part of a larger model set, PBRify_Remix. PBRify_Remix is an easy way to upscale and generate PBR textures using existing low quality textures. The dataset consists of ethically sourced textures from websites like ambientCG and Polyhaven, which sets it apart from most other models. It's intended for use with RTX Remix (hence the name) but it'll work for other things as well.
SPAN
1x
1x-PBRify_Height
1x-PBRify_Height
1x-PBRify_Height
This is part of a larger model set, PBRify_Remix. PBRify_Remix is an easy way to upscale and generate PBR textures using existing low quality textures. The dataset consists of ethically sourced textures from websites like ambientCG and Polyhaven, which sets it apart from most other models. It's intended for use with RTX Remix (hence the name) but it'll work for other things as well. Note: The height map model is best used after applying a "delighting" pipeline to your game textures. Currently PBRify_Remix does not have a method for this, so you need your own.
SPAN
1x
1x-PBRify_RoughnessV2.pth
1x-PBRify_RoughnessV2.pth
1x-PBRify_RoughnessV2.pth
This is part of a larger model set, PBRify_Remix. PBRify_Remix is an easy way to upscale and generate PBR textures using existing low quality textures. The dataset consists of ethically sourced textures from websites like ambientCG and Polyhaven, which sets it apart from most other models. It's intended for use with RTX Remix (hence the name) but it'll work for other things as well.
SPAN
4x
4x-PBRify_UpscalerSPANV4
4x-PBRify_UpscalerSPANV4
4x-PBRify_UpscalerSPANV4
This is part of a larger model set, PBRify_Remix. PBRify_Remix is an easy way to upscale and generate PBR textures using existing low quality textures. The dataset consists of ethically sourced textures from websites like ambientCG and Polyhaven, which sets it apart from most other models. It's intended for use with RTX Remix (hence the name) but it'll work for other things as well.
HMA
4x
 4xmssim_hma_pretrains
4xmssim_hma_pretrains
Link to Github Release Since no official HMA model releases exist yet, I am releasing my hma and hma_medium mssim pretrains. These can be used to speed up and stabilize early training stages when training new hma models. Trained with mssim on nomosv2. ## 4xmssim_hma_pretrain **Scale:** 4 **Architecture:** HMA **Architecture Option:** hma **Author:** Philip Hofmann **License:** CC-BY-0.4 **Purpose:** Pretrained **Subject:** Photography **Input Type:** Images **Release Date:** 19.07.2024 **Dataset:** nomosv2 **Dataset Size:** 6000 **OTF (on the fly augmentations):** No **Pretrained Model:** None (=From Scratch) **Iterations:** 205'000 **Batch Size:** 4 **Patch Size:** 96 **Description:** A pretrain to start hma model training. --- ## 4xmssim_hma_medium_pretrain **Scale:** 4 **Architecture:** HMA **Architecture Option:** hma_medium **Author:** Philip Hofmann **License:** CC-BY-0.4 **Purpose:** Pretrained **Subject:** Photography **Input Type:** Images **Release Date:** 19.07.2024 **Dataset:** nomosv2 **Dataset Size:** 6000 **OTF (on the fly augmentations):** No **Pretrained Model:** None (=From Scratch) **Iterations:** 150'000 **Batch Size:** 4 **Patch Size:** 48 **Description:** A pretrain to start hma_medium model training. --- **Showcase:** slow.pics
SPAN
2x
ModernSpanimationV1
ModernSpanimationV1
ModernSpanimationV1
Upscale modern animation images/videos. Compression, blur, and un-sharpening degradations were used on the dataset. First model I trained and I think it came out pretty well.
RealPLKSR_dysample
4x
No Image
4x-RealPLKSR_dysample_pretrain
Simple pretrain for RealPLKSR Dysample, trained on CC0 content. "Ethical" model
LUDVAE
1x
Ludvae200
Ludvae200
A 1x model for 1x realistic noise degradation model . Github Release Link Name: Ludvae200 License: CC BY 4.0 Author: Philip Hofmann Network: LUD-VAE Scale: 1 Release Date: 25.03.2024 Iterations: 190'000 H_size: 64 n_channels: 3 dataloader_batch_size: 16 H_noise_level: 8 L_noise_level: 3 Dataset: RealLR200 Number of train images: 200 OTF Training: No Pretrained_Model_G: None Description: 1x realistic noise degradation model, trained on the RealLR200 dataset as found released on the SeeSR github repo. Next to the ludvae200.pth model file, I provide a ludvae200.zip file which not only contains the code but also an inference script to run this model on the dataset of your choice. Adapt the ludvae200_inference.py script accordingly by adjusting the file paths at the beginning section, to your input folder, output folder, the folder path holding the ludvae200.pth model, and a folder path where you want the text file to be generated. I made the textfile generation the same way as I did in Kim's Dataset Destroyer, which means you will have each image file logged with each of the values used to degrade that specific image file in the resulting text file, which will append only and never overwrite. You can also adjust the strength settings inside the inference script file to fit to your needs. If you in general want less strong noise for example, you should adjust the temperature upper limit from 0.4 to 0.2 or go even lower. So in line 96 change "temperature_strength = uniform(0.1,0.4)" to "temperature_strength = uniform(0.1,0.2)" just to give an example. These values are defaulted to my needs of my last dataset degradation workflow I used, but feel free to adjust these values. You can also do the same as I did, temporarily using deterministic values with multiple runs to determine the min and max values of noise generation you deem suitable for your dataset needs. See the examples of what this looked like for my last dataset workflow I used my model in.
Compact
2x
Ani4VK v2 Compact
## Ani4K v2 Ani4K v2, as the successor to the original Ani4K, retains its predecessor's fantastic detail retention, depth of field preservation and faithfulness to the original source. As the name suggests, the model is targeted at modern anime, ranging from high-quality Bluray to crappy WEB releases, for upscaling either to 2K or 4K. An **UltraCompact** version of the model is available on the Github release. The UltraCompact version is faster, without a perceptual hit to quality in most cases. 📌 More comparisons **** **FAQ** - *How does v2 differ from v1?* I'm so glad you asked! A shortcoming of the v1 models is that the model really struggled on sources which were poorly mastered. This is unfortunately still very common even with modern anime. v2 is far more capable of dealing with such sources. - *How does Ani4K v2 differ from JaNai v3? Which one should I pick?* JaNai v3 is a fantastic model, and shares many of the fundamental training objectives behind Ani4K (DOF preservation, faithfulness, etc.). I'd say that the primary difference is one of training philosophy-- JaNai seeks to render the source as if it was originally mastered in 4K, whereas Ani4K seeks to produce an upscale that is as close as possible to the source (while cleaning up any issues). Long story short, test both models and see which you prefer. - *What model versions are there?* Ani4K comes in Compact and UltraCompact flavors. Compact is of course a standard option. UltraCompact provides a noticeable performance uplift, without too much impact on quality. I ultimately did not train a SuperUltraCompact variant as I felt the hit to the model quality was far too significant.
ATD
4x
 4xNomos8k_atd_jpg
 4xNomos8k_atd_jpg
4xNomos8k_atd_jpg
A 4x model for 4x photo upscaler, handles jpg compression . Link to Github Release Name: 4xNomos8k_atd_jpg License: CC BY 4.0 Author: Philip Hofmann Network: ATD Scale: 4 Release Date: 22.03.2024 Iterations: 240'000 epoch: 152 batch_size: 6, 3 HR_size: 128, 192 Dataset: nomos8k Number of train images: 8492 OTF Training: Yes Pretrained_Model_G: 003_ATD_SRx4_finetune Description: 4x photo upscaler which handles jpg compression. This model will preserve noise. Trained on the very recently released (~2 weeks ago) Adaptive-Token-Dictionary network. Training details: AdamW optimizer with U-Net SN discriminator and BFloat16. Degraded with otf jpg compression down to 40, re-compression down to 40, together with resizes and the blur kernels. Losses: PixelLoss using CHC (Clipped Huber with Cosine Similarity Loss), PerceptualLoss using Huber, GANLoss, LDL using Huber, YCbCr Color Loss (bt601) and Luma Loss (CIE XYZ) on neosr. 7 Examples: Slowpics
ATD
4x
 4xRealWebPhoto_v3_atd
 4xRealWebPhoto_v3_atd
4xRealWebPhoto_v3_atd
A 4x model for 4x upscaler for photos downloaded from the web . Link to Github Release Name: 4xRealWebPhoto_v3_atd License: CC BY 4.0 Author: Philip Hofmann Network: ATD Scale: 4 Release Date: 22.03.2024 Iterations: 250'000 epoch: 10 batch_size: 6, 3 HR_size: 128, 192 Dataset: 4xRealWebPhoto_v3 Number of train images: 101'904 OTF Training: No Pretrained_Model_G: 003_ATD_SRx4_finetune Description: 4x real web photo upscaler, meant for upscaling photos downloaded from the web. Trained on my v3 of my 4xRealWebPhoto dataset, it should be able to handle noise, jpg and webp (re)compression, (re)scaling, and just a little bit of lens blur, while also be able to handle good quality input. Trained on the very recently released (~2 weeks ago) Adaptive-Token-Dictionary network. My 4xRealWebPhoto dataset tried to simulate the use-case of a photo being uploaded to the web and being processed by the service provides (like on a social media platform) so compression/downscaling, then maybe being downloaded and re-uploaded by another used where it, again, were processed by the service provider. I included different variants in the dataset. The pdf with info to the v2 dataset can be found here, while i simply included whats different in the v3 png: !4xRealWebPhoto_v3 Training details: AdamW optimizer with U-Net SN discriminator and BFloat16. Degraded with otf jpg compression down to 40, re-compression down to 40, together with resizes and the blur kernels. Losses: PixelLoss using CHC (Clipped Huber with Cosine Similarity Loss), PerceptualLoss using Huber, GANLoss, LDL using Huber, Focal Frequency, Gradient Variance with Huber, YCbCr Color Loss (bt601) and Luma Loss (CIE XYZ) on neosr with norm: true. 11 Examples: Slowpics
RealPLKSR
4x
4xmssim_realplksr_dysample_pretrain
4xmssim_realplksr_dysample_pretrain
4xmssim_realplksr_dysample_pretrain
A 4x model for Pretrained . 4xmssim_realplksr_dysample_pretrain Scale: 4 Architecture: RealPLKSR with Dysample Architecture Option: realplksr Author: Philip Hofmann License: CC-BY-0.4 Subject: Photography Input Type: Images Release Date: 27.06.2024 Dataset: nomosv2 Dataset Size: 6000 OTF (on the fly augmentations): No Pretrained Model: None (=From Scratch) Iterations: 200'000 Batch Size: 8 GT Size: 192, 512 Description: Dysample had been recently added to RealPLKSR, which from what I had seen can resolve or help avoid the checkerboard / grid pattern on inference outputs. So with the commits from three days ago, the 24.06.24, on neosr, I wanted to create a 4x photo pretrain I can then use to train more realplksr models with dysample specifically to stabilize training at the beginning. Showcase: Imgsli Slowpics
RGT
4x
 4xRealWebPhoto_v2_rgt_s
 4xRealWebPhoto_v2_rgt_s
4xRealWebPhoto_v2_rgt_s
A 4x model for 4x real web photo upscaler, meant for upscaling photos downloaded from the web . Link to Github Release Name: 4xRealWebPhoto_v2_rgt_s License: CC BY 4.0 Author: Philip Hofmann Network: RGT Network Option: RGT-S Scale: 4 Release Date: 10.03.2024 Iterations: 220'000 epoch: 5 batch_size: 16 HR_size: 128 Dataset: 4xRealWebPhoto_v2 (see details in attached pdf file in github release) Number of train images: 1'086'976 (or 543'488 pairs) OTF Training: No Pretrained_Model_G: RGT_S_x4 Description: 4x real web photo upscaler, meant for upscaling photos downloaded from the web. Trained on my v2 of my 4xRealWebPhoto dataset, it should be able to handle realistic noise, jpg and webp compression and re-compression, scaling and rescaling with multiple downscampling algos, and handle a little bit of lens blur. Thought featuring degraded images in the examples, this model should also be able to handle good quality input. Details about the approach/dataset I made to train this model (and therefore also what this model would be capable of handling) is in the attached pdf in the github release. My previous tries of this dataset, meaning v0 and v1, will get a separate entry, though this version would be recommended over them. 12 Examples on Slowpics
DAT
4x
No Image
4x-DAT2_mssim_Pretrain
Simple pretrain for DAT2, trained on CC0 content. "Ethical" model
DAT
4x
IllustrationJaNai_V1_DAT2
IllustrationJaNai_V1_DAT2
IllustrationJaNai_V1_DAT2
A 4x model for Illustrations, digital art, manga covers. Model for color images including manga covers and color illustrations, digital art, visual novel art, artbooks, and more. The DAT2 version is the highest quality version but also the slowest. See the ESRGAN version for faster performance. https://slow.pics/c/GfArurPG
DRCT
4x
No Image
DRCT-L_X4
Official 4x drct-l Pretrain model, as released on their DRCT Github Page
ESRGAN
4x
IllustrationJaNai_V1_ESRGAN
IllustrationJaNai_V1_ESRGAN
IllustrationJaNai_V1_ESRGAN
A 4x model for Illustrations, digital art, manga covers. Model for color images including manga covers and color illustrations, digital art, visual novel art, artbooks, and more. The ESRGAN version is high quality with balanced performance. See the DAT2 version for maximum quality. https://slow.pics/c/GfArurPG
Compact
1x
SwatKats Compact
SwatKats Compact
SwatKats Compact
A 1x model for Upscaling older cartoons. This is yet another retrain of SaurusX's SwatKats_Lite model. The dataset was reprocessed with my Find Misaligned Images script, along with the new ImgAlign update, which drastically reduced artifacts and increased the model's capabilities. This particular model is roughly on par with or slightly behind the original, doing better in some spots and worse in others. Refer to the attached examples to see this. The advantage of this over the original is the speed improvement of Compact over ESRGAN-lite. In a 480p test on an RTX 4090, the original ESRGAN-lite model took 0.28 seconds to process a frame vs Compact's 0.13 seconds. https://slow.pics/s/dF3Icjpv OR <https://imgsli.com/MjQxMzc1/0/1>
RGT
4x
NomosUni rgt multijpg
NomosUni rgt multijpg
NomosUni rgt multijpg
A 4x model for 4x universal DoF preserving upscaler. 4x universal DoF preserving upscaler, pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos) in neosr with adamw, unet and pixel, perceptual, gan and color losses. Similiar to the last model I released, with same dataset, this is a full RGT model in comparison. FP32 ONNX conversion is provided in the google drive folder for you to run it. 6 Examples (To check JPG compression handling see Example Nr.4, to check Depth of Field handlin see Example Nr.1 & Nr.6): Slowpics
ESRGAN
4x
WTP ColorDS
WTP ColorDS
The model was trained for some tests, but maybe someone will need to remove a screentone with a color image; in addition to the screentone, it can handle small halftones quite well
ESRGAN
4x
WTP UDS Esrgan
WTP UDS Esrgan
I just decided to finish off the previous one, made the manga design more clear and corrected the deviation in color temperature by adding a more diverse set of colors
RGT
4x
NomosUni rgt s multijpg
NomosUni rgt s multijpg
NomosUni rgt s multijpg
A 4x model for 4x universal DoF preserving upscaler. 4x universal DoF preserving upscaler, pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos) in neosr with adamw, unet and pixel, perceptual, gan, color and ldl losses. Examples: Imgsli Slowpics
RealPLKSR
4x
No Image
4x-RealPLKSR_Pretrain_V4
Simple pretrain for RealPLKSR, trained on CC0 or self-made content. Don't use V1-V3, they're not stable for training. Dataset consisted of simple resizing: - Lanczos - Linear - Bicubic - Box
ESRGAN
2x
Eva16Lite 201k
Eva16Lite 201k
Eva16Lite 201k
A 2x model for Upscale Evangelion episode 16. An ESRGAN Lite model trained on the dataset provided by pwnsweet. I had to make the adjustment of transfering average brightness and contrast from the LR's to the HR's to avoid flashing in the final model. Otherwise the dataset is as provided. https://imgsli.com/MjM5Njk5
ESRGAN
2x
Eva16Lite 64k
Eva16Lite 64k
Eva16Lite 64k
A 2x model for Upscale Evangelion episode 16. An earlier iteration of my other ESRGAN lite model. After some closer examination at the video output of the other model I think this one is also worth considering. For being at such an early stage, this model is extremely stable, which makes me think the other model may be exhibiting some overfitting issues. Definitely worth a look.
DAT
2x
Evangelion dat2
Evangelion dat2
Evangelion dat2
A 2x model for 2x upscaler for evangelion episodes. For the evangelion upscale project still, this time a dat2 model. A 2x upscaler for evangelion episodes on the evangelion dataset provided by pwnsweet, which called for model trainers to train models on it. Slowpoke Pics 4 Examples
SPAN
1x
SPANGELION
SPANGELION
SPANGELION
A 1x model for For restoring the non blu-ray Evangelion episodes.. Restoration model for de-cruddifying Evangelion episodes from their non blu-ray source. Has issues with eyes in some cases, where it thinks the whites are haloing. Comparison images are of the interpolated model.
SPAN
1x
SPANGELION INTERPOLATED
SPANGELION INTERPOLATED
SPANGELION INTERPOLATED
A 1x model for For restoring the non blu-ray Evangelion episodes.. Restoration model for de-cruddifying Evangelion episodes from their non blu-ray source. Has issues with eyes in some cases, where it thinks the whites are haloing. Comparison images are of the interpolated model.
SPAN
1x
SPANGELION SHARP
SPANGELION SHARP
SPANGELION SHARP
A 1x model for For restoring the non blu-ray Evangelion episodes.. Restoration model for de-cruddifying Evangelion episodes from their non blu-ray source. Has issues with eyes in some cases, where it thinks the whites are haloing. Comparison images are of the interpolated model.
SPAN
4x
PurePhoto span
PurePhoto span
PurePhoto span
A 4x model for upscaling model for amateur to professional photos (regular). Skilled in working with cats, hair, parties, and creating clear images. Also proficient in resizing photos and enlarging large, sharp images. Can effectively improve images from small sizes as well (300px at smallest on one side, depending on the subject). Experienced in experimenting with techniques like upscaling with this model twice and then reducing it by 50% to enhance details, especially in features like hair or animals. Original image, second is upscale with the model once, and the last is upscaled with model twice.
DAT
4x
No Image
DWTP DP dat2 beta
A 4x model for remove paper texture and screentone from independent manga scans. Well, I don’t even know how to describe it, a small breakthrough in my war with self-scans, I’m still not very happy with it, but maybe the lump will come in handy one day https://discord.com/channels/547949405949657098/547949806761410560/1204387665798103050
DAT
4x
No Image
DWTP DS dat2 v1
A 4x model for remove paper texture and screentone from independent manga scans. This is my first model for this, it can be said to remove poorly https://discord.com/channels/547949405949657098/1159598245417332796/1185562840304721921
DAT
4x
DWTP DS dat2 v3
DWTP DS dat2 v3
DWTP DS dat2 v3
DAT descreenton model, designed to reduce discrepancies on tiles due to too much loss of the first version, while getting rid of the removal of paper texture https://discord.com/channels/547949405949657098/944307589284495410/1192058535257854042
DAT
4x
DWTP DS dat2 v3 2
DWTP DS dat2 v3 2
DWTP DS dat2 v3 2
A 4x model for remove screentone. in some places an improved version 4x_DWTP_DS_dat2_v3 one of my friends likes it less than 3 and another more so here is v3_2 raw - v3 - v3.2 p.s. Sorry for the size of all this and examples with links, it was just easier with real examples, and I don’t have the opportunity to try something with all of them now
ESRGAN
2x
AstroManPlus 262k
AstroManPlus 262k
AstroManPlus 262k
A 2x model for To recreate the upscale process used in the He-Man German Blu-ray release.. A followup to my previous AstroManLite model. The primary focus is on detail preservation and enhancement of decent DVD material instead of artifact repair. I eliminated the attempt to remove temporal artifacts from the He-Man DVDs as it proved detrimental to the final results. It will, however, remove MPEG-2 artifacts to great effect.
RealPLKSR
4x
No Image
4x_realplksr_gan_pretrain
by musl
## RealPLKSR GAN pretrain **Scale:** 4x **Architecture:** RealPLKSR **Download:** GDrive | Training files: GDrive **Author:** musl **License:** CC0 **Purpose:** Pretrain **Subject:** Multipurpose **Date:** 15 May 2024 **Size:** default Real-PLKSR **I/O Channels:** 3(RGB)->3(RGB) **Dataset:** Nomos-v2 **Dataset Size:** 6000 **OTF (on the fly augmentations):** No **Pretrained Model:** 4x_realplksr_mssim_pretrain **Iterations:** ~450k **Batch Size:** 2-6 **GT Size:** 128-416 **Description:** Pretrained GAN models for RealPLKSR network. Trained on downsampling-only (nearest, bilinear, bicubic, lanczos and mitchell).
SwinIR
4x
No Image
4x-SwinIR-M_Pretrain
Simple pretrain for SwinIR Medium, trained on CC0 content. "Ethical" model
Compact
4x
Drawimation compact
Drawimation compact
Drawimation compact
A 4x model for upscale anime from dvd. Anime upscale from dvd with x4, a little softer lines and look but keeps the colors. https://imgsli.com/MjM3ODEy
SwinIR
4x
No Image
4x-SwinIR-S_Pretrain
Simple pretrain for SwinIR Small, trained on CC0 content. "Ethical" model
OmniSR
4x
wtp manga p omni
wtp manga p omni
wtp manga p omni
A 4x model for pre-workout to speed up your workouts. I see that many people are trying manga scale models on the Omni, and to speed up the process I decided to do preliminary training, the only loss is the resize: (“lanczos”, “bicubic”, “bilenear”, “HAMMING”) Original link: https://drive.google.com/drive/folders/1ln1ZneTvqPbvi8nsJwzjMoYSDQ_DcHBS?usp=sharing
RealPLKSR
4x
No Image
4x_realplksr_mssim_pretrain
by musl
## RealPLKSR mssim pretrains **Scale:** 2x and 4x **Architecture:** RealPLKSR **Links:** **`4x_realplksr_mssim_pretrain.pth`** | GDrive **`2x_realplksr_mssim_pretrain.pth`** | GDrive **Author:** musl **License:** CC0 **Purpose:** Pretrain **Subject:** Multipurpose **Date:** 08 May 2024 **Size:** default Real-PLKSR **I/O Channels:** 3(RGB)->3(RGB) **Dataset:** Nomos-v2 **Dataset Size:** 6000 **OTF (on the fly augmentations):** No **Pretrained Model:** scratch **Iterations:** ~260k **Batch Size:** 2-6 **GT Size:** 128-256 **Description:** Pretrained models for RealPLKSR network. Trained on downsampling-only (nearest, bilinear, bicubic and lanczos).
SPAN
4x
SPANkendata
SPANkendata
SPANkendata
A 4x model for Intended as a pretrain model, can be used to upscale undegraded photos.. Original Link: https://github.com/terrainer/AI-Upscaling-Models/tree/main/4xSPANkendata 4x realistic upscaler that may also work for general purpose usage. Trained on Bicubic downscaled tiles from very high quality images. Iterations: 500k, fine-tuned with 120k, then 110k, then 75k batch_size: 32, 16, 12, 10 HR_size: 192, 256, 320, 192 Epoch: Unknown, 32, 20, 12 Dataset_size: 58,420 512x512 tiles, downscaled with a variety of traditional algorithms and CAR
DAT
4x
NomosUniDAT2 multijpg ldl
NomosUniDAT2 multijpg ldl
NomosUniDAT2 multijpg ldl
A 4x model for 4x universal DoF preserving upscaler. 4x universal DoF preserving upscaler, pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos).
DAT
4x
NomosUniDAT2 multijpg ldl sharp
NomosUniDAT2 multijpg ldl sharp
NomosUniDAT2 multijpg ldl sharp
A 4x model for 4x universal DoF preserving upscaler. 4x universal DoF preserving upscaler, pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos).
Compact
1x
ExposureCorrection compact
ExposureCorrection compact
ExposureCorrection compact
A 1x model for 1x exposure correction of photos. This model is meant as an experiment to see if compact can be used to train on photos to exposure correct those using the pixel, perceptual, color, color and ldl losses. There is no brightness loss. Still it seems to kinda work. 4 Examples: https://slow.pics/s/nuLodV0z
Compact
1x
OverExposureCorrection compact
OverExposureCorrection compact
OverExposureCorrection compact
A 1x model for 1x exposure correction of overexposed photos. This model is meant as an experiment to see if compact can be used to train on overexposed images to exposure correct those using the pixel, perceptual, color, color and ldl losses. There is no brightness loss. Still it seems to kinda work. 2 Examples: https://slow.pics/s/KG4ELfGD
Compact
1x
UnderExposureCorrection compact
UnderExposureCorrection compact
UnderExposureCorrection compact
A 1x model for 1x exposure correction of underexposed photos. This model is meant as an experiment to see if compact can be used to train on underexposed images to exposure correct those using the pixel, perceptual, color, color and ldl losses. There is no brightness loss. Still it seems to kinda work. 3 Examples: https://slow.pics/s/Lnh2zcuK
ESRGAN
4x
eula digimanga MiA 65k
eula digimanga MiA 65k
eula digimanga MiA 65k
A 4x model for Grayscale pencil/sketch style art with paper textures (Made in Abyss specific). Old model trained for pencil sketch style art, good at restoring the paper and pencil and other textures , specifically the Made in Abyss manga, but should work for art with a similar style. Comparison: https://slow.pics/c/D0MuvwLY
DRCT
4x
4xmssim_drct-l_pretrain
4xmssim_drct-l_pretrain
4xmssim_drct-l_pretrain
Link to Github Release Since no DRCT model releases exist yet, I am releasing my drct-s, drct and drct-l pretrains. These can be used to speed up and stabilize early training stages when training new drct models. Trained with mssim (and augs, and color and lumaloss) on downscaled nomosuni. Training files are also provided additionally. ## 4xmssim_drct-l_pretrain **Scale:** 4 **Architecture:** DRCT **Architecture Option:** DRCT-L **Author:** Philip Hofmann **License:** CC-BY-0.4 **Purpose:** Pretrained **Subject:** Anime, Photography **Input Type:** Images **Release Date:** 28.04.2024 **Dataset:** nomos_uni **Dataset Size:** 2989 **OTF (on the fly augmentations):** No **Pretrained Model:** None (=From Scratch) **Iterations:** 108'000 **Batch Size:** 2-6 **GT Size:** 128-320 **Description:** A pretrain to start drct-l model training, since there are no official released drct pretrains yet I trained these myself and release them here.
DRCT
4x
4xmssim_drct_pretrain
4xmssim_drct_pretrain
4xmssim_drct_pretrain
Link to Github Release Since no DRCT model releases exist yet, I am releasing my drct-s, drct and drct-l pretrains. These can be used to speed up and stabilize early training stages when training new drct models. Trained with mssim (and augs, and color and lumaloss) on downscaled nomosuni. Training files are also provided additionally. ## 4xmssim_drct_pretrain **Scale:** 4 **Architecture:** DRCT **Architecture Option:** DRCT **Author:** Philip Hofmann **License:** CC-BY-0.4 **Purpose:** Pretrained **Subject:** Anime, Photography **Input Type:** Images **Release Date:** 28.04.2024 **Dataset:** nomos_uni **Dataset Size:** 2989 **OTF (on the fly augmentations):** No **Pretrained Model:** None (=From Scratch) **Iterations:** 95'000 **Batch Size:** 2-6 **GT Size:** 128-320 **Description:** A pretrain to start drct model training, since there are no official released drct pretrains yet I trained these myself and release them here.
DRCT
4x
4xmssim_drct-s_pretrain
4xmssim_drct-s_pretrain
4xmssim_drct-s_pretrain
Link to Github Release Since no DRCT model releases exist yet, I am releasing my drct-s, drct and drct-l pretrains. These can be used to speed up and stabilize early training stages when training new drct models. Trained with mssim (and augs, and color and lumaloss) on downscaled nomosuni. Training files are also provided additionally. ## 4xmssim_drct-s_pretrain.pth **Scale:** 4 **Architecture:** DRCT **Architecture Option:** DRCT-S **Author:** Philip Hofmann **License:** CC-BY-0.4 **Purpose:** Pretrained **Subject:** Anime, Photography **Input Type:** Images **Release Date:** 28.04.2024 **Dataset:** nomos_uni **Dataset Size:** 2989 **OTF (on the fly augmentations):** No **Pretrained Model:** None (=From Scratch) **Iterations:** 75'000 **Batch Size:** 6 **GT Size:** 128-320 **Description:** A pretrain to start drct-s model training, since there are no official released drct pretrains yet I trained these myself and release them here.
Compact
2x
AnimeJaNai_HD_V3_Compact
AnimeJaNai_HD_V3_Compact
AnimeJaNai_HD_V3_Compact
Real-time 2x Real-ESRGAN Compact/UltraCompact/SuperUltraCompact models designed for upscaling 1080p anime to 4K. The aim of these models is to address scaling, blur, oversharpening, and compression artifacts while upscaling to deliver a result that appears as if the anime was originally mastered in 4K resolution. Can be set up to run in real-time with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai The development of the V3 models spanned over seven months, during which over 100 release candidate models were trained and meticulously refined. The V3 models introduce several improvements compared to V2, including: More faithful appearance to original source Improved handling of oversharpening artifacts, ringing, aliasing Better at preserving intentional blur in scenes using depth of field More accurate line colors, darkness, and thickness Better preservation of soft shadow edges Overall, the V3 models yield significantly more natural and faithful results compared to the V2 models.test
Compact
2x
AnimeJaNai_HD_V3_SuperUltraCompact
AnimeJaNai_HD_V3_SuperUltraCompact
AnimeJaNai_HD_V3_SuperUltraCompact
Real-time 2x Real-ESRGAN Compact/UltraCompact/SuperUltraCompact models designed for upscaling 1080p anime to 4K. The aim of these models is to address scaling, blur, oversharpening, and compression artifacts while upscaling to deliver a result that appears as if the anime was originally mastered in 4K resolution. Can be set up to run in real-time with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai The development of the V3 models spanned over seven months, during which over 100 release candidate models were trained and meticulously refined. The V3 models introduce several improvements compared to V2, including: More faithful appearance to original source Improved handling of oversharpening artifacts, ringing, aliasing Better at preserving intentional blur in scenes using depth of field More accurate line colors, darkness, and thickness Better preservation of soft shadow edges Overall, the V3 models yield significantly more natural and faithful results compared to the V2 models.
Compact
2x
AnimeJaNai_HD_V3_UltraCompact
AnimeJaNai_HD_V3_UltraCompact
AnimeJaNai_HD_V3_UltraCompact
Real-time 2x Real-ESRGAN Compact/UltraCompact/SuperUltraCompact models designed for upscaling 1080p anime to 4K. The aim of these models is to address scaling, blur, oversharpening, and compression artifacts while upscaling to deliver a result that appears as if the anime was originally mastered in 4K resolution. Can be set up to run in real-time with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai The development of the V3 models spanned over seven months, during which over 100 release candidate models were trained and meticulously refined. The V3 models introduce several improvements compared to V2, including: More faithful appearance to original source Improved handling of oversharpening artifacts, ringing, aliasing Better at preserving intentional blur in scenes using depth of field More accurate line colors, darkness, and thickness Better preservation of soft shadow edges Overall, the V3 models yield significantly more natural and faithful results compared to the V2 models.
Compact
2x
NomosUni compact otf medium
NomosUni compact otf medium
NomosUni compact otf medium
A 2x model for 2x fast universal upscaler with medium degradation handling (jpg compression, noise, blur). 2x compact fast universal upscaler with medium degradation handling using the Real-ESRGAN training pipeline, based off 2xNomosUni_compact_otf_strong. Handles jpg compression, some noise, and some blur (so dejpgs, denoises and deblurs). Original Link: https://drive.google.com/drive/folders/1YdOPyb0Ht9YjoRZILf87hSocyAx43CLc Examples: RealPhoto Noisy
SPAN
2x
NomosUni span multijpg ldl
NomosUni span multijpg ldl
NomosUni span multijpg ldl
A 2x model for 2x fast universal DoF preserving upscaler. 2x span fast universal DoF preserving upscaler, pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos). Original Link: https://drive.google.com/drive/folders/1_P56B3k1XrBsa-f698zPHAuLaqQ8sSWF Examples: dofwheat digitalart face
ESRGAN
2x
HFA2kShallowESRGAN
HFA2kShallowESRGAN
HFA2kShallowESRGAN
A 2x model for 2x anime upscaler. 2x shallow esrgan version of the HFA2kCompact model. This model should be usable with FAST_Anime_VSR using TensorRT for fast inference, as should my 2xHFA2kReal-CUGAN model. Original Link: https://drive.google.com/drive/folders/1_gtYqZGQgrpq55gsGf17MG_QUP4XeFJD Slow Pics examples: Example 1 Example 2 Ludvae1 Ludvae2
GRL
4x
HFA2k VCISR GRLGAN ep200
HFA2k VCISR GRLGAN ep200
HFA2k VCISR GRLGAN ep200
A 4x model for 4x anime upscaler handling video compression artifacts, trained for 200 epochs. 4x anime upscaler handling video compression artifacts since trained with otf degradations for "mpeg2video", "libxvid", "libx264", "libx265" with crf 20-32, mpeg bitrate 3800-5800 (together with the standard Real-ESRGAN otf pipeline). A faster arch using this otf degradation pipeline would be great for handling video compression artifacts. Since this one is a GRL model and therefore slow, as noted by the dev maybe more for research purposes (or more for single images/screenshots). Trained using VCISR for 200 epochs. "This is epoch 200 and the start iteration is 85959 with learning rate 2.5e-05" Slow Pics examples: h264_crf28 ludvae1 ludvae2
SPAN
4x
4xmssim_span_pretrain
4xmssim_span_pretrain
4xmssim_span_pretrain
A 4x model for Pretrain . Github Release Link Neosr's latest update from yesterday included a new adaptation of the multi-scale ssim loss. This was an experiment to test out the difference between making a SPAN pretrain with pixel loss with L1 criteria (as often used in research) vs mssim loss as its only loss. Models are provided so they can be used for tests or also used as a pretrain for another SPAN model. --- ## 4xmssim_span_pretrain Scale: 4 Architecture: SPAN Author: Philip Hofmann License: CC-BY-4.0 Subject: Realistic, Anime Date: 10.04.2024 Dataset: nomos_uni Dataset Size: 2989 OTF (on the fly augmentations): No Pretrained Model: None Iterations: 80'000 Batch Size: 12 GT Size: 128 Description: 4x SPAN pretrain trained on neosr's new adaptation of the multi-scale ssim loss from yesterdays update on downsampled nomos_uni dataset using kim's dataset destroyer with down_up,linear,cubic_mitchell,lanczos,gauss,box (while down_up used the same and with range = 0.15,1.5). The new augmentations except CutBlur have also been used (since CutBlur is meant to be applied to real-world SR and may cause undesired effects if applied to bicubic-only). Config and training log provided for more details. Showcase: 7 Slowpics Examples
SPAN
4x
4xpix_span_pretrain
4xpix_span_pretrain
4xpix_span_pretrain
A 4x model for Pretrain . Github Release Link Neosr's latest update from yesterday included a new adaptation of the multi-scale ssim loss. This was an experiment to test out the difference between making a SPAN pretrain with pixel loss with L1 criteria (as often used in research) vs mssim loss as its only loss. Models are provided so they can be used for tests or also used as a pretrain for another SPAN model. --- ## 4xpix_span_pretrain Scale: 4 Architecture: SPAN Author: Philip Hofmann License: CC-BY-4.0 Subject: Realistic, Anime Date: 10.04.2024 Dataset: nomos_uni Dataset Size: 2989 OTF (on the fly augmentations): No Pretrained Model: None Iterations: 80'000 Batch Size: 12 GT Size: 128 Description: 4x SPAN pretrain trained on pixel loss with L1 criteria (as often used in research) on downsampled nomos_uni dataset using kim's dataset destroyer with down_up,linear,cubic_mitchell,lanczos,gauss,box (while down_up used the same and with range = 0.15,1.5). The new augmentations except CutBlur have also been used (since CutBlur is meant to be applied to real-world SR and may cause undesired effects if applied to bicubic-only). Config and training log provided for more details. Showcase: 7 Slowpics Examples
Compact
2x
HFA2k compact multijpg
HFA2k compact multijpg
HFA2k compact multijpg
A 2x model for 2x anime upscaler which tries to preserve dof. 2x anime upscaler, a non-otf version of the HFA2kCompact model which tries to preserve dof by not deblurring. Original Link: https://drive.google.com/drive/folders/1RZbALUTCf4DYDTVMBR74b8Nt-7m6TEyk?usp=drive_link
Compact
2x
HFA2k LUDVAE compact
HFA2k LUDVAE compact
HFA2k LUDVAE compact
A 2x model for A lightweight anime 2x upscaling model with realistic degradations, based on musl's HFA2k_LUDVAE dataset. 2x lightweight anime upscaler with realistic degradations (compression, noise, blur) based on musl's HFA2k_LUDVAE dataset.
SPAN
2x
HFA2k LUDVAE SPAN
HFA2k LUDVAE SPAN
HFA2k LUDVAE SPAN
A 2x model for A lightweight anime 2x upscaling model with realistic degradations, based on musl's HFA2k_LUDVAE dataset. 2x lightweight anime upscaler with realistic degradations (compression, noise, blur) based on musl's HFA2k_LUDVAE dataset. Original Link: https://drive.google.com/drive/folders/1Fe1Uu3-L5YWaVSwPGBTIPG4xU7kK9rZK?usp=drive_link
SPAN
2x
HFA2kSPAN
HFA2kSPAN
HFA2kSPAN
A 2x model for 2x anime upscaler. 2x span version of the hfa2k model. Original Link: https://drive.google.com/drive/folders/1Fe1Uu3-L5YWaVSwPGBTIPG4xU7kK9rZK?usp=drive_link
OmniSR
2x
HFA2OmniSR
HFA2OmniSR
HFA2OmniSR
A 2x model for 2x anime upscaler. 2x omnisr version of the 2xHFA2kCompact model. Original Link: https://drive.google.com/drive/folders/1QfiPT1uHi7Yxb4Gjhl25M2YI6hWU5hVt?usp=drive_link
Real-CUGAN
2x
HFA2Real CUGAN
HFA2Real CUGAN
HFA2Real CUGAN
A 2x model for 2x anime upscaler. 2x Real-CUGAN version of the 2xHFA2kCompact model. Original Link: https://drive.google.com/drive/folders/1-ijn4RE5vSWJuIwoBTZdA3cKlraw8PcY Slow Pics examples: Example 1 Example 2 Ludvae1 Ludvae2
SwinIR
2x
HFA2SwinIR S
HFA2SwinIR S
HFA2SwinIR S
A 2x model for 2x anime upscaler. 2x SwinIR-Small version of the 2xHFA2kCompact model. Original Link: https://drive.google.com/drive/folders/1yM2_BkktUXS2-i1sTCYZuHRyfx-ak2o8
ESRGAN
4x
DWTP ds esrgan 5
DWTP ds esrgan 5
DWTP ds esrgan 5
A 4x model for manga screentone removal. I haven’t shared models for a long time, I think one of these days I’ll upload a dat version and a less accurate esrgan. This version of the model was meant to be an attempt to get as much detail out of the scans as possible while removing the screentone, so some things had to be sacrificed, but overall it is better than all my previous models combined in terms of accuracy and tone. **examples** colab which contains the rest of my models ||In general, this is a collab for convenient scaling of manga in the ru segment of translators and there are other people’s models there, if I suddenly added someone else’s model that has a license, please let me know, I’ll correct it||
ESRGAN
2x
NomosUni esrgan multijpg
NomosUni esrgan multijpg
NomosUni esrgan multijpg
A 2x model for 2x universal upscaler. 2x esrgan universal upscaler, pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos). Original Link: https://drive.google.com/drive/folders/12zKVS74mz0NtBKGlkx_r0ytUVoeDiTX3?usp=drive_link Try it out in this huggingface space I made (since the spaces uses cpu, testing locally on your gpu would be faster) Slow Pics crops examples: ani face dof_wheat
SwinIR
1x
Bendel_Halftone
Bendel_Halftone
Bendel_Halftone
A model trained with the goal of preserving texture while removing halftones. I do plan on making improved versions down the line, as this one didn't come out as well as I'd like.
SRFormer
4x
Frankendata_FullDegradation_SRFormer
Frankendata_FullDegradation_SRFormer
Frankendata_FullDegradation_SRFormer
Description: 4x realistic upscaler that may also work for general purpose usage. It was trained with OTF random degradation with a very low to very high range of degradations, including blur, noise, and compression. Trained with the same Frankendata dataset that I used for the pretrain model.
Compact
2x
Higurashi_v1_compact
Higurashi_v1_compact
Higurashi_v1_compact
this a compact model I trained to upscale the SD Blu-ray release of Higurashi no Naku Koro ni and other early 2000 shows like Yu Gi Ho GX, it can do line darkening and some restoration. Unfortunately I lost everything related to it from the dataset and training settings
SPAN
1x
SwatKats_SPAN
SwatKats_SPAN
This is SaurusX's original description: Fix vertical blur / split lines / shadowing. A 1x model of my 2xSwatKats. Resolves the same video problems as before, but 1x and faster and meant for chaining to other 2x models (or whatever). Input MUST be 540 vertical as the blur problem is very resolution sensitive. ------- The goal for this retrain was to get the "magic" of SwatKatsLite into SPAN for much faster video processing. It's about 80% of the way there. There's some artifacts here and there that aren't present in the original, and it has small amounts of color halos in certain areas. I've retrained this model so many times that I've just decided to release it. This particular release was trained to ~450k. The included Family Guy image shows the color ringing issue. Sorry. Update: I've added a second alt model that has less color ringing, but doesn't do quite as good with line repair. I believe I've reached architecture limits
SPAN
4x
4x-NomosUni_span_multijpg
4x-NomosUni_span_multijpg
4x-NomosUni_span_multijpg
4x fast universal upscaler pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos).
SPAN
1x
BroadcastToStudio_SPAN
BroadcastToStudio_SPAN
This is SaurusX's original description: Improvement of low-quality cartoons from broadcast sources. Will greatly increase the visual quality of bad broadcast tape sources of '80s and '90s cartoons (e.g. Garfield and Friends, Heathcliff, DuckTales, etc). Directly addresses chroma blur, dot crawl, and rainbowing. You're highly advised to take care of haloing beforehand in your favorite video editor as the model will not fix it and may make existing halos more noticeable. ------- Sadly this model has some intense artifacts. Thankfully the SPAN re-train reduced these a bit, but they're still problematic. This was just a quick test of SPAN's capabilities. This was trained only to ~66k iterations, compared to the 480k of the original.
SPAN
2x
GarfieldJr_span
GarfieldJr_span
GarfieldJr_span
2x span nf48 model trained with 2x_Garfield dataset. the output of this model is in no way near the 2x_Garfield, but it can do everything that the esrgan model does to some degree and have the speed advantage. finally, I would like to thank @SaurusX for providing me with his dataset. Comp: https://slow.pics/c/Yezua7O7
Compact
2x
2x-NomosUni_compact_multijpg
2x-NomosUni_compact_multijpg
2x-NomosUni_compact_multijpg
2x compact fast universal upscaler pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos).
SPAN
2x
2x-NomosUni_span_multijpg
2x-NomosUni_span_multijpg
2x-NomosUni_span_multijpg
2x span fast universal upscaler pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos).
DAT
4x
4xTextureDAT2_otf
4xTextureDAT2_otf
A 4x model for 4x texture image upscaler, handling jpg compression, some noise and slight blur . Link to Github Release Name: 4xTextureDAT2_otf Author: Philip Hofmann Release: 13.12.2023 License: CC BY 4.0 Network: DAT Arch Option: DAT2 Scale: 4 Iterations: 125000 batch_size: 6 HR_size: 128 Dataset: GTAV_512_Dataset Number of train images: 30122 OTF Training: Yes Pretrained_Model_G: DAT_2_x4 Description: 4x upscale texture images, trained with the Real-ESRGAN otf pipeline so handles jpg compression, some noise and slight blur
SPAN
4x
ClearRealityV1
ClearRealityV1
ClearRealityV1
Nice to release a model again! This one is intended for realistic imagery, and works especially well on faces, hair, and nature shots. It should only be used on somewhat clear shots, without a lot of grain. I trained this model on SPAN, which as of the time of release, you'll need chaiNNer-nightly for. I aimed for a softer, more natural look for this model with as few artifacts as possible. In addition to the Normal model, I've included a "soft" model. The Soft model is... softer. Basically it was an earlier version of the model with a more limited dataset. It produces more natural output on games or rendered content, but suffers a bit more with realistic stuff. Note: In shots with DOF (depth of field) or bokeh, unfortunately there will be artifacts. Compatibility: You'll have to use the latest chaiNNer-nightly to use this model
SPAN
4x
Nomos8k_span_otf_medium
Nomos8k_span_otf_medium
Nomos8k_span_otf_medium
I release my span otf series: They come in three variations: weak, middle, and strong. Mainly meant for photos (can be tried on other things of course). (Also there is a non-otf span model I had been working on simultaneously that I will release shortly, should give better results on less degraded input in comparison to this span otf series) Basically I trained the otf_strong for 90k iter and then medium and weak based off that, with some more training to de-learn (tone down) the (too?) strong degradations. Used discrim resets to correct occuring colorloss in all of them. gt size was for the most part 512 with batch 9 (since i hoped it would give better results) with 0.55 it/s training speed (first 40k at the beginning were gt size 256 with batch 20 with 0.58 it/s).
SPAN
4x
Nomos8k_span_otf_strong
Nomos8k_span_otf_strong
Nomos8k_span_otf_strong
I release my span otf series: They come in three variations: weak, middle, and strong. Mainly meant for photos (can be tried on other things of course). (Also there is a non-otf span model I had been working on simultaneously that I will release shortly, should give better results on less degraded input in comparison to this span otf series) Basically I trained the otf_strong for 90k iter and then medium and weak based off that, with some more training to de-learn (tone down) the (too?) strong degradations. Used discrim resets to correct occuring colorloss in all of them. gt size was for the most part 512 with batch 9 (since i hoped it would give better results) with 0.55 it/s training speed (first 40k at the beginning were gt size 256 with batch 20 with 0.58 it/s).
SPAN
4x
Nomos8k_span_otf_weak
Nomos8k_span_otf_weak
Nomos8k_span_otf_weak
I release my span otf series: They come in three variations: weak, middle, and strong. Mainly meant for photos (can be tried on other things of course). (Also there is a non-otf span model I had been working on simultaneously that I will release shortly, should give better results on less degraded input in comparison to this span otf series) Basically I trained the otf_strong for 90k iter and then medium and weak based off that, with some more training to de-learn (tone down) the (too?) strong degradations. Used discrim resets to correct occuring colorloss in all of them. gt size was for the most part 512 with batch 9 (since i hoped it would give better results) with 0.55 it/s training speed (first 40k at the beginning were gt size 256 with batch 20 with 0.58 it/s).
DAT
2x
2x Manga Ora
2x Manga Ora
2x Manga Ora
by Ora and null
Made to restore screentones and remove compression artifacts in manga images with widths between 650 and 900 pixels (can be more than 900).
ESRGAN
1x
1x Manhwa Null
1x Manhwa Null
1x Manhwa Null
by null
Model designed exclusively to remove compression artifacts while preserving noise in Manhwa, Manhua, Webtoons etc.
ATD
4x
003_ATD_SRx4_finetune
003_ATD_SRx4_finetune
Official 4x finetune pretrain from ATD # Adaptive Token Dictionary This repository is an official implementation of the paper "Transcending the Limit of Local Window: Advanced Super-Resolution Transformer with Adaptive Token Dictionary", CVPR, 2024. [Arxiv] [visual results] [pretrained models] By Leheng Zhang, Yawei Li, Xingyu Zhou, Xiaorui Zhao, and Shuhang Gu. > **Abstract:** Single Image Super-Resolution is a classic computer vision problem that involves estimating high-resolution (HR) images from low-resolution (LR) ones. Although deep neural networks (DNNs), especially Transformers for super-resolution, have seen significant advancements in recent years, challenges still remain, particularly in limited receptive field caused by window-based self-attention. To address these issues, we introduce a group of auxiliary **A**daptive **T**oken **D**ictionary to SR Transformer and establish an **ATD**-SR method. The introduced token dictionary could learn prior information from training data and adapt the learned prior to specific testing image through an adaptive refinement step. The refinement strategy could not only provide global information to all input tokens but also group image tokens into categories. Based on category partitions, we further propose a category-based self-attention mechanism designed to leverage distant but similar tokens for enhancing input features. The experimental results show that our method achieves the best performance on various single image super-resolution benchmarks. > > <img width="800" src="figures/tdca-acmsa.png"> > <img width="800" src="figures/arch.png"> > <br/><br/> > <img width="800" src="figures/viscomp_076.png">
Real-CUGAN
2x
wtp manga cover pretrained
wtp manga cover pretrained
wtp manga cover pretrained
A 2x model for pre-workout to speed up your workouts. While experimenting with color models for cugan, I just ran into problems that the official models are not imported correctly and have additional processing in addition to resizing. As a result, for my project I trained 2 x cugans for neosr; only resizing losses were applied to the dataset `(“lanczos”, “bicubic”, “bilenear”, “HAMMING”, “NEAREST”)`
ESRGAN
2x
AstroManLite
AstroManLite
AstroManLite
I used frames from 5 episodes from the PAL DVD sets with differing quality levels against the German Blu-rays in order to reverse engineer an AstroRes model. Detail enhancement is very high though it helps to have a good quality source (of course). The model will remove MPEG-2, some halos, and PAL-conversion artifacts like image trails and faded lines to great effect. Temporal stability is quite good.
Compact
1x
Dehalo_v1_Compact
Dehalo_v1_Compact
Dehalo_v1_Compact
This is a 1x model for removing halos in animation. I was training this a while back and considered it a failed project, but I recently found it useful on a few things I was working on, so I decided to release it. I don't remember any details regarding how it was trained.
DAT
4x
LexicaDAT2_otf
LexicaDAT2_otf
4x ai generated image upscaler trained with otf The 4xLexicaDAT2_hb generated some weird lines on some edges. 4xNomosUniDAT is a different checkpoint of 4xNomosUniDAT_otf (145000), I liked the result a bit more in that example.
DAT
4x
NomosUniDAT_otf
NomosUniDAT_otf
4x universal upscaler trained with otf
RGT
4x
RGT_S
RGT_S
Official 4x rgt-s pretrain from RGT # Recursive Generalization Transformer for Image Super-Resolution Zheng Chen, Yulun Zhang, Jinjin Gu, Linghe Kong, and Xiaokang Yang, "Recursive Generalization Transformer for Image Super-Resolution", ICLR, 2024 [paper] [arXiv] [supplementary material] [visual results] [pretrained models] --- > **Abstract:** Transformer architectures have exhibited remarkable performance in image superresolution (SR). Since the quadratic computational complexity of the selfattention (SA) in Transformer, existing methods tend to adopt SA in a local region to reduce overheads. However, the local design restricts the global context exploitation, which is crucial for accurate image reconstruction. In this work, we propose the Recursive Generalization Transformer (RGT) for image SR, which can capture global spatial information and is suitable for high-resolution images. Specifically, we propose the recursive-generalization self-attention (RG-SA). It recursively aggregates input features into representative feature maps, and then utilizes cross-attention to extract global information. Meanwhile, the channel dimensions of attention matrices ($query$, $key$, and $value$) are further scaled to mitigate the redundancy in the channel domain. Furthermore, we combine the RG-SA with local self-attention to enhance the exploitation of the global context, and propose the hybrid adaptive integration (HAI) for module integration. The HAI allows the direct and effective fusion between features at different levels (local or global). Extensive experiments demonstrate that our RGT outperforms recent state-of-the-art methods quantitatively and qualitatively. ---
Compact
2x
AniScale-2-Compact
AniScale-2-Compact
AniScale-2-Compact
AniScale 2 is a versatile and faithful anime model trained for use on a variety of post ~2000 sources. As the name suggests, this is the successor to the original AniScale, and a substantial upgrade in nearly every respect. Superb blur and depth of field handling, thorough WEB and DVD compression repair, and pleasing line art refinement are the hallmarks of AniScale 2. A few notes: - While AniScale 2 is trained first and foremost as an OmniSR model, AniScale 2 is also intended to be a platform to explore multiple SISR archs. For starters, I've trained OmniSR and Compact versions, but more will come. - AniScale 2 also comes with a "refiner" model, creatively named AniScale 2 Refiner (AS2R). AS2R is a 1x Compact model (for maximum speed) trained to supplement and increase the versatility of AniScale 2, without making the base model excessively aggressive. The Refiner is focused on providing light sharpening, fixing line art and line thinning, depending on whether you run the model before or after upscaling with AniScale 2. - For more information, please visit the Github wiki page. **Comparisons** - General Comparisons - Depth of Field/Blur Comparisons (including vs. my favorite model punching bag)
DITN
2x
AniScale-2-DITN
AniScale-2-DITN
AniScale-2-DITN
AniScale 2 is a versatile and faithful anime model trained for use on a variety of post ~2000 sources. As the name suggests, this is the successor to the original AniScale, and a substantial upgrade in nearly every respect. Superb blur and depth of field handling, thorough WEB and DVD compression repair, and pleasing line art refinement are the hallmarks of AniScale 2. A few notes: - While AniScale 2 is trained first and foremost as an OmniSR model, AniScale 2 is also intended to be a platform to explore multiple SISR archs. For starters, I've trained OmniSR and Compact versions, but more will come. - AniScale 2 also comes with a "refiner" model, creatively named AniScale 2 Refiner (AS2R). AS2R is a 1x Compact model (for maximum speed) trained to supplement and increase the versatility of AniScale 2, without making the base model excessively aggressive. The Refiner is focused on providing light sharpening, fixing line art and line thinning, depending on whether you run the model before or after upscaling with AniScale 2. - For more information, please visit the Github wiki page. **Comparisons** - General Comparisons - Depth of Field/Blur Comparisons (including vs. my favorite model punching bag)
ESRGAN
2x
AniScale-2-ESRGAN
AniScale-2-ESRGAN
AniScale-2-ESRGAN
AniScale 2 is a versatile and faithful anime model trained for use on a variety of post ~2000 sources. As the name suggests, this is the successor to the original AniScale, and a substantial upgrade in nearly every respect. Superb blur and depth of field handling, thorough WEB and DVD compression repair, and pleasing line art refinement are the hallmarks of AniScale 2. A few notes: - While AniScale 2 is trained first and foremost as an OmniSR model, AniScale 2 is also intended to be a platform to explore multiple SISR archs. For starters, I've trained OmniSR and Compact versions, but more will come. - AniScale 2 also comes with a "refiner" model, creatively named AniScale 2 Refiner (AS2R). AS2R is a 1x Compact model (for maximum speed) trained to supplement and increase the versatility of AniScale 2, without making the base model excessively aggressive. The Refiner is focused on providing light sharpening, fixing line art and line thinning, depending on whether you run the model before or after upscaling with AniScale 2. - For more information, please visit the Github wiki page. **Comparisons** - General Comparisons - Depth of Field/Blur Comparisons (including vs. my favorite model punching bag)
ESRGAN
2x
AniScale-2-ESRGAN-Lite
AniScale-2-ESRGAN-Lite
AniScale-2-ESRGAN-Lite
AniScale 2 is a versatile and faithful anime model trained for use on a variety of post ~2000 sources. As the name suggests, this is the successor to the original AniScale, and a substantial upgrade in nearly every respect. Superb blur and depth of field handling, thorough WEB and DVD compression repair, and pleasing line art refinement are the hallmarks of AniScale 2. A few notes: - While AniScale 2 is trained first and foremost as an OmniSR model, AniScale 2 is also intended to be a platform to explore multiple SISR archs. For starters, I've trained OmniSR and Compact versions, but more will come. - AniScale 2 also comes with a "refiner" model, creatively named AniScale 2 Refiner (AS2R). AS2R is a 1x Compact model (for maximum speed) trained to supplement and increase the versatility of AniScale 2, without making the base model excessively aggressive. The Refiner is focused on providing light sharpening, fixing line art and line thinning, depending on whether you run the model before or after upscaling with AniScale 2. - For more information, please visit the Github wiki page. **Comparisons** - General Comparisons - Depth of Field/Blur Comparisons (including vs. my favorite model punching bag)
OmniSR
2x
AniScale-2-OmniSR
AniScale-2-OmniSR
AniScale-2-OmniSR
AniScale 2 is a versatile and faithful anime model trained for use on a variety of post ~2000 sources. As the name suggests, this is the successor to the original AniScale, and a substantial upgrade in nearly every respect. Superb blur and depth of field handling, thorough WEB and DVD compression repair, and pleasing line art refinement are the hallmarks of AniScale 2. A few notes: - While AniScale 2 is trained first and foremost as an OmniSR model, AniScale 2 is also intended to be a platform to explore multiple SISR archs. For starters, I've trained OmniSR and Compact versions, but more will come. - AniScale 2 also comes with a "refiner" model, creatively named AniScale 2 Refiner (AS2R). AS2R is a 1x Compact model (for maximum speed) trained to supplement and increase the versatility of AniScale 2, without making the base model excessively aggressive. The Refiner is focused on providing light sharpening, fixing line art and line thinning, depending on whether you run the model before or after upscaling with AniScale 2. - For more information, please visit the Github wiki page. **Comparisons** - General Comparisons - Depth of Field/Blur Comparisons (including vs. my favorite model punching bag)
ESRGAN
4x
DWTP_descreenton_H_esrgan
DWTP_descreenton_H_esrgan
DWTP_descreenton_H_esrgan
I've been experimenting a lot since the last descripton was released to save even more detail, and to get rid of some of the beginner's soreness. All three models are not yet perfect and I will develop them, but what is the essence of the three models, the models on omni is just a division of VL eats less large screenton, VH is just a re-trained model on a modified dataset and with VL as a prevoritolnoy shadowing that it would demolish more screenton, while not going to demolish all the textures and very large screenton. Esrgan model was trained on a dataset VH for use in the colab, it works in some places worse than Omni, so could not fight with overlapping screenton.
OmniSR
4x
DWTP_descreenton_VH4
DWTP_descreenton_VH4
DWTP_descreenton_VH4
I've been experimenting a lot since the last descripton was released to save even more detail, and to get rid of some of the beginner's soreness. All three models are not yet perfect and I will develop them, but what is the essence of the three models, the models on omni is just a division of VL eats less large screenton, VH is just a re-trained model on a modified dataset and with VL as a prevoritolnoy shadowing that it would demolish more screenton, while not going to demolish all the textures and very large screenton. Esrgan model was trained on a dataset VH for use in the colab, it works in some places worse than Omni, so could not fight with overlapping screenton.
OmniSR
4x
DWTP_descreenton_VL4
DWTP_descreenton_VL4
DWTP_descreenton_VL4
I've been experimenting a lot since the last descripton was released to save even more detail, and to get rid of some of the beginner's soreness. All three models are not yet perfect and I will develop them, but what is the essence of the three models, the models on omni is just a division of VL eats less large screenton, VH is just a re-trained model on a modified dataset and with VL as a prevoritolnoy shadowing that it would demolish more screenton, while not going to demolish all the textures and very large screenton. Esrgan model was trained on a dataset VH for use in the colab, it works in some places worse than Omni, so could not fight with overlapping screenton.
CRAFT
4x
craft pretrain
craft pretrain
by musl
A 4x model for Pretrain to quick start new models..
DAT
4x
4xNomosUniDAT_bokeh_jpg
4xNomosUniDAT_bokeh_jpg
4xNomosUniDAT_bokeh_jpg
4x Multipurpose DAT upscaler Trained on DAT with Adan, U-Net SN, huber pixel loss, huber perceptial loss, vanilla gan loss, huber ldl loss and huber focal-frequency loss, on paired nomos_uni (universal dataset containing photographs, anime, text, maps, music sheets, paintings ..) with added jpg compression 40-100 and down_up, bicubic, bilinear, box, nearest and lanczos scales. No blur degradation had been introduced in the training dataset to keep the model from trying to sharpen blurry backgrounds. The three strengths of this model (design purpose): 1. Multipurpose 2. Handles bokeh effect 3. Handles jpg compression This model will not: - Denoise - Deblur
OmniSR
4x
wtp_descreentone
wtp_descreentone
wtp_descreentone
In general, what was the point? The project was conceived for a manga scale with a screentone using the method of superimposing a new screentone, but for this the old screentone must be removed, but this is not all the same important contour. With the screentone overlay method, the contour should be as close to 1 bit as possible, otherwise we will see full contours, wavy contours, etc. This is the whole idealogy of this model.
SRFormer
4x
FrankendataPretainer_SRFormer
FrankendataPretainer_SRFormer
FrankendataPretainer_SRFormer
4x realistic upscaler that may also work for general purpose usage. Trained on Bicubic downscaled tiles from very high quality images.
DAT
4x
4xLSDIRDAT
4xLSDIRDAT
4xLSDIRDAT
A 4x photo upscale DAT model trained with otf (resize, jpg, small blur) on the LSDIR dataset.
DAT
4x
4xNomosUniDAT2_box
4xNomosUniDAT2_box
4xNomosUniDAT2_box
4x general purpose upscaler for non-degraded content. Has been trained as paired dataset on box downsamples as a DAT2 model, AdamW with pixel, perceptual, gan, color, ldl and ff loss.
DAT
4x
4xReal_SSDIR_DAT_GAN
4xReal_SSDIR_DAT_GAN
4xReal_SSDIR_DAT_GAN
4x photo upscaler on the SSDIR_Sharp dataset, trained with otf, using the same settings as Real_HAT_GAN. These otf values are very high in my opinion. This was an experiment if I used the same settings as Real_HAT_GAN (therefore the name). This model will denoise very strongly and smooth out a lot (so a lot of details will be lost. But maybe this effect might be beneficial to someone)
DAT
4x
4xSSDIRDAT
4xSSDIRDAT
4xSSDIRDAT
Description: 4x photo upscaler on the SSDIR_Sharp dataset, trained with otf for jpg, resize and small blur. This model sharpens.
SPAN
1x
1x-span_anime_pretrain
1x-span_anime_pretrain
Basic 1x model to use for SPAN anime models. Enjoy
SRFormer
1x
Frankenfixer_SRFormerLight
Frankenfixer_SRFormerLight
Frankenfixer_SRFormerLight
A 1x model designed to reduce artifacts and restore detail to images upscaled by 4xFrankendata_FullDegradation_SRFormer. It could possibly work with other upscaling models too.
Real-CUGAN
4x
cugan_pretrain
cugan_pretrain
by musl
Model trained only on downscale degradation (bicubic, bilinear, nearest, lanczos and mitchell). Can be used to start new Real-CUGAN models. ONNX included on the dir.
DAT
4x
4xFaceUpDAT
4xFaceUpDAT
4xFaceUpDAT
Description: 4x photo upscaler for faces, trained on the FaceUp dataset. These models are an improvement over the previously released 4xFFHQDAT and are its successors. These models are released together with the FaceUp dataset, plus the accompanying youtube video This model comes in 4 different versions: 4xFaceUpDAT (for good quality input) 4xFaceUpLDAT (for lower quality input, can additionally denoise) 4xFaceUpSharpDAT (for good quality input, produces sharper output, trained without USM but sharpened input images, good quality input) 4xFaceUpSharpLDAT (for lower quality input, produces sharper output, trained without USM but sharpened input images, can additionally denoise) I recommend trying out 4xFaceUpDAT
DAT
4x
4xFaceUpLDAT
4xFaceUpLDAT
4xFaceUpLDAT
Description: 4x photo upscaler for faces, trained on the FaceUp dataset. These models are an improvement over the previously released 4xFFHQDAT and are its successors. These models are released together with the FaceUp dataset, plus the accompanying youtube video This model comes in 4 different versions: 4xFaceUpDAT (for good quality input) 4xFaceUpLDAT (for lower quality input, can additionally denoise) 4xFaceUpSharpDAT (for good quality input, produces sharper output, trained without USM but sharpened input images, good quality input) 4xFaceUpSharpLDAT (for lower quality input, produces sharper output, trained without USM but sharpened input images, can additionally denoise) I recommend trying out 4xFaceUpDAT
DAT
4x
4xFaceUpSharpDAT
4xFaceUpSharpDAT
4xFaceUpSharpDAT
Description: 4x photo upscaler for faces, trained on the FaceUp dataset. These models are an improvement over the previously released 4xFFHQDAT and are its successors. These models are released together with the FaceUp dataset, plus the accompanying youtube video This model comes in 4 different versions: 4xFaceUpDAT (for good quality input) 4xFaceUpLDAT (for lower quality input, can additionally denoise) 4xFaceUpSharpDAT (for good quality input, produces sharper output, trained without USM but sharpened input images, good quality input) 4xFaceUpSharpLDAT (for lower quality input, produces sharper output, trained without USM but sharpened input images, can additionally denoise) I recommend trying out 4xFaceUpDAT
DAT
4x
4xFaceUpSharpLDAT
4xFaceUpSharpLDAT
4xFaceUpSharpLDAT
Description: 4x photo upscaler for faces, trained on the FaceUp dataset. These models are an improvement over the previously released 4xFFHQDAT and are its successors. These models are released together with the FaceUp dataset, plus the accompanying youtube video This model comes in 4 different versions: 4xFaceUpDAT (for good quality input) 4xFaceUpLDAT (for lower quality input, can additionally denoise) 4xFaceUpSharpDAT (for good quality input, produces sharper output, trained without USM but sharpened input images, good quality input) 4xFaceUpSharpLDAT (for lower quality input, produces sharper output, trained without USM but sharpened input images, can additionally denoise) I recommend trying out 4xFaceUpDAT
SPAN
2x
2x-span_anime_pretrain
2x-span_anime_pretrain
This is just a very basic 2x model to start your anime models with on SPAN, nothing special. I tried to keep it as basic as possible
SPAN
4x
span_pretrain
span_pretrain
by musl
Please load it using `strict_load_g: false`. Model trained only on downscale degradation (bicubic, bilinear, nearest, lanczos and mitchell). Can be used to start new SPAN models.
DAT
4x
4xFFHQDAT
4xFFHQDAT
4xFFHQDAT
4x photo upscaler for faces with otf jpg compression, blur and resize, trained on FFHQ dataset. This has been trained on and for faces, but i guess can also be used for other photos, might be able to retain skin detail. This is not face restoration, but simply a 4x upscaler trained on faces, therefore input images need to be of good quality if good output quality is desired.
DAT
4x
4xFFHQLDAT
4xFFHQLDAT
4xFFHQLDAT
Since the above 4xFFHQDAT model is not able to handle the noise present in low quality input images, i made a small variant/finetune of this, the 4xFFHQLDAT model. This model might come in handy if your input image is of bad quality/not suited for the previous model. I basically made this model in a response to an input image posted in upscaling-results channel as a request to this upscale model (since 4xFFHQDAT would not be able to handle noise), see Imgsli1 example below for result.
Compact
1x
wtp_descreentone_compact
wtp_descreentone_compact
wtp_descreentone_compact
model was created as a compact analog of popular models that delete a screenton in the process of overlaying a new screenton. Speed(image:1441x2048, video card: 3060 12gb) 1x_wtp_descreentone_compact: 1.2 sec.
DCTLSA
4x
dctlsa_pretrained
dctlsa_pretrained
by musl
Model trained only on downscale degradation (bicubic, bilinear, nearest, lanczos and mitchell). Can be used to start new DCTLSA models.
DAT
4x
4xNomos8kDAT
4xNomos8kDAT
4xNomos8kDAT
A 4x photo upscaler with otf jpg compression, blur and resize, trained on musl's Nomos8k_sfw dataset for realisic sr, this time based on the DAT arch, as a finetune on the official 4x DAT model. The 295 MB file is the pth file which can be run with the dat reo github code. The 85.8 MB file is an onnx conversion. All Files can be found in this google drive folder. If above onnx file is not working, you can try the other conversions in the onnx subfolder. Examples: Imgsli1 (generated with onnx file) Imgsli2 (generated with onnx file) Imgsli (generated with testscript of dat repo on the three test images in dataset/single with pth file)
SPAN
2x
No Image
spanx2_ch48
Official 2x pretrain for SPAN.
SPAN
4x
No Image
spanx4_ch48
Official 4x pretrain for SPAN.
OmniSR
1x
DeJPG_OmniSR
DeJPG_OmniSR
DeJPG_OmniSR
1xDeJPG OmniSR model Meant to remove jpg artifacts. First compression round on the training dataset was 40-95, second round was 80-100. Example: https://slow.pics/c/70nGEvPR PS these are non-otf models. otf experiments have not quite done it for me (probably would need more adjusting), but they can be found in the experiments folder and their outputs in the outputs folder: https://drive.google.com/drive/folders/1RRXZCRVqlqU_iaeF-RGUM9UBWtI0HseY?usp=drive_link
SRFormer
1x
DeJPG_SRFormer_light
DeJPG_SRFormer_light
DeJPG_SRFormer_light
1xDeJPG SRFormer light model Meant to remove jpg artifacts. First compression round on the training dataset was 40-95, second round was 80-100. onnx conversions in this folder which contains all files: https://drive.google.com/drive/folders/1RRXZCRVqlqU_iaeF-RGUM9UBWtI0HseY Example: https://slow.pics/c/70nGEvPR PS these are non-otf models. otf experiments have not quite done it for me (probably would need more adjusting), but they can be found in the experiments folder and their outputs in the outputs folder: https://drive.google.com/drive/folders/1RRXZCRVqlqU_iaeF-RGUM9UBWtI0HseY?usp=drive_link
Compact
2x
smbss-2x (small)
smbss-2x (small)
smbss-2x (small)
A 2x model designed for hand-drawn animation of the 80's, 90's and early 2000s. The Super Mario Bros. Super Show is a hybrid animated and live-action show that debuted in 1989. Each episode featured a cartoon segment sandwiched between live-action segments. The combination of the two formats makes the show very difficult to upscale, and creates the need for multiple model files. These models were trained using footage from the show itself (except for some title images that were recreated in Photoshop) by finding cases where the same image was shown at different resolutions. For the animated footage, I also did some retouching to reduce haloing, color bleed, and other issues. A few notes: - The opening sequence (in particular, the first shot with the giant talking Mario face) is very difficult to upscale due to the picture being degraded from multiple layers of compositing. Ideally, a professional remaster of the series would have access to the individual elements, upscale those, then recomposite them. - Although live-action models are provided, you will likely get better results from commercial upscaling software (I recommend the Proteus model in Topaz Video AI). - Footage must be deinterlaced before processing (I recommend QTGMC for both live-action and animated segments). Full Read Me Sample Animated Result Sample Live Action Result
ESRGAN
2x
smbss-2x (large)
smbss-2x (large)
smbss-2x (large)
A 2x model designed for hand-drawn animation of the 80's, 90's and early 2000s. The Super Mario Bros. Super Show is a hybrid animated and live-action show that debuted in 1989. Each episode featured a cartoon segment sandwiched between live-action segments. The combination of the two formats makes the show very difficult to upscale, and creates the need for multiple model files. These models were trained using footage from the show itself (except for some title images that were recreated in Photoshop) by finding cases where the same image was shown at different resolutions. For the animated footage, I also did some retouching to reduce haloing, color bleed, and other issues. A few notes: - The opening sequence (in particular, the first shot with the giant talking Mario face) is very difficult to upscale due to the picture being degraded from multiple layers of compositing. Ideally, a professional remaster of the series would have access to the individual elements, upscale those, then recomposite them. - Although live-action models are provided, you will likely get better results from commercial upscaling software (I recommend the Proteus model in Topaz Video AI). - Footage must be deinterlaced before processing (I recommend QTGMC for both live-action and animated segments). Full Read Me Sample Animated Result Sample Live Action Result
Compact
2x
smbss-2x (medium)
smbss-2x (medium)
smbss-2x (medium)
A 2x model designed for hand-drawn animation of the 80's, 90's and early 2000s. The Super Mario Bros. Super Show is a hybrid animated and live-action show that debuted in 1989. Each episode featured a cartoon segment sandwiched between live-action segments. The combination of the two formats makes the show very difficult to upscale, and creates the need for multiple model files. These models were trained using footage from the show itself (except for some title images that were recreated in Photoshop) by finding cases where the same image was shown at different resolutions. For the animated footage, I also did some retouching to reduce haloing, color bleed, and other issues. A few notes: - The opening sequence (in particular, the first shot with the giant talking Mario face) is very difficult to upscale due to the picture being degraded from multiple layers of compositing. Ideally, a professional remaster of the series would have access to the individual elements, upscale those, then recomposite them. - Although live-action models are provided, you will likely get better results from commercial upscaling software (I recommend the Proteus model in Topaz Video AI). - Footage must be deinterlaced before processing (I recommend QTGMC for both live-action and animated segments). Full Read Me Sample Animated Result Sample Live Action Result
Compact
2x
AnimeJaNai v2 Compact
AnimeJaNai v2 Compact
AnimeJaNai v2 Compact
Real-time 2x Real-ESRGAN Compact/UltraCompact/SuperUltraCompact models designed for upscaling 1080p anime to 4K. The models correct the inherent blurriness found in anime while preserving details and colors. Optimized for upscaling from 1080p to 4K, but can still produce worthwhile results when upscaling some lower-resolution anime. Can be set up to run in real-time with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai Most HD anime are not produced in native 1080p resolution but rather have a production resolution between 720p and 1080p. When the anime is distributed to consumers, the video is scaled up to 1080p, causing scaling artifacts and blur in the video. The aim is to address these issues while upscaling to deliver a result that appears as if the anime was originally mastered in 4K resolution. Development of the V2 models spanned over four months, during which over 200 release candidate models were trained and meticulously refined. The V2 models introduce several improvements compared to V1: More accurate "native-res aware" sharpening, so the model works just as well on blurry native 720p sources, sharper native 1080p sources, and everything in between More accurate colors including line colors Improved artifact handling Better preservation and enhancement of background details and grain Overall, the V2 models yield much more natural and faithful results compared to V1.
Compact
2x
AnimeJaNai v2 SuperUltraCompact
AnimeJaNai v2 SuperUltraCompact
AnimeJaNai v2 SuperUltraCompact
Real-time 2x Real-ESRGAN Compact/UltraCompact/SuperUltraCompact models designed for upscaling 1080p anime to 4K. The models correct the inherent blurriness found in anime while preserving details and colors. Optimized for upscaling from 1080p to 4K, but can still produce worthwhile results when upscaling some lower-resolution anime. Can be set up to run in real-time with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai Most HD anime are not produced in native 1080p resolution but rather have a production resolution between 720p and 1080p. When the anime is distributed to consumers, the video is scaled up to 1080p, causing scaling artifacts and blur in the video. The aim is to address these issues while upscaling to deliver a result that appears as if the anime was originally mastered in 4K resolution. Development of the V2 models spanned over four months, during which over 200 release candidate models were trained and meticulously refined. The V2 models introduce several improvements compared to V1: More accurate "native-res aware" sharpening, so the model works just as well on blurry native 720p sources, sharper native 1080p sources, and everything in between More accurate colors including line colors Improved artifact handling Better preservation and enhancement of background details and grain Overall, the V2 models yield much more natural and faithful results compared to V1.
Compact
2x
AnimeJaNai v2 UltraCompact
AnimeJaNai v2 UltraCompact
AnimeJaNai v2 UltraCompact
Real-time 2x Real-ESRGAN Compact/UltraCompact/SuperUltraCompact models designed for upscaling 1080p anime to 4K. The models correct the inherent blurriness found in anime while preserving details and colors. Optimized for upscaling from 1080p to 4K, but can still produce worthwhile results when upscaling some lower-resolution anime. Can be set up to run in real-time with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai Most HD anime are not produced in native 1080p resolution but rather have a production resolution between 720p and 1080p. When the anime is distributed to consumers, the video is scaled up to 1080p, causing scaling artifacts and blur in the video. The aim is to address these issues while upscaling to deliver a result that appears as if the anime was originally mastered in 4K resolution. Development of the V2 models spanned over four months, during which over 200 release candidate models were trained and meticulously refined. The V2 models introduce several improvements compared to V1: More accurate "native-res aware" sharpening, so the model works just as well on blurry native 720p sources, sharper native 1080p sources, and everything in between More accurate colors including line colors Improved artifact handling Better preservation and enhancement of background details and grain Overall, the V2 models yield much more natural and faithful results compared to V1.
DAT
4x
DAT_x4
DAT_x4
Official 4x Pretrain of Dual Aggregation Transformer for Image Super-Resolution DAT Transformer has recently gained considerable popularity in low-level vision tasks, including image super-resolution (SR). These networks utilize self-attention along different dimensions, spatial or channel, and achieve impressive performance. This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner. Specifically, we alternately apply spatial and channel self-attention in consecutive Transformer blocks. The alternate strategy enables DAT to capture the global context and realize inter-block feature aggregation. Furthermore, we propose the adaptive interaction module (AIM) and the spatial-gate feed-forward network (SGFN) to achieve intra-block feature aggregation. AIM complements two self-attention mechanisms from corresponding dimensions. Meanwhile, SGFN introduces additional non-linear spatial information in the feed-forward network. Extensive experiments show that our DAT surpasses current methods.
SRFormer
2x
2xHFA2kAVCSRFormer_light
2xHFA2kAVCSRFormer_light
2xHFA2kAVCSRFormer_light
2x SRFormer_light anime upscale model that handles AVC (h264) compression since h264 crf 20-28 degradation together with bicubic, bilinear, box and lanczos downsampling has been applied on musl's HFA2k dataset with Kim's dataset destroyer for training. If you want to run this model with chaiNNer (or another application) you need to use the onnx files with an onnx upscale node. All onnx conversions can be found in the onnx folder on my repo. Example 1: https://imgsli.com/MTkxMTQz Example 2: https://imgsli.com/MTkxMTQ0
SwinIR
2x
2xLexicaSwinIR
2xLexicaSwinIR
2xLexicaSwinIR
2x upscaler for AI generated images. Trained on 43856 images from lexica.art, so its trained specifically on that model but should work in general on ai generated images.
HAT
4x
4xLexicaHAT
4xLexicaHAT
4xLexicaHAT
4x upscaler for AI generated images. Trained on 43856 images from lexica.art, so its trained specifically on that model but should work in general on ai generated images.
ESRGAN
4x
4xLSDIR
4xLSDIR
4xLSDIR
A normal ESRGAN model without degradations and without any pretrain, simply an RRDBNet model trained on paired dataset (4x downsampled) on the full LSDIR dataset (84,991 images / 165 GB)
ESRGAN
4x
4xLSDIRplus
4xLSDIRplus
4xLSDIRplus
Interpolation of 4xLSDIRplusC and 4xLSDIRplusR to handle jpg compression and a little bit of noise/blur
ESRGAN
4x
4xLSDIRplusC
4xLSDIRplusC
4xLSDIRplusC
The RealESRGAN_x4plus finetuned with the big LSDIR dataset (84,991 images / 165 GB), with manually added jpg compression.
ESRGAN
4x
4xLSDIRplusN
4xLSDIRplusN
4xLSDIRplusN
The RealESRGAN_x4plus finetuned with the big LSDIR dataset (84,991 images / 165 GB), no degradation.
ESRGAN
4x
4xLSDIRplusR
4xLSDIRplusR
4xLSDIRplusR
The RealESRGAN_x4plus finetuned with the big LSDIR dataset (84,991 images / 165 GB), with jpg compression and noise and blur
HAT
4x
4xNomos8kSCHAT-L
4xNomos8kSCHAT-L
4xNomos8kSCHAT-L
4x photo upscaler with otf jpg compression and blur, trained on musl's Nomos8k_sfw dataset for realisic sr. Provided is a 16fp onnx (154.1MB) download, and a pth (316.2MB) download. Since this is a big model, upscaling might take a while.
HAT
4x
4xNomos8kSCHAT-S
4xNomos8kSCHAT-S
4xNomos8kSCHAT-S
4x photo upscaler with otf jpg compression and blur, trained on musl's Nomos8k_sfw dataset for realisic sr. HAT-S version/model. Fp16 Onnx (56.5MB) and pth (77.3MB) file download provided.
SRFormer
4x
4xNomos8kSCSRFormer
4xNomos8kSCSRFormer
4xNomos8kSCSRFormer
4x photo upscaler with otf jpg compression and blur, trained on musl's Nomos8k_sfw dataset for realisic sr. SRFormer base model/version. Provided are onnx fp16 (45.8MB) and pth (185.4MB) downloads.
Compact
2x
Ani4K
Ani4K
Ani4K
Ani4K is a 2x model for creating natural-looking 2K and 4K upscales of modern anime. The model places emphasis on depth of field and blur effect retention, which are extremely common in modern anime and something that many anime models struggle with. In addition, Ani4K is trained for strong detail retention, WEB/BD compression clean-up, and producing pleasing lineart. More info on the Github page: https://github.com/Sirosky/Sirosky-Upscaling-Models. Blur Retention Comparisons (including vs. xinntao's 2x paper model): https://imgsli.com/MTg2NTg5/4/5 General Comparisons: https://imgsli.com/MTg2ODI2
Compact
2x
2xHFA2kAVCCompact
2xHFA2kAVCCompact
2xHFA2kAVCCompact
A 2x Compact anime upscale model that handles AVC (h264) degradation. Applied h264 crf 20-28 degradation together with bicubic, bilinear, box and lanczos downsampling on musl's HFA2k dataset with Kim's dataset destroyer.
EDSR
2x
2xHFA2kAVCEDSR_M
2xHFA2kAVCEDSR_M
2xHFA2kAVCEDSR_M
A 2x EDSR M anime upscale model that handles AVC (h264) degradation. Applied h264 crf 20-28 degradation and bicubic, bilinear, box and lanczos downsampling on musl's HFA2k dataset with Kim's dataset destroyer.
OmniSR
2x
2xHFA2kAVCOmniSR
2xHFA2kAVCOmniSR
2xHFA2kAVCOmniSR
The second released community trained model on the OmniSR network. Trained with multiscale discriminator to fix decreased output brightness occuring with OmniSR. 2x anime upscale that handles AVC (h264) compression since h264 crf 20-28 degradation together with bicubic, bilinear, box and lanczos downsampling has been applied on musl's HFA2k dataset with Kim's dataset destroyer.
Compact
2x
2xken-v1-eva-01
2xken-v1-eva-01
2xken-v1-eva-01
I recommend filtering the source before passing it to the model. Note: if the original trainers of the models i used as teachers for this model are not cool with using them in this way, i will remove this model
GRL
4x
4xHFA2kLUDVAEGRL_small
4xHFA2kLUDVAEGRL_small
4xHFA2kLUDVAEGRL_small
4x anime super-resolution with real degradation.
OmniSR
4x
4x_ardo
4x_ardo
4x_ardo
by musl
This is the first community model using the OmniSR network. I trained it mostly to test this network.\nNote: the model output decreases brightness. This is a well known issue we had with OmniSR.
SRFormer
4x
4xHFA2kLUDVAESRFormer_light
4xHFA2kLUDVAESRFormer_light
4xHFA2kLUDVAESRFormer_light
4x lightweight anime upscaler with realistic degradations (compression, noise, blur).
SwinIR
4x
4xHFA2kLUDVAESwinIR_light
4xHFA2kLUDVAESwinIR_light
4xHFA2kLUDVAESwinIR_light
4x lightweight anime upscaler with realistic degradations (compression, noise, blur).
DITN
4x
ditn_w16_pretrain
ditn_w16_pretrain
by musl
Meant to quick start new DITN models trained on `neosr`. Trained only on downscale degradation (bicubic, bilinear, nearest, lanczos and mitchell). Note: Flash Attention breaks inference compatibility with official code.
ESRGAN
2x
2xLexicaRRDBNet
2xLexicaRRDBNet
2xLexicaRRDBNet
2x upscaler for the AI generated image output. Trained on 43856 images from lexica.art, so its trained specifically on that model but should work in general on ai generated images.
ESRGAN
2x
2xLexicaRRDBNet_Sharp
2xLexicaRRDBNet_Sharp
2xLexicaRRDBNet_Sharp
Its like 2xLexicaRRDBNet model, but trained for some more with l1_gt_usm and percep_gt_usm set to true, resulting in sharper outputs. I provide both so they can be chosen based on preferrence of the user.
ESRGAN
2x
2x_Garfield1_308k
2x_Garfield1_308k
2x_Garfield1_308k
Meant to preserve original 4:3 framing of G&F while upscaling. Trained to handle all seven seasons, which drastically differ in quality. Fixes include rainbowing, dot-crawl, halos, general noise, and of course MPEG-2 artifacts. It should be generally flexible in dealing with different degrees of those artifacts, but as always the cleaner the source the better. The model could be useful with other shows from the '80s and '90s with ugly video sources. Maybe.
ESRGAN
2x
2x_DBZScanLite
2x_DBZScanLite
2x_DBZScanLite
Using the Kineko 4K scan the hope was to clean up the Funi DVD releases and upscale to something comparable. This model degrains the series pretty damn well, but what's revealed beneath is a lot of moving noise that's probably best dealt with in pre- or post- with a video editor like AVISynth. So while the model is remarkably temporally stable, the DBZ sources certainly aren't. The movies look really nice when upscaled with raw input, though.
ESRGAN
4x
Nomos8kSC
Nomos8kSC
Nomos8kSC
4x photo upscaler with otf jpg compression and blur, trained on musl's Nomos8k_sfw dataset for realisic sr
ESRGAN
4x
HFA2k
HFA2k
HFA2k
4x anime image upscaler with a bit of otf jpg compression and blur. Trained on musl's hfa2k dataset release for Anime SISR, which has been extracted from modern anime films, where the selection criteria was high SNR, no DOF and high frequency information. Examples: https://imgsli.com/MTc2NDgx (PS Example input images are small, each around 500x280 px) \ Example input files: https://drive.google.com/drive/folders/1RI6gGqRy-KxDujbaIrpEMdvFkIWHvfPt \ Example output files: https://drive.google.com/drive/folders/1GqHwPlFp6bIQl4R1AxmrmJ6vUUD8FUxH \ All my model files can be found on my github repo https://github.com/Phhofm/models
Compact
2x
Parimg Compact
Parimg Compact
Parimg Compact
A 2x photo upscaling compact model based on microsofts image pairs. This was one of the earliest models I started training and finished it now for release.
LUDVAE
1x
DM600_LUDVAE
DM600_LUDVAE
DM600_LUDVAE
by musl
Create realistic noise and compression artifacts on images. This is meant to synthesize paired dataset for SISR training.
Compact
4x
animerd v1
animerd v1
animerd v1
by musl
Anime super-resolution, with real degradation.
OmniSR
2x
No Image
OmniSR 2x DF2K
Omni Aggregation Networks for Lightweight Image Super-Resolution (OmniSR) While lightweight ViT framework has made tremendous progress in image super-resolution, its uni-dimensional self-attention modeling, as well as homogeneous aggregation scheme, limit its effective receptive field (ERF) to include more comprehensive interactions from both spatial and channel dimensions. To tackle these drawbacks, this work proposes two enhanced components under a new Omni-SR architecture. First, an Omni Self-Attention (OSA) block is proposed based on dense interaction principle, which can simultaneously model pixel-interaction from both spatial and channel dimensions, mining the potential correlations across omni-axis (i.e., spatial and channel). Coupling with mainstream window partitioning strategies, OSA can achieve superior performance with compelling computational budgets. Second, a multi-scale interaction scheme is proposed to mitigate sub-optimal ERF (i.e., premature saturation) in shallow models, which facilitates local propagation and meso-/global-scale interactions, rendering an omni-scale aggregation building block. Extensive experiments demonstrate that Omni-SR achieves recordhigh performance on lightweight super-resolution benchmarks (e.g., 26.95dB@Urban100 ×4 with only 792K parameters). Our code is available at https://github.com/Francis0625/Omni-SR
OmniSR
2x
No Image
OmniSR 2x DIV2K
Omni Aggregation Networks for Lightweight Image Super-Resolution (OmniSR) While lightweight ViT framework has made tremendous progress in image super-resolution, its uni-dimensional self-attention modeling, as well as homogeneous aggregation scheme, limit its effective receptive field (ERF) to include more comprehensive interactions from both spatial and channel dimensions. To tackle these drawbacks, this work proposes two enhanced components under a new Omni-SR architecture. First, an Omni Self-Attention (OSA) block is proposed based on dense interaction principle, which can simultaneously model pixel-interaction from both spatial and channel dimensions, mining the potential correlations across omni-axis (i.e., spatial and channel). Coupling with mainstream window partitioning strategies, OSA can achieve superior performance with compelling computational budgets. Second, a multi-scale interaction scheme is proposed to mitigate sub-optimal ERF (i.e., premature saturation) in shallow models, which facilitates local propagation and meso-/global-scale interactions, rendering an omni-scale aggregation building block. Extensive experiments demonstrate that Omni-SR achieves recordhigh performance on lightweight super-resolution benchmarks (e.g., 26.95dB@Urban100 ×4 with only 792K parameters). Our code is available at https://github.com/Francis0625/Omni-SR
OmniSR
3x
No Image
OmniSR 3x DF2K
Omni Aggregation Networks for Lightweight Image Super-Resolution (OmniSR) While lightweight ViT framework has made tremendous progress in image super-resolution, its uni-dimensional self-attention modeling, as well as homogeneous aggregation scheme, limit its effective receptive field (ERF) to include more comprehensive interactions from both spatial and channel dimensions. To tackle these drawbacks, this work proposes two enhanced components under a new Omni-SR architecture. First, an Omni Self-Attention (OSA) block is proposed based on dense interaction principle, which can simultaneously model pixel-interaction from both spatial and channel dimensions, mining the potential correlations across omni-axis (i.e., spatial and channel). Coupling with mainstream window partitioning strategies, OSA can achieve superior performance with compelling computational budgets. Second, a multi-scale interaction scheme is proposed to mitigate sub-optimal ERF (i.e., premature saturation) in shallow models, which facilitates local propagation and meso-/global-scale interactions, rendering an omni-scale aggregation building block. Extensive experiments demonstrate that Omni-SR achieves recordhigh performance on lightweight super-resolution benchmarks (e.g., 26.95dB@Urban100 ×4 with only 792K parameters). Our code is available at https://github.com/Francis0625/Omni-SR
OmniSR
3x
No Image
OmniSR 3x DIV2K
Omni Aggregation Networks for Lightweight Image Super-Resolution (OmniSR) While lightweight ViT framework has made tremendous progress in image super-resolution, its uni-dimensional self-attention modeling, as well as homogeneous aggregation scheme, limit its effective receptive field (ERF) to include more comprehensive interactions from both spatial and channel dimensions. To tackle these drawbacks, this work proposes two enhanced components under a new Omni-SR architecture. First, an Omni Self-Attention (OSA) block is proposed based on dense interaction principle, which can simultaneously model pixel-interaction from both spatial and channel dimensions, mining the potential correlations across omni-axis (i.e., spatial and channel). Coupling with mainstream window partitioning strategies, OSA can achieve superior performance with compelling computational budgets. Second, a multi-scale interaction scheme is proposed to mitigate sub-optimal ERF (i.e., premature saturation) in shallow models, which facilitates local propagation and meso-/global-scale interactions, rendering an omni-scale aggregation building block. Extensive experiments demonstrate that Omni-SR achieves recordhigh performance on lightweight super-resolution benchmarks (e.g., 26.95dB@Urban100 ×4 with only 792K parameters). Our code is available at https://github.com/Francis0625/Omni-SR
OmniSR
4x
OmniSR 4x DF2K
OmniSR 4x DF2K
Omni Aggregation Networks for Lightweight Image Super-Resolution (OmniSR) While lightweight ViT framework has made tremendous progress in image super-resolution, its uni-dimensional self-attention modeling, as well as homogeneous aggregation scheme, limit its effective receptive field (ERF) to include more comprehensive interactions from both spatial and channel dimensions. To tackle these drawbacks, this work proposes two enhanced components under a new Omni-SR architecture. First, an Omni Self-Attention (OSA) block is proposed based on dense interaction principle, which can simultaneously model pixel-interaction from both spatial and channel dimensions, mining the potential correlations across omni-axis (i.e., spatial and channel). Coupling with mainstream window partitioning strategies, OSA can achieve superior performance with compelling computational budgets. Second, a multi-scale interaction scheme is proposed to mitigate sub-optimal ERF (i.e., premature saturation) in shallow models, which facilitates local propagation and meso-/global-scale interactions, rendering an omni-scale aggregation building block. Extensive experiments demonstrate that Omni-SR achieves recordhigh performance on lightweight super-resolution benchmarks (e.g., 26.95dB@Urban100 ×4 with only 792K parameters). Our code is available at https://github.com/Francis0625/Omni-SR
OmniSR
4x
OmniSR 4x DIV2K
OmniSR 4x DIV2K
Omni Aggregation Networks for Lightweight Image Super-Resolution (OmniSR) While lightweight ViT framework has made tremendous progress in image super-resolution, its uni-dimensional self-attention modeling, as well as homogeneous aggregation scheme, limit its effective receptive field (ERF) to include more comprehensive interactions from both spatial and channel dimensions. To tackle these drawbacks, this work proposes two enhanced components under a new Omni-SR architecture. First, an Omni Self-Attention (OSA) block is proposed based on dense interaction principle, which can simultaneously model pixel-interaction from both spatial and channel dimensions, mining the potential correlations across omni-axis (i.e., spatial and channel). Coupling with mainstream window partitioning strategies, OSA can achieve superior performance with compelling computational budgets. Second, a multi-scale interaction scheme is proposed to mitigate sub-optimal ERF (i.e., premature saturation) in shallow models, which facilitates local propagation and meso-/global-scale interactions, rendering an omni-scale aggregation building block. Extensive experiments demonstrate that Omni-SR achieves recordhigh performance on lightweight super-resolution benchmarks (e.g., 26.95dB@Urban100 ×4 with only 792K parameters). Our code is available at https://github.com/Francis0625/Omni-SR
Compact
2x
HFA2kCompact
HFA2kCompact
HFA2kCompact
A compact anime 2x upscaling model based on musl's HFA2k dataset. Compact 2x anime upscaler with otf compression and blur. The '2xHFA2kCompact.pth' (4.6 MB) is the original trained model file, the other model files are conversions using chaiNNer. Trained on musl's latest dataset release for Anime SISR, which has been extracted from modern anime films, where the selection criteria was high SNR, no DOF and high frequency information. Examples: https://imgsli.com/MTcxNjA4 Example input files: https://drive.google.com/drive/folders/1VSPw8m7VbZO6roM9syE7Nf2QYwwn7GUz Example output files: https://drive.google.com/drive/folders/1NFfomnv6d5RtWy_GwOsKO3uZ_HNOo-2i Future work: Started training a 4xRRDBNet version of this, bigger model so maybe better results but slower, for still/ single anime images. Will release in the future or drop, based on achieved results.
Compact
2x
AniScale
AniScale
AniScale
2x general purpose anime upscaler, with a focus on cleaning up scuffed sources while enhancing texture detail. While the model was trained on DVDs, it still works well on modern anime. The model is trained to deal with noise, compression artifacts, blur, bleeding, haloing and scuffed line art. Apparently, it also learned to deal with some dotcrawl.
Compact
4x
FatePlus Compact
FatePlus Compact
FatePlus Compact
Images from the Fate/Extra games. Works better on the newer ones. Just a model I trained quickly to test traiNNer-redux and see what I could do with OTF. Works well enough
Compact
4x
LSDIR Compact C3
LSDIR Compact C3
LSDIR Compact C3
Upscale compressed photos to x4 their size. Able to handle JPG compression (30-100).
Compact
4x
LSDIRCompactN
LSDIRCompactN
LSDIRCompactN
Upscale good quality input photos to x4 their size The original 4xLSDIRCompact a bit more trained, cannot handle degradation
Compact
4x
LSDIR Compact R3
LSDIR Compact R3
LSDIR Compact R3
Upscale (degraded) photos to x4 their size. Trained on synthetic data, meant to handle more degradations
DAT
4x
DAT_2_x4
DAT_2_x4
Official DAT2 4x Pretrain of Dual Aggregation Transformer for Image Super-Resolution DAT Transformer has recently gained considerable popularity in low-level vision tasks, including image super-resolution (SR). These networks utilize self-attention along different dimensions, spatial or channel, and achieve impressive performance. This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner. Specifically, we alternately apply spatial and channel self-attention in consecutive Transformer blocks. The alternate strategy enables DAT to capture the global context and realize inter-block feature aggregation. Furthermore, we propose the adaptive interaction module (AIM) and the spatial-gate feed-forward network (SGFN) to achieve intra-block feature aggregation. AIM complements two self-attention mechanisms from corresponding dimensions. Meanwhile, SGFN introduces additional non-linear spatial information in the feed-forward network. Extensive experiments show that our DAT surpasses current methods.
CUGAN
2x
sudo_shuffle_cugan_9.584.969
sudo_shuffle_cugan_9.584.969
This is my attempt at making cugan more efficient and also training it with better loss functions than official training code. Took me months to train on a 4090. It is pure pain to train this network. Nobody will probably beat my iter record any time soon. I tried to balance speed/vram and quality. The image quality is very close to cugan architecture while having ultracompact speed. It usually looks way better than compact, but has different kinds of artefacts. If you have high contrast lines which are a few or one pixel or thin lines with movement, then the model will struggle a bit. That is quite rare though. Exported as onnx for mlrt or vsgan. https://github.com/styler00dollar/VSGAN-tensorrt-docker. I also included old l1 for pretrain purposes, but be warned, it is hard to train.
HAT
4x
HAT-S_SRx4
HAT-S_SRx4
Official Paper model. See https://github.com/XPixelGroup/HAT for more details.
Compact
4x
Rybu
Rybu
Rybu
by musl
General realistic super-resolution.
ESRGAN
1x
DEPVR
DEPVR
DEPVR
Attempts to remove PVRTC compression artifacts. If 1x_DEPVR is unable to remove some artifacts you should use the other version but it can't remove color artifacts properly so you should overlay color of normal version to it.
ESRGAN
1x
DEPVR Artifact
DEPVR Artifact
DEPVR Artifact
Attempts to remove PVRTC compression artifacts. If 1x_DEPVR is unable to remove some artifacts you should use the other version but it can't remove color artifacts properly so you should overlay color of normal version to it.
Compact
4x
LSDIR Compact v2
LSDIR Compact v2
LSDIR Compact v2
Upscale photos to x4 their size. 4xLSDIRCompactv2 supersedes the previously released models, it combines all my progress on my compact model. Both CompactC and CompactR had received around 8 hours more training since release with batch size 10 (CompactR had only been up to 5 on release), and these two were then interpolated together. This allows v2 to handle some degradations, while preserving the details of the CompactC model. Examples: https://imgsli.com/MTY0Njgz/0/2
Compact
4x
LSDIR Compact C
LSDIR Compact C
LSDIR Compact C
Upscale small photos with compression to 4x their size. Trying to extend my previous model to be able to handle compression (JPG 100-30) by manually altering the training dataset, since 4xLSDIRCompact can't handle compression. Use this instead of 4xLSDIRCompact if your photo has compression (like an image from the web).
Compact
4x
LSDIR Compact R
LSDIR Compact R
LSDIR Compact R
Upscale small photos with compression, noise and slight blur to 4x their size. Extending my last 4xLSDIRCompact model to Real-ESRGAN, meaning trained on synthetic data instead to handle more kinds of degradations, it should be able to handle compression, noise, and slight blur. Here is a comparison to show that 4xLSDIRCompact cannot handle compression artifacts, and that these two models will produce better output for that specific scenario. These models are not ‘better’ than the previous one, they are just meant to handle a different use case.
Compact
4x
LSDIR Compact
LSDIR Compact
LSDIR Compact
Upscale small good quality photos to 4x their size My first ever model 😄 Well, it’s not the best, but, it’s something 😉 I provide some 15 examples from the validation set here for you to visually see the generated output (with chaiNNer), photo dimensions are in the name
Compact
2x
90s Cartoon v1
90s Cartoon v1
Upscale 90s and early 2000 American cartoons
Compact
2x
GT-v2-evA
GT-v2-evA
This is the v2 of my 2xGT model, the main purpose of it is to improve upon the v1 model and make more it generalized. for upscaling videos that have grain I would recommend denoising and de-haloing it before passing it through the model.
ESRGAN
1x
GainRESV4
GainRESV4
GainRESV4
Eliminate aliasing and general artifacts caused by not enough resolution. Model knows to add clarity and soften oversharpened areas. Gives image a supersampled look. Had a better version thats more faithful to details but i lost it.
HAT
4x
HAT-L_SRx4_ImageNet-pretrain
HAT-L_SRx4_ImageNet-pretrain
Official Paper pretrain model. See https://github.com/XPixelGroup/HAT for more details.
Compact
2x
HDCube Compact
HDCube Compact
HDCube Compact
Gamecube and Wii textures Dumps. It is designed as a faster alternative to the full size model, it requires much less Vram and is significantly faster.
Compact
2x
AnimeJaNai Standard v1 Compact
AnimeJaNai Standard v1 Compact
AnimeJaNai Standard v1 Compact
Realtime 2x model intended for high or medium quality 1080p anime with an emphasis on correcting the inherit blurriness of anime while preserving details and colors. Not suitable for artifact-heavy or highly compressed content as it will just sharpen artifacts. Also works with SD anime by running the model twice. Can be set up to run with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai Minimum of RTX 3080 is recommended for running UltraCompact model on 1080p in realtime; RTX 4090 is required to run Compact on 1080p in realtime. SuperUltraCompact should run in realtime on 1080p on some lower cards. The compact model is recommended on SD content. Samples: https://imgsli.com/MTUxMDYx \ Comparisons to Anime4K + other compact models and upscalers: https://imgsli.com/MTUxMjY4
Compact
2x
AnimeJaNai Standard v1 SuperUltraCompact
AnimeJaNai Standard v1 SuperUltraCompact
AnimeJaNai Standard v1 SuperUltraCompact
Realtime 2x model intended for high or medium quality 1080p anime with an emphasis on correcting the inherit blurriness of anime while preserving details and colors. Not suitable for artifact-heavy or highly compressed content as it will just sharpen artifacts. Also works with SD anime by running the model twice. Can be set up to run with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai Minimum of RTX 3080 is recommended for running UltraCompact model on 1080p in realtime; RTX 4090 is required to run Compact on 1080p in realtime. SuperUltraCompact should run in realtime on 1080p on some lower cards. The compact model is recommended on SD content. Samples: https://imgsli.com/MTUxMDYx \ Comparisons to Anime4K + other compact models and upscalers: https://imgsli.com/MTUxMjY4
Compact
2x
AnimeJaNai Standard v1 UltraCompact
AnimeJaNai Standard v1 UltraCompact
AnimeJaNai Standard v1 UltraCompact
Realtime 2x model intended for high or medium quality 1080p anime with an emphasis on correcting the inherit blurriness of anime while preserving details and colors. Not suitable for artifact-heavy or highly compressed content as it will just sharpen artifacts. Also works with SD anime by running the model twice. Can be set up to run with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai Minimum of RTX 3080 is recommended for running UltraCompact model on 1080p in realtime; RTX 4090 is required to run Compact on 1080p in realtime. SuperUltraCompact should run in realtime on 1080p on some lower cards. The compact model is recommended on SD content. Samples: https://imgsli.com/MTUxMDYx \ Comparisons to Anime4K + other compact models and upscalers: https://imgsli.com/MTUxMjY4
Compact
2x
AnimeJaNai Strong v1 Compact
AnimeJaNai Strong v1 Compact
AnimeJaNai Strong v1 Compact
Realtime 2x model intended for high or medium quality 1080p anime with an emphasis on correcting the inherit blurriness of anime while preserving details and colors. Not suitable for artifact-heavy or highly compressed content as it will just sharpen artifacts. Also works with SD anime by running the model twice. Can be set up to run with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai Minimum of RTX 3080 is recommended for running UltraCompact model on 1080p in realtime; RTX 4090 is required to run Compact on 1080p in realtime. SuperUltraCompact should run in realtime on 1080p on some lower cards. The compact model is recommended on SD content. Samples: https://imgsli.com/MTUxMDYx \ Comparisons to Anime4K + other compact models and upscalers: https://imgsli.com/MTUxMjY4
Compact
2x
AnimeJaNai Strong v1 UltraCompact
AnimeJaNai Strong v1 UltraCompact
AnimeJaNai Strong v1 UltraCompact
Realtime 2x model intended for high or medium quality 1080p anime with an emphasis on correcting the inherit blurriness of anime while preserving details and colors. Not suitable for artifact-heavy or highly compressed content as it will just sharpen artifacts. Also works with SD anime by running the model twice. Can be set up to run with mpv on Windows using https://github.com/the-database/mpv-upscale-2x_animejanai Minimum of RTX 3080 is recommended for running UltraCompact model on 1080p in realtime; RTX 4090 is required to run Compact on 1080p in realtime. SuperUltraCompact should run in realtime on 1080p on some lower cards. The compact model is recommended on SD content. Samples: https://imgsli.com/MTUxMDYx \ Comparisons to Anime4K + other compact models and upscalers: https://imgsli.com/MTUxMjY4
Compact
2x
Futsuu Anime
Futsuu Anime
Futsuu Anime
This model upscales while doing some sharpening and line darkening. Can also clean up some minor artifacts of various types. It is intended to to be a good general purpose upscaler that will work well with most animation.
ESRGAN
4x
RealisticRescaler
RealisticRescaler
RealisticRescaler
This model was made to upscale realistic low-res textures that are compressed by either JPEG or BC1. From my testing, this works rather well on realistic GameCube textures such as the ones from Shrek Extra Large and the board textures from Mario Party 4. This model could also work on some real life images, especially the ones that are taken outdoors.
Compact
2x
LD-Anime Compact
LD-Anime Compact
LD-Anime Compact
by Skr and Zarxrax
I trained Skr's great LD-Anime model on compact architecture. It upscales while fixing numerous video problems, including: noise/grain, compression artifacts, rainbows, dot crawl, halos and color bleed. This compact version may look slightly worse than Skr's original model, but runs significantly faster and also retains the correct colors better than the original model did.
ESRGAN
4x
escale
escale
escale
A 4x model for Anime / Visual Novel Art. Third iteration of my eroge upscaling model. Discriminator: https://drive.google.com/file/d/18q-4ktFNZ8tPjkszuoG6kVBhCjKl8iFa/view?usp=sharing
Compact
2x
GT-evA
GT-evA
This is my first 2xcompact model, the main purpose of it is to upscale Dragon ball GT. for upscaling videos that have grain I would recommend denoising and dehaloing it before passing it to the model for temporal stability.
ESRGAN
4x
eula_anifilm v1
eula_anifilm v1
eula_anifilm v1
Upscaling cel animation. Trained this more than a year ago, releasing cause I've got a much better v2 and v3 now.
ESRGAN
4x
HDCube2
HDCube2
HDCube2
Gamecube and Wii textures Dumps. https://github.com/Venomalia/HDcube/tree/main/v2
ESRGAN
4x
HDCube3
HDCube3
HDCube3
A 4x model for Gamecube and Wii textures.. Pretrained using 4x_HDcube2. It can be used for all image formats supported by Gamecube and Wii hardware and can remove its typical artifacts like CMPR Block Compression (DXT1 algorithm, also known as BC1), color palette errors, color reduction up to 8bit color depth and 1bit alpha depth.
Compact
1x
Dotzilla Compact
Dotzilla Compact
Dotzilla Compact
Wipes out dot crawl and rainbows in animation.
Compact
1x
Epsilon-one compact
Epsilon-one compact
Epsilon-one compact
This model is far from perfect, but it does a decent job on removing dot crawl and dehalo\deblock old anime at fast rate without removing many details. chaining it with models like 2x-anifilm-compact or 2xLD-ANIME or 2x_AnimeClassics_UltraLite_510K will give good results i think.
ESRGAN
1x
Ghibli Grain
Ghibli Grain
Ghibli Grain
A 1x model for Realistic Ghibli Grain. Text or Anime Attempt to get the nostalgic grain feel of classically animated Ghibli movies. Got the idea from researching the making of Ronin's :PKT_blue: 1x_AnimeFilmGrain28k. This was made running Nate video denoising with the preset "more denoising" on kiki's delivery service then overlaying it 66.65 percent over the original with DaVinci. On digital drawn anime it gives a slightly [10%] more organic/sharp feel to black lines. Matches well with content that already has a light digital grain. Run Twice for heavy grain. Eternal Thanks to all that indulge all my :psyduck: Wonderings :element_fire_neon:
ESRGAN
1x
Kim2091 DeJpeg v0
Kim2091 DeJpeg v0
Kim2091 DeJpeg v0
A 1x model. Pretrained using Custom JPEG dataset. Model I forgot to release. This doesn't totally remove JPEG artifacts, but it does a decent job at a fast rate. It seemingly does a better job of retaining detail than some other JPEG models. The model is incomplete, I need to train it further on compact rather than ESRGAN. This is just a temporary release
Compact
2x
Bubble AnimeScale Compact v1
Bubble AnimeScale Compact v1
Bubble AnimeScale Compact v1
A 2x model. Pretrained using 4x_muy4_035_1.pth. This is my first model, so it's not perfect, but I wanted to see if I could train an upscaling model that didn't result in a lot of detail loss and deblurring like the current Compact upscaling models. I believe I accomplished this, but I was unable to reduce contrast shifting. The contrast shifting may cause skin tones to appear incorrect on bright frames, but it's not too bad overall! I'll list a few examples below; more can be found by clicking the Overview link on the Github release page. Pretrained model: Kim2091_CrappyCompactV2.pth
SwinIR
2x
Bubble AnimeScale SwinIR Small v1
Bubble AnimeScale SwinIR Small v1
Bubble AnimeScale SwinIR Small v1
2x_Bubble_AnimeScale_SwinIR_Small_v1 was trained to upscale anime frames faithfully without major contrast shifting compared to my compact model. Although much slower compared to my compact model, the results look significantly better! A few example upscales are listed below; more can be found by clicking the Overview link on the Github release page.
ESRGAN
4x
Link
Link
Link
A 4x model for Chainmail game textures. Alternatively, it can be used to turn plain images into chainmail.. A highly generative model for chainmail game textures. Works fairly well on items without visible chainmail texture (ie just a grey area) or items with clear chainmail texture, less well on items with present but poorly defined chainmail texture. I'm not 100% happy with it but it is still an improvement over existing models in enough situations to be worth releasing.
Compact
4x
No Image
RealESR General WDN x4 v3
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data. A tiny small model for general scenes. De-noising version, meant to be interpolated with the normal one.
Compact
4x
No Image
RealESR General x4 v3
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data. A tiny small model for general scenes.
ESRGAN
1x
ITF SkinDiffDDS v1
ITF SkinDiffDDS v1
ITF SkinDiffDDS v1
Removes banding, blocking, dithering, aliasing, noise and color tint on DDS Compressed Skin Diffuse Textures. This should work extremely well on most modern DDS compression types. The training set was compressed with BC3/DXT5, BC3/DXT5 Fast, BC2/DXT3, BC2/DXT3 Fast, and a small number of JPEG compressed images to cover outliers. This model is trained to remove the slight green color tint that DDS compression tends to add to skin textures, so the model output will not match the original color tone of the input image. This is the desired result though, as DDS compression shifts the colors to a sickly green tint and this model corrects that to more natural color tones. The training set included faces, body parts, eyes, mouths and hair in a variety of skin types and tones so it should work well on most related diffuse textures. However it's not just limited to skin, many other images and textures can be cleaned with this model. Designed to be used as a first step cleaning pass before applying additional models after. Check out the other ITF Models.
ESRGAN
2x
DigiGradients Lite
DigiGradients Lite
DigiGradients Lite
A 2x model for Digital Animation. A very focused model meant for upscaling the TMNT 2003 DVDs. Degradations were added via AVISynth in order to match the video on the TMNT 2003 DVDs to correct the source problems. Problems corrected include aliased red chroma, chroma vertical blur, bad deinterlacing, banding, compression "grain", and poor animation line detail. The AVS scripts for the LR's were run through HCEnc to get authentic low bit rate MPEG-2 artifacts for fixing. By design, the final model gives a very digital-looking result and does not do a good job of retaining textures as the style of TMNT 2003 is all flats and gradients.
ESRGAN
4x
SmolFace
SmolFace
SmolFace
A 4x model for Art/People. A sharp upscaler trained specifically for small sprite faces. Does not blend, so avoid using on painted/photo portraits unless you were trying to retain more of the outlines somehow.
ESRGAN
4x
No Image
SmolFace Clean
A 4x model for Art/People. A sharp upscaler trained specifically for small sprite faces. Does not blend, so avoid using on painted/photo portraits unless you were trying to retain more of the outlines somehow.
ESRGAN
4x
eula digimanga bw v2 nc1
eula digimanga bw v2 nc1
eula digimanga bw v2 nc1
Vast improvement over v1 in low frequency detail; moiré and artifacting reduced significantly and less random noise from JPEG artefacts in the input. Also now only works on 1 channel images, so it runs slightly faster on average and resulting images are much smaller but it might not work on some ESRGAN implementations, I personally recommend using chaiNNer. v1 may still be better in some edge cases. There's also a supplementary 1x model that denoises very low quality LRs and smooths halftones so the image works better with the 4x model. Only trained it to help build the dataset and it's useless for already decent-ish LRs but may help you in some situations.
SRFormer
2x
SRFormerLight_SRx2_DIV2K
SRFormerLight_SRx2_DIV2K
Official paper pretrain model SRFormer: Permuted Self-Attention for Single Image Super-Resolution Yupeng Zhou 1, Zhen Li 1, Chun-Le Guo 1, Song Bai 2, Ming-Ming Cheng 1, Qibin Hou 1 1TMCC, School of Computer Science, Nankai University 2ByteDance, Singapore arXiv The official PyTorch implementation of SRFormer: Permuted Self-Attention for Single Image Super-Resolution (arxiv). SRFormer achieves state-of-the-art performance in classical image SR lightweight image SR real-world image SR The results can be found here. Abstract: In this paper, we introduce SRFormer, a simple yet effective Transformer-based model for single image super-resolution. We rethink the design of the popular shifted window self-attention, expose and analyze several characteristic issues of it, and present permuted self-attention (PSA). PSA strikes an appropriate balance between the channel and spatial information for self-attention, allowing each Transformer block to build pairwise correlations within large windows with even less computational burden. Our permuted self-attention is simple and can be easily applied to existing super-resolution networks based on Transformers. Without any bells and whistles, we show that our SRFormer achieves a 33.86dB PSNR score on the Urban100 dataset, which is 0.46dB higher than that of SwinIR but uses fewer parameters and computations. We hope our simple and effective approach can serve as a useful tool for future research in super-resolution model design. Our code is publicly available at https://github.com/HVision-NKU/SRFormer. Contents Installation & Dataset Training Testing Results Pretrain Models Citations License and Acknowledgement Installation & Dataset python 3.8 pyTorch >= 1.7.0 cd SRFormer pip install -r requirements.txt python setup.py develop Dataset We use the same training and testing sets as SwinIR, the following datasets need to be downloaded for training. Task Training Set Testing Set classical image SR DIV2K (800 training images) or DIV2K +Flickr2K (2650 images) Set5 + Set14 + BSD100 + Urban100 + Manga109 lightweight image SR DIV2K (800 training images) Set5 + Set14 + BSD100 + Urban100 + Manga109 real-world image SR DIV2K (800 training images) +Flickr2K (2650 images) + OST (10324 images for sky,water,grass,mountain,building,plant,animal) RealSRSet+5images Training Please download the dataset corresponding to the task and place them in the folder specified by the training option in folder /options/train/SRFormer Follow the instructions below to train our SRFormer. # train SRFormer for classical SR task ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_SRx2_scratch.yml ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_SRx3_scratch.yml ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_SRx4_scratch.yml # train SRFormer for lightweight SR task ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_light_SRx2_scratch.yml ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_light_SRx3_scratch.yml ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_light_SRx4_scratch.yml Testing # test SRFormer for classical SR task python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_DF2Ksrx2.yml python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_DF2Ksrx3.yml python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_DF2Ksrx4.yml # test SRFormer for lightweight SR task python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_light_DIV2Ksrx2.yml python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_light_DIV2Ksrx3.yml python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_light_DIV2Ksrx4.yml Results We provide the results on classical image SR, lightweight image SR, realworld image SR. More results can be found in the paper. The visual results of SRFormer will upload to google drive soon. Classical image SR Results of Table 4 in the paper Results of Figure 4 in the paper Lightweight image SR Results of Table 5 in the paper Results of Figure 5 in the paper Model size comparison Results of Table 1 and Table 2 in the Supplementary Material Realworld image SR Results of Figure 8 in the paper Pretrain Models Pretrain Models can be download from google drive. To reproduce the results in the article, you can download them and put them in the /PretrainModel folder. Citations You may want to cite: @article{zhou2023srformer, title={SRFormer: Permuted Self-Attention for Single Image Super-Resolution}, author={Zhou, Yupeng and Li, Zhen and Guo, Chun-Le and Bai, Song and Cheng, Ming-Ming and Hou, Qibin}, journal={arXiv preprint arXiv:2303.09735}, year={2023} } License and Acknowledgement This project is released under the Apache 2.0 license. The codes are based on BasicSR, Swin Transformer, and SwinIR. Please also follow their licenses. Thanks for their awesome works.
SRFormer
4x
SRFormer_SRx4_DF2K
SRFormer_SRx4_DF2K
Official paper pretrain model SRFormer: Permuted Self-Attention for Single Image Super-Resolution Yupeng Zhou 1, Zhen Li 1, Chun-Le Guo 1, Song Bai 2, Ming-Ming Cheng 1, Qibin Hou 1 1TMCC, School of Computer Science, Nankai University 2ByteDance, Singapore arXiv The official PyTorch implementation of SRFormer: Permuted Self-Attention for Single Image Super-Resolution (arxiv). SRFormer achieves state-of-the-art performance in classical image SR lightweight image SR real-world image SR The results can be found here. Abstract: In this paper, we introduce SRFormer, a simple yet effective Transformer-based model for single image super-resolution. We rethink the design of the popular shifted window self-attention, expose and analyze several characteristic issues of it, and present permuted self-attention (PSA). PSA strikes an appropriate balance between the channel and spatial information for self-attention, allowing each Transformer block to build pairwise correlations within large windows with even less computational burden. Our permuted self-attention is simple and can be easily applied to existing super-resolution networks based on Transformers. Without any bells and whistles, we show that our SRFormer achieves a 33.86dB PSNR score on the Urban100 dataset, which is 0.46dB higher than that of SwinIR but uses fewer parameters and computations. We hope our simple and effective approach can serve as a useful tool for future research in super-resolution model design. Our code is publicly available at https://github.com/HVision-NKU/SRFormer. Contents Installation & Dataset Training Testing Results Pretrain Models Citations License and Acknowledgement Installation & Dataset python 3.8 pyTorch >= 1.7.0 cd SRFormer pip install -r requirements.txt python setup.py develop Dataset We use the same training and testing sets as SwinIR, the following datasets need to be downloaded for training. Task Training Set Testing Set classical image SR DIV2K (800 training images) or DIV2K +Flickr2K (2650 images) Set5 + Set14 + BSD100 + Urban100 + Manga109 lightweight image SR DIV2K (800 training images) Set5 + Set14 + BSD100 + Urban100 + Manga109 real-world image SR DIV2K (800 training images) +Flickr2K (2650 images) + OST (10324 images for sky,water,grass,mountain,building,plant,animal) RealSRSet+5images Training Please download the dataset corresponding to the task and place them in the folder specified by the training option in folder /options/train/SRFormer Follow the instructions below to train our SRFormer. # train SRFormer for classical SR task ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_SRx2_scratch.yml ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_SRx3_scratch.yml ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_SRx4_scratch.yml # train SRFormer for lightweight SR task ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_light_SRx2_scratch.yml ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_light_SRx3_scratch.yml ./scripts/dist_train.sh 4 options/train/SRFormer/train_SRFormer_light_SRx4_scratch.yml Testing # test SRFormer for classical SR task python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_DF2Ksrx2.yml python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_DF2Ksrx3.yml python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_DF2Ksrx4.yml # test SRFormer for lightweight SR task python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_light_DIV2Ksrx2.yml python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_light_DIV2Ksrx3.yml python basicsr/test.py -opt options/test/SRFormer/test_SRFormer_light_DIV2Ksrx4.yml Results We provide the results on classical image SR, lightweight image SR, realworld image SR. More results can be found in the paper. The visual results of SRFormer will upload to google drive soon. Classical image SR Results of Table 4 in the paper Results of Figure 4 in the paper Lightweight image SR Results of Table 5 in the paper Results of Figure 5 in the paper Model size comparison Results of Table 1 and Table 2 in the Supplementary Material Realworld image SR Results of Figure 8 in the paper Pretrain Models Pretrain Models can be download from google drive. To reproduce the results in the article, you can download them and put them in the /PretrainModel folder. Citations You may want to cite: @article{zhou2023srformer, title={SRFormer: Permuted Self-Attention for Single Image Super-Resolution}, author={Zhou, Yupeng and Li, Zhen and Guo, Chun-Le and Bai, Song and Cheng, Ming-Ming and Hou, Qibin}, journal={arXiv preprint arXiv:2303.09735}, year={2023} } License and Acknowledgement This project is released under the Apache 2.0 license. The codes are based on BasicSR, Swin Transformer, and SwinIR. Please also follow their licenses. Thanks for their awesome works.
ESRGAN
1x
No Image
ITF SkinDiffDetail Lite v1
A 1x model for Skin Upscaling. Adds plausible high frequency detail and removes subtle blur. This is an early unfinished attempt at a x1 Lite model designed specifically for enhancing detail on skin diffuse textures of 3d characters. Even in its current state it works quite well. Best suited for uncompressed or cleaned textures - otherwise it may just enhance any existing compression artefacts too. The training set included faces, body parts, eyes and hair in a variety of skin types and tones so it should work well on most related diffuse textures. However it's not just limited to skin, many other images and textures can be enhanced with this model. The results are subtle, so run multiple times if desired. Pretrained model: 50/50 Interpolation of DIV2K-Lite and SpongeBC1-Lite
ESRGAN
1x
RedImage10000
RedImage10000
A 1x model for Correct old color photos that are tinted red. Correct old color photos that are tinted red. Model only tested in nature.
Compact
2x
Anifilm Compact (2x)
Anifilm Compact (2x)
Anifilm Compact (2x)
This model is based on a private model by @eula 5600x 3070 named 4x_eula_anifilm_v1_225k. He sent me a copy of the model, and I decided to train a compact model based on it with his permission. This model seems to fix the majority of the issues the original model had while being far faster, it's just a tiny bit softer in some images. The dataset consists of Dragon Ball movies converted to YUV24 with @sgdisk --zap-all /dev/sda's help to reduce artifacts, then upscaled with ArtClarity and eula_anifilm. LRs are the original frames right from DVD. As a result, this model corrects some color space issues. The 2x model's HRs were downscaled by 50% with Lanczos. The 2x and 4x models are pretty close in output despite being trained separately. The 2x model is a bit softer overall. The models in the Real-ESRGAN Compatible folder are the original output from Real-ESRGANs training code for compatibility reasons.
Compact
4x
Anifilm Compact (4x)
Anifilm Compact (4x)
Anifilm Compact (4x)
This model is based on a private model by @eula 5600x 3070 named 4x_eula_anifilm_v1_225k. He sent me a copy of the model, and I decided to train a compact model based on it with his permission. This model seems to fix the majority of the issues the original model had while being far faster, it's just a tiny bit softer in some images. The dataset consists of Dragon Ball movies converted to YUV24 with @sgdisk --zap-all /dev/sda's help to reduce artifacts, then upscaled with ArtClarity and eula_anifilm. LRs are the original frames right from DVD. As a result, this model corrects some color space issues. The 2x model's HRs were downscaled by 50% with Lanczos. The 2x and 4x models are pretty close in output despite being trained separately. The 2x model is a bit softer overall. The models in the Real-ESRGAN Compatible folder are the original output from Real-ESRGANs training code for compatibility reasons.
ESRGAN
4x
No Image
Morrowind 2.0
Morrowind Mod Textures
CAIN YUV
2x
No Image
explodV1
A 2x model for Animethemes. I think one of sharpest models for anime. Sample video: https://cdn.discordapp.com/attachments/579685650824036387/1002663424695750686/output.mp4 Plz don't steal without credits, k thx.
ESRGAN
1x
DeRoqBeta-lite
DeRoqBeta-lite
DeRoqBeta-lite
Incomplete lite model to remove ROQ compression
Compact
1x
BleedOut Compact
BleedOut Compact
BleedOut Compact
This model helps repair color bleed and heavy chroma noise that may be present on some older footage, particularly that which was recorded on VHS. It also cleans up rainbows if they are present.
ESRGAN
1x
SheeepIt!
SheeepIt!
SheeepIt!
Restoring frames from the show "Sheeep" This model was trained to restore "Sheeep" while retaining and enhancing the noise present in the show. The model amazingly finished training in only 8.8k iterations with no pretrain, with a dataset of only 4 image pairs. This model should work well for animes and cartoons with a lot of grain present. There are some slight haloing issues in dark colors unfortunately, but I was unable to fix it.
ESRGAN
2x
Digitoon Lite
Digitoon Lite
Digitoon Lite
A 2x model for Digital Animation. Meant as a versatile model for upscaling high detail digital anime and cartoons. Has debanding, MPEG-2 correction, and halo reduction. Trained to handle both 4:3 and 16:9 DVD material with equal efficacy. Will retain a lot of textures except for the really high freq stuff.
ESRGAN
4x
No Image
FatePlus-lite
A 4x model for Anime PSP games, Fate Extra. This model was trained as a favor to Demon and the Fate Extra community. It leaves a nice grain on the images and upscales lines and details accurately without looking odd. This model works on most anime-style PSP games. Enjoy! It works best on content with dithering and quantization. NOTICE: I have included both NCNN and ONNX models to make upscaling easier if you rely on either of these. For NCNN, there's two versions. One is FP16 and the other is FP32. FP16 works best on RTX GPUs. Choose FP32 if in doubt about compatibility, or if FP16 doesn't work for you. To use ONNX, download chaiNNer and upscale through there with the ONNX nodes.
ESRGAN
2x
sudo RealESRGAN2x
sudo RealESRGAN2x
A 2x model. Pretrained using Pretrained_Model_G: RealESRGAN_x4plus_anime_6B.pth /. RealESRGAN_x4plus_anime_6B.pth (sudo_RealESRGAN2x_3.332.758_G.pth) Tried to make the best 2x model there is for drawings. I think i archived that. And yes, it is nearly 3.8 million iterations (probably a record nobody will beat here), took me nearly half a year to train. It can happen that in one edge is a noisy pattern in edges. You can use padding/crop for that. I aimed for perceptual quality without zooming in like 400%. Since RealESRGAN is 4x, I downscaled these images with bicubic. I would recommend my VSGAN code though and just load the onnx. https://github.com/styler00dollar/VSGAN-tensorrt-docker I just wanted a good 2x model for animations, but that model can also be used for wallpapers and so on. Before I hear people complaining, the dropout model is a modified architecture. Stuff like cupscale or chaiNNer won't work with pth. Load the onnx with VSGAN or chaiNNer. I did add the model before switching to dropout though, which is normal ESRGAN pth, that one should work everywhere. I also converted everything into onnx, jit and ncnn, so pretty much everything there is. If you want to use ncnn, don't use nihuis code (that also includes cupscale), these codes don't include propper tiling in C++, which is very bad for this model. I think chaiNNer should have overlap/padding with ncnn, so use that instead if you really want ncnn. Plz don't steal without credits, k thx.
ESRGAN
2x
sudo RealESRGAN2x Dropout
sudo RealESRGAN2x Dropout
Tried to make the best 2x model there is for drawings. I think i archived that. And yes, it is nearly 3.8 million iterations (probably a record nobody will beat here), took me nearly half a year to train. It can happen that in one edge is a noisy pattern in edges. You can use padding/crop for that. I aimed for perceptual quality without zooming in like 400%. Since RealESRGAN is 4x, I downscaled these images with bicubic. I would recommend my VSGAN code though and just load the onnx. https://github.com/styler00dollar/VSGAN-tensorrt-docker I just wanted a good 2x model for animations, but that model can also be used for wallpapers and so on. Before I hear people complaining, the dropout model is a modified architecture. Stuff like cupscale or chaiNNer won't work with pth. Load the onnx with VSGAN or chaiNNer. I did add the model before switching to dropout though, which is normal ESRGAN pth, that one should work everywhere. I also converted everything into onnx, jit and ncnn, so pretty much everything there is. If you want to use ncnn, don't use nihuis code (that also includes cupscale), these codes don't include propper tiling in C++, which is very bad for this model. I think chaiNNer should have overlap/padding with ncnn, so use that instead if you really want ncnn. Plz don't steal without credits, k thx.
Compact
1x
AnimeUndeint Compact
AnimeUndeint Compact
AnimeUndeint Compact
This model corrects jagged lines on animation that has been deinterlaced. It handles simple line doubling, line interpolation, and even Yadif-style artifacts. It can also handle sources that were resized after deinterlacing, for example resizing from ntsc to pal resolutions. If a source has been upscaled after deinterlacing, it will need to be downsized before applying this model.
Compact
1x
HurrDeblur SuperUltraCompact
HurrDeblur SuperUltraCompact
HurrDeblur SuperUltraCompact
This is a sharpening/deblurring model for anime video. It was created with three goals in mind: Be blazing fast Try to avoid enhancing noise and compression artifacts Try to avoid over-enhancing intentionally blurred parts of the image Despite that last point, this is not intended to be used on modern anime which makes heavy of use depth-of-field effects.
Compact
2x
sudo UltraCompact
sudo UltraCompact
A 2x model for Realtime animation restauration and doing stuff like deblur and compression artefact removal. (Teacher: RealESRGANv2-animevideo-xsx2.pth) My first attempt to make a REALTIME 2x upscaling model while also applying teacher student learning. It beats Anime4k in every way. These benchmarks use a 3060ti and it shows that everything better than a 3060ti should be able to handle 1080p input if you create engine files and use my TensorRT code. You can see in the readme how to convert onnx files into engines. The 2 right bars compare normal Compact2 and Ultracompact in speed, the 2 on the left showcase older apis I used which isn't too important for this showcase. To use this, you need to use my code which is https://github.com/styler00dollar/VSGAN-tensorrt-docker. If you use Manjaro, it is also possible to pipe the data stream directly into mpv, so you can watch it in a video player without rendering a video. Yeah the model does seem a little noisy if you zoom in a lot, but don't forget that the model itself is only 1.2mb. I think it does quite well. I still try to improve on fast models, but this is good enough to share as a first model. Plz don't steal without credits, k thx.
Compact
2x
DigitalFlim SuperUltraCompact
DigitalFlim SuperUltraCompact
DigitalFlim SuperUltraCompact
This was trained on the dataset that OptimusPrimal used for his DigitalFilm models. This model cleans up the image removing some noise while upscaling. This model is very fast, running around 200x the speed of a standard ESRGAN model on my system. This is primarily a proof of concept for how fast Real-ESRGAN models can get while still producing nice results.
ESRGAN
4x
HDCube
HDCube
HDCube
A 4x model for Gamecube and Wii textures.. Gamecube and Wii textures (mainly DXT and 8bit color compression). Is good for preserving fine details without affecting the original style too much, it is not suitable for pixel art, small icons and text under 16 pixel. Pretrained model: 4x_NMKD Siax
Compact
4x
RealESR AnimeVideo v3
RealESR AnimeVideo v3
RealESR AnimeVideo v3
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data. We add small models that are optimized for anime videos :-) We have made the following improvements: - better naturalness - Fewer artifacts - more faithful to the original colors - better texture restoration - better background restoration
ESRGAN
4x
Wood BC1
Wood BC1
Wood BC1
A 4x upscaler for BC1-compressed wood textures. The model was trained on wood planks, wood stems, and a bit of tree bark. It performs reasonably well on woody textures and produces sharp outputs. However, it also tends to slightly shift the hue of the image, so results might appear slightly more red. More information: https://github.com/RunDevelopment/ESRGAN-models/blob/main/wood/README.md
Compact
1x
No Image
Compact Pretrain
This is a collection of pretrained models for Real-ESRGAN's Compact architecture. There are 1x, 2x, and 4x models, as well as 1x and 2x "UltraCompact" and "SuperUltraCompact" models (think of these as the equivalent to ESRGAN "lite" models). By using these are pretrains for your models, you can ensure that your models are able to be interpolated with other Compact models that were trained from these. These pretrains are compatible with most existing compact models.
Compact
1x
No Image
SuperUltraCompact Pretrain
This is a collection of pretrained models for Real-ESRGAN's Compact architecture. There are 1x, 2x, and 4x models, as well as 1x and 2x "UltraCompact" and "SuperUltraCompact" models (think of these as the equivalent to ESRGAN "lite" models). By using these are pretrains for your models, you can ensure that your models are able to be interpolated with other Compact models that were trained from these. These pretrains are compatible with most existing compact models.
Compact
1x
No Image
UltraCompact Pretrain
This is a collection of pretrained models for Real-ESRGAN's Compact architecture. There are 1x, 2x, and 4x models, as well as 1x and 2x "UltraCompact" and "SuperUltraCompact" models (think of these as the equivalent to ESRGAN "lite" models). By using these are pretrains for your models, you can ensure that your models are able to be interpolated with other Compact models that were trained from these. These pretrains are compatible with most existing compact models.
Compact
2x
No Image
Compact Pretrain
This is a collection of pretrained models for Real-ESRGAN's Compact architecture. There are 1x, 2x, and 4x models, as well as 1x and 2x "UltraCompact" and "SuperUltraCompact" models (think of these as the equivalent to ESRGAN "lite" models). By using these are pretrains for your models, you can ensure that your models are able to be interpolated with other Compact models that were trained from these. These pretrains are compatible with most existing compact models.
Compact
2x
No Image
SuperUltraCompact Pretrain
This is a collection of pretrained models for Real-ESRGAN's Compact architecture. There are 1x, 2x, and 4x models, as well as 1x and 2x "UltraCompact" and "SuperUltraCompact" models (think of these as the equivalent to ESRGAN "lite" models). By using these are pretrains for your models, you can ensure that your models are able to be interpolated with other Compact models that were trained from these. These pretrains are compatible with most existing compact models.
Compact
2x
No Image
UltraCompact Pretrain
This is a collection of pretrained models for Real-ESRGAN's Compact architecture. There are 1x, 2x, and 4x models, as well as 1x and 2x "UltraCompact" and "SuperUltraCompact" models (think of these as the equivalent to ESRGAN "lite" models). By using these are pretrains for your models, you can ensure that your models are able to be interpolated with other Compact models that were trained from these. These pretrains are compatible with most existing compact models.
Compact
4x
No Image
Compact Pretrain
This is a collection of pretrained models for Real-ESRGAN's Compact architecture. There are 1x, 2x, and 4x models, as well as 1x and 2x "UltraCompact" and "SuperUltraCompact" models (think of these as the equivalent to ESRGAN "lite" models). By using these are pretrains for your models, you can ensure that your models are able to be interpolated with other Compact models that were trained from these. These pretrains are compatible with most existing compact models.
ESRGAN
4x
realesrgan-x4minus
realesrgan-x4minus
realesrgan-x4minus
Basically realesrgan-x4plus without the degradation training. Supposed to help retain more details, but unfortunately due to the dataset (I think) still blurs details adjacent to other objects.
ESRGAN
4x
Normal RG0
Normal RG0
Normal RG0
A 4x upscaler for uncompressed normal maps with a zeroed-out B channel. The input is required to have no alpha and constant-zero blue channel. The model can handle some light compression artifacts but has trouble with quantization artifacts. The output normals will also have a constant-zero B channel. Use external software or image editing plugins to properly normalize the generated normals and generate the Z component (if necessary). E.g. chaiNNer can do this with Normalize Normal Map node. Do not rely on this network producing unit vectors. More information: https://github.com/RunDevelopment/ESRGAN-models/blob/main/normals/README.md
ESRGAN
4x
Normal RG0 BC1
Normal RG0 BC1
Normal RG0 BC1
A 4x upscaler for BC1-compressed normal maps with a zeroed-out B channel. The output normals will also have a constant-zero B channel. Use external software or image editing plugins to properly normalize the generated normals and generate the Z component (if necessary). E.g. chaiNNer can do this with Normalize Normal Map node. Do not rely on this network producing unit vectors. **Dataset transformation:** \ The LR have been compressed with various BC1 compression settings (dithering, weighting) using Texconv version 2021.11.8.1. Since the contents of the B channel influence the R and G channels during compression, the LR were given a B channel that was either the Z component of the normal, a random constant color, or a random texture (usually the G channel of the associated albedo) before compression. The LR + B channel was then compressed and the resulting BC1-compressed DDS was converted back into a PNG and had its B channel zeroed out. More information: https://github.com/RunDevelopment/ESRGAN-models/blob/main/normals/README.md
ESRGAN
4x
Normal RG0 BC7
Normal RG0 BC7
Normal RG0 BC7
A 4x upscaler for BC7-compressed normal maps with a zeroed-out B channel and no alpha. The input is required to have no alpha and constant-zero blue channel. BC7-compressed images with an alpha channel should be handled by the BC1 model instead. This is because BC7 achieves very different quality for images with and without an alpha channel. The output normals will also have a constant-zero B channel. The output normals will also have a constant-zero B channel. Use external software or image editing plugins to properly normalize the generated normals and generate the Z component (if necessary). E.g. chaiNNer can do this with Normalize Normal Map node. Do not rely on this network producing unit vectors. **Dataset transformation:** \ The LR have been compressed using Texconv version 2021.11.8.1. Since the contents of the B channel influence the R and G channels during compression, the LR were given a B channel that was either the Z component of the normal, a random constant color, or a random texture (usually the G channel of the associated albedo) before compression. The LR + B channel was then compressed and the resulting BC7-compressed DDS was converted back into a PNG and had its B channel zeroed out. More information: https://github.com/RunDevelopment/ESRGAN-models/blob/main/normals/README.md
RIFE
2x
No Image
sudo rife4 testV1 scale1
A 2x model for Animation interpolation. I never really mentioned it in model-releases prior since I think not too many care about interpolation here, but I trained a rife4 model for animation a some months ago, which is better than rife4 and rife4.2 imo. Thought I should also mention it here as well. I also converted it into ncnn. (Nihuis rife ncnn models are only exported with the fastest mode and not the best quality. I exported ncnn models for the most important quality settings. Due to different export/quality settings, there are multiple models. For that reason alone, my ncnn models are much better too, since nihui only exported the fastest one.) My https://github.com/styler00dollar/VSGAN-tensorrt-docker also has the rife ncnn extention, which can use VMAF, dedup, scene detection and so on, which I would recommend. My models are in that extention as well, just select model 10, 11 or 12 and use the dev docker. That test video is done with 2x framerate, enbemble True and FastMode False, combined with scene detection and dedup stuff, tta False. Towards the best quality rife can do. Plz don't steal without credits, k thx. Pretrained model: RifeV4 Sample video: https://cdn.discordapp.com/attachments/579685650824036387/990345004260151296/ngnl_sudorife.mp4
ESRGAN
1x
ReFocus V3
ReFocus V3
ReFocus V3
DeBlur, ReFocus, Sharpen Real life style images, but will work on Anime images too.
ESRGAN
2x
AnimeClassics UltraLite
AnimeClassics UltraLite
AnimeClassics UltraLite
A 2x model for Anime/Pretrained. A 2x Ultra Lite model coming in under 8MB. Trained with over 15 sets of LRs ranging in a wide amount of issues. Handles Rainbows, Dot Crawl, MPEG/H.264 Compression, and may even assist in removing halos, and fixing blurriness in certain cases. This is my first public model for everyone. Best when used on old anime that is grainy. I can't say what anime it's best suited for as I have tried multiple series, and have found it does a good job on most all the tests. I wouldn't say use this for Western Animation, but it may work. I have done a few tests that I have shown in the upscale results, but that was chained with other models to achieve such a result. This model is meant to retain the more natural look of a series. There is a color shift on the end result, not drastic, but still noticable. I figure you should fix any color issues in post that way to give a more polished upscale. Big thanks to @SaurusX for the model name, and just helping out in general with anything.
ESRGAN
1x
BroadcastToStudio Lite
BroadcastToStudio Lite
BroadcastToStudio Lite
Improvement of low-quality cartoons from broadcast sources. Will greatly increase the visual quality of bad broadcast tape sources of '80s and '90s cartoons (e.g. Garfield and Friends, Heathcliff, DuckTales, etc). Directly addresses chroma blur, dot crawl, and rainbowing. You're highly advised to take care of haloing beforehand in your favorite video editor as the model will not fix it and may make existing halos more noticeable.
ESRGAN
4x
No Image
GameAI 1.0
by Tal
This model is intended to mainly handle PS2 compression and a mixture of Realistic and cartoonish textures, it's not meant to be used for very low resolution textures such as item icons.
ESRGAN
4x
GameAI 2.0
GameAI 2.0
GameAI 2.0
by Tal
This model is intended to mainly handle PS2 compression and a mixture of Realistic and cartoonish textures, it's not meant to be used for very low resolution textures such as item icons.
ESRGAN
1x
DEDXT
DEDXT
DEDXT
To retain details while removing artifacts caused by dxt compression on textures
ESRGAN
1x
No Image
GainRESV3 (Aggro)
A 1x model for Anti aliasing / Deblur. To eliminate aliasing and general artifacts caused by not enough resolution while bringing out details Im stopping its training here because it's getting worse, i think of some aligment issues by game's rendering pipeline + downscaling...
ESRGAN
1x
GainRESV3 (Natural)
GainRESV3 (Natural)
GainRESV3 (Natural)
A 1x model for Anti aliasing / Deblur. To eliminate aliasing and general artifacts caused by not enough resolution while bringing out details Im stopping its training here because it's getting worse, i think of some aligment issues by game's rendering pipeline + downscaling...
ESRGAN
1x
No Image
GainRESV3 (Passive)
A 1x model for Anti aliasing / Deblur. To eliminate aliasing and general artifacts caused by not enough resolution while bringing out details Im stopping its training here because it's getting worse, i think of some aligment issues by game's rendering pipeline + downscaling...
ESRGAN
4x
AnimeSharp lite
AnimeSharp lite
AnimeSharp lite
This model is a lite version of AnimeSharp. It was trained using student-teacher learning (if i'm using the term properly), where the HRs are LRs upscaled by the full size AnimeSharp ESRGAN model, and the lite model is trained on those outputs as the HR. It works best on clean or slightly blurry anime. Downscale by 50% first in almost all cases
ESRGAN
1x
ToonVHS
ToonVHS
ToonVHS
A 1x model for VHS. Best when used on cartoons, it can work on anime. Due to the dataset it does struggle a bit with orange colors and grainy dark spots. This model is meant to be used to clean up the image before using it on a 2x or 4x model. Example with 2x-BIGOLDIES https://imgsli.com/OTQ0NjM/1/0.
ESRGAN
4x
eula digimanga bw v1
eula digimanga bw v1
eula digimanga bw v1
Black and white digital manga with halftones.
ESRGAN
1x
ReFocus Cleanly
ReFocus Cleanly
ReFocus Cleanly
DeBlur, ReFocus, Sharpen Manga, Anime and cartoon style images, but will work on real life images too.
ESRGAN
4x
AnimeSharp
AnimeSharp
AnimeSharp
Interpolation between 4x-UltraSharp and 4x-TextSharp-v0.5. Works amazingly on anime. It also upscales text, but it's far better with anime content. I rebranded this model on 2/10/22 to 4x-AnimeSharp from 4x-TextSharpV1. Pretrained model: Interpolation between 4x-UltraSharp and 4x-TextSharp-v0.5
ESRGAN
1x
DXTDecompressor Source V3
DXTDecompressor Source V3
DXTDecompressor Source V3
Removing compression artifacts from DXT1 compressed textures. This model was created to remove DXT1 compression artifacts from textures imported into the Source Engine. Compressed textures in the engine sometimes have a green-tint which this model also corrects. The data for this model contained a good mix of diffuse textures and normal maps which means this model is pretty good at removing compression from normals as well. Creating this model was a real learning experience for me and I hope someone finds a good use for it.
Compact
2x
No Image
RealESRGANv2 AnimeVideo xs-x2
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data. We add small models that are optimized for anime videos :-)
ESRGAN
4x
RealESRGAN_x4Plus Anime 6B
RealESRGAN_x4Plus Anime 6B
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data. This model is optimized for anime images with much smaller model size.
Compact
4x
No Image
RealESRGANv2 AnimeVideo xs-x4
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data. We add small models that are optimized for anime videos :-)
ESRGAN
4x
BooruGan 600k
BooruGan 600k
BooruGan 600k
by Tal
This model is designed to mainly upscale anime artworks.
ESRGAN
4x
BooruGan 650k
BooruGan 650k
BooruGan 650k
by Tal
This model is designed to mainly upscale anime artworks. If you have issues with chroma then try the 600k iterations release.
ESRGAN
1x
RoQ_nRoll
RoQ_nRoll
RoQ_nRoll
This model decompresses images and video compressed using RoQ. Config and presets will be added when Mega decides to let me use their site!
ESRGAN
1x
SwatKats Lite
SwatKats Lite
SwatKats Lite
Fix vertical blur / split lines / shadowing. A 1x lite model of my 2xSwatKats. Resolves the same video problems as before, but 1x and faster and meant for chaining to other 2x models (or whatever). Input MUST be 540 vertical as the blur problem is very resolution sensitive.
ESRGAN
2x
SwatKats
SwatKats
SwatKats
In addition to removing the vertical blur, the model upscales, sharpens and will remove MPEG-2 artifacting and a small amount of rainbowing and dot crawl. Another series afflicted with the vertical blur is Avatar the Last Airbender, which can be repaired by this model. The video fed into the model MUST be 540 vertical for the deblur to work properly.
ESRGAN
1x
No Image
DeBink Lite
This model was trained on lossless video frames of Metal Arms: Glitch in the System, compressed with Bink V1 for the LR frames. It's a lot more efficient than my first DeBink model, and also has less artifacts. It's not quite as robust, but the compression is barely noticeable in videos after processing with this. Sample video: https://cdn.discordapp.com/attachments/903415274521374750/904129653374078976/XB_Demo_Mov-85600_G.mp4
ESRGAN
4x
UltraSharp
UltraSharp
UltraSharp
A 4x model. Pretrained using 4xESRGAN. This is my best model yet! It generates lots and lots of detail and leaves a nice texture on images. It works on most images, whether compressed or not. It does work best on JPEG compression though, as that's mostly what it was trained on. It has the ability to restore highly compressed images as well! If you want a more balanced output, check out the UltraMix Collection down below. It's a bunch of interpolated models based around UltraSharp and my other models
ESRGAN
1x
No Image
NoiseToner Poisson Detailed
Attempts to remove the damages done from noise. Successor of sorts to Noisetoner_Poisson_150000_G
SOFVSR
4x
VimeoScale
VimeoScale
VimeoScale
A 4x model for Upscaling IRL video content. This model is one of the longest I have trained and tuned, includes some noise training but this is not intended for deblocking/decompression. Combined with https://github.com/JoeyBallentine/Video-Inference 's fp16 mode, this should handle most SD and 720p content at or faster than ESRGAN, with a significant bump in quality. Pretrained model: Self-Trained Base for the Unet finalisation with vgg_fea discrim
ESRGAN
1x
Focus
Focus
Focus
These models deblur most images. It was trained mostly on aniso2 and iso blurring (BSRGAN augmentation) with some gaussian mixed in. It performs well on most blurry images, but I'd recommend using something like Fatality_Deblur for very strong gaussian blur.
ESRGAN
1x
No Image
Focus Moderate
These models deblur most images. It was trained mostly on aniso2 and iso blurring (BSRGAN augmentation) with some gaussian mixed in. It performs well on most blurry images, but I'd recommend using something like Fatality_Deblur for very strong gaussian blur.
ESRGAN
1x
No Image
Swift-SRGAN
2x
No Image
Swift-SRGAN 2x
Swift-SRGAN - Rethinking Super-Resolution for real-time inference In recent years, there have been several advancements in the task of image super-resolution using the state of the art Deep Learning-based architectures. Many super-resolution-based techniques previously published, require high-end and top-of-the-line Graphics Processing Unit (GPUs) to perform image super-resolution. With the increasing advancements in Deep Learning approaches, neural networks have become more and more compute hungry. We took a step back and, focused on creating a real-time efficient solution. We present an architecture that is faster and smaller in terms of its memory footprint. The proposed architecture uses Depth-wise Separable Convolutions to extract features and, it performs on-par with other super-resolution GANs (Generative Adversarial Networks) while maintaining real-time inference and a low memory footprint. A real-time super-resolution enables streaming high resolution media content even under poor bandwidth conditions. While maintaining an efficient trade-off between the accuracy and latency, we are able to produce a comparable performance model which is one-eighth (1/8) the size of super-resolution GANs and computes 74 times faster than super-resolution GANs. NOTE: The author used the wrong file extensions for the models *on GitHub*. You will download a `.pth.tar` file. This is not actually a TAR file. Change the file extension to just `.pth` and the model will work.
Swift-SRGAN
4x
No Image
Swift-SRGAN 4x
Swift-SRGAN - Rethinking Super-Resolution for real-time inference In recent years, there have been several advancements in the task of image super-resolution using the state of the art Deep Learning-based architectures. Many super-resolution-based techniques previously published, require high-end and top-of-the-line Graphics Processing Unit (GPUs) to perform image super-resolution. With the increasing advancements in Deep Learning approaches, neural networks have become more and more compute hungry. We took a step back and, focused on creating a real-time efficient solution. We present an architecture that is faster and smaller in terms of its memory footprint. The proposed architecture uses Depth-wise Separable Convolutions to extract features and, it performs on-par with other super-resolution GANs (Generative Adversarial Networks) while maintaining real-time inference and a low memory footprint. A real-time super-resolution enables streaming high resolution media content even under poor bandwidth conditions. While maintaining an efficient trade-off between the accuracy and latency, we are able to produce a comparable performance model which is one-eighth (1/8) the size of super-resolution GANs and computes 74 times faster than super-resolution GANs. NOTE: The author used the wrong file extensions for the models *on GitHub*. You will download a `.pth.tar` file. This is not actually a TAR file. Change the file extension to just `.pth` and the model will work.
CAIN
2x
No Image
cvpv6
A 2x model for Anime - interpolation.
ESRGAN
4x
VolArt
VolArt
VolArt
A 4x model for Game textures/Art. This model upscales artwork for the game Volfoss (2001). The model peaked in quality very quickly. The NR model removes most noise, but has the downside of removing transparent portions. Use the main model in most cases.
ESRGAN
4x
No Image
VolArtNR
A 4x model for Game textures/Art. This model upscales artwork for the game Volfoss (2001). The model peaked in quality very quickly. The NR model removes most noise, but has the downside of removing transparent portions. Use the main model in most cases.
ESRGAN+
4x
Valar
Valar
by musl
Meant as an experiment to test latest techniques implemented on traiNNer, including: AdaTarget, KernelGAN, UNet discriminator, nESRGAN+ arch, noise patches, camera noise, isotropic/anisotropic/sinc blur, frequency separation, contextual loss, mixup, clipL1 pixel loss, AdamP optimizer, etc. The config file is provided on the download link above. I encourage everybody to mirror the model, distribute and modify it in anywway you want.
ESRGAN
4x
Fabric
Fabric
Fabric
A 4x model for Fabric. This model set upscales fabric or cloth textures (works on cats too!). The Alt model is just an earlier iteration version. It may work better on some images.The images need to be minimally compressed or passed through a decompression model first. It works with DDS compression though.
ESRGAN
4x
Fabric-Alt
Fabric-Alt
Fabric-Alt
A 4x model for Fabric. This model set upscales fabric or cloth textures (works on cats too!). The Alt model is just an earlier iteration version. It may work better on some images.The images need to be minimally compressed or passed through a decompression model first. It works with DDS compression though.
SOFVSR
2x
VimeoScale Unet
VimeoScale Unet
VimeoScale Unet
A 2x model for Upscaling video content. This model is meant to surpass VEAI 2x while being efficient to run quickly with fp16. The real-esrgan/BSRGAN augmentation and Unet should help with videos where the resolution is not ideal and can reconstruct details without effecting blurs in most cases. This model SHOULD run faster than real-esrgan while matching the resolving and enabling some multiframe feature extraction. No major denoising/compression/blurring effects (or artifacts) should be found. Pretrained model: Self-Trained Base for the Unet finalisation with vgg_fea discrim
ESRGAN
4x
UniScaleV2 Moderate
UniScaleV2 Moderate
UniScaleV2 Moderate
These models work great on game textures when interpolated 50/50 with UniScale_Restore, and work amazingly on uncompressed images. DO NOT USE FOR COMPRESSED IMAGES, use the original UniScale or UltraSharp for that.
ESRGAN
4x
UniScaleV2 Sharp
UniScaleV2 Sharp
UniScaleV2 Sharp
These models work great on game textures when interpolated 50/50 with UniScale_Restore, and work amazingly on uncompressed images. DO NOT USE FOR COMPRESSED IMAGES, use the original UniScale or UltraSharp for that.
ESRGAN
4x
UniScaleV2 Soft
UniScaleV2 Soft
UniScaleV2 Soft
These models work great on game textures when interpolated 50/50 with UniScale_Restore, and work amazingly on uncompressed images. DO NOT USE FOR COMPRESSED IMAGES, use the original UniScale or UltraSharp for that.
ESRGAN
2x
UniScale CartoonRestore Lite
UniScale CartoonRestore Lite
A 2x model for Animation, Pixel Art. This model has VERY strong compression removal and line restructuring that allows it to restore any heavily compressed drawings, animation, cartoons, or anime. Also works on games as well as DDS compression. It renders frames very quickly and is very viable for restoring videos. Pretrained model: 4x-MMScale (Yes, 4x. mistakes were made)
ESRGAN
4x
UniScale Restore
UniScale Restore
UniScale Restore
UniScale_Restore has strong compression removal that helps with restoring heavily compressed or noisy images. It is intended to compete with BSRGAN. Trained with BSRGAN_Resize and Combo_Noise in traiNNer.
SwinIR
2x
No Image
SwinIR-M-x2 (classicalSR-DF2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
2x
No Image
SwinIR-M-x2 (classicalSR-DIV2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
2x
No Image
SwinIR-M-x2 (lightweightSR-DIV2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
2x
No Image
SwinIR-M-x2-GAN (realSR-BSRGAN-DFO-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
3x
No Image
SwinIR-M-x3 (classicalSR-DF2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
3x
No Image
SwinIR-M-x3 (classicalSR-DIV2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
3x
No Image
SwinIR-M-x3 (lightweightSR-DIV2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
4x
No Image
SwinIR-M-x4 (classicalSR-DF2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
4x
No Image
SwinIR-M-x4 (classicalSR-DIV2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
ESRGAN
4x
CountryRoads
CountryRoads
CountryRoads
Streets with dense foliage in the background. Outdoor scenes.
SwinIR
4x
No Image
SwinIR-M-x4 (lightweightSR-DIV2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
4x
No Image
SwinIR-M-x4-GAN (realSR-BSRGAN-DFO-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
4x
No Image
SwinIR-L-x4-GAN (realSR-BSRGAN-DFOWMFC-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
8x
No Image
SwinIR-M-x8 (classicalSR-DF2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
SwinIR
8x
No Image
SwinIR-M-x8 (classicalSR-DIV2K-s64w8)
SwinIR: Image Restoration Using Swin Transformer Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.
ESRGAN
2x
BSRGANx2
BSRGANx2
BSRGANx2
by cszn
Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021)
ESRGAN
4x
BSRGAN
BSRGAN
BSRGAN
by cszn
Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021)
ESRGAN
4x
UniScale Balanced
UniScale Balanced
UniScale Balanced
UniScale strikes a nice balance between sharpness and realism. This model can upscale almost anything well. It was originally intended to upscale game textures, but was expanded into a universal upscaler. Interp is these two models interpolated.
ESRGAN
4x
UniScale Strong
UniScale Strong
UniScale Strong
UniScale strikes a nice balance between sharpness and realism. This model can upscale almost anything well. It was originally intended to upscale game textures, but was expanded into a universal upscaler. Interp is these two models interpolated.
ESRGAN
4x
UniScaleNR Balanced
UniScaleNR Balanced
UniScaleNR Balanced
Version of UniScale trained with camera noise injection (NR = Noise Removal). This model removes noise from images while upscaling.
ESRGAN
4x
UniScaleNR Strong
UniScaleNR Strong
UniScaleNR Strong
Version of UniScale trained with camera noise injection (NR = Noise Removal). This model removes noise from images while upscaling.
ESRGAN+
8x
BoyMeBob Redux
BoyMeBob Redux
BoyMeBob Redux
by Joey
Upscaling cartoons Pretrained model: 8xBoyMeBob (unreleased) which used 8xESRGAN
CAIN
2x
No Image
cainliteanime
A 2x model for Anime - interpolation. Sample video: https://cdn.discordapp.com/attachments/724353640521007145/873861876352704533/1.webm
ESRGAN
1x
No Image
ArtClarity
A 1x model. Pretrained using 4xPSNR. Texture retaining denoiser and sharpener for digital artwork Sample images: https://1drv.ms/u/s!Aip-EMByJHY28xKFp93VIGydNsoN?e=vN2WYh
ESRGAN
2x
ATLA KORRA
ATLA KORRA
ATLA KORRA
Upscaling of Animation based on The Legend of Korra.
ESRGAN
2x
RealESRGAN_x2Plus
RealESRGAN_x2Plus
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data.
ESRGAN
4x
RealESRGAN_x4Plus
RealESRGAN_x4Plus
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data.
ESRGAN
1x
Filmify4K v2
Filmify4K v2
Filmify4K v2
by Muf
A 1x model for artifact. This model attempts to make films upscaled to 4K with Topaz Gaia-HQ look more natural and filmic. It sharpens, adds film grain, and smooths out small artefacts from the upscaling process. I recommend adding a tiny amount of grain to the input to seed the model (you can do this in VEAI), otherwise the film grain will remain static across frames that don't move much. Pretrain model used with permission to relicense from Twittman.
ESRGAN
4x
No Image
muy4_035_1
Upscaling of anime art (Specifically visual novel CG art)
CAIN
2x
No Image
cainREALTIME
A 2x model for RL videos - interpolation. Architecture: CAIN (CAIN 1 Group) Trained mostly on abba music videos
ESRGAN
1x
UnResize V3
UnResize V3
UnResize V3
Fix images that have been arbitrarily / poorly resized, such as non-integer nearest-neighbor upscaling/downscaling - Also acts as an image sharpener/deblur when used on slightly soft inputs Pretrained model: 1x_UnResize_MKII_030000_G.pth
ESRGAN
1x
Plants
Plants
Plants
by Muf
A 1x model for bad upscale. Images of plants, trees or other foliage upscaled with Photoshop Preserve Details 2.0. Sharpens and "subdivides" details and noise so it doesn't look upscaled.
ESRGAN
4x
No Image
Ground
A 4x model for Upscales ground textures.
ESRGAN
1x
No Image
MangaJPEGHQ
Remove JPEG artifacts from manga without destroying screentone and other details. For 80-50 JPEG Quality
ESRGAN
1x
No Image
MangaJPEGHQPlus
Remove JPEG artifacts from manga without destroying screentone and other details. For 95-80 JPEG Quality
ESRGAN
1x
MangaJPEGLQ
MangaJPEGLQ
MangaJPEGLQ
Remove JPEG artifacts from manga without destroying screentone and other details. For 25-5 JPEG Quality
ESRGAN
1x
No Image
MangaJPEGMQ
Remove JPEG artifacts from manga without destroying screentone and other details. For 60-30 JPEG Quality
ESRGAN
4x
No Image
NXbrz
Basic pixel art upscaling, for people who want a more simpler style and lightweight pixel art upscaling model. Sample images: https://drive.google.com/drive/folders/1mYTMpwDlKQulBmjgKpcxL3-T6xAGXVxl?usp=sharing
ESRGAN
1x
No Image
DeBink v4
This model removes early 2000s Bink and other compression artifacts. Works well on almost any image or video compression type.
ESRGAN
1x
No Image
DeBink v5
This model removes early 2000s Bink and other compression artifacts. Works well on almost any image or video compression type.
ESRGAN
1x
No Image
DeBink v6
This model removes early 2000s Bink and other compression artifacts. Works well on almost any image or video compression type.
ESRGAN
1x
No Image
Loyaldk SharpKeroro
Upscale Anime while keeping as much of the original detail as possible. Isn't among the sharpest models and if the anime is somewhat blurry it will retain that detail but add more pixels to give a smoother edge. Doesn't change the color very much and may make some scene's very slightly darker. Pretrained model: Pony Models
ESRGAN
2x
No Image
Loyaldk Giroro
Upscale Anime while keeping as much of the original detail as possible. Isn't among the sharpest models and if the anime is somewhat blurry it will retain that detail but add more pixels to give a smoother edge. Doesn't change the color very much and may make some scene's very slightly darker. Pretrained model: Pony Models
ESRGAN
2x
No Image
Loyaldk Keroro
Upscale Anime while keeping as much of the original detail as possible. Isn't among the sharpest models and if the anime is somewhat blurry it will retain that detail but add more pixels to give a smoother edge. Doesn't change the color very much and may make some scene's very slightly darker. Pretrained model: Pony Models
ESRGAN
2x
No Image
Loyaldk Kororo
Upscale Anime while keeping as much of the original detail as possible. Isn't among the sharpest models and if the anime is somewhat blurry it will retain that detail but add more pixels to give a smoother edge. Doesn't change the color very much and may make some scene's very slightly darker. Pretrained model: Pony Models
ESRGAN
4x
No Image
Loyaldk Giroro
Upscale Anime while keeping as much of the original detail as possible. Isn't among the sharpest models and if the anime is somewhat blurry it will retain that detail but add more pixels to give a smoother edge. Doesn't change the color very much and may make some scene's very slightly darker. Pretrained model: Pony Models
ESRGAN
4x
No Image
Loyaldk Keroro
Upscale Anime while keeping as much of the original detail as possible. Isn't among the sharpest models and if the anime is somewhat blurry it will retain that detail but add more pixels to give a smoother edge. Doesn't change the color very much and may make some scene's very slightly darker. Pretrained model: Pony Models
ESRGAN
4x
No Image
Loyaldk Kororo
Upscale Anime while keeping as much of the original detail as possible. Isn't among the sharpest models and if the anime is somewhat blurry it will retain that detail but add more pixels to give a smoother edge. Doesn't change the color very much and may make some scene's very slightly darker. Pretrained model: Pony Models
CAIN
2x
No Image
rvpV1
A 2x model for Video Frame Interpolation for anime openings. Pretrained using My ~~first~~ (ok technically second) attempt to create a video frame interpolation model and I like how it turned out. To use it, you can either use https://gitlab.com/hubert.sontowski2007/cainapp, https://github.com/styler00dollar/Colab-CAIN or the bot in [Game Upscale] (model called rvpv1 there, just use --model rvpv1). Some demo videos in pcloud, but you need to download them. The web player seems to playback in low fps. The architecture is mainly the same to the original CAIN, but i modified the padding to be zero padding instead. ``.pt`` means JIT model, `.pth`, means normal pytorch model. Architecture file is in pcloud as well. And no, no cupscale or flowframes.. Sample video: https://files.catbox.moe/xiq9vi.mp4
ESRGAN
2x
KemonoScale v2
KemonoScale v2
KemonoScale v2
Upscaling frames from Irodori anime (namely kemono friends) from 540p (the source render resolution) to 1080p, low resolution flat shaded art, de-JPEG of the aforementioned
ESRGAN+
2x
MangaScaleV3
MangaScaleV3
MangaScaleV3
To upscale manga including halftones, instead of trying to smooth them out. Pretrained model: MangaScaleV2(not released)
ESRGAN
4x
No Image
UltraFArt v3
A 4x model for Art. Illustrations with with larger shaped features (?). Pretrained model: 4x_UltraFArt
ESRGAN
4x
No Image
UltraFArt v3 Fine
A 4x model for Art. Illustrations with with larger shaped features (?). Pretrained model: 4x_UltraFArt
ESRGAN
4x
No Image
UltraFArt v3 Photo
A 4x model for Art. Illustrations with with larger shaped features (?). Pretrained model: 4x_UltraFArt
ESRGAN
4x
No Image
UltraFArt v3 Smooth
A 4x model for Art. Illustrations with with larger shaped features (?). Pretrained model: 4x_UltraFArt
ESRGAN
1x
No Image
ThePi7on Solidd Deborutify UltraLite
Sharpening, line darkening and slight line thinning, specifically made for the Boruto anime.
ESRGAN
2x
No Image
Loyaldk SuperPony V2.0
Upscale MLP episodes. Full version. Able to handle compression better compared to V1.0 and no longer creates halo's/rainbows. Good for Vector 2D art however converts detail to blobs.
ESRGAN
4x
No Image
Loyaldk MediumPony V2.0
Upscale MLP episodes. Full version. Able to handle compression better compared to V1.0 and no longer creates halo's/rainbows. Good for Vector 2D art however converts detail to blobs.
ESRGAN
4x
No Image
Loyaldk SuperPony V2.0
Upscale MLP episodes. Full version. Able to handle compression better compared to V1.0 and no longer creates halo's/rainbows. Good for Vector 2D art however converts detail to blobs.
ESRGAN
1x
DitherDeleterV3 Smooth
DitherDeleterV3 Smooth
DitherDeleterV3 Smooth
Attempts to remove the damages done by dithering. For this model, I downscaled all of the HR images to make their pixel size closer to 1000x1000 using the Box filter and then downscaled them again by 50% using the Point filter. Afterwards, I applied 32-bit Riemersma to every image in the dataset. https://imgsli.com/NTA5MjU/0/1
ESRGAN
2x
LD-Anime_Skr v1.0
LD-Anime_Skr v1.0
LD-Anime_Skr v1.0
by Skr
A 2x model for Denoise/Dehalo. Denoise, dehalo, derainbow old anime
ESRGAN
1x
Bandage Smooth
Bandage Smooth
Bandage Smooth
Attempts to remove the damages done by color banding. For this model, I downscaled all of the HR images to make their pixel size closer to 1000x1000 using the Box filter and then downscaled them again by 50% using the Point filter. Afterwards, I applied 64-bit color banding to every image in the dataset. Comparisons with other models: https://imgsli.com/NTA1NTk
ESRGAN
4x
No Image
Loyaldk LitePony V2.0
Good for Vector 2D art however converts detail to blobs. Quality seems weak with this one. However posting anyways to see if a use is found.
ESRGAN
2x
No Image
Loyaldk MediumPony V2.0
Upscale MLP episodes. Liter version. Able to handle compression better compared to V1.0 and no longer creates halo's/rainbows. Good for Vector 2D art however converts detail to blobs. Leaves more blobs when converting detail to blobs compared to litepony
ESRGAN
1x
DXTless SourceEngine
DXTless SourceEngine
DXTless SourceEngine
This model is made for Source Engine textures. It tries to remove compression artifacts such as blockyness, discoloration, green tint. It does pretty well on a lot of things, realistic stuff as well, but it was mostly made to work on TF2 textures. It's made to keep as much detail as possible, without any unnecessary denoising/sharpening. Huge thanks to Twittman for assisting me along this journey.
ESRGAN
4x
No Image
Remacri
A 4x model. Pretrained using none - interpolated. A creation of BSRGAN with more details and less smoothing, made by interpolating IRL models such as Siax, Superscale, Superscale Artisoft, Pixel Perfect, etc. This was, things like skin and other details don't become mushy and blurry.
ESRGAN
2x
No Image
Loyaldk LitePony V2.0
Upscale MLP episodes. Liter version. Able to handle compression better compared to V1.0 and no longer creates halo's/rainbows. Good for Vector 2D art however converts detail to blobs.
ESRGAN
4x
UniversalUpscalerV2 Sharp
UniversalUpscalerV2 Sharp
UniversalUpscalerV2 Sharp
General Upscaler Comparison with v1: https://imgsli.com/NDc3ODA/1/2
ESRGAN
2x
No Image
VHS upscale and denoise Film
A 2x model for VHS. VHS captures of Film material, but may work on VHS recorded native SD-TV material as well. Also useable for cleaned up source material
ESRGAN
8x
No Image
MS Unpainter
Low-resolution MS Paint drawings, general pixel art, and general dithered pixel art, all kinds of pixel art Pretrained model: A mix of: 8x_glasshopper_ArzenalV1.1, 8x_glasshopper_MS-Unpainter, 8x_NMKD-Sphax + Sphax de-dither, and 8x_NMKD-YanderePixelArt4 + Yandere De-Dither, all interpolated with adjustments using the interpolate function in Cupscale
ESRGAN
8x
No Image
MS Unpainter De-Dither
Low-resolution MS Paint drawings, general pixel art, and general dithered pixel art, all kinds of pixel art Pretrained model: A mix of: 8x_glasshopper_ArzenalV1.1, 8x_glasshopper_MS-Unpainter, 8x_NMKD-Sphax + Sphax de-dither, and 8x_NMKD-YanderePixelArt4 + Yandere De-Dither, all interpolated with adjustments using the interpolate function in Cupscale
ESRGAN
2x
No Image
KcjpunkAnime 2.0 Lite
A 2x model for Digital Animation. Pretrained using This is my first attempt to make Light model so I started with 2x version. This model is much faster and give better results than my previous one.. Up-scaling Digital Animation
ESRGAN
4x
No Image
KCJPUNK
A 4x model for Digital Animation. Up-scaling Digital Animation
ESRGAN
4x
HellinaCel
HellinaCel
HellinaCel
A 4x model for Traditional Animation. A rougher alternative to 4xCelFrames with a focus on realistic looking cels over nice looking cels. It's trained on DetoriationFrames LRs, so if you give it an image straight from the source it will go twice as hard on it. This can be used to your advantage, though. I recommend cleaning it up before running DetoriationFrames, or else it will come out rough.
ESRGAN
1x
PixelSharpen
PixelSharpen
PixelSharpen
Restores blurry/upscaled pixel art.
ESRGAN
1x
VHS-Sharpen
VHS-Sharpen
VHS-Sharpen
A 1x model for VHS. Make old VHS footage crispy. This model will **not** work on video and images with noticeable JPEG/Video compression artifacts, noticeable interlacing or haloing, heavy tape distortion/artifacts and scenes with tons of detail. For best results, use a downscaled HD capture of the VHS tape you intend to use it on.
ESRGAN
4x
No Image
Training4Melozard Anime
by Joey
Pretrained model: RRDB_ESRGAN_x4_old_arch.pth
ESRGAN
1x
No Image
BaldrickVHSFix V0.2
A 1x model for VHS. Fixing minor VHS Chroma and Pattern Noise - NOTE: only works on deinterlaced sources
ESRGAN
2x
No Image
Loyaldk LitePony V1.0
Upscale MLP episodes. Liter version.
SRResNet
2x
No Image
Waifaux NL3 SRResNet
by Joey
Emulating Waifu2x at Noise Level 3 NOTE: You can't use this with regular esrgan forks or the bot, it has to be run through basicsr
ESRGAN
2x
No Image
Waifaux NL3 SuperLite
by Joey
Trained this model to see how it would work trying to essentially get the same results as Waifu2x but with ESRGAN
ESRGAN
2x
DigitalFilmV5 Lite
DigitalFilmV5 Lite
DigitalFilmV5 Lite
A 2x model for Traditional Animation. Upscaling Dragon Ball Z DBox DVD's. - grainy sources. It keeps some grain, but also doing some cleaning, sharpening, and fixing.
ESRGAN
4x
No Image
DigitalFake 2.1
by Joey
A 4x model for Digital Animation. Replica of DigitalFrames 2.1 but interpolatable
ESRGAN
8x
No Image
Arzenal v1.1
Smooth general pixel art, Minecraft textures Pretrained model: Interpolated from nmkd's pixel art upscaling models with different interpolation settings and de-dithered versions
ESRGAN
4x
No Image
PocketMonsters-Alpha
by Joey
A 4x model for Pixel Art with Tranparency / Alpha Channel. upscaling pixel art with alpha channels that should perform better than any other. It should work well on both cartoon and 3D styled content.
ESRGAN
8x
No Image
Sphax Alpha NN
by Joey
A 8x model for Pixel Art with Tranparency / Alpha Channel. Replica of sphax with transparency
ESRGAN
2x
No Image
Gen5 Alpha
by Joey
A 2x model for Pixel Art with Tranparency / Alpha Channel.
ESRGAN
2x
NMKD YandereNeo (2x)
NMKD YandereNeo (2x)
NMKD YandereNeo (2x)
by Nmkd
Pretrained model: 4x_DIV2K-Lite_1M (`4x_NMKD-YandereNeo-Lite_320k` for the 2x conversion)
ESRGAN
4x
NMKD YandereNeo (4x)
NMKD YandereNeo (4x)
NMKD YandereNeo (4x)
by Nmkd
Pretrained model: 4x_DIV2K-Lite_1M (`4x_NMKD-YandereNeo-Lite_320k` for the 2x conversion)
ESRGAN
1x
DitherDeleter Smooth
DitherDeleter Smooth
ESRGAN
2x
CGIMaster v1
CGIMaster v1
CGIMaster v1
Mixed 3D/2D CGI animations on some 2D animations it might sharpen the edges. General usage is to upscale CGI animations compressed by YouTube.
ESRGAN
4x
SGI
SGI
SGI
Upscaling and dedithering pre-rendered sprites, images and textures made in the 90s. Basically vintage CGI.
ESRGAN
4x
No Image
1ch-Alpha Lite
by Joey
A 4x model for Alpha. Obsoleted by Joey's Fork - Alpha channels of PNGs
ESRGAN
4x
FSMangaV2
FSMangaV2
FSMangaV2
Manga-style images with or without dithering - cartoons, maybe pixel art, etc
ESRGAN
2x
BIGOLDIES
BIGOLDIES
BIGOLDIES
upscaling old anime. help to denoise and find lines and dehalo
ESRGAN
2x
SHARP ANIME V2
SHARP ANIME V2
ESRGAN
1x
Sudo Inpaint PartialConv2D
Sudo Inpaint PartialConv2D
Sudo Inpaint PartialConv2D
Experimental PartialConv2D attempt to paint with ESRGAN. Took ~10.4 days of training on a P100 and around 1.5 months in total due to Colab limits. Not sure if I will continue training it since training is very slow, but may get better.. Warning: Result can vary with different tilesizes. Try not to tile your data. Sample image: https://e.pcloud.link/publink/show?code=kZQOu7ZldzmFyMPUcFNGkEvwqOxQ8Bl3CeX
SOFVSR
4x
No Image
REDSVAL-7f-RRDB Lite
by Joey
A 4x model for IRL videos. none
ESRGAN
4x
No Image
Cat_Patch
A 4x model for Cats. Pretrained model: Previous attempt
ESRGAN
4x
No Image
AbeScale
A 4x model for Linework Cartoons.
SOFVSR
4x
No Image
VESRGAN_G
Trained on REDS' size dataset
SOFVSR
4x
No Image
SOFVSR_REDS_F3 V1
A 4x model for IRL videos. upscale IRL videos. Use not recommended by creator. Sample video: https://cdn.discordapp.com/attachments/547949806761410560/813134983236419664/100k_iter.mp4
ESRGAN
4x
PixelPerfectV4
PixelPerfectV4
PixelPerfectV4
A 4x model for Pixel Art/Sprites. Sprite Upscaler Comparison with v3: https://imgsli.com/MzgxMTc/2/1
ESRGAN
1x
No Image
FrankenMapGenerator-CX Lite
by Joey
A 1x model for Map Generation - roughness and displacement maps. This model generates "Franken Maps" (named after Frankenstein), which is a custom material map combination I made. Basically, the Red channel of RGB is just the texture converted to grayscale, the Green channel is the roughness map, and the Blue channel is the displacement map. I had to do this to get around the current limitation of CX loss where it requires a 3 channel output (otherwise I would have just made a 2 channel model, or separate single channel models). As of right now the channels need to be manually split from each other but I will be making a tool for doing this automatically in the coming days. Pretrained model: 1x_DIV2K-Lite_450k.pth
ESRGAN
4x
No Image
Fatality MK2
A 4x model for Pixel Art/Sprites. Pretrained using A previously attempted MK2.
ESRGAN
4x
NMKD Siax ("CX")
NMKD Siax ("CX")
NMKD Siax ("CX")
by Nmkd
Universal upscaler for clean and slightly compressed images (JPEG quality 75 or better)
ESRGAN
1x
NormalMapGenerator-CX Lite
NormalMapGenerator-CX Lite
NormalMapGenerator-CX Lite
by Joey
A 1x model for Map Generation - normal maps. Generating normal maps from textures Pretrained model: 1x_DIV2K-Lite_450k.pth
ESRGAN
8x
NMKD Typescale
NMKD Typescale
NMKD Typescale
by Nmkd
Low-resolution text/typography and symbols
ESRGAN
1x
DeEdge
DeEdge
DeEdge
Halo Removal + edge removal. This model softens edges without blurring other parts of the image. Comparison and use case: https://imgsli.com/MjgxMDU
ESRGAN
1x
No Image
SpongeBC1 Lite
by Joey
First ever lite BC1/DXT1 model. Probably only useful for cartoony textures like those in spongebob games or other cartoon licensed games. Pretrained model: 1x_DIV2K-Lite_80k
ESRGAN
2x
pokemodel lite
pokemodel lite
pokemodel lite
Upscale old anime like pokemon
ESRGAN
1x
NMKD h264Texturize
NMKD h264Texturize
NMKD h264Texturize
by Nmkd
A 1x model for Texturizing. Tries to reverse heavy h264 compression. Fails. Can be used to texturize images though.
ESRGAN
4x
No Image
Rek's Effeks Photoanime v2
by Rek
A 4x model for Stylization. Photo stylization from JPEGs. Trained on images upscaled by ISO Denoise v2 -> DeJPEG Fatality PlusULTRA -> NMKD Yan2. Essentially a combination of that model chain into one. Use if you're looking for a stylized output, not photo quality. Pretrained model: NMKD Yandere2
ESRGAN
1x
No Image
SpongeColor Lite
by Joey
The first attempt at ESRGAN colorization that produces more than 2 colors. Doesn't work that great but it was a neat experiment.
ESRGAN
4x
NMKD UltraYandere Lite
NMKD UltraYandere Lite
NMKD UltraYandere Lite
by Nmkd
Fast Anime/Art upscaling Pretrained model: 4x_DIV2K-Lite
ESRGAN
2x
No Image
Faithful Lite
by Joey
A "lite" model version of my Faithful model
ESRGAN
2x
No Image
FakeFaith Lite
by Joey
An attempt at recreating the "faithful" style without using the faithful dataset -- aka keeping the "pixel art" style of pixel art.
ESRGAN
4x
No Image
Struzan
A 4x model for Art. Upscaling airbrush/pencil-based artwork Sample images: https://drive.google.com/drive/folders/1fyeIWInDrM6r-xxrCW4U09oafWw08u9S?usp=sharing
ESRGAN
1x
No Image
SBDV-DeJPEG Lite
by Joey
Pretrained model: 1x_DIV2K-Lite_80k.pth
ESRGAN
4x
No Image
RRDB-G_ResNet-D (Both G and D)
by Joey
Pretrained Discriminators Clean bicubic downscales / Pretrained models (G/D)
ESRGAN
4x
NMKD UltraYandere
NMKD UltraYandere
NMKD UltraYandere
by Nmkd
A 4x model for Art/Anime. Highly flexible 2D Art upscaling
ESRGAN
1x
NMKD YandereInpaint
NMKD YandereInpaint
ESRGAN
4x
No Image
MeguUp
Upscaling of lossless (uncompressed) anime art. Pretrained model: Interpolated custom, hence the license.
ESRGAN
2x
SHARP ANIME V1
SHARP ANIME V1
SHARP ANIME V1
this model has been trained to work on lines and details - works well on animes which have fairly fine lines at the base but also the video must be progressive or deinterlaced
ESRGAN
4x
NMKD PatchySharp
NMKD PatchySharp
NMKD PatchySharp
by Nmkd
A 4x model for Art/CGI. Upscaler for clean images or images with compression artifacts (jpeg quality >75) - Produces very sharp lines/edges due to NN-Filtered HR images. Proven to produce very, very good results on drawings (sharp lines) and CGI, but should also work pretty well for real-world images.
ESRGAN
4x
No Image
Lollypop
A universal model, that is aimed at prerendered images, but handles realistic faces, manga, pixel art and dedithering as well. Trained using the patchgan discriminator, with cx loss, cutmixup and frequency separation, it produces good results with a slight grain due to patchgan, with some sharpening using cutmixup.
ESRGAN
4x
PackCraft v4
PackCraft v4
PackCraft v4
by Joey
A 4x model for Upscaling pack.png. Designed to upscale one specific minecraft screenshot. Results on dissimilar screenshots may be poor. Pretrained model: 4xESRGAN originally, but I used each previous version as a pretrained until I got to v4.
ESRGAN
4x
OLDIES_ALTERNATIVE_FINAL
OLDIES_ALTERNATIVE_FINAL
OLDIES_ALTERNATIVE_FINAL
This model was made for my project captain tsubasa anime so i don't know if it works good for anything else. just try it ;)
ESRGAN
2x
fidelbd pokemodel
fidelbd pokemodel
fidelbd pokemodel
Made this model to upscale old anime that looks blurry.
SOFVSR
2x
No Image
SBS11 RRDB
by Joey
Pretrained Discriminators
ESRGAN
4x
No Image
Rebout Blend
A 4x model for Pixel Art/Sprites.
SOFVSR
3x
No Image
Video TSSM RRDB
by Joey
Pretrained Discriminators
ESRGAN
4x
No Image
OLDIES_FINAL
i made this model to upscale old anime and denoise.
SOFVSR
4x
No Image
video_G
Trained on REDS' size dataset
ESRGAN
4x
No Image
DeIndeo
A 4x model for Indeo Compression Artifacts.
ESRGAN
1x
BCGone Smooth
BCGone Smooth
BCGone Smooth
Attempts to remove the damages done by BC1 compression.
ESRGAN
1x
No Image
DoubleDetoon
by Joey
A 1x model for Detooning. An attempt to detoon images/drawings of people
ESRGAN
4x
No Image
BSDevianceMIP
A 4x model for Art. upscales Digital Drawings. It was trained on random drawings found on DeviantArt. Mostly Landscape and Scenery and Illustrations of Characters. It does fairly well and works on many different styles.
ESRGAN
4x
No Image
BSSbeveHarvey
Upscale Steve Harvey, but maybe other things, somehow??
ESRGAN
4x
BigFace
BigFace
BigFace
A 4x model for Art/People.
ESRGAN
4x
BigFace v3
BigFace v3
BigFace v3
A 4x model for Art/People.
ESRGAN
4x
No Image
BigFace V3 Blend
A 4x model for Art/People.
ESRGAN
4x
No Image
BigFace V3 Clear
A 4x model for Art/People.
ESRGAN
4x
No Image
NMKD Superscale
by Nmkd
A 4x model for Clean Real-World Images. Upscaling of realistic images/photos with noise and compression artifacts
ESRGAN
4x
No Image
FuzzyBox
Photographs, Artwork, Textures, Anything really - Tried out a new pixel loss idea based on ensuring the HR downscaled matches the LR. Colors are pretty good as well as edges, but generated details seem slightly fuzzy hence the name.
ESRGAN
4x
Jaypeg90
Jaypeg90
Photos/realistic 3D with JPEG compression, quality 85-95 and 4:2:0 chroma subsampling. - Created for Myst3 images, since all have 4:2:0 chroma subsampling and existing JPEG models did not give good results. Favors smoothing over over-sharpening.
SPSR
2x
No Image
FaithfulSPSR
by Joey
Mainly just a test for SPSR. Seems to work better than the original 2xFaithful32_1316 that I used as a pretrained, even though it uses the same dataset. Pretrained model: 2xFaithful32_1316
SPSR
4x
No Image
BS_ScreenBooster_SPSR
This model is designed to upscale game screenshots (3D Games) by 4 times. The SPSR version is an improvement over the ESRGAN based V2.
ESRGAN
1x
No Image
N64clean
N64 textures use a color depth of 5-bits per channel, this model attempts to clean them, restoring smooth gradients in textures
ESRGAN
4x
FSDedither Riven
FSDedither Riven
FSDedither Riven
Fine-tuned 4xFSDedither to upscale images from the game Riven, but should be better in general, particularly on ordered dithering. I adjusted the dataset to have a better variety of dithering parameters, and turned up the HFEN and pixel loss to get better details and color restoration with less noise.
ESRGAN
4x
No Image
NickelbackFS
This model aims to improve further on what has been achieved by the old Nickelback which was an improvement attempt over 4xESRGAN and also 4xBox. It can upscale most pictures/photos (granted they are clean enough) without destroying as much detail as Box and basic ESRGAN.
ESRGAN
4x
No Image
Morrowind Mixed
Morrowind Mod Textures
ESRGAN
8x
No Image
HugePaint
A 8x model for Digital Illustrations. Trained on a variety of images from ArtStation
ESRGAN
4x
No Image
Fatal_Anime
Trained on Anime and Manga images
ESRGAN
8x
No Image
HugePeeps v1
A 8x model for Art/People. Painted humans
ESRGAN
4x
No Image
Fatal Pixels
A 4x model for Pixel Art/Sprites.
ESRGAN
4x
No Image
BigFArt
Larger-scaled pixels to digital painting
ESRGAN
4x
No Image
BigFArt Bang1
Larger-scaled pixels to digital painting
ESRGAN
4x
No Image
BigFArt Base
Larger-scaled pixels to digital painting
ESRGAN
4x
No Image
BigFArt Blend
Larger-scaled pixels to digital painting
ESRGAN
4x
No Image
BigFArt Detail
Larger-scaled pixels to digital painting
ESRGAN
4x
No Image
BigFArt Fine
Larger-scaled pixels to digital painting
ESRGAN
4x
RealSR DF2K
RealSR DF2K
Real-World Super-Resolution via Kernel Estimation and Noise Injection Our solution is the winner of CVPR NTIRE 2020 Challenge on Real-World Super-Resolution in both tracks. Recent state-of-the-art super-resolution methods have achieved impressive performance on ideal datasets regardless of blur and noise. However, these methods always fail in real-world image super-resolution, since most of them adopt simple bicubic downsampling from high-quality images to construct Low-Resolution (LR) and High-Resolution (HR) pairs for training which may lose track of frequency-related details. To address this issue, we focus on designing a novel degradation framework for real-world images by estimating various blur kernels as well as real noise distributions. Based on our novel degradation framework, we can acquire LR images sharing a common domain with real-world images. Then, we propose a real-world super-resolution model aiming at better perception. Extensive experiments on synthetic noise data and real-world images demonstrate that our method outperforms the state-of-the-art methods, resulting in lower noise and better visual quality. In addition, our method is the winner of NTIRE 2020 Challenge on both tracks of Real-World Super-Resolution, which significantly outperforms other competitors by large margins. for corrupted images with processing noise.
ESRGAN
4x
No Image
RealSR DF2K (JPEG)
Real-World Super-Resolution via Kernel Estimation and Noise Injection Our solution is the winner of CVPR NTIRE 2020 Challenge on Real-World Super-Resolution in both tracks. Recent state-of-the-art super-resolution methods have achieved impressive performance on ideal datasets regardless of blur and noise. However, these methods always fail in real-world image super-resolution, since most of them adopt simple bicubic downsampling from high-quality images to construct Low-Resolution (LR) and High-Resolution (HR) pairs for training which may lose track of frequency-related details. To address this issue, we focus on designing a novel degradation framework for real-world images by estimating various blur kernels as well as real noise distributions. Based on our novel degradation framework, we can acquire LR images sharing a common domain with real-world images. Then, we propose a real-world super-resolution model aiming at better perception. Extensive experiments on synthetic noise data and real-world images demonstrate that our method outperforms the state-of-the-art methods, resulting in lower noise and better visual quality. In addition, our method is the winner of NTIRE 2020 Challenge on both tracks of Real-World Super-Resolution, which significantly outperforms other competitors by large margins. for compressed jpeg image.
ESRGAN
4x
No Image
RealSR DPED
Real-World Super-Resolution via Kernel Estimation and Noise Injection Our solution is the winner of CVPR NTIRE 2020 Challenge on Real-World Super-Resolution in both tracks. Recent state-of-the-art super-resolution methods have achieved impressive performance on ideal datasets regardless of blur and noise. However, these methods always fail in real-world image super-resolution, since most of them adopt simple bicubic downsampling from high-quality images to construct Low-Resolution (LR) and High-Resolution (HR) pairs for training which may lose track of frequency-related details. To address this issue, we focus on designing a novel degradation framework for real-world images by estimating various blur kernels as well as real noise distributions. Based on our novel degradation framework, we can acquire LR images sharing a common domain with real-world images. Then, we propose a real-world super-resolution model aiming at better perception. Extensive experiments on synthetic noise data and real-world images demonstrate that our method outperforms the state-of-the-art methods, resulting in lower noise and better visual quality. In addition, our method is the winner of NTIRE 2020 Challenge on both tracks of Real-World Super-Resolution, which significantly outperforms other competitors by large margins. for real images taken by cell phone camera.
ESRGAN
4x
ThiefGold
ThiefGold
ThiefGold
Various game textures. Primary wood, metal, stone
ESRGAN
4x
ThiefGoldMod
ThiefGoldMod
ThiefGoldMod
Version of the previous model but based on Manga109 pretrained model and with slightly different dataset. Sometimes gives better results especially for wood and metal, sometimes worse. Sometime generates the same dotted artifacts on very bright/white images.
ESRGAN
4x
FSDedither Manga
FSDedither Manga
FSDedither Manga
Cartoons/pixel art/other non-realistic stuff with dithering
ESRGAN
2x
No Image
Byousoku 5 Centimeter
A 2x model for Anime/Pretrained. Anime landscape upscale. Trained on Frames from BluRay of Byousoku 5 Centimeter
ESRGAN
4x
No Image
Sol Levante NTSC2HD
A 4x model for Anime/Pretrained. NTSC DVD-spec encode x4 scale super-resolution for Anime Drawing style content. The dataset has a LOT of data throughout almost every frame, so it had a lot of stuff to learn. The resulting DVD-spec encode also had some blocking at times so it also learned to fight off blocking.
SPSR
4x
4x SPSR
4x SPSR
Structure-Preserving Super Resolution with Gradient Guidance Structures matter in single image super resolution (SISR). Recent studies benefiting from generative adversarial network (GAN) have promoted the development of SISR by recovering photo-realistic images. However, there are always undesired structural distortions in the recovered images. In this paper, we propose a structure-preserving super resolution method to alleviate the above issue while maintaining the merits of GAN-based methods to generate perceptual-pleasant details. Specifically, we exploit gradient maps of images to guide the recovery in two aspects. On the one hand, we restore high-resolution gradient maps by a gradient branch to provide additional structure priors for the SR process. On the other hand, we propose a gradient loss which imposes a second-order restriction on the super-resolved images. Along with the previous image-space loss functions, the gradient-space objectives help generative networks concentrate more on geometric structures. Moreover, our method is model-agnostic, which can be potentially used for off-the-shelf SR networks. Experimental results show that we achieve the best PI and LPIPS performance and meanwhile comparable PSNR and SSIM compared with state-of-the-art perceptual-driven SR methods. Visual results demonstrate our superiority in restoring structures while generating natural SR images.
ESRGAN
1x
No Image
ESRGAN
2x
No Image
BSTexty
As the name might suggest, this model aims to upscale text with less distortion that other models. It seems to do a good job generally, but don't expect it to be a state of the art model that can upscale magazines and stuff. It makes things more readable but since it was train on B/W pictures it desaturates them.
ESRGAN
4x
No Image
SmoothRealism
by Joey
Pixel art, rocky/grainy textures? Quantization smoothing, adding detail. Pretrained model: 4x_RRDB_PSNR_old_arch.pth
ESRGAN
1x
No Image
BSChroma
ChromaSharpen - makes the colors slightly more vibrant with a sideeffect of possibly adding Chromatic Abberation to the image. I am not sure about the usage of this model on real case scenarios but anything blurry or with fuzzy colors could work
ESRGAN
1x
No Image
BSLuma
A 1x model for Luma. Fix Luminance issues?? LumaSharpen ESRGAN Edition? - This model mostly does what "Lumasharpen" algorithms do. It may help fixing images with Luminance images issues? Like old DVD rips? I am not sure, didn't test.
ESRGAN
4x
No Image
BSDeviance
A 4x model for Art. This model upscales Digital Drawings. It was trained on random drawings found on DeviantArt. Mostly Landscape and Scenery and Illustrations of Characters. It does fairly well and works on many different styles.
ESRGAN
1x
No Image
ESRGAN
1x
No Image
ESRGAN
1x
No Image
JPEGDestroyer
This model is meant to reduce or eliminate JPEG Compression without making the original images too smooth or killing detail. It manages to do a fairly good job but don't expect overly compressed images to work with this. Pretrained model: 1xESRGAN + previous attempts
ESRGAN
4x
No Image
Nickelfront
A 4x model for Coins. Upscale coins. That's it. If you were mad at me because Nickelback doesn't make any sense. Now you have the perfect solution to your problems. If you want to upscale nickels or anything with similar texture made out of metal, now you can. It works pretty well for a joke.
ESRGAN
4x
No Image
Nickelback
this model aims to improve further on what has been achieved by the regular 4xESRGAN and also 4xBox. It can upscale most pictures/photos (granted they are clean enough) without destroying as much detail as the aforementioned models. It generates less moiré like patterns and keeps details without oversharpening or blurring the image too much.
ESRGAN
2x
No Image
BSWolly
Pixar Movies or Wall-E pictures/frames
ESRGAN
1x
No Image
BS_Colorizer/Vapourizer
B/W | 100% Desaturated images. It mostly results in Blue and Yellow images with slight hints of Green, Orange and Magenta. You are free to use this as a pretrain to achieve better results.
ESRGAN
4x
FSDedither
FSDedither
FSDedither
For photos/realistic images, but worth trying on other images that have reduced colors and dithering along with fine details. Trained using the ESRGAN-FS code (https://github.com/ManuelFritsche/real-world-sr/tree/master/esrgan-fs/codes) for better details compared to plain ESRGAN.
ESRGAN
4x
MinecraftAlphaUpscaler with Good data
MinecraftAlphaUpscaler with Good data
MinecraftAlphaUpscaler with Good data
A 4x model for Upscaling pack.png. Designed to upscale one specific minecraft screenshot. Results on dissimilar screenshots may be poor.
ESRGAN
4x
Minepack
Minepack
Minepack
A 4x model for Upscaling pack.png. Upscales Minecraft screenshots by 4x. May suffer from haloing, weird patterns on blocks and JPEG-like artifacts.
ESRGAN
16x
No Image
16x PSNR Pretrained Model
A 16x model. Pretrained using RRDB_PSNR_x4.pth. The original RRDB_PSNR_x4.pth model converted to 1x, 2x, 8x and 16x scales, intended to be used as pretrained models for new models at those scales. These are compatible with victor's 4xESRGAN.pth conversions
ESRGAN
1x
No Image
1x PSNR Pretrained Model
A 1x model. Pretrained using RRDB_PSNR_x4.pth. The original RRDB_PSNR_x4.pth model converted to 1x, 2x, 8x and 16x scales, intended to be used as pretrained models for new models at those scales. These are compatible with victor's 4xESRGAN.pth conversions
ESRGAN
2x
No Image
2x PSNR Pretrained Model
A 2x model. Pretrained using RRDB_PSNR_x4.pth. The original RRDB_PSNR_x4.pth model converted to 1x, 2x, 8x and 16x scales, intended to be used as pretrained models for new models at those scales. These are compatible with victor's 4xESRGAN.pth conversions
ESRGAN
4x
No Image
4x PSNR Pretrained Model
A 4x model. Pretrained using RRDB_PSNR_x4.pth. The original RRDB_PSNR_x4.pth model converted to 1x, 2x, 8x and 16x scales, intended to be used as pretrained models for new models at those scales. These are compatible with victor's 4xESRGAN.pth conversions
ESRGAN
8x
No Image
8x PSNR Pretrained Model
A 8x model. Pretrained using RRDB_PSNR_x4.pth. The original RRDB_PSNR_x4.pth model converted to 1x, 2x, 8x and 16x scales, intended to be used as pretrained models for new models at those scales. These are compatible with victor's 4xESRGAN.pth conversions
ESRGAN
4x
No Image
SpongeBob.CEL.2.HD.125ki.499e-PHOENiX
Restorative CEL Animation MPEG-1 and 2 Model specifically crafted for SpongeBob S01.
ESRGAN
4x
No Image
FArtFace
A 4x model.
ESRGAN
8x
TGHQFace
TGHQFace
TGHQFace
Upscales blurry 128px faces, usefull for enhancing that someone in a small picture.
ESRGAN
4x
No Image
LADDIER1
Remove noise, grain, box blur, lens blur and gaussian blur and increase overall image quality.
ESRGAN
4x
No Image
FireAlpha
A 4x model for Alpha (4 channel).
ESRGAN
4x
No Image
Face-Ality V1 (Fatality Faces)
Upscales small faces to big faces.
ESRGAN
4x
No Image
Spongebob v6
by Joey
New version of the Spongebob model. Ideally it's a lot sharper and cleaner but I'm still not sure if it works better than the old one. From what I can tell it's better in many cases.
ESRGAN
4x
No Image
Spongebob v6 De-Quantize
by Joey
A 4x model for Animation - Quantized. A model I trained to do both de-quantizing as well as upscaling. The results are pretty blurry but it works decently for what it is.
ESRGAN
4x
No Image
Spongebob v6 Deblur
by Joey
After not being entirely happy with the main Spongebob v6 model, I trained a new one with blurring OTF options and two different downscale types. This one is much better in my opinion.
ESRGAN
1x
No Image
Spongebob De-Quantize
by Joey
A 1x model for Animation - Quantized. Removed color quantization/indexing and dithering from cartoon style images and textures Pretrained model: 1x_1xDEDITHER_32_512_126900_G
ESRGAN
1x
DeJpeg Fatality PlusULTRA!
DeJpeg Fatality PlusULTRA!
DeJpeg Fatality PlusULTRA!
Pretrained model: 1x_DeJpeg_Fatality_01_200000_G.pth
ESRGAN
4x
No Image
FatalimiX
Comic and Cartoon style images
ESRGAN
1x
Fatality DeBlur
Fatality DeBlur
Fatality DeBlur
Pretrained model: 1x_DeJpeg_Fatality_01_175000_G.pth
ESRGAN
4x
No Image
ESRGAN
4x
No Image
deviantPixelHD
Similar to Manga109, can be used as a general digital upscaler as well as with pixel art
ESRGAN
2x
No Image
Faithful
by Joey
A 2x model.
ESRGAN
4x
No Image
DigiPaint
A 4x model for Art. Digital Art Upscaler Pretrained model: 4xfalcoon300(manga).pth
ESRGAN
1x
No Image
BC1 Smooth 2
A model to help remove compression artifacts in BC1-BC3/DXT1-DXT5 compressed images (these all have color encoded the same way)
ESRGAN
1x
No Image
mdeblur
Strong deblurring model
ESRGAN
4x
No Image
Deoldify
Old black and white photo restoration.
ESRGAN
4x
No Image
Fallout Weapons V2
by Bob
Video game textures, mostly metal rusty, clean or painted
ESRGAN
4x
No Image
ArtStation1337
A 4x model for Digital Art/People. Pretrained using Mainly for digital art, but can be used to upscale pixel art..
ESRGAN
4x
No Image
ArtStation1337 Bloom
A 4x model for Digital Art/People. Pretrained using Mainly for digital art, but can be used to upscale pixel art..
ESRGAN
4x
No Image
ArtStation1337 Dedither
A 4x model for Digital Art/People. Pretrained using Mainly for digital art, but can be used to upscale pixel art..
ESRGAN
4x
No Image
ArtStation1337 Dedither v2
A 4x model for Digital Art/People. Pretrained using Mainly for digital art, but can be used to upscale pixel art..
ESRGAN
4x
No Image
ArtStation1337 Diffuse
A 4x model for Digital Art/People. Pretrained using Mainly for digital art, but can be used to upscale pixel art..
ESRGAN
4x
No Image
ArtStation1337 v2
A 4x model for Digital Art/People. Pretrained using Mainly for digital art, but can be used to upscale pixel art..
ESRGAN
1x
No Image
JPG (00-20%)
A 1x model. Pretrained using Custom (CC0 Textures). Gallery with sample images: https://drive.google.com/open?id=1HWokMsUwsR_Mw-NOJgc8OdS8mkdIje_C
ESRGAN
4x
No Image
Fallout Weapons (Fallout 4 Weapons?)
by Bob
Video game textures, mostly metal rusty, clean or painted
ESRGAN
4x
No Image
Rebout
A 4x model for Pixel Art/Sprites. For upscaling character sprites
ESRGAN
1x
No Image
JPG (20-40%)
A 1x model. Pretrained using Custom (CC0 Textures). Gallery with sample images: https://drive.google.com/open?id=1HWokMsUwsR_Mw-NOJgc8OdS8mkdIje_C
ESRGAN
1x
No Image
JPG (40-60%)
A 1x model. Pretrained using Custom (CC0 Textures). Gallery with sample images: https://drive.google.com/open?id=1HWokMsUwsR_Mw-NOJgc8OdS8mkdIje_C
ESRGAN
1x
No Image
JPG (60-80%)
A 1x model. Pretrained using Custom (CC0 Textures). Gallery with sample images: https://drive.google.com/open?id=1HWokMsUwsR_Mw-NOJgc8OdS8mkdIje_C
ESRGAN
1x
No Image
JPG (80-100%)
A 1x model. Pretrained using Custom (CC0 Textures). Gallery with sample images: https://drive.google.com/open?id=1HWokMsUwsR_Mw-NOJgc8OdS8mkdIje_C
ESRGAN
1x
No Image
KDM003 scans
A 1x model for Art. Clean up model for scanned illustrations. - Made to remove moire patterns, reduce small imperfections, and correct mild compression artifacts in scanned Kingdom Death: Monster illustrations. CMYK printing often shifts colors, so this is intended to reverse that color shifting as well. Pretrained model: Failed attempts based on ESRGAN_1x_JPEG_80to100
ESRGAN
4x
No Image
Fatality 1
A 4x model for Pixel Art/Sprites. Upscales medium resolution Sprites, dithered or undithered, can also upscale manga/anime and gameboy camera images.
ESRGAN
4x
No Image
Lady0101
A 4x model for Pixel Art/Paintings. Upscale pixel art/paintings to digital painting style
ESRGAN
4x
No Image
Faces_04_N
Upscale faces both pixelized and real Pretrained model: 4x_Faces_N_250000.pth
ESRGAN
4x
No Image
WaifuGAN v3
Upscaling CG-painted anime with variable outlines. Pretrained model: Manga109v2.pth
ESRGAN
1x
No Image
cinepak
Removal of Compression such as Cinepak, msvideo1 and Roq
ESRGAN
4x
No Image
detoon
A 4x model for Detooning. A toon to realistic shading style model to wiki under drawings
ESRGAN
4x
No Image
detoon alt
A 4x model for Detooning. A toon to realistic shading style model to wiki under drawings
ESRGAN
4x
No Image
Box
RRDB_ESRGAN_x4 replacement for stuff that's supposed to look realistic.
ESRGAN
4x
No Image
xbrz+dd
xbrz plus dedithering style pixel-art upscaling model to wiki under specialised
ESRGAN
4x
No Image
scalenx
Scalenx style pixel art upscaler
ESRGAN
4x
No Image
xbrz
Xbrz style pixel art upscaler
ESRGAN
1x
No Image
Anti Aliasing
A 1x model for Anti aliasing / Images with pixelated edges.
ESRGAN
1x
No Image
DeSharpen
A 1x model for Denoise. Pretrained using 1st attempt on random sharpening with the same dataset at 200000 iterations, which was trained on non-random desharp model, total ~600000 iterations on 3 models.. Made for rare particular cases when the image was destroyed by applying noise, i.e. game textures or any badly exported photos. If your image does not have any oversharpening, it won't hurt them, leaving as is. In theory, this model knows when to activate and when to skip, also can successfully remove artifacts if only some parts of the image are oversharpened, for example in image consisting of several combined images, 1 of them with sharpen noise.
ESRGAN
1x
No Image
Artifacts BC1 Free alsa
by Alsa
A 1x model. Pretrained using BC1 take 2.
ESRGAN
4x
No Image
Face Focus
Face De-blur - slightly out of focus / blurred images of faces. It is aimed at faces / hair
ESRGAN
4x
No Image
Trixie
A 4x model for Faces / Game Textures. character textures for star wars games, including the heroes, rebels, sith and imperial. Plus a few main aliens…Why called trixie? Because jar jars big adventure would be too long of a name… This also provides good upscale for face textures for general purpose as well as basic star wars
ESRGAN
4x
No Image
Realistic Misc
by Alsa
A 4x model. Pretrained using Manga109Attempt. The Misc model is trained on various pictures shot by myself (Alsa), including bricks, stone, dirt, grass, plants, wood, bark, metal and a few others.
ESRGAN
1x
Normals Generator General
Normals Generator General
Normals Generator General
A 1x model for Map Generation - Normal Maps. This model generates "Franken Maps" (named after Frankenstein), which is a custom material map combination I made. Basically, the Red channel of RGB is just the texture converted to grayscale, the Green channel is the roughness map, and the Blue channel is the displacement map. I had to do this to get around the current limitation of CX loss where it requires a 3 channel output (otherwise I would have just made a 2 channel model, or separate single channel models). As of right now the channels need to be manually split from each other but I will be making a tool for doing this automatically in the coming days.
ESRGAN
1x
No Image
JPG (00-20%)
by Alsa
A 1x model. Pretrained using Custom (Photos / Manga).
ESRGAN
4x
No Image
Map
Map / Old Paper with text
ESRGAN
1x
No Image
JPG (20-40%)
by Alsa
A 1x model. Pretrained using Custom (Photos / Manga).
ESRGAN
1x
No Image
JPG (40-60%)
by Alsa
A 1x model. Pretrained using Custom (Photos / Manga).
ESRGAN
4x
No Image
Forest
Wood / Leaves game textures
ESRGAN
1x
No Image
JPG (60-80%)
by Alsa
A 1x model. Pretrained using Custom (Photos / Manga).
ESRGAN
1x
No Image
JPG (80-100%)
by Alsa
A 1x model. Pretrained using Custom (Photos / Manga).
ESRGAN
16x
No Image
16xESRGAN
A 16x model. Pretrained using RRDB_ESRGAN_x4.pth.
ESRGAN
1x
No Image
1xESRGAN
A 1x model. Pretrained using RRDB_ESRGAN_x4.pth.
ESRGAN
2x
No Image
2xESRGAN
A 2x model. Pretrained using RRDB_ESRGAN_x4.pth.
ESRGAN
4x
4xESRGAN
4xESRGAN
4xESRGAN
A 4x model. Pretrained using RRDB_ESRGAN_x4.pth. ESRGAN
ESRGAN
8x
No Image
8xESRGAN
A 8x model. Pretrained using RRDB_ESRGAN_x4.pth.
ESRGAN
4x
No Image
Skyrim Misc
Skyrim Diffuse Textures
ESRGAN
4x
No Image
Comic Book
Comic / Drawings. Trained on Custom (Spider-Man) dataset
ESRGAN
4x
No Image
Skyrim Armory
by Alsa
Game textures of equipment
ESRGAN
4x
No Image
Skyrim Alpha
A 4x model for Pixel Art with Tranparency / Alpha Channel.
ESRGAN
4x
No Image
Manga109Attempt
Pretrained model: 4xPSNR