File size: 2,698 Bytes
c5e5928 00db4b2 c5e5928 1c8b1e8 2d3cb76 1c8b1e8 e321e09 1c8b1e8 3360852 6e1ea59 3360852 1c8b1e8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
language:
- en
library_name: diffusers
license: apache-2.0
---
# UNet2DModel for Digit Image Generation
![image](digit-7.png)
![image](digit-8.png)
![image](digit-9.png)
## Model Details
- Model Name: UNet2DModel
- Task: Digit Image Generation
- Model Type: Generative Model
- Dataset: MNIST (Handwritten Digit Images)
- Output Image Size: 32x32 pixels
- Image Color: Black and White (Grayscale)
## Model Description
The model is a generative model specifically designed for digit image generation. It is trained on the MNIST dataset, which consists of handwritten digit images. The model is capable of generating realistic black and white digit images of numbers 0 to 9 with a size of 32x32 pixels.
```
from diffusers import UNet2DModel
unet = UNet2DModel(
in_channels=1,
out_channels=1,
sample_size=32,
block_out_channels=(32,64,128,256),
norm_num_groups=8,
num_class_embeds=10
)
```
## Training
[Training Script](./ddpm-mnist-32.ipynb)
## Limitations
- Single Modality: The model generates black and white digit images and does not capture color information.
- Limited to Digits: The model is specifically trained for digit image generation and may not generalize well to other types of images or objects.
- Resolution: The model generates digit images with a fixed resolution of 32x32 pixels and may not perform well on tasks requiring higher-resolution images.
## Ethical Considerations
- Bias: The model's performance may be influenced by biases present in the MNIST dataset, such as variations in handwriting styles.
- Potential Misuse: The generated digit images should not be used for any malicious or fraudulent purposes, including creating counterfeit documents or impersonating individuals.
- Privacy: The model does not store or process any personal or sensitive information.
## Usage Examples:
Example Python code snippets and instructions for using the model to generate image.
```
from diffusers import UNet2DModel, DDPMScheduler
device = "cuda"
scheduler = DDPMScheduler()
unet = UNet2DModel.from_pretrained("gnokit/unet-mnist-32", use_safetensors=True, variant="fp16").to(device)
class_to_generate = 8 # 0-9
sample = torch.randn(1, 1, 32, 32).to(device)
class_labels = [class_to_generate]
class_labels = torch.tensor(class_labels).to(device)
for i, t in enumerate(scheduler.timesteps):
# Get model pred
with torch.no_grad():
noise_pred = unet(sample, t, class_labels=class_labels).sample
# Update sample with step
sample = scheduler.step(noise_pred, t, sample).prev_sample
image = sample.clip(-1, 1)*0.5 + 0.5 # image in tensor format
```
---
license: apache-2.0
datasets:
- mnist
pipeline_tag: unconditional-image-generation
--- |