File size: 2,698 Bytes
c5e5928
 
 
 
00db4b2
c5e5928
1c8b1e8
 
2d3cb76
 
 
 
1c8b1e8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e321e09
 
 
 
 
 
1c8b1e8
 
 
3360852
 
6e1ea59
3360852
1c8b1e8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
language:
- en
library_name: diffusers
license: apache-2.0
---
# UNet2DModel for Digit Image Generation

![image](digit-7.png)
![image](digit-8.png)
![image](digit-9.png)

## Model Details
- Model Name: UNet2DModel
- Task: Digit Image Generation
- Model Type: Generative Model
- Dataset: MNIST (Handwritten Digit Images)
- Output Image Size: 32x32 pixels
- Image Color: Black and White (Grayscale)

## Model Description
The model is a generative model specifically designed for digit image generation. It is trained on the MNIST dataset, which consists of handwritten digit images. The model is capable of generating realistic black and white digit images of numbers 0 to 9 with a size of 32x32 pixels.

```
from diffusers import UNet2DModel

unet = UNet2DModel(
    in_channels=1,
    out_channels=1,
    sample_size=32,
    block_out_channels=(32,64,128,256),
    norm_num_groups=8,
    num_class_embeds=10
)
```

## Training

[Training Script](./ddpm-mnist-32.ipynb)

## Limitations
- Single Modality: The model generates black and white digit images and does not capture color information.
- Limited to Digits: The model is specifically trained for digit image generation and may not generalize well to other types of images or objects.
- Resolution: The model generates digit images with a fixed resolution of 32x32 pixels and may not perform well on tasks requiring higher-resolution images.

## Ethical Considerations
- Bias: The model's performance may be influenced by biases present in the MNIST dataset, such as variations in handwriting styles.
- Potential Misuse: The generated digit images should not be used for any malicious or fraudulent purposes, including creating counterfeit documents or impersonating individuals.
- Privacy: The model does not store or process any personal or sensitive information.

## Usage Examples:
Example Python code snippets and instructions for using the model to generate image.
```
from diffusers import UNet2DModel, DDPMScheduler

device = "cuda"
scheduler = DDPMScheduler()
unet = UNet2DModel.from_pretrained("gnokit/unet-mnist-32", use_safetensors=True, variant="fp16").to(device)
class_to_generate = 8 # 0-9

sample = torch.randn(1, 1, 32, 32).to(device)
class_labels = [class_to_generate]
class_labels = torch.tensor(class_labels).to(device)

for i, t in enumerate(scheduler.timesteps):
	# Get model pred
	with torch.no_grad():
		noise_pred = unet(sample, t, class_labels=class_labels).sample

	# Update sample with step
	sample = scheduler.step(noise_pred, t, sample).prev_sample

image = sample.clip(-1, 1)*0.5 + 0.5 # image in tensor format

```

---
license: apache-2.0
datasets:
- mnist
pipeline_tag: unconditional-image-generation
---