ayoubkirouane commited on
Commit
e036794
·
1 Parent(s): 19d53c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -30,6 +30,8 @@ The dataset consists of **856 {image: caption}** pairs, providing a substantial
30
  The model is conditioned on both CLIP image tokens and text tokens and employs a **teacher forcing** training approach. It predicts the next text token while considering the context provided by the image and previous text tokens.
31
 
32
 
 
 
33
  ## Limitations
34
  + The quality of generated captions may vary depending on the complexity and diversity of images from the 'One-Piece-anime-captions' dataset.
35
  + The model's output is based on the data it was fine-tuned on, so it may not generalize well to images outside the dataset's domain.
 
30
  The model is conditioned on both CLIP image tokens and text tokens and employs a **teacher forcing** training approach. It predicts the next text token while considering the context provided by the image and previous text tokens.
31
 
32
 
33
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6338c06c107c4835a05699f9/N_yNK2tLabtwmSYAqpTEp.jpeg)
34
+
35
  ## Limitations
36
  + The quality of generated captions may vary depending on the complexity and diversity of images from the 'One-Piece-anime-captions' dataset.
37
  + The model's output is based on the data it was fine-tuned on, so it may not generalize well to images outside the dataset's domain.