Safetensors
llama
File size: 5,062 Bytes
9891392
 
 
 
 
 
 
 
 
 
 
 
 
 
931e55c
9891392
cdba7fa
9891392
 
931e55c
9891392
931e55c
 
 
 
 
 
 
 
cdba7fa
 
931e55c
 
 
 
 
 
9891392
06937ff
 
cdba7fa
06937ff
 
 
3bb4d60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
06937ff
 
3bb4d60
06937ff
 
3bb4d60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
06937ff
cdba7fa
9891392
931e55c
 
 
9891392
 
931e55c
9891392
931e55c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
license: apache-2.0
datasets:
- PleIAs/common_corpus
language:
- en
- fr
- es
- de
- it
- la
- nl
- pl
---
**Pleias-360m-Preview** is an early preview of a 360 million parameter base model trained by Pleias on Common Corpus.

Like all the base and specialized models from Pleias, Pleias-360m-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.

## Description
Pleias-360m-Preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference.

It includes the following features, that would apply to any responsibly trained variant:
* Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content.
* Extensive multilingual support for main European languages.
* A new tokenizer designed for enhanced document processing tasks and better multilingual support.
* Extremely low level of toxicity and problematic content.

Pleias-360m-Preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese. 

Given its size, Pleias-360m-Preview can run on CPU without any compression loss. We provide a first GGUF variant as part of our release.

## Recommended use
As a base model, Pleias-360m-Preview is only able to run continuation prompts.

Text generation is currently able to support a range of creative writing tasks in multiple European languages. For more consistent results we recommend using a low or null temperature with a slight repetition penalty (1.1-1.2).

Pleias-360m-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.

## Examples
With only 360 million parameters, Pleias-360m-Preview is able to generate literary texts in multiple languages:

English poetry:

<blockquote>
<b>Saturn is fallen, am I too to fall?<br/>
Am I to leave this haven of my rest,<br/>
This cradle of my glory, this soft clime,<br/>
This calm luxuriance of blissful light,<br/></b>
And the sweetness that was not in me?<br/>
I have been a slave and an enemy.<br/>
The world has made me its prey ; it's wrong;<br/>
It's cruelty! It's madness! — but you know<br/>
That all things are good for one man or another :<br/>
You see how he can be happy without them:<br/>
He cannot be miserable unless they come<br/>
To his heart with their own poisonous breaths.<br/>
But if there were no other way left us,<br/>
We should never find our happiness again.<br/>
If we could only live on earth alone,<br/>
With nothing else about us than ourselves,<br/>
Our very being would become more real.<br/>
There must be some other place where we may<br/>
Live happily as before, and enjoy life<br/>
As well as now. But what shall we do here?<br/>
What will happen when we die? What will hap-<br/>
                                  [pen then?<br/>
When death comes, let him who knows the end<br/>
</blockquote>

French classic play with the 17th century writing style (including the long s):

<blockquote>
<b>Helas! pour mon malheur ie l’ay trop écouté.<br/>
Ie n’ay point du ſilence affecté le myſtère.<br/>
Ie croyois ſans péril pouuoir eſtre ſincere.<br/>
Et ſans armer mes yeux d’vn moment de rigueur,<br/></b>
Je ne voy pas que ce foit vn crime à la nature:<br/>
Mais je vous prie, qu'on me donne quelque recompenſe :<br/>
Pourquoy donc auec tant de gloire &c en vne heure?<br/>
Cela fait voir bien clairement que ma deſpenſe<br/>
Eſt plus grande que celle des autres hommes ;<br/>
Car il eſt certain que les Princes font moins heureux;<br/>
Que ceux qui ont eu leur vie par leurs mains confacrée.<br/>
Il y a deux fortes de perſonnes dans le monde :<br/>
Les uns fe nomment gens de lettres ou de proueſleurz,<br/>
Dont on dit que c'eſt eux-mesme qui nous donnent tousjours<br/>
Leur efprit & leur cœur aux affaires publiques ;<br/>
Ainſi ils s'appellent gens de cour, comme ſi cela<br/>
N'eſtoit autre choſe que ces grands Seigneurs qui gouuernenc<br/>
L'Etat,&qui ſe conſeruent toujours avec honneur.<br/>
Ils ſont fort peu de choſes ſans ambition ;<br/>
</blockquote>

## Training
Pleias-360m-Preview was fully pretrained at Jean Zay on 64 h100s for 46 hours with Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release. 

Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).

## Update
Pleias-360m-Preview is currently released as an early preview.

The model will undergo several more round of post-training to enhance reasoning capacities and fine-tunability as well as in anticipation of a generalist instruct version.