mahsaBa76 commited on
Commit
a0f50f5
·
verified ·
1 Parent(s): ba70e96

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,1645 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:278
8
+ - loss:MatryoshkaLoss
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: BAAI/bge-base-en-v1.5
11
+ widget:
12
+ - source_sentence: How does Bitcoin's P2P network prevent malicious nodes from flooding
13
+ the network with invalid blocks or transactions?
14
+ sentences:
15
+ - 'paper-title: The Bitcoin Lightning Network: Scalable Off-Chain Instant Payments
16
+
17
+
18
+ \subsection*{8.4 Payment Routing}
19
+
20
+ It is theoretically possible to build a route map implicitly from observing 2
21
+ -of-2 multisigs on the blockchain to build a routing table. Note, however, this
22
+ is not feasible with pay-to-script-hash transaction outputs, which can be resolved
23
+ out-of-band from the bitcoin protocol via a third party routing service. Building
24
+ a routing table will become necessary for large operators (e.g. BGP, Cjdns). Eventually,
25
+ with optimizations, the network will look a lot like the correspondent banking
26
+ network, or Tier-1 ISPs. Similar to how packets still reach their destination
27
+ on your home network connection, not all participants need to have a full routing
28
+ table. The core Tier-1 routes can be online all the time - while nodes at the
29
+ edges, such as average users, would be connected intermittently.
30
+
31
+
32
+ Node discovery can occur along the edges by pre-selecting and offering partial
33
+ routes to well-known nodes.
34
+
35
+
36
+ \subsection*{8.5 Fees}
37
+
38
+ Lightning Network fees, which differ from blockchain fees, are paid directly between
39
+ participants within the channel. The fees pay for the time-value of money for
40
+ consuming the channel for a determined maximum period of time, and for counterparty
41
+ risk of non-communication.
42
+
43
+
44
+ Counterparty risk for fees only exist with one''s direct channel counterparty.
45
+ If a node two hops away decides to disconnect and their transaction gets broadcast
46
+ on the blockchain, one''s direct counterparties should not broadcast on the blockchain,
47
+ but continue to update via novation with a new Commitment Transaction. See the
48
+ Decrementing Timelocks entry in the HTLC section for more information about counterparty
49
+ risk.
50
+
51
+
52
+ The time-value of fees pays for consuming time (e.g. 3 days) and is conceptually
53
+ equivalent to a gold lease rate without custodial risk; it is the time-value for
54
+ using up the access to money for a very short duration. Since certain paths may
55
+ become very profitable in one direction, it is possible for fees to be negative
56
+ to encourage the channel to be available for those profitable paths.
57
+
58
+
59
+ \section*{9 Risks}
60
+
61
+ The primary risks relate to timelock expiration. Additionally, for core nodes
62
+ and possibly some merchants to be able to route funds, the keys must be held online
63
+ for lower latency. However, end-users and nodes are able to keep their private
64
+ keys firewalled off in cold storage.
65
+
66
+
67
+ \subsection*{9.1 Improper Timelocks}
68
+
69
+ Participants must choose timelocks with sufficient amounts of time. If insufficient
70
+ time is given, it is possible that timelocked transactions believed to be invalid
71
+ will become valid, enabling coin theft by the counterparty. There is a trade-off
72
+ between longer timelocks and the time-value of money. When writing wallet and
73
+ Lightning Network application software, it is necessary to ensure that sufficient
74
+ time is given and users are able to have their transactions enter into the blockchain
75
+ when interacting with non-cooperative or malicious channel counterparties.
76
+
77
+
78
+ \subsection*{9.2 Forced Expiration Spam}
79
+
80
+ Forced expiration of many transactions may be the greatest systemic risk when
81
+ using the Lightning Network. If a malicious participant creates many channels
82
+ and forces them all to expire at once, these may overwhelm block data capacity,
83
+ forcing expiration and broadcast to the blockchain. The result would be mass spam
84
+ on the bitcoin network. The spam may delay transactions to the point where other
85
+ locktimed transactions become valid.
86
+
87
+
88
+ This may be mitigated by permitting one transaction replacement on all pending
89
+ transactions. Anti-spam can be used by permitting only one transaction replacement
90
+ of a higher sequence number by the inverse of an even or odd number. For example,
91
+ if an odd sequence number was broadcast, permit a replacement to a higher even
92
+ number only once. Transactions would use the sequence number in an orderly way
93
+ to replace other transactions. This mitigates the risk assuming honest miners.
94
+ This attack is extremely high risk, as incorrect broadcast of Commitment Transactions
95
+ entail a full penalty of all funds in the channel.
96
+
97
+
98
+ Additionally, one may attempt to steal HTLC transactions by forcing a timeout
99
+ transaction to go through when it should not. This can be easily mitigated by
100
+ having each transfer inside the channel be lower than the total transaction fees
101
+ used. Since transactions are extremely cheap and do not hit the blockchain with
102
+ cooperative channel counterparties, large transfers of value can be split into
103
+ many small transfers. This attempt can only work if the blocks are completely
104
+ full for a long time. While it is possible to mitigate it using a longer HTLC
105
+ timeout duration, variable block sizes may become common, which may need mitigations.
106
+
107
+
108
+ If this type of transaction becomes the dominant form of transactions which are
109
+ included on the blockchain, it may become necessary to increase the block size
110
+ and run a variable blocksize structure and timestop flags as described in the
111
+ section below. This can create sufficient penalties and disincentives to be highly
112
+ unprofitable and unsuccessful for attackers, as attackers lose all their funds
113
+ from broadcasting the wrong transaction, to the point where it will never occur.'
114
+ - 'paper-title: OmniLedger: A Secure, Scale-Out, Decentralized Ledger via Sharding
115
+
116
+
117
+ Fig. 11: Bootstrap bandwidth consumption with state blocks.\\[0pt]
118
+
119
+ to create the UTXO state. For this experiment, we reconstructed Bitcoin''s blockchain
120
+ [5], [41] and created a parallel OmniLedger blockchain with weekly state blocks.
121
+
122
+
123
+ Figure 11 depicts the bandwidth overhead of a validator that did not follow the
124
+ state for the first 100 days. As we can see, the state block approach is better
125
+ if the validator is outdated for more than 19 days or 2736 Bitcoin blocks.
126
+
127
+
128
+ The benefit might not seem substantial for Bitcoin, but in OmniLedger, 2736 blocks
129
+ are created in less than 8 hours, meaning that for one day-long epochs, the state
130
+ block approach is significantly better. If a peak throughput is required and 16
131
+ MB blocks are deployed, we expect reduced bandwidth consumption close to two orders
132
+ of magnitude.
133
+
134
+
135
+ \section*{IX. Related Work}
136
+
137
+ The growing interests in scaling blockchains have produced a number of prominent
138
+ systems that we compare in Table IV. ByzCoin [32] is a first step to scalable
139
+ BFT consensus, but cannot scale-out. Elastico is the first open scale-out DL,
140
+ however, it suffers from performance and security challenges that we have already
141
+ discussed in Section II. RSCoin [16] proposes sharding as a scalable approach
142
+ for centrally banked cryptocurrencies. RSCoin relies on a trusted source of randomness
143
+ for sharding and auditing, making its usage problematic in trustless settings.
144
+ Furthermore, to validate transactions, each shard has to coordinate with the client
145
+ and instead of running BFT, RSCoin uses a simple two-phase commit, assuming that
146
+ safety is preserved if the majority of validators is honest. This
147
+
148
+
149
+ TABLE IV: Comparison of Distributed Ledger Systems
150
+
151
+
152
+ \begin{center}
153
+
154
+ \begin{tabular}{ccccccc}
155
+
156
+ \hline
157
+
158
+ System & Scale-Out & \begin{tabular}{c}
159
+
160
+ Cross-Shard \\
161
+
162
+ Transaction Atomicity \\
163
+
164
+ \end{tabular} & State Blocks & \begin{tabular}{c}
165
+
166
+ Measured Scalability \\
167
+
168
+ (\# of Validators) \\
169
+
170
+ \end{tabular} & \begin{tabular}{c}
171
+
172
+ Estimated \\
173
+
174
+ Time to Fail \\
175
+
176
+ \end{tabular} & \begin{tabular}{c}
177
+
178
+ Measured \\
179
+
180
+ Latency \\
181
+
182
+ \end{tabular} \\
183
+
184
+ \hline
185
+
186
+ RSCoin [16] & In Permissioned & Partial & No & 30 & N/A & 1 sec \\
187
+
188
+ Elastico [34] & In PoW & No & No & 1600 & 1 hour & 800 sec \\
189
+
190
+ ByzCoin [32] & No & N/A & No & 1008 & 19 years & 40 sec \\
191
+
192
+ Bitcoin-NG [21] & No & N/A & No & 1000 & N/A & 600 sec \\
193
+
194
+ PBFT [9], [11] & No & N/A & No & 16 & N/A & 1 sec \\
195
+
196
+ Nakamoto [36] & No & N/A & No & 4000 & N/A & 600 sec \\
197
+
198
+ OmniLedger & Yes & Yes & Yes & 2400 & 68.5 years & 1.5 sec \\
199
+
200
+ \hline
201
+
202
+ \end{tabular}
203
+
204
+ \end{center}
205
+
206
+
207
+ approach, however, does not protect from double spending attempts by a malicious
208
+ client colluding with a validator.
209
+
210
+
211
+ In short, prior solutions [16], [32], [34] achieve only two out of the three desired
212
+ properties; decentralization, long-term security, and scale-out, as illustrated
213
+ in Figure 1. OmniLedger overcomes this issue by scaling out, as far as throughput
214
+ is concerned, and by maintaining consistency to the level required for safety,
215
+ without imposing a total order.
216
+
217
+
218
+ Bitcoin-NG scales Bitcoin without changing the consensus algorithm by observing
219
+ that the PoW process does not have to be the same as the transaction validation
220
+ process; this results in two separate timelines: one slow for PoW and one fast
221
+ for transaction validation. Although Bitcoin-NG significantly increases the throughput
222
+ of Bitcoin, it is still susceptible to the same attacks as Bitcoin [24], [3].
223
+
224
+
225
+ Other efforts to scale blockchains include: Tendermint [9], a protocol similar
226
+ to PBFT for shard-level consensus that does not scale due to its similarities
227
+ to PBFT, and the Lightning Network [40], an off-chain payment protocol for Bitcoin
228
+ (also compatible to OmniLedger); it limits the amount of information committed
229
+ to the blockchain.'
230
+ - "Datatype: lecture_note, Title: Lecture 4: Peer to Peer Networking for Blockchains\n\
231
+ \nHow does broadcast take only $O(\\log N)$ steps? We first need to understand\
232
+ \ the gossip-flooding-based broadcast protocol. The flooding protocol mimics the\
233
+ \ spread of an epidemic. Once a node is ``infected\", it infects its peers and\
234
+ \ forever stay's infected. It is easy to see that the spread of information will\
235
+ \ happen exponentially; hence the information will take $O(\\log N)$ hops to spread\
236
+ \ to all nodes. To formally understand the spread, we note that $d$-regular graphs\
237
+ \ with $d\\geq 3$ are an \\textit{expander graph} for large sizes ($|V|$) with\
238
+ \ high probability. An expander graph is a connected but sparse graph ($|E|=O(|V|)$)\
239
+ \ with the following property: $|\\partial A| \\geq \\epsilon|A|$ for any connected\
240
+ \ sub-graph $A$ with $|A|<0.5|V|$. Here, $|\\partial A|$ refers to the number\
241
+ \ of vertices outside $A$ with at least one neighbor in $A$. A gossip message\
242
+ \ originates with $A(0)$ as the broadcasting node with $|A(0)|=1$, in the next\
243
+ \ hop, it will spread to $\\partial A(0)$ with $|A(1)|\\geq (1+\\epsilon)|A(0)|$.\
244
+ \ This recursion continues and we have $|A(k)|\\geq(1+\\epsilon)^kA(0)$. Thus,\
245
+ \ the number of steps to reach half the number of nodes is logarithmic in the\
246
+ \ number of nodes. It can be shown that the other half of the nodes can also be\
247
+ \ covered in $O(\\log N)$ time.\n\n\n%Engineering issues (peer discovery, bootstrap,\
248
+ \ churn). Implementation connections (to the lab experiment). Validation of tx,\
249
+ \ blocks. How does that impact networking? What about skipping validation and\
250
+ \ doing cut-through routing? Compact blocks. (RR)\n\n\\section*{Bitcoin P2P network:\
251
+ \ A systems view}\nIn Bitcoin, peers connect to each other and communicate using\
252
+ \ the TCP protocol. The codebase allows for eight outgoing connections and up\
253
+ \ to 117 incoming connections. The network has a high churn rate (rate at which\
254
+ \ users enter/leave the system); hence, the node must be ready to connect to new\
255
+ \ peers. Moreover, to ensure that the peers we are connecting to are chosen randomly,\
256
+ \ the node keeps a large list of nodes running Bitcoin in the form of their (IP,\
257
+ \ port) tuple and establishes a connection to one of them randomly when a slot\
258
+ \ opens up. \n\nHow does a node bootstrap its list of peers? This happens by\
259
+ \ connecting to a set of DNS seed nodes. The seed nodes are not heavily decentralized;\
260
+ \ hence completely relying on the peer list provided by them is not advisable.\
261
+ \ On connecting to the initial set of peers, a node asks its neighbors for their\
262
+ \ peer list using {\\tt getAddr} and {\\tt Addr} messages. The node keeps refreshing\
263
+ \ its peer list regularly by exchanging peer lists with its peers. \n\nTransmission\
264
+ \ of all block and transactions happen through the inventory message {\\tt inv},\
265
+ \ on receiving an {\\tt inv} message the node checks if it has the block or the\
266
+ \ transaction in its local storage. If not, it sends the {\\tt getData} message\
267
+ \ to fetch those blocks and transactions from the peer. Since block sizes are\
268
+ \ relatively large, block transmission can optionally happen in 2 stages. On receiving\
269
+ \ the {\\tt inv} message, the node may ask for headers first using {\\tt getHeaders}\
270
+ \ and ask for complete blocks only if a header chain is established. This header-first\
271
+ \ block transmission increases queries but can decrease the net bandwidth usage.\
272
+ \ It may also prevent nodes from accepting PoW invalid blocks since the node can\
273
+ \ check from the header whether PoW is valid. \n\nWe saw in the previous lecture\
274
+ \ that some nodes might be malicious. A question that may arise is: what stops\
275
+ \ malicious nodes from flooding the network with invalid blocks and transactions\
276
+ \ (i.e., with invalid PoW and/or signatures)? Such flooding will saturate the\
277
+ \ network and increase transmission delay to unacceptable levels. Such an attack\
278
+ \ is prevented by a simple design decision, forward message to peers only after\
279
+ \ validating the message; i.e., a node sends an {\\tt inv} block message to its\
280
+ \ peers only after validating the block. If the adversary creates an invalid block,\
281
+ \ the block will not be propagated beyond one honest node. Additionally, nodes\
282
+ \ maintain their peers' reputation using some predefined heuristics; if a peer\
283
+ \ misbehaves (say by sending a transaction with invalid signatures), its reputation\
284
+ \ is downgraded and after a certain lower threshold is disconnected."
285
+ - source_sentence: How does the blockchain protocol ensure that all honest players
286
+ converge on the same chain?
287
+ sentences:
288
+ - "paper-title: Blockchain CAP Theorem Allows User-Dependent Adaptivity and Finality\n\
289
+ \nDefinition 3 (Potential starting value for period $p$ ). A value $v$ that has\
290
+ \ been next-voted by $t+1$ honest nodes for period $p-1$.\n\nDefinition 4 (Committed\
291
+ \ value for period $p$ ). A value $v$ that has been cert-voted by $2 t+1$ nodes\
292
+ \ for period $p$.\n\nDefinition 5 (Potentially committed value for period $p$\
293
+ \ ). A value $v$ that has been cert-voted by $t+1$ honest nodes for period $p$.\n\
294
+ \nAlthough we slightly altered Algorand BA protocol (which is highlighted in red\
295
+ \ in Appendix A), we note that our modification does not break the safety of the\
296
+ \ protocol or cause any deadlock in Lemma 1 and Lemma 2, At a high level, the\
297
+ \ validity check only causes less soft-votes from honest nodes, which is indistinguishable\
298
+ \ with the case where the leader is malicious and no value receives at least $2\
299
+ \ t+1$ soft-votes in some period. Therefore, the safety and deadlock-free property\
300
+ \ remain.\n\nLemma 1 (Asynchronous Safety, CP0). Even when the network is partitioned,\
301
+ \ the protocol ensures safety of the system so that no two honest nodes will finish\
302
+ \ one iteration of the protocol with different outputs.\n\nProof. The following\
303
+ \ properties hold even during a network partition.\n\n\\begin{itemize}\n \\item\
304
+ \ By quorum intersection, as each honest node only soft-votes one value, then\
305
+ \ at most one value is committed or potentially committed for each period $p$\
306
+ \ in one iteration.\n \\item If a value $v$ is potentially committed for period\
307
+ \ $p$, then only $v$ can receive $2 t+1$ next-votes for period $p$. Thus, the\
308
+ \ unique potential starting value for period $p+1$ is $v$.\n \\item If a period\
309
+ \ $p$ has a unique potential starting value $v \\neq \\perp$, then only $v$ can\
310
+ \ be committed for period $p$. Moreover, honest nodes will only next-vote $v$\
311
+ \ for period $p$, so the unique potential starting value for period $p+1$ is also\
312
+ \ $v$. Inductively, any future periods $p^{\\prime}>p$ can only have $v$ as a\
313
+ \ potential starting value. Thus, once a value is potentially committed, it becomes\
314
+ \ the unique value that can be committed or potentially committed for any future\
315
+ \ period, and no two honest nodes will finish this iteration of the protocol with\
316
+ \ different outputs.\n\\end{itemize}\n\nLemma 2 (Asynchronous Deadlock-freedom).\
317
+ \ As long as messages will be delivered eventually, an honest node can always\
318
+ \ leave period p, either by entering a higher period or meeting the halting condition\
319
+ \ for the current iteration.\n\nProof. We first prove that there can never exist\
320
+ \ $2 t+1$ next-votes for two different non- $\\perp$ values from the same period\
321
+ \ $p$ by induction.\n\nStart with $p=1$. Note that every honest node sets $s t_{i}^{1}=\\\
322
+ perp$ and at most one value (say $v$ ) could receive more than $2 t+1$ soft-votes.\
323
+ \ Therefore only value $v$ and $\\perp$ could potentially receive more than $2\
324
+ \ t+1$ next-votes in period 1 . Note that it is possible that both $v$ and $\\\
325
+ perp$ receive more than $2 t+1$ next-votes: all the honest nodes could next-vote\
326
+ \ for $\\perp$ in Step 4 and then next-vote for $v$ in Step 5 after seeing the\
327
+ \ $2 t+1$ soft-votes for $v$.\n\nAssume that the claim holds for period $p-1(p\
328
+ \ \\geq 2)$ : there exist at most two values each of which has $2 t+1$ next-votes\
329
+ \ for period $p-1$, and one of them is necessarily $\\perp$. Then there are three\
330
+ \ possible cases:"
331
+ - 'paper-title: A Scalable Proof-of-Stake Blockchain in the Open Setting * \\ (or,
332
+ How to Mimic Nakamoto''s Design via Proof-of-Stake)
333
+
334
+
335
+ Common prefix. Our analysis is based on the common prefix analysis of core-chain.
336
+ The core-chain can achieve common prefix as we discussed. The opportunity for
337
+ malicious players to destroy common prefix probability is to generate different
338
+ blockchain for the same core-chain. For the malicious players can sign different
339
+ blocks for one block-core, this will allow him to fork the blockchain. So the
340
+ malicious players can fork the blockchain when they are chosen to generate block.
341
+ However, with the property of hash function, the malicious players can not generate
342
+ two blocks with same hash value. When an honest player is chosen to extend a block,
343
+ he will only support one blockchain. Then all of the honest players will converge
344
+ on one blockchain.\\
345
+
346
+ Corollary 6.4 (Common prefix). Consider the blockchain protocol $\Pi^{\text {main
347
+ }}$. Consider $\alpha^{\star}=\lambda \beta^{\star}$, $\lambda>1$, and $\delta>0$.
348
+ Consider two honest PoS-players, P in round $r$ and $\mathrm{P}^{\prime}$ in round
349
+ $r^{\prime}$, with the local best PoS blockchains $\tilde{\mathcal{C}}, \tilde{\mathcal{C}}^{\prime}$,
350
+ respectively, where $r^{\prime} \geq r$. Then we have $\operatorname{Pr}\left[\tilde{\mathcal{C}}[1,
351
+ \ell] \preceq \tilde{\mathcal{C}}^{\prime}\right] \geq 1-e^{-\Omega(\kappa)}$,
352
+ where $\ell=\operatorname{len}(\mathcal{C})-\Theta(\kappa)$.
353
+
354
+
355
+ Proof. As we discussed, $\tilde{\mathcal{C}}$ and $\tilde{\mathcal{C}}^{\prime}$
356
+ are associated with core-chains $\mathcal{C}$ and $\mathcal{C}^{\prime}$ respectively.
357
+ From Corollary 5.6 we know that $\operatorname{Pr}\left[\mathcal{C}[1, \ell] \preceq
358
+ \mathcal{C}^{\prime}\right] \geq 1-e^{-\Omega(\kappa)}$.
359
+
360
+
361
+ Based on the assumption that $\alpha^{\star}=\lambda \beta^{\star}$ and $\lambda>1$,
362
+ we can have that the malicious players are not able to generate more than $\Theta(\kappa)$
363
+ blocks before an honest player is chosen to generate block with high probability.
364
+ All of the honest players will converge on the same chain. Put them together,
365
+ we have $\operatorname{Pr}\left[\tilde{\mathcal{C}}[1, \ell] \preceq \tilde{\mathcal{C}}^{\prime}\right]
366
+ \geq 1-e^{-\Omega(\kappa)}$ where $\ell=\operatorname{len}(\mathcal{C})-\Theta(\kappa)$.
367
+
368
+
369
+ Chain soundness. A new player will accept a blockchain (in which the corresponding
370
+ corechain is included). The proof idea for achieving chain soundness property
371
+ of our blockchain protocol directly follows that for the core-chain protocol.
372
+ We have the following statement.\\
373
+
374
+ Corollary 6.5 (Chain soundness). Consider the blockchain protocol $\Pi^{\text
375
+ {main }}$. Consider for every round, $\alpha=\lambda \beta, \lambda>1$, and $\delta>0$.
376
+ There are two honest PoS-players, $\mathrm{P}^{\prime}$ and $\mathrm{P}^{\prime
377
+ \prime}$ in round $r$, with the local best PoS blockchains $\tilde{\mathcal{C}}^{\prime}$
378
+ and $\tilde{\mathcal{C}}^{\prime \prime}$, respectively. Let $\mathrm{P}^{\prime}$
379
+ be a new player and $\mathrm{P}^{\prime \prime}$ be an existing player in round
380
+ $r$. Then we have $\tilde{\mathcal{C}}^{\prime}[\neg \kappa] \preceq \tilde{\mathcal{C}}^{\prime
381
+ \prime}$ and $\tilde{\mathcal{C}}^{\prime \prime}[\neg \kappa] \preceq \tilde{\mathcal{C}}^{\prime}$.'
382
+ - "Datatype: lecture_note, Title: Lecture 9: Scaling Latency\n\n\\begin{figure}\n\
383
+ \\begin{center}\n\\includegraphics[width=\\textwidth]{figures/Prism_main.pdf}\n\
384
+ \\end{center}\n\n\\caption{Factorizing the blocks into three types of blocks:\
385
+ \ proposer blocks, transaction blocks and voter blocks.}\n\\label{fig:prism}\n\
386
+ \n\\end{figure}\n\nJust as in {\\sf Prism 1.0}, the \\textit{proposer} blocktree\
387
+ \ in {\\sf Prism} anchors the blockchain. Each proposer block contains a list\
388
+ \ of reference links to \\textit{transaction} blocks that contain transactions,\
389
+ \ as well as a single reference to a parent proposer block. Honest nodes mine\
390
+ \ proposer blocks following the longest chain rule in the proposer tree.\nWe define\
391
+ \ the *level* of a proposer block as its distance from the genesis proposer block,\
392
+ \ and the *height* of the proposer tree as the maximum level that contains any\
393
+ \ proposer blocks. To determine the ordering of proposer blocks (and thus transaction\
394
+ \ blocks and transactions), we elect one \\textit{leader} proposer block from\
395
+ \ each level. The sequence of leader blocks up to the height of the proposer tree\
396
+ \ is called the \\textit{leader sequence}, and is determined by the *voter* chains.\
397
+ \ Note that the leader blocks do not need to follow the chain structure of the\
398
+ \ proposer blocks because otherwise deadlock may occur if conflicting blocks (i.e.,\
399
+ \ two proposer blocks not on one chain) are determined as leader blocks. \n\n\
400
+ In {\\sf Prism}, there are $m$ voter chains, where $m \\gg 1$ is a fixed parameter\
401
+ \ chosen by the system designer. The larger the $m$, the more parallel the voting\
402
+ \ process and hence the shorter the latency of confirmation. In general $m$ is\
403
+ \ chosen as large as network bandwidth and memory management issues are manageable.\
404
+ \ For example, $m=1000$ is chosen in the \\href{https://arxiv.org/pdf/1909.11261.pdf}{full-stack\
405
+ \ implementation} of Prism. New voter blocks are mined on each voter chain according\
406
+ \ to the longest chain rule. A voter block votes for a proposer block by containing\
407
+ \ a reference link to that proposer block, with the requirements that: (1) a vote\
408
+ \ is valid only if the voter block is in the longest chain of its voter tree;\
409
+ \ (2) each voter chain votes for one and only one proposer block at each level;\
410
+ \ (3) each voter block votes for all the proposer levels that have not been voted\
411
+ \ by its parent. The leader block at each level is the one that has the largest\
412
+ \ number of votes among all the proposer blocks at the same level (ties can be\
413
+ \ broken by the hash of the proposer blocks). The elected leader blocks then provide\
414
+ \ a unique ordering of the transaction blocks to form the final ledger. \n\n{\\\
415
+ sf Prism} also uses cryptographic sortition to prevent the adversary from focusing\
416
+ \ its mining power on a specific type of blocks or on a specific voter chain.\
417
+ \ A miner first forms a ``superblock\" containing $m+2$ parts: a transaction block,\
418
+ \ a proposer block and a voter block on the $i$-th voter tree ($1\\leq i \\leq\
419
+ \ m$). We say a superblock is successfully mined if \n\\begin{equation}\n \
420
+ \ Hash({\\sf nonce}, {\\sf superblock}) < T_{\\rm tx} + T_{\\rm prop} + m T_{\\\
421
+ rm v}. \n\\label{eq:sortition}\n\\end{equation}\nFurther, every successfully mined\
422
+ \ superblock is identified as a transaction block, a proposer block or a voter\
423
+ \ block based on the hash output: \n\n\n* identify the superblock as a proposer\
424
+ \ block if the hash output is less than $T_{\\rm prop}$;\n* identify the superblock\
425
+ \ as a transaction block if the hash output is in the range $[T_{\\rm prop}, T_{\\\
426
+ rm tx} + T_{\\rm prop})$;\n* identify the superblock as a voter block on the\
427
+ \ $i$-th voter tree ($1\\leq i \\leq m$) if the hash output is in the range $[T_{\\\
428
+ rm tx} + T_{\\rm prop} + (i-1) T_{\\rm v}, T_{\\rm tx} + T_{\\rm prop} + i T_{\\\
429
+ rm v} )$;"
430
+ - source_sentence: What is the role of the 2/3-GHOST function in the GRANDPA finality
431
+ gadget?
432
+ sentences:
433
+ - 'paper-title: GRANDPA: a Byzantine Finality Gadget
434
+
435
+
436
+ \subsection*{2.3 Preliminaries}
437
+
438
+ Network model : We will be using the partially synchronous network model introduced
439
+ by 7] and in particular the gossip network variant used in [5]. We assume that
440
+ any message sent or received by an honest participant reaches all honest participants
441
+ within time $T$, but possibly only after some Global Synchronisation Time GST.
442
+ Concretely, any message sent or received by some honest participant at time $t$
443
+ is received by all honest participants by time GST $+T$ at the latest.
444
+
445
+
446
+ Voters: For each voting step, there is a set of $n$ voters. We will frequently
447
+ need to assume that for each such step, at most $f<n / 3$ voters are Byzantine.
448
+ We need $n-f$ of voters to agree on finality. Whether or not block producers ever
449
+ vote, they will need to be participants who track the state of the protocol.
450
+
451
+
452
+ Votes: A vote is a block hash, together with some metadata such as round number
453
+ and the type of vote, such as prevote or precommit, all signed with a voter''s
454
+ private key.
455
+
456
+
457
+ Rounds: Each participant has their own idea of what is the current round number.
458
+ Every prevote and precommit has an associated round number. Honest voters only
459
+ vote once (for each type of vote) in each round and do not vote in earlier rounds
460
+ after later ones. Participants need to keep track of which block they see as currently
461
+ being the latest finalised block and an estimate of which block could have been
462
+ finalised in the last round.
463
+
464
+
465
+ For block $B$, we write chain $(B)$ for the chain whose head is $B$. The block
466
+ number, $n(B)$ of a block $B$ is the length of chain $(B)$. For blocks $B^{\prime}$
467
+ and $B$, we say $B$ is later than $B^{\prime}$ if it has a higher block number.
468
+ We write $B>B^{\prime}$ or that $B$ is descendant of $B^{\prime}$ for $B, B^{\prime}$
469
+ appearing in the same blockchain with $B^{\prime}$ later i.e. $B^{\prime} \in$
470
+ chain $(B)$ with $n(B)>n\left(B^{\prime}\right) . B \geq B^{\prime}$ and $B \leq
471
+ B^{\prime}$ are similar except allowing $B=B^{\prime}$. We write $B \sim B^{\prime}$
472
+ or $B$ and $B^{\prime}$ are on the same chain if $B<B^{\prime}, B=B^{\prime}$
473
+ or $B>B^{\prime}$; and $B \nsim B^{\prime}$ or $B$ and $B^{\prime}$ are not on
474
+ the same chain if there is no such chain.
475
+
476
+
477
+ Blocks are ordered as a tree with the genesis block as root. So any two blocks
478
+ have a common ancestor but two blocks not on the same chain do not have a common
479
+ descendant. A vote $v$ for a block $B$ by a voter $V$ is a message signed by $V$
480
+ containing the blockhash of $B$ and meta-information like the round numbers and
481
+ the type of vote.
482
+
483
+
484
+ A voter equivocates in a set of votes $S$ if they have cast multiple different
485
+ votes in $S$. We call a set $S$ of votes safe if the number of voters who equivocate
486
+ in $S$ is at most $f$. We say that $S$ has a supermajority for a block $B$ if
487
+ the set of voters who either have a vote for blocks $\geq B$ or equivocate in
488
+ $S$ has size at least $(n+f+1) / 2$. We count equivocations as votes for everything
489
+ so that observing a vote is monotonic, meaning that if $S \subset T$ then if $S$
490
+ has a supermajority for $B$ so does $T$, while being able to ignore yet more equivocating
491
+ votes from an equivocating voter.
492
+
493
+
494
+ For our finality gadget (GRANDPA) we use the ghost [13] eventual consensus algorithm
495
+ as $F$. The 2/3-GHOST function $g(S)$ takes a set $S$ of votes and returns the
496
+ block $B$ with highest block number such that $S$ has a supermajority for $B$.
497
+ If there is no such block, then it returns ''nil''. Note that, if $S$ is safe,
498
+ then we can compute $g(S)$ by starting at the genesis block and iteratively looking
499
+ for a child of our current block with a supermajority, which must be unique if
500
+ it exists. Thus we have:
501
+
502
+
503
+ Lemma 2.5. Let $T$ be a safe set of votes. Then'
504
+ - 'paper-title: Zexe: Enabling Decentralized Private Computation
505
+
506
+
507
+ In sum, proofs of predicates'' satisfiability are produced via a SNARK over $E_{\text
508
+ {BLS }}$, and proofs for the NP relation $\mathcal{R}_{\mathrm{e}}$ are produced
509
+ via a zkSNARK over $E_{\mathrm{CP}}$. The matching fields between the two curves
510
+ ensure that the former proofs can be efficiently verified.
511
+
512
+
513
+ Problem 3: Cocks-Pinch curves are costly. While the curve $E_{\mathrm{CP}}$ was
514
+ chosen to facilitate efficient checking of proofs over $E_{\mathrm{BLS}}$, the
515
+ curve $E_{\mathrm{CP}}$ is at least $2 \times$ more expensive (in time and space)
516
+ than $E_{\mathrm{BLS}}$ simply because $E_{\mathrm{CP}}$ ''s base field has about
517
+ twice as many bits as $E_{\mathrm{BLS}}$ ''s base field. Checks in the NP relation
518
+ $\mathcal{R}_{\mathrm{e}}$\\
519
+
520
+ that are not directly related to proof checking are now unnecessarily carried
521
+ over a less efficient curve.\\
522
+
523
+ Solution 3: split relations across two curves. We split $\mathcal{R}_{\mathrm{e}}$
524
+ into two NP relations $\mathcal{R}_{\mathrm{BLS}}$ and $\mathcal{R}_{\mathrm{CP}}$
525
+ (see Fig. 14), with the latter containing just the proof check and the former
526
+ containing all other checks. We can then use a zkSNARK over the curve $E_{\text
527
+ {BLS }}$ (an efficient curve) to produce proofs for $\mathcal{R}_{\mathrm{BLS}}$,
528
+ and a zkSNARK over $E_{\mathrm{CP}}$ (the less efficient curve) to produce proofs
529
+ for $\mathcal{R}_{\mathrm{CP}}$. This approach significantly reduces the running
530
+ time of DPC.Execute (producing proofs for the checks in $\mathcal{R}_{\mathrm{BLS}}$
531
+ is more efficient over $E_{\mathrm{BLS}}$ than over $E_{\mathrm{CP}}$ ), at the
532
+ expense of a modest increase in transaction size (a transaction now includes a
533
+ zkSNARK proof over $E_{\mathrm{BLS}}$ in addition to a proof over $E_{\mathrm{CP}}$
534
+ ). An important technicality that must be addressed is that the foregoing split
535
+ relies on certain secret information to be shared across the NP relations, namely,
536
+ the identities of relevant predicates and the local data. We can store this information
537
+ in suitable commitments that are part of the NP instances for the two NP relations
538
+ (doing this efficiently requires some care as we discuss below).'
539
+ - 'paper-title: Ouroboros Praos: An adaptively-secure, semi-synchronous proof-of-stake
540
+ blockchain
541
+
542
+
543
+ where $\alpha_{\mathcal{H}}$ denotes the total relative stake of the honest parties.
544
+ Note that this bound applies to all static adversaries $\mathcal{A}$ that corrupt
545
+ no more than a $1-\alpha_{\mathcal{H}}$ fraction of all stake. With this in mind,
546
+ we define the dominant distribution as follows.\\
547
+
548
+ Definition 13 (The dominant distribution $\mathcal{D}_{\alpha}^{f}$ ). For two
549
+ parameters $f$ and $\alpha$, define $\mathcal{D}_{\alpha}^{f}$ to be the distribution
550
+ on strings $w \in\{0,1, \perp\}^{R}$ that independently assigns each $w_{i}$ so
551
+ that
552
+
553
+
554
+
555
+ \begin{align*}
556
+
557
+ p_{\perp} \triangleq \operatorname{Pr}\left[w_{i}\right. & =\perp]=1-f, \\
558
+
559
+ p_{0} \triangleq \operatorname{Pr}\left[w_{i}\right. & =0]=\phi(\alpha) \cdot(1-f),
560
+ \quad \text { and } \tag{9}\\
561
+
562
+ p_{1} \triangleq \operatorname{Pr}\left[w_{i}\right. & =1]=1-p_{\perp}-p_{0} .
563
+
564
+ \end{align*}
565
+
566
+
567
+
568
+ The distribution $\mathcal{D}_{\alpha}^{f}$ "dominates" $\mathcal{D}_{\mathcal{Z},
569
+ \mathcal{A}}^{f}$ for any static adversary $\mathcal{A}$ that corrupts no more
570
+ than a relative $1-\alpha$ share of the total stake, in the sense that nonempty
571
+ slots are more likely to be tainted under $\mathcal{D}_{\alpha}^{f}$ than they
572
+ are under $\mathcal{D}_{\mathcal{Z}, \mathcal{A}}^{f}$.
573
+
574
+
575
+ To make this relationship precise, we introduce the partial order $\preceq$ on
576
+ the set $\{\perp, 0,1\}$ so that $x \preceq y$ if and only if $x=y$ or $y=1$.
577
+ We extend this partial order to $\{\perp, 0,1\}^{R}$ by declaring $x_{1} \ldots
578
+ x_{R} \preceq y_{1} \ldots y_{R}$ if and only if $x_{i} \preceq y_{i}$ for each
579
+ $i$. Intuitively, the relationship $x \prec y$ asserts that $y$ is "more adversarial
580
+ than" $x$; concretely, any legal fork for $x$ is also a legal fork for $y$. Finally,
581
+ we define a notion of stochastic dominance for distributions on characteristic
582
+ strings, and $\alpha$-dominated adversaries.
583
+
584
+
585
+ Definition 14 (Stochastic dominance). We say that a subset $E \subseteq\{\perp,
586
+ 0,1\}^{R}$ is monotone if $x \in E$ and $x \preceq y$ implies that $y \in E$.
587
+ Let $\mathcal{D}$ and $\mathcal{D}^{\prime}$ be two distributions on the set of
588
+ characteristic strings $\{\perp, 0,1\}^{R}$. Then we say that $\mathcal{D}^{\prime}$
589
+ dominates $\mathcal{D}$, written $\mathcal{D} \preceq \mathcal{D}^{\prime}$, if
590
+ $\operatorname{Pr}{ }_{\mathcal{D}}[E] \leq \operatorname{Pr}_{\mathcal{D}^{\prime}}[E]$
591
+ for every monotone set $E$. An adversary $\mathcal{A}$ is called $\alpha$-dominated
592
+ if the distribution $\mathcal{D}_{\mathcal{Z}, \mathcal{A}}^{f}$ that it induces
593
+ on the set of characteristic strings satisfies $\mathcal{D}_{\mathcal{Z}, \mathcal{A}}^{f}
594
+ \preceq \mathcal{D}_{\alpha}^{f}$.
595
+
596
+
597
+ As noted above, this notion of stochastic dominance is consistent with the chain-theoretic
598
+ definitions of interest, in the sense that failures of the abstract chain properties
599
+ form monotone events. We record this in the lemma below.'
600
+ - source_sentence: What does the paper conclude about the relationship between latency
601
+ and security in the Nakamoto Consensus protocol?
602
+ sentences:
603
+ - 'paper-title: Close Latency-Security Trade-off for the Nakamoto Consensus
604
+
605
+
606
+ Evidently, if the infinite sums in (2) and (10) are replaced by partial sums for
607
+ numerical evaluation, the resulting (tighter) security level remains unachievable.
608
+
609
+
610
+ \subsection*{3.1 Remarks}
611
+
612
+ Theorems 3.5 and 3.6 assume the delay $\Delta>0$. The bounds therein still apply
613
+ if we set $\Delta=0$, but are slightly looser than the bounds in Theorems 3.3
614
+ and 3.4 for the zero-delay case.
615
+
616
+
617
+ It is important to include the time of interest $s$ in Definitions 3.1 and 3.2.
618
+ The "bad events" for security breach depend on $s$ as well as the latency $t$.
619
+ These well-defined events are concerned with block mining times, not how blocks
620
+ form blockchains. ${ }^{3}$
621
+
622
+
623
+ We note that a number of previous analyses on the Nakamoto consensus assume a
624
+ finite lifespan of the protocol [1, 10], that is, a maximum round number is defined,
625
+ at which round the protocol terminates. The probability of consistency depends
626
+ on the maximum round number. In contrast, this paper does not assume a finite
627
+ lifespan. Theorem 3.5 states that, barring a small probability event, confirmed
628
+ blocks remain permanently in all miners'' longest blockchains into the arbitrary
629
+ future.
630
+
631
+
632
+ Even though we provide the same security guarantee for every blockchain after
633
+ the confirmation latency $t$, no one can simultaneously guarantee the same for
634
+ all blocks that will ever be confirmed.
635
+
636
+
637
+ \footnotetext{${ }^{3}$ To be rigorous, we do not make claims such as "the blockchain/protocol/system
638
+ satisfies consistency or liveness properties with probability ..." because those
639
+ properties themselves are not events in the probability space defined here.
640
+
641
+ }
642
+
643
+ \includegraphics[max width=\textwidth, center]{2025_01_02_447c9a776bd74bcc1f99g-04}
644
+
645
+
646
+ Figure 1: Bitcoin''s latency-security trade-off with $\alpha+\beta=$ $1 / 600$
647
+ blocks per second and $\Delta=10$ seconds.
648
+
649
+
650
+ This is a simple consequence of Murphy''s Law: If an adversary keeps trying new
651
+ episodes of attacks, with probability 1 a bad event will eventually occur to revert
652
+ some confirmed honest blocks.
653
+
654
+
655
+ For technical convenience, we regard a block in a miner''s longest blockchain
656
+ to be confirmed after a certain amount of time elapses since the block is mined
657
+ or enters the miner''s view. Nakamoto [22] originally proposed confirming a block
658
+ after it is sufficiently deep in an honest miner''s longest blockchain. We believe
659
+ both confirmation rules are easy to use in practice. And the two confirmation
660
+ rules imply each other in probability (see Appendix A for further discussion).
661
+
662
+
663
+ \subsection*{3.2 Numerical Examples}
664
+
665
+ The latency-security trade-off under several different sets of parameters is plotted
666
+ in Figure 1. The mining rate is set to Bitcoin''s one block per 600 seconds, or
667
+ $\alpha+\beta=1 / 600$ blocks/second. The propagation delay bound is assumed to
668
+ be $\Delta=10$ seconds. The latency upper and lower bounds are computed using
669
+ Theorems 3.5 and 3.6, respectively. In Figure 1, all bounds appear to be exponential
670
+ for all but very small latency and high error probabilities. This implies the
671
+ exponential bound (7) is a good approximation of (5) in Theorem 3.5 for the typical
672
+ range of parameters of interest here.
673
+
674
+
675
+ It is instructive to examine concrete data points in Figure 1: If the adversarial
676
+ share of the total network mining rate is $10 \%$ $(\alpha: \beta=9: 1)$, then
677
+ a confirmation time of four hours is sufficient to achieve $10^{-3}$ security
678
+ level, and a ten-hour confirmation achieves $10^{-9}$ security level. These results
679
+ are about two hours away from the corresponding lower bounds. Also, for every
680
+ additional hour of latency, the security improves by a factor of approximately
681
+ 20 . If the adversarial share of the mining rate increases to $25 \%(\alpha: \beta=3:
682
+ 1)$, then 10 hours 40 minutes and 28 hours 45 minutes of confirmation times achieve
683
+ $10^{-3}$ and $10^{-9}$ security levels, respectively, and the gap between the
684
+ upper and lower bounds is between five and seven hours. In general, the gap is
685
+ proportionally insignificant at high security levels but can be otherwise at low
686
+ security levels. For given mining rates, the gaps are similar at different security
687
+ levels. This indicates the lower bound (10) is also approximately exponential
688
+ with a slightly steeper exponent than that of the upper bound.'
689
+ - "paper-title: Ledger Combiners for Fast Settlement\n\n$$\n\\begin{aligned}\n\\\
690
+ delta\\left(\\operatorname{PoW}_{p}^{m}(x), \\mathrm{IPoW}_{p / m}^{m}(x)\\right)\
691
+ \ & =\\frac{1}{2} \\sum_{s \\in\\{0,1\\}^{m}}\\left|\\operatorname{Pr}\\left[\\\
692
+ operatorname{PoW}_{p}^{m}(x)=s\\right]-\\operatorname{Pr}\\left[\\operatorname{IPoW}_{p\
693
+ \ / m}^{m}(x)=s\\right]\\right| \\\\\n& =\\sum_{\\substack{s \\in\\{0,1)^{m} \\\
694
+ \\\n\\mathrm{hw}(s)=1}}\\left(\\operatorname{Pr}\\left[\\operatorname{PoW}_{p}^{m}(x)=s\\\
695
+ right]-\\operatorname{Pr}\\left[\\operatorname{IPoW}_{p / m}^{m}(x)=s\\right]\\\
696
+ right) \\\\\n& \\leq m \\cdot\\left[\\frac{p}{m}-\\frac{p}{m}\\left(1-\\frac{p}{m}\\\
697
+ right)^{m-1}\\right] \\leq p[1-(1-p)]=p^{2}\n\\end{aligned}\n$$\n\nas desired,\
698
+ \ where the last inequality follows by Bernoulli inequality.\n\nThe above lemma\
699
+ \ already justifies the use of $\\mathrm{PoW}_{p}^{m}$ for achieving subindependence\
700
+ \ in practical scenarios. To observe this, note that the use of $\\mathrm{IPoW}_{p\
701
+ \ / m}^{m}$ would lead to full independence of the individual PoW lotteries, and\
702
+ \ by Lemma 7 the real execution with $\\mathrm{PoW}_{p}^{m}$ will only differ\
703
+ \ from this ideal behavior with probability at most $Q \\cdot p^{2}$, where $Q$\
704
+ \ is the total number of PoW-queries. With current values of $p \\approx 10^{-22}$\
705
+ \ in e.g., Bitcoin ${ }^{2}$, and the block creation time adjusting to 10 minutes,\
706
+ \ this difference would manifest on expectation in about $10^{18}$ years. Note\
707
+ \ that any future increase of the total mining difficulty while maintaining the\
708
+ \ block creation time would only increase this period.\n\nNonetheless, in Appendix\
709
+ \ F we give a more detailed analysis of $\\mathrm{PoW}_{p}^{m}$ that shows that,\
710
+ \ loosely speaking, $m$ parallel executions of Bitcoin using PoW ${ }_{p}^{m}$\
711
+ \ as their shared PoW oracle achieve $\\varepsilon$-subindependence for $\\varepsilon$\
712
+ \ negligible in the security parameter.\n\n\\subsection*{4.2 Realizing Rank via\
713
+ \ Timestamped Blockchains}\nAn important consideration when deploying our virtual\
714
+ \ ledger construction over existing blockchains is how to realize the notion of\
715
+ \ rank. We note that typical Nakamoto-style PoS blockchains (e.g., the Ouroboros\
716
+ \ family, Snow White) assume a common notion of time among the participants and\
717
+ \ explicitly label blocks with slot numbers with a direct correspondence to absolute\
718
+ \ time. These slot numbers (or, preferably, a notion of common time associated\
719
+ \ with each slot number) directly afford a notion of rank that provides the desired\
720
+ \ persistence and liveness guarantees. To formalize this property, we introduce\
721
+ \ the notion of a timestamped blockchain.\n\nDefinition 11. A timestamped blockchain\
722
+ \ is one satisfying the following conventions:\n\n\\begin{itemize}\n \\item Block\
723
+ \ timestamps. Every block contains a declared timestamp.\n \\item Monotonicity.\
724
+ \ In order for a block to be considered valid, its timestamp can be no less than\
725
+ \ the timestamps of all prior blocks in the blockchain. (Thus valid blockchains\
726
+ \ consist of blocks in monotonically increasing order.)\n\\end{itemize}\n\nInformally,\
727
+ \ we say that an algorithm is a timestamped blockchain algorithm if it calls for\
728
+ \ participants to broadcast timestamped blockchains and to \"respect timestamps.\"\
729
+ \ More specifically, the algorithm satisfies the following:\n\n\\begin{itemize}\n\
730
+ \ \\item Faithful honest timestamping. Honest participants always post blocks\
731
+ \ with timestamps determined by their local clocks.\n \\item Ignore future blocks.\
732
+ \ Honest participants ignore blocks that contain a timestamp which is greater\
733
+ \ than their local time by more than a fixed constant. (These blocks might be\
734
+ \ considered later when the local clock of the participant \"catches up\" with\
735
+ \ the timestamp.)\n\\end{itemize}"
736
+ - "paper-title: A Scalable Proof-of-Stake Blockchain in the Open Setting * \\\\\
737
+ \ (or, How to Mimic Nakamoto's Design via Proof-of-Stake)\n\nLet $\\ell$ be the\
738
+ \ length of core-chain $\\mathcal{C}$. In our design, only the elected PoS-players\
739
+ \ are allowed to generate new block-cores (to extend the core-chain). Now, each\
740
+ \ registered PoS-player P will work on the right \"context\" which consists of\
741
+ \ the latest block-core in the longest corechain and the current time; formally\
742
+ \ context $:=\\left\\langle h^{\\text {prev }}\\right.$, round $\\rangle$ where\
743
+ \ $\\mathcal{C}[\\ell]$ is the latest blockcore in the longest core-chain $\\\
744
+ mathcal{C}$, and $h^{\\text {prev }}$ is the identity returned by the functionality\
745
+ \ $\\mathcal{F}_{\\text {rCERT }}$ for $\\mathcal{C}[\\ell]$, and round denotes\
746
+ \ the current time. The PoS-player P may query $\\mathcal{F}_{\\text {rCERT }}$\
747
+ \ by command (Elect, P , context, $\\mathcal{C}$ ) to see if he is selected to\
748
+ \ extend $\\mathcal{C}$. If the PoS-player P is selected (with certain probability\
749
+ \ $p$ ), he would receive a message (Elected, $\\mathrm{P}, h, \\sigma, \\mathrm{~b}$\
750
+ \ ) from $\\mathcal{F}_{\\text {rCERT }}$ such that $\\mathrm{b}=1$. Once receiving\
751
+ \ the signature $\\sigma$ from the functionality, the PoS-player P defines a new\
752
+ \ block-core $B:=\\left\\langle\\left\\langle h^{\\text {prev }}, h\\right.\\\
753
+ right.$, round $\\left.\\rangle, \\mathrm{P}, \\sigma\\right\\rangle$, updates\
754
+ \ his local core-chain $\\mathcal{C}$ and then broadcasts the local core-chain\
755
+ \ to the network. Please refer to Figure 3 for more details of our core-chain\
756
+ \ protocol.\n\nNote that here PoS-players have access to the functionality $\\\
757
+ mathcal{F}_{\\text {rCERT }}$. The players need to register to the functionality\
758
+ \ $\\mathcal{F}_{\\text {rCERT }}$ before querying the functionality.\n\nThe best\
759
+ \ core-chain strategy. Our proof-of-stake core-chain protocol $\\Pi^{\\text {core\
760
+ \ }}$ uses the subroutine BestCore to single out the best valid core-chain from\
761
+ \ a set of core-chains. Now we describe the rules of selecting the best core-chain.\
762
+ \ Roughly speaking, a core-chain is the best one if it is the current longest\
763
+ \ valid core-chain. The BestCore subroutine takes as input, a core-chain set $\\\
764
+ mathbb{C}^{\\prime}$ and the current time information round'. Intuitively, the\
765
+ \ subroutine validates all $\\mathcal{C} \\in \\mathbb{C}^{\\prime}$, then finds\
766
+ \ the valid longest core-chain.\n\nIn more detail, BestCore proceeds as follows.\
767
+ \ On input the current set of core-chains $\\mathbb{C}^{\\prime}$ and the current\
768
+ \ time information round', and for each core-chain $\\mathcal{C}$, the subroutine\
769
+ \ then evaluates every block-core of the core-chain $\\mathcal{C}$ sequentially.\
770
+ \ Let $\\ell$ be the length of $\\mathcal{C}$. Starting from the head of $\\mathcal{C}$,\
771
+ \ for every block-core $\\mathcal{C}[i]$, for all $i \\in[\\ell]$, in the core-chain\
772
+ \ $\\mathcal{C}$, the BestCore subroutine (1) ensures that $\\mathcal{C}[i]$ is\
773
+ \ linked to the previous block-core $\\mathcal{C}[i-1]$ correctly, and (2) tests\
774
+ \ if the\n\n\\section*{Protocol $\\Pi^{\\text {core }}$}\nInitially, a set $\\\
775
+ mathcal{P}_{0}$ of players are registered to the functionality $\\mathcal{F}_{\\\
776
+ text {rCERT }}$, where $\\mathcal{P}_{0} \\subseteq \\mathcal{P}$. Initially,\
777
+ \ for each $\\mathrm{P} \\in \\mathcal{P}$, set $\\mathcal{C}:=\\emptyset$, and\
778
+ \ state $:=\\emptyset$.\n\nUpon receiving message (Input-Stake, P ) from the environment\
779
+ \ $z$ at round round, the PoS-player $\\mathrm{P} \\in$ $\\mathcal{P}$, with local\
780
+ \ state state, proceeds as follows.\n\n\\begin{enumerate}\n \\item Select the\
781
+ \ best local PoS core-chain:\n\\end{enumerate}"
782
+ - source_sentence: What is the difference between absolute settlement and relative
783
+ settlement for transactions in a ledger?
784
+ sentences:
785
+ - 'paper-title: Ledger Combiners for Fast Settlement
786
+
787
+
788
+ Since the above requirements are formulated independently for each $t$, it is
789
+ well-defined to treat $\mathrm{C}[\cdot]$ as operating on ledgers rather than
790
+ dynamic ledgers; we sometimes overload the notation in this sense.
791
+
792
+
793
+ Looking ahead, our amplification combiner will consider $\mathrm{t}_{\mathrm{C}}\left(\mathbf{L}_{1}^{(t)},
794
+ \ldots, \mathbf{L}_{m}^{(t)}\right)=\bigcup_{i} \mathbf{L}_{i}^{(t)}$ along with
795
+ two related definitions of $\mathrm{a}_{\mathrm{C}}$ :
796
+
797
+
798
+ $$
799
+
800
+ \mathrm{a}_{\mathrm{C}}\left(A_{1}^{(t)}, \ldots, A_{m}^{(t)}\right)=\bigcup_{i}
801
+ A_{i}^{(t)} \quad \text { and } \quad \mathrm{a}_{\mathrm{C}}\left(A_{1}^{(t)},
802
+ \ldots, A_{m}^{(t)}\right)=\bigcap_{i} A_{i}^{(t)}
803
+
804
+ $$
805
+
806
+
807
+ see Section 3. The robust combiner will adopt a more sophisticated notion of $t_{c}$;
808
+ see Section 5 . In each of these cases, the important structural properties of
809
+ the construction are captured by the rank function $r_{C}$.
810
+
811
+
812
+ \subsection*{2.3 Transaction Validity and Settlement}
813
+
814
+ In the discussion below, we assume a general notion of transaction validity that
815
+ can be decided inductively: given a ledger $\mathbf{L}$, the validity of a transaction
816
+ $t x \in \mathbf{L}$ is determined by the transactions in the state $\mathbf{L}\lceil\operatorname{tx}\rceil$
817
+ of $\mathbf{L}$ up to tx and their ordering. Intuitively, only valid transactions
818
+ are then accounted for when interpreting the state of the ledger on the application
819
+ level. The canonical example of such a validity predicate in the case of so-called
820
+ UTXO transactions is formalized for completeness in Appendix B. Note that protocols
821
+ such as Bitcoin allow only valid transactions to enter the ledger; as the Bitcoin
822
+ ledger is represented by a simple chain it is possible to evaluate the validity
823
+ predicate upon block creation for each included transaction. This may not be the
824
+ case for more general ledgers, such as the result of applying one of our combiners
825
+ or various DAG-based constructions.
826
+
827
+
828
+ While we focus our analysis on persistence and liveness as given in Definition
829
+ 3, our broader goal is to study settlement. Intuitively, settlement is the delay
830
+ necessary to ensure that a transaction included in some $A^{(t)}$ enters the dynamic
831
+ ledger and, furthermore, that its validity stabilizes for all future times.
832
+
833
+
834
+ Definition 5 (Absolute settlement). For a dynamic ledger $\mathbf{D} \stackrel{\text
835
+ { def }}{=} \mathbf{L}^{(0)}, \mathbf{L}^{(1)}, \ldots$ we say that a transaction
836
+ $t x \in$ $A^{(\tau)} \cap \mathbf{L}^{(t)}($ for $\tau \leq t)$ is (absolutely)
837
+ settled at time $t$ iffor all $\ell \geq t$ we have: (i) $\mathbf{L}^{(t)}\lceil\mathrm{tx}\rceil
838
+ \subseteq \mathbf{L}^{(\ell)}$, (ii) the linear orders $<_{\mathbf{L}^{(t)}}$
839
+ and $<_{\mathbf{L}^{(t)}}$ agree on $\mathbf{L}^{(t)}\lceil\mathrm{tx}\rceil$,
840
+ and (iii) for any $\mathrm{tx}^{\prime} \in \mathbf{L}^{(e)}$ such that $\mathrm{tx}^{\prime}{<_{\mathbf{L}}(t)}
841
+ \mathrm{tx}$ we have $\mathrm{tx}^{\prime} \in \mathbf{L}^{(t)}\lceil\mathrm{tx}\rceil$.
842
+
843
+
844
+ Note that for any absolutely settled transaction, its validity is determined and
845
+ it is guaranteed to remain unchanged in the future.
846
+
847
+
848
+ It will be useful to also consider a weaker notion of relative settlement of a
849
+ transaction: Intuitively, tx is relatively settled at time $t$ if we have the
850
+ guarantee that no (conflicting) transaction $\mathrm{tx}^{\prime}$ that is not
851
+ part of the ledger at time $t$ can possibly eventually precede $t x$ in the ledger
852
+ ordering.'
853
+ - "paper-title: Casper the Friendly Finality Gadget\n\n\\documentclass[10pt]{article}\n\
854
+ \\usepackage[utf8]{inputenc}\n\\usepackage[T1]{fontenc}\n\\usepackage{amsmath}\n\
855
+ \\usepackage{amsfonts}\n\\usepackage{amssymb}\n\\usepackage[version=4]{mhchem}\n\
856
+ \\usepackage{stmaryrd}\n\\usepackage{graphicx}\n\\usepackage[export]{adjustbox}\n\
857
+ \\graphicspath{ {./images/} }\n\\usepackage{hyperref}\n\\hypersetup{colorlinks=true,\
858
+ \ linkcolor=blue, filecolor=magenta, urlcolor=cyan,}\n\\urlstyle{same}\n\n\\title{Casper\
859
+ \ the Friendly Finality Gadget }\n\n\\author{Vitalik Buterin and Virgil Griffith\\\
860
+ \\\nEthereum Foundation}\n\\date{}\n\n\n%New command to display footnote whose\
861
+ \ markers will always be hidden\n\\let\\svthefootnote\\thefootnote\n\\newcommand\\\
862
+ blfootnotetext[1]{%\n \\let\\thefootnote\\relax\\footnote{#1}%\n \\addtocounter{footnote}{-1}%\n\
863
+ \ \\let\\thefootnote\\svthefootnote%\n}\n\n%Overriding the \\footnotetext command\
864
+ \ to hide the marker if its value is `0`\n\\let\\svfootnotetext\\footnotetext\n\
865
+ \\renewcommand\\footnotetext[2][?]{%\n \\if\\relax#1\\relax%\n \\ifnum\\value{footnote}=0\\\
866
+ blfootnotetext{#2}\\else\\svfootnotetext{#2}\\fi%\n \\else%\n \\if?#1\\ifnum\\\
867
+ value{footnote}=0\\blfootnotetext{#2}\\else\\svfootnotetext{#2}\\fi%\n \\else\\\
868
+ svfootnotetext[#1]{#2}\\fi%\n \\fi\n}\n\n\\begin{document}\n\\maketitle\n\n\n\
869
+ \\begin{abstract}\nWe introduce Casper, a proof of stake-based finality system\
870
+ \ which overlays an existing proof of work blockchain. Casper is a partial consensus\
871
+ \ mechanism combining proof of stake algorithm research and Byzantine fault tolerant\
872
+ \ consensus theory. We introduce our system, prove some desirable features, and\
873
+ \ show defenses against long range revisions and catastrophic crashes. The Casper\
874
+ \ overlay provides almost any proof of work chain with additional protections\
875
+ \ against block reversions.\n\\end{abstract}\n\n\\section*{1. Introduction}\n\
876
+ Over the past few years there has been considerable research into \"proof of stake\"\
877
+ \ (PoS) based blockchain consensus algorithms. In a PoS system, a blockchain appends\
878
+ \ and agrees on new blocks through a process where anyone who holds coins inside\
879
+ \ of the system can participate, and the influence an agent has is proportional\
880
+ \ to the number of coins (or \"stake\") it holds. This is a vastly more efficient\
881
+ \ alternative to proof of work (PoW) \"mining\" and enables blockchains to operate\
882
+ \ without mining's high hardware and electricity costs.\\\\[0pt]\nThere are two\
883
+ \ major schools of thought in PoS design. The first, chain-based proof of stake[1,\
884
+ \ 2], mimics proof of work mechanics and features a chain of blocks and simulates\
885
+ \ mining by pseudorandomly assigning the right to create new blocks to stakeholders.\
886
+ \ This includes Peercoin[3], Blackcoin[4], and Iddo Bentov's work[5].\\\\[0pt]\n\
887
+ The other school, Byzantine fault tolerant (BFT) based proof of stake, is based\
888
+ \ on a thirty-year-old body of research into BFT consensus algorithms such as\
889
+ \ PBFT[6]. BFT algorithms typically have proven mathematical properties; for example,\
890
+ \ one can usually mathematically prove that as long as $>\\frac{2}{3}$ of protocol\
891
+ \ participants are following the protocol honestly, then, regardless of network\
892
+ \ latency, the algorithm cannot finalize conflicting blocks. Repurposing BFT algorithms\
893
+ \ for proof of stake was first introduced by Tendermint[7], and has modern inspirations\
894
+ \ such as [8]. Casper follows this BFT tradition, though with some modifications.\n\
895
+ \n\\subsection*{1.1. Our Work}\nCasper the Friendly Finality Gadget is an overlay\
896
+ \ atop a proposal mechanism-a mechanism which proposes blocks ${ }^{1}$. Casper\
897
+ \ is responsible for finalizing these blocks, essentially selecting a unique chain\
898
+ \ which represents the canonical transactions of the ledger. Casper provides safety,\
899
+ \ but liveness depends on the chosen proposal mechanism. That is, if attackers\
900
+ \ wholly control the proposal mechanism, Casper protects against finalizing two\
901
+ \ conflicting checkpoints, but the attackers could prevent Casper from finalizing\
902
+ \ any future checkpoints.\\\\\nCasper introduces several new features that BFT\
903
+ \ algorithms do not necessarily support:"
904
+ - 'paper-title: Bitcoin and Cryptocurrency Technologies
905
+
906
+
907
+ Interestingly, these concerns have an analogy in the realm of voting. It''s illegal
908
+ in the United States and many other nations for individuals to sell their vote.
909
+ Arguably participating in a pool controlled by someone else is akin to selling
910
+ your vote in the Bitcoin consensus protocol.
911
+
912
+
913
+ Technical requirements for pools. Recall that mining pools appear to be an emergent
914
+ phenomenon. There''s no evidence that Satoshi was thinking of mining pools at
915
+ the time of Bitcoin''s original design. It wasn''t apparent for a few years that
916
+ efficient pools could be run between many individuals who don''t know or trust
917
+ each other.
918
+
919
+
920
+ As we saw in Chapter 5, mining pools typically work by designating a pool operator
921
+ with a well-known public key. Each of the participating miners mines as usual
922
+ but sends in shares to the pool operator. These shares are "near misses" or "partial
923
+ solutions" which would be valid solutions at a lower difficulty level. This shows
924
+ the pool operator how much work the miner is performing. Whenever one of the pool
925
+ participants finds a valid block, the pool operator then distributes the rewards
926
+ amongst the pool participants based on the number of shares they have submitted.
927
+ As we discussed in Chapter 5, there are many formulas for dividing the revenue
928
+ up, but all mining pools follow this basic structure.
929
+
930
+
931
+ The existence of pools thus relies on at least two technical properties of Bitcoin.
932
+ The first is that it''s easy for a miner to prove (probabilistically) how much
933
+ work they are doing by submitting shares. By choosing a low enough threshold for
934
+ shares, miners can easily prove how much work they are performing with arbitrary
935
+ precision regardless of the actual difficulty of finding an valid block. This
936
+ facet of mining puzzles appears difficult to change, given that we need a puzzle
937
+ that can be created with arbitrary difficulty.
938
+
939
+
940
+ Second, pool members can easily prove to the pool operator that they''re following
941
+ the rules and working to find valid blocks which would reward the pool as a whole.
942
+ This works because the pool''s public key is committed to in the coinbase transaction
943
+ included in the block''s Merkle tree of transactions. Once a miner finds a block
944
+ or even a share, they can''t change which public key is the recipient of the newly
945
+ minted coins.
946
+
947
+
948
+ Block discarding attacks. There is one weakness in this scheme for implementing
949
+ mining pools: there is nothing to to enforce that participating miners actually
950
+ submit valid blocks to the pool manager in the event that they find them. Suppose
951
+ that there''s a pool member that''s upset with a large mining pool. They can participate
952
+ in the pool by mining and submitting shares just like normal, but in the event
953
+ that they actually find a valid block that would reward the pool they simply discard
954
+ it and don''t tell the pool operator about it.
955
+
956
+
957
+ This attack reduces the pool''s overall mining power as none of the attacker''s
958
+ work is contributing towards finding valid blocks. However the attacker will still
959
+ be rewarded as they appear to be submitting valid shares and simply getting unlucky
960
+ to not find any valid blocks. If the mining pool is designed to be revenue-neutral
961
+ (that is, all mining rewards are redistributed back to participants) then this
962
+ attack can cause the pool to run at a loss.
963
+
964
+
965
+ This attack is sometimes called a vigilante or sabotage attack and is considered
966
+ a form of vandalism because the attack appears to be costly for both the attacker
967
+ and the pool. The attacker loses money because every block they discard would
968
+ have led to some proportion of the block rewards being returned to them. Of course,
969
+ the attacker still gets rewards for other puzzle solutions that are found.
970
+
971
+
972
+ It appears that a rational attacker wouldn''t employ this strategy, since they
973
+ would lose money without gaining anything tangible. It turns out (quite surprisingly)
974
+ that there are cases where this strategy can be profitable, as discussed in the
975
+ box below. But in any case, we want to design an entirely new mining puzzle formulation
976
+ that ensures this strategy is always profitable.
977
+
978
+
979
+ Sidebar: block discarding attacks between pools. People assumed for years that
980
+ it can''t be profitable for a participant to discard valid blocks found on behalf
981
+ of the pool. It turns out this strategy can be profitable if one mining pool uses
982
+ it to attack another. This was proposed apocryphally many times and first thoroughly
983
+ analyzed in a paper by Ittay Eyal in 2015.
984
+
985
+
986
+ Let''s consider a simple case: suppose two mining pools, $A$ and $B$, each have
987
+ $50 \%$ of the total mining capacity. Now suppose B uses half of its mining power
988
+ ( $25 \%$ of the total capacity) to mine as a member in pool A, but discards all
989
+ blocks found. We can show, in a simplified model, that B will now earns $5 / 9$
990
+ of the total rewards, greater than the $50 \%$ it would earn by mining normally.
991
+ In this simple case, dedicating half of its mining power to attacking can be shown
992
+ to be the optimal strategy for pool B.'
993
+ pipeline_tag: sentence-similarity
994
+ library_name: sentence-transformers
995
+ metrics:
996
+ - cosine_accuracy@1
997
+ - cosine_accuracy@3
998
+ - cosine_accuracy@5
999
+ - cosine_accuracy@10
1000
+ - cosine_precision@1
1001
+ - cosine_precision@3
1002
+ - cosine_precision@5
1003
+ - cosine_precision@10
1004
+ - cosine_recall@1
1005
+ - cosine_recall@3
1006
+ - cosine_recall@5
1007
+ - cosine_recall@10
1008
+ - cosine_ndcg@10
1009
+ - cosine_mrr@10
1010
+ - cosine_map@100
1011
+ model-index:
1012
+ - name: SentenceTransformer based on BAAI/bge-base-en-v1.5
1013
+ results:
1014
+ - task:
1015
+ type: information-retrieval
1016
+ name: Information Retrieval
1017
+ dataset:
1018
+ name: dim 768
1019
+ type: dim_768
1020
+ metrics:
1021
+ - type: cosine_accuracy@1
1022
+ value: 0.5
1023
+ name: Cosine Accuracy@1
1024
+ - type: cosine_accuracy@3
1025
+ value: 0.7857142857142857
1026
+ name: Cosine Accuracy@3
1027
+ - type: cosine_accuracy@5
1028
+ value: 0.8571428571428571
1029
+ name: Cosine Accuracy@5
1030
+ - type: cosine_accuracy@10
1031
+ value: 0.8571428571428571
1032
+ name: Cosine Accuracy@10
1033
+ - type: cosine_precision@1
1034
+ value: 0.5
1035
+ name: Cosine Precision@1
1036
+ - type: cosine_precision@3
1037
+ value: 0.26190476190476186
1038
+ name: Cosine Precision@3
1039
+ - type: cosine_precision@5
1040
+ value: 0.17142857142857146
1041
+ name: Cosine Precision@5
1042
+ - type: cosine_precision@10
1043
+ value: 0.08571428571428573
1044
+ name: Cosine Precision@10
1045
+ - type: cosine_recall@1
1046
+ value: 0.5
1047
+ name: Cosine Recall@1
1048
+ - type: cosine_recall@3
1049
+ value: 0.7857142857142857
1050
+ name: Cosine Recall@3
1051
+ - type: cosine_recall@5
1052
+ value: 0.8571428571428571
1053
+ name: Cosine Recall@5
1054
+ - type: cosine_recall@10
1055
+ value: 0.8571428571428571
1056
+ name: Cosine Recall@10
1057
+ - type: cosine_ndcg@10
1058
+ value: 0.7032219246239031
1059
+ name: Cosine Ndcg@10
1060
+ - type: cosine_mrr@10
1061
+ value: 0.6511904761904762
1062
+ name: Cosine Mrr@10
1063
+ - type: cosine_map@100
1064
+ value: 0.6553083095766022
1065
+ name: Cosine Map@100
1066
+ - task:
1067
+ type: information-retrieval
1068
+ name: Information Retrieval
1069
+ dataset:
1070
+ name: dim 512
1071
+ type: dim_512
1072
+ metrics:
1073
+ - type: cosine_accuracy@1
1074
+ value: 0.5714285714285714
1075
+ name: Cosine Accuracy@1
1076
+ - type: cosine_accuracy@3
1077
+ value: 0.7857142857142857
1078
+ name: Cosine Accuracy@3
1079
+ - type: cosine_accuracy@5
1080
+ value: 0.8214285714285714
1081
+ name: Cosine Accuracy@5
1082
+ - type: cosine_accuracy@10
1083
+ value: 0.8571428571428571
1084
+ name: Cosine Accuracy@10
1085
+ - type: cosine_precision@1
1086
+ value: 0.5714285714285714
1087
+ name: Cosine Precision@1
1088
+ - type: cosine_precision@3
1089
+ value: 0.26190476190476186
1090
+ name: Cosine Precision@3
1091
+ - type: cosine_precision@5
1092
+ value: 0.1642857142857143
1093
+ name: Cosine Precision@5
1094
+ - type: cosine_precision@10
1095
+ value: 0.08571428571428573
1096
+ name: Cosine Precision@10
1097
+ - type: cosine_recall@1
1098
+ value: 0.5714285714285714
1099
+ name: Cosine Recall@1
1100
+ - type: cosine_recall@3
1101
+ value: 0.7857142857142857
1102
+ name: Cosine Recall@3
1103
+ - type: cosine_recall@5
1104
+ value: 0.8214285714285714
1105
+ name: Cosine Recall@5
1106
+ - type: cosine_recall@10
1107
+ value: 0.8571428571428571
1108
+ name: Cosine Recall@10
1109
+ - type: cosine_ndcg@10
1110
+ value: 0.7276726753008987
1111
+ name: Cosine Ndcg@10
1112
+ - type: cosine_mrr@10
1113
+ value: 0.6848639455782314
1114
+ name: Cosine Mrr@10
1115
+ - type: cosine_map@100
1116
+ value: 0.6886316064887493
1117
+ name: Cosine Map@100
1118
+ - task:
1119
+ type: information-retrieval
1120
+ name: Information Retrieval
1121
+ dataset:
1122
+ name: dim 256
1123
+ type: dim_256
1124
+ metrics:
1125
+ - type: cosine_accuracy@1
1126
+ value: 0.5714285714285714
1127
+ name: Cosine Accuracy@1
1128
+ - type: cosine_accuracy@3
1129
+ value: 0.7857142857142857
1130
+ name: Cosine Accuracy@3
1131
+ - type: cosine_accuracy@5
1132
+ value: 0.8214285714285714
1133
+ name: Cosine Accuracy@5
1134
+ - type: cosine_accuracy@10
1135
+ value: 0.8571428571428571
1136
+ name: Cosine Accuracy@10
1137
+ - type: cosine_precision@1
1138
+ value: 0.5714285714285714
1139
+ name: Cosine Precision@1
1140
+ - type: cosine_precision@3
1141
+ value: 0.26190476190476186
1142
+ name: Cosine Precision@3
1143
+ - type: cosine_precision@5
1144
+ value: 0.1642857142857143
1145
+ name: Cosine Precision@5
1146
+ - type: cosine_precision@10
1147
+ value: 0.08571428571428573
1148
+ name: Cosine Precision@10
1149
+ - type: cosine_recall@1
1150
+ value: 0.5714285714285714
1151
+ name: Cosine Recall@1
1152
+ - type: cosine_recall@3
1153
+ value: 0.7857142857142857
1154
+ name: Cosine Recall@3
1155
+ - type: cosine_recall@5
1156
+ value: 0.8214285714285714
1157
+ name: Cosine Recall@5
1158
+ - type: cosine_recall@10
1159
+ value: 0.8571428571428571
1160
+ name: Cosine Recall@10
1161
+ - type: cosine_ndcg@10
1162
+ value: 0.7284895986499949
1163
+ name: Cosine Ndcg@10
1164
+ - type: cosine_mrr@10
1165
+ value: 0.6857142857142858
1166
+ name: Cosine Mrr@10
1167
+ - type: cosine_map@100
1168
+ value: 0.6893267651888342
1169
+ name: Cosine Map@100
1170
+ - task:
1171
+ type: information-retrieval
1172
+ name: Information Retrieval
1173
+ dataset:
1174
+ name: dim 128
1175
+ type: dim_128
1176
+ metrics:
1177
+ - type: cosine_accuracy@1
1178
+ value: 0.5
1179
+ name: Cosine Accuracy@1
1180
+ - type: cosine_accuracy@3
1181
+ value: 0.75
1182
+ name: Cosine Accuracy@3
1183
+ - type: cosine_accuracy@5
1184
+ value: 0.8214285714285714
1185
+ name: Cosine Accuracy@5
1186
+ - type: cosine_accuracy@10
1187
+ value: 0.8571428571428571
1188
+ name: Cosine Accuracy@10
1189
+ - type: cosine_precision@1
1190
+ value: 0.5
1191
+ name: Cosine Precision@1
1192
+ - type: cosine_precision@3
1193
+ value: 0.24999999999999997
1194
+ name: Cosine Precision@3
1195
+ - type: cosine_precision@5
1196
+ value: 0.1642857142857143
1197
+ name: Cosine Precision@5
1198
+ - type: cosine_precision@10
1199
+ value: 0.08571428571428573
1200
+ name: Cosine Precision@10
1201
+ - type: cosine_recall@1
1202
+ value: 0.5
1203
+ name: Cosine Recall@1
1204
+ - type: cosine_recall@3
1205
+ value: 0.75
1206
+ name: Cosine Recall@3
1207
+ - type: cosine_recall@5
1208
+ value: 0.8214285714285714
1209
+ name: Cosine Recall@5
1210
+ - type: cosine_recall@10
1211
+ value: 0.8571428571428571
1212
+ name: Cosine Recall@10
1213
+ - type: cosine_ndcg@10
1214
+ value: 0.6935204558400861
1215
+ name: Cosine Ndcg@10
1216
+ - type: cosine_mrr@10
1217
+ value: 0.6395833333333334
1218
+ name: Cosine Mrr@10
1219
+ - type: cosine_map@100
1220
+ value: 0.6425405844155845
1221
+ name: Cosine Map@100
1222
+ - task:
1223
+ type: information-retrieval
1224
+ name: Information Retrieval
1225
+ dataset:
1226
+ name: dim 64
1227
+ type: dim_64
1228
+ metrics:
1229
+ - type: cosine_accuracy@1
1230
+ value: 0.42857142857142855
1231
+ name: Cosine Accuracy@1
1232
+ - type: cosine_accuracy@3
1233
+ value: 0.6785714285714286
1234
+ name: Cosine Accuracy@3
1235
+ - type: cosine_accuracy@5
1236
+ value: 0.75
1237
+ name: Cosine Accuracy@5
1238
+ - type: cosine_accuracy@10
1239
+ value: 0.8214285714285714
1240
+ name: Cosine Accuracy@10
1241
+ - type: cosine_precision@1
1242
+ value: 0.42857142857142855
1243
+ name: Cosine Precision@1
1244
+ - type: cosine_precision@3
1245
+ value: 0.22619047619047614
1246
+ name: Cosine Precision@3
1247
+ - type: cosine_precision@5
1248
+ value: 0.15000000000000005
1249
+ name: Cosine Precision@5
1250
+ - type: cosine_precision@10
1251
+ value: 0.08214285714285716
1252
+ name: Cosine Precision@10
1253
+ - type: cosine_recall@1
1254
+ value: 0.42857142857142855
1255
+ name: Cosine Recall@1
1256
+ - type: cosine_recall@3
1257
+ value: 0.6785714285714286
1258
+ name: Cosine Recall@3
1259
+ - type: cosine_recall@5
1260
+ value: 0.75
1261
+ name: Cosine Recall@5
1262
+ - type: cosine_recall@10
1263
+ value: 0.8214285714285714
1264
+ name: Cosine Recall@10
1265
+ - type: cosine_ndcg@10
1266
+ value: 0.631592589549331
1267
+ name: Cosine Ndcg@10
1268
+ - type: cosine_mrr@10
1269
+ value: 0.5696428571428572
1270
+ name: Cosine Mrr@10
1271
+ - type: cosine_map@100
1272
+ value: 0.5757306413556414
1273
+ name: Cosine Map@100
1274
+ ---
1275
+
1276
+ # SentenceTransformer based on BAAI/bge-base-en-v1.5
1277
+
1278
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
1279
+
1280
+ ## Model Details
1281
+
1282
+ ### Model Description
1283
+ - **Model Type:** Sentence Transformer
1284
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
1285
+ - **Maximum Sequence Length:** 512 tokens
1286
+ - **Output Dimensionality:** 768 dimensions
1287
+ - **Similarity Function:** Cosine Similarity
1288
+ - **Training Dataset:**
1289
+ - json
1290
+ <!-- - **Language:** Unknown -->
1291
+ <!-- - **License:** Unknown -->
1292
+
1293
+ ### Model Sources
1294
+
1295
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
1296
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
1297
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
1298
+
1299
+ ### Full Model Architecture
1300
+
1301
+ ```
1302
+ SentenceTransformer(
1303
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
1304
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
1305
+ (2): Normalize()
1306
+ )
1307
+ ```
1308
+
1309
+ ## Usage
1310
+
1311
+ ### Direct Usage (Sentence Transformers)
1312
+
1313
+ First install the Sentence Transformers library:
1314
+
1315
+ ```bash
1316
+ pip install -U sentence-transformers
1317
+ ```
1318
+
1319
+ Then you can load this model and run inference.
1320
+ ```python
1321
+ from sentence_transformers import SentenceTransformer
1322
+
1323
+ # Download from the 🤗 Hub
1324
+ model = SentenceTransformer("mahsaBa76/bge-base-custom-matryoshka")
1325
+ # Run inference
1326
+ sentences = [
1327
+ 'What is the difference between absolute settlement and relative settlement for transactions in a ledger?',
1328
+ 'paper-title: Ledger Combiners for Fast Settlement\n\nSince the above requirements are formulated independently for each $t$, it is well-defined to treat $\\mathrm{C}[\\cdot]$ as operating on ledgers rather than dynamic ledgers; we sometimes overload the notation in this sense.\n\nLooking ahead, our amplification combiner will consider $\\mathrm{t}_{\\mathrm{C}}\\left(\\mathbf{L}_{1}^{(t)}, \\ldots, \\mathbf{L}_{m}^{(t)}\\right)=\\bigcup_{i} \\mathbf{L}_{i}^{(t)}$ along with two related definitions of $\\mathrm{a}_{\\mathrm{C}}$ :\n\n$$\n\\mathrm{a}_{\\mathrm{C}}\\left(A_{1}^{(t)}, \\ldots, A_{m}^{(t)}\\right)=\\bigcup_{i} A_{i}^{(t)} \\quad \\text { and } \\quad \\mathrm{a}_{\\mathrm{C}}\\left(A_{1}^{(t)}, \\ldots, A_{m}^{(t)}\\right)=\\bigcap_{i} A_{i}^{(t)}\n$$\n\nsee Section 3. The robust combiner will adopt a more sophisticated notion of $t_{c}$; see Section 5 . In each of these cases, the important structural properties of the construction are captured by the rank function $r_{C}$.\n\n\\subsection*{2.3 Transaction Validity and Settlement}\nIn the discussion below, we assume a general notion of transaction validity that can be decided inductively: given a ledger $\\mathbf{L}$, the validity of a transaction $t x \\in \\mathbf{L}$ is determined by the transactions in the state $\\mathbf{L}\\lceil\\operatorname{tx}\\rceil$ of $\\mathbf{L}$ up to tx and their ordering. Intuitively, only valid transactions are then accounted for when interpreting the state of the ledger on the application level. The canonical example of such a validity predicate in the case of so-called UTXO transactions is formalized for completeness in Appendix B. Note that protocols such as Bitcoin allow only valid transactions to enter the ledger; as the Bitcoin ledger is represented by a simple chain it is possible to evaluate the validity predicate upon block creation for each included transaction. This may not be the case for more general ledgers, such as the result of applying one of our combiners or various DAG-based constructions.\n\nWhile we focus our analysis on persistence and liveness as given in Definition 3, our broader goal is to study settlement. Intuitively, settlement is the delay necessary to ensure that a transaction included in some $A^{(t)}$ enters the dynamic ledger and, furthermore, that its validity stabilizes for all future times.\n\nDefinition 5 (Absolute settlement). For a dynamic ledger $\\mathbf{D} \\stackrel{\\text { def }}{=} \\mathbf{L}^{(0)}, \\mathbf{L}^{(1)}, \\ldots$ we say that a transaction $t x \\in$ $A^{(\\tau)} \\cap \\mathbf{L}^{(t)}($ for $\\tau \\leq t)$ is (absolutely) settled at time $t$ iffor all $\\ell \\geq t$ we have: (i) $\\mathbf{L}^{(t)}\\lceil\\mathrm{tx}\\rceil \\subseteq \\mathbf{L}^{(\\ell)}$, (ii) the linear orders $<_{\\mathbf{L}^{(t)}}$ and $<_{\\mathbf{L}^{(t)}}$ agree on $\\mathbf{L}^{(t)}\\lceil\\mathrm{tx}\\rceil$, and (iii) for any $\\mathrm{tx}^{\\prime} \\in \\mathbf{L}^{(e)}$ such that $\\mathrm{tx}^{\\prime}{<_{\\mathbf{L}}(t)} \\mathrm{tx}$ we have $\\mathrm{tx}^{\\prime} \\in \\mathbf{L}^{(t)}\\lceil\\mathrm{tx}\\rceil$.\n\nNote that for any absolutely settled transaction, its validity is determined and it is guaranteed to remain unchanged in the future.\n\nIt will be useful to also consider a weaker notion of relative settlement of a transaction: Intuitively, tx is relatively settled at time $t$ if we have the guarantee that no (conflicting) transaction $\\mathrm{tx}^{\\prime}$ that is not part of the ledger at time $t$ can possibly eventually precede $t x$ in the ledger ordering.',
1329
+ 'paper-title: Casper the Friendly Finality Gadget\n\n\\documentclass[10pt]{article}\n\\usepackage[utf8]{inputenc}\n\\usepackage[T1]{fontenc}\n\\usepackage{amsmath}\n\\usepackage{amsfonts}\n\\usepackage{amssymb}\n\\usepackage[version=4]{mhchem}\n\\usepackage{stmaryrd}\n\\usepackage{graphicx}\n\\usepackage[export]{adjustbox}\n\\graphicspath{ {./images/} }\n\\usepackage{hyperref}\n\\hypersetup{colorlinks=true, linkcolor=blue, filecolor=magenta, urlcolor=cyan,}\n\\urlstyle{same}\n\n\\title{Casper the Friendly Finality Gadget }\n\n\\author{Vitalik Buterin and Virgil Griffith\\\\\nEthereum Foundation}\n\\date{}\n\n\n%New command to display footnote whose markers will always be hidden\n\\let\\svthefootnote\\thefootnote\n\\newcommand\\blfootnotetext[1]{%\n \\let\\thefootnote\\relax\\footnote{#1}%\n \\addtocounter{footnote}{-1}%\n \\let\\thefootnote\\svthefootnote%\n}\n\n%Overriding the \\footnotetext command to hide the marker if its value is `0`\n\\let\\svfootnotetext\\footnotetext\n\\renewcommand\\footnotetext[2][?]{%\n \\if\\relax#1\\relax%\n \\ifnum\\value{footnote}=0\\blfootnotetext{#2}\\else\\svfootnotetext{#2}\\fi%\n \\else%\n \\if?#1\\ifnum\\value{footnote}=0\\blfootnotetext{#2}\\else\\svfootnotetext{#2}\\fi%\n \\else\\svfootnotetext[#1]{#2}\\fi%\n \\fi\n}\n\n\\begin{document}\n\\maketitle\n\n\n\\begin{abstract}\nWe introduce Casper, a proof of stake-based finality system which overlays an existing proof of work blockchain. Casper is a partial consensus mechanism combining proof of stake algorithm research and Byzantine fault tolerant consensus theory. We introduce our system, prove some desirable features, and show defenses against long range revisions and catastrophic crashes. The Casper overlay provides almost any proof of work chain with additional protections against block reversions.\n\\end{abstract}\n\n\\section*{1. Introduction}\nOver the past few years there has been considerable research into "proof of stake" (PoS) based blockchain consensus algorithms. In a PoS system, a blockchain appends and agrees on new blocks through a process where anyone who holds coins inside of the system can participate, and the influence an agent has is proportional to the number of coins (or "stake") it holds. This is a vastly more efficient alternative to proof of work (PoW) "mining" and enables blockchains to operate without mining\'s high hardware and electricity costs.\\\\[0pt]\nThere are two major schools of thought in PoS design. The first, chain-based proof of stake[1, 2], mimics proof of work mechanics and features a chain of blocks and simulates mining by pseudorandomly assigning the right to create new blocks to stakeholders. This includes Peercoin[3], Blackcoin[4], and Iddo Bentov\'s work[5].\\\\[0pt]\nThe other school, Byzantine fault tolerant (BFT) based proof of stake, is based on a thirty-year-old body of research into BFT consensus algorithms such as PBFT[6]. BFT algorithms typically have proven mathematical properties; for example, one can usually mathematically prove that as long as $>\\frac{2}{3}$ of protocol participants are following the protocol honestly, then, regardless of network latency, the algorithm cannot finalize conflicting blocks. Repurposing BFT algorithms for proof of stake was first introduced by Tendermint[7], and has modern inspirations such as [8]. Casper follows this BFT tradition, though with some modifications.\n\n\\subsection*{1.1. Our Work}\nCasper the Friendly Finality Gadget is an overlay atop a proposal mechanism-a mechanism which proposes blocks ${ }^{1}$. Casper is responsible for finalizing these blocks, essentially selecting a unique chain which represents the canonical transactions of the ledger. Casper provides safety, but liveness depends on the chosen proposal mechanism. That is, if attackers wholly control the proposal mechanism, Casper protects against finalizing two conflicting checkpoints, but the attackers could prevent Casper from finalizing any future checkpoints.\\\\\nCasper introduces several new features that BFT algorithms do not necessarily support:',
1330
+ ]
1331
+ embeddings = model.encode(sentences)
1332
+ print(embeddings.shape)
1333
+ # [3, 768]
1334
+
1335
+ # Get the similarity scores for the embeddings
1336
+ similarities = model.similarity(embeddings, embeddings)
1337
+ print(similarities.shape)
1338
+ # [3, 3]
1339
+ ```
1340
+
1341
+ <!--
1342
+ ### Direct Usage (Transformers)
1343
+
1344
+ <details><summary>Click to see the direct usage in Transformers</summary>
1345
+
1346
+ </details>
1347
+ -->
1348
+
1349
+ <!--
1350
+ ### Downstream Usage (Sentence Transformers)
1351
+
1352
+ You can finetune this model on your own dataset.
1353
+
1354
+ <details><summary>Click to expand</summary>
1355
+
1356
+ </details>
1357
+ -->
1358
+
1359
+ <!--
1360
+ ### Out-of-Scope Use
1361
+
1362
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
1363
+ -->
1364
+
1365
+ ## Evaluation
1366
+
1367
+ ### Metrics
1368
+
1369
+ #### Information Retrieval
1370
+
1371
+ * Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
1372
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
1373
+
1374
+ | Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
1375
+ |:--------------------|:-----------|:-----------|:-----------|:-----------|:-----------|
1376
+ | cosine_accuracy@1 | 0.5 | 0.5714 | 0.5714 | 0.5 | 0.4286 |
1377
+ | cosine_accuracy@3 | 0.7857 | 0.7857 | 0.7857 | 0.75 | 0.6786 |
1378
+ | cosine_accuracy@5 | 0.8571 | 0.8214 | 0.8214 | 0.8214 | 0.75 |
1379
+ | cosine_accuracy@10 | 0.8571 | 0.8571 | 0.8571 | 0.8571 | 0.8214 |
1380
+ | cosine_precision@1 | 0.5 | 0.5714 | 0.5714 | 0.5 | 0.4286 |
1381
+ | cosine_precision@3 | 0.2619 | 0.2619 | 0.2619 | 0.25 | 0.2262 |
1382
+ | cosine_precision@5 | 0.1714 | 0.1643 | 0.1643 | 0.1643 | 0.15 |
1383
+ | cosine_precision@10 | 0.0857 | 0.0857 | 0.0857 | 0.0857 | 0.0821 |
1384
+ | cosine_recall@1 | 0.5 | 0.5714 | 0.5714 | 0.5 | 0.4286 |
1385
+ | cosine_recall@3 | 0.7857 | 0.7857 | 0.7857 | 0.75 | 0.6786 |
1386
+ | cosine_recall@5 | 0.8571 | 0.8214 | 0.8214 | 0.8214 | 0.75 |
1387
+ | cosine_recall@10 | 0.8571 | 0.8571 | 0.8571 | 0.8571 | 0.8214 |
1388
+ | **cosine_ndcg@10** | **0.7032** | **0.7277** | **0.7285** | **0.6935** | **0.6316** |
1389
+ | cosine_mrr@10 | 0.6512 | 0.6849 | 0.6857 | 0.6396 | 0.5696 |
1390
+ | cosine_map@100 | 0.6553 | 0.6886 | 0.6893 | 0.6425 | 0.5757 |
1391
+
1392
+ <!--
1393
+ ## Bias, Risks and Limitations
1394
+
1395
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
1396
+ -->
1397
+
1398
+ <!--
1399
+ ### Recommendations
1400
+
1401
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
1402
+ -->
1403
+
1404
+ ## Training Details
1405
+
1406
+ ### Training Dataset
1407
+
1408
+ #### json
1409
+
1410
+ * Dataset: json
1411
+ * Size: 278 training samples
1412
+ * Columns: <code>anchor</code> and <code>positive</code>
1413
+ * Approximate statistics based on the first 278 samples:
1414
+ | | anchor | positive |
1415
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
1416
+ | type | string | string |
1417
+ | details | <ul><li>min: 14 tokens</li><li>mean: 26.06 tokens</li><li>max: 72 tokens</li></ul> | <ul><li>min: 512 tokens</li><li>mean: 512.0 tokens</li><li>max: 512 tokens</li></ul> |
1418
+ * Samples:
1419
+ | anchor | positive |
1420
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1421
+ | <code>How does ByzCoin ensure that microblock chains remain consistent even in the presence of keyblock conflicts?</code> | <code>paper-title: Enhancing Bitcoin Security and Performance with Strong Consistency via Collective Signing<br><br>Figure 3: ByzCoin blockchain: Two parallel chains store information about the leaders (keyblocks) and the transactions (microblocks)\\<br>becomes two separate parallel blockchains, as shown in Fig. 3. The main blockchain is the keyblock chain, consisting of all mined blocks. The microblock chain is a secondary blockchain that depends on the primary to identify the era in which every microblock belongs to, i.e., which miners are authoritative to sign it and who is the leader of the era.<br><br>Microblocks. A microblock is a simple block that the current consensus group produces every few seconds to represent newly-committed transactions. Each microblock includes a set of transactions and a collective signature. Each microblock also includes hashes referring to the previous microblock and keyblock: the former to ensure total ordering, and the latter indicating which consensus group window and l...</code> |
1422
+ | <code>What are the primary ways in which Bitcoin users can be deanonymized, and why is network-layer deanonymization particularly concerning?</code> | <code>paper-title: Bitcoin and Cryptocurrency Technologies<br><br>This is is exactly what the Fistful of Bitcoins researchers (and others since) have done. They bought a variety of things, joined mining pools, used Bitcoin exchanges, wallet services, and gambling sites, and interacted in a variety of other ways with service providers, compromising 344 transactions in all.<br><br>In Figure 6.5, we again show the clusters of Figure 6.4, but this times with the labels attached. While our guesses about Mt. gox and Satoshi Dice were correct, the researchers were able to identify numerous other service providers that would have been hard to identify without transacting with them.\\<br>\includegraphics[max width=\textwidth, center]{2025_01_02_05ab7f20e06e1a41e145g-175}<br><br>Figure 6.5. Labeled clusters. By transacting with various Bitcoin service providers, Meiklejohn et al. were able to attach real world identities to their clusters.<br><br>Identifying individuals. The next question is: can we do the same thing for indivi...</code> |
1423
+ | <code>What is the main purpose of the ledger indistinguishability and transaction non-malleability properties in the Zerocash protocol?</code> | <code>paper-title: Zerocash: Decentralized Anonymous Payments from Bitcoin<br><br>Ledger indistinguishability is formalized by an experiment L-IND that proceeds as follows. First, a challenger samples a random bit $b$ and initializes two DAP scheme oracles $\mathcal{O}_{0}^{\text {DAP }}$ and $\mathcal{O}_{1}^{\text {DAP }}$, maintaining ledgers $L_{0}$ and $L_{1}$. Throughout, the challenger allows $\mathcal{A}$ to issue queries to $\mathcal{O}_{0}^{\text {DAP }}$ and $\mathcal{O}_{1}^{\text {DAP }}$, thus controlling the behavior of honest parties on $L_{0}$ and $L_{1}$. The challenger provides the adversary with the view of both ledgers, but in randomized order: $L_{\text {Left }}:=L_{b}$ and $L_{\text {Right }}:=L_{1-b}$. The adversary's goal is to distinguish whether the view he sees corresponds to $\left(L_{\text {Left }}, L_{\text {Right }}\right)=\left(L_{0}, L_{1}\right)$, i.e. $b=0$, or to $\left(L_{\text {Left }}, L_{\text {Right }}\right)=\left(L_{1}, L_{0}\right)$, i.e. $b=1$.<br><br>At eac...</code> |
1424
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
1425
+ ```json
1426
+ {
1427
+ "loss": "MultipleNegativesRankingLoss",
1428
+ "matryoshka_dims": [
1429
+ 768,
1430
+ 512,
1431
+ 256,
1432
+ 128,
1433
+ 64
1434
+ ],
1435
+ "matryoshka_weights": [
1436
+ 1,
1437
+ 1,
1438
+ 1,
1439
+ 1,
1440
+ 1
1441
+ ],
1442
+ "n_dims_per_step": -1
1443
+ }
1444
+ ```
1445
+
1446
+ ### Training Hyperparameters
1447
+ #### Non-Default Hyperparameters
1448
+
1449
+ - `eval_strategy`: epoch
1450
+ - `per_device_train_batch_size`: 32
1451
+ - `gradient_accumulation_steps`: 16
1452
+ - `learning_rate`: 2e-05
1453
+ - `num_train_epochs`: 4
1454
+
1455
+ #### All Hyperparameters
1456
+ <details><summary>Click to expand</summary>
1457
+
1458
+ - `overwrite_output_dir`: False
1459
+ - `do_predict`: False
1460
+ - `eval_strategy`: epoch
1461
+ - `prediction_loss_only`: True
1462
+ - `per_device_train_batch_size`: 32
1463
+ - `per_device_eval_batch_size`: 8
1464
+ - `per_gpu_train_batch_size`: None
1465
+ - `per_gpu_eval_batch_size`: None
1466
+ - `gradient_accumulation_steps`: 16
1467
+ - `eval_accumulation_steps`: None
1468
+ - `learning_rate`: 2e-05
1469
+ - `weight_decay`: 0.0
1470
+ - `adam_beta1`: 0.9
1471
+ - `adam_beta2`: 0.999
1472
+ - `adam_epsilon`: 1e-08
1473
+ - `max_grad_norm`: 1.0
1474
+ - `num_train_epochs`: 4
1475
+ - `max_steps`: -1
1476
+ - `lr_scheduler_type`: linear
1477
+ - `lr_scheduler_kwargs`: {}
1478
+ - `warmup_ratio`: 0.0
1479
+ - `warmup_steps`: 0
1480
+ - `log_level`: passive
1481
+ - `log_level_replica`: warning
1482
+ - `log_on_each_node`: True
1483
+ - `logging_nan_inf_filter`: True
1484
+ - `save_safetensors`: True
1485
+ - `save_on_each_node`: False
1486
+ - `save_only_model`: False
1487
+ - `restore_callback_states_from_checkpoint`: False
1488
+ - `no_cuda`: False
1489
+ - `use_cpu`: False
1490
+ - `use_mps_device`: False
1491
+ - `seed`: 42
1492
+ - `data_seed`: None
1493
+ - `jit_mode_eval`: False
1494
+ - `use_ipex`: False
1495
+ - `bf16`: False
1496
+ - `fp16`: False
1497
+ - `fp16_opt_level`: O1
1498
+ - `half_precision_backend`: auto
1499
+ - `bf16_full_eval`: False
1500
+ - `fp16_full_eval`: False
1501
+ - `tf32`: None
1502
+ - `local_rank`: 0
1503
+ - `ddp_backend`: None
1504
+ - `tpu_num_cores`: None
1505
+ - `tpu_metrics_debug`: False
1506
+ - `debug`: []
1507
+ - `dataloader_drop_last`: False
1508
+ - `dataloader_num_workers`: 0
1509
+ - `dataloader_prefetch_factor`: None
1510
+ - `past_index`: -1
1511
+ - `disable_tqdm`: False
1512
+ - `remove_unused_columns`: True
1513
+ - `label_names`: None
1514
+ - `load_best_model_at_end`: False
1515
+ - `ignore_data_skip`: False
1516
+ - `fsdp`: []
1517
+ - `fsdp_min_num_params`: 0
1518
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
1519
+ - `fsdp_transformer_layer_cls_to_wrap`: None
1520
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
1521
+ - `deepspeed`: None
1522
+ - `label_smoothing_factor`: 0.0
1523
+ - `optim`: adamw_torch
1524
+ - `optim_args`: None
1525
+ - `adafactor`: False
1526
+ - `group_by_length`: False
1527
+ - `length_column_name`: length
1528
+ - `ddp_find_unused_parameters`: None
1529
+ - `ddp_bucket_cap_mb`: None
1530
+ - `ddp_broadcast_buffers`: False
1531
+ - `dataloader_pin_memory`: True
1532
+ - `dataloader_persistent_workers`: False
1533
+ - `skip_memory_metrics`: True
1534
+ - `use_legacy_prediction_loop`: False
1535
+ - `push_to_hub`: False
1536
+ - `resume_from_checkpoint`: None
1537
+ - `hub_model_id`: None
1538
+ - `hub_strategy`: every_save
1539
+ - `hub_private_repo`: False
1540
+ - `hub_always_push`: False
1541
+ - `gradient_checkpointing`: False
1542
+ - `gradient_checkpointing_kwargs`: None
1543
+ - `include_inputs_for_metrics`: False
1544
+ - `eval_do_concat_batches`: True
1545
+ - `fp16_backend`: auto
1546
+ - `push_to_hub_model_id`: None
1547
+ - `push_to_hub_organization`: None
1548
+ - `mp_parameters`:
1549
+ - `auto_find_batch_size`: False
1550
+ - `full_determinism`: False
1551
+ - `torchdynamo`: None
1552
+ - `ray_scope`: last
1553
+ - `ddp_timeout`: 1800
1554
+ - `torch_compile`: False
1555
+ - `torch_compile_backend`: None
1556
+ - `torch_compile_mode`: None
1557
+ - `dispatch_batches`: None
1558
+ - `split_batches`: None
1559
+ - `include_tokens_per_second`: False
1560
+ - `include_num_input_tokens_seen`: False
1561
+ - `neftune_noise_alpha`: None
1562
+ - `optim_target_modules`: None
1563
+ - `batch_eval_metrics`: False
1564
+ - `prompts`: None
1565
+ - `batch_sampler`: batch_sampler
1566
+ - `multi_dataset_batch_sampler`: proportional
1567
+
1568
+ </details>
1569
+
1570
+ ### Training Logs
1571
+ | Epoch | Step | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
1572
+ |:-----:|:----:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
1573
+ | 1.0 | 1 | 0.6975 | 0.6930 | 0.6760 | 0.6960 | 0.6098 |
1574
+ | 2.0 | 2 | 0.7258 | 0.7082 | 0.7062 | 0.6935 | 0.6231 |
1575
+ | 3.0 | 3 | 0.7079 | 0.7270 | 0.7067 | 0.6935 | 0.6184 |
1576
+ | 4.0 | 4 | 0.7032 | 0.7277 | 0.7285 | 0.6935 | 0.6316 |
1577
+
1578
+
1579
+ ### Framework Versions
1580
+ - Python: 3.10.16
1581
+ - Sentence Transformers: 3.3.1
1582
+ - Transformers: 4.41.2
1583
+ - PyTorch: 2.5.1+cu118
1584
+ - Accelerate: 1.2.1
1585
+ - Datasets: 2.19.1
1586
+ - Tokenizers: 0.19.1
1587
+
1588
+ ## Citation
1589
+
1590
+ ### BibTeX
1591
+
1592
+ #### Sentence Transformers
1593
+ ```bibtex
1594
+ @inproceedings{reimers-2019-sentence-bert,
1595
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
1596
+ author = "Reimers, Nils and Gurevych, Iryna",
1597
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
1598
+ month = "11",
1599
+ year = "2019",
1600
+ publisher = "Association for Computational Linguistics",
1601
+ url = "https://arxiv.org/abs/1908.10084",
1602
+ }
1603
+ ```
1604
+
1605
+ #### MatryoshkaLoss
1606
+ ```bibtex
1607
+ @misc{kusupati2024matryoshka,
1608
+ title={Matryoshka Representation Learning},
1609
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
1610
+ year={2024},
1611
+ eprint={2205.13147},
1612
+ archivePrefix={arXiv},
1613
+ primaryClass={cs.LG}
1614
+ }
1615
+ ```
1616
+
1617
+ #### MultipleNegativesRankingLoss
1618
+ ```bibtex
1619
+ @misc{henderson2017efficient,
1620
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
1621
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
1622
+ year={2017},
1623
+ eprint={1705.00652},
1624
+ archivePrefix={arXiv},
1625
+ primaryClass={cs.CL}
1626
+ }
1627
+ ```
1628
+
1629
+ <!--
1630
+ ## Glossary
1631
+
1632
+ *Clearly define terms in order to be accessible across audiences.*
1633
+ -->
1634
+
1635
+ <!--
1636
+ ## Model Card Authors
1637
+
1638
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1639
+ -->
1640
+
1641
+ <!--
1642
+ ## Model Card Contact
1643
+
1644
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1645
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.41.2",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.5.1+cu118"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbbd0553f28af0c94ea67c9e8873773258334a62eb7f0e500b525b70c81ec187
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff