Text Generation
Transformers
ONNX
English
gpt_neox
vriveras commited on
Commit
4885ca6
·
1 Parent(s): 2ba3811

Adding optimization section in model card.

Browse files
Files changed (1) hide show
  1. README.md +21 -2
README.md CHANGED
@@ -7,7 +7,8 @@ inference: false
7
  datasets:
8
  - databricks/databricks-dolly-15k
9
  ---
10
- # dolly-v2-7b Model Card
 
11
  ## Summary
12
 
13
  Databricks’ `dolly-v2-7b`, an instruction-following large language model trained on the Databricks machine learning platform
@@ -27,6 +28,22 @@ running inference for various GPU configurations.
27
 
28
  **Owner**: Databricks, Inc.
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  ## Model Overview
31
  `dolly-v2-7b` is a 6.9 billion parameter causal language model created by [Databricks](https://databricks.com/) that is derived from
32
  [EleutherAI’s](https://www.eleuther.ai/) [Pythia-6.9b](https://huggingface.co/EleutherAI/pythia-6.9b) and fine-tuned
@@ -173,4 +190,6 @@ but a robust statement as to the sources of these variations requires further st
173
  | databricks/dolly-v1-6b | 0.41 | 0.62963 | 0.643252 | 0.676758 | 0.384812 | 0.773667 | 0.687768 | 0.583431 |
174
  | EleutherAI/gpt-neox-20b | 0.402 | 0.683923 | 0.656669 | 0.7142 | 0.408703 | 0.784004 | 0.695413 | 0.602236 |
175
 
176
- # Happy Hacking!
 
 
 
7
  datasets:
8
  - databricks/databricks-dolly-15k
9
  ---
10
+ # dolly-v2-7b Olive Optimized Model Card
11
+
12
  ## Summary
13
 
14
  Databricks’ `dolly-v2-7b`, an instruction-following large language model trained on the Databricks machine learning platform
 
28
 
29
  **Owner**: Databricks, Inc.
30
 
31
+ ## Olive Optimization
32
+
33
+ This repo hosts model files that may be loaded as an [`ORTModelForCausalLM`](https://github.com/huggingface/optimum/blob/a6951c17c3450e1dea99617aa842334f4e904392/optimum/onnxruntime/modeling_decoder.py#L623) when using Python with [🤗 Optimum](https://huggingface.co/docs/optimum/onnxruntime/overview). Alternatively, the ONNX models may be composed into a custom pipeline in any language that supports ONNX Runtime & DirectML. If you choose to use ONNX Runtime & DirectML outside of Python, then you will need to provide your own implementation of the tokenizer.
34
+
35
+ | Model | Impl |
36
+ | ------------------------------- | ----------------------------------------------------------- |
37
+ | **dolly-v2-7b decoder merged with past** | **ONNX Model** |
38
+ | Tokenizer | `AutoTokenizer` (🤗 Transformers) |
39
+
40
+ The ONNX model above was processed with the [Olive](https://github.com/microsoft/olive) toolchain using the [Olive + Dolly V2 with DirectML Sample](https://github.com/microsoft/Olive/tree/main/examples/directml/dolly_v2). The Olive sample performs the following steps:
41
+
42
+ 1. Run the [OptimumConversion Pass](https://microsoft.github.io/Olive/api/passes.html#optimumconversion)
43
+ 2. Run the [OrtTransformersOptimization Pass](https://microsoft.github.io/Olive/api/passes.html#orttransformersoptimization), which leverages the [ONNX Runtime Transformer Model Optimization Tool](https://onnxruntime.ai/docs/performance/transformers-optimization.html). This step executes several time-consuming graph transformations, such as fusing subgraphs into LayerNorm.
44
+ 3. Convert the optimized ONNX models from FLOAT32 to FLOAT16.
45
+ 4. Run the [OptimumMerging Pass](https://microsoft.github.io/Olive/api/passes.html#optimummerging) to leverage caching and reduce memory usage by merging the decoder_model.onnx and decoder_with_past_model.onnx models together.
46
+
47
  ## Model Overview
48
  `dolly-v2-7b` is a 6.9 billion parameter causal language model created by [Databricks](https://databricks.com/) that is derived from
49
  [EleutherAI’s](https://www.eleuther.ai/) [Pythia-6.9b](https://huggingface.co/EleutherAI/pythia-6.9b) and fine-tuned
 
190
  | databricks/dolly-v1-6b | 0.41 | 0.62963 | 0.643252 | 0.676758 | 0.384812 | 0.773667 | 0.687768 | 0.583431 |
191
  | EleutherAI/gpt-neox-20b | 0.402 | 0.683923 | 0.656669 | 0.7142 | 0.408703 | 0.784004 | 0.695413 | 0.602236 |
192
 
193
+ # Happy Hacking!
194
+
195
+ This model is an optimized version of Databricks, Inc. [databricks/dolly-v2-7b](https://huggingface.co/databricks/dolly-v2-7b)​.