|
--- |
|
language: |
|
- en |
|
datasets: |
|
- natural_instructions |
|
- the_pile |
|
- cot |
|
- Muennighoff/P3 |
|
tags: |
|
- gpt |
|
pipeline_tag: text-generation |
|
inference: |
|
parameters: |
|
temperature: 0.1 |
|
widget: |
|
- text: "Is this review positive or negative? Review: Best cast iron skillet you will ever buy. Answer:" |
|
example_title: "Sentiment analysis" |
|
- text: "Where is Zurich? Ans:" |
|
example_title: "Question Answering" |
|
--- |
|
|
|
# Quick Start |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
pipe = pipeline(model='togethercomputer/GPT-JT-6B-v0') |
|
|
|
pipe("Where is Zurich? Ans:") |
|
``` |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_togethercomputer__GPT-JT-6B-v0) |
|
|
|
| Metric | Value | |
|
|-----------------------|---------------------------| |
|
| Avg. | 38.37 | |
|
| ARC (25-shot) | 42.06 | |
|
| HellaSwag (10-shot) | 67.96 | |
|
| MMLU (5-shot) | 49.34 | |
|
| TruthfulQA (0-shot) | 38.89 | |
|
| Winogrande (5-shot) | 64.8 | |
|
| GSM8K (5-shot) | 1.21 | |
|
| DROP (3-shot) | 4.31 | |
|
|