munish0838 commited on
Commit
c528755
·
verified ·
1 Parent(s): 9986365

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - text-generation
5
+ - pytorch
6
+ - Lynx
7
+ - Patronus AI
8
+ - evaluation
9
+ - hallucination-detection
10
+ license: llama3
11
+ language:
12
+ - en
13
+ base_model: PatronusAI/Llama-3-Patronus-Lynx-8B-Instruct
14
+ pipeline_tag: text-generation
15
+ ---
16
+
17
+ # QuantFactory/Llama-3-Patronus-Lynx-8B-Instruct-GGUF
18
+ This is quantized version of [PatronusAI/Llama-3-Patronus-Lynx-8B-Instruct](https://huggingface.co/PatronusAI/Llama-3-Patronus-Lynx-8B-Instruct) created using llama.cpp
19
+
20
+ # Model Description
21
+
22
+ Lynx is an open-source hallucination evaluation model. Patronus-Lynx-8B-Instruct was trained on a mix of datasets including CovidQA, PubmedQA, DROP, RAGTruth.
23
+ The datasets contain a mix of hand-annotated and synthetic data. The maximum sequence length is 8000 tokens.
24
+
25
+
26
+ ## Model Details
27
+
28
+ - **Model Type:** Patronus-Lynx-8B-Instruct is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct model.
29
+ - **Language:** Primarily English
30
+ - **Developed by:** Patronus AI
31
+ - **License:** [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
32
+
33
+ ### Model Sources
34
+
35
+ <!-- Provide the basic links for the model. -->
36
+
37
+ - **Repository:** [https://github.com/patronus-ai/Lynx-hallucination-detection](https://github.com/patronus-ai/Lynx-hallucination-detection)
38
+
39
+
40
+ ## How to Get Started with the Model
41
+ The model is fine-tuned to be used to detect hallucinations in a RAG setting. Provided a document, question and answer, the model can evaluate whether the answer is faithful to the document.
42
+
43
+ To use the model, we recommend using the prompt we used for fine-tuning:
44
+
45
+ ```
46
+ PROMPT = """
47
+ Given the following QUESTION, DOCUMENT and ANSWER you must analyze the provided answer and determine whether it is faithful to the contents of the DOCUMENT. The ANSWER must not offer new information beyond the context provided in the DOCUMENT. The ANSWER also must not contradict information provided in the DOCUMENT. Output your final verdict by strictly following this format: "PASS" if the answer is faithful to the DOCUMENT and "FAIL" if the answer is not faithful to the DOCUMENT. Show your reasoning.
48
+
49
+ --
50
+ QUESTION (THIS DOES NOT COUNT AS BACKGROUND INFORMATION):
51
+ {question}
52
+
53
+ --
54
+ DOCUMENT:
55
+ {context}
56
+
57
+ --
58
+ ANSWER:
59
+ {answer}
60
+
61
+ --
62
+
63
+ Your output should be in JSON FORMAT with the keys "REASONING" and "SCORE":
64
+ {{"REASONING": <your reasoning as bullet points>, "SCORE": <your final score>}}
65
+ """
66
+ ```
67
+
68
+ The model will output the score as 'PASS' if the answer is faithful to the document or FAIL if the answer is not faithful to the document.
69
+
70
+
71
+ ## Training Details
72
+
73
+ The model was finetuned for 3 epochs using H100s on dataset of size 2400. We use [lion](https://github.com/lucidrains/lion-pytorch) optimizer with lr=5.0e-7. For more details on data generation, please check out our Github repo.
74
+
75
+ ### Training Data
76
+
77
+ We train on 2400 samples consisting of CovidQA, PubmedQA, DROP and RAGTruth samples. For datasets that do not contain hallucinated samples, we generate perturbations to introduce hallucinations in the data. For more details about the data generation process, refer to the paper.
78
+
79
+ ## Evaluation
80
+
81
+ The model was evaluated on [PatronusAI/HaluBench](https://huggingface.co/datasets/PatronusAI/HaluBench).
82
+
83
+ It outperforms GPT-3.5-Turbo, GPT-4-Turbo, GPT-4o and Claude Sonnet.
84
+
85
+
86
+ ## Model Card Contact
87
+ [@sunitha-ravi](https://huggingface.co/sunitha-ravi)