G-AshwinKumar commited on
Commit
e0f9d93
verified
1 Parent(s): 38e047f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -37,9 +37,9 @@ Aloe: A Family of Fine-tuned Open Healthcare LLMs
37
  ---
38
 
39
 
40
- Llama3.1-Aloe-70B-Beta is an **open healthcare LLM** (released with a permissive CC-BY license) achieving **state-of-the-art performance** on several medical tasks. Aloe Beta is made available in two model sizes: [8B](https://huggingface.co/HPAI-BSC/Llama31-Aloe-Beta-8B) and [70B](https://huggingface.co/HPAI-BSC/Llama31-Aloe-Beta-70B). Both models are trained using the same recipe. All necessary resources and details are made available below.
41
 
42
- Aloe is trained in 20 medical tasks, resulting in a robust and versatile healthcare model. Evaluations show Aloe models to be among the best in their class. When combined with a RAG system ([also released](https://github.com/HPAI-BSC/prompt_engine)) the 8B version gets close to the performance of closed models like MedPalm-2, GPT4 and Medprompt. With the same RAG system, Aloe-Beta-70B outperforms those private alternatives, producing state-of-the-art results.
43
 
44
  # Aloe-70B-Beta
45
 
@@ -349,9 +349,9 @@ We used [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF) library. We aligned the
349
  To compare Aloe with the most competitive open models (both general purpose and healthcare-specific) we use popular healthcare datasets (PubMedQA, MedMCQA, MedQA and MMLU for six medical tasks only), together with the new and highly reliable CareQA. However, while MCQA benchmarks provide valuable insights into a model's ability to handle structured queries, they fall short of representing the full range of challenges faced in medical practice. Building upon this idea, Aloe-Beta represents the next step in the evolution of the Aloe Family, designed to broaden the scope beyond the multiple-choice question-answering tasks that define Aloe-Alpha.
350
 
351
 
352
- Benchmark results indicate the training conducted on Aloe has boosted its performance achieving comparable results with SOTA models like Llama3-OpenBioLLLM, Llama3-Med42, MedPalm-2 and GPT-4. Llama31-Aloe-Beta-70B also outperforms the other existing medical models in the OpenLLM Leaderboard and in the evaluation of other medical tasks like Medical Factualy and Medical Treatment recommendations among others. All these results make Llama31-Aloe-Beta-70B one of the best existing models for healthcare.
353
 
354
- With the help of prompting techniques the performance of Llama3-Aloe-70B-Beta is significantly improved. Medprompting in particular provides a 4% increase in reported accuracy, after which Llama31-Aloe-Beta-70B outperforms all the existing models that do not use RAG evaluation.
355
 
356
 
357
  ## Environmental Impact
@@ -369,7 +369,7 @@ With the help of prompting techniques the performance of Llama3-Aloe-70B-Beta is
369
 
370
 
371
  ## Authors
372
- Aloe Beta has been developed by the [High Performance Artificial Intelligence](https://hpai.bsc.es/) research group, from the [Barcelona Supercomping Center - BSC](https://www.bsc.es/). Main authors are [Jordi Bayarri Planas](https://huggingface.co/JordiBayarri), Ashwin Kumar Gururajan and [Dario Garcia-Gasulla](https://huggingface.co/dariog). Red teaming efforts lead by Adrian Tormos.
373
 
374
  mailto:[email protected]
375
 
 
37
  ---
38
 
39
 
40
+ Llama3.1-Aloe-70B-Beta is an **open healthcare LLM** (released with a permissive CC-BY license) achieving **state-of-the-art performance** on several medical tasks. Aloe Beta is made available in two model sizes: [8B](https://huggingface.co/HPAI-BSC/Llama31-Aloe-Beta-8B) and [70B](https://huggingface.co/HPAI-BSC/Llama31-Aloe-Beta-70B). Both models are trained using the same recipe.
41
 
42
+ Aloe is trained on 20 medical tasks, resulting in a robust and versatile healthcare model. Evaluations show Aloe models to be among the best in their class. When combined with a RAG system ([also released](https://github.com/HPAI-BSC/prompt_engine)) the 8B version gets close to the performance of closed models like MedPalm-2, GPT4. With the same RAG system, Aloe-Beta-70B outperforms those private alternatives, producing state-of-the-art results.
43
 
44
  # Aloe-70B-Beta
45
 
 
349
  To compare Aloe with the most competitive open models (both general purpose and healthcare-specific) we use popular healthcare datasets (PubMedQA, MedMCQA, MedQA and MMLU for six medical tasks only), together with the new and highly reliable CareQA. However, while MCQA benchmarks provide valuable insights into a model's ability to handle structured queries, they fall short of representing the full range of challenges faced in medical practice. Building upon this idea, Aloe-Beta represents the next step in the evolution of the Aloe Family, designed to broaden the scope beyond the multiple-choice question-answering tasks that define Aloe-Alpha.
350
 
351
 
352
+ Benchmark results indicate the training conducted on Aloe has boosted its performance achieving comparable results with SOTA models like Llama3-OpenBioLLLM, Llama3-Med42, MedPalm-2 and GPT-4. Llama3.1-Aloe-Beta-70B also outperforms the other existing medical models in the OpenLLM Leaderboard and in the evaluation of other medical tasks like Medical Factualy and Medical Treatment recommendations among others. All these results make Llama3.1-Aloe-Beta-70B one of the best existing models for healthcare.
353
 
354
+ With the help of prompting techniques the performance of Llama3.1-Aloe-Beta-70B is significantly improved. Medprompting in particular provides a 4% increase in reported accuracy, after which Llama3.1-Aloe-Beta-70B outperforms all the existing models that do not use RAG evaluation.
355
 
356
 
357
  ## Environmental Impact
 
369
 
370
 
371
  ## Authors
372
+ Aloe Beta has been developed by the [High Performance Artificial Intelligence](https://hpai.bsc.es/) research group, from the [Barcelona Supercomping Center - BSC](https://www.bsc.es/). Main authors are [Jordi Bayarri Planas](https://huggingface.co/JordiBayarri), [Ashwin Kumar Gururajan](https://huggingface.co/G-AshwinKumar) and [Dario Garcia-Gasulla](https://huggingface.co/dariog). Red teaming efforts lead by Adrian Tormos.
373
 
374
  mailto:[email protected]
375