HPAI-BSC
/

Llama3.1-Aloe-Beta-70B

@@ -37,9 +37,9 @@ Aloe: A Family of Fine-tuned Open Healthcare LLMs
 ---
-Llama3.1-Aloe-70B-Beta is an **open healthcare LLM** (released with a permissive CC-BY license) achieving **state-of-the-art performance** on several medical tasks. Aloe Beta is made available in two model sizes: [8B](https://huggingface.co/HPAI-BSC/Llama31-Aloe-Beta-8B) and [70B](https://huggingface.co/HPAI-BSC/Llama31-Aloe-Beta-70B). Both models are trained using the same recipe. All necessary resources and details are made available below.
-Aloe is trained in 20 medical tasks, resulting in a robust and versatile healthcare model. Evaluations show Aloe models to be among the best in their class. When combined with a RAG system ([also released](https://github.com/HPAI-BSC/prompt_engine)) the 8B version gets close to the performance of closed models like MedPalm-2, GPT4 and Medprompt. With the same RAG system, Aloe-Beta-70B outperforms those private alternatives, producing state-of-the-art results.
 # Aloe-70B-Beta
@@ -349,9 +349,9 @@ We used [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF) library. We aligned the
 To compare Aloe with the most competitive open models (both general purpose and healthcare-specific) we use popular healthcare datasets (PubMedQA, MedMCQA, MedQA and MMLU for six medical tasks only), together with the new and highly reliable CareQA. However, while MCQA benchmarks provide valuable insights into a model's ability to handle structured queries, they fall short of representing the full range of challenges faced in medical practice. Building upon this idea, Aloe-Beta represents the next step in the evolution of the Aloe Family, designed to broaden the scope beyond the multiple-choice question-answering tasks that define Aloe-Alpha.
-Benchmark results indicate the training conducted on Aloe has boosted its performance achieving comparable results with SOTA models like Llama3-OpenBioLLLM, Llama3-Med42, MedPalm-2 and GPT-4. Llama31-Aloe-Beta-70B  also outperforms the other existing medical models in the OpenLLM Leaderboard and in the evaluation of other medical tasks like Medical Factualy and Medical Treatment recommendations among others. All these results make Llama31-Aloe-Beta-70B one of the best existing models for healthcare.
-With the help of prompting techniques the performance of Llama3-Aloe-70B-Beta is significantly improved. Medprompting in particular provides a 4% increase in reported accuracy, after which Llama31-Aloe-Beta-70B outperforms all the existing models that do not use RAG evaluation.
 ## Environmental Impact
@@ -369,7 +369,7 @@ With the help of prompting techniques the performance of Llama3-Aloe-70B-Beta is
 ## Authors
-Aloe Beta has been developed by the [High Performance Artificial Intelligence](https://hpai.bsc.es/) research group, from the [Barcelona Supercomping Center - BSC](https://www.bsc.es/). Main authors are [Jordi Bayarri Planas](https://huggingface.co/JordiBayarri), Ashwin Kumar Gururajan and [Dario Garcia-Gasulla](https://huggingface.co/dariog). Red teaming efforts lead by Adrian Tormos.
 mailto:[email protected]

 ---
+Llama3.1-Aloe-70B-Beta is an **open healthcare LLM** (released with a permissive CC-BY license) achieving **state-of-the-art performance** on several medical tasks. Aloe Beta is made available in two model sizes: [8B](https://huggingface.co/HPAI-BSC/Llama31-Aloe-Beta-8B) and [70B](https://huggingface.co/HPAI-BSC/Llama31-Aloe-Beta-70B). Both models are trained using the same recipe.
+Aloe is trained on 20 medical tasks, resulting in a robust and versatile healthcare model. Evaluations show Aloe models to be among the best in their class. When combined with a RAG system ([also released](https://github.com/HPAI-BSC/prompt_engine)) the 8B version gets close to the performance of closed models like MedPalm-2, GPT4. With the same RAG system, Aloe-Beta-70B outperforms those private alternatives, producing state-of-the-art results.
 # Aloe-70B-Beta
 To compare Aloe with the most competitive open models (both general purpose and healthcare-specific) we use popular healthcare datasets (PubMedQA, MedMCQA, MedQA and MMLU for six medical tasks only), together with the new and highly reliable CareQA. However, while MCQA benchmarks provide valuable insights into a model's ability to handle structured queries, they fall short of representing the full range of challenges faced in medical practice. Building upon this idea, Aloe-Beta represents the next step in the evolution of the Aloe Family, designed to broaden the scope beyond the multiple-choice question-answering tasks that define Aloe-Alpha.
+Benchmark results indicate the training conducted on Aloe has boosted its performance achieving comparable results with SOTA models like Llama3-OpenBioLLLM, Llama3-Med42, MedPalm-2 and GPT-4. Llama3.1-Aloe-Beta-70B  also outperforms the other existing medical models in the OpenLLM Leaderboard and in the evaluation of other medical tasks like Medical Factualy and Medical Treatment recommendations among others. All these results make Llama3.1-Aloe-Beta-70B one of the best existing models for healthcare.
+With the help of prompting techniques the performance of Llama3.1-Aloe-Beta-70B is significantly improved. Medprompting in particular provides a 4% increase in reported accuracy, after which Llama3.1-Aloe-Beta-70B outperforms all the existing models that do not use RAG evaluation.
 ## Environmental Impact
 ## Authors
+Aloe Beta has been developed by the [High Performance Artificial Intelligence](https://hpai.bsc.es/) research group, from the [Barcelona Supercomping Center - BSC](https://www.bsc.es/). Main authors are [Jordi Bayarri Planas](https://huggingface.co/JordiBayarri), [Ashwin Kumar Gururajan](https://huggingface.co/G-AshwinKumar) and [Dario Garcia-Gasulla](https://huggingface.co/dariog). Red teaming efforts lead by Adrian Tormos.
 mailto:[email protected]