--- license: apache-2.0 language: - en metrics: - accuracy base_model: - meta-llama/Llama-3.1-8B-Instruct pipeline_tag: visual-question-answering --- # MMEvol Model Card ## Model Details Here are the pretrained weights and instruction tuning weights | Model | Pretrained Projector | Base LLM | PT Data | IT Data | Download | | ---------------- | -------------------- | --------- | ------------------------------------------------------------ | ------- | -------- | | MMEvol-LLaMA3-8B | [mm_projector](https://huggingface.co/Tongyi-ConvAI/MMEvol/tree/main/llama3) | LLaMA3-8B | [LLaVA-Pretrain](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain) | MMEvol | [ckpt](https://huggingface.co/Tongyi-ConvAI/MMEvol/tree/main/llama3)| ## Training dataset - [480k MMEvol Curated Instruction Tuning Data](https://huggingface.co/datasets/Tongyi-ConvAI/MMEvol). ## Performance ### VLMEvalKit Support (OpenCompass) | Model | MME_C | MMStar | HallBench | MathVista_mini | MMMU_val | AI2D | POPE | BLINK | RWQA | | ---------------- | ----- | ------ | --------- | -------------- | -------- | ---- | ---- | ----- | ---- | | MMEvol-LLaMA3-8B | 47.8 | 50.1 | 62.3 | 50.0 | 40.8 | 73.9 | 86.8 | 46.4 | 62.6 | ### VLMEvalKit Not Support (VQADataSet) | Model | VQA_v2 | GQA | MIA | MMSInst | | ---------------- | ------ | ---- | ---- | ------- | | MMEvol-LLaMA3-8B | 83.4 | 65.0 | 78.8 | 32.3 | ## Paper or resources for more information - Page: https://mmevol.github.io/ - arXiv: https://arxiv.org/pdf/2409.05840 ## License Llama 3 is licensed under the LLAMA 3 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved. ## Contact us if you have any questions - Run Luo — r.luo@siat.ac.cn - Haonan Zhang — zchiowal@gmail.com