Abstract
Language models have been effective in a wide range of applications, yet the most sophisticated models are often proprietary. For example, GPT-4 by OpenAI and various models by Anthropic are expensive and consume substantial energy. In contrast, the open-source community has produced competitive models, like Llama3. Furthermore, niche-specific smaller language models, such as those tailored for legal, medical or financial tasks, have outperformed their proprietary counterparts. This paper introduces a novel approach that employs functional tokens to integrate multiple open-source models, each optimized for particular tasks. Our newly developed Octopus v4 model leverages functional tokens to intelligently direct user queries to the most appropriate vertical model and reformat the query to achieve the best performance. Octopus v4, an evolution of the Octopus v1, v2, and v3 models, excels in selection and parameter understanding and reformatting. Additionally, we explore the use of graph as a versatile data structure that effectively coordinates multiple open-source models by harnessing the capabilities of the Octopus model and functional tokens. Use our open-sourced GitHub (https://www.nexa4ai.com/) to try Octopus v4 models (https://huggingface.co/NexaAIDev/Octopus-v4), and contrite to a larger graph of language models. By activating models less than 10B parameters, we achieved SOTA MMLU score of 74.8 among the same level models.
Community
That is really helpful feedback. We will adjust the website soon! Sorry for the inconvenience.
I like this metaphor
What ethical considerations are taken into account when developing and deploying multifaceted models like Octopus v4, particularly in sensitive domains like healthcare and finance?
We are working on safety evaluation, thanks for pointing this out.
Definitely, GGUF format is coming soon
Here's a plain-english rewrite of the paper: https://www.aimodels.fyi/papers/arxiv/octopus-v4-graph-language-models
This is truly impressive. Could you share some insights on the primary challenges you faced in developing Octopus v4, particularly with integrating various open-source models?
Finding the best domain specific LLM can be challenging, there is no leaderboard to rank them, so we make one by our own: https://huggingface.co/spaces/NexaAIDev/domain_llm_leaderboard
The hybrid approach to achieve high MMLU is interesting :>
Quick question:
- How could I build a graph network follow the idea? Do we need to set up all those query services from those model providers?
- Assume those connected models' parameters got updated from time to time, will this impact your graph search model (v4) decision? How to maintain a fresh model seems challenging.
- We will update our build_graph tutorial soon : https://github.com/NexaAI/octopus-v4
We have build a token_mapping for you, see https://github.com/NexaAI/octopus-v4/blob/main/utils.py - In our current design, the orchestration model (Octopus-V4) is trained independent of worker nodes (domain-specific LLMs). At this moment, there won't be impact. But we will consider this in our future design.
The idea and demo are fabulous!
I want to use the Octopus V4 on your github for my data science projects, can you briefly introduce what I can use it for?
How does Octopus v4's use of functional tokens enhance handling of multi-domain queries compared to larger, single-model systems?
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Telecom Language Models: Must They Be Large? (2024)
- ChatGPT Alternative Solutions: Large Language Models Survey (2024)
- Improving the Capabilities of Large Language Model based Marketing Analytics Copilots with Semantic Search and Fine-Tuning (2024)
- A Survey of Large Language Models on Generative Graph Analytics: Query, Learning, and Applications (2024)
- Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Very great model, looking forward to using it for on device training!
Octopus v4: Revolutionizing Language Models with Graph-Based AI
Links π:
π Subscribe: https://www.youtube.com/@Arxflix
π Twitter: https://x.com/arxflix
π LMNT (Partner): https://lmnt.com/
Models citing this paper 5
Browse 5 models citing this paperDatasets citing this paper 0
No dataset linking this paper