--- title: README emoji: 🦀 colorFrom: indigo colorTo: purple sdk: static pinned: false ---
Arm’s AI development resources ensure you can deploy at pace, achieving best performance on Arm by default. Our aim is to make your AI development easier, ensuring integration with all major operating systems and AI frameworks, enabling portability for deploying AI on Arm at scale.
Discover below some key resources and content from Arm, including our software libraries and tools, that enable you to optimize for Arm architectures and pass-on significant performance uplift for models – from traditional ML and computer vision workloads to small and large language models - running on Arm-based devices.
The availability of smaller LLMs that enable fundamental text-based generative AI workloads, such as Llama 3.2 1B and 3B, are critical to enabling AI inference at scale. Running the new Llama 3.2 3B LLM on Arm-powered mobile devices through the Arm CPU optimized kernel leads to a 5x improvement in prompt processing and 3x improvement in token generation, achieving 19.92 tokens per second in the generation phase. This means less latency when processing AI workloads on the device and a far faster overall user experience. Also, the more AI processed at the edge, the more power that is saved from data traveling to and from the cloud, leading to energy and cost savings.
Alongside running small models at the edge, we are also able to run larger models, such as Llama 3.2 11B and 90B, in the cloud. The 11B and 90B models are a great fit for CPU based inference workloads in the cloud that generate text and image, as our data on Arm Neoverse V2 shows. When we run the 11B image and text model on the Arm-based AWS Graviton4, we can achieve 29.3 tokens per second in the generation phase. When you consider that the human reading speed is around 5 tokens per second, it’s far outpacing that.
Arm Kleidi is a targeted software suite, expediting optimizations for any framework and enabling accelerations for billions of AI workloads across Arm-based devices everywhere. Application developers achieve top performance by default, with no additional work or investment in new skills or tools training required.
Useful Resources on Arm Kleidi:
Our foundation of pervasiveness, flexible performance and energy efficiency mean that Arm CPUs are already the hardware of choice for a variety of AI workloads. Alongside Arm-based servers excelling with LLM workloads, the Arm Kleidi software suite, optimizations to our software libraries, combined with the open-source llama.cpp project enable generative AI to run efficiently on mobile devices.
Our work includes a virtual assistant demo which at first utilized Meta’s Llama2-7B LLM on mobile via a chat-based application, and has since expanded to include the Llama3 model and Phi-3 3.8B. You can learn more about the technical implementation of the demos here.
Find out more about the community contributions that make this happen:
These advancements are also highlighted in our Learning Paths below.
Arm Neoverse platforms give our infrastructure partners access to leading performance, efficiency and unparalleled flexibility to innovate in pursuit of the optimal solutions for emerging AI workloads. The flexibility of the Neoverse platform enables our innovative hardware partners to closely integrate additional compute acceleration into their designs, creating a new generation of built-for-AI custom data center silicon.
Read the latest on AI-on-Neoverse:
Tutorials designed to help you develop quality Arm software faster.
Contribute to our Learning Paths: suggest a new Learning Path or create one yourself with support from the Arm community.
Note: The data collated here is sourced from Arm and third parties. While Arm uses reasonable efforts to keep this information accurate, Arm does not warrant (express or implied) or provide any guarantee of data correctness due to the ever-evolving AI and software landscape. Any links to third party sites and resources are provided for ease and convenience. Your use of such third-party sites and resources is subject to the third party’s terms of use, and use is at your own risk.