Haihao Shen's picture

Haihao Shen

Haihao

·

https://github.com/intel/auto-round

AI & ML interests

LLM quantization, sparsity, and acceleration

Recent Activity

reacted to wenhuach's post with 🚀 11 days ago

This week, OPEA Space released several new INT4 models, including: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF allenai/OLMo-2-1124-13B-Instruct THUDM/glm-4v-9b AIDC-AI/Marco-o1 and several others. Let us know which models you'd like prioritized for quantization, and we'll do our best to make it happen! https://huggingface.co/OPEA

authored a paper 25 days ago

A dynamic parallel method for performance optimization on hybrid CPUs

upvoted a paper 25 days ago

A dynamic parallel method for performance optimization on hybrid CPUs

View all activity

Articles

Building Cost-Efficient Enterprise RAG applications with Intel Gaudi 2 and Intel Xeon

Accelerate StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding

Organizations

Papers 10

arxiv:2411.19542

arxiv:2311.16133

arxiv:2311.00502

arxiv:2310.10944

models

None public yet

datasets

None public yet