Model Card for Microsoft-phi-4-Instruct-AutoRound-GPTQ-4bit

Model Overview

Model Name: Microsoft-phi-4-Instruct-AutoRound-GPTQ-4bit
Model Type: Instruction-tuned, Quantized GPT-4-based language model
Quantization: GPTQ 4-bit
Author: Satwik11
Hosted on: Hugging Face

Description

This model is a quantized version of the Microsoft phi-4 Instruct model, designed to deliver high performance while maintaining computational efficiency. By leveraging the GPTQ 4-bit quantization method, it enables deployment in environments with limited resources while retaining a high degree of accuracy.

The model is fine-tuned for instruction-following tasks, making it ideal for applications in conversational AI, question answering, and general-purpose text generation.

Key Features

  • Instruction-tuned: Fine-tuned to follow human-like instructions effectively.
  • Quantized for Efficiency: Uses GPTQ 4-bit quantization to reduce memory requirements and inference latency.
  • Pre-trained Base: Built on the Microsoft phi-4 framework, ensuring state-of-the-art performance on NLP tasks.

Use Cases

  • Chatbots and virtual assistants.
  • Summarization and content generation.
  • Research and educational applications.
  • Semantic search and knowledge retrieval.

Model Details

Architecture

  • Base Model: Microsoft phi-4
  • Quantization Technique: GPTQ (4-bit)
  • Language: English
  • Training Objective: Instruction-following fine-tuning
Downloads last month
283
Safetensors
Model size
2.85B params
Tensor type
I32
BF16
FP16
Inference API
Unable to determine this model's library. Check the docs .

Model tree for Satwik11/Microsoft-phi-4-Instruct-AutoRound-GPTQ-4bit

Base model

microsoft/phi-4
Quantized
(82)
this model