Model Card for Model ID

We prune the Phi-2 (2.7B) model to 35% sparsty (1.8B) and then finetune on 100K 2048 length sequences from the C4 dataset (https://huggingface.co/datasets/c4). Our pruning algorithm is described in the paper Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes. Code for pruning algorithm can be found here .

Model Details

Model is derived from Pruning the Phi-2 Model

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: Lucio Dery, Steven Kolawole, Jean-François Kagy, Virginia Smith, Graham Neubig, Ameet Talwalkar
  • Model type: Decoder-only
  • Language(s) (NLP): English
  • License: MIT

Model Sources [optional]

Training Details

Training Data

Finetuned on 100K 2048 length sequences from the C4 dataset (https://huggingface.co/datasets/c4).

Training Procedure

Full fine-tuning.

Training Hyperparameters

Distillation KL-Weight : 0.01

Learning Rate : 1e-4

Batch Size : 128

Optimzer : AdamW

Warmup Steps : 5

License

The model is licensed under the MIT license.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: NVIDIA A6000

Citation

BibTeX:

@misc{dery2024everybody, title={Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes}, author={Lucio Dery and Steven Kolawole and Jean-Francois Kagey and Virginia Smith and Graham Neubig and Ameet Talwalkar}, year={2024}, eprint={2402.05406}, archivePrefix={arXiv}, primaryClass={cs.LG} }

Model Card Authors [optional]

Lucio Dery: [email protected]

Model Card Contact

[email protected]

Downloads last month
76
Safetensors
Model size
1.9B params
Tensor type
F32
·
FP16
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .