Update README.md

53696b7 verified about 2 months ago

4.3 kB

	---
	license: apache-2.0
	datasets:
	- rajpurkar/squad_v2
	metrics:
	- precision
	- f1
	- recall
	- squad_v2
	- meteor
	- bleu
	- rouge
	- exact_match
	base_model:
	- meta-llama/Llama-3.2-1B
	- google/gemma-2-2b-it
	library_name: transformers
	tags:
	- llama
	- sqaud
	- fine
	- tuned
	---

	1. Overview
	This repository highlights the fine-tuning of the Llama-3.2-1B model on the SQuAD (Stanford Question Answering Dataset) dataset. The task involves training the model to accurately answer questions based on a given context passage. Fine-tuning the pre-trained Llama model aligns it with the objectives of extractive question-answering.

	2. Model Information
	Model Used: meta-llama/Llama-3.2-1B
	Pre-trained Parameters: The model contains approximately 1.03 billion parameters, verified during setup and matching official documentation.
	Fine-tuned Parameters: The parameter count remains consistent with the pre-trained model, as fine-tuning only updates task-specific weights.

	3. Dataset and Task Details
	Dataset: SQuAD
	The Stanford Question Answering Dataset (SQuAD) is a benchmark dataset designed for extractive question-answering tasks. It contains passages with corresponding questions and answer spans extracted directly from the text.
	Task Objective
	Given a passage and a question, the model is trained to identify the correct span of text in the passage that answers the question.

	4. Fine-Tuning Approach
	Train-Test Split: An 80:20 split was applied to the dataset, ensuring a balanced distribution of passages and questions in the train and test subsets. Stratified sampling was used, with a seed value of 1 for reproducibility.
	Tokenization: Context and question pairs were tokenized with padding and truncation to ensure uniform input lengths (maximum 512 tokens).
	Model Training: Fine-tuning was conducted over three epochs with a learning rate of 3e-5. Gradient accumulation and early stopping were used to enhance training efficiency and prevent overfitting.
	Hardware: Training utilized GPU acceleration to handle the large model size and complex token sequences efficiently.

	5. Results and Observations
	Zero-shot vs. Fine-tuned Performance: Without fine-tuning, the pre-trained Llama model demonstrated limited ability to answer questions accurately. Fine-tuning significantly improved the model’s performance on metrics such as F1 score, exact match, and ROUGE.

	Fine-tuning Benefits: Training on the SQuAD dataset equipped the model with a deeper understanding of context and its relationship to specific queries, enhancing its ability to extract precise answer spans.

	Model Parameters: The parameter count remained unchanged during fine-tuning, underscoring that performance improvements stemmed from the optimization of existing weights rather than structural changes.

	6. How to Use the Fine-Tuned Model
	Install Necessary Libraries:

	pip install transformers datasets
	Load the Fine-Tuned Model:

	from transformers import AutoTokenizer, AutoModelForQuestionAnswering

	model_name = "<your-huggingface-repo>/squad-llama-finetuned"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForQuestionAnswering.from_pretrained(model_name)
	Make Predictions:

	context = "Llama is a model developed by Meta AI designed for natural language understanding tasks."
	question = "Who developed Llama?"

	inputs = tokenizer(question, context, return_tensors="pt", truncation=True, padding=True)
	outputs = model(**inputs)

	start_idx = outputs.start_logits.argmax()
	end_idx = outputs.end_logits.argmax()

	answer = tokenizer.decode(inputs["input_ids"][0][start_idx:end_idx + 1])
	print(f"Predicted Answer: {answer}")

	7. Key Takeaways
	Fine-tuning Llama on SQuAD equips it with the ability to handle extractive question-answering tasks with high accuracy and precision.
	The parameter count of the model does not change during fine-tuning, highlighting that performance enhancements are derived from weight updates rather than architectural modifications.
	The comparison between zero-shot and fine-tuned performance demonstrates the necessity of task-specific training to achieve state-of-the-art results.

	8. Acknowledgments
	Hugging Face for providing seamless tools for model fine-tuning and evaluation.
	Stanford Question Answering Dataset for serving as a robust benchmark for extractive QA tasks.