add model

4e6f878 about 1 month ago

3.82 kB

	---
	datasets:
	- togethercomputer/RedPajama-Data-V2
	language:
	- de
	pipeline_tag: text-generation
	library_name: coremltools
	license: other
	tags:
	- coreml
	- tinyllama
	- german-language-model
	---

	# LLäMmlein 1B CoreML

	This repository contains the CoreML version of [LLäMmlein 1B](https://huggingface.co/LSX-UniWue/LLaMmlein_1B), a German language model trained from scratch using the [Tinyllama](https://github.com/jzhang38/TinyLlama) codebase on the German portion of [RedPajama V2](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2).

	## Model Details

	- Model Type: German Language Model based on TinyLlama architecture
	- Language: German
	- Framework: CoreML
	- Original Model: [LSX-UniWue/LLaMmlein_1B](https://huggingface.co/LSX-UniWue/LLaMmlein_1B)
	- Size: 1B parameters
	- Format: CoreML (.mlpackage)
	- Minimum Deployment Target: iOS 16
	- Compute Units: ALL (CPU + Neural Engine)
	- Input Sequence Length: 512 tokens

	## Conversion Process

	The model was converted from PyTorch to CoreML using the following steps:

	```python
	import torch
	import numpy as np
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import coremltools as ct

	# Load model and convert to TorchScript
	model = AutoModelForCausalLM.from_pretrained("LSX-UniWue/LLaMmlein_1B")
	tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/LLaMmlein_1B")

	# Set model to eval mode
	model.eval()

	# Create example input
	text = "Ein Beispieltext"
	inputs = tokenizer(text, return_tensors="pt")

	# Create a wrapper class for tracing
	class ModelWrapper(torch.nn.Module):
	def __init__(self, model):
	super().__init__()
	self.model = model

	def forward(self, input_ids):
	return self.model(input_ids).logits

	# Wrap and trace model
	wrapped_model = ModelWrapper(model)
	traced_model = torch.jit.trace(wrapped_model, inputs.input_ids)

	# Convert to CoreML
	model_mlpackage = ct.convert(
	traced_model,
	inputs=[
	ct.TensorType(
	name="input_ids",
	shape=inputs.input_ids.shape,
	dtype=np.int32
	)
	],
	source="pytorch",
	minimum_deployment_target=ct.target.iOS16,
	convert_to="mlprogram",
	compute_precision=ct.precision.FLOAT16,
	compute_units=ct.ComputeUnit.ALL,
	)

	model_mlpackage.save("LLaMmlein_1B.mlpackage")
	```

	## Usage

	To use this model on Apple devices:

	```swift
	import CoreML

	// Load the model
	let config = MLModelConfiguration()
	let model = try LLaMmlein_1B(configuration: config)

	// Prepare input
	let inputIds = // Your tokenized input as [Int32]

	// Make prediction
	let prediction = try model.prediction(input_ids: inputIds)
	```

	## Performance Considerations

	- The model is optimized for Apple Neural Engine
	- Recommended for iOS 16+ devices
	- Best performance achieved with batch size of 1
	- Maximum sequence length is set to 512 tokens

	## Original Model Information

	The original model was trained on the German portion of RedPajama V2. For more details about the base model:
	- Visit the [project page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/)
	- Read the [research paper](arxiv.org/abs/2411.11171)
	- Check the [SuperGLEBer benchmark](https://lsx-uniwue.github.io/SuperGLEBer-site/) for evaluation results

	## License

	This model inherits its license from the original LLäMmlein 1B model.

	## Citation

	If you use this model, please cite the original work:

	```bibtex
	@misc{llammlein2024,
	title={LLäMmlein: A German Language Model},
	author={LSX-UniWue},
	year={2024},
	publisher={Hugging Face},
	journal={Hugging Face Hub},
	howpublished={\url{https://huggingface.co/LSX-UniWue/LLaMmlein_1B}},
	}
	```

	For the original model description and evaluation results, see the [original model card](https://huggingface.co/LSX-UniWue/LLaMmlein_1B).