I'm exploring techniques for language model optimization by merging a base language model with multiple LoRA models.

Initially I finetuned PY007/TinyLlama-1.1B-Chat-v0.3 model on distinct datasets, one focused on color and the other on physics, resulting in two LoRA models.

I applied weights of 0.65 and 0.35 to control the impact of each LoRA model during the merge.

For each parameter in the LoRA models, a weighted contribution is calculated using the formula:

new_weight = original_weight + (LoRA_B @ LoRA_A) * scaling * merge_weight.

This involves a matmul b/w the aloobun/tinyllama-colorist-lora_2 and aloobun/tinyllama-physics-lora_1 weights. The result is added to the original weight of the base model.

The model may not be perfect but i'm learning on the go. Give it a try:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from transformers import pipeline
from time import perf_counter

model_id="aloobun/tinyllama_multiple_lora_weight_merge_0.65_0.35"

def formatted_prompt(question)-> str: 
    return f"<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant:"

tokenizer = AutoTokenizer.from_pretrained(model_id)
pipe = pipeline(
    "text-generation",
    model=model_id_colorist_final,
    torch_dtype=torch.float16,
    device_map="auto",
)

start_time = perf_counter()

QNA PHYSICS:

prompt = formatted_prompt('What are the theoretical explanations for the integer and fractional quantum Hall effects, and how do these explanations account for the observed experimental data?')
sequences = pipe(
    prompt,
    do_sample=True,
    temperature=0.1,
    top_p=0.9,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_new_tokens=512
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

output_time = perf_counter() - start_time
print(f"Time taken for inference: {round(output_time,2)} seconds")

RESULT:

Result: <|im_start|>user
What are the theoretical explanations for the integer and fractional quantum Hall effects, and how do these explanations account for the observed experimental data?<|im_end|>
<|im_start|>assistant:
The integer and fractional quantum Hall effects are two different phenomena that occur in the physics of semiconductors. They are related to the properties of the electronic states of the material, and their understanding can provide insights into the fundamental properties of quantum systems.

The integer quantum Hall effect (IQHE) is a phenomenon where the Hall conductance, which is the current flowing through a conductor, is quantized, meaning that it has a discrete value. This effect is due to the interplay between the electron spin and the electron charge, which are both quantized in the material. The IQHE is a result of the interplay between the electron spin and the electron charge, and it is a fundamental property of the material.

The fractional quantum Hall effect (FQHE) is a more recent phenomenon that occurs in semiconductors with a high electron mobility. In this effect, the Hall conductance is not quantized, but instead has a continuous range of values. This effect is due to the interplay between the electron mobility and the electron charge, which are both quantized in the material. The FQHE is a result of the interplay between the electron mobility and the electron charge, and it is a fundamental property of the material.

The theoretical explanations for the IQHE and FQHE are based on the principles of quantum mechanics and the principles of quantum field theory. These theories provide a framework for understanding the physics of the material and the underlying quantum mechanics. The explanations provide insights into the fundamental properties of quantum systems, such as the quantization of the current flowing through a material, and the interplay between the electron spin and the electron charge.

In summary, the integer and fractional quantum Hall effects are two different phenomena that occur in the physics of semiconductors. They are related to the properties of the electronic states of the material, and their understanding can provide insights into the fundamental properties of quantum systems. The theoretical explanations for the IQHE and FQHE are based on the principles of quantum mechanics and the principles of quantum field theory, respectively, and provide insights into the physics of the material. 

Time taken for inference: 13.36 seconds

GET HEX:

from time import perf_counter
start_time = perf_counter()

prompt = formatted_prompt('give me hex code for pure red color')
sequences = pipe(
    prompt,
    do_sample=True,
    temperature=0.1,
    top_p=0.9,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_new_tokens=200
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

output_time = perf_counter() - start_time
print(f"Time taken for inference: {round(output_time,2)} seconds")

RESULT:

Result: <|im_start|>user
give me hex code for pure red color<|im_end|>
<|im_start|>assistant: 

#ff0000
 
Time taken for inference: 0.45 seconds
Downloads last month
0
Safetensors
Model size
1.1B params
Tensor type
FP16
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Datasets used to train aloobun/tinyllama_multiple_lora_weight_merge_0.65_0.35