hanseokOh
/

smartPatent-mContriever-lora

Model card Files Files and versions Community

smartPatent-mContriever-lora / README.md

hanseokOh's picture

Update README.md

91acbe9 verified about 1 year ago

|

history blame contribute delete

1.67 kB

	---
	library_name: peft
	base_model: facebook/mcontriever-msmarco
	language:
	- ko
	---

	# smartPatent-mContriever-lora

	The model is fine-tuned on the customed Korean Patent Retrieval system.

	### Training Data

	<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
	Two types of datasets are used as training data: queries automatically generated through GPT-4 and patent titles that are linked to existing patent abstracts.

	### Usage

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

	```python
	from transformers import AutoTokenizer, AutoModel, AutoModelForSequenceClassification
	import torch
	from transformers import AutoModel, AutoTokenizer
	from peft import PeftModel, PeftConfig

	def get_model(peft_model_name):
	config = PeftConfig.from_pretrained(peft_model_name)
	base_model = AutoModel.from_pretrained(config.base_model_name_or_path)
	model = PeftModel.from_pretrained(base_model, peft_model_name)
	model = model.merge_and_unload()
	model.eval()
	return model

	# Load the tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained('facebook/mcontriever-msmarco')
	model = get_model('hanseokOh/smartPatent-mContriever-lora')
	```

	### Info
	- Developed by: hanseokOh
	- Model type: information retriever
	- Language(s) (NLP): Korean
	- Finetuned from model [optional]: mContriever-msmarco

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: https://github.com/hanseokOh/PatentSearch