Model Card for LargeCodeModelGPTBigCode
Model Overview
LargeCodeModelGPTBigCode
is a model designed for code test generation and analysis. It is based on GPTBigCode and is specifically tailored for handling and generating tests for code. The model has been trained on a small manually labeled dataset of code and can be used for various tasks related to code analysis and testing.
Features:
- Code test generation.
- Python code analysis and generation.
- Uses a pre-trained GPT2 model integrated with Hugging Face.
How it Works
The model is loaded from an external repository, such as Hugging Face, and is initialized using the class LargeCodeModelGPTBigCode
. Several parameters can be specified during initialization to configure the model, such as:
- gpt2_name: The link to the model on Hugging Face
- prompt_string: An additional wrapper for better understanding of the task by the model
- params_inference: Inference parameters (used in self.gpt2.generate(**inputs, **inference_params))
- max_length: The maximum number of tokens in the sequence
- device: The device to run the model on
- saved_model_path: Path to the fine-tuned model
- num_lines: Number of lines (due to "non-terminating" model generation)
- flag_hugging_face: Flag to enable usage with Hugging Face (default: False)
- flag_pretrained: Flag to initialize the model with pre-trained weights
You should download inference_gptbigcode.py for proper model usage or use git clone https://huggingface.co/4ervonec19/SimpleTestGenerator
instead. Also you may use this file for inference parameters tuning.
Model Initialization
from inference_gptbigcode import LargeCodeModelGPTBigCode
gpt2bigcode = "4ervonec19/SimpleTestGenerator"
CodeModel = LargeCodeModelGPTBigCode(gpt2_name=gpt2bigcode,
flag_pretrained=True,
flag_hugging_face=True)
Inference Example
Here’s an example of inference where the model is used to generate tests based on a given code snippet:
code_example = '''def equals_zero(a):
if a == 0:
return True
return False'''
tests_generated = CodeModel.input_inference(code_text=code_example)
# Result
print(tests_generated['generated_output'])
Output:
The result will contain the input function and generated tests dict, for example:
{'input_function': ('def equals_zero(a):\n if a == 0:\n return True\n return False',),
'generated_output': 'def test_equals_zero():\n assert equals_zero(0) is True\n assert equals_zero(1) is False\n assert equals_zero(0) is True\n assert equals_zero(1.5) is False'}
Model Details
- Architecture: GPT2
- Pretraining: Yes, the model uses a pre-trained GPT2 version for test generation and code generation.
- Framework: PyTorch/HuggingFace
- License: MIT (or another, depending on the model's license)
Limitations
- The model may not always generate correct or optimal tests, especially for complex or non-standard code fragments.
- Some understanding of code structure may be required for optimal results.
- The quality of generated tests depends on the quality of the input code and its context.
- Downloads last month
- 146