|
--- |
|
license: mit |
|
library_name: transformers |
|
tags: |
|
- code |
|
--- |
|
|
|
## JonBERTa-attn-ft-coco-124L |
|
|
|
Model for the paper [**"A Transformer-Based Approach for Smart Invocation of Automatic Code Completion"**](https://arxiv.org/abs/2405.14753). |
|
|
|
#### Description |
|
This model is fine-tuned on a code-completion dataset collected from the open-source [Code4Me](https://github.com/code4me-me/code4me) plugin. The training objective is to have a small, lightweight transformer model to filter out unnecessary and unhelpful code completions. To this end, we leverage the in-IDE telemetry data, and integrate it with the textual code data in the transformer's attention module. |
|
|
|
- **Developed by:** [AISE Lab](https://www.linkedin.com/company/aise-tudelft/) @ [SERG](https://se.ewi.tudelft.nl/), Delft University of Technology |
|
- **Model type:** [JonBERTa](https://github.com/Ar4l/curating-code-completions/blob/main/modeling_jonberta.py) |
|
- **Language:** Code |
|
- **Finetuned from model:** [`CodeBERTa-small-v1`](https://huggingface.co/huggingface/CodeBERTa-small-v1). |
|
|
|
Models are named as follows: |
|
|
|
- `CodeBERTa` → `CodeBERTa-ft-coco-[1,2,5]e-05lr` |
|
- e.g. `CodeBERTa-ft-coco-2e-05lr`, which was trained with learning rate of `2e-05`. |
|
- `JonBERTa-head` → `JonBERTa-head-ft-[dense,proj,reinit]` |
|
- e.g. `JonBERTa-head-ft-dense-proj`, where all have `2e-05` learning rate, but may differ in the head layer in which the telemetry features are introduced (either `head` or `proj`, with optional `reinit`ialisation of all its weights). |
|
- `JonBERTa-attn` → `JonBERTa-attn-ft-[0,1,2,3,4,5]L` |
|
- e.g. `JonBERTa-attn-ft-012L` , where all have `2e-05` learning rate, but may differ in the attention layer(s) in which the telemetry features are introduced (either `0`, `1`, `2`, `3`, `4`, or `5L`). |
|
|
|
Other hyperparameters may be found in the paper or the replication package (see below). |
|
|
|
#### Sources |
|
|
|
- **Replication Repository:** [`Ar4l/curating-code-completions`](https://github.com/Ar4l/curating-code-completions/tree/main) |
|
- **Paper:** [**"A Transformer-Based Approach for Smart Invocation of Automatic Code Completion"**](https://arxiv.org/abs/2405.14753) |
|
- **Contact:** https://huggingface.co/Ar4l |
|
|
|
To cite, please use |
|
|
|
```bibtex |
|
@misc{de_moor_smart_invocation_2024, |
|
title = {A {Transformer}-{Based} {Approach} for {Smart} {Invocation} of {Automatic} {Code} {Completion}}, |
|
url = {http://arxiv.org/abs/2405.14753}, |
|
doi = {10.1145/3664646.3664760}, |
|
author = {de Moor, Aral and van Deursen, Arie and Izadi, Maliheh}, |
|
month = may, |
|
year = {2024}, |
|
} |
|
``` |
|
|
|
#### Training Details |
|
This model was trained with the following hyperparameters, everything else being `TrainingArguments`' default. The dataset was prepared identically across all models as detailed in the paper. |
|
|
|
```python |
|
num_train_epochs : int = 3 |
|
learning_rate : float = 2e-5 |
|
batch_size : int = 16 |
|
``` |
|
|
|
#### Model Configuration |
|
|
|
```python |
|
num_telemetry_features :int = 26 |
|
|
|
add_feature_embeddings :bool = True |
|
feature_hidden_size :int = num_telemetry_features * 4 |
|
feature_dropout_prob :float = 0.1 |
|
add_feature_bias :bool = True |
|
|
|
add_self_attn :bool = True |
|
self_attn_layers :list[int] = search(sum( |
|
[[i,j,k] for i in range(6) for j in range(6) for k in range(6) if i < j < k], |
|
[[i,j] for j in range(6) for i in range(6) if i < j], |
|
[[i] for i in range(6)], |
|
[] |
|
)) |
|
``` |