web3se
/

SmartBERT-v2

software-engineering

Inference Endpoints

Model card Files Files and versions Community

devilyouwei commited on Oct 26, 2024

Commit

8e53fcf

·

1 Parent(s): 20847d2

update readme

Files changed (2) hide show

README.md +39 -17
framework.png +0 -0

README.md CHANGED Viewed

@@ -1,25 +1,47 @@
-# SmartBERT CodeBERT 16000
-Trained on **16000** (1, 16000) smart contracts
-Evaluated on **4000** (20000, 4000) smart contracts
-Replace all the `\n`, `\t` in the _function_ code with one space.
-Base Model: **CodeBERT-mlm-base**
-Training Setup
 ```python
-    training_args = TrainingArguments(
-        output_dir=OUTPUT_DIR,
-        overwrite_output_dir=True,
-        num_train_epochs=20,
-        per_device_train_batch_size=64,
-        save_steps=10000,
-        save_total_limit=2,
-        evaluation_strategy="steps",
-        eval_steps=10000,
-        resume_from_checkpoint=checkpoint
-    )
 ```

+# SmartBERT V2 CodeBERT
+![SmartBERT](./framework.png)
+## Overview
+SmartBERT V2 CodeBERT is a pre-trained model, initialized with **[CodeBERT-base-mlm](https://huggingface.co/microsoft/codebert-base-mlm)**, designed to transfer **Smart Contract** function-level code into embeddings effectively.
+- **Training Data:** Trained on **16,000** smart contracts.
+- **Hardware:** Utilized 2 Nvidia A100 80G GPUs.
+- **Training Duration:** More than 10 hours.
+- **Evaluation Data:** Evaluated on **4,000** smart contracts.
+## Preprocessing
+All newline (`\n`) and tab (`\t`) characters in the function code were replaced with a single space to ensure consistency in the input data format.
+## Base Model
+- **Base Model**: [CodeBERT-base-mlm](https://huggingface.co/microsoft/codebert-base-mlm)
+## Training Setup
 ```python
+from transformers import TrainingArguments
+training_args = TrainingArguments(
+    output_dir=OUTPUT_DIR,
+    overwrite_output_dir=True,
+    num_train_epochs=20,
+    per_device_train_batch_size=64,
+    save_steps=10000,
+    save_total_limit=2,
+    evaluation_strategy="steps",
+    eval_steps=10000,
+    resume_from_checkpoint=checkpoint
+)
 ```
+## How to Use
+To train and deploy the SmartBERT V2 model for Web API services, please refer to our GitHub repository: [web3se-lab/SmartBERT](https://github.com/web3se-lab/SmartBERT).
+## Contributors
+- [Youwei Huang](https://www.devil.ren)
+- [Sen Fang](https://github.com/TomasAndersonFang)

framework.png ADDED Viewed