Update README.md
#6
by
wxyedward
- opened
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
license:
|
3 |
tags:
|
4 |
- chemistry
|
5 |
- biology
|
@@ -8,8 +8,14 @@ tags:
|
|
8 |
---
|
9 |
# DrugGPT
|
10 |
A generative drug design model based on GPT2.
|
11 |
-
<img src="https://img.shields.io/
|
12 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
1. Clone
|
14 |
```shell
|
15 |
git clone https://github.com/LIYUESEN/druggpt.git
|
@@ -27,8 +33,21 @@ A generative drug design model based on GPT2.
|
|
27 |
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
|
28 |
conda install -c openbabel openbabel
|
29 |
```
|
30 |
-
##
|
31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
- If you want to input a protein FASTA file
|
33 |
```shell
|
34 |
python drug_generator.py -f bcl2.fasta -n 50
|
@@ -42,9 +61,18 @@ Run the script with the desired arguments, such as the protein sequence, ligand
|
|
42 |
```shell
|
43 |
python drug_generator.py -f bcl2.fasta -l COc1ccc(cc1)C(=O) -n 50
|
44 |
```
|
|
|
45 |
- Note: If you are running in a Linux environment, you need to enclose the ligand's prompt with single quotes ('').
|
46 |
```shell
|
47 |
python drug_generator.py -f bcl2.fasta -l 'COc1ccc(cc1)C(=O)' -n 50
|
48 |
-
```
|
49 |
-
##
|
50 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: gpl-3.0
|
3 |
tags:
|
4 |
- chemistry
|
5 |
- biology
|
|
|
8 |
---
|
9 |
# DrugGPT
|
10 |
A generative drug design model based on GPT2.
|
11 |
+
<img src="https://img.shields.io/github/license/LIYUESEN/druggpt"><img src="https://img.shields.io/badge/python-3.7-blue"><img src="https://img.shields.io/github/stars/LIYUESEN/druggpt?style=social">
|
12 |
+
## π© Introduction
|
13 |
+
DrugGPT is a generative pharmaceutical strategy based on GPT structure, which aims to bring innovation to drug design by using natural language processing technique.
|
14 |
+
|
15 |
+
This project applies the GPT model to the exploration of chemical space to discover new molecules with potential binding abilities for specific proteins.
|
16 |
+
|
17 |
+
DrugGPT provides a fast and efficient method for the generation of drug candidate molecules by training on up to 1.8 million protein-ligand binding data.
|
18 |
+
## π₯ Deployment
|
19 |
1. Clone
|
20 |
```shell
|
21 |
git clone https://github.com/LIYUESEN/druggpt.git
|
|
|
33 |
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
|
34 |
conda install -c openbabel openbabel
|
35 |
```
|
36 |
+
## π How to use
|
37 |
+
Use [drug_generator.py](https://github.com/LIYUESEN/druggpt/blob/main/drug_generator.py)
|
38 |
+
|
39 |
+
Required parameters:
|
40 |
+
- `-p` | `--pro_seq`: Input a protein amino acid sequence.
|
41 |
+
- `-f` | `--fasta`: Input a FASTA file.
|
42 |
+
|
43 |
+
> Only one of -p and -f should be specified.
|
44 |
+
- `-l` | `--ligand_prompt`: Input a ligand prompt.
|
45 |
+
- `-e` | `--empty_input`: Enable directly generate mode.
|
46 |
+
- `-n` | `--number`: At least how many molecules will be generated.
|
47 |
+
- `-d` | `--device`: Hardware device to use. Default is 'cuda'.
|
48 |
+
- `-o` | `--output`: Output directory for generated molecules. Default is './ligand_output/'.
|
49 |
+
- `-b` | `--batch_size`: How many molecules will be generated per batch. Try to reduce this value if you have low RAM. Default is 32.
|
50 |
+
## π¬ Example usage
|
51 |
- If you want to input a protein FASTA file
|
52 |
```shell
|
53 |
python drug_generator.py -f bcl2.fasta -n 50
|
|
|
61 |
```shell
|
62 |
python drug_generator.py -f bcl2.fasta -l COc1ccc(cc1)C(=O) -n 50
|
63 |
```
|
64 |
+
|
65 |
- Note: If you are running in a Linux environment, you need to enclose the ligand's prompt with single quotes ('').
|
66 |
```shell
|
67 |
python drug_generator.py -f bcl2.fasta -l 'COc1ccc(cc1)C(=O)' -n 50
|
68 |
+
```
|
69 |
+
## π How to reference this work
|
70 |
+
DrugGPT: A GPT-based Strategy for Designing Potential Ligands Targeting Specific Proteins
|
71 |
+
|
72 |
+
Yuesen Li, Chengyi Gao, Xin Song, Xiangyu Wang, Yungang Xu, Suxia Han
|
73 |
+
|
74 |
+
bioRxiv 2023.06.29.543848; doi: [https://doi.org/10.1101/2023.06.29.543848](https://doi.org/10.1101/2023.06.29.543848)
|
75 |
+
|
76 |
+
[![DOI](https://img.shields.io/badge/DOI-10.1101/2023.06.29.543848-blue)](https://doi.org/10.1101/2023.06.29.543848)
|
77 |
+
## β License
|
78 |
+
[GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.html)
|