File size: 7,995 Bytes
01523b5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
cnt_agents: &cnt_agents 2
max_turn: &max_turn 5
max_criticizing_rounds: 3

prompts:
  role_assigner_prepend_prompt: &role_assigner_prepend_prompt |-
    # Role Description
    You are the leader of a group of experts, now you need to recruit a small group of experts with diverse identity to correctly write the code to solve the given problems:
    ${task_description}
    
    You can recruit ${cnt_critic_agents} expert in different fields. What experts will you recruit to better generate an accurate solution?

    Here are some suggestion:
    ${advice}
    
  role_assigner_append_prompt: &role_assigner_append_prompt |-
    # Response Format Guidance
    You should respond with a list of expert description. For example:
    1. an electrical engineer specified in the filed of xxx.
    2. an economist who is good at xxx.
    3. a lawyer with a good knowledge of xxx.
    ...

    Only respond with the description of each role. Do not include your reason.

  solver_prepend_prompt: &solver_prepend_prompt |-
    Can you complete the following code?
    ```python
    ${task_description}
    ```
  
  solver_append_prompt: &solver_append_prompt |-
    You are ${role_description}. Using the these information, can you provide a correct completion of the code? Explain your reasoning. Your response should contain only Python code. Do not give any additional information. Use ```python to put the completed Python code in markdown quotes. When responding, please include the given code and the completion.
  # You should respond in the following json format wrapped with markdown quotes:
  # ```json
  # {
  #     "text": "your thought",
  #     "reasoning": "your reasoning",
  #     "criticism": "constructive self-criticism",
  #     "code": "the final code completion",
  # }
  # ```

  # Respond only the json, and nothing else. Make sure it can be directly parsed with Python `json.loads`.

  critic_prepend_prompt: &critic_prepend_prompt |-
    You are in a discussion group, aiming to complete the following code function:
    ```python
    ${task_description}
    ```

  critic_append_prompt: &critic_append_prompt |-
    You are ${role_description}. Based on your knowledge, can you check the correctness of the completion given above? You should give your correct solution to the problem step by step. When responding, you should follow the following rules:
    1. Analyze the above latest solution and the problem. 
    2. If the latest solution is correct, end your response with a special token "[Agree]". 
    3. If the latest solution is wrong, write down your critics in the code block and give a corrected code with comment explanation on the modification.
    3. Your response should contain only Python code. Do not give any additional information. Use ```python to wrap your Python code in markdown quotes. When responding, please include the given code and the completion.

    Now give your response.

  manager_prompt: &manager_prompt |-
    According to the Previous Solution and the Previous Sentences, select the most appropriate Critic from a specific Role and output the Role.
    ```python 
    ${task_description} 
    ```
    # Previous Solution
    The solution you gave in the last step is:
    ${former_solution}

    # Critics
    There are some critics on the above solution:
    ```
    ${critic_opinions}
    ```

    # Previous Sentences
    The previous sentences in the previous rounds is:
    ${previous_sentence}

  executor_prepend_prompt: &executor_prepend_prompt |-
    You are an experienced program tester. Now your team is trying to solve the problem: 
    '''
    Complete the Python function:
    ${task_description}
    '''

    Your team has given the following answer:
    '''
    ${solution}
    '''

  executor_append_prompt: &executor_append_prompt |-
    The solution has been written to `tmp/main.py`. Your are going to write the unit testing code for the solution. You should respond in the following json format wrapped with markdown quotes:
    ```json
    {
        "thought": your thought,
        "file_path": the path to write your testing code,
        "code": the testing code,
        "command": the command to change directory and execute your testing code
    }
    ```

    Respond only the json, and nothing else.

  evaluator_prepend_prompt: &evaluator_prepend_prompt |-
    # Experts
    The experts recruited in this turn includes:
    ${all_role_description}
    
    # Problem and Writer's Solution
    Problem: 
    ${task_description}

    Writer's Solution: 
    ${solution}

  evaluator_append_prompt: &evaluator_append_prompt |-
    You are an experienced code reviewer. As a good reviewer, you carefully check the functional correctness of the given code completion. When the completion is incorrect, you should patiently teach the writer how to correct the completion, but do not give the code directly.

    # Response Format Guidance
    You must respond in the following format:
    Score: (0 or 1, 0 for incorrect and 1 for correct)
    Response: (give your advice on how to correct the solution, and your suggestion on on what experts should recruit in the next round)


name: pipeline


environment:
  env_type: task-basic
  max_turn: *max_turn
  rule:
    role_assigner:
      type: role_description
      cnt_agents: *cnt_agents
    decision_maker:
      type: vertical-solver-first
    executor:
      type: none
    evaluator:
      type: basic

agents:
  - #role_assigner_agent:
    agent_type: role_assigner
    name: role assigner
    prepend_prompt_template: *role_assigner_prepend_prompt
    append_prompt_template: *role_assigner_append_prompt
    max_retry: 10
    memory:
      memory_type: chat_history
    llm:
      llm_type: gpt-3.5-turbo
      model: "gpt-3.5-turbo"
      temperature: 0
      max_tokens: 512
    output_parser:
      type: role_assigner

  - #solver_agent:
    agent_type: solver
    name: Planner
    prepend_prompt_template: *solver_prepend_prompt
    append_prompt_template: *solver_append_prompt
    max_retry: 10
    memory:
      memory_type: chat_history
    llm:
      llm_type: gpt-3.5-turbo
      model: "gpt-3.5-turbo"
      temperature: 0
      max_tokens: 2048
    output_parser:
      type: humaneval-solver
      # stop:
      #   - "\ndef "
      #   - "\nclass "
      #   - "\nif "
      #   - "\n\n#"

  - #critic_agents:
    agent_type: critic
    name: Critic 1
    role_description: |-
      Waiting to be assigned.
    prepend_prompt_template: *critic_prepend_prompt
    append_prompt_template: *critic_append_prompt
    max_retry: 10
    memory:
      memory_type: chat_history
    llm:
      llm_type: gpt-3.5-turbo
      model: "gpt-3.5-turbo"
      temperature: 0
      max_tokens: 1024
    output_parser:
      type: humaneval-critic-agree

  - #executor_agent:
    agent_type: executor
    name: Executor
    prepend_prompt_template: *executor_prepend_prompt
    append_prompt_template: *executor_append_prompt
    max_retry: 10
    memory:
      memory_type: chat_history
    llm:
      llm_type: gpt-3.5-turbo
      model: gpt-3.5-turbo
      temperature: 0
      max_tokens: 1024
    output_parser:
      type: humaneval-executor

  - #evaluator_agent:
    agent_type: evaluator
    name: Evaluator
    role_description: |-
      Evaluator
    prepend_prompt_template: *evaluator_prepend_prompt
    append_prompt_template: *evaluator_append_prompt
    max_retry: 10
    memory:
      memory_type: chat_history
    llm:
      llm_type: gpt-3.5-turbo
      model: gpt-3.5-turbo
      temperature: 0.3
      max_tokens: 1024
    output_parser:
      type: mgsm-evaluator
      dimensions:
        - Score


  - #manager_agent:
    agent_type: manager
    name: Manager
    prompt_template: *manager_prompt
    max_retry: 10
    memory:
      memory_type: chat_history
    llm:
      llm_type: gpt-3.5-turbo
      model: "gpt-3.5-turbo"
      temperature: 0
      max_tokens: 1024
    output_parser:
      type: humaneval-manager