kaikaidai commited on
Commit
0fdff9c
·
verified ·
1 Parent(s): 022688b

Line breaks

Browse files
Files changed (1) hide show
  1. common.py +5 -4
common.py CHANGED
@@ -35,6 +35,7 @@ Score:
35
  A score of 1 means that the response's answer meets all of the evaluation criteria.
36
  A score of 0 means that the response's answer does not meet all of the evaluation criteria.
37
 
 
38
  [BEGIN DATA]
39
  ***
40
  [User Query]: {{input}}
@@ -71,11 +72,11 @@ POLICY_CONTENT = """
71
  # About Atla
72
 
73
  Atla is an applied research organization that trains models as evaluators to capture human preferences. We're a team of researchers, engineers, and operational leaders, with experience spanning a variety of disciplines, all working together to build reliable and understandable AI systems. Our research is informed by our experiences conducting AI safety research at the UK AI Task Force, OpenAI and the Stanford Existential Risks Initiative.
74
-
75
  # Our Mission
76
 
77
  By creating advanced evaluation models, we enable AI developers to identify and fix risks, leading to safer, more reliable AI that can be trusted and widely used. Our aim is to surpass the current state-of-the-art evaluation methods by training models specifically for evaluation. AIs will probably become very powerful, and perform tasks that are difficult for us to verify. We want to enable humans to oversee AI systems that are solving tasks too difficult for humans to evaluate. We have written more about [our approach to scalable oversight](https://www.atla-ai.com/post/scaling-alignment) on our blog.
78
-
79
  # Judge Arena Policy
80
 
81
  ## Overview
@@ -120,7 +121,7 @@ Judge Arena is specifically designed to assess AI models that function as evalua
120
 
121
  - **Ongoing Revisions**: This policy may be updated to reflect changes in our practices or in response to community feedback.
122
  - **Notification of Changes**: Policy changes will be communicated to users and stakeholders on this page.
123
-
124
  # FAQ
125
 
126
  **Isn't this the same as Chatbot Arena?**
@@ -138,6 +139,6 @@ Judge Arena is specifically designed to assess AI models that function as evalua
138
  \n\n**What is Atla working on?**
139
 
140
  - We are training a general-purpose evaluator that you will soon be able to run in this Judge Arena. Our next step will be to open-source a powerful model that the community can use to run fast and accurate evaluations.
141
-
142
  ## Get in touch
143
  Feel free to email us at [[email protected]](mailto:[email protected]) or leave feedback on our [Github](https://github.com/atla-ai/judge-arena)!"""
 
35
  A score of 1 means that the response's answer meets all of the evaluation criteria.
36
  A score of 0 means that the response's answer does not meet all of the evaluation criteria.
37
 
38
+ Here is the data:
39
  [BEGIN DATA]
40
  ***
41
  [User Query]: {{input}}
 
72
  # About Atla
73
 
74
  Atla is an applied research organization that trains models as evaluators to capture human preferences. We're a team of researchers, engineers, and operational leaders, with experience spanning a variety of disciplines, all working together to build reliable and understandable AI systems. Our research is informed by our experiences conducting AI safety research at the UK AI Task Force, OpenAI and the Stanford Existential Risks Initiative.
75
+ <br><br>
76
  # Our Mission
77
 
78
  By creating advanced evaluation models, we enable AI developers to identify and fix risks, leading to safer, more reliable AI that can be trusted and widely used. Our aim is to surpass the current state-of-the-art evaluation methods by training models specifically for evaluation. AIs will probably become very powerful, and perform tasks that are difficult for us to verify. We want to enable humans to oversee AI systems that are solving tasks too difficult for humans to evaluate. We have written more about [our approach to scalable oversight](https://www.atla-ai.com/post/scaling-alignment) on our blog.
79
+ <br><br>
80
  # Judge Arena Policy
81
 
82
  ## Overview
 
121
 
122
  - **Ongoing Revisions**: This policy may be updated to reflect changes in our practices or in response to community feedback.
123
  - **Notification of Changes**: Policy changes will be communicated to users and stakeholders on this page.
124
+ <br><br>
125
  # FAQ
126
 
127
  **Isn't this the same as Chatbot Arena?**
 
139
  \n\n**What is Atla working on?**
140
 
141
  - We are training a general-purpose evaluator that you will soon be able to run in this Judge Arena. Our next step will be to open-source a powerful model that the community can use to run fast and accurate evaluations.
142
+ <br><br>
143
  ## Get in touch
144
  Feel free to email us at [[email protected]](mailto:[email protected]) or leave feedback on our [Github](https://github.com/atla-ai/judge-arena)!"""