zej97 commited on
Commit
4d7183d
·
1 Parent(s): 3eaeddb

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitignore +12 -0
  2. LICENSE +21 -0
  3. README.md +70 -8
  4. __pycache__/aira.cpython-311.pyc +0 -0
  5. __pycache__/aira.cpython-39.pyc +0 -0
  6. __pycache__/app.cpython-311.pyc +0 -0
  7. __pycache__/app.cpython-39.pyc +0 -0
  8. __pycache__/components.cpython-311.pyc +0 -0
  9. __pycache__/home.cpython-311.pyc +0 -0
  10. __pycache__/main.cpython-311.pyc +0 -0
  11. __pycache__/main.cpython-39.pyc +0 -0
  12. __pycache__/style.cpython-311.pyc +0 -0
  13. __pycache__/test.cpython-311.pyc +0 -0
  14. __pycache__/test2.cpython-311.pyc +0 -0
  15. __pycache__/test3.cpython-311.pyc +0 -0
  16. actions/__pycache__/duck_search.cpython-311.pyc +0 -0
  17. actions/__pycache__/google_search.cpython-311.pyc +0 -0
  18. actions/__pycache__/web_scrape.cpython-311.pyc +0 -0
  19. actions/__pycache__/web_scrape.cpython-39.pyc +0 -0
  20. actions/__pycache__/web_search.cpython-311.pyc +0 -0
  21. actions/__pycache__/web_search.cpython-39.pyc +0 -0
  22. actions/duck_search.py +11 -0
  23. actions/google_search.py +63 -0
  24. agent/__init__.py +0 -0
  25. agent/__pycache__/__init__.cpython-311.pyc +0 -0
  26. agent/__pycache__/llm_utils.cpython-311.pyc +0 -0
  27. agent/__pycache__/llm_utils.cpython-39.pyc +0 -0
  28. agent/__pycache__/prompts.cpython-311.pyc +0 -0
  29. agent/__pycache__/prompts.cpython-39.pyc +0 -0
  30. agent/__pycache__/research_agent.cpython-311.pyc +0 -0
  31. agent/__pycache__/research_agent.cpython-39.pyc +0 -0
  32. agent/__pycache__/run.cpython-311.pyc +0 -0
  33. agent/__pycache__/run.cpython-39.pyc +0 -0
  34. agent/__pycache__/toolkits.cpython-311.pyc +0 -0
  35. agent/llm_utils.py +39 -0
  36. agent/prompts.py +132 -0
  37. agent/research_agent.py +109 -0
  38. agent/toolkits.py +15 -0
  39. app.py +81 -0
  40. config/__init__.py +9 -0
  41. config/__pycache__/__init__.cpython-311.pyc +0 -0
  42. config/__pycache__/__init__.cpython-39.pyc +0 -0
  43. config/__pycache__/config.cpython-311.pyc +0 -0
  44. config/__pycache__/config.cpython-39.pyc +0 -0
  45. config/__pycache__/singleton.cpython-311.pyc +0 -0
  46. config/__pycache__/singleton.cpython-39.pyc +0 -0
  47. config/config.py +82 -0
  48. config/singleton.py +24 -0
  49. outputs/Should I invest in the Large Language Model industry in 2023/research--2012672616352147449.txt +1 -0
  50. outputs/What are the most recent advancements in the domain of superconductors as of 2023/research--2821165325009188188.txt +1 -0
.gitignore ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #Ignore env containing secrets
2
+ .env
3
+ #Ignore Virtual Env
4
+ env/
5
+ #Ignore generated outputs
6
+ outputs/
7
+ #Ignore pycache
8
+ **/__pycache__/
9
+
10
+ test*.py
11
+ ./test/
12
+ ./flagged/
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2023 Ze Jin
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,12 +1,74 @@
1
  ---
2
- title: AI Research Assistant
3
- emoji: ⚡
4
- colorFrom: blue
5
- colorTo: green
6
- sdk: gradio
7
- sdk_version: 3.39.0
8
  app_file: app.py
9
- pinned: false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
+ title: AI-Research-Assistant
 
 
 
 
 
3
  app_file: app.py
4
+ sdk: gradio
5
+ sdk_version: 3.38.0
6
+ ---
7
+ <div style="width: 100%;">
8
+ <img src="./statics/title.svg" style="width: 100%;">
9
+ <div align="right">
10
+ <a href="./README.md">English</a> |
11
+ <a href="./statics/README_zh.md">中文</a>
12
+ </div>
13
+ </div>
14
+
15
+ Inspired by [gpt-researcher](https://github.com/assafelovic/gpt-researcher). This project endeavors to develop an AI research assistant capable of **generating research reports** effortlessly for researchers. For instance, researchers can request the AI research assistant to compose a report on *the latest advancements in the field of superconductors as of 2023*, which is currently a trending topic. The AI research assistant will subsequently compile a report based on the relevant information obtained from the internet. Now, AIRA also offers support for **academic English polishing**.
16
+
17
+ <!-- make a table -->
18
+ | Image1 | Image2 |
19
+ | :----: | :----: |
20
+ | <img src="./statics/example1-1.png"> | <img src="./statics/example1-2.png"> |
21
+
22
+ The currently supported agents encompass a wide range of fields, including *finance, business analysis, clinical medicine, basic medicine, travel, academic research and sociology*.
23
+
24
+ In addition to official api, this project offers an alternative approach to generating research reports by utilizing a third-party API. For access to this third-party API, please refer to [chimeragpt](https://chimeragpt.adventblocks.cc/) or [GPT-API-free](https://github.com/chatanywhere/GPT_API_free). Before running the project, kindly ensure that you set the environment variables `OPENAI_API_KEY` and `OPENAI_API_BASE`.
25
+
26
+ ```shell
27
+ $ export OPENAI_API_KEY = your_api_key
28
+ $ export OPENAI_API_BASE = your_api_base
29
+ ```
30
+
31
+ or you can set the api key and base in `.env` file.
32
+
33
+
34
+ ## Installation
35
+
36
+ 1. Clone the repository
37
+
38
+ ```shell
39
+ $ git clone [email protected]:paradoxtown/ai_research_assistant.git
40
+ $ cd ai_research_assistant
41
+ ```
42
+
43
+ 2. Install the dependencies
44
+
45
+ ```shell
46
+ $ pip install -r requirements.txt
47
+ ```
48
+
49
+ 3. Export evnironment variables
50
+
51
+ ```shell
52
+ $ export OPENAI_API_KEY = your_api_key
53
+ $ export OPENAI_API_BASE = your_api_base
54
+ ```
55
+ or modify the `.env` file.
56
+
57
+ 4. Run the project
58
+
59
+ ```shell
60
+ $ python app.py
61
+ ```
62
+
63
+ ## TODO
64
+
65
+ - [x] Switch Google Search to DuckDuckGo
66
+ - [ ] Literature review
67
+ - [x] Third-party API
68
+ - [ ] Prettify report
69
+ - [x] Add medical agent and social agent
70
+ - [ ] Add option for users to customize the number of words and temperature
71
+
72
  ---
73
 
74
+ <div align="center">Happy researching! 🚀</div>
__pycache__/aira.cpython-311.pyc ADDED
Binary file (4.71 kB). View file
 
__pycache__/aira.cpython-39.pyc ADDED
Binary file (2.39 kB). View file
 
__pycache__/app.cpython-311.pyc ADDED
Binary file (6.08 kB). View file
 
__pycache__/app.cpython-39.pyc ADDED
Binary file (2.64 kB). View file
 
__pycache__/components.cpython-311.pyc ADDED
Binary file (164 Bytes). View file
 
__pycache__/home.cpython-311.pyc ADDED
Binary file (2.27 kB). View file
 
__pycache__/main.cpython-311.pyc ADDED
Binary file (3.84 kB). View file
 
__pycache__/main.cpython-39.pyc ADDED
Binary file (1.99 kB). View file
 
__pycache__/style.cpython-311.pyc ADDED
Binary file (1.93 kB). View file
 
__pycache__/test.cpython-311.pyc ADDED
Binary file (1.23 kB). View file
 
__pycache__/test2.cpython-311.pyc ADDED
Binary file (390 Bytes). View file
 
__pycache__/test3.cpython-311.pyc ADDED
Binary file (1.04 kB). View file
 
actions/__pycache__/duck_search.cpython-311.pyc ADDED
Binary file (970 Bytes). View file
 
actions/__pycache__/google_search.cpython-311.pyc ADDED
Binary file (3.87 kB). View file
 
actions/__pycache__/web_scrape.cpython-311.pyc ADDED
Binary file (10.6 kB). View file
 
actions/__pycache__/web_scrape.cpython-39.pyc ADDED
Binary file (6.73 kB). View file
 
actions/__pycache__/web_search.cpython-311.pyc ADDED
Binary file (1.31 kB). View file
 
actions/__pycache__/web_search.cpython-39.pyc ADDED
Binary file (769 Bytes). View file
 
actions/duck_search.py ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from duckduckgo_search import DDGS
2
+
3
+
4
+ def duckduckgo_search(query, max_search_result=3):
5
+ with DDGS() as ddgs:
6
+ responses = list()
7
+ for i, r in enumerate(ddgs.text(query, region='wt-wt', safesearch='Off', timelimit='y')):
8
+ if i == max_search_result:
9
+ break
10
+ responses.append(r)
11
+ return responses
actions/google_search.py ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import requests
2
+ from bs4 import BeautifulSoup
3
+
4
+
5
+ def get_urls(query, proxies=None):
6
+ query = query
7
+ url = f"https://www.google.com/search?q={query}"
8
+ headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36'}
9
+ response = requests.get(url, headers=headers, proxies=proxies)
10
+ soup = BeautifulSoup(response.content, 'html.parser')
11
+ results = []
12
+ for g in soup.find_all('div', class_='g'):
13
+ anchors = g.find_all('a')
14
+ if anchors:
15
+ link = anchors[0]['href']
16
+ if link.startswith('/url?q='):
17
+ link = link[7:]
18
+ if not link.startswith('http'):
19
+ continue
20
+ title = g.find('h3').text
21
+ item = {'title': title, 'link': link}
22
+ results.append(item)
23
+
24
+ return results
25
+
26
+ def scrape_text(url, proxies=None) -> str:
27
+ """Scrape text from a webpage
28
+
29
+ Args:
30
+ url (str): The URL to scrape text from
31
+
32
+ Returns:
33
+ str: The scraped text
34
+ """
35
+ headers = {
36
+ 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36',
37
+ 'Content-Type': 'text/plain',
38
+ }
39
+ try:
40
+ response = requests.get(url, headers=headers, proxies=proxies, timeout=8)
41
+ if response.encoding == "ISO-8859-1": response.encoding = response.apparent_encoding
42
+ except:
43
+ return "Unable to connect to the server"
44
+ soup = BeautifulSoup(response.text, "html.parser")
45
+ for script in soup(["script", "style"]):
46
+ script.extract()
47
+ text = soup.get_text()
48
+ lines = (line.strip() for line in text.splitlines())
49
+ chunks = (phrase.strip() for line in lines for phrase in line.split(" "))
50
+ text = "\n".join(chunk for chunk in chunks if chunk)
51
+ return text
52
+
53
+
54
+ if __name__ == '__main__':
55
+ txt = "What is LSTM?"
56
+ proxies = None
57
+ urls = get_urls(txt, proxies)
58
+ max_search_result = 10
59
+
60
+ for url in urls[:max_search_result]:
61
+ print(url)
62
+ print(scrape_text(url['link'], proxies))
63
+ print("\n\n")
agent/__init__.py ADDED
File without changes
agent/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (150 Bytes). View file
 
agent/__pycache__/llm_utils.cpython-311.pyc ADDED
Binary file (1.66 kB). View file
 
agent/__pycache__/llm_utils.cpython-39.pyc ADDED
Binary file (2.95 kB). View file
 
agent/__pycache__/prompts.cpython-311.pyc ADDED
Binary file (11.3 kB). View file
 
agent/__pycache__/prompts.cpython-39.pyc ADDED
Binary file (9.36 kB). View file
 
agent/__pycache__/research_agent.cpython-311.pyc ADDED
Binary file (6.66 kB). View file
 
agent/__pycache__/research_agent.cpython-39.pyc ADDED
Binary file (7.01 kB). View file
 
agent/__pycache__/run.cpython-311.pyc ADDED
Binary file (729 Bytes). View file
 
agent/__pycache__/run.cpython-39.pyc ADDED
Binary file (2.17 kB). View file
 
agent/__pycache__/toolkits.cpython-311.pyc ADDED
Binary file (853 Bytes). View file
 
agent/llm_utils.py ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from config import Config
3
+ import openai
4
+
5
+ CFG = Config()
6
+
7
+ openai.api_key = CFG.openai_api_key
8
+ openai.api_base = CFG.openai_api_base
9
+
10
+ from typing import Optional
11
+
12
+ def llm_response(model,
13
+ messages,
14
+ temperature: float = CFG.temperature,
15
+ max_tokens: Optional[int] = None):
16
+ return openai.ChatCompletion.create(
17
+ model=model,
18
+ messages=messages,
19
+ temperature=temperature,
20
+ max_tokens=max_tokens,
21
+ ).choices[0].message["content"]
22
+
23
+
24
+ def llm_stream_response(model,
25
+ messages,
26
+ temperature: float = CFG.temperature,
27
+ max_tokens: Optional[int] = None):
28
+ response = ""
29
+ for chunk in openai.ChatCompletion.create(
30
+ model=model,
31
+ messages=messages,
32
+ temperature=temperature,
33
+ max_tokens=max_tokens,
34
+ stream=True,
35
+ ):
36
+ content = chunk["choices"][0].get("delta", {}).get("content")
37
+ if content is not None:
38
+ response += content
39
+ yield response
agent/prompts.py ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ def generate_agent_role_prompt(agent):
2
+ """ Generates the agent role prompt.
3
+ Args: agent (str): The type of the agent.
4
+ Returns: str: The agent role prompt.
5
+ """
6
+ prompts = {
7
+ "Finance Agent": "You are a seasoned finance analyst AI assistant. Your primary goal is to compose comprehensive, astute, impartial, and methodically arranged financial reports based on provided data and trends.",
8
+
9
+ "Travel Agent": "You are a world-travelled AI tour guide assistant. Your main purpose is to draft engaging, insightful, unbiased, and well-structured travel reports on given locations, including history, attractions, and cultural insights.",
10
+
11
+ "Academic Research Agent": "You are an AI academic research assistant. Your primary responsibility is to create thorough, academically rigorous, unbiased, and systematically organized reports on a given research topic, following the standards of scholarly work.",
12
+
13
+ "Business Analyst Agent": "You are an experienced AI business analyst assistant. Your main objective is to produce comprehensive, insightful, impartial, and systematically structured business reports based on provided business data, market trends, and strategic analysis.",
14
+ "Computer Security Analyst Agent": "You are an AI specializing in computer security analysis. Your principal duty is to generate comprehensive, meticulously detailed, impartial, and systematically structured reports on computer security topics. This includes Exploits, Techniques, Threat Actors, and Advanced Persistent Threat (APT) Groups. All produced reports should adhere to the highest standards of scholarly work and provide in-depth insights into the complexities of computer security.",
15
+
16
+ "Clinical Medicine Agent": "You are an AI specializing in clinical medicine analysis. Your primary role is to compose comprehensive, well-researched, impartial, and methodically organized reports on various aspects of clinical medicine. This includes in-depth studies on medical conditions, treatments, medical advancements, patient care, and healthcare practices. Your reports should follow the highest standards of medical research and provide critical insights into the complexities of the clinical medicine field. Whether it's analyzing medical data, conducting literature reviews, or evaluating the efficacy of medical interventions, your goal is to deliver insightful and evidence-based reports to assist medical professionals and researchers in making informed decisions.",
17
+
18
+ "Basic Medicine Agent": "You are an AI specializing in basic medicine. Your goal is to provide comprehensive, unbiased reports on essential healthcare topics. Deliver clear insights into general health practices, common medical conditions, preventive measures, first aid procedures, and healthy lifestyle choices. Aim to be accessible to non-medical professionals and offer evidence-based recommendations for overall well-being.",
19
+
20
+ "Social Science Research Agent": "You are an AI social science research assistant with a focus on providing comprehensive, well-researched, and unbiased reports on various topics within the social sciences. Your primary goal is to delve into the complexities of human behavior, society, and culture to produce insightful and methodically organized reports. Whether it's sociology, psychology, anthropology, economics, or any other social science discipline, you excel in critically analyzing data, academic literature, and historical trends to offer valuable insights into the subject matter. Your reports are crafted to meet the highest standards of scholarly work, adhering to objectivity and academic rigor while presenting information in a clear and engaging manner. With your expertise, you can delve into societal issues, cultural dynamics, economic trends, and other relevant areas within the realm of social sciences.",
21
+
22
+ "Default Agent": "You are an AI critical thinker research assistant. Your sole purpose is to write well written, critically acclaimed, objective and structured reports on given text."
23
+ }
24
+
25
+ return prompts.get(agent, "No such agent")
26
+
27
+
28
+ def generate_report_prompt(question, research_summary):
29
+ """ Generates the report prompt for the given question and research summary.
30
+ Args: question (str): The question to generate the report prompt for
31
+ research_summary (str): The research summary to generate the report prompt for
32
+ Returns: str: The report prompt for the given question and research summary
33
+ """
34
+
35
+ return f'"""{research_summary}""" Using the above information, answer the following'\
36
+ f' question or topic: "{question}" in a detailed report --'\
37
+ " The report should focus on the answer to the question, should be well structured, informative, detailed" \
38
+ " in depth, with facts and numbers if available, a minimum of 2,400 words and with markdown syntax and apa format. "\
39
+ "Write all source urls at the end of the report in apa format."
40
+
41
+ def generate_search_queries_prompt(question):
42
+ """ Generates the search queries prompt for the given question.
43
+ Args: question (str): The question to generate the search queries prompt for
44
+ Returns: str: The search queries prompt for the given question
45
+ """
46
+
47
+ return f'Write 5 google search queries to search online that form an objective opinion from the following: "{question}"\n'\
48
+ 'You must respond with a list of strings in the following json format: {"Q1": query1, "Q2": query2, "Q3": query3, "Q4": query4, "Q5": query5}'
49
+
50
+
51
+ def generate_resource_report_prompt(question, research_summary):
52
+ """Generates the resource report prompt for the given question and research summary.
53
+
54
+ Args:
55
+ question (str): The question to generate the resource report prompt for.
56
+ research_summary (str): The research summary to generate the resource report prompt for.
57
+
58
+ Returns:
59
+ str: The resource report prompt for the given question and research summary.
60
+ """
61
+ return f'"""{research_summary}""" Based on the above information, generate a bibliography recommendation report for the following' \
62
+ f' question or topic: "{question}". The report should provide a detailed analysis of each recommended resource,' \
63
+ ' explaining how each source can contribute to finding answers to the research question.' \
64
+ ' Focus on the relevance, reliability, and significance of each source.' \
65
+ ' Ensure that the report is well-structured, informative, in-depth, and follows Markdown syntax.' \
66
+ ' Include relevant facts, figures, and numbers whenever available.' \
67
+ ' The report should have a minimum length of 1,200 words.'
68
+
69
+
70
+ def generate_outline_report_prompt(question, research_summary):
71
+ """ Generates the outline report prompt for the given question and research summary.
72
+ Args: question (str): The question to generate the outline report prompt for
73
+ research_summary (str): The research summary to generate the outline report prompt for
74
+ Returns: str: The outline report prompt for the given question and research summary
75
+ """
76
+
77
+ return f'"""{research_summary}""" Using the above information, generate an outline for a research report in Markdown syntax'\
78
+ f' for the following question or topic: "{question}". The outline should provide a well-structured framework'\
79
+ ' for the research report, including the main sections, subsections, and key points to be covered.' \
80
+ ' The research report should be detailed, informative, in-depth, and a minimum of 1,200 words.' \
81
+ ' Use appropriate Markdown syntax to format the outline and ensure readability.'
82
+
83
+ def generate_concepts_prompt(question, research_summary):
84
+ """ Generates the concepts prompt for the given question.
85
+ Args: question (str): The question to generate the concepts prompt for
86
+ research_summary (str): The research summary to generate the concepts prompt for
87
+ Returns: str: The concepts prompt for the given question
88
+ """
89
+
90
+ return f'"""{research_summary}""" Using the above information, generate a list of 5 main concepts to learn for a research report'\
91
+ f' on the following question or topic: "{question}". The outline should provide a well-structured framework'\
92
+ 'You must respond with a list of strings in the following format: ["concepts 1", "concepts 2", "concepts 3", "concepts 4, concepts 5"]'
93
+
94
+
95
+ def generate_lesson_prompt(concept):
96
+ """
97
+ Generates the lesson prompt for the given question.
98
+ Args:
99
+ concept (str): The concept to generate the lesson prompt for.
100
+ Returns:
101
+ str: The lesson prompt for the given concept.
102
+ """
103
+
104
+ prompt = f'generate a comprehensive lesson about {concept} in Markdown syntax. This should include the definition'\
105
+ f'of {concept}, its historical background and development, its applications or uses in different'\
106
+ f'fields, and notable events or facts related to {concept}.'
107
+
108
+ return prompt
109
+
110
+ def get_report_by_type(report_type):
111
+ report_type_mapping = {
112
+ 'Research Report': generate_report_prompt,
113
+ 'Resource Report': generate_resource_report_prompt,
114
+ 'Outline Report': generate_outline_report_prompt
115
+ }
116
+ return report_type_mapping[report_type]
117
+
118
+ def generate_english_polishing_prompt(content):
119
+ """ Generates the english polishing prompt for the given content.
120
+ Inspired by project gpt_academic
121
+ Args: question (str):
122
+ Returns: str: The english polishing prompt for the given content
123
+ """
124
+ return f'Below is a paragraph from an academic paper. Polish the writing to meet the academic style and improve the spelling, grammar, clarity, concision, and overall readability. When necessary, rewrite the whole sentence. Furthermore, list all modifications and explain the reasons for doing so in the markdown table. \n {content}'
125
+
126
+ def generate_summarize_prompt(content):
127
+ """ Generates the summarize prompt for the given content.
128
+ Inspired by project gpt_academic
129
+ Args: question (str):
130
+ Returns: str: The summarize prompt for the given content
131
+ """
132
+ return f'The following information is crawled from the Internet and will be used in writing the research report. Please clear the junk information and summarize the useful information in depth. Include all factual information, numbers, stats etc if available. \n {content}'
agent/research_agent.py ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ from actions.duck_search import duckduckgo_search
3
+ from processing.text import read_txt_files
4
+ from agent.llm_utils import llm_response, llm_stream_response
5
+ from config import Config
6
+ from agent import prompts
7
+ import os
8
+ import string
9
+
10
+ CFG = Config()
11
+
12
+
13
+ class ResearchAgent:
14
+ def __init__(self, question, agent):
15
+ """ Initializes the research assistant with the given question.
16
+ Args: question (str): The question to research
17
+ Returns: None
18
+ """
19
+
20
+ self.question = question
21
+ self.agent = agent
22
+ self.visited_urls = set()
23
+ self.search_summary = ""
24
+ self.directory_name = ''.join(c for c in question if c.isascii() and c not in string.punctuation)[:100]
25
+ self.dir_path = os.path.dirname(f"./outputs/{self.directory_name}/")
26
+
27
+ def call_agent(self, action):
28
+ messages = [{
29
+ "role": "system",
30
+ "content": prompts.generate_agent_role_prompt(self.agent),
31
+ }, {
32
+ "role": "user",
33
+ "content": action,
34
+ }]
35
+ return llm_response(
36
+ model=CFG.fast_llm_model,
37
+ messages=messages,
38
+ )
39
+
40
+ def call_agent_stream(self, action):
41
+ messages = [{
42
+ "role": "system",
43
+ "content": prompts.generate_agent_role_prompt(self.agent),
44
+ }, {
45
+ "role": "user",
46
+ "content": action,
47
+ }]
48
+ yield from llm_stream_response(
49
+ model=CFG.fast_llm_model,
50
+ messages=messages
51
+ )
52
+
53
+ def create_search_queries(self):
54
+ """ Creates the search queries for the given question.
55
+ Args: None
56
+ Returns: list[str]: The search queries for the given question
57
+ """
58
+ result = self.call_agent(prompts.generate_search_queries_prompt(self.question))
59
+ return json.loads(result)
60
+
61
+ def search_single_query(self, query):
62
+ """ Runs the async search for the given query.
63
+ Args: query (str): The query to run the async search for
64
+ Returns: list[str]: The async search for the given query
65
+ """
66
+ return duckduckgo_search(query, max_search_result=3)
67
+
68
+ def run_search_summary(self, query):
69
+ """ Runs the search summary for the given query.
70
+ Args: query (str): The query to run the search summary for
71
+ Returns: str: The search summary for the given query
72
+ """
73
+ responses = self.search_single_query(query)
74
+
75
+ print(f"Searching for {query}")
76
+ query = hash(query)
77
+ file_path = f"./outputs/{self.directory_name}/research-{query}.txt"
78
+ os.makedirs(os.path.dirname(file_path), exist_ok=True)
79
+ with open(file_path, "w") as f:
80
+ json.dump(responses, f)
81
+ print(f"Saved {query} to {file_path}")
82
+ return responses
83
+
84
+ def search_online(self):
85
+ """ Conducts the search for the given question.
86
+ Args: None
87
+ Returns: str: The search results for the given question
88
+ """
89
+
90
+ self.search_summary = read_txt_files(self.dir_path) if os.path.isdir(self.dir_path) else ""
91
+
92
+ if not self.search_summary:
93
+ search_queries = self.create_search_queries()
94
+ for _, query in search_queries.items():
95
+ search_result = self.run_search_summary(query)
96
+ self.search_summary += f"=Query=:\n{query}\n=Search Result=:\n{search_result}\n================\n"
97
+
98
+ return self.search_summary
99
+
100
+ def write_report(self, report_type):
101
+ """ Writes the report for the given question.
102
+ Args: None
103
+ Returns: str: The report for the given question
104
+ """
105
+ yield "Searching online..."
106
+
107
+ report_type_func = prompts.get_report_by_type(report_type)
108
+
109
+ yield from self.call_agent_stream(report_type_func(self.question, self.search_online()))
agent/toolkits.py ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from agent import prompts, llm_utils
2
+ from config import Config
3
+
4
+ CFG = Config()
5
+
6
+ def english_polishing(content):
7
+ prompt = prompts.generate_english_polishing_prompt(content)
8
+ messages = [{
9
+ "role": "user",
10
+ "content": prompt,
11
+ }]
12
+
13
+ yield from llm_utils.llm_stream_response(
14
+ model=CFG.fast_llm_model,
15
+ messages=messages)
app.py ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+
3
+ from config import check_openai_api_key
4
+ from agent.research_agent import ResearchAgent
5
+ from agent.toolkits import english_polishing
6
+ from statics.style import *
7
+
8
+ theme = gr.themes.Soft(
9
+ primary_hue=gr.themes.Color(c100="#e0e7ff", c200="#c7d2fe", c300="#a5b4fc", c400="#818cf8", c50="#eef2ff", c500="#6366f1", c600="#5e5aaa", c700="#4338ca", c800="#3730a3", c900="#312e81", c950="#2b2c5e"),
10
+ font_mono=[gr.themes.GoogleFont('Fira Code'), 'ui-monospace', 'Consolas', 'monospace']
11
+ )
12
+
13
+ check_openai_api_key()
14
+
15
+ def run_agent(task, agent, report_type):
16
+ assistant = ResearchAgent(task, agent)
17
+ yield from assistant.write_report(report_type)
18
+
19
+ with gr.Blocks(theme=gr.themes.Base(),
20
+ title="AI Research Assistant",
21
+ css=css) as demo:
22
+ gr.HTML(top_bar)
23
+ with gr.Tab(label="Report"):
24
+ with gr.Column():
25
+ gr.HTML(research_report_html)
26
+ research_report = gr.Markdown(value="&nbsp;&nbsp;**Research report will appear here...**",
27
+ elem_classes="output")
28
+ with gr.Row():
29
+ agent_type = gr.Dropdown(label="# Agent Type",
30
+ value="Default Agent",
31
+ interactive=True,
32
+ allow_custom_value=False,
33
+ choices=["Default Agent",
34
+ "Business Analyst Agent",
35
+ "Finance Agent",
36
+ "Travel Agent",
37
+ "Academic Research Agent",
38
+ "Computer Security Analyst Agent",
39
+ "Clinical Medicine Agent",
40
+ "Basic Medicine Agent",
41
+ "Social Science Research Agent"])
42
+ report_type = gr.Dropdown(label="# Report Type",
43
+ value="Research Report",
44
+ interactive=True,
45
+ allow_custom_value=False,
46
+ choices=["Research Report",
47
+ "Resource Report",
48
+ "Outline Report"])
49
+ input_box = gr.Textbox(label="# What would you like to research next?", placeholder="Enter your question here")
50
+ submit_btn = gr.Button("Generate Report")
51
+ submit_btn.click(run_agent, inputs=[input_box, agent_type, report_type],
52
+ outputs=research_report)
53
+ gr.Examples(["Should I invest in the Large Language Model industry in 2023?",
54
+ "Is it advisable to make investments in the electric car industry during the year 2023?",
55
+ "What constitutes the optimal approach for investing in the Bitcoin industry during the year 2023?",
56
+ "What are the most recent advancements in the domain of superconductors as of 2023?"],
57
+ inputs=input_box)
58
+
59
+ with gr.Tab("English Polishing"):
60
+ gr.HTML(english_polishing_html)
61
+ polished_result = gr.Markdown("&nbsp;&nbsp;**Polished result will appear here...**", elem_classes="output")
62
+ sentences = gr.Textbox(label="# What would you like to polish?", placeholder="Enter your sentence here")
63
+
64
+ with gr.Row():
65
+ polish_btn = gr.Button("Polish")
66
+ save_btn = gr.Button("Save")
67
+
68
+ polish_btn.click(english_polishing, inputs=[sentences], outputs=polished_result)
69
+
70
+ def save_result(history, origin, result):
71
+ history += f"\n**Origin** : {origin}\n\n**Polished Result** : {result}"
72
+ return history
73
+
74
+ gr.HTML(history_result_html)
75
+ history_result = gr.Markdown("&nbsp;&nbsp;**History result will appear here...**")
76
+ save_btn.click(save_result, inputs=[history_result, sentences, polished_result], outputs=history_result)
77
+
78
+ with gr.Tab("Literature Review"):
79
+ pass
80
+
81
+ demo.queue().launch()
config/__init__.py ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ from config.config import Config, check_openai_api_key
2
+ from config.singleton import AbstractSingleton, Singleton
3
+
4
+ __all__ = [
5
+ "check_openai_api_key",
6
+ "AbstractSingleton",
7
+ "Config",
8
+ "Singleton",
9
+ ]
config/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (407 Bytes). View file
 
config/__pycache__/__init__.cpython-39.pyc ADDED
Binary file (334 Bytes). View file
 
config/__pycache__/config.cpython-311.pyc ADDED
Binary file (5.13 kB). View file
 
config/__pycache__/config.cpython-39.pyc ADDED
Binary file (3.51 kB). View file
 
config/__pycache__/singleton.cpython-311.pyc ADDED
Binary file (1.46 kB). View file
 
config/__pycache__/singleton.cpython-39.pyc ADDED
Binary file (1.04 kB). View file
 
config/config.py ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Configuration class to store the state of bools for different scripts access."""
2
+ import os
3
+
4
+ import openai
5
+ from colorama import Fore
6
+ from dotenv import load_dotenv
7
+
8
+ from config.singleton import Singleton
9
+
10
+ load_dotenv(verbose=True)
11
+
12
+
13
+ class Config(metaclass=Singleton):
14
+ """
15
+ Configuration class to store the state of bools for different scripts access.
16
+ """
17
+
18
+ def __init__(self) -> None:
19
+ """Initialize the Config class"""
20
+ self.debug_mode = False
21
+ self.allow_downloads = False
22
+
23
+ self.selenium_web_browser = os.getenv("USE_WEB_BROWSER", "chrome")
24
+ self.fast_llm_model = os.getenv("FAST_LLM_MODEL", "gpt-3.5-turbo")
25
+ self.smart_llm_model = os.getenv("SMART_LLM_MODEL", "gpt-4")
26
+ self.fast_token_limit = int(os.getenv("FAST_TOKEN_LIMIT", 8000))
27
+ self.smart_token_limit = int(os.getenv("SMART_TOKEN_LIMIT", 8000))
28
+ self.browse_chunk_max_length = int(os.getenv("BROWSE_CHUNK_MAX_LENGTH", 8192))
29
+
30
+ self.openai_api_key = os.getenv("OPENAI_API_KEY")
31
+ self.openai_api_base = os.getenv("OPENAI_API_BASE", openai.api_base)
32
+ self.temperature = float(os.getenv("TEMPERATURE", "1"))
33
+
34
+ self.user_agent = os.getenv(
35
+ "USER_AGENT",
36
+ "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36"
37
+ " (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36",
38
+ )
39
+
40
+ self.memory_backend = os.getenv("MEMORY_BACKEND", "local")
41
+ # Initialize the OpenAI API client
42
+ openai.api_key = self.openai_api_key
43
+
44
+ def set_fast_llm_model(self, value: str) -> None:
45
+ """Set the fast LLM model value."""
46
+ self.fast_llm_model = value
47
+
48
+ def set_smart_llm_model(self, value: str) -> None:
49
+ """Set the smart LLM model value."""
50
+ self.smart_llm_model = value
51
+
52
+ def set_fast_token_limit(self, value: int) -> None:
53
+ """Set the fast token limit value."""
54
+ self.fast_token_limit = value
55
+
56
+ def set_smart_token_limit(self, value: int) -> None:
57
+ """Set the smart token limit value."""
58
+ self.smart_token_limit = value
59
+
60
+ def set_browse_chunk_max_length(self, value: int) -> None:
61
+ """Set the browse_website command chunk max length value."""
62
+ self.browse_chunk_max_length = value
63
+
64
+ def set_openai_api_key(self, value: str) -> None:
65
+ """Set the OpenAI API key value."""
66
+ self.openai_api_key = value
67
+
68
+ def set_debug_mode(self, value: bool) -> None:
69
+ """Set the debug mode value."""
70
+ self.debug_mode = value
71
+
72
+
73
+ def check_openai_api_key() -> None:
74
+ """Check if the OpenAI API key is set in config.py or as an environment variable."""
75
+ cfg = Config()
76
+ if not cfg.openai_api_key:
77
+ print(
78
+ Fore.RED
79
+ + "Please set your OpenAI API key in .env or as an environment variable."
80
+ )
81
+ print("You can get your key from https://platform.openai.com/account/api-keys")
82
+ exit(1)
config/singleton.py ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """The singleton metaclass for ensuring only one instance of a class."""
2
+ import abc
3
+
4
+
5
+ class Singleton(abc.ABCMeta, type):
6
+ """
7
+ Singleton metaclass for ensuring only one instance of a class.
8
+ """
9
+
10
+ _instances = {}
11
+
12
+ def __call__(cls, *args, **kwargs):
13
+ """Call method for the singleton metaclass."""
14
+ if cls not in cls._instances:
15
+ cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
16
+ return cls._instances[cls]
17
+
18
+
19
+ class AbstractSingleton(abc.ABC, metaclass=Singleton):
20
+ """
21
+ Abstract singleton class for ensuring only one instance of a class.
22
+ """
23
+
24
+ pass
outputs/Should I invest in the Large Language Model industry in 2023/research--2012672616352147449.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ [{"title": "Top Investing Trends For 2023 - Forbes Advisor", "href": "https://www.forbes.com/advisor/investing/top-investing-trends-2023/", "body": "With 2022 drawing to a close, the S&P 500 has clawed its way out of bear market territory but remains down 17% as of this writing. As we look ahead to 2023, here are nine investing trends that can ..."}, {"title": "Global Risks Report 2023: the biggest risks facing the world", "href": "https://www.weforum.org/agenda/2023/01/these-are-the-biggest-risks-facing-the-world-global-risks-2023/", "body": "Davos 2023. The World Economic Forum's latest Global Risks Report identifies the key risks facing the world over the next decade. In the next two years, the cost-of-living crisis is seen as the biggest risk, while over the next 10 years environmental risks dominate. The interconnectedness of global risks and crises is giving rise to the threat ..."}, {"title": "Global Risks Report 2023: We know what the risks are - here's what ...", "href": "https://www.weforum.org/agenda/2023/01/global-risks-report-2023-experts-davos2023/", "body": "The urgency of a cost of living crisis dominates 2023's Global Risks Report, which is in danger of deprioritizing other risks. Experts at the World Economic Forum give their insights into how their sectors are seeking to manage risks, build resilience and use new opportunities to shore up defences in 2023."}]
outputs/What are the most recent advancements in the domain of superconductors as of 2023/research--2821165325009188188.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ [{"title": "Physicists discover a new switch for superconductivity - ScienceDaily", "href": "https://www.sciencedaily.com/releases/2023/06/230622120822.htm", "body": "June 22, 2023 Source: Massachusetts Institute of Technology Summary: A study sheds surprising light on how certain superconductors undergo a 'nematic transition' -- unlocking new,..."}, {"title": "Physicists discover a new switch for superconductivity", "href": "https://news.mit.edu/2023/physicists-discover-new-switch-superconductivity-0622", "body": "June 22, 2023 Press Inquiries Caption When some ultrathin materials undergo a \"nematic transition,\" their atomic lattice structure stretches in ways that unlock superconductivity (as this conceptual image shows). MIT physicists have identified how this essential nematic switch occurs in one class of superconductors. Credits Image: iStock"}, {"title": "New Room-Temperature Superconductor Discovered by Scientists - The New ...", "href": "https://www.nytimes.com/2023/03/08/science/room-temperature-superconductor-ranga-dias.html", "body": "New Room-Temperature Superconductor Discovered by Scientists - The New York Times New Room-Temperature Superconductor Offers Tantalizing Possibilities The breakthrough could one day..."}]