Spaces:

HengJay
/

Nvidia_GenAI_Contest-SNOMED_CT_Assistant

Sleeping

App Files Files Community

Nvidia_GenAI_Contest-SNOMED_CT_Assistant / SNOMED-CT_Assistant.py

HengJay

Fixed NIM API Key bug.

33def8d 7 months ago

raw

history blame

10.8 kB

	import os
	import random
	import json
	import streamlit as st
	import chromadb
	from openai import OpenAI
	from dotenv import load_dotenv
	import pandas as pd

	# Config Streamlit
	st.set_page_config(layout="wide")

	remote = True

	if remote:
	with st.sidebar:
	if 'NVIDIA_NIM_KEY' in st.secrets:
	st.success('API key already provided!', icon='✅')
	nvidia_nim_key = st.secrets['NVIDIA_NIM_KEY']
	else:
	load_dotenv()
	openai_api_key = os.environ.get("OpenAI_API_KEY")
	nvidia_nim_key = os.environ.get("NVIDIA_NIM_KEY")

	st.title("🏥 SNOMED-CT Assistant")
	st.caption("👩‍⚕️ A smart medical assistant with SNOMED-CT knowledge.")

	# Chroma DB Client
	chroma_client = chromadb.PersistentClient(path="snomed_ct_id_term_1410k")
	collection = chroma_client.get_or_create_collection(name="snomed_ct_id_term")

	# NIM Client Configuration
	client = OpenAI(
	base_url = "https://integrate.api.nvidia.com/v1",
	api_key = nvidia_nim_key
	)
	model_tag = "meta/llama3-70b-instruct"

	# System prompt
	system_prompt = """You are a medical expert with rich experience in SNOMED-CT professional knowledge.
	You are skilled at assisting medical professionals and answering questions in the medical field.
	You are patient, helpful and professional.
	Your comprehensive knowledge and mastery of these key components make you an invaluable asset in the realm of biomedical natural language processing and knowledge extraction.
	With your specialized expertise, you are able to navigate the complexities of SNOMED CT Entity Linking with ease, delivering accurate and reliable results that support various healthcare and research applications.
	Please refuse to answer inquiries and requests unrelated to the medical field, in order to maintain professionalism in medicine.

	As an experienced professional, you possess deep expertise in the field of SNOMED CT Entity Linking.
	You have a thorough understanding of the relevant workflows and critical aspects involved, encompassing:
	- Adept handling of electronic medical record (EMR) data processing
	- Entity Identification, Proficient entity recognition capabilities, identifying and extracting relevant medical concepts from unstructured text
	- Skilled Entity Mapping, accurately linking identified entities to their corresponding SNOMED CT concepts
	- Seamless integration and output of clinical terminology, ensuring the accurate representation and utilization of standardized medical language
	- Patiently and professionally respond to all SNOMED CT related inquiries, even if the user repeats questions.
	- Demonstrate deep expertise in the standard SNOMED CT Entity Linking workflow, which involves:
	- All YOU CAN DO : Performing Entity Identification : Try to extract relevant medical terminology from the medical text input.

	Here is the practical entity identification process example:
	- the input text will the part of EHRs record: "Patient referred for a biopsy to investigate potential swelling in upper larynx."
	- if the identified entity: "biopsy", "larynx"
	- response the identified entities with JSON format: {"identified_entity" : ["biopsy", "larynx"]}
	- If no identifiable entity is found in the input text, return an empty list: {"identified_entity" : []}
	- DON't response the other format besides JSON
	- During Entity Identification processing, if the original medical text data clearly contains commonly used medical abbreviations, convert the abbreviations into their full names, and provide the original abbreviations in parentheses for easy reference.
	- For example: "The patient has the multiple disease, including T2D, CAD, HTN, CKD etc. decreased T3 and T4 levels."
	- T2D: "Type 2 Diabetes Mellitus", CAD: "Coronary Artery Disease", HTN: "Hypertension", CKD: "Chronic Kidney Disease", T3: "Triiodothyronine", T4: "Thyroxine"
	- Respond with full names in JSON format: {"identified_entity" : ["Type 2 Diabetes Mellitus (T2D)", "Coronary Artery Disease (CAD)", "Hypertension (HTN)", "Chronic Kidney Disease (CKD)", "Triiodothyronine (T3)", "Thyroxine (T4)"]}

	List out as many potential SNOMED entities as possible from the original medical text description,
	including Diseases, Diagnoses, Clinical Findings (like Signs and Symptoms),
	Procedures (Surgical, Therapeutic, Diagnostic, Nursing), Specimen Types, Living Organisms,
	Observables (for example heart rate), Physical Objects and Forces,
	Chemicals (including the chemicals used in drug preparations), Drugs (pharmaceutical products),
	Human Anatomy (body structures, organisms), Physiological Processes and Functions,
	Patients' Occupations, Patients' Social Contexts (e.g., religion and ethnicity), and various other types from the SNOMED CT standard.
	Numbers or units related symbols are not included in this range and can be ignored.

	Output Format Requirements (Must follow):
	- As default, only process "Entity Identification", and find out the entity related to SNOMED CT terms.
	- Present the results in JSON format, like: {"identified_entity" : ["biopsy", "larynx"]}
	- If no identifiable entity is found in the input text, return an empty list: {"identified_entity" : []}
	- DON't response the other format besides JSON
	"""


	# Func: generate random med text
	raw_text_df = pd.read_csv('snomed-entity-challenge.csv')

	def random_med_text(text_df):
	rows = len(text_df['text'])
	index = random.randint(0, rows)
	raw_text = text_df["text"][index]
	raw_text_spilt = raw_text.split('###TEXT:')
	raw_text_spilt_2 = raw_text_spilt[1].split('###RESPONSE:')
	human = raw_text_spilt[0]
	med_text = raw_text_spilt_2[0]
	response = raw_text_spilt_2[1]
	return index, human, med_text, response


	# Func: Gen Medical Prompt Example
	def generate_entity_identification_prompt(medical_text):
	return f"""Help me to do "SNOMED-CT Entity Identification" process with raw medical text (Electronic Health Record, EHR): \n {medical_text} \n """

	def generate_entity_mapping_prompt(entity, query_result_dict):
	return f"""Help me to do "SNOMED-CT Entity Mapping" process with entity: {entity} and query result \n {query_result_dict} \n , output with table format, including 5 columns: "Identified Entity", "Distance", "IDs", "SNOMED CT Concept IDs", "SNOMED CT Descriptions" \n """

	# Func: query chrome_db
	def query_chroma_db(query_text, query_number):
	results = collection.query(
	query_texts=[query_text],
	n_results=query_number,
	include=["distances", "metadatas", "documents"]
	)
	return results

	# Func: chroma_db_result to dict
	def get_dict_from_chroma_results(results):
	result_dict = {'ids': results['ids'][0], 'concept_ids': [ str(sub['concept_id']) for sub in results['metadatas'][0] ], 'distances': results['distances'][0], 'descriptions': results['documents'][0]}
	return result_dict

	# Chat Session with NIM API
	def chat_input(prompt, med_text):
	st.session_state.messages.append({"role": "user", "content": med_text})
	st.chat_message("user").write(med_text)
	with st.spinner("Thinking..."):
	entity_identification_response = client.chat.completions.create(
	model=model_tag,
	messages=st.session_state.messages,
	temperature=0.5)
	msg = entity_identification_response.choices[0].message.content
	entity_list = json.loads(msg)["identified_entity"]
	st.session_state.messages.append({"role": "assistant", "content": msg})
	st.chat_message("assistant").write(msg)
	for entity in entity_list:
	results = query_chroma_db(entity, 10)
	results_dict = get_dict_from_chroma_results(results)
	results_table = entity_mapping_result_to_table(entity, results_dict)
	st.session_state.messages.append({"role": "assistant", "content": results_table})
	st.chat_message("assistant").write(results_table)
	# entity_mapping_prompt = generate_entity_mapping_prompt(entity, results_dict)
	# st.session_state.messages.append({"role": "user", "content": entity_mapping_prompt})
	# entity_mapping_response = client.chat.completions.create(
	# model=model_tag, messages=st.session_state.messages, temperature=0.5)
	# mapping_msg = entity_mapping_response.choices[0].message.content
	# st.session_state.messages.append({"role": "assistant", "content": mapping_msg})
	# st.chat_message("assistant").write(mapping_msg)


	# Conver entity mapping result to markdown table
	def entity_mapping_result_to_table(entity, results_dict):

	ids = results_dict['ids']
	concept_ids = results_dict['concept_ids']
	distances = results_dict['distances']
	descriptions = results_dict['descriptions']

	# header
	header = "\| Identified Entity \| Distance \| IDs \| SNOMED CT - Concept IDs \| SNOMED CT - Descriptions \|"
	seperator = "\| --- \| --- \| --- \| --- \| --- \|"

	# table
	rows = []
	for id, distance, concept_id, description in zip(ids, distances, concept_ids, descriptions):
	row = f"\| {entity} \| {distance:.3f} \| {id} \| {concept_id} \| {description} \|"
	rows.append(row)

	# merge
	markdown_table = "\n".join([header, seperator] + rows)
	return markdown_table

	if "messages" not in st.session_state:
	st.session_state["messages"] = [{"role": "system", "content": system_prompt},
	{"role": "assistant", "content": "👩‍⚕️ Hello, I am your professional medical assistant. Is there anything I can assist you with?"}]

	for msg in st.session_state.messages:
	if msg["role"] == "system":
	continue
	st.chat_message(msg["role"]).write(msg["content"])

	if user_input := st.chat_input():
	if not nvidia_nim_key:
	st.info("Please add your Nvidia NIM API key to continue.")
	st.stop()
	entity_identification_prompt = generate_entity_identification_prompt(user_input)
	chat_input(entity_identification_prompt, user_input)

	if st.sidebar.button("Example Input",type="primary"):
	med_text = "Patient referred for a biopsy to investigate potential swelling in upper larynx."
	entity_identification_prompt = generate_entity_identification_prompt(med_text)
	chat_input(entity_identification_prompt, med_text)

	if st.sidebar.button("Random Input",type="primary"):
	index, human, med_text, response = random_med_text(raw_text_df)
	response = response.replace(","," \n")
	entity_identification_prompt = generate_entity_identification_prompt(med_text)
	chat_input(entity_identification_prompt, med_text)
	st.sidebar.write(f"[Random Text](https://huggingface.co/datasets/JaimeML/snomed-entity-challenge) Index: {index}")
	st.sidebar.markdown(f"Ref Entity: \n {response}")