Learning each other

#1
by Alanturner2 - opened

First, I appreciate your sharing.
I am so curious about your model. I want to develop the chatbot system to chat with some pdf files.
Then your results and model motivate me to make this discussion.
How can you train this model?( training infrastructure, time and environment cost & training progress more deeply)
What is the initial purpose?(only chat with this book)
How well work?
I want help!

Thank you for your interest in my model development project. I understand your curiosity about creating a chatbot system for PDF interaction, and I'm happy to share my experience. My approach involved fine-tuning a Llama-3.2-1B model using authorized psychological literature. While the initial training was conducted on my laptop with a relatively short training time of approximately 30 minutes, I acknowledge this was just a preliminary phase of development.
As for the infrastructure and environment, I kept it simple for this initial proof of concept. However, I recognize that for more robust implementation, we would need more sophisticated computing resources and longer training times. The training progress focused on specialized psychological literature, which helped create a foundation for clinical and academic applications.
The initial purpose extends beyond just chatting with a specific book. As both an IT professional and a recent psychology graduate from UNAM (Universidad Nacional Autónoma de México), I identified a gap in AI language models specifically trained for psychological applications. Most psychologists currently rely on commercially trained AI models, which may not always provide the most accurate or contextually appropriate responses for psychological consultation.
The model works well for basic interactions, showing promising results in generating responses based on validated psychological literature. However, I'm currently planning the next development phase, which will include:

A comprehensive evaluation and validation process
Implementation of a larger model for handling more complex responses
More extensive training time and resources
Rigorous testing protocols

For your PDF chatbot system, I would recommend considering a larger model than what I initially used, and potentially implementing a RAG (Retrieval-Augmented Generation) system for dynamic data handling. While my approach worked for static, validated literature, your use case might benefit from more flexible data interaction capabilities.
I'm still learning and expanding my knowledge in this field, and I appreciate the opportunity to share experiences with fellow developers

Thank you for your responding.
I am researching on your same problem. But I started from the RAG using langchain. If you want to know, I will help you. And you can use my one demo project (https://huggingface.co/spaces/Alanturner2/Arxiv-pdf-summarization)but retriever has long latency. I mean it takes long time.
I have another questions. Did you train the model just only for the one book? Why don't you use the RAG at first?
I hope your next answer!

Hello, I decided not to use the RAG approach because I believe the responses are faster when the content from the books is static and the model is already trained with this specific data. What I want is a LLaMA 3.2 model specialized in psychology, not a general version of LLaMA 3.2 that relies on a RAG to access psychological data.

Thank you for your answering again.
Could you share code for me?

Sign up or log in to comment