1 2 10

Shrirang Mahajan

NotShrirang

https://www.shrirangmahajan.in/

AI & ML interests

Deep Learning, LLMs, Machine Learning, Generative AI

Recent Activity

liked a model 4 days ago

deepseek-ai/DeepSeek-R1-Distill-Llama-70B

liked a model 4 days ago

deepseek-ai/DeepSeek-V3

liked a model 4 days ago

deepseek-ai/DeepSeek-R1

View all activity

Organizations

NotShrirang's activity

liked 3 models 4 days ago

@ariG23498 Hey, I read the article again, and it feels a lot easier to read. Kudos to your quick response! I know changing something you have put efforts into, is not easy.

Thanks for directly aligning with my preference!
Loss: 📉

reacted to burtenshaw's post with 🔥 13 days ago

Post

39978

We’re launching a FREE and CERTIFIED course on Agents!

We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.

Here's what you'll learn:

- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience

This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.

Enroll today and start building the next generation of AI agent applications!

https://bit.ly/hf-learn-agents

23 replies

commented on Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) 13 days ago

Great explanation! How they were able to convert an optimization problem into a differentiable equation is just amazing!
I was recently trying to understand what DPO does under the hood and I watched this video by @hkproj . Great work!

Also, just filling in for newbies like me:

The maximization equation in 3rd step in Reformulating the RLHF Objective
We divide the maximization equation with −β and because of the - sign, it becomes minimization problem.
In (Introducing the Partition Function), Z(x) is a normalization constant. I wasn't able to understand how this term Z(x) came into picture and how it is substituted. So I asked ChatGPT and I got this!

This makes little bit of sense, but I have not verified whether this is correct or not.
There are some helpful steps in "Mathematical Derivations" section in the DPO paper: https://arxiv.org/pdf/2305.18290