PlanLLM

drawing

Model Details

PlanLLM is a conversational assistant trained to assist users in completing a recipe from beginning to end and be able to answer any related or relevant requests that the user might have. The model was also tested with DIY Tasks and performed similarly.

Training

PlanLLM was trained by fine-tuning a Vicuna model on synthetic dialogue between users and an assistant about a given recipe. The model was first trained using SFT and then using Direct Preference Optimization (DPO).

Details

SFT:

  • Train Type: Fully Sharded Data Parallel (FSDP) with 4 A100 40GB GPUs
  • Batch Size: 1
  • Gradient Acc. Steps: 64
  • Train steps: 600

DPO:

  • Train Type: Low-Rank Adaptation (LoRA) with 1 A100 40GB GPU
  • LoRA Rank: 64
  • LoRA Alpha: 16
  • Batch Size: 1
  • Gradient Acc. Steps: 64
  • Train steps: 350

Dataset

PlanLLM was trained on synthetic user-system dialogues where the role of the system is to aid the user in completing a predetermined task. For our case, we used recipes.

These dialogues were generated using the user utterances collected from Alexa users who interacted with TWIZ, our entry in the Alexa Prize Taskbot Challenge 1. Using an intent classifier we mapped each user utterance to a specific intent allowing us to collect intent-specific utterances and a dialogue graph of each dialogue (with intents being the graph nodes). For the system responses, we used a combination of templates, external knowledge sources, and Large Language Models.

Using this we built a pipeline that would navigate a dialogue graph generating user requests and system responses for each turn, creating complete dialogues that follow a similar dialogue pattern used by real users.

Details

SFT:

  • Dialogues: 10k (90/5/5 splits)
  • Recipes: 1000

DPO:

  • Dialogues: 3k (90/5/5 splits)
  • Recipes: 1000 (same recipes used for SFT)

License

It's the same as Vicuna. A non-commercial Apache 2.0 license.

Paper

"Plan-Grounded Large Language Models for Dual Goal Conversational Settings" (Accepted at EACL 2024) Diogo Gl贸ria-Silva, Rafael Ferreira, Diogo Tavares, David Semedo, Jo茫o Magalh茫es

Cite Us!

@InProceedings{planllm_eacl24,
  author="Gl贸ria-Silva, Diogo
          and Ferreira, Rafael
          and Tavares, Diogo
          and Semedo, David
          and Magalh茫es, Jo茫o",
  title="Plan-Grounded Large Language Models for Dual Goal Conversational Settings",
  booktitle="European Chapter of the Association for Computational Linguistics (EACL 2024)",
  year="2024",
}
Downloads last month
16
Safetensors
Model size
6.74B params
Tensor type
F32
Inference Examples
Inference API (serverless) has been turned off for this model.