iofu728 commited on
Commit
d64a23a
·
1 Parent(s): c5b556d

Feature(LLMLingua): update the news

Browse files
Files changed (1) hide show
  1. app.py +7 -2
app.py CHANGED
@@ -7,7 +7,7 @@ INTRO = """
7
  # LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models (EMNLP 2023) [[paper](https://arxiv.org/abs/2310.05736)]
8
  _Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
9
 
10
- This is an early demo of the prompt compression method LLMLingua.
11
 
12
  It should be noted that due to limited resources, we only provide the **GPT2-Small** size language model in this demo. Using the **LLaMA2-7B** as a small language model would result in a significant performance improvement, especially at high compression ratios.
13
 
@@ -19,10 +19,15 @@ To use it, upload your prompt and set the compression target.
19
  2. ✅ Set the target_token or compression ratio.
20
  3. 🤔 Try experimenting with different target compression ratios or other hyperparameters to optimize the performance.
21
 
22
- You can check our [repo](https://aka.ms/LLMLingua)!
23
 
24
  We also has a work to compress long context scenories, using less cost but even improve the downstream performance, LongLLMLingua.<br>
25
  [LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression](https://arxiv.org/abs/2310.06839) (Under Review).<br>
 
 
 
 
 
26
  """
27
 
28
  INTRO_EXAMPLES = '''
 
7
  # LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models (EMNLP 2023) [[paper](https://arxiv.org/abs/2310.05736)]
8
  _Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
9
 
10
+ ### This is an <b>early demo</b> of the prompt compression method LLMLingua and <b>the capabilities are limited</b>, restricted to using only the GPT-2 small size mode.
11
 
12
  It should be noted that due to limited resources, we only provide the **GPT2-Small** size language model in this demo. Using the **LLaMA2-7B** as a small language model would result in a significant performance improvement, especially at high compression ratios.
13
 
 
19
  2. ✅ Set the target_token or compression ratio.
20
  3. 🤔 Try experimenting with different target compression ratios or other hyperparameters to optimize the performance.
21
 
22
+ You can check our [project page](https://llmlingua.com/)!
23
 
24
  We also has a work to compress long context scenories, using less cost but even improve the downstream performance, LongLLMLingua.<br>
25
  [LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression](https://arxiv.org/abs/2310.06839) (Under Review).<br>
26
+
27
+ ## News
28
+
29
+ - 🎈 We launched a [project page](https://llmlingua.com/) showcasing real-world case studies, including RAG, Online Meetings, CoT, and Code;
30
+ - 👾 LongLLMLingua has been incorporated into the [LlamaIndex pipeline](https://github.com/run-llama/llama_index/blob/main/llama_index/indices/postprocessor/longllmlingua.py), which is a widely used RAG framework.
31
  """
32
 
33
  INTRO_EXAMPLES = '''