This is a `gpt2-large` model, finetuned on the Wikitext-103 dataset. It achieves a perplexity of **10.56** using a "sliding window" context, using the `run_clm.py` script at [https://github.com/neulab/knn-transformers](https://github.com/neulab/knn-transformers). | Base LM: | `distilgpt2` | `gpt2` | | :--- | ----: | ---: | | base perplexity | 18.25 | 14.84 | | + kNN-LM | 15.03 | 12.57 | | + RetoMaton | **14.70** | **12.46** | This model was released as part of the paper ["Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval"](https://arxiv.org/pdf/2201.12431.pdf) (ICML'2022). For more information, see: [https://github.com/neulab/knn-transformers](https://github.com/neulab/knn-transformers) If you use this model, please cite: ``` @inproceedings{alon2022neuro, title={Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval}, author={Alon, Uri and Xu, Frank and He, Junxian and Sengupta, Sudipta and Roth, Dan and Neubig, Graham}, booktitle={International Conference on Machine Learning}, pages={468--485}, year={2022}, organization={PMLR} } ```