22 17 74

Kiran Kamble

kiranr

ki6an

AI & ML interests

nlp,llm

Recent Activity

new activity 8 days ago

Writer/palmyra-base:Adding `safetensors` variant of this model

updated a model 14 days ago

Writer/palmyra-creative-dummy-weights

updated a model 19 days ago

Writer/palmyra-x-4.3-long-cite

View all activity

Organizations

kiranr's activity

upvoted a paper 4 months ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 138

upvoted an article 4 months ago

Article

Using Writer Framework with Hugging Face Spaces

•

Aug 20, 2024

• 30

upvoted 2 papers 4 months ago

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 58

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 41

upvoted a paper 5 months ago

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15, 2024 • 52

upvoted a collection 6 months ago

DCLM

Collection

DCLM Models + Datasets • 6 items • Updated Oct 4, 2024 • 24

upvoted a paper 9 months ago

ReALM: Reference Resolution As Language Modeling

Paper • 2403.20329 • Published Mar 29, 2024 • 21

upvoted 5 papers 11 months ago

Linear Transformers with Learnable Kernel Functions are Better In-Context Models

Paper • 2402.10644 • Published Feb 16, 2024 • 79

Speculative Streaming: Fast LLM Inference without Auxiliary Models

Paper • 2402.11131 • Published Feb 16, 2024 • 42

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Paper • 2402.04291 • Published Feb 6, 2024 • 48

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 73

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 30

upvoted a collection 11 months ago

Papers about model merging

Collection

referenced in the mergekit repo: https://github.com/cg123/mergekit • 4 items • Updated Feb 13, 2024 • 14

upvoted a collection about 1 year ago

Llamafied Yi

Collection

Yi base models converted to Llama architecture. • 4 items • Updated Nov 14, 2023 • 9

upvoted a paper about 1 year ago

Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model

Paper • 2310.09520 • Published Oct 14, 2023 • 10

upvoted 2 papers over 1 year ago

Large Language Models as Optimizers

Paper • 2309.03409 • Published Sep 7, 2023 • 75

Personality Traits in Large Language Models

Paper • 2307.00184 • Published Jul 1, 2023 • 20