Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval Mar 22, 2024 • 70
view article Article Fine-tune ModernBERT for RAG with Synthetic Data By sdiazlor • about 14 hours ago • 10
CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval Paper • 2411.12644 • Published Nov 19, 2024 • 3
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 6 days ago • 113
view article Article Python Is All You Need? Introducing Dria-Agent-α By andthattoo • 10 days ago • 22
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1, 2024 • 58
Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP Paper • 2408.04303 • Published Aug 8, 2024 • 15
view article Article Announcing NVIDIA Cosmos World Foundation Models By mingyuliutw • 14 days ago • 23
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model Paper • 2501.01028 • Published 19 days ago • 12
PubMedBERT Embeddings M2V Collection Models distilled with Model2Vec - 100K / 500K / 1M / 2M / 8M parameter variants. • 5 items • Updated 13 days ago • 3
ModernGLiNER Collection GLiNER models based on modern encoder architectures • 2 items • Updated 28 days ago • 6
view article Article Fine-tune ModernBERT for text classification using synthetic data By davidberenstein1957 • 22 days ago • 23
Granite 3.1 Language Models Collection A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 8 items • Updated Dec 18, 2024 • 48
view article Article Use Models from the Hugging Face Hub in LM Studio By yagilb • Nov 28, 2024 • 132
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 125