Yedson54
's Collections
Architectures
updated
Associative Recurrent Memory Transformer
Paper
•
2407.04841
•
Published
•
32
Mixture-of-Agents Enhances Large Language Model Capabilities
Paper
•
2406.04692
•
Published
•
55
Transformers are SSMs: Generalized Models and Efficient Algorithms
Through Structured State Space Duality
Paper
•
2405.21060
•
Published
•
63
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
•
2404.14219
•
Published
•
253
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
88
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Paper
•
2406.16860
•
Published
•
59
Kolmogorov-Arnold Transformer
Paper
•
2409.10594
•
Published
•
39
Paper
•
2410.05258
•
Published
•
169
Selective Attention Improves Transformer
Paper
•
2410.02703
•
Published
•
23