my_read_book - a ryanafufu Collection

ryanafufu 's Collections

my_read_book

updated 6 days ago

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Paper • 2407.08083 • Published Jul 10, 2024 • 28
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 58
The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 37
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 28
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 108
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published Sep 18, 2024 • 36
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Paper • 2409.12568 • Published Sep 19, 2024 • 48
Imagine yourself: Tuning-Free Personalized Image Generation

Paper • 2409.13346 • Published Sep 20, 2024 • 68
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 136
MaskBit: Embedding-free Image Generation via Bit Tokens

Paper • 2409.16211 • Published Sep 24, 2024 • 16
Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 94
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Paper • 2412.09626 • Published 22 days ago • 19
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published 22 days ago • 80
ColorFlow: Retrieval-Augmented Image Sequence Colorization

Paper • 2412.11815 • Published 19 days ago • 26
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published 11 days ago • 33