dynamicjerry
's Collections
Papers
updated
Gemini: A Family of Highly Capable Multimodal Models
Paper
•
2312.11805
•
Published
•
44
Unlocking Pre-trained Image Backbones for Semantic Image Synthesis
Paper
•
2312.13314
•
Published
•
7
LLM in a flash: Efficient Large Language Model Inference with Limited
Memory
Paper
•
2312.11514
•
Published
•
257
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Paper
•
2312.09911
•
Published
•
53
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion
Models
Paper
•
2312.09608
•
Published
•
13
VecFusion: Vector Font Generation with Diffusion
Paper
•
2312.10540
•
Published
•
21
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip
Connection Editing
Paper
•
2312.11392
•
Published
•
19
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image
Inpainting with Diffusion Models
Paper
•
2312.14091
•
Published
•
15
Eliminating Oversaturation and Artifacts of High Guidance Scales in
Diffusion Models
Paper
•
2410.02416
•
Published
•
26
FashionComposer: Compositional Fashion Image Generation
Paper
•
2412.14168
•
Published
•
16
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper
•
2412.11768
•
Published
•
41
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper
•
2412.10360
•
Published
•
136
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free
Scale Fusion
Paper
•
2412.09626
•
Published
•
19
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better
Reasoning in SLMs
Paper
•
2412.08347
•
Published
•
4
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper
•
2412.13303
•
Published
•
13