James Chang's picture

40 8

James Chang

strategist922

·

strategist922

AI & ML interests

Multimodal Learning

Organizations

None yet

strategist922's activity

upvoted a paper 2 months ago

Pyramidal Flow Matching for Efficient Video Generative Modeling

Paper • 2410.05954 • Published Oct 8, 2024 • 38

upvoted a paper 3 months ago

LVCD: Reference-based Lineart Video Colorization with Diffusion Models

Paper • 2409.12960 • Published Sep 19, 2024 • 24

upvoted an article 3 months ago

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12, 2024

• 108

upvoted 2 papers 3 months ago

Hyena Hierarchy: Towards Larger Convolutional Language Models

Paper • 2302.10866 • Published Feb 21, 2023 • 7

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 139

upvoted 6 papers 4 months ago

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Paper • 2405.04324 • Published May 7, 2024 • 22

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 65

Deduplicating Training Data Makes Language Models Better

Paper • 2107.06499 • Published Jul 14, 2021 • 4

Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 10

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Paper • 2306.01116 • Published Jun 1, 2023 • 32

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation

Paper • 2108.12409 • Published Aug 27, 2021 • 5

upvoted a paper 8 months ago

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 30

upvoted a paper 9 months ago

Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23, 2024 • 59

upvoted a collection 9 months ago

Idefics2 🐶

Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6, 2024 • 91

upvoted a paper 9 months ago

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17, 2024 • 59

upvoted 5 papers 10 months ago

LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25, 2024 • 65

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

Paper • 2403.16990 • Published Mar 25, 2024 • 25

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 75

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 18

Nemotron-4 15B Technical Report

Paper • 2402.16819 • Published Feb 26, 2024 • 42