Ethan Maxwell's picture

11 10

Ethan Maxwell

EthanMaxwell

·

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

OpenGVLab/InternVL2_5-26B

liked a model about 1 month ago

utter-project/EuroLLM-9B

liked a model about 1 month ago

NexaAIDev/OmniAudio-2.6B

View all activity

Organizations

None yet

EthanMaxwell's activity

liked 3 models about 1 month ago

OpenGVLab/InternVL2_5-26B

Image-Text-to-Text • Updated Dec 18, 2024 • 4.95k • 33

utter-project/EuroLLM-9B

Text Generation • Updated Dec 9, 2024 • 1.01k • 62

NexaAIDev/OmniAudio-2.6B

Audio-Text-to-Text • Updated Dec 13, 2024 • 5.55k • 239

liked a model about 2 months ago

NexaAIDev/Qwen2-Audio-7B-GGUF

Audio-Text-to-Text • Updated Nov 25, 2024 • 6.8k • 133

liked 5 models 2 months ago

NexaAIDev/OmniVLM-968M

Updated Dec 17, 2024 • 1.07k • 497

genmo/mochi-1-preview

Text-to-Video • Updated Dec 18, 2024 • 41.8k • 1.15k

HuggingFaceTB/SmolLM2-1.7B-Instruct

Text Generation • Updated 15 days ago • 82.5k • • 475

microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 1.22k • 1.53k

Etched/oasis-500m

Updated Nov 4, 2024 • 174 • 434

upvoted 11 papers 2 months ago

SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Paper • 2411.05007 • Published Nov 7, 2024 • 17

GazeGen: Gaze-Driven User Interaction for Visual Content Generation

Paper • 2411.04335 • Published Nov 7, 2024 • 14

Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?

Paper • 2411.05000 • Published Nov 7, 2024 • 21

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Paper • 2411.04923 • Published Nov 7, 2024 • 20

Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model

Paper • 2411.04496 • Published Nov 7, 2024 • 22

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published Nov 7, 2024 • 49

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 50

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 25

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 64

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 70

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 113