69 1183 2029

taesiri PRO

taesiri

https://taesiri.ai/

AI & ML interests

AGI ... one linear layer at a time

Recent Activity

updated a dataset about 11 hours ago

taesiri/o1-pro-image-difference-captioning

updated a dataset about 12 hours ago

taesiri/PhotoshopRequest-DailyDump-January-2025

liked a dataset about 12 hours ago

PowerInfer/QWQ-LONGCOT-500K

View all activity

Organizations

taesiri's activity

upvoted 6 papers about 15 hours ago

MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models

Paper • 2501.00316 • Published 4 days ago • 16

MapQaTor: A System for Efficient Annotation of Map Query Datasets

Paper • 2412.21015 • Published 4 days ago • 6

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published 3 days ago • 27

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published 1 day ago • 30

VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control

Paper • 2501.01427 • Published 1 day ago • 30

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published 2 days ago • 45

upvoted 3 papers about 23 hours ago

upvoted a paper 1 day ago

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published 7 days ago • 63

upvoted 4 papers 4 days ago

Edicho: Consistent Image Editing in the Wild

Paper • 2412.21079 • Published 4 days ago • 19

On the Compositional Generalization of Multimodal LLMs for Medical Imaging

Paper • 2412.20070 • Published 7 days ago • 39

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Paper • 2412.18525 • Published 10 days ago • 59

1.58-bit FLUX

Paper • 2412.18653 • Published 10 days ago • 63

upvoted 2 papers 5 days ago

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published 9 days ago • 82

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

Paper • 2412.18605 • Published 10 days ago • 17

upvoted a paper 6 days ago

YuLan-Mini: An Open Data-efficient Language Model

Paper • 2412.17743 • Published 11 days ago • 59

upvoted a paper 8 days ago

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published 11 days ago • 33

upvoted a collection 8 days ago

InternVL2.5-MPO

Collection

Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 4 days ago • 23

upvoted a paper 10 days ago

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published 12 days ago • 42