Yandex

https://yandex.com/company/

yandexcom

yandex

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

mryab authored a paper 5 days ago

Towards Best Practices for Open Datasets for LLM Training

mryab authored a paper 2 months ago

RedPajama: an Open Dataset for Training Large Language Models

puhsu authored a paper 7 months ago

TabReD: A Benchmark of Tabular Machine Learning in-the-Wild

View all activity

yandex's activity

mryab

authored a paper 5 days ago

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published 7 days ago • 44

mryab

authored a paper 2 months ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 48

puhsu

authored a paper 7 months ago

TabReD: A Benchmark of Tabular Machine Learning in-the-Wild

Paper • 2406.19380 • Published Jun 27, 2024 • 47

artnitolog

posted an update 7 months ago

Post

2541

Recently, we open-sourced YaFSDP, Yandex’s tool for efficient distributed training of LLMs.

Here are some of the key ideas used in YaFSDP to provide speedup and memory savings over FSDP:
• Allocate and utilize just two buffers throughout the transformer for all collected weights to circumvent the torch memory allocator;
• Gather small normalization layers at the beginning of the iteration and average the gradients only at the end;
• Move gradient division to the very end of the backward pass.

To learn more about how YaFSDP works, check out our latest blog post: https://medium.com/yandex/yafsdp-a-tool-for-faster-llm-training-and-optimized-gpu-utilization-is-no-632b7539f5b3

mryab

authored a paper 7 months ago

Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees

Paper • 2110.03313 • Published Oct 7, 2021 • 1

artnitolog

posted an update 7 months ago

Post

2082

Today we are introducing YaFSDP, Yandex’s tool for efficient distributed LLM training. YaFSDP can be used in conjunction with huggingface workflows and is up to 25% faster compared to FSDP.

Learn more here: https://github.com/yandex/YaFSDP

mryab

authored 5 papers 9 months ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8, 2024 • 8

mryab

authored a paper 11 months ago

Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements

Paper • 2401.06766 • Published Jan 12, 2024 • 2

mryab

authored 2 papers about 1 year ago

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Paper • 2312.08361 • Published Dec 13, 2023 • 25

Training Transformers Together

Paper • 2207.03481 • Published Jul 7, 2022 • 5

mryab

authored a paper over 1 year ago

Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy

Paper • 2310.09247 • Published Oct 13, 2023 • 3

puhsu

authored 2 papers over 1 year ago

TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning

Paper • 2307.14338 • Published Jul 26, 2023 • 1

TabDDPM: Modelling Tabular Data with Diffusion Models

Paper • 2209.15421 • Published Sep 30, 2022 • 2

mryab

authored 3 papers over 1 year ago

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

Paper • 2303.06865 • Published Mar 13, 2023 • 1

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

Paper • 2301.11913 • Published Jan 27, 2023 • 1

Distributed Deep Learning in Open Collaborations

Paper • 2106.10207 • Published Jun 18, 2021 • 2

AI & ML interests

Recent Activity

Team members 10

yandex's activity