BigScience Workshop

non-profit

https://bigscience.huggingface.co

bigscienceW

bigscience-workshop

Activity Feed

AI & ML interests

A one-year long research workshop on large language models: the Summer of Language Models 21 🌸

Recent Activity

Jekaterina authored a paper 4 days ago

INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge

DeividasM authored a paper 4 days ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Jekaterina authored a paper 4 days ago

DEPAC: a Corpus for Depression and Anxiety Detection from Speech

View all activity

bigscience's activity

christopher

in bigscience/bloom-1b1-intermediate 14 days ago

Adding `safetensors` variant of this model

#2 opened 14 days ago by

SFconvertbot

christopher

in bigscience/bloom-7b1-intermediate 15 days ago

Adding `safetensors` variant of this model

#4 opened 15 days ago by

SFconvertbot

soldni

authored 2 papers 15 days ago

RouterRetriever: Exploring the Benefits of Routing over Multiple Expert Embedding Models

Paper • 2409.02685 • Published Sep 4, 2024 • 1

Establishing Task Scaling Laws via Compute-Efficient Model Ladders

Paper • 2412.04403 • Published 29 days ago • 2

yjernite

posted an update 22 days ago

Post

2070

🇪🇺 Policy Thoughts in the EU AI Act Implementation 🇪🇺

There is a lot to like in the first draft of the EU GPAI Code of Practice, especially as regards transparency requirements. The Systemic Risks part, on the other hand, is concerning for both smaller developers and for external stakeholders.

I wrote more on this topic ahead of the next draft. TLDR: more attention to immediate large-scale risks and to collaborative solutions supported by evidence can help everyone - as long as developers disclose sufficient information about their design choices and deployment contexts.

Full blog here, based on our submitted response with @frimelle and @brunatrevelin :

https://huggingface.co/blog/yjernite/eu-draft-cop-risks#on-the-proposed-taxonomy-of-systemic-risks

2 replies

christopher

posted an update 26 days ago

Post

1582

The folks at Foursquare released a dataset of 104.5 million places of interest ( foursquare/fsq-os-places) and here's all of them on a plot

3 replies

christopher

posted an update 29 days ago

Post

2331

The Lichess database of games, puzzles, and engine evaluations is now on the Hub: https://huggingface.co/Lichess

Billions of chess data points to download, query, and stream and we're excited to see what you'll build with it! ♟️ 🤗

- Lichess/positions-datasets-66f50837db5cd3287d60d489
- Lichess/games-datasets-66f508df78f4b43e1bb2d353

paws

authored a paper 29 days ago

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Paper • 2412.02980 • Published about 1 month ago • 12

soldni

authored 2 papers about 1 month ago

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 57

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Paper • 2411.14199 • Published Nov 21, 2024 • 29

albertvillanova

posted an update about 2 months ago

Post

1443

🚨 How green is your model? 🌱 Introducing a new feature in the Comparator tool: Environmental Impact for responsible #LLM research!
👉 open-llm-leaderboard/comparator
Now, you can not only compare models by performance, but also by their environmental footprint!

🌍 The Comparator calculates CO₂ emissions during evaluation and shows key model characteristics: evaluation score, number of parameters, architecture, precision, type... 🛠️
Make informed decisions about your model's impact on the planet and join the movement towards greener AI!

sagot

authored a paper about 2 months ago

CamemBERT 2.0: A Smarter French Language Model Aged to Perfection

Paper • 2411.08868 • Published Nov 13, 2024 • 12

albertvillanova

posted an update about 2 months ago

Post

1530

🚀 New feature of the Comparator of the 🤗 Open LLM Leaderboard: now compare models with their base versions & derivatives (finetunes, adapters, etc.). Perfect for tracking how adjustments affect performance & seeing innovations in action. Dive deeper into the leaderboard!

🛠️ Here's how to use it:
1. Select your model from the leaderboard.
2. Load its model tree.
3. Choose any base & derived models (adapters, finetunes, merges, quantizations) for comparison.
4. Press Load.
See side-by-side performance metrics instantly!

Ready to dive in? 🏆 Try the 🤗 Open LLM Leaderboard Comparator now! See how models stack up against their base versions and derivatives to understand fine-tuning and other adjustments. Easier model analysis for better insights! Check it out here: open-llm-leaderboard/comparator 🌐

christopher

in bigscience/bloomz-mt 2 months ago

Adding `safetensors` variant of this model

#4 opened 2 months ago by

SFconvertbot

christopher

in bigscience/bloom-1b1 2 months ago

Request: DOI

#43 opened 2 months ago by

ovv4thewin

w11wo

authored a paper 2 months ago

GenUP: Generative User Profilers as In-Context Learners for Next POI Recommender Systems

Paper • 2410.20643 • Published Oct 28, 2024

albertvillanova

posted an update 2 months ago

Post

3134

🚀 Exciting update! You can now compare multiple models side-by-side with the Hugging Face Open LLM Comparator! 📊

open-llm-leaderboard/comparator

Dive into multi-model evaluations, pinpoint the best model for your needs, and explore insights across top open LLMs all in one place. Ready to level up your model comparison game?

albertvillanova

posted an update 2 months ago

Post

1227

🚨 Instruct-tuning impacts models differently across families! Qwen2.5-72B-Instruct excels on IFEval but struggles with MATH-Hard, while Llama-3.1-70B-Instruct avoids MATH performance loss! Why? Can they follow the format in examples? 📊 Compare models: open-llm-leaderboard/comparator

albertvillanova

posted an update 2 months ago

Post

1917

Finding the Best SmolLM for Your Project

Need an LLM assistant but unsure which hashtag#smolLM to run locally? With so many models available, how can you decide which one suits your needs best? 🤔

If the model you’re interested in is evaluated on the Hugging Face Open LLM Leaderboard, there’s an easy way to compare them: use the model Comparator tool: open-llm-leaderboard/comparator
Let’s walk through an example👇

Let’s compare two solid options:
- Qwen2.5-1.5B-Instruct from Alibaba Cloud Qwen (1.5B params)
- gemma-2-2b-it from Google (2.5B params)

For an assistant, you want a model that’s great at instruction following. So, how do these two models stack up on the IFEval task?

What about other evaluations?
Both models are close in performance on many other tasks, showing minimal differences. Surprisingly, the 1.5B Qwen model performs just as well as the 2.5B Gemma in many areas, even though it's smaller in size! 📊

This is a great example of how parameter size isn’t everything. With efficient design and training, a smaller model like Qwen2.5-1.5B can match or even surpass larger models in certain tasks.

Looking for other comparisons? Drop your model suggestions below! 👇

lintang

authored a paper 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21, 2024 • 44

AI & ML interests

Recent Activity

Team members 328

bigscience's activity

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Request: DOI