Clémentine Fourrier's picture

Clémentine Fourrier

clefourrier

·

http://clefourrier.github.io

AI & ML interests

None yet

Recent Activity

liked a Space about 16 hours ago

mvaloatto/TCTF

updated a Space about 17 hours ago

science/README

new activity about 19 hours ago

open-llm-leaderboard/open_llm_leaderboard:Carbon Dioxide Emissions

View all activity

Articles

Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard

Introduction to the Open Leaderboard for Japanese LLMs

Letting Large Models Debate: The First Multilingual LLM Debate Competition

Judge Arena: Benchmarking LLMs as Evaluators

Introducing the Open FinLLM Leaderboard

BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

Let's talk about LLM evaluation

Introducing the Open Arabic LLM Leaderboard

Introducing the Open Leaderboard for Hebrew LLMs!

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

Improving Prompt Consistency with Structured Generations

Introducing the Open Chain of Thought Leaderboard

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

Introducing the Chatbot Guardrails Arena

Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?

TTS Arena: Benchmarking Text-to-Speech Models in the Wild

Introducing the Red-Teaming Resistance Leaderboard

Introducing the Open Ko-LLM Leaderboard: Leading the Korean LLM Evaluation Ecosystem

NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates

Introducing the Enterprise Scenarios Leaderboard: a Leaderboard for Real World Use Cases

The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models

A guide to setting up your own Hugging Face leaderboard: an end-to-end example with Vectara's hallucination leaderboard

2023, year of open LLMs

Open LLM Leaderboard: DROP deep dive

Overview of natively supported quantization schemes in 🤗 Transformers

What's going on with the Open LLM Leaderboard?

Introduction to Graph Machine Learning

Organizations

clefourrier's activity

upvoted 2 articles about 1 month ago

Article

Bridging the Gap Between Physical Numerical Simulations and Machine Learning: Introducing The Well

By

•

Dec 2, 2024

• 17

Article

Halo: Open Source Health Tracking with Wearables

By

•

Nov 19, 2024

• 99

upvoted an article 2 months ago

Article

Releasing Outlines-core 0.1.0: structured generation in Rust and Python

Oct 22, 2024

• 44

upvoted 2 articles 3 months ago

Article

Democratization of AI, Open Source, and AI Auditing: Thoughts from the DisinfoCon Panel in Berlin

By

•

Oct 8, 2024

• 5

Article

A Short Summary of Chinese AI Global Expansion

By

•

Oct 1, 2024

• 15

upvoted a collection 3 months ago

Molmo

Artifacts for open multimodal language models. • 5 items • Updated Nov 27, 2024 • 290

upvoted 2 articles 6 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 294

Article

Our Transformers Code Agent beats the GAIA benchmark!

Jul 1, 2024

• 49

upvoted a paper 6 months ago

MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures

Paper • 2406.06565 • Published Jun 3, 2024 • 9

upvoted a collection 6 months ago

🎭 Avatars

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 70 items • Updated 11 days ago • 79

upvoted a paper 6 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 87

upvoted an article 7 months ago

Article

Space secrets security update

May 31, 2024

• 50

upvoted 3 articles 8 months ago

Article

Evaling llm-jp-eval (evals are hard)

By

•

May 18, 2024

• 4

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Apr 22, 2024

• 80

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By

•

Apr 24, 2024

• 60

upvoted a collection 8 months ago

Granite Code Models

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated 17 days ago • 181

upvoted a paper 8 months ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 101

upvoted 2 articles 8 months ago

Article

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

May 3, 2024

• 13

Article

Improving Prompt Consistency with Structured Generations

Apr 30, 2024

• 58

upvoted an article 9 months ago

Article

A guide to setting up your own Hugging Face leaderboard: an end-to-end example with Vectara's hallucination leaderboard

Jan 12, 2024

• 6