Mariusz Kurman's picture

Mariusz Kurman PRO

mkurman

AI & ML interests

AI Tech Lead | MD

Recent Activity

reacted to openfree's post with ๐Ÿ”ฅ about 12 hours ago
# ๐Ÿงฌ Protein Genesis AI: Design Proteins with Just a Prompt ## ๐Ÿค” Current Challenges in Protein Design Traditional protein design faces critical barriers: - ๐Ÿ’ฐ High costs ($1M - $10M+) & long development cycles (2-3 years) - ๐Ÿ”ฌ Complex equipment and expert knowledge required - ๐Ÿ“‰ Low success rates (<10%) - โฐ Time-consuming experimental validation ## โœจ Our Solution: Protein Genesis AI Transform protein design through simple natural language input: ``` "Design a protein that targets cancer cells" "Create an enzyme that breaks down plastic" ``` ### Key Features - ๐Ÿค– AI-powered automated design - ๐Ÿ“Š Real-time analysis & optimization - ๐Ÿ”ฌ Instant 3D visualization - ๐Ÿ’พ Immediate PDB file generation ## ๐ŸŽฏ Applications ### Medical & Industrial - ๐Ÿฅ Drug development - ๐Ÿ’‰ Antibody design - ๐Ÿญ Industrial enzymes - โ™ป๏ธ Environmental solutions ### Research & Education - ๐Ÿ”ฌ Basic research - ๐Ÿ“š Educational tools - ๐Ÿงซ Experimental design - ๐Ÿ“ˆ Data analysis ## ๐Ÿ’ซ Key Advantages - ๐Ÿ‘จโ€๐Ÿ’ป No coding or technical expertise needed - โšก Results in minutes (vs. years) - ๐Ÿ’ฐ 90% cost reduction - ๐ŸŒ Accessible anywhere ## ๐ŸŽ“ Who Needs This? - ๐Ÿข Biotech companies - ๐Ÿฅ Pharmaceutical research - ๐ŸŽ“ Academic institutions - ๐Ÿงช Research laboratories ## ๐ŸŒŸ Why It Matters Protein Genesis AI democratizes protein design by transforming complex processes into simple text prompts. This breakthrough accelerates scientific discovery, potentially leading to faster drug development and innovative biotechnology solutions. The future of protein design starts with a simple prompt! ๐Ÿš€ https://huggingface.co/spaces/openfree/ProteinGenesis
reacted to singhsidhukuldeep's post with ๐Ÿ‘€ about 12 hours ago
Exciting breakthrough in e-commerce recommendation systems! Walmart Global Tech researchers have developed a novel Triple Modality Fusion (TMF) framework that revolutionizes how we make product recommendations. >> Key Innovation The framework ingeniously combines three distinct data types: - Visual data to capture product aesthetics and context - Textual information for detailed product features - Graph data to understand complex user-item relationships >> Technical Architecture The system leverages a Large Language Model (Llama2-7B) as its backbone and introduces several sophisticated components: Modality Fusion Module - All-Modality Self-Attention (AMSA) for unified representation - Cross-Modality Attention (CMA) mechanism for deep feature integration - Custom FFN adapters to align different modality embeddings Advanced Training Strategy - Curriculum learning approach with three complexity levels - Parameter-Efficient Fine-Tuning using LoRA - Special token system for behavior and item representation >> Real-World Impact The results are remarkable: - 38.25% improvement in Electronics recommendations - 43.09% boost in Sports category accuracy - Significantly higher human evaluation scores compared to traditional methods Currently deployed in Walmart's production environment, this research demonstrates how combining multiple data modalities with advanced LLM architectures can dramatically improve recommendation accuracy and user satisfaction.
new activity about 22 hours ago
mkurman/llama-3.2-MEDIT-3B-o1:space
View all activity

Organizations

MedIT Solutions's profile picture BigScience Biomedical Datasets's profile picture SOWA Project's profile picture

mkurman's activity

reacted to openfree's post with ๐Ÿ”ฅ about 12 hours ago
view post
Post
2038
# ๐Ÿงฌ Protein Genesis AI: Design Proteins with Just a Prompt

## ๐Ÿค” Current Challenges in Protein Design

Traditional protein design faces critical barriers:
- ๐Ÿ’ฐ High costs ($1M - $10M+) & long development cycles (2-3 years)
- ๐Ÿ”ฌ Complex equipment and expert knowledge required
- ๐Ÿ“‰ Low success rates (<10%)
- โฐ Time-consuming experimental validation

## โœจ Our Solution: Protein Genesis AI

Transform protein design through simple natural language input:
"Design a protein that targets cancer cells"
"Create an enzyme that breaks down plastic"


### Key Features
- ๐Ÿค– AI-powered automated design
- ๐Ÿ“Š Real-time analysis & optimization
- ๐Ÿ”ฌ Instant 3D visualization
- ๐Ÿ’พ Immediate PDB file generation

## ๐ŸŽฏ Applications

### Medical & Industrial
- ๐Ÿฅ Drug development
- ๐Ÿ’‰ Antibody design
- ๐Ÿญ Industrial enzymes
- โ™ป๏ธ Environmental solutions

### Research & Education
- ๐Ÿ”ฌ Basic research
- ๐Ÿ“š Educational tools
- ๐Ÿงซ Experimental design
- ๐Ÿ“ˆ Data analysis

## ๐Ÿ’ซ Key Advantages

- ๐Ÿ‘จโ€๐Ÿ’ป No coding or technical expertise needed
- โšก Results in minutes (vs. years)
- ๐Ÿ’ฐ 90% cost reduction
- ๐ŸŒ Accessible anywhere

## ๐ŸŽ“ Who Needs This?
- ๐Ÿข Biotech companies
- ๐Ÿฅ Pharmaceutical research
- ๐ŸŽ“ Academic institutions
- ๐Ÿงช Research laboratories

## ๐ŸŒŸ Why It Matters
Protein Genesis AI democratizes protein design by transforming complex processes into simple text prompts. This breakthrough accelerates scientific discovery, potentially leading to faster drug development and innovative biotechnology solutions. The future of protein design starts with a simple prompt! ๐Ÿš€

openfree/ProteinGenesis
ยท
reacted to singhsidhukuldeep's post with ๐Ÿ‘€ about 12 hours ago
view post
Post
851
Exciting breakthrough in e-commerce recommendation systems!
Walmart Global Tech researchers have developed a novel Triple Modality Fusion (TMF) framework that revolutionizes how we make product recommendations.

>> Key Innovation
The framework ingeniously combines three distinct data types:
- Visual data to capture product aesthetics and context
- Textual information for detailed product features
- Graph data to understand complex user-item relationships

>> Technical Architecture
The system leverages a Large Language Model (Llama2-7B) as its backbone and introduces several sophisticated components:

Modality Fusion Module
- All-Modality Self-Attention (AMSA) for unified representation
- Cross-Modality Attention (CMA) mechanism for deep feature integration
- Custom FFN adapters to align different modality embeddings

Advanced Training Strategy
- Curriculum learning approach with three complexity levels
- Parameter-Efficient Fine-Tuning using LoRA
- Special token system for behavior and item representation

>> Real-World Impact
The results are remarkable:
- 38.25% improvement in Electronics recommendations
- 43.09% boost in Sports category accuracy
- Significantly higher human evaluation scores compared to traditional methods

Currently deployed in Walmart's production environment, this research demonstrates how combining multiple data modalities with advanced LLM architectures can dramatically improve recommendation accuracy and user satisfaction.
reacted to Sri-Vigneshwar-DJ's post with ๐Ÿ”ฅ 1 day ago
view post
Post
1311
Combining smolagents with Anthropicโ€™s best practices simplifies building powerful AI agents:

1. Code-Based Agents: Write actions as Python code, reducing steps by 30%.
2. Prompt Chaining: Break tasks into sequential subtasks with validation gates.
3. Routing: Classify inputs and direct them to specialized handlers.
4. Fallback: Handle tasks even if classification fails.

https://huggingface.co/blog/Sri-Vigneshwar-DJ/building-effective-agents-with-anthropics-best-pra
reacted to ezgikorkmaz's post with ๐Ÿ”ฅ 1 day ago
posted an update 1 day ago
view post
Post
1049
I kindly invite you to try my experimental Llama 3.2 3B with o1-like thinking.

It utilizes Thoughts when needed, so don't be surprised when it's not. It also has a minor bug that requires further fine-tuning (sometimes it starts with the <|python_tag|> instead of <Thought>).

Enjoy!

Give some likes and whatever to make me feel better and motivated to keep going ๐Ÿ˜‚

mkurman/llama-3.2-MEDIT-3B-o1
reacted to reddgr's post with ๐Ÿ‘€ about 1 month ago
view post
Post
1822
Thought it would only make sense to share this here. Lately, one of my favorite activities has been annotating prompts and putting them into datasets ( reddgr/tl-test-learn-prompts reddgr/rq-request-question-prompts reddgr/nli-chatbot-prompt-categorization), which I then use to classify and select chatbot conversations for my website. It's quite fun to use this widget on the lmsys/lmsys-chat-1m, but I also use it on my 2 years of talking to chatbots (soon to be dataset, but still a lot of web scraping and ETL work left)... This one in the picture was probably one of the first prompts I wrote to an LLM:
posted an update about 1 month ago
view post
Post
317
How Do I Contribute (HDIC)

Exciting times to come? We are working on a layer self-esteem technique to score their contribution to the final prediction. For now, it unlocks a lot of knowledge already stored in weights we couldn't force the model to extract by further fine-tuning!
reacted to AdinaY's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
1342
HunyuanVideo ๐Ÿ“น The new open video generation model by Tencent!
๐Ÿ‘‰ tencent/HunyuanVideo
zh-ai-community/video-models-666afd86cfa4e4dd1473b64c
โœจ 13B parameters: Probably the largest open video model to date
โœจ Unified architecture for image & video generation
โœจ Powered by advanced features: MLLM Text Encoder, 3D VAE, and Prompt Rewrite
โœจ Delivers stunning visuals, diverse motion, and unparalleled stability
๐Ÿ”“ Fully open with code & weights
reacted to singhsidhukuldeep's post with ๐Ÿค— about 1 month ago
view post
Post
1308
Exciting breakthrough in Document AI! Researchers from UNC Chapel Hill and Bloomberg have developed M3DocRAG, a revolutionary framework for multi-modal document understanding.

The innovation lies in its ability to handle complex document scenarios that traditional systems struggle with:
- Process 40,000+ pages across 3,000+ documents
- Answer questions requiring information from multiple pages
- Understand visual elements like charts, tables, and figures
- Support both closed-domain (single document) and open-domain (multiple documents) queries

Under the hood, M3DocRAG operates through three sophisticated stages:

>> Document Embedding:
- Converts PDF pages to RGB images
- Uses ColPali to project both text queries and page images into a shared embedding space
- Creates dense visual embeddings for each page while maintaining visual information integrity

>> Page Retrieval:
- Employs MaxSim scoring to compute relevance between queries and pages
- Implements inverted file indexing (IVFFlat) for efficient search
- Reduces retrieval latency from 20s to under 2s when searching 40K+ pages
- Supports approximate nearest neighbor search via Faiss

>> Question Answering:
- Leverages Qwen2-VL 7B as the multi-modal language model
- Processes retrieved pages through a visual encoder
- Generates answers considering both textual and visual context

The results are impressive:
- State-of-the-art performance on MP-DocVQA benchmark
- Superior handling of non-text evidence compared to text-only systems
- Significantly better performance on multi-hop reasoning tasks

This is a game-changer for industries dealing with large document volumesโ€”finance, healthcare, and legal sectors can now process documents more efficiently while preserving crucial visual context.
ยท
reacted to cfahlgren1's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
1930
You can just ask things ๐Ÿ—ฃ๏ธ

"show me messages in the coding category that are in the top 10% of reward model scores"

Download really high quality instructions from the Llama3.1 405B synthetic dataset ๐Ÿ”ฅ

argilla/magpie-ultra-v1.0

replied to their post about 1 month ago
view reply

That is an excellent question. I was just googling and searching in Arxiv. Now, I try Elicit, โ€œtalkโ€ with papers and listen to โ€œpodcastsโ€ on NotebookLM.

replied to their post about 1 month ago
reacted to AdinaY's post with โค๏ธ about 1 month ago
view post
Post
1481
2023 & 2024 Top Downloaded (all time) Open Models on the hub are both from the Chinese community ๐Ÿ‘€

2023 ๐Ÿ‘‰ Bge base by BAAI
BAAI/bge-base-en-v1.5
2024 ๐Ÿ‘‰ Qwen 2.5 by Alibaba Qwen
Qwen/Qwen2.5-1.5B-Instruct

Canโ€™t wait to see what incredible models the Chinese community will bring in 2025๐Ÿš€

โœจ Follow https://huggingface.co/zh-ai-community to get the latest updates from the Chinese community
โœจ Explore the 2024 Year in Review huggingface/open-source-ai-year-in-review-2024
reacted to prithivMLmods's post with โค๏ธ about 1 month ago
view post
Post
2639
Milestone for Flux.1 Dev ๐Ÿ”ฅ

๐Ÿ’ขThe Flux.1 Dev model has crossed 1๏ธโƒฃ0๏ธโƒฃ,0๏ธโƒฃ0๏ธโƒฃ0๏ธโƒฃ creative public adapters! ๐ŸŽˆ
๐Ÿ”— https://huggingface.co/models?other=base_model:adapter:black-forest-labs/FLUX.1-dev

๐Ÿ’ขThis includes:
- 266 Finetunes
- 19 Quants
- 4 Merges

๐Ÿ’ข Hereโ€™s the 10,000th public adapter : ๐Ÿ˜œ
+ strangerzonehf/Flux-3DXL-Partfile-0006

๐Ÿ’ข Page :
+ https://huggingface.co/strangerzonehf

๐Ÿ’ข Collection :
+ prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be
posted an update about 1 month ago
view post
Post
437
What AI-enhanced research tools would you recommend for searching and analyzing scientific papers?
  • 5 replies
ยท
reacted to nataliaElv's post with ๐Ÿ‘€ about 1 month ago
view post
Post
1185
We're so close to reaching 100 languages! Can you help us cover the remaining 200? Check if we're still looking for language leads for your language: nataliaElv/language-leads-dashboard
reacted to AdinaY's post with ๐Ÿ‘€ about 1 month ago
view post
Post
1342
HunyuanVideo ๐Ÿ“น The new open video generation model by Tencent!
๐Ÿ‘‰ tencent/HunyuanVideo
zh-ai-community/video-models-666afd86cfa4e4dd1473b64c
โœจ 13B parameters: Probably the largest open video model to date
โœจ Unified architecture for image & video generation
โœจ Powered by advanced features: MLLM Text Encoder, 3D VAE, and Prompt Rewrite
โœจ Delivers stunning visuals, diverse motion, and unparalleled stability
๐Ÿ”“ Fully open with code & weights
reacted to merve's post with ๐Ÿค— about 1 month ago
view post
Post
2892
Last week we were blessed with open-source models! A recap ๐Ÿ’
merve/nov-29-releases-674ccc255a57baf97b1e2d31

๐Ÿ–ผ๏ธ Multimodal
> At Hugging Face we released SmolVLM, a performant and efficient smol vision language model ๐Ÿ’—
> Show Lab released ShowUI-2B: new vision-language-action model to build GUI/web automation agents ๐Ÿค–
> Rhymes AI has released the base model of Aria: Aria-Base-64K and Aria-Base-8K with their respective context length
> ViDoRe team released ColSmolVLM: A new ColPali-like retrieval model based on SmolVLM
> Dataset: Llava-CoT-o1-Instruct: new dataset labelled using Llava-CoT multimodal reasoning model๐Ÿ“–
> Dataset: LLaVA-CoT-100k dataset used to train Llava-CoT released by creators of Llava-CoT ๐Ÿ“•

๐Ÿ’ฌ LLMs
> Qwen team released QwQ-32B-Preview, state-of-the-art open-source reasoning model, broke the internet ๐Ÿ”ฅ
> AliBaba has released Marco-o1, a new open-source reasoning model ๐Ÿ’ฅ
> NVIDIA released Hymba 1.5B Base and Instruct, the new state-of-the-art SLMs with hybrid architecture (Mamba + transformer)

โฏ๏ธ Image/Video Generation
> Qwen2VL-Flux: new image generation model based on Qwen2VL image encoder, T5 and Flux for generation
> Lightricks released LTX-Video, a new DiT-based video generation model that can generate 24 FPS videos at 768x512 res โฏ๏ธ
> Dataset: Image Preferences is a new image generation preference dataset made with DIBT community effort of Argilla ๐Ÿท๏ธ

Audio
> OuteAI released OuteTTS-0.2-500M new multilingual text-to-speech model based on Qwen-2.5-0.5B trained on 5B audio prompt tokens
reacted to vincentg64's post with ๐Ÿ‘€ about 1 month ago
view post
Post
1227
LLM 2.0, the New Generation of Large Language Models https://mltblog.com/49ksOLL

I get many questions about the radically different LLM technology that I started to develop 2 years ago. Initially designed to retrieve information that I could no longer find on the Internet, not with search, OpenAI, Gemini, Perplexity or any other platform, it evolved to become the ideal solution for professional enterprise users. Now agentic and multimodal, automating business tasks at scale with lightning speed, consistently delivering real ROI, bypassing the costs associated to training and GPU with zero weight and explainable AI, tested and developed for Fortune 100 company.

So, what is behind the scenes, how different is it compared to LLM 1.0 (GPT and the likes), how can it be hallucination-free, what makes it a game changer, how did it eliminate prompt engineering, how does it handle knowledge graphs without neural networks, and what are the other benefits?

In a nutshell, the performance is due to building a robust architecture from the ground up and at every step, offering far more than a prompt box, relying on home-made technology rather than faulty Python libraries, and designed by enterprise and tech visionaries for enterprise users.

Contextual smart crawling to retrieve underlying taxonomies, augmented taxonomies, long contextual multi-tokens, real-time fine-tunning, increased security, LLM router with specialized sub-LLMs, an in-memory database architecture of its own to efficiently handle sparsity in keyword associations, contextual backend tables, agents built on the backend, mapping between prompt and corpus keywords, customized PMI rather than cosine similarity, variable-length embeddings, and the scoring engine (the new โ€œPageRankโ€ of LLMs) returning results along with the relevancy scores, are but a few of the differentiators.

โžก๏ธ Read the full article, at https://mltblog.com/49ksOLL
  • 1 reply
ยท
reacted to cutechicken's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
3286
# Tank War: A Cool AI-Generated Game Making Waves on Hugging Face

Hey there! Let me tell you about Tank War, a super interesting HTML5 Canvas game that's currently ranked #11 on Hugging Face Trending. What's really cool about this game is that it was initially whipped up in just one minute using MOUSE-I ( VIDraft/mouse1), and then polished up with some human touches to the metadata files. Pretty neat, right?

## What Makes It Fun?

- **Stage-by-Stage Action**: You get 2 stages, each packed with 10 rounds and an epic boss battle
- **Power-Up Shopping**: Grab new tanks and upgrades with your hard-earned gold
- **Two-Gun System**: Switch between a heavy-hitting cannon and a rapid-fire machine gun
- **Air Support**: Call in BF-109 fighters and JU-87 dive bombers to rain down some extra firepower

## The Tech Behind the Magic

1. **AI-Powered Foundation**
- Quick game logic generation through MOUSE-I
- Fine-tuned with custom metadata tweaks

2. **Smooth Canvas Graphics**
- Butter-smooth animations with requestAnimationFrame
- Smart hitbox system for precise combat

3. **Smart Code Structure**
- Well-organized classes for enemies, effects, and support units
- Clever code reuse through inheritance

4. **Cool Game Features**
- Awesome sound effects and background music
- Smart enemy AI that keeps you on your toes
- Support units that know how to pick their targets

This project shows just how far we've come with AI-assisted game development, and its popularity on Hugging Face proves it's onto something good! It's a perfect example of how MOUSE-I's quick prototyping abilities and a developer's careful tweaking can create something really special. Think of it as AI and human creativity teaming up to make something awesome! ๐ŸŽฎโœจ

cutechicken/tankwar