BigScience Biomedical Datasets

non-profit

AI & ML interests

We aim to unify the schema across many different biomedical NLP resources.

Recent Activity

phloboย  updated a dataset 27 days ago
bigbio/craft
phloboย  updated a dataset 28 days ago
bigbio/monero
phloboย  updated a dataset about 1 month ago
bigbio/neurotrial_ner
View all activity

bigbio's activity

mkurmanย 
posted an update 1 day ago
view post
Post
940
I kindly invite you to try my experimental Llama 3.2 3B with o1-like thinking.

It utilizes Thoughts when needed, so don't be surprised when it's not. It also has a minor bug that requires further fine-tuning (sometimes it starts with the <|python_tag|> instead of <Thought>).

Enjoy!

Give some likes and whatever to make me feel better and motivated to keep going ๐Ÿ˜‚

mkurman/llama-3.2-MEDIT-3B-o1
prithivMLmodsย 
posted an update 5 days ago
view post
Post
3378
Triangulum Catalogued ๐Ÿ”ฅ๐Ÿ’ซ

๐ŸŽฏTriangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF
ยท
prithivMLmodsย 
posted an update 15 days ago
prithivMLmodsย 
posted an update 18 days ago
view post
Post
2474
Qwen2VL Models: Vision and Language Processing ๐Ÿ‰

๐Ÿ“FT; [ Latex OCR, Math Parsing, Text Analogy OCRTest ]

Colab Demo: prithivMLmods/Qwen2-VL-OCR-2B-Instruct

โ„๏ธDemo : prithivMLmods/Qwen2-VL-2B . The demo includes the Qwen2VL 2B Base Model.

๐ŸŽฏThe space handles documenting content from the input image along with standardized plain text. It includes adjustment tools with over 30 font styles, file formatting support for PDF and DOCX, textual alignments, font size adjustments, and line spacing modifications.

๐Ÿ“„PDFs are rendered using the ReportLab software library toolkit.

๐ŸงตModels :
+ prithivMLmods/Qwen2-VL-OCR-2B-Instruct
+ prithivMLmods/Qwen2-VL-Ocrtest-2B-Instruct
+ prithivMLmods/Qwen2-VL-Math-Prase-2B-Instruct

๐Ÿš€Sample Document :
+ https://drive.google.com/file/d/1Hfqqzq4Xc-3eTjbz-jcQY84V5E1YM71E/view?usp=sharing

๐Ÿ“ฆCollection :
+ prithivMLmods/vision-language-models-67639f790e806e1f9799979f

.
.
.
@prithivMLmods ๐Ÿค—
  • 1 reply
ยท
prithivMLmodsย 
posted an update 19 days ago
view post
Post
3223
๐ŸŽ„ Here Before - Xmas๐ŸŽ…โœจ

๐Ÿง‘๐Ÿปโ€๐ŸŽ„Models
+ [ Xmas 2D Illustration ] : strangerzonehf/Flux-Xmas-Illustration-LoRA
+ [ Xmas 3D Art ] : strangerzonehf/Flux-Xmas-3D-LoRA
+ [ Xmas Chocolate ] : strangerzonehf/Flux-Xmas-Chocolate-LoRA
+ [ Xmas Isometric Kit ] : strangerzonehf/Flux-Xmas-Isometric-Kit-LoRA
+ [ Xmas Realpix ] : strangerzonehf/Flux-Xmas-Realpix-LoRA
+ [ Xmas Anime ] : strangerzonehf/Flux-Anime-Xmas-LoRA

โ„๏ธCollections
+ [ Xmas Art ] : strangerzonehf/christmas-pack-6758b199487adafaddb68f82
+ [ Stranger Zone Collection ] : prithivMLmods/stranger-zone-collections-org-6737118adcf2cb40d66d0c7e

๐ŸฅถPage
+ [ Stranger Zone ] : https://huggingface.co/strangerzonehf


.
.
.
@prithivMLmods ๐Ÿค—
AtAndDevย 
posted an update 19 days ago
view post
Post
376
@s3nh Hey man check your discord! Got some news.
  • 4 replies
ยท
prithivMLmodsย 
posted an update 23 days ago
prithivMLmodsย 
posted an update about 1 month ago
view post
Post
3832
Near 3:2 { 1280*832 } Adapters ๐Ÿ”ฅ

๐ŸงชThe datasets were prepared for a 3:2 aspect ratio by processing images of any dimension (width ร— height) in alignment with the adapter's concept. This involved using techniques such as magic expand, magic fill, or outpainting to adjust the remaining parts of the image to achieve the 3:2 ratio & posts training. This approach enhanced the desired image quality to up to 2 MB for detailed prompts and reduced artifacts in images sized at 1280 ร— 832.

๐ŸŽˆThis approach was used instead of cropping down the 2x or 3x zoomed positions in the actual image. It generative filling to adjust the image's aspect ratio proportionally within the dataset.

๐Ÿ”งI used Canva's Magic Expand, Firefly's Generative Fill, and Flux's Outpaint for aspect ratio adjustments.

โฌ‡๏ธModel DLC :
+ [ Microworld Nft ] : strangerzonehf/Flux-Microworld-NFT-LoRA
+ [ Creative Stocks ] : strangerzonehf/Flux-Creative-Stocks-LoRA
+ [ Icon-Kit ] : strangerzonehf/Flux-Icon-Kit-LoRA
+ [ Claymation ] : strangerzonehf/Flux-Claymation-XC-LoRA
+ [ Super Portrait ] : strangerzonehf/Flux-Super-Portrait-LoRA
+ [ Ghibli Art ] : strangerzonehf/Flux-Ghibli-Art-LoRA
+ [ Isometric Site ] : strangerzonehf/Flux-Isometric-Site-LoRA

๐ŸงจPage :
1] Stranger Zone: https://huggingface.co/strangerzonehf

๐Ÿ’ฃSpace :
1] Flux LoRA DLC: prithivMLmods/FLUX-LoRA-DLC

๐Ÿ“ฆCollections :
1] strangerzonehf/flux-3dxl-engine-674833c14a001d5b1fdb5139
2] prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be
3] strangerzonehf/animaker-engine-673714956dec98c400c30cf6
4] strangerzonehf/mixer-engine-673582c9c5939d8aa5bf9533

.
.
.
@prithivMLmods
  • 1 reply
ยท
ImranzamanMLย 
posted an update about 1 month ago
view post
Post
490
Deep understanding of (C-index) evaluation measure for better model
Lets start with three patients groups:

Group A
Group B
Group C
For each patient, we will predict risk score (higher score means higher risk of early event).

Step 1: Understanding Concordance Index
The Concordance Index (C-index) evaluate that how well the model ranks survival times.

Understand with sample data:
Group A has 3 patients with actual survival times and predicted risk scores:

Patient Actual Survival Time Predicted Risk Score
P1 5 months 0.8
P2 3 months 0.9
P3 10 months 0.2
Comparable pairs:

(P1, P2): P2 has a shorter survival time and a higher risk score โ†’ Concordant โœ…
(P1, P3): P3 has a longer survival time and a lower risk score โ†’ Concordant โœ…
(P2, P3): P3 has a longer survival time and a lower risk score โ†’ Concordant โœ…
Total pairs = 3
Total concordant pairs = 3

C-index for Group A = Concordant pairs/Total pairs= 3/3 = 1.0

Step 2: Calculate C-index for All Groups
Repeat the process for all groups. For now we can assume:

Group A: C-index = 1.0
Group B: C-index = 0.8
Group C: C-index = 0.6
Step 3: Stratified Concordance Index
The Stratified Concordance Index combines the C-index scores of all groups and focusing on the following:

Average performance across groups (mean of C-indices).
Consistency across groups (low standard deviation of C-indices).
Formula:
Stratified C-index = Mean(C-index scores) - Standard Deviation(C-index scores)

Calculate the mean:
Mean=1.0 + 0.8 + 0.6/3 = 0.8

Calculate the standard deviation:
Standard Deviation= sqrt((1.0-0.8)^2 + (0.8-0.8)^2 + (0.6-0.8)^/3) = 0.16

Stratified C-index:
Stratified C-index = 0.8 - 0.16 = 0.64

Step 4: Interpret the Results
A high Stratified C-index means:

The model predicts well overall (high mean C-index).
  • 1 reply
ยท
mkurmanย 
posted an update about 1 month ago
view post
Post
317
How Do I Contribute (HDIC)

Exciting times to come? We are working on a layer self-esteem technique to score their contribution to the final prediction. For now, it unlocks a lot of knowledge already stored in weights we couldn't force the model to extract by further fine-tuning!
mkurmanย 
posted an update about 1 month ago
view post
Post
437
What AI-enhanced research tools would you recommend for searching and analyzing scientific papers?
  • 5 replies
ยท
prithivMLmodsย 
posted an update about 1 month ago
view post
Post
2639
Milestone for Flux.1 Dev ๐Ÿ”ฅ

๐Ÿ’ขThe Flux.1 Dev model has crossed 1๏ธโƒฃ0๏ธโƒฃ,0๏ธโƒฃ0๏ธโƒฃ0๏ธโƒฃ creative public adapters! ๐ŸŽˆ
๐Ÿ”— https://huggingface.co/models?other=base_model:adapter:black-forest-labs/FLUX.1-dev

๐Ÿ’ขThis includes:
- 266 Finetunes
- 19 Quants
- 4 Merges

๐Ÿ’ข Hereโ€™s the 10,000th public adapter : ๐Ÿ˜œ
+ strangerzonehf/Flux-3DXL-Partfile-0006

๐Ÿ’ข Page :
+ https://huggingface.co/strangerzonehf

๐Ÿ’ข Collection :
+ prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be
Taylor658ย 
posted an update about 1 month ago
view post
Post
469
๐ŸŒ The Stanford Institute for Human-Centered AI (https://aiindex.stanford.edu/vibrancy/) has released its 2024 Global AI Vibrancy Tool, a way to explore and compare AI progress across 36 countries.

๐Ÿ“Š It measures progress across the 8 broad pillars of R&D, Responsible AI, Economy, Education, Diversity, Policy and Governance, Public Opinion and Infrastructure. (Each of these pillars have a number of Sub Indices)

๐Ÿ“ˆ As a whole it is not surprising that the USA was at the top in terms of overall score as of 2023 (AI investment activity is a large part of the economic pillar for example and that is a large part of the overall USA ranking) but drilling in to more STRATEGIC Macro pillars like Education, Infrastructure or R&D reveal interesting growth patterns in Asia (particularly China) and Western Europe that I suspect the 2024 metrics will bear out.

๐Ÿค– Hopefully the 2024 Global Vibrancy ranking will break out AI and ML verticals like Computer Vision or NLP and or the AI Agent space as that may also from a global macro level give indications of what is to come globally for AI in 2025.
mkurmanย 
posted an update about 1 month ago
view post
Post
1178
We built a new small language model SmolLM2-MedIT-Upscale-2B, based on SmolLM2-1.7B-Instruct from Hugging Face. The premise was simple - increasing the vector in attention layers would positively impact the model's capabilities.

What did we prove?
In total, not much really, since we don't have the original trained under the same conditions as our upscale. However...

1. We scaled up the model without losing its quality
2. We confirmed that the method we devised works
3. After extremely short fine-tuning, the model achieved much better results in IFEval compared to the original (53.68 vs 64.29) and a higher overall average score in Open LLM Leaderboard (14.75 vs 15.17)

I consider this a big success ๐Ÿ˜‡, since surpassing the original in metrics is often very time-consuming, generates high costs, and doesn't always work out.

Meanwhile, we're moving forward, training SmolLM2 400M Instruct as an upscale of 136M.

We're curious about how increasing the base and intermediate vectors will affect the model's quality. We'll compare it to the original and the 360M Instruct version released by Hugging Face.

License: Apache 2.0โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹

meditsolutions/SmolLM2-MedIT-Upscale-2B
prithivMLmodsย 
posted an update about 1 month ago
view post
Post
2736
Fine-Textured [Polygon] Character 3D Design Renders ๐Ÿ™‰

Adapters capable of providing better lighting control (Bn+, Bn-) and richer textures compared to previous sets require more contextual prompts for optimal performance.

The ideal settings are achieved at inference steps around 30โ€“35, with the best dimensions being 1280 x 832 [ 3:2 ]. However, it also performs well with the default settings of 1024 x 1024 [ 1:1 ].

๐Ÿ’ขModels DLC :
+ strangerzonehf/Flux-3DXL-Partfile-0001
+ strangerzonehf/Flux-3DXL-Partfile-0002
+ strangerzonehf/Flux-3DXL-Partfile-0003
+ strangerzonehf/Flux-3DXL-Partfile-0004
+ strangerzonehf/Flux-3DXL-Partfile-C0001

๐Ÿ’ขCollections :
1] strangerzonehf/flux-3dxl-engine-674833c14a001d5b1fdb5139
2] prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be

๐Ÿ’ขSpace :
1] prithivMLmods/FLUX-LoRA-DLC

๐Ÿ’ขPage :
1] Stranger Zone: https://huggingface.co/strangerzonehf

.
.
.
@prithivMLmods ๐Ÿค—
prithivMLmodsย 
posted an update about 1 month ago
view post
Post
3286
HF Posts Receipts ๐Ÿ†๐Ÿš€

[ HF POSTS RECEIPT ] : prithivMLmods/HF-POSTS-RECEIPT

๐Ÿฅ The one thing that needs to be remembered is the 'username'.

๐Ÿฅ And yeah, thank you, @maxiw , for creating the awesome dataset and sharing them here! ๐Ÿ™Œ

๐Ÿฅ [ Dataset ] : maxiw/hf-posts

.
.
.
@prithivMLmods
Taylor658ย 
posted an update about 1 month ago
view post
Post
723
๐Ÿค–๐Ÿ’ป Function Calling is a key component of Agent workflows. To call functions, an LLM needs a way to interact with other systems and run code. This usually means connecting it to a runtime environment that can handle function calls, data, and security.

Per the Berkeley Function-Calling Leaderboard there are only 2 fully open source models (The other 2 in the top 20 that are not closed source have cc-by-nc-4.0 licenses) out of the top 20 models that currently have function calling built in as of 17 Nov 2024.
https://gorilla.cs.berkeley.edu/leaderboard.html

The 2 Open Source Models out of the top 20 that currently support function calling are:

meetkai/functionary-medium-v3.1
Team-ACE/ToolACE-8B

This is a both a huge disadvantage AND an opportunity for the Open Source community as Enterprises, Small Business, Government Agencies etc. quickly adopt Agents and Agent workflows over the next few months. Open Source will have a lot of catching up to do as Enterprises will be hesitant to switch from the closed source models that they may initially build their Agent workflows on in the next few months to an open source alternative later.

Hopefully more open source models will support function calling in the near future.