26 33 76

Akarshan Biswas

qnixsynapse

qnixsynapse

AI & ML interests

NLP, models, quantization

Recent Activity

new activity 4 days ago

google/gemma-2-9b-it:Tool calling support in Gemma 2

liked a Space 4 days ago

webml-community/attention-visualization

reacted to suayptalha's post with 👀 7 days ago

🚀 Introducing 𝐅𝐢𝐫𝐬𝐭 𝐇𝐮𝐠𝐠𝐢𝐧𝐠 𝐅𝐚𝐜𝐞 𝐈𝐧𝐭𝐞𝐠𝐫𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐦𝐢𝐧𝐆𝐑𝐔 𝐌𝐨𝐝𝐞𝐥𝐬 from the paper 𝐖𝐞𝐫𝐞 𝐑𝐍𝐍𝐬 𝐀𝐥𝐥 𝐖𝐞 𝐍𝐞𝐞𝐝𝐞𝐝? 🖥 I have integrated 𝐧𝐞𝐱𝐭-𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧 𝐑𝐍𝐍𝐬, specifically minGRU, which offer faster performance compared to Transformer architectures, into HuggingFace. This allows users to leverage the lighter and more efficient minGRU models with the "𝐭𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬" 𝐥𝐢𝐛𝐫𝐚𝐫𝐲 for both usage and training. 💻 I integrated two main tasks: 𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐒𝐞𝐪𝐮𝐞𝐧𝐜𝐞𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 and 𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐂𝐚𝐮𝐬𝐚𝐥𝐋𝐌. 𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐒𝐞𝐪𝐮𝐞𝐧𝐜𝐞𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧: You can use this class for 𝐒𝐞𝐪𝐮𝐞𝐧𝐜𝐞 𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 tasks. I also trained a Sentiment Analysis model with stanfordnlp/imdb dataset. 𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐂𝐚𝐮𝐬𝐚𝐥𝐋𝐌: You can use this class for 𝐂𝐚𝐮𝐬𝐚𝐥 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥 tasks such as GPT, Llama. I also trained an example model with roneneldan/TinyStories dataset. You can fine-tune and use it! 🔗 𝐋𝐢𝐧𝐤𝐬: Models: https://huggingface.co/collections/suayptalha/mingru-676fe8d90760d01b7955d7ab GitHub: https://github.com/suayptalha/minGRU-hf LinkedIn Post: https://www.linkedin.com/posts/suayp-talha-kocabay_mingru-a-suayptalha-collection-activity-7278755484172439552-wNY1 📰 𝐂𝐫𝐞𝐝𝐢𝐭𝐬: Paper Link: https://arxiv.org/abs/2410.01201 I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.

View all activity

Organizations

None yet

qnixsynapse's activity

liked a Space 4 days ago

Running

🔥

Attention Visualization

Vision Transformer Attention Visualization

liked a model 29 days ago

meta-llama/Llama-3.3-70B-Instruct

Text Generation • Updated 16 days ago • 415k • • 1.48k

liked a model about 1 month ago

ruliad/deepthought-8b-llama-v0.01-alpha

Text Generation • Updated 30 days ago • 28.6k • 139

liked a Space about 1 month ago

Running

643

👁

PR Puppet Sora

liked a model 2 months ago

google/gemma-2-2b

Text Generation • Updated Aug 7, 2024 • 196k • 471

liked 3 models 3 months ago

liked 2 models 4 months ago

princeton-nlp/gemma-2-9b-it-SimPO

Text Generation • Updated Aug 2, 2024 • 15.9k • 138

bartowski/OLMoE-1B-7B-0924-Instruct-GGUF

Text Generation • Updated Sep 17, 2024 • 459 • 8

liked a dataset 4 months ago

SkunkworksAI/reasoning-0.01

Viewer • Updated Sep 14, 2024 • 29.9k • 1.77k • 268

liked a model 4 months ago

G-reen/gpt5o-reflexion-q-agi-llama-3.1-8b

Text Generation • Updated Sep 13, 2024 • 253 • 64

liked a model 5 months ago

homebrewltd/llama3-s-instruct-v0.2

Updated Aug 23, 2024 • 26 • 44

liked 2 Spaces 5 months ago

Running on Zero

111

😻

Llama3.1 S V0.2 Checkpoint 2024 08 20

Running on Zero

3.93k

🏎️💨

FLUX.1 [Schnell]

liked 2 models 5 months ago

google/gemma-scope

Updated Aug 29, 2024 • 148

google/gemma-2-2b-it

Text Generation • Updated Aug 27, 2024 • 376k • • 843

liked 3 models 6 months ago

meta-llama/Llama-3.1-8B-Instruct

Text Generation • Updated Sep 25, 2024 • 4.8M • • 3.38k

mistralai/Mistral-Nemo-Instruct-2407

Text Generation • Updated Nov 6, 2024 • 3.66M • • 1.34k

Groq/Llama-3-Groq-8B-Tool-Use

Text Generation • Updated Aug 27, 2024 • 692 • 269