Kev Daragon's picture
3 9

Kev Daragon

AntleRot
ยท

AI & ML interests

Text & image

Recent Activity

Organizations

None yet

AntleRot's activity

reacted to suayptalha's post with ๐Ÿ‘€ 21 days ago
view post
Post
1859
๐Ÿš€ Introducing ๐…๐ข๐ซ๐ฌ๐ญ ๐‡๐ฎ๐ ๐ ๐ข๐ง๐  ๐…๐š๐œ๐ž ๐ˆ๐ง๐ญ๐ž๐ ๐ซ๐š๐ญ๐ข๐จ๐ง ๐จ๐Ÿ ๐ฆ๐ข๐ง๐†๐‘๐” ๐Œ๐จ๐๐ž๐ฅ๐ฌ from the paper ๐–๐ž๐ซ๐ž ๐‘๐๐๐ฌ ๐€๐ฅ๐ฅ ๐–๐ž ๐๐ž๐ž๐๐ž๐?

๐Ÿ–ฅ I have integrated ๐ง๐ž๐ฑ๐ญ-๐ ๐ž๐ง๐ž๐ซ๐š๐ญ๐ข๐จ๐ง ๐‘๐๐๐ฌ, specifically minGRU, which offer faster performance compared to Transformer architectures, into HuggingFace. This allows users to leverage the lighter and more efficient minGRU models with the "๐ญ๐ซ๐š๐ง๐ฌ๐Ÿ๐จ๐ซ๐ฆ๐ž๐ซ๐ฌ" ๐ฅ๐ข๐›๐ซ๐š๐ซ๐ฒ for both usage and training.

๐Ÿ’ป I integrated two main tasks: ๐Œ๐ข๐ง๐†๐‘๐”๐…๐จ๐ซ๐’๐ž๐ช๐ฎ๐ž๐ง๐œ๐ž๐‚๐ฅ๐š๐ฌ๐ฌ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง and ๐Œ๐ข๐ง๐†๐‘๐”๐…๐จ๐ซ๐‚๐š๐ฎ๐ฌ๐š๐ฅ๐‹๐Œ.

๐Œ๐ข๐ง๐†๐‘๐”๐…๐จ๐ซ๐’๐ž๐ช๐ฎ๐ž๐ง๐œ๐ž๐‚๐ฅ๐š๐ฌ๐ฌ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง:
You can use this class for ๐’๐ž๐ช๐ฎ๐ž๐ง๐œ๐ž ๐‚๐ฅ๐š๐ฌ๐ฌ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง tasks. I also trained a Sentiment Analysis model with stanfordnlp/imdb dataset.

๐Œ๐ข๐ง๐†๐‘๐”๐…๐จ๐ซ๐‚๐š๐ฎ๐ฌ๐š๐ฅ๐‹๐Œ:
You can use this class for ๐‚๐š๐ฎ๐ฌ๐š๐ฅ ๐‹๐š๐ง๐ ๐ฎ๐š๐ ๐ž ๐Œ๐จ๐๐ž๐ฅ tasks such as GPT, Llama. I also trained an example model with roneneldan/TinyStories dataset. You can fine-tune and use it!

๐Ÿ”— ๐‹๐ข๐ง๐ค๐ฌ:
Models: suayptalha/mingru-676fe8d90760d01b7955d7ab
GitHub: https://github.com/suayptalha/minGRU-hf
LinkedIn Post: https://www.linkedin.com/posts/suayp-talha-kocabay_mingru-a-suayptalha-collection-activity-7278755484172439552-wNY1

๐Ÿ“ฐ ๐‚๐ซ๐ž๐๐ข๐ญ๐ฌ:
Paper Link: https://arxiv.org/abs/2410.01201

I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.
reacted to Jaward's post with โค๏ธ 21 days ago
view post
Post
2991
nanoBLT: Simplified lightweight implementation of a character-level Byte Latent Transformer model (under 500 lines of code). The model is 2x4x2 (n_layers_encoder, n_layers_latent, n_layers_decoder) layer deep trained on ~1M bytes of tiny Shakespeare with a patch size of 4.

Code: https://github.com/Jaykef/ai-algorithms/blob/main/byte_latent_transformer.ipynb
reacted to nyuuzyou's post with ๐Ÿค— 21 days ago
view post
Post
2217
๐ŸŽจ KLING AI Dataset - nyuuzyou/klingai

A collection of 12,782 AI-generated media items featuring:
- High-quality image and video generations at various resolutions
- Complete metadata including user IDs, prompts, and generation parameters
- Content generated using text-to-image, text-to-video, and image-to-video modalities
- Full generation settings and technical parameters
reacted to csabakecskemeti's post with ๐Ÿ”ฅ 21 days ago
view post
Post
1468
I've built a small utility to split safetensors file by file.
The issue/need came up when I've tried to convert the new Deepseek V3 model from FP8 to BF16.
The only Ada architecture GPU I have is an RTX 4080 and the 16GB vram was just wasn't enough for the conversion.

BTW: I'll upload the bf16 version here:
DevQuasar/deepseek-ai.DeepSeek-V3-Base-bf16
(it will take a while - days with my upload speed)
If anyone has access the resources to test it I'd appreciate a feedback if it's working or not.

The tool, is available from here:
https://github.com/csabakecskemeti/ai_utils/blob/main/safetensor_splitter.py
It's splitting every file to n pieces by the layers if possible, and create a new "model.safetensors.index.json" file.
I've tested it with Llama 3.1 8B and multiple split sizes, and validated by using inference pipeline.
use --help for usage
Please note current version expects the model is already multiple file and have a "model.safetensors.index.json" layer-safetensor mapping file.
reacted to hexgrad's post with ๐Ÿ”ฅ 21 days ago
view post
Post
2788
๐Ÿ‡ฌ๐Ÿ‡ง Four British voices have joined hexgrad/Kokoro-82M (Apache TTS model): bf_emma, bf_isabella, bm_george, bm_lewis
reacted to YannisTevissen's post with ๐Ÿค— 21 days ago
reacted to DawnC's post with ๐Ÿค— 21 days ago
view post
Post
1430
๐ŸŒŸ PawMatchAI: Making Breed Selection More Intuitive! ๐Ÿ•
Excited to share the latest update to this AI-powered companion for finding your perfect furry friend! The breed recommendation system just got a visual upgrade to help you make better decisions.

โœจ What's New?
Enhanced breed recognition accuracy through strategic model improvements:
- Upgraded to a fine-tuned ConvNeXt architecture for superior feature extraction
- Implemented progressive layer unfreezing during training
- Optimized data augmentation pipeline for better generalization
- Achieved 8% improvement in breed classification accuracy

๐ŸŽฏ Key Features:
- Smart breed recognition powered by AI
- Visual matching scores with intuitive color indicators
- Detailed breed comparisons with interactive tooltips
- Lifestyle-based recommendations tailored to your needs

๐Ÿ’ญ Project Vision
Combining my passion for AI and pets, this project represents another step toward my goal of creating meaningful AI applications. Each update aims to make the breed selection process more accessible while improving the underlying technology.

๐Ÿ‘‰ Try it now: DawnC/PawMatchAI

Your likes โค๏ธ on this space fuel this project's growth!

#AI #MachineLearning #DeepLearning #Pytorch #ComputerVision
See translation