19 3 15

John Locke

johnlockejrr

[email protected]

AI & ML interests

NLP, OCR, AI

Recent Activity

new activity 5 days ago

cantillation/Teamim-small_Random_WeightDecay-0.05_Augmented_New-Data_date-02-08-2024:A little info

reacted to singhsidhukuldeep's post with 🚀 15 days ago

Exciting breakthrough in AI: @Meta's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization! The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special: >> Key Innovations Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models. Three-Component Architecture: • Lightweight Local Encoder that converts bytes to patch representations • Powerful Global Latent Transformer that processes patches • Local Decoder that converts patches back to bytes >> Technical Advantages • Matches performance of Llama 3 at 8B parameters while being more efficient • Superior handling of non-English languages and rare character sequences • Remarkable 99.9% accuracy on spelling tasks • Better scaling properties than token-based models >> Under the Hood The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs. This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!

liked a Space 25 days ago

sivan22/Ituria

View all activity

Organizations

None yet

johnlockejrr's activity

New activity in cantillation/Teamim-small_Random_WeightDecay-0.05_Augmented_New-Data_date-02-08-2024 5 days ago

A little info

#1 opened 5 days ago by

johnlockejrr

reacted to singhsidhukuldeep's post with 🚀 15 days ago

Post

3622

Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
• Lightweight Local Encoder that converts bytes to patch representations
• Powerful Global Latent Transformer that processes patches
• Local Decoder that converts patches back to bytes

>> Technical Advantages
• Matches performance of Llama 3 at 8B parameters while being more efficient
• Superior handling of non-English languages and rare character sequences
• Remarkable 99.9% accuracy on spelling tasks
• Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!

2 replies

liked a Space 25 days ago

Running

🏃

Ituria

New activity in Gabriel/Qwen2-VL-2B-Instruct 27 days ago

Model inference

#1 opened 27 days ago by

johnlockejrr

reacted to MohamedRashad's post with ❤️❤️ about 1 month ago

Post

1646

A while back i shared this model MohamedRashad/arabic-small-nougat that was a finetune from facebook/nougat-small for the Arabic Language.

Today this humble project has been scaled with new models, new datasets, new space, and a new paper

Check everything throught this collection here:
MohamedRashad/arabic-nougat-673a3f540bd92904c9b92a8e

1 reply

upvoted a paper about 1 month ago

Arabic-Nougat: Fine-Tuning Vision Transformers for Arabic OCR and Markdown Extraction

Paper • 2411.17835 • Published Nov 19, 2024 • 3

New activity in MohamedRashad/arabic-small-nougat about 1 month ago

Arabic Small Nougat

#1 opened 9 months ago by

johnlockejrr

reacted to MohamedRashad's post with 🤗🚀 about 1 month ago

Post

1646

1 reply

upvoted an article about 1 month ago

Article

HTRflow - A tool for HTR and OCR

•

Oct 1, 2024

• 15

reacted to jsulz's post with 🔥 about 2 months ago

Post

2921

When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. That’s where our chunk-based approach comes in.

Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means:

⏩ Only upload the chunks that changed.
🚀 Download just the updates, not the whole file.
🧠 We store your file as deduplicated chunks

In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isn’t just a performance boost. It’s a rethinking of how we manage models and datasets on the Hub.

We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows?

https://huggingface.co/blog/from-files-to-chunks