Kuldeep Singh Sidhu's picture
6 3

Kuldeep Singh Sidhu

singhsidhukuldeep

AI & ML interests

šŸ˜ƒ TOP 3 on HuggingFace for posts šŸ¤— Seeking contributors for a completely open-source šŸš€ Data Science platform! singhsidhukuldeep.github.io

Recent Activity

posted an update about 16 hours ago
Excited to share insights from Walmart's groundbreaking semantic search system that revolutionizes e-commerce product discovery! The team at Walmart Global Technology(the team that I am a part of šŸ˜¬) has developed a hybrid retrieval system that combines traditional inverted index search with neural embedding-based search to tackle the challenging problem of tail queries in e-commerce. Key Technical Highlights: ā€¢ The system uses a two-tower BERT architecture where one tower processes queries and another processes product information, generating dense vector representations for semantic matching. ā€¢ Product information is enriched by combining titles with key attributes like category, brand, color, and gender using special prefix tokens to help the model distinguish different attribute types. ā€¢ The neural model leverages DistilBERT with 6 layers and projects the 768-dimensional embeddings down to 256 dimensions using a linear layer, achieving optimal performance while reducing storage and computation costs. ā€¢ To improve model training, they implemented innovative negative sampling techniques combining product category matching and token overlap filtering to identify challenging negative examples. Production Implementation Details: ā€¢ The system uses a managed ANN (Approximate Nearest Neighbor) service to enable fast retrieval, achieving 99% recall@20 with just 13ms latency. ā€¢ Query embeddings are cached with preset TTL (Time-To-Live) to reduce latency and costs in production. ā€¢ The model is exported to ONNX format and served in Java, with custom optimizations like fixed input shapes and GPU acceleration using NVIDIA T4 processors. Results: The system showed significant improvements in both offline metrics and live experiments, with: - +2.84% improvement in NDCG@10 for human evaluation - +0.54% lift in Add-to-Cart rates in live A/B testing This is a fantastic example of how modern NLP techniques can be successfully deployed at scale to solve real-world e-
posted an update 3 days ago
Groundbreaking Research Alert: Revolutionizing Document Ranking with Long-Context LLMs Researchers from Renmin University of China and Baidu Inc . have introduced a novel approach to document ranking that challenges conventional sliding window methods. Their work demonstrates how long-context Large Language Models can process up to 100 documents simultaneously, achieving superior performance while reducing API costs by 50%. Key Technical Innovations: - Full ranking strategy enables processing all passages in a single inference - Multi-pass sliding window approach for comprehensive listwise label construction - Importance-aware learning objective that prioritizes top-ranked passage IDs - Support for context lengths up to 128k tokens using models like LLaMA 3.1-8B-Instruct Performance Highlights: - 2.2 point improvement in NDCG@10 metrics - 29.3% reduction in latency compared to traditional methods - Significant API cost savings through elimination of redundant passage processing Under the hood, the system leverages advanced long-context LLMs to perform global interactions among passages, enabling more nuanced relevance assessment. The architecture incorporates a novel importance-aware loss function that assigns differential weights based on passage ranking positions. The research team's implementation demonstrated remarkable versatility across multiple datasets, including TREC DL and BEIR benchmarks. Their fine-tuned model, RankMistral, showcases the practical viability of full ranking approaches in production environments. This advancement marks a significant step forward in information retrieval systems, offering both improved accuracy and computational efficiency. The implications for search engines and content recommendation systems are substantial.
posted an update 8 days ago
Exciting News in AI: JinaAI Releases JINA-CLIP-v2! The team at Jina AI has just released a groundbreaking multilingual multimodal embedding model that's pushing the boundaries of text-image understanding. Here's why this is a big deal: šŸš€ Technical Highlights: - Dual encoder architecture combining a 561M parameter Jina XLM-RoBERTa text encoder and a 304M parameter EVA02-L14 vision encoder - Supports 89 languages with 8,192 token context length - Processes images up to 512Ɨ512 pixels with 14Ɨ14 patch size - Implements FlashAttention2 for text and xFormers for vision processing - Uses Matryoshka Representation Learning for efficient vector storage āš”ļø Under The Hood: - Multi-stage training process with progressive resolution scaling (224ā†’384ā†’512) - Contrastive learning using InfoNCE loss in both directions - Trained on massive multilingual dataset including 400M English and 400M multilingual image-caption pairs - Incorporates specialized datasets for document understanding, scientific graphs, and infographics - Uses hard negative mining with 7 negatives per positive sample šŸ“Š Performance: - Outperforms previous models on visual document retrieval (52.65% nDCG@5) - Achieves 89.73% image-to-text and 79.09% text-to-image retrieval on CLIP benchmark - Strong multilingual performance across 30 languages - Maintains performance even with 75% dimension reduction (256D vs 1024D) šŸŽÆ Key Innovation: The model solves the long-standing challenge of unifying text-only and multi-modal retrieval systems while adding robust multilingual support. Perfect for building cross-lingual visual search systems! Kudos to the research team at Jina AI for this impressive advancement in multimodal AI!
View all activity

Organizations

MLX Community's profile picture Social Post Explorers's profile picture C4AI Community's profile picture

Posts 106

view post
Post
227
Excited to share insights from Walmart's groundbreaking semantic search system that revolutionizes e-commerce product discovery!

The team at Walmart Global Technology(the team that I am a part of šŸ˜¬) has developed a hybrid retrieval system that combines traditional inverted index search with neural embedding-based search to tackle the challenging problem of tail queries in e-commerce.

Key Technical Highlights:

ā€¢ The system uses a two-tower BERT architecture where one tower processes queries and another processes product information, generating dense vector representations for semantic matching.

ā€¢ Product information is enriched by combining titles with key attributes like category, brand, color, and gender using special prefix tokens to help the model distinguish different attribute types.

ā€¢ The neural model leverages DistilBERT with 6 layers and projects the 768-dimensional embeddings down to 256 dimensions using a linear layer, achieving optimal performance while reducing storage and computation costs.

ā€¢ To improve model training, they implemented innovative negative sampling techniques combining product category matching and token overlap filtering to identify challenging negative examples.

Production Implementation Details:

ā€¢ The system uses a managed ANN (Approximate Nearest Neighbor) service to enable fast retrieval, achieving 99% recall@20 with just 13ms latency.

ā€¢ Query embeddings are cached with preset TTL (Time-To-Live) to reduce latency and costs in production.

ā€¢ The model is exported to ONNX format and served in Java, with custom optimizations like fixed input shapes and GPU acceleration using NVIDIA T4 processors.

Results:
The system showed significant improvements in both offline metrics and live experiments, with:
- +2.84% improvement in NDCG@10 for human evaluation
- +0.54% lift in Add-to-Cart rates in live A/B testing

This is a fantastic example of how modern NLP techniques can be successfully deployed at scale to solve real-world e-
view post
Post
1997
Groundbreaking Research Alert: Revolutionizing Document Ranking with Long-Context LLMs

Researchers from Renmin University of China and Baidu Inc . have introduced a novel approach to document ranking that challenges conventional sliding window methods. Their work demonstrates how long-context Large Language Models can process up to 100 documents simultaneously, achieving superior performance while reducing API costs by 50%.

Key Technical Innovations:
- Full ranking strategy enables processing all passages in a single inference
- Multi-pass sliding window approach for comprehensive listwise label construction
- Importance-aware learning objective that prioritizes top-ranked passage IDs
- Support for context lengths up to 128k tokens using models like LLaMA 3.1-8B-Instruct

Performance Highlights:
- 2.2 point improvement in NDCG@10 metrics
- 29.3% reduction in latency compared to traditional methods
- Significant API cost savings through elimination of redundant passage processing

Under the hood, the system leverages advanced long-context LLMs to perform global interactions among passages, enabling more nuanced relevance assessment. The architecture incorporates a novel importance-aware loss function that assigns differential weights based on passage ranking positions.

The research team's implementation demonstrated remarkable versatility across multiple datasets, including TREC DL and BEIR benchmarks. Their fine-tuned model, RankMistral, showcases the practical viability of full ranking approaches in production environments.

This advancement marks a significant step forward in information retrieval systems, offering both improved accuracy and computational efficiency. The implications for search engines and content recommendation systems are substantial.

models

None public yet

datasets

None public yet