Post
227
Excited to share insights from Walmart's groundbreaking semantic search system that revolutionizes e-commerce product discovery!
The team at Walmart Global Technology(the team that I am a part of š¬) has developed a hybrid retrieval system that combines traditional inverted index search with neural embedding-based search to tackle the challenging problem of tail queries in e-commerce.
Key Technical Highlights:
ā¢ The system uses a two-tower BERT architecture where one tower processes queries and another processes product information, generating dense vector representations for semantic matching.
ā¢ Product information is enriched by combining titles with key attributes like category, brand, color, and gender using special prefix tokens to help the model distinguish different attribute types.
ā¢ The neural model leverages DistilBERT with 6 layers and projects the 768-dimensional embeddings down to 256 dimensions using a linear layer, achieving optimal performance while reducing storage and computation costs.
ā¢ To improve model training, they implemented innovative negative sampling techniques combining product category matching and token overlap filtering to identify challenging negative examples.
Production Implementation Details:
ā¢ The system uses a managed ANN (Approximate Nearest Neighbor) service to enable fast retrieval, achieving 99% recall@20 with just 13ms latency.
ā¢ Query embeddings are cached with preset TTL (Time-To-Live) to reduce latency and costs in production.
ā¢ The model is exported to ONNX format and served in Java, with custom optimizations like fixed input shapes and GPU acceleration using NVIDIA T4 processors.
Results:
The system showed significant improvements in both offline metrics and live experiments, with:
- +2.84% improvement in NDCG@10 for human evaluation
- +0.54% lift in Add-to-Cart rates in live A/B testing
This is a fantastic example of how modern NLP techniques can be successfully deployed at scale to solve real-world e-
The team at Walmart Global Technology(the team that I am a part of š¬) has developed a hybrid retrieval system that combines traditional inverted index search with neural embedding-based search to tackle the challenging problem of tail queries in e-commerce.
Key Technical Highlights:
ā¢ The system uses a two-tower BERT architecture where one tower processes queries and another processes product information, generating dense vector representations for semantic matching.
ā¢ Product information is enriched by combining titles with key attributes like category, brand, color, and gender using special prefix tokens to help the model distinguish different attribute types.
ā¢ The neural model leverages DistilBERT with 6 layers and projects the 768-dimensional embeddings down to 256 dimensions using a linear layer, achieving optimal performance while reducing storage and computation costs.
ā¢ To improve model training, they implemented innovative negative sampling techniques combining product category matching and token overlap filtering to identify challenging negative examples.
Production Implementation Details:
ā¢ The system uses a managed ANN (Approximate Nearest Neighbor) service to enable fast retrieval, achieving 99% recall@20 with just 13ms latency.
ā¢ Query embeddings are cached with preset TTL (Time-To-Live) to reduce latency and costs in production.
ā¢ The model is exported to ONNX format and served in Java, with custom optimizations like fixed input shapes and GPU acceleration using NVIDIA T4 processors.
Results:
The system showed significant improvements in both offline metrics and live experiments, with:
- +2.84% improvement in NDCG@10 for human evaluation
- +0.54% lift in Add-to-Cart rates in live A/B testing
This is a fantastic example of how modern NLP techniques can be successfully deployed at scale to solve real-world e-