Case Study: Building a RAG-Powered Product Search for E-Commerce

The Challenge

ModaHaus, a Berlin-based online fashion retailer with 12,000+ SKUs, had a search problem. Their Elasticsearch-powered search worked fine for exact queries like "blue Nike running shoes size 42" — but fell apart when customers searched the way they naturally think:

"something to wear to a summer wedding"
"cozy work-from-home outfit under €100"
"similar to that green dress Zendaya wore at the Oscars"

Their search conversion rate (searches that led to a purchase) sat at 8.2% — well below the industry benchmark of 12–15%. They were leaving money on the table.

The Solution: RAG-Powered Semantic Search

ModaHaus posted a project on Workia.dev looking for an AI freelancer who could modernize their search without a full platform rewrite. They hired Elena, a Madrid-based AI developer specializing in RAG systems and e-commerce.

Elena's approach had four components:

1. Product Embedding Pipeline

Every product in ModaHaus's catalog was processed through a pipeline that:

Combined the product title, description, category tags, customer reviews, and style notes into a single "product document"
Generated embeddings using OpenAI's text-embedding-3-large model
Stored the embeddings in a Qdrant vector database, alongside the original product metadata

This pipeline ran nightly to capture new products and updated descriptions.

2. Hybrid Search Architecture

Instead of replacing Elasticsearch entirely, Elena built a hybrid system:

Keyword search (Elasticsearch): Still handles exact matches, SKU lookups, and structured filters (size, color, price range)
Semantic search (Qdrant): Handles natural language queries by embedding the search query and finding the closest product embeddings
Result fusion: A reciprocal rank fusion algorithm combines results from both systems, weighted 40% keyword / 60% semantic

This hybrid approach meant ModaHaus didn't lose their existing search quality while gaining natural language understanding.

3. Query Understanding Layer

Before the query hits the search engines, a lightweight LLM call (Claude Haiku) analyzes it:

Extracts structured attributes (occasion, budget, style, season)
Detects intent (browsing vs. specific item search)
Expands the query with relevant terms (e.g., "summer wedding" → "sundress, linen suit, cocktail dress, fascinator")

This pre-processing step dramatically improved result relevance for vague queries.

4. Conversational Follow-Up

If a customer's initial search returns results but they want to refine, a lightweight chat widget (powered by Claude Sonnet) allows follow-up:

"Show me these but in blue"
"Anything cheaper?"
"I like the third one — what goes well with it?"

The chat maintains context from the original search and uses the RAG system to retrieve relevant products in real-time.

The Results

After a 4-week build and 2-week A/B test:

Search-to-purchase conversion: 8.2% → 11.0% (+34%)
Average order value: +12% (the conversational follow-up drove cross-sells)
Zero-result searches: down 67% (semantic search finds relevant results even for unusual queries)
Customer satisfaction (post-search survey): +28 NPS points

Technical Stack

Embeddings: OpenAI text-embedding-3-large
Vector DB: Qdrant (self-hosted on Hetzner)
LLM: Claude Haiku (query understanding), Claude Sonnet (conversational search)
Backend: Python FastAPI service, sitting between the Next.js frontend and the search engines
Existing infrastructure: Elasticsearch (retained), PostgreSQL product catalog

Project Timeline & Cost

Duration: 6 weeks (4 build + 2 testing)
Freelancer cost: €14,400 (€800/day × 18 billable days)
Infrastructure cost: ~€200/month additional (Qdrant server + LLM API calls)
ROI: Based on the conversion lift, ModaHaus estimates the project paid for itself within the first month

Key Takeaways

You don't need to rip and replace — hybrid search architectures let you add AI capabilities on top of existing systems
Query understanding is the secret weapon — pre-processing queries with an LLM is cheap and dramatically improves relevance
Conversational search drives revenue — customers who used the follow-up chat had 3x higher conversion than those who didn't
Right-size your models — Elena used Haiku for the fast, cheap query preprocessing and Sonnet only for the conversational feature where quality mattered most

If you're looking to add AI-powered search to your e-commerce platform, find a specialist on Workia.dev.