The Challenge
ModaHaus, a Berlin-based online fashion retailer with 12,000+ SKUs, had a search problem. Their Elasticsearch-powered search worked fine for exact queries like "blue Nike running shoes size 42" — but fell apart when customers searched the way they naturally think:
- "something to wear to a summer wedding"
- "cozy work-from-home outfit under €100"
- "similar to that green dress Zendaya wore at the Oscars"
Their search conversion rate (searches that led to a purchase) sat at 8.2% — well below the industry benchmark of 12–15%. They were leaving money on the table.
The Solution: RAG-Powered Semantic Search
ModaHaus posted a project on Workia.dev looking for an AI freelancer who could modernize their search without a full platform rewrite. They hired Elena, a Madrid-based AI developer specializing in RAG systems and e-commerce.
Elena's approach had four components:
1. Product Embedding Pipeline
Every product in ModaHaus's catalog was processed through a pipeline that:
- Combined the product title, description, category tags, customer reviews, and style notes into a single "product document"
- Generated embeddings using OpenAI's text-embedding-3-large model
- Stored the embeddings in a Qdrant vector database, alongside the original product metadata
This pipeline ran nightly to capture new products and updated descriptions.
2. Hybrid Search Architecture
Instead of replacing Elasticsearch entirely, Elena built a hybrid system:
- Keyword search (Elasticsearch): Still handles exact matches, SKU lookups, and structured filters (size, color, price range)
- Semantic search (Qdrant): Handles natural language queries by embedding the search query and finding the closest product embeddings
- Result fusion: A reciprocal rank fusion algorithm combines results from both systems, weighted 40% keyword / 60% semantic
This hybrid approach meant ModaHaus didn't lose their existing search quality while gaining natural language understanding.
3. Query Understanding Layer
Before the query hits the search engines, a lightweight LLM call (Claude Haiku) analyzes it:
- Extracts structured attributes (occasion, budget, style, season)
- Detects intent (browsing vs. specific item search)
- Expands the query with relevant terms (e.g., "summer wedding" → "sundress, linen suit, cocktail dress, fascinator")
This pre-processing step dramatically improved result relevance for vague queries.
4. Conversational Follow-Up
If a customer's initial search returns results but they want to refine, a lightweight chat widget (powered by Claude Sonnet) allows follow-up:
- "Show me these but in blue"
- "Anything cheaper?"
- "I like the third one — what goes well with it?"
The chat maintains context from the original search and uses the RAG system to retrieve relevant products in real-time.
The Results
After a 4-week build and 2-week A/B test:
- Search-to-purchase conversion: 8.2% → 11.0% (+34%)
- Average order value: +12% (the conversational follow-up drove cross-sells)
- Zero-result searches: down 67% (semantic search finds relevant results even for unusual queries)
- Customer satisfaction (post-search survey): +28 NPS points
Technical Stack
- Embeddings: OpenAI text-embedding-3-large
- Vector DB: Qdrant (self-hosted on Hetzner)
- LLM: Claude Haiku (query understanding), Claude Sonnet (conversational search)
- Backend: Python FastAPI service, sitting between the Next.js frontend and the search engines
- Existing infrastructure: Elasticsearch (retained), PostgreSQL product catalog
Project Timeline & Cost
- Duration: 6 weeks (4 build + 2 testing)
- Freelancer cost: €14,400 (€800/day × 18 billable days)
- Infrastructure cost: ~€200/month additional (Qdrant server + LLM API calls)
- ROI: Based on the conversion lift, ModaHaus estimates the project paid for itself within the first month
Key Takeaways
- You don't need to rip and replace — hybrid search architectures let you add AI capabilities on top of existing systems
- Query understanding is the secret weapon — pre-processing queries with an LLM is cheap and dramatically improves relevance
- Conversational search drives revenue — customers who used the follow-up chat had 3x higher conversion than those who didn't
- Right-size your models — Elena used Haiku for the fast, cheap query preprocessing and Sonnet only for the conversational feature where quality mattered most
If you're looking to add AI-powered search to your e-commerce platform, find a specialist on Workia.dev.