Portfolio
RAG Document Processing System
Insurance Technology
90%
Faster document lookup
25+
Projects managed in monorepo
23
Insurers integrated via Calcly
Overview
At Insly, I led development of a RAG (Retrieval-Augmented Generation) system that gives insurance brokers fast, context-aware answers about policy details. The system combines traditional search with vector embeddings to handle complex queries across 23 different insurance providers.
Business Context
Insly is a leading insurance technology company serving brokers across Central Europe. As the insurance industry digitizes, brokers need quick access to policy information from dozens of providers. The company had accumulated thousands of policy documents but lacked an efficient way for brokers to search and retrieve relevant information. Manual document lookup was taking brokers 15-30 minutes per query, significantly impacting customer service response times.
Challenge
Insurance brokers needed to quickly find relevant information across thousands of policy documents from 23 different insurers, each with unique formats and terminology.
- Thousands of policy documents in various formats (PDF, Word, HTML)
- Complex insurance terminology requiring domain-specific understanding
- Need for both Polish and English language support
Solution
We built a hybrid search system combining Elasticsearch for keyword matching with Qdrant vector database for semantic understanding, powered by Polish-optimized embeddings.
- Built RAG system with hybrid search (Elasticsearch + Qdrant vector DB)
- Implemented Polish-optimized embeddings for semantic search
- Integrated AWS Bedrock and local Ollama for flexible LLM backends
Approach & Methodology
We started with a thorough analysis of how brokers currently search for information and what types of questions they need answered. Using Event Storming workshops, we mapped the document lifecycle and identified key retrieval patterns. We then designed a hybrid search architecture that combines traditional keyword search for exact matches with semantic vector search for conceptual queries. The system was built iteratively, with weekly demos to gather broker feedback.
Implementation Details
Hybrid Search Architecture
Combined Elasticsearch BM25 scoring with Qdrant vector similarity for optimal retrieval. Documents are chunked intelligently to preserve context while enabling precise matching.
Polish Language Embeddings
Implemented custom Polish language embeddings optimized for insurance domain vocabulary, significantly improving semantic search accuracy for Polish-language documents.
Flexible LLM Backend
Designed pluggable LLM architecture supporting both AWS Bedrock (Claude) for production and local Ollama instances for development and cost-sensitive use cases.
Key Decisions
- Chose Qdrant over Pinecone for vector storage due to self-hosting requirements and GDPR compliance
- Implemented hybrid search (BM25 + vectors) rather than pure semantic search for better handling of exact policy numbers and codes
- Built pluggable LLM backend to support both cloud (AWS Bedrock) and local (Ollama) models for cost flexibility
Tech Stack
Related Services
The following services were utilized in this project to deliver successful outcomes.
Lessons Learned
- Domain-specific embeddings significantly outperform generic models for specialized vocabularies like insurance terminology
- Chunking strategy is critical - too small loses context, too large reduces precision. We found 512 tokens with 50-token overlap optimal
- Early user involvement is essential - brokers helped identify edge cases we wouldn't have discovered through testing alone
Project Information
Timeline
6 months (ongoing improvements)
Team
2 developers + 1 product owner
Results
90%
Faster document lookup
25+
Projects managed in monorepo
23
Insurers integrated via Calcly
Have a Similar Challenge?
Let's discuss how I can help your project succeed with proven architecture and AI solutions.