RAG Document Processing System

Insurance Technology

90%

Faster document lookup

25+

Projects managed in monorepo

23

Insurers integrated via Calcly

Overview

At Insly, I led development of a RAG (Retrieval-Augmented Generation) system that gives insurance brokers fast, context-aware answers about policy details. The system combines traditional search with vector embeddings to handle complex queries across 23 different insurance providers.

Business Context

Insly is a leading insurance technology company serving brokers across Central Europe. As the insurance industry digitizes, brokers need quick access to policy information from dozens of providers. The company had accumulated thousands of policy documents but lacked an efficient way for brokers to search and retrieve relevant information. Manual document lookup was taking brokers 15-30 minutes per query, significantly impacting customer service response times.

Challenge

Insurance brokers needed to quickly find relevant information across thousands of policy documents from 23 different insurers, each with unique formats and terminology.

  • Thousands of policy documents in various formats (PDF, Word, HTML)
  • Complex insurance terminology requiring domain-specific understanding
  • Need for both Polish and English language support

Solution

We built a hybrid search system combining Elasticsearch for keyword matching with Qdrant vector database for semantic understanding, powered by Polish-optimized embeddings.

  • Built RAG system with hybrid search (Elasticsearch + Qdrant vector DB)
  • Implemented Polish-optimized embeddings for semantic search
  • Integrated AWS Bedrock and local Ollama for flexible LLM backends

Approach & Methodology

We started with a thorough analysis of how brokers currently search for information and what types of questions they need answered. Using Event Storming workshops, we mapped the document lifecycle and identified key retrieval patterns. We then designed a hybrid search architecture that combines traditional keyword search for exact matches with semantic vector search for conceptual queries. The system was built iteratively, with weekly demos to gather broker feedback.

Implementation Details

Hybrid Search Architecture

Combined Elasticsearch BM25 scoring with Qdrant vector similarity for optimal retrieval. Documents are chunked intelligently to preserve context while enabling precise matching.

Polish Language Embeddings

Implemented custom Polish language embeddings optimized for insurance domain vocabulary, significantly improving semantic search accuracy for Polish-language documents.

Flexible LLM Backend

Designed pluggable LLM architecture supporting both AWS Bedrock (Claude) for production and local Ollama instances for development and cost-sensitive use cases.

Key Decisions

  • Chose Qdrant over Pinecone for vector storage due to self-hosting requirements and GDPR compliance
  • Implemented hybrid search (BM25 + vectors) rather than pure semantic search for better handling of exact policy numbers and codes
  • Built pluggable LLM backend to support both cloud (AWS Bedrock) and local (Ollama) models for cost flexibility

Tech Stack

Python FastAPI Elasticsearch Qdrant AWS Bedrock Go PostgreSQL

Related Services

The following services were utilized in this project to deliver successful outcomes.

Lessons Learned

  • Domain-specific embeddings significantly outperform generic models for specialized vocabularies like insurance terminology
  • Chunking strategy is critical - too small loses context, too large reduces precision. We found 512 tokens with 50-token overlap optimal
  • Early user involvement is essential - brokers helped identify edge cases we wouldn't have discovered through testing alone

Project Information

Timeline

6 months (ongoing improvements)

Team

2 developers + 1 product owner

Results

90%

Faster document lookup

25+

Projects managed in monorepo

23

Insurers integrated via Calcly

Have a Similar Challenge?

Let's discuss how I can help your project succeed with proven architecture and AI solutions.