AI / ML2026completed

PDF RAG Assistant

A retrieval-augmented generation system that ingests PDF documents, indexes them in a persistent Chroma vector store, and answers questions using hybrid search, reranking, and strict context-only prompting to reduce hallucinations. The stack pairs a FastAPI backend for ingestion and query APIs with a Streamlit chat UI, OpenAI embeddings and chat models, and optional BM25 keyword retrieval fused with semantic search.

PythonFastAPIStreamlitRAGChromaDBOpenAI

View on GitHub

2018

Year

Features

Technologies

Problem

·Balancing retrieval breadth with precision for diverse PDFs
·Keeping answers strictly grounded while remaining helpful
·Managing first-load latency for embedding and reranker models

Solution

·Used hybrid fusion plus reranking to tighten context before generation
·Enforced strict system prompts and “I don’t know” fallbacks
·Structured modular code (ingestion, retriever, reranker, LLM) for clarity

Key Features

✓

PDF upload, chunking, deduplication, and persistent Chroma indexing

✓

LLM query expansion and hybrid retrieval with document filters

✓

Cross-encoder reranking for top-k context selection

✓

Streaming answers with source citations (file and page)

✓

Anti-hallucination prompts with explicit insufficient-context handling

✓

REST API plus interactive Streamlit frontend

Technologies

PythonFastAPIStreamlitChromaDBOpenAI APILangChain-style pipelinesBM25Hugging Face (reranker)Pydantic

Learnings

→
Designed end-to-end RAG pipelines from PDF ingestion to streamed responses
→
Combined vector (MMR) and BM25 retrieval for stronger recall
→
Applied reranking to improve context quality before LLM generation
→
Practiced production-minded API design, config, and structured logging

Highlights

Hybrid Retrieval

Cross-Encoder Rerank

Grounded Answers

Next Project

MicroBankingSystem Backend

Robust backend system for banking operations with database architecture. Group project involving microservices and database design.