Rupesh sharma

I built DocQuify because I kept running into the same problem: I'd have a 60-page PDF open and need one specific thing from it. Ctrl+F only gets you so far when you don't know the exact wording.

The idea was straightforward: let people just ask their documents questions in normal language and get answers pulled from the actual content. Not summaries, not guesses. Specific answers with the context behind them.

The technical part that took the most thought was the chunking strategy. Splitting a PDF into pieces sounds simple until you realize that meaning often spans across paragraphs. A naive split loses that. I ended up using overlapping chunks so context doesn't get cut off at boundaries, which made a real difference in answer quality.

From there, each chunk gets embedded via OpenAI, stored in a vector DB, and when someone asks a question, their query gets embedded too and matched against the stored chunks semantically. The closest chunks go into the prompt as context, and the model answers from those rather than general knowledge.

Challenge

The obvious solution when people think "chat with PDF" is to just dump the whole document into the prompt. That breaks immediately with anything longer than a few pages: context limits, cost, and the model loses focus in a sea of text.

The real challenge was building retrieval that's actually reliable. Getting the chunking wrong means the right answer exists in your database but never gets retrieved. Getting the embedding model wrong means semantic search returns irrelevant chunks. Either way the user gets a bad answer and blames the product.

Solution

Overlapping text chunks for context continuity across splits. OpenAI embeddings for semantic indexing. Top-k vector retrieval at query time, injected as grounded context into the prompt.

The model only sees what's relevant to the question, which keeps answers accurate and prevents it from drifting into general knowledge territory.

DocQuify

Challenge

Solution

Outcome