A reading instrument

Close reading, at corpus scale.

Does the document talk about a topic — and does it talk about it in the right place? Document Lens turns a folder of PDFs, a keyword list, and the axes you care about into coverage heatmaps, scores, trend lines, and concordances. Built for researchers studying corporate disclosure; at home in any keyword-driven study of unstructured text.

Local-first. Your documents and every analysis live in SQLite on this device.

The analysis workflow

01

Import & Analyze

Import PDFs and run document-level analysis: readability, writing quality, word frequency. A shared library means each document is processed once, reused everywhere.

02

Keyword Search

Search your framework terms across every document at once. Taxonomy hierarchies give you tier-level analysis; polarity separates genuine delivery from performative language.

03

N-gram Discovery

Find the two- and three-word phrases the corpus actually uses, so the vocabulary you search for is the vocabulary that's there — not just the one you started with.

04

Visualize

Generate charts comparing keyword usage, trends, and document coverage — then export the tables and figures straight into your paper.

Nine workflows over one project

Measure

Coverage heatmaps for every keyword × document pair. A single Wedding Cake score per report when you need one number, year-over-year tracking when you need the trend, and rankings when you need the comparison.

Interrogate

Audit whether each keyword is used in the right context — anomalies and confirmations side by side. Then open the concordance and read what the document actually says, with the PDF right alongside.

Map

Cross two axes — SDG × Function, Pillar × section, any pair you define — and see how each document distributes its attention. Talking about a topic is one thing; talking about it in the right place is the finding.

Local-first, by design

Research corpora are often confidential, embargoed, or simply yours. Document Lens is built so your documents and your analysis never leave your machine.

For disclosure researchers

The original use case: sustainability reporting in university annual reports. Keyword polarity supports both narratives from one list — terms that signal delivery, and counter-terms that signal greenwashing — with the Wedding Cake score grounding it all in the SDG model.

For any keyword-driven study

Policy documents, curricula, submissions, transcripts — if you have a corpus, a framework, and a question about who says what, where, the same projects, axes, and scoring rules apply. Bring your own taxonomy.