Flagship Project

Truth Preservation Network

A civilizational project to digitize, verify, and preserve historical documents — building the verified data foundation that AI desperately needs.

Partner With Us Follow the Journey

The Problem

History Is Disappearing

Physical documents deteriorate. Libraries are underfunded. AI models are trained on unverified data and hallucinate as a result. TPN exists to fix all three — simultaneously.

📚

Documents Deteriorating

Newspapers, manuscripts, and books from the 19th and 20th centuries are decaying faster than institutions can preserve them.

🏛️

Libraries Underfunded

Public libraries lack the resources to digitize their collections at scale. Community participation is the only viable solution.

🤖

AI Needs Verified Data

Current AI models hallucinate because they are trained on unverified sources. Grounded, provenance-backed data is the cure.

The Solution

The TPN Pipeline

A physical page enters one end. A verified, searchable, hashed digital record exits the other.

📱

Scan

Volunteer photographs document with phone

→

🔤

OCR

Tesseract extracts text at 94%+ confidence

→

🧹

Clean

Python normalizes and corrects OCR output

→

🔐

Hash

SHA-256 fingerprint proves content integrity

→

🔍

Semantic API serves verified archive

Current Status

Phase 1 — Foundation Complete

Built in public. Every milestone documented. Every component validated before moving forward.

✅ Complete

OCR Engine

Tesseract-powered text extraction achieving 94%+ confidence on historical documents.

✅ Complete

Cleaning Engine

Python pipeline normalizes OCR output, corrects errors, and standardizes formatting.

✅ Complete

SHA-256 Provenance

Every document receives a unique cryptographic fingerprint proving content integrity.

✅ Complete

Semantic Embeddings

384-dimensional vectors enable natural language search across the entire archive.

✅ Complete

Search API

FastAPI serving four endpoints — process, retrieve, search, and health check.

⬜ Week 3

Real World Validation

First real phone scan from a Detroit library. Pipeline tested against genuine historical documents.