Built in Public

The Journey

Every milestone documented. Every lesson learned in public. This is what building a company from scratch actually looks like.

Where We Are Right Now

8
Build Sessions
94%
OCR Confidence
4
API Endpoints Live
384
Embedding Dimensions
1,851+
Lines of Code
1
Phase Complete

Every Step. Documented.

No highlights reel. No polished retrospectives. The real build — as it happened.

2026-03-06 — Session 1
Foundation Built
Project structure established. Master project prompt created. Pipeline overview documented. Philosophy and mission locked. Folder architecture defined for the entire project.
✅ Complete Folder Structure Documentation Pipeline Design
2026-03-07 — Session 2
Pipeline Engines Built
Python virtual environment configured. Tesseract 5.3.4 installed. OCR engine built and validated at 94.04% confidence. Cleaning engine built with 3 bugs found and fixed. Metadata engine built with SHA-256 hashing working. Pipeline runner built — full pipeline in one command. First document processed: TPN-20260307-806040a5.
✅ Complete OCR Python SHA-256 Tesseract
2026-03-07 — Session 3
API Skeleton Built
FastAPI skeleton built with three endpoints — process, retrieve, and health check. Import mismatch bug debugged and resolved by tracing function signatures across all pipeline modules. All endpoints tested and confirmed working. Week 1 complete.
✅ Complete FastAPI API Debugging
2026-03-07 — Session 4
Embeddings and Search Live
Form() bug found and fixed — library, scanner, and document type fields now saving correctly. Embedding engine built using sentence-transformers all-MiniLM-L6-v2 — 384 dimensions. Semantic search endpoint added to API. All four endpoints confirmed working from browser via Swagger UI.
✅ Complete Embeddings Semantic Search Bug Fix
2026-03-07 — Session 5
GitHub and Professional README
Git initialized. Private GitHub repository created. 1,851 lines of code pushed in first commit. Professional README written with mission, pipeline, setup instructions, and roadmap. Swagger UI API docs confirmed working. All four endpoints tested from browser.
✅ Complete GitHub Git Documentation
2026-03-07 — Session 6
Website Rebuilt
virtuallaborforce.com rebuilt from scratch with professional design system. Five pages: Home, TPN, Journey, About, Contact. Clean modern aesthetic with inspirational tone. Connected to GitHub for automatic deployment.
✅ Complete Website HTML/CSS Netlify
2026-03-08 — Session 7
First Real Phone Scans — Pipeline Optimized
vFlat X used to capture 6 real pages from a physical book. Initial confidence scores ranged from 37% to 69% — all flagged for review. Three targeted fixes were identified and implemented: adaptive threshold parameters optimized, fastNlMeansDenoising replacing medianBlur, and PSM mode changed from 6 to 3. Result: 4 of 6 pages ACCEPTED at 85%+ confidence. Peak confidence achieved: 91.95%. Scanning guide created for future volunteers.
✅ Complete Real Scans OCR Optimization 90%+ Confidence vFlat X
Coming — Week 4
Eastpointe Library — Historical Documents
First visit to Eastpointe Library historical section. Real historical newspapers, manuscripts, and local records scanned and processed through the TPN pipeline. Community contribution tracker launched.
⬜ Upcoming Eastpointe Library Historical Documents Community Layer

Watch This Project Being Built

Every session adds a new milestone to this page. If you want to collaborate, contribute, or invest — this log is your due diligence.