Dating Methodology

How the system determines and classifies manuscript and patristic dating

Dating System Overview

The system stores dating at two precision levels: **centuries** (century_min / century_max) and **years** (year_min / year_max / year_best).

Centuries are used for cumulative coverage calculations — they determine which manuscripts contribute to each century's coverage statistics. Years enable precise temporal analysis and timeline visualization.

Not all records have year-level dating. When unavailable, the system falls back to century-level data, which is always present for cataloged manuscripts.

Manuscript Dating Sources

The primary source for manuscript dating is the **NTVMR** (New Testament Virtual Manuscript Room), maintained by the Institut für Neutestamentliche Textforschung at the University of Münster.

The NTVMR API provides origEarly and origLate fields for each manuscript — these represent real academic cataloging years based on paleographic analysis and scholarly consensus.

These values are preserved as yearMin / yearMax with datingConfidence = HIGH. NTVMR is the world's primary academic database for Greek New Testament manuscript cataloging.

NTVMR data is never overwritten by AI-generated estimates.

Church Father Dating Sources

Unlike manuscripts, there is no single public API with standardized dating for all Church Fathers. Academic databases exist but lack uniform machine-readable date fields.

To fill this gap, the system uses the **OpenAI API** (gpt-4o-mini model) to obtain dating estimates. A structured prompt requests: birth year, death year, floruit period, and academic reference for the dating.

All AI-generated dating receives datingConfidence = LOW until manually reviewed and validated by an administrator.

This process is **triggered manually by administrators** — it is never automatic or scheduled.

Confidence Levels

Every year-level dating record carries a confidence indicator that reflects the reliability of its source:

LevelSourceDescription
HIGHNTVMR / primary academic sourcePaleographic analysis and scholarly consensus from established institutions
MEDIUMCurated seed dataHand-curated data from reliable secondary references, reviewed before insertion
LOWOpenAI (AI-generated)LLM-generated estimates awaiting manual validation by administrators

Confidence levels are **never auto-promoted**. LOW data cannot become MEDIUM or HIGH through any automated process — only manual administrator review can elevate a record's confidence.

The yearBest Field

The yearBest field represents the scholarly consensus year for a given record. For example, Irenaeus has yearBest = 180 (floruit circa 180 AD).

This field is only filled when the source **explicitly provides a specific accepted year**. It is **never calculated as a mathematical average** of yearMin and yearMax.

When no scholarly consensus exists for a specific year, the field remains empty (null). This avoids presenting misleading precision for records where only a date range is known.

Enrichment Process

The dating enrichment process is a manual, administrator-triggered operation that fills in missing year-level dating for Church Fathers:

  1. Administrator triggers POST /admin/enrich-dating
  2. System identifies records without year_min
  3. For each record, sends a structured prompt to OpenAI requesting birth year, death year, floruit, and academic reference
  4. Validates the response (yearMin > 0, yearMax ≥ yearMin)
  5. Saves with datingSource = openai and datingConfidence = LOW
  6. Every result is logged for audit, including both successful enrichments and failures