Unsterwerx

canonical

Normalizes all documents in ingested status into the Universal Data Set. Each document is parsed, converted to a canonical representation, stored in the content-addressed store (CAS), and indexed in the FTS5 full-text search index.

Usage

bash
unsterwerx canonical

Examples

Extract pending documents

bash
unsterwerx canonical
Canonical Summary
══════════════════════════════════
  Processed:        12
  Extracted:        11
  Failed:            1
  Elements:        847
  Words:         31204
══════════════════════════════════

No pending documents (idempotent)

bash
unsterwerx canonical
Canonical Summary
══════════════════════════════════
  Processed:         0
  Extracted:         0
  Failed:            0
  Elements:          0
  Words:             0
══════════════════════════════════

Notes