Methodology — transparency and scope

How the archive works

CagGPT is a public-interest document search and archive project designed to make released source material easier to explore, trace, and review. It exists to help people work from evidence, follow document trails more clearly, and understand what is available in the archive as it continues to grow.

Source documents

The archive is built around released source documents published as grouped PDF collections. These files form the public document shelf behind the site and provide the underlying material for review and reference.

Search and traceability

The aim of the archive is not just to surface text, but to help people trace material back to its source. As search is expanded, results are intended to remain connected to the original documents and page references wherever possible.

Coverage and limitations

This archive is still being expanded. The source documents have been scanned and converted into searchable text to make the material easier to explore. In some places, content has been held back where scan quality was poor or the extracted text needs further checking. This is intentional, so unclear material is not presented as settled or complete.

What this means for visitors

CagGPT is intended as a research and navigation tool. It is designed to help people find material more easily, understand where it sits within the broader archive, and move back to the underlying documents for closer review.

Not every source is complete, and not every document set has been processed to the same depth yet. Where clearer copies become available and additional material is reviewed, further content will be added over time.

Visitors should treat this site as a guide to the record rather than a substitute for reading the original source documents. Where a matter is important, the source material should be reviewed directly.