Language detection
The current version applies separate phrase sets and thresholds for English and Bulgarian. Unsupported or mixed-language samples are flagged with lower confidence.
Estimate AI-writing risk with English and Bulgarian rules, optional author baseline comparison, and OCR fallback for scan-heavy PDFs.
The current version applies separate phrase sets and thresholds for English and Bulgarian. Unsupported or mixed-language samples are flagged with lower confidence.
When a PDF does not expose enough embedded text, the extractor can fall back to OCR so scan-heavy documents remain analyzable.
You can attach a previous text from the same author to measure style drift without turning the tool into a hard authorship classifier.