← Back to all products

AI Content Detector

$19

Python AI content detector using perplexity analysis, burstiness scoring, and statistical text analysis.

📁 11 files
MarkdownPython

📄 Product Preview

Try the interactive reader and demo tools below, or get the full product with all content unlocked.

📖 Interactive Reader (Free Preview) ⚙ Try Demo Tools 📦 Download Free Sample

📁 File Structure 11 files

ai-content-detector/ ├── LICENSE ├── README.md ├── examples/ │ ├── basic_usage.py │ └── sample_texts/ │ ├── ai_generated.txt │ └── human_written.txt ├── free-sample.zip ├── guide/ │ ├── 01_features.md │ ├── 02_cli-reference.md │ └── 03_important-disclaimer.md ├── index.html └── src/ └── ai_content_detector.py

📖 Documentation Preview README excerpt

AI Content Detector

Python AI content detector: perplexity analysis, burstiness scoring, vocabulary richness, statistical text analysis, and confidence-scored reports. All math from scratch. Zero dependencies.

Part of the AI Toolkit collection by [CodeVault](https://ai-toolkit.codevault.dev).

Features

  • Perplexity estimation — Bigram language model measures text predictability
  • Burstiness scoring — Sentence length variation analysis (AI text is suspiciously uniform)
  • Vocabulary richness — Type-token ratio detects AI's characteristic word diversity
  • Transition word density — AI overuses words like "Furthermore", "Additionally", "Moreover"
  • Repetition detection — N-gram repetition patterns common in AI output
  • Readability scoring — Flesch reading ease and syllable analysis
  • Confidence reports — Weighted ensemble verdict with per-signal breakdown
  • Batch analysis — Analyze entire directories of text files
  • JSON export — Machine-readable reports for integration into workflows

Quick Start


# Run demo with AI and human text samples
python src/ai_content_detector.py --demo

# Analyze inline text
python src/ai_content_detector.py --text "Your text to analyze goes here..."

# Analyze a file
python src/ai_content_detector.py --file document.txt

# Analyze with JSON export
python src/ai_content_detector.py --file document.txt --export report.json

# Batch analyze a folder
python src/ai_content_detector.py --batch essays/ --export results.json

# Quick verdict only
python src/ai_content_detector.py --file document.txt --quiet

Project Structure


ai-content-detector/
├── README.md
├── LICENSE
├── src/
│   └── ai_content_detector.py    # Core engine (~470 lines)
└── examples/
    ├── basic_usage.py             # Programmatic usage example
    └── sample_texts/              # AI and human text samples
        ├── ai_generated.txt
        └── human_written.txt

CLI Reference

FlagDescription
--demoRun demo with AI and human samples

... continues with setup instructions, usage examples, and more.

📄 Code Sample .py preview

src/ai_content_detector.py #!/usr/bin/env python3 """ AI Content Detector — AI Toolkit (DataNest) Detect AI-generated text using statistical analysis: perplexity estimation, burstiness scoring, vocabulary richness metrics, sentence pattern analysis, and confidence-scored reports. All math from scratch — zero external dependencies. Python 3.10+ stdlib only. Usage: python ai_content_detector.py --text "The text to analyze..." python ai_content_detector.py --file document.txt python ai_content_detector.py --file document.txt --export report.json python ai_content_detector.py --demo python ai_content_detector.py --batch folder/ --export results.json """ from __future__ import annotations import argparse import json import logging import math import re import statistics import string import sys from collections import Counter from dataclasses import dataclass, field from pathlib import Path from typing import Any # --------------------------------------------------------------------------- # Logging # --------------------------------------------------------------------------- logging.basicConfig( level=logging.INFO, format="%(asctime)s [%(levelname)s] %(name)s: %(message)s", ) logger = logging.getLogger("ai_content_detector") # --------------------------------------------------------------------------- # Constants — tuned from empirical observation of AI vs. human text # --------------------------------------------------------------------------- # Perplexity thresholds (lower = more predictable = more likely AI) PERPLEXITY_AI_THRESHOLD: float = 35.0 # below this → likely AI PERPLEXITY_HUMAN_THRESHOLD: float = 70.0 # above this → likely human # Burstiness thresholds (AI text has LOW burstiness — very uniform sentence lengths) BURSTINESS_AI_THRESHOLD: float = 0.3 # below this → likely AI BURSTINESS_HUMAN_THRESHOLD: float = 0.6 # above this → likely human # ... 614 more lines ...
Buy Now — $19 Back to Products