Thin Content Detector
Home/Tools/Content Optimization/Thin Content Detector
✎ Content Optimization

Thin Content Detector

v1.0 documentation

Identifies thin, low-quality, or duplicate content across pages. Flags pages below word count thresholds, with low text-to-HTML ratio, or excessive boilerplate.

URL inputFile inputXLSX export
thin_content_detector.py107 lines4 paramsPython 3.8+
Quick start
1

Install

terminal
pip install -r requirements.txt
2

Run

terminal
python thin_content_detector.py --urls https://site.com/p1 https://site.com/p2 --min-words 300
terminal
python thin_content_detector.py --files *.html --output thin_report.xlsx
3

Export

Add --output report.xlsx to save results as a spreadsheet.

Parameters
FlagDescription
--urlsURLs to check. Multiple values allowed
--filesHTML files. Multiple values allowed
--min-wordsMin words (integer)
--outputSave as XLSX
help
python thin_content_detector.py --help
Use cases
Content audit
Pre-publish check
Competitive analysis

Analyze existing content to find what needs updating, merging, or removing. Export results and create a content maintenance plan.

Run before publishing new content to ensure it meets quality thresholds. Fix issues before they go live.

Compare your content against top-ranking competitors. Identify gaps and opportunities to improve.

Dependencies

Requires: beautifulsoup4, pandas, requests. All included in requirements.txt.

Get all 154 Python SEO tools — $49

One-time payment. Lifetime access. No monthly fees.
Learn 25 tools and get 25% back. Earn from client work and get 50% back.

Get the full toolkit

AAIO Inc — aaioinc.com/tools/thin_content_detector/