TF-IDF Keyword Extractor
Extract top TF-IDF terms from one or multiple documents. Useful for discovering what terms define a page's content.
Install
pip install -r requirements.txtRun
python tfidf_extractor.py --url https://example.com/page --top 30python tfidf_extractor.py --files page1.html page2.html --output tfidf_results.xlsxpython tfidf_extractor.py --file article.txt --ngram-range 1 3Export
Add --output report.xlsx to save results as a spreadsheet.
| Flag | Description |
|---|---|
--url | Single URL to analyze |
--urls | Multiple URLs. Multiple values allowed |
--file | Single file |
--files | Multiple files. Multiple values allowed |
--top | Top N terms per document. Default: 30 (integer) |
--ngram-min | Ngram min (integer) |
--ngram-max | Ngram max (integer) |
--output | Save results as XLSX |
python tfidf_extractor.py --helpRun across all your blog posts to score quality. Sort by score in the XLSX export, then prioritize rewrites for the lowest-scoring pages.
Before publishing freelance content, run this tool to check quality signals. Use specific metrics as concrete feedback for writers.
Include the analysis in your SEO audit report. Clients appreciate data-backed recommendations over subjective opinions.
Combine with other tools for a complete workflow:
Requires: beautifulsoup4, pandas, requests, scikit-learn. All included in requirements.txt.
Get all 154 Python SEO tools — $49
One-time payment. Lifetime access. No monthly fees.
Learn 25 tools and get 25% back. Earn from client work and get 50% back.
AAIO Inc — aaioinc.com/tools/tfidf_extractor/