Python & command-line tool to gather text on the Web: Crawling & scraping, content extraction, metadata. TXT, Markdown, CSV & XML output.
-
Updated
Jul 16, 2024 - Python
Python & command-line tool to gather text on the Web: Crawling & scraping, content extraction, metadata. TXT, Markdown, CSV & XML output.
reader is for your command line what the “readability” view is for modern browsers: A lightweight tool offering better readability of web pages on the CLI.
SmartReader is a library to extract the main content of a web page, based on a port of the Readability library by Mozilla
For Notion,OneNote,Bear,Yuque,Joplin。Clip anything to anywhere
Go package that cleans a HTML page for better readability.
Extractum is a PHP library that extracts information from web pages.
A very simple python script to strip clutters from montreal gazette readability web page for people with an handicap situation.
A little online tool that takes a URL and converts it to Markdown.
A tool to render yaml into InnoSetup iss files.
🇺🇸 3 Rs of Software Architecture for iOS
📗 Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more
Reduce content complexity
ESLint plugin for John Resig-style micro template, Lodash's template, Underscore's template and EJS.
Chrome Extension to Summarize or Chat with Web Pages/Local Documents Using locally running LLMs. Keep all of your data and conversations private. 🔐
Extract clean(er), readable text from web pages via Mercury Web Parser.
Code, models, and data for "Strategies for Arabic Readability Modelling". ArabicNLP 2024, ACL.
A code-golfing language experience that has aspects of traditional programming languages - terse, elegant, readable.
A Python library for calculating a large variety of metrics from text
An HTTP proxy that parses only text, links and pictures from pages reducing internet bandwidth usage, removing ads and heavy scripts
Add a description, image, and links to the readability topic page so that developers can more easily learn about it.
To associate your repository with the readability topic, visit your repo's landing page and select "manage topics."