rawler: Crawl your website and find broken links with Ruby
Need a quick-and-dirty way to find broken links on your web site? Rawler from Oscar Del Ben is a Ruby gem that gives you a command line tool to crawl your site, looking for errors.
Install via Rubygems:
gem install rawler
For usage, just execute the command rawler –help
:
~ » rawler --help ~ 255 ↵
Rawler is a command line utility for parsing links on a website
Usage:
rawler http://example.com [options]
where [options] are:
--username, -u <s>: HTT Basic Username
--password, -p <s>: HTT Basic Password
--version, -v: Print version and exit
--help, -h: Show this message
Point Rawler to your URL and you’ll get a list of followed links and their HTTP status codes:
~ » rawler https://changelog.com ~ 130 ↵
301 - https://changelog.com/episodes
200 - https://changelog.com/archive
200 - https://changelog.com/
200 - https://chrome.google.com/extensions/detail/oiaejidbmkiecgbjeifoejpgmdaleoha
301 - http://github.com
200 - http://stylebot.me
200 - http://twitter.com/stylebot
200 - https://changelog.com/tagged/css
200 - https://github.com/handlino/CompassApp
...
The roadmap includes:
- Follow redirects, but still inform about them
- Respect robots.txt
- Export to html
If you want to help out, fork the project and contribute.
Discussion
Sign in or Join to comment or subscribe