LinkChecker is a utility written in Python for scanning and checking web page links, usually used for finding invalid or outdated pointers which need to be updated. The LinkChecker project is in a bit of flux right now because the original project (GitHub wummel/linkchecker) has gone completely quiet and presumably the original author is no longer interested in maintaining it. Luckily there is a new group of volunteers rallying around a new fork (GitHub linkcheck/linkchecker)
The project has a variety of packaged downloads, but they are not all updated yet from the newest source tree. On my Mac system I always had trouble making the old project work (usually getting an error like ImportError: No module named requests
). Switching to the new LinkChecker source and using Virtualenv have solved my problems! These are my steps for making this work; it’s pretty straightforward if you have some experience with Python-based utilities.
Prerequisites
- Python
- Virtualenv
First Time Installation
The first step is to create a working directory for LinkChecker and set up the virtual Python environment:
mkdir ~/linkchecker
cd ~/linkchecker
virtualenv env
source env/bin/activate
python --version
Next we’ll clone the latest LinkChecker and install it in the virtual Python environment:
git clone https://github.com/linkcheck/linkchecker.git .
python setup.py sdist --manifest-only
python setup.py build
python setup.py install
Next, confirm that it’s installed and ready to run:
linkchecker
linkchecker --help
Finally, start using the tool and check some websites, for example:
linkchecker --timeout 5 --check-extern https://tweetfave.com/
linkchecker -r 1 --timeout 5 --check-extern http://www.cantoni.org/2017/07/27/podcast-update-feed-reader
Running LinkChecker
The above steps are just needed for the first time. After that, you just need to enter the Virtualenv first:
cd ~/linkchecker
source env/bin/activate
linkchecker --help