I wrote a quick Python script designed to search a file / remote address for URLs and return the HTTP status codes for each one. It’s quick and dirty, and the regex needs some tweaking, but for the most part it works. The reason I didn’t just use a link checker is that I was actually testing RSS feeds, so this was designed to grab URLs throughout the feed as opposed to just A tags. It lists anything in the 40x range.
Read the rest of this entry »