WebCLI: A Command Line Interface for Automating the Extraction of Web Content

For the most part, we have one way to interact with web content... the browser.

For myself, I often found myself wishing that I could access web content through a command line interface. Why should it be difficult to grab a collection of images or .zip files from a web page? The command line is an effective way to automate similar tasks. This is the reason I spent a few weeks writing a program that enables this type of interaction, a web command-line interface (WebCLI).

WebCLI was implemented as an interactive Python interpreter, closely imitating the look and feel of the traditional UNIX shell.

This application was used for the html tag visualization I did last year. WebCLI allows users to use traditional command line file system interface utilities like 'cd', 'cp', and 'ls', along with a set of custom flags pertinent only to web content (ls -A) will show the users all html anchor elements within the current page. Here's a screenshot: