World Wide Web Wanderer (Complete History)
World Wide Web Wanderer
The brilliant idea of World Wide Web was devised in the spring of 1989 in the head of Tim Berners-Lee, a physicist in CERN, but it didn’t gain any widespread popular use until the remarkable NCSA Mosaic web-browser was introduced in the beginning of 1993.
In the spring of 1993, just months after the release of Mosaic, Matthew Gray, who studied physics in Massachusetts Institute of Technology (MIT) and was one of the three members of the Student Information Processing Board (SIPB) who set up the site www.mit.edu, decided to write a program, called World Wide Web Wanderer, to systematically traverse the Web and collect sites. Wanderer was first functional in spring of 1993 and became the first automated Web agent (spider or web crawler). The Wanderer certainly did not reach every site in Web, but it was run with consistent methodology, hopefully yielding consistent data for the growth of the Web.
Matthew was initially motivated primarily to discover new sites, as the Web was still a relatively small place (in the early 1993 the total number of web-sites all over the world was about 100, and in June of 1995, even with the phenomenal growth of the Internet, the number of Web servers increased to a point where one in every 270 machines on the Internet is a Web server). As the Web started to grew rapidly after 1993, the focus quickly changed to charting the growth of the Web. The first report, compiled using the data collected by Wanderer (see the table bellow) covers the period from June 1993 to June 1995.
|Month/Year||Nr. of Web sites||% of .com sites||Hosts per Web server|
Wanderer was written using the Perl language and while crawling the Web, it generated an index called Wandex—the first web database. Initially, the Wanderer counted only Web servers, but shortly after its introduction, it started to capture URLs as it went along.
Matthew Gray’s Wanderer created quite a controversy at the time, partially because early versions of the program ran rampant through the Web and caused a noticeable network performance degradation. This degradation occurred because it would access the same page hundreds of time a day. The Wanderer soon amended its ways, but the controversy over whether spiders were good or bad for the Internet remained for some time.
Wanderer certainly was not the Internet’s first search engine, it was the Archie of Alan Emtage, but Wanderer was the first web robot, and, with its index Wandex, clearly had the potential to become the first general-purpose Web search engine, years before Yahoo and Google. Mathew Gray however does not make this claim and he always stated that this was not its purpose. Anyway Wanderer inspired a number of programmers to follow up on the idea of web robots.