World Wide Web Wanderer / Wandex
|Country of Origin|
|Robot:||World Wide Web Wanderer (Source)|
|Older Version||Internet Archive / WebCite|
|wiseGEEK: »In 1993, not long after the creation of the World Wide Web, Matthew Grey developed the World Wide Web Wanderer, which was the first web robot. The World Wide Web Wanderer indexed all of the websites that existed in the internet by capturing their URLs, but didn’t track any of the actual content of the websites. The index associated with the Wanderer, which was an early sort of search engine, was called Wandex.« Source|
|SalientMarketing: »The first web robot was the creation of Massachusetts Institute of Technology (MIT) physics student Matthew Gray in 1993. Gray’s World Wide Web Wanderer was designed to track the growth of the then-infant Web.
“I wrote the Wanderer to systematically traverse the Web and collect sites,” Gray wrote of his invention. “I was initially motivated primarily to discover new sites, as the Web was still a relatively small place. The Wanderer was the primary tool for collection of data to measure the growth of the Web. It was the first automated Web agent or “spider.” The Wanderer was first functional in spring of 1993 and performed regular traversals of the Web from June 1993 to January 1996.”
During its three-year run, the Wanderer tracked the growth in web sites from 130 in June 1993, to more than 100,000 in January 1996 and an estimated 230,000 just six months later.
Gray extended the scope of the Wanderer from tracking the Web’s size to capturing individual URLs into Wandex, the first web database. Gray’s good intentions also created controversy as early versions of the Wanderer were also known to not just crawl the Web, but slow traffic on the Web to a crawl as the program repeatedly accessed the same pages hundreds of times a day. The problem was fixed in later versions.«
|Wikipedia: »The World Wide Web Wanderer, also referred to as just the Wanderer, was a Perl-based web crawler that was first deployed in June 1993 to measure the size
of the World Wide Web. The Wanderer was developed at the Massachusetts Institute of Technology by Matthew Gray, who, as of 2017, has spent a decade as a software
engineer at Google. The crawler was used to generate an index called the Wandex later in 1993. While the Wanderer was probably the first web robot, and, with its index,
clearly had the potential to become a general-purpose WWW search engine, the author does not make this claim and elsewhere
it is stated that this was not its purpose. The Wanderer charted the growth of the web until late 1995.« Source|
|Matthew Gray (30 Jun 93): »I have written a perl script that wanders the WWW collecting URLs, keeping
tracking of where it's been and new hosts that it finds. Eventually,
after hacking up the code to return some slightly more useful information
(currently it just returns URLs), I will produce a searchabe index of this.
There is a complete list of all the sites it has found at
A complete list of sites found by the W4 (World Wide Web Wanderer)
I'll announce here when we get this index properly running, however it probably
won't be until sometime in August, as I am going on vacation. Until then...
|Matthew Gray (30 Jun 93) »Ok, how "big" is the Web. Here is what W4 has found out.
Actually, first I'd better explain a little bit about what the wanderer does.
It does a simple depth first search, with an added feature I call 'getting
bored'. That is, if it finds a number of documents that have the same
URL, up to the last field (eg http://foo/bar/blah, http://foo/bar/baz,
http://foo/bar/more) it will eventually get 'bored' and skip it. This makes
it go a little quicker. Of course, it potentially is losing some documents
here, but probably not.
W4 took many hours (maybe 20) to run, but I don't remember exactly, because it
saves state so I could kill it and restart it whenever I wanted. Well, in
total, the W4 found more than 17,000 http documents (didn't follow any other
kinds of links) and more than 125 unique hosts. In the current version,
it *only* retrieved the URL of the document.
In the next version, I hope to have it do the following other things.
o Get the |
Features & Functionality
References & further Publications
|Wikipedia (EN): n.a.|
|Wikipedia (Others): n.a.|
|SalientMarketing: The first web robot - 1993 URL: http://www.salientmarketing.com/seo-resources/search-engine-history/web-robot.html|
|WiseGEEK: What Was the First Search Engine? URL: http://www.wisegeek.com/what-was-the-first-search-engine.htm|
|Gray, Matthew (1996): Web Growth Summary URL: http://www.mit.edu/~mkgray/net/printable/web-growth-summary.html|
|Georgi Dalakov: World Wide Web Wanderer of Matthew Gray URL: http://history-computer.com/Internet/Conquering/Wanderer.html|
|Gray, Matthew (1996): Internet Growth Summary URL: http://www.mit.edu/~mkgray/net/printable/internet-growth-summary.html|