Humble Beginnings
In the Merriam Webster Online Thesaurus, the word “search” shares meaning space with “hunt” and “quest”. This is appropriate.
As the World Wide Web has evolved into the virtual information universe that it is today, man has struggled with ways to find what he is looking for in the most efficient way possible. Often time, information is sought to sate a hunger for knowledge. It is entirely possible that the degree of success in obtaining this particular knowledge determines the success or failure of a school report or a business acquisition or a job search. In extreme cases, the failure in the hunt for information could have the same effects as the hunt for food 10,000 years ago. The tribe or family that was unsuccessful in the hunt could be at risk of malnutrition. The breadwinner cannot win bread if his informative hunt for a job is met with failure.
The consequences of a failed search are usually not so dire. Sometimes the search is more like a quest. Like Arthurian legend describes the search for the Holy Grail, so do some quest to find the paths to data that holds great importance in industry and academia. As man grows with the Web, his virtual problems grow to parallel what have previously been “real” problems only. In a discussion of “(s)earch systems” that “can alleviate the need to organize … information”, Cutrell, Dumais, and Teevan rightly point out that “(k)nowing many details about what is being sought (including the fact that it exists) makes it all the more frustrating when we are unable to find it.” (58)
Whether you call
it “hunt”, “quest”, or “search”, the imperative to improve the methods thereof
grow by the moment. With information on the web expanding at an exponential
rate, we are “losing” more data by the week than we could collectively learn in
a year just two decades ago. One of the sides of Tim Berners-Lee’s double edged
sword he chooses to wield when advocating a free Web for all is the fact that
there are no standards for virtual ontological categorization of data. There
are no neatly labeled shelves in cyberspace on which man is required to deposit
his contributions to the Web’s information coffers. Or rather, there is no
enforcement for contributors to categorize their donations using one of the
myriad database structures on the Web. We can throw our treasures into
cyberspace and let them fall where they may, sometimes being picked up and
labeled by search spiders or directory editors, and sometimes laying unnoticed
like a shipwreck at the shallow bottom of a
Were he alive
today, this complete lack of order would no doubt dismay Benjamin Franklin to
no end. He listed among 13 moral virtues by which he attempted to live his
life, “Order” as the third. He described it thus: “Let all your things have their places; let each part of your business
have its time.” (
In response to online humanity’s notable lack
of the virtue, “Order”, “Search” has fortunately evolved. The notion of associative search is well
documented as early as 1945 when the United States Director of the Office of Scientific
Research and Development, Dr. Vannevar Bush, outlined
his ideas for a conceptual machine named a “memex”.(Bush 107) The unique supposition of his proposal was not
the notion of automated search, which had been around for many years, but the
notion of automated associative search. Until now, most forward thinkers
recognizing the potential for computational machines to automate search, worked
under the assumption that search would be linear and only possible on well
ordered systems with extensive indexing incorporated as part of the framework.
Bush asserted that machine associative search was not only possible, but would
be the evolutionary victor in the survival of the fittest search methods. 60
years later, his assertion stands.
It was 45 years after Bush’s visionary
narrative that ARCHIE was born; by many accounts, ARCHIE was the first
associative search engine. Originally intended for the name ARCHIVES, UNIX
standards at the time required the abbreviated form. ARCHIE was essentially a database
of filenames with a regular expression matching algorithm built in. ARCHIE was
the search engines’ venture out of the virtual primordial ooze. VERONICA and
JUGHEAD followed in ARCHIE’s wake over the next
twelve months, both supporting the Gopher protocol, which was emerging as a
standard at the time. (Wall)
At this time, there were still no efficient
means for indexing newly emerging file names or content in cyberspace. It
wasn’t until 1993 and the Web’s first robot program, or bot,
that this milestone was achieved. The bot was called
the World Wide Web Wanderer and the database it populated was called Wandex. (Wall)
Other submission based and indexing search
systems came and went during this time, but none had made the next evolutionary
leap until Excite, which used “statistical analysis of word relationships to
make searching more efficient.” (Wall)
The next two years saw an explosion of
competition in the search arena. Webcrawler, Lycos, Infoseek, AltaVista, LookSmart, Inktomi, and AskJeeves all raced
to be the best at searching and returning valid results to Webizens
in need. Commercialism was beginning to rear it’s
head, but had not completely taken over……yet. (Wall)
It should be mentioned that two friends who
had structured a portal to bring Web users to a collection of favorite sites
had to reorganize their collection into a searchable directory during this
explosion. Thus Yahoo was born. (Wall)