Humble Beginnings

Humble Beginnings

In the Merriam Webster Online Thesaurus, the word “search” shares meaning space with “hunt” and “quest”. This is appropriate.

As the World Wide Web has evolved into the virtual information universe that it is today, man has struggled with ways to find what he is looking for in the most efficient way possible. Often time, information is sought to sate a hunger for knowledge. It is entirely possible that the degree of success in obtaining this particular knowledge determines the success or failure of a school report or a business acquisition or a job search. In extreme cases, the failure in the hunt for information could have the same effects as the hunt for food 10,000 years ago. The tribe or family that was unsuccessful in the hunt could be at risk of malnutrition. The breadwinner cannot win bread if his informative hunt for a job is met with failure.

The consequences of a failed search are usually not so dire. Sometimes the search is more like a quest. Like Arthurian legend describes the search for the Holy Grail, so do some quest to find the paths to data that holds great importance in industry and academia. As man grows with the Web, his virtual problems grow to parallel what have previously been “real” problems only. In a discussion of “(s)earch systems” that “can alleviate the need to organize … information”, Cutrell, Dumais, and Teevan rightly point out that “(k)nowing many details about what is being sought (including the fact that it exists) makes it all the more frustrating when we are unable to find it.” (58)

Whether you call it “hunt”, “quest”, or “search”, the imperative to improve the methods thereof grow by the moment. With information on the web expanding at an exponential rate, we are “losing” more data by the week than we could collectively learn in a year just two decades ago. One of the sides of Tim Berners-Lee’s double edged sword he chooses to wield when advocating a free Web for all is the fact that there are no standards for virtual ontological categorization of data. There are no neatly labeled shelves in cyberspace on which man is required to deposit his contributions to the Web’s information coffers. Or rather, there is no enforcement for contributors to categorize their donations using one of the myriad database structures on the Web. We can throw our treasures into cyberspace and let them fall where they may, sometimes being picked up and labeled by search spiders or directory editors, and sometimes laying unnoticed like a shipwreck at the shallow bottom of a Caribbean Sea.

Were he alive today, this complete lack of order would no doubt dismay Benjamin Franklin to no end. He listed among 13 moral virtues by which he attempted to live his life, “Order” as the third. He described it thus: “Let all your things have their places; let each part of your business have its time.” ( Franklin ) He also indicated that this was the virtue with which he had the most trouble, so perhaps his dismay would be accompanied by sympathy. And so it seems that man at large fails in this virtue, just as Franklin struggled with it.

In response to online humanity’s notable lack of the virtue, “Order”, “Search” has fortunately evolved. The notion of associative search is well documented as early as 1945 when the United States Director of the Office of Scientific Research and Development, Dr. Vannevar Bush, outlined his ideas for a conceptual machine named a “memex”.(Bush 107) The unique supposition of his proposal was not the notion of automated search, which had been around for many years, but the notion of automated associative search. Until now, most forward thinkers recognizing the potential for computational machines to automate search, worked under the assumption that search would be linear and only possible on well ordered systems with extensive indexing incorporated as part of the framework. Bush asserted that machine associative search was not only possible, but would be the evolutionary victor in the survival of the fittest search methods. 60 years later, his assertion stands.

It was 45 years after Bush’s visionary narrative that ARCHIE was born; by many accounts, ARCHIE was the first associative search engine. Originally intended for the name ARCHIVES, UNIX standards at the time required the abbreviated form. ARCHIE was essentially a database of filenames with a regular expression matching algorithm built in. ARCHIE was the search engines’ venture out of the virtual primordial ooze. VERONICA and JUGHEAD followed in ARCHIE’s wake over the next twelve months, both supporting the Gopher protocol, which was emerging as a standard at the time. (Wall)

At this time, there were still no efficient means for indexing newly emerging file names or content in cyberspace. It wasn’t until 1993 and the Web’s first robot program, or bot, that this milestone was achieved. The bot was called the World Wide Web Wanderer and the database it populated was called Wandex. (Wall)

Other submission based and indexing search systems came and went during this time, but none had made the next evolutionary leap until Excite, which used “statistical analysis of word relationships to make searching more efficient.” (Wall)

The next two years saw an explosion of competition in the search arena. Webcrawler, Lycos, Infoseek, AltaVista, LookSmart, Inktomi, and AskJeeves all raced to be the best at searching and returning valid results to Webizens in need. Commercialism was beginning to rear it’s head, but had not completely taken over……yet. (Wall)

It should be mentioned that two friends who had structured a portal to bring Web users to a collection of favorite sites had to reorganize their collection into a searchable directory during this explosion. Thus Yahoo was born. (Wall)