Sources Select Resources

Dean's Digital World - Sources 45

By Dean Tudor

Dean Tudor

"How do you organize effective research as high tech keeps accelerating, old hierarchies are displaced by more productive heterarchies, and disciplines yield the best results through hybrid vigor? One answer is to fling corporate sponsors, scientists, and students into seething heaps and see what emerges. Better questions get asked faster, and a lot of just-in-time learning can converge on projects" -- Stewart Brand, Wired, March 1999.

The Internet, now a world-wide network of networks running on different types of computers, had its roots in linked computers before 1960. This then evolved into Project INTREX at MIT, a computerized network for scientists to input the latest results of their research so that other scientists could have instant access to those findings. By 1967 this had changed to ARPANet, a computer network deliberately designed to function despite severe power outages or nuclear warfare. For many years, it was the sole domain of the military and government researchers. Eventually it became BITNET, a network opened to academics and students (BIT="because it's time") for E-mail messaging, file transfers, and telnet remote logins.

At the same time (about 1978), UseNet and "Bulletin Board Systems" began to operate as true electronic postings of messages onto a publicly accessible site -- for everyone with a computer to see. CompuServe and the Source (proprietary BBSes) emerged by 1983. Then, the Gopher software program (University of Minnesota) allowed for transfers of information in plain-text ASCII format. By 1991, the concept of the hypertexted link (developed by Apple Computer and used in CD-ROMs) had spread to various computer networks, and the World Wide Web was born, albeit in text form. The Lynx software program (University of Kansas) allowed for transfers of hypertexted, linked information in plain-text ASCII format. Lynx was also the first true browser, since it could read E-mail and UseNet postings, as well as do file transfers and printing.

Two developments in 1994 changed everything. First, the Internet Society decided to open its network to commercial network ventures, such as CompuServe and America Online, as well as to regular businesses. What had previously been reserved for researchers, academics, students, and not-for-profits was now open to the entire globe. Second, the Mosaic software program (University of Illinois) allowed for transfers of hypertexted, linked graphics, audio, movies, and binary files to a Windows platform. Technology continued to evolve as well, with faster computers, larger bandwidth fibre optic cables, faster modems, greater storage, and easy-to-learn programs. From Mosaic came Netscape and Internet Explorer, two programs that can also handle E-mail and messaging systems.

There are two basic modes of data on the Internet. One is E-mail: electronic messages sent, archived, and retrieved by various programs. "Listservs" will send posted messages to all (free) subscribers. "UseNet newsgroups" (also known as "NetNews newsgroups") are true bulletin boards on which posted messages remain for electronic visitors to view. "Web forum" is an electronic bulletin board for messages posted to a particular Website. "IRC" is a program (Internet Relay Chat) for multiple online discussion, like a conference call, familiarly called "chat forums". E-mail is the most used part of the Internet.

The other basic mode is the World Wide Web: computer sites with openly accessible files of data that can be read by a browser, with hypertexted (hidden) links to other computer sites. Earlier versions of file transfers were "ftp" (file transfer protocol), which enabled researchers to get copies of files from another computer system. "Gopher" was an interface program developed to make textual FTP easier. Netscape and Internet Explorer really made it easier, and everything here was also linked. The Web will have over 800 million documents, mostly in English, by 2000. With the development of the Web and all-in-one browsers such as Netscape, there has been a decline in proprietary service and bulletin board activities. Most of these were on a "charged for" basis, while the Web is essentially free. BBSes are dying, and services such as America Online are now just basic portals to the Web, charging only for access. The necessary links and linkages for E-mail lists and Web sites can be found at

Now, there is also a certain matter of convergence. Browsers such as Internet Explorer and Netscape can handle the Web, E-mail, ftp, telnet, UseNet, Gopher, wide area information systems (WAIS) -- all at once. While you are visiting a Web site, a browser can send your E-mail. While you are reading your E-mail, a browser can load and visit a Web site listed in the E-mail message. What makes the Internet extremely useful is this matter of converging technologies: because of its setup, the Internet can be -- simultaneously -- a mass medium communication device (sending many messages from one source), a personal device (sending messages from one reader to another reader), a networked device (sending many messages to one person), and a computer device (sending many different messages to many different people). Everyone can be a publisher. The Internet is an essentially flat device, with chaos happening all at once, not really structured in time patterns or quality levels. Each person can produce or receive personal or mass messages through the E-mail functions; data can be provided by many people and accessed by the masses, or archived for groups or individuals to select and retrieve (either now or later). Every person, business, country, government is equal on the Internet, each having the same E-mail and Web site options: for personal communications and interactivity, mass messages, information storage, data processing and fact retrieval. The average person is comfortable surfing the Internet, but only the researcher can plough through medical committee findings, legal journals, or annual reports -- and make sense of the contents. The chances are decreasing that the average person will stumble across the specific information needed. The answer is to generate specific questions when using the Internet, whether you are using it for business or for personal information. It actually matters little to the researcher whether the required data is in print form or in electronic form, for the substance is the same. Only the format is different (and hence the searching patterns). For purposes of "finding answers", the researcher needs to know only what is out on the Internet and how to get it and to use it. Most everything recent is on the Internet, and you get it and use it just as you would data from a CD-ROM. Indeed, many Web sites are just collections of CD-ROMs, assembled for ease of use and continual access.

So what, then, is different about the Internet?

* The Internet is free, after paying for access charges, except for a few for-profit sites which charge for normal online access anyway, such as DIALOG, NewsScan, Lexis-Nexis, and InfoGlobe.

* The Internet is freely available for usage from a researcher's home, 24 hours a day. Most trips to a physical library or office have been made redundant by the Internet. Researchers can work in a comfortable environment at any hour of the day.

* The Internet is used by the community of researchers: all scholars, academic or independent, are connected via Web sites and E-mail. They are as near as the keyboard, and can be reached instantaneously if you are already online. This means that online research can now include experts along with documents and references in one bundle.

* The Internet is also larger and faster than CD-ROMs and online systems, in three ways: - larger site computers process and return requests faster than a personal computer can; - the Internet tracks breaking stories in news and sports, unlike online systems (and impossible with fixed CD-ROMs); - the Internet has graphics, sound and motion, which online services such as Lexis-Nexis or DIALOG cannot produce.

* The Internet is a vast resource for quick and (mostly) free "information", often for data that was never conveniently available before, such as:

- breaking wire stories from news sources; (and there is also a place to go where you can get a quick overview of what ALL the major newspapers and broadcasters are leading with -- it's called 1stHeadlines, and it monitors the leads from 200 newspaper broadcaster and online sources, with quicklinks to the fulltext. There are also quicklinks to find coverage on hot political, business, and international stories) <> - business wires with stock market reports on a 15-second delay (instant access is available, but it costs money);

- weather and sports from around the world;

- enormous reference tools via online CD-ROMs (all free-text searchable), and some of these, such as The Canadian Encyclopedia, are regularly updated. The Encyclopedia Britannica in late 1999 made an announcement that it too was now "free" (but driven by adverts), and its Web site crashed from the immense number of immediate hits from researchers and curious people.

- searchable databases of magazines, newspapers, research studies, company profiles, library catalogues, for low or no cost;

- general searching for solutions;

- press releases from businesses and governments;

- entertainment information and listings;

- searching for people (phone numbers, addresses, reverse lookups);

- genealogy and family matters (e.g., adoption searches);

- government-related information and documents;

- health and medical issues, including alternative medicine;

- legal research (court decisions, legislation, statutes, regulations, records such as land titles, licenses, assessment, taxes);

- fun and practical data -- hobbies, games, computer programs, planning holidays, collecting, music, comparative shopping, home maintenance, job hunting, product information, networking, education, auctions;

- photographs, graphics, video, audio, and so forth, which can add to a research report (these are not available through online services);

- expert advice via E-mail, reference queries, authors, Sources Web site <>.

The advent of electronic searching (online or offline) has brought about some dislocated changes in the world of information:

* Many publications have ceased to appear in paper form. For example, the Canadian case digests published by Canada Law Book have been available on the online service QL Systems for decades. The print version was finally discontinued in 1994.

* Many online services have merged, to counteract the impact of the Internet. For example, Globe Information Services (Globe and Mail, Thomson) and Dow Jones Interactive (Wall Street Journal) have combined as InfoGlobe-Dow Jones Interactive. Now Canadian data can be easily obtained by international researchers. And Canadians have access to expert analyses of the mutual fund market, corporate information on publicly traded companies, securities price and volume data, price charts and data. Lexis-Nexis combined with Butterworths Canada to present over one billion documents online from over 20,000 sources.

* Many free information services have been launched, mainly by governments, on the Internet. For example, the Canadian Centre for Occupational Health and Safety has its information service on its Web site.<>

* Newly merged reference products are available in one bundle, such as Electric Library Canada (Rogers MultiMedia), which has three separate interfaces for libraries, consumers and businesses. Canadian content includes all the Maclean Hunter publications, The Canadian Encyclopedia, various dictionaries and quotation books, and the Toronto Star. <>

* Unfortunately for libraries, many government periodicals and annual reports are no longer available in paper format. Some good news from this action is that the legislative debates, while no longer in daily paper form, are posted on the Internet more quickly than paper publication once allowed, and they are also now fully searchable word documents.

* Most of the Internet (and online information in general) is dominated by the United States. English is the "lingua franca" of the Internet, and spellings tend to be the American version. Mass media Web sites have a heavy US bias, and because of the US government attitude to Freedom of Information, many databases are available for low or no cost: reverse phone lookups, reward money programs, credit bureaux, federal license plate checks, US tax courts, home price searches, drug test results, skiptracer databases, deadbeat parent locator service, sex offender registries, environmental investigations, national driver registry, bankruptcy courts, injury claims, casino fraud, child abuser lists, patents and trademarks, missing persons, military personnel records, physicians masterfile, various police reports, genealogy searches.

Here in Canada, municipal assessments were put on the Internet in Victoria, BC, and then almost immediately withdrawn because of concerns of privacy. Given time, the privacy issues will be resolved. For example, the National Archives of Canada put up veterans' records from World War I, data on more than 630,000 Canadians -- data now more than 82 years old and more readily available that it has ever been.

* It takes twice as long to find relevant documents on the free Web as it does on pay-for online services. Studies have shown that while the Web and the online services yielded the same number of valuable documents, the Web also returned twice the number of useless documents and also had problems with broken links. These slowed down research time. Choosing the free Web or a pay-for service depends on whether the researcher prefers to spend money or time.

* Studies have also shown that while searching Web sites is slightly faster at finding information than searching databases and print in a library, the library searches were more accurate. In the studies, no real evaluation of the Web sites was made, in the interest of speed, and this resulted in inaccuracies. Web sites can be used for quick information on certain subjects and definitely for finding updated information. But the library's collections can be just as quick, and the information is more trustworthy.

Factor in the Internet's 24-hour accessibility and working from home, and the bottom line is definitely skewed towards the Web sites. Pluses for the libraries: they are better organized than Web sites and older information is just as good as newer data if it doesn't have to be current. Plusses for the Internet: it is more current and it has more primary sources available from around the world.

* Research materials are only as good as the researcher. Lots of people are attracted to the Internet and to Web sites because of the glamour and the open accessibility. Yet they don't pay attention to how search engines work. Web sites can be distracting, with music, video, advertising, and other flashy effects. It is very easy to get caught up and lose focus. Some caveats: not all Web sites are created equal, and just because Web sites have a lot of information doesn't mean they have the exact information that researchers want. A little knowledge can be a dangerous thing, particularly if misinformation is repeated.

Next time: more on search strategies and Web site evaluations...

Dean Tudor is Sources Informatics Consultant and a professor of Journalism and Information Science at Ryerson University. He can be reached at

Published in Sources, Number 45, Winter 2000.

See:  Other Dean's Digital World Articles

Sources, 489 College Street, Suite 201, Toronto, ON M6G 1L9.
Phone: (416) 964-7799 FAX: (416) 964-8763

The Sources Directory     Include yourself in Sources     Mailing Lists and Databases

Media Names & Numbers     Sources Calendar     News Releases     Parliamentary Names & Numbers

Resources for Journalists, Reporters, Writers, Freelancers, Editors, and Researchers