Dean's Digital World - Sources
By Dean Tudor
"How do you organize effective research as high tech keeps
accelerating, old hierarchies are displaced by more productive heterarchies,
and disciplines yield the best results through hybrid vigor? One
answer is to fling corporate sponsors, scientists, and students
into seething heaps and see what emerges. Better questions get asked
faster, and a lot of just-in-time learning can converge on projects"
-- Stewart Brand, Wired, March 1999.
The Internet, now a world-wide network of networks running on different
types of computers, had its roots in linked computers before 1960.
This then evolved into Project INTREX at MIT, a computerized network
for scientists to input the latest results of their research so
that other scientists could have instant access to those findings.
By 1967 this had changed to ARPANet, a computer network deliberately
designed to function despite severe power outages or nuclear warfare.
For many years, it was the sole domain of the military and government
researchers. Eventually it became BITNET, a network opened to academics
and students (BIT="because it's time") for E-mail messaging,
file transfers, and telnet remote logins.
At the same time (about 1978), UseNet and "Bulletin Board
Systems" began to operate as true electronic postings of messages
onto a publicly accessible site -- for everyone with a computer
to see. CompuServe and the Source (proprietary BBSes) emerged by
1983. Then, the Gopher software program (University of Minnesota)
allowed for transfers of information in plain-text ASCII format.
By 1991, the concept of the hypertexted link (developed by Apple
Computer and used in CD-ROMs) had spread to various computer networks,
and the World Wide Web was born, albeit in text form. The Lynx software
program (University of Kansas) allowed for transfers of hypertexted,
linked information in plain-text ASCII format. Lynx was also the
first true browser, since it could read E-mail and UseNet postings,
as well as do file transfers and printing.
Two developments in 1994 changed everything. First, the Internet
Society decided to open its network to commercial network ventures,
such as CompuServe and America Online, as well as to regular businesses.
What had previously been reserved for researchers, academics, students,
and not-for-profits was now open to the entire globe. Second, the
Mosaic software program (University of Illinois) allowed for transfers
of hypertexted, linked graphics, audio, movies, and binary files
to a Windows platform. Technology continued to evolve as well, with
faster computers, larger bandwidth fibre optic cables, faster modems,
greater storage, and easy-to-learn programs. From Mosaic came Netscape
and Internet Explorer, two programs that can also handle E-mail
and messaging systems.
There are two basic modes of data on the Internet. One is E-mail:
electronic messages sent, archived, and retrieved by various programs.
"Listservs" will send posted messages to all (free) subscribers.
"UseNet newsgroups" (also known as "NetNews newsgroups")
are true bulletin boards on which posted messages remain for electronic
visitors to view. "Web forum" is an electronic bulletin
board for messages posted to a particular Website. "IRC"
is a program (Internet Relay Chat) for multiple online discussion,
like a conference call, familiarly called "chat forums".
E-mail is the most used part of the Internet.
The other basic mode is the World Wide Web: computer sites with
openly accessible files of data that can be read by a browser, with
hypertexted (hidden) links to other computer sites. Earlier versions
of file transfers were "ftp" (file transfer protocol),
which enabled researchers to get copies of files from another computer
system. "Gopher" was an interface program developed to
make textual FTP easier. Netscape and Internet Explorer really made
it easier, and everything here was also linked. The Web will have
over 800 million documents, mostly in English, by 2000. With the
development of the Web and all-in-one browsers such as Netscape,
there has been a decline in proprietary service and bulletin board
activities. Most of these were on a "charged for" basis,
while the Web is essentially free. BBSes are dying, and services
such as America Online are now just basic portals to the Web, charging
only for access. The necessary links and linkages for E-mail lists
and Web sites can be found at http://www.ryerson.ca/journal/megasources.html
Now, there is also a certain matter of convergence. Browsers such
as Internet Explorer and Netscape can handle the Web, E-mail, ftp,
telnet, UseNet, Gopher, wide area information systems (WAIS) --
all at once. While you are visiting a Web site, a browser can send
your E-mail. While you are reading your E-mail, a browser can load
and visit a Web site listed in the E-mail message. What makes the
Internet extremely useful is this matter of converging technologies:
because of its setup, the Internet can be -- simultaneously -- a
mass medium communication device (sending many messages from one
source), a personal device (sending messages from one reader to
another reader), a networked device (sending many messages to one
person), and a computer device (sending many different messages
to many different people). Everyone can be a publisher. The Internet
is an essentially flat device, with chaos happening all at once,
not really structured in time patterns or quality levels. Each person
can produce or receive personal or mass messages through the E-mail
functions; data can be provided by many people and accessed by the
masses, or archived for groups or individuals to select and retrieve
(either now or later). Every person, business, country, government
is equal on the Internet, each having the same E-mail and Web site
options: for personal communications and interactivity, mass messages,
information storage, data processing and fact retrieval. The average
person is comfortable surfing the Internet, but only the researcher
can plough through medical committee findings, legal journals, or
annual reports -- and make sense of the contents. The chances are
decreasing that the average person will stumble across the specific
information needed. The answer is to generate specific questions
when using the Internet, whether you are using it for business or
for personal information. It actually matters little to the researcher
whether the required data is in print form or in electronic form,
for the substance is the same. Only the format is different (and
hence the searching patterns). For purposes of "finding answers",
the researcher needs to know only what is out on the Internet and
how to get it and to use it. Most everything recent is on the Internet,
and you get it and use it just as you would data from a CD-ROM.
Indeed, many Web sites are just collections of CD-ROMs, assembled
for ease of use and continual access.
So what, then, is different about the Internet?
* The Internet is free, after paying for access charges, except
for a few for-profit sites which charge for normal online access
anyway, such as DIALOG, NewsScan, Lexis-Nexis, and InfoGlobe.
* The Internet is freely available for usage from a researcher's
home, 24 hours a day. Most trips to a physical library or office
have been made redundant by the Internet. Researchers can work in
a comfortable environment at any hour of the day.
* The Internet is used by the community of researchers: all scholars,
academic or independent, are connected via Web sites and E-mail.
They are as near as the keyboard, and can be reached instantaneously
if you are already online. This means that online research can now
include experts along with documents and references in one bundle.
* The Internet is also larger and faster than CD-ROMs and online
systems, in three ways: - larger site computers process and return
requests faster than a personal computer can; - the Internet tracks
breaking stories in news and sports, unlike online systems (and
impossible with fixed CD-ROMs); - the Internet has graphics, sound
and motion, which online services such as Lexis-Nexis or DIALOG
* The Internet is a vast resource for quick and (mostly) free "information",
often for data that was never conveniently available before, such
- breaking wire stories from news sources; (and there is also a
place to go where you can get a quick overview of what ALL the major
newspapers and broadcasters are leading with -- it's called 1stHeadlines,
and it monitors the leads from 200 newspaper broadcaster and online
sources, with quicklinks to the fulltext. There are also quicklinks
to find coverage on hot political, business, and international stories)
<www.1stheadlines.com> - business wires with stock market
reports on a 15-second delay (instant access is available, but it
- weather and sports from around the world;
- enormous reference tools via online CD-ROMs (all free-text searchable),
and some of these, such as The Canadian Encyclopedia, are regularly
updated. The Encyclopedia Britannica in late 1999 made an announcement
that it too was now "free" (but driven by adverts), and
its Web site crashed from the immense number of immediate hits from
researchers and curious people.
- searchable databases of magazines, newspapers, research studies,
company profiles, library catalogues, for low or no cost;
- general searching for solutions;
- press releases from businesses and governments;
- entertainment information and listings;
- searching for people (phone numbers, addresses, reverse lookups);
- genealogy and family matters (e.g., adoption searches);
- government-related information and documents;
- health and medical issues, including alternative medicine;
- legal research (court decisions, legislation, statutes, regulations,
records such as land titles, licenses, assessment, taxes);
- fun and practical data -- hobbies, games, computer programs,
planning holidays, collecting, music, comparative shopping, home
maintenance, job hunting, product information, networking, education,
- photographs, graphics, video, audio, and so forth, which can
add to a research report (these are not available through online
- expert advice via E-mail, reference queries, authors, Sources
Web site <www.sources.com>.
The advent of electronic searching (online or offline) has brought
about some dislocated changes in the world of information:
* Many publications have ceased to appear in paper form. For example,
the Canadian case digests published by Canada Law Book have been
available on the online service QL Systems for decades. The print
version was finally discontinued in 1994.
* Many online services have merged, to counteract the impact of
the Internet. For example, Globe Information Services (Globe and
Mail, Thomson) and Dow Jones Interactive (Wall Street Journal) have
combined as InfoGlobe-Dow Jones Interactive. Now Canadian data can
be easily obtained by international researchers. And Canadians have
access to expert analyses of the mutual fund market, corporate information
on publicly traded companies, securities price and volume data,
price charts and data. Lexis-Nexis combined with Butterworths Canada
to present over one billion documents online from over 20,000 sources.
* Many free information services have been launched, mainly by
governments, on the Internet. For example, the Canadian Centre for
Occupational Health and Safety has its information service on its
* Newly merged reference products are available in one bundle,
such as Electric Library Canada (Rogers MultiMedia), which has three
separate interfaces for libraries, consumers and businesses. Canadian
content includes all the Maclean Hunter publications, The Canadian
Encyclopedia, various dictionaries and quotation books, and the
Toronto Star. <www.elibrary.ca>
* Unfortunately for libraries, many government periodicals and
annual reports are no longer available in paper format. Some good
news from this action is that the legislative debates, while no
longer in daily paper form, are posted on the Internet more quickly
than paper publication once allowed, and they are also now fully
searchable word documents.
* Most of the Internet (and online information in general) is dominated
by the United States. English is the "lingua franca" of
the Internet, and spellings tend to be the American version. Mass
media Web sites have a heavy US bias, and because of the US government
attitude to Freedom of Information, many databases are available
for low or no cost: reverse phone lookups, reward money programs,
credit bureaux, federal license plate checks, US tax courts, home
price searches, drug test results, skiptracer databases, deadbeat
parent locator service, sex offender registries, environmental investigations,
national driver registry, bankruptcy courts, injury claims, casino
fraud, child abuser lists, patents and trademarks, missing persons,
military personnel records, physicians masterfile, various police
reports, genealogy searches.
Here in Canada, municipal assessments were put on the Internet
in Victoria, BC, and then almost immediately withdrawn because of
concerns of privacy. Given time, the privacy issues will be resolved.
For example, the National Archives of Canada put up veterans' records
from World War I, data on more than 630,000 Canadians -- data now
more than 82 years old and more readily available that it has ever
* It takes twice as long to find relevant documents on the free
Web as it does on pay-for online services. Studies have shown that
while the Web and the online services yielded the same number of
valuable documents, the Web also returned twice the number of useless
documents and also had problems with broken links. These slowed
down research time. Choosing the free Web or a pay-for service depends
on whether the researcher prefers to spend money or time.
* Studies have also shown that while searching Web sites is slightly
faster at finding information than searching databases and print
in a library, the library searches were more accurate. In the studies,
no real evaluation of the Web sites was made, in the interest of
speed, and this resulted in inaccuracies. Web sites can be used
for quick information on certain subjects and definitely for finding
updated information. But the library's collections can be just as
quick, and the information is more trustworthy.
Factor in the Internet's 24-hour accessibility and working from
home, and the bottom line is definitely skewed towards the Web sites.
Pluses for the libraries: they are better organized than Web sites
and older information is just as good as newer data if it doesn't
have to be current. Plusses for the Internet: it is more current
and it has more primary sources available from around the world.
* Research materials are only as good as the researcher. Lots of
people are attracted to the Internet and to Web sites because of
the glamour and the open accessibility. Yet they don't pay attention
to how search engines work. Web sites can be distracting, with music,
video, advertising, and other flashy effects. It is very easy to
get caught up and lose focus. Some caveats: not all Web sites are
created equal, and just because Web sites have a lot of information
doesn't mean they have the exact information that researchers want.
A little knowledge can be a dangerous thing, particularly if misinformation
Next time: more on search strategies and Web site evaluations...
Dean Tudor is Sources Informatics Consultant and a professor
of Journalism and Information Science at Ryerson University. He
can be reached at email@example.com.
Published in Sources,
Number 45, Winter 2000.
Dean's Digital World Articles
Sources, 489 College
Street, Suite 201, Toronto, ON M6G 1L9.
Phone: (416) 964-7799 FAX: (416) 964-8763
Include yourself in Sources
Mailing Lists and
Media Names & Numbers
Names & Numbers