September/October 1993
Getting to know Archie, or finding files through Internet
This article, the sixth in THE INSTITUTE's e-mail series, looks at a powerful technique for
finding information, wherever it is stored, on the computer networks of the worldwide Internet. It
is assumed that you have an electronic mail account and a personal computer/modem or some
other means with which to log on [July/August, p.12].
Learning to use the technology takes time plus an investment in computer hardware and software.
But the payback is ample: higher productivity, possibly enhanced employability, and the
satisfaction of keeping up with new technology.
E-mail is the sending and receiving of either simple e-mail messages composed on-line or more
complex messages that have been uploaded after being written and reviewed off-line. Even word
processor or spreadsheet files can be sent by e-mail by means of encoding routines [March/April,
p. 12].
Beyond e-mail are other tools of electronic communication. You may, for example, log on to a
remote computer from your e-mail (or host) computer and copy files using the ftp command
[May/June, p. 12]. Or you may work on that remote computer using the telnet command; for
example, you might want to access a catalog program at a distant library to search for a
publication.
Given the thousands of computer networks and the millions of computers, there is both a fine
opportunity to access useful information and the very real challenge of finding it. This is where
Archie and its friends come in.
ARCHIE. Archie is a service that permits you to find files stored on computers equipped with ftp
servers and connected to the Internet. You define a string (which is a set of contiguous
characters). Archie matches your string with any similar strings that exist in these ftp directories
and returns the address of the computer and the complete file name.
There is an excellent description of Archie in chapter 9 of Ed Krol's book "The Whole Internet"
[July/August, p. 12]. Some of the U.S. Archie servers and their locations are:
- archie.rutgers.edu (NJ)
- archie.sura.net (MD)
- archie.unl.edu (NE)
- archie.ans.net (NY)
Some of the servers outside the United States are:
- archie.uqam.ca Canada
- archie.au Australia
- archie.switch.ch Switzerland
- archie.doc.ic.ac.uk United Kingdom
- archie.wide.ad.jp Japan
While the very first Archie server was located at McGill University, in Montreal, unfortunately, it
is currently out of service. Archie servers regularly scan all the known ftp servers and copy their
current ftp directory listings. Archie also provides tools for searching these directory listings for
the files you want and sends you a list of the addresses of the computers where these files are
stored. All Archie servers have the same information, so you are asked to use the one closest to
you.
There are three ways to access Archie - by telnet, by e-mail, or by using the Archie command on
your host machine (if Archie has been installed). Archie via Telnet is discussed next. Even if you
do not intend to use telnet access, read this section for some of the explanations you may need for
e-mail or command usage.
ARCHIE VIA TELNET. If you have telnet capability and want to access Archie interactively
(instead of sending an e-mail request), use the command "telnet archie.au" (please substitute the
address of the Archie server closest to you). If Archie has too many request, you will get a
message indicating that you cannot connect to Archie. Try later.
When you are connected, respond to the log-in prompt with "archie". You should then get the
prompt "archie>" and are ready to enter Archie commands. Some of these are:
- help for more information
- quit to leave Archie
- show search display rule for matching string to filename
- set search x set rule for matching string to filename, where:
x=exact for an exact match (example: set search exact)
x=regex for a match using the Unix regular expression rule
x= sub for matching a substring (case insensitive)
x= subcase for matching a substring (case sensitive)
- whatis qqq list filenames with keywords that match qqq
- prog zzz list servers with filenames matching zzz
- mail send the result of last search to host computer
- mail xxx@yyy send the result of last search to xxx@yyy instead of to the host
- servers list all known Archie servers
- site sss list all files on ftp server named sss
The essence of Archie is to search for a match between the string you define (for example, xxx)
and the files stored in the copies of the ftp directory listings for each computer. Setting the search
matching rule is the first step. Unix is case sensitive so you have to decide if you want to make
your search case-insensitive (for example, Archie=archie= ARCHIE and so on). The regular
expression rule in Unix means you can use characters with special meanings.
The second step is to choose between a one-step search of the filenames using prog or a two-step
process using whatis and then prog. In the two-step process, whatis finds all file names with
keywords that match your string. Then you inspect these file names and decide which you want.
The advantage of using whatis is that it compares your string with strings in keyword descriptors,
which tend to be more informative than the file names. Then you use prog to locate the computer
that has your desired file. Once you locate the file, you leave Archie and copy the file using the ftp
command. -- Robert T.H. Alden
Robert T.H. (Bob) Alden is the chair of the IEEE E-mail Committee, and a
former IEEE vice president. He welcomes your input via
.
pre-IEEE website
by Bob Alden
|