TI Column Logo
September/October 1993

Getting to know Archie, or finding files through Internet

This article, the sixth in THE INSTITUTE's e-mail series, looks at a powerful technique for finding information, wherever it is stored, on the computer networks of the worldwide Internet. It is assumed that you have an electronic mail account and a personal computer/modem or some other means with which to log on [July/August, p.12].

Learning to use the technology takes time plus an investment in computer hardware and software. But the payback is ample: higher productivity, possibly enhanced employability, and the satisfaction of keeping up with new technology.

E-mail is the sending and receiving of either simple e-mail messages composed on-line or more complex messages that have been uploaded after being written and reviewed off-line. Even word processor or spreadsheet files can be sent by e-mail by means of encoding routines [March/April, p. 12].

Beyond e-mail are other tools of electronic communication. You may, for example, log on to a remote computer from your e-mail (or host) computer and copy files using the ftp command [May/June, p. 12]. Or you may work on that remote computer using the telnet command; for example, you might want to access a catalog program at a distant library to search for a publication.

Given the thousands of computer networks and the millions of computers, there is both a fine opportunity to access useful information and the very real challenge of finding it. This is where Archie and its friends come in.

ARCHIE.   Archie is a service that permits you to find files stored on computers equipped with ftp servers and connected to the Internet. You define a string (which is a set of contiguous characters). Archie matches your string with any similar strings that exist in these ftp directories and returns the address of the computer and the complete file name.

There is an excellent description of Archie in chapter 9 of Ed Krol's book "The Whole Internet" [July/August, p. 12]. Some of the U.S. Archie servers and their locations are:

  • archie.rutgers.edu     (NJ)
  • archie.sura.net     (MD)
  • archie.unl.edu     (NE)
  • archie.ans.net     (NY)
Some of the servers outside the United States are:
  • archie.uqam.ca     Canada
  • archie.au     Australia
  • archie.switch.ch     Switzerland
  • archie.doc.ic.ac.uk     United Kingdom
  • archie.wide.ad.jp     Japan

While the very first Archie server was located at McGill University, in Montreal, unfortunately, it is currently out of service. Archie servers regularly scan all the known ftp servers and copy their current ftp directory listings. Archie also provides tools for searching these directory listings for the files you want and sends you a list of the addresses of the computers where these files are stored. All Archie servers have the same information, so you are asked to use the one closest to you.

There are three ways to access Archie - by telnet, by e-mail, or by using the Archie command on your host machine (if Archie has been installed). Archie via Telnet is discussed next. Even if you do not intend to use telnet access, read this section for some of the explanations you may need for e-mail or command usage.

ARCHIE VIA TELNET.   If you have telnet capability and want to access Archie interactively (instead of sending an e-mail request), use the command "telnet archie.au" (please substitute the address of the Archie server closest to you). If Archie has too many request, you will get a message indicating that you cannot connect to Archie. Try later.

When you are connected, respond to the log-in prompt with "archie". You should then get the prompt "archie>" and are ready to enter Archie commands. Some of these are:

  • help     for more information
  • quit     to leave Archie
  • show search     display rule for matching string to filename
  • set search x     set rule for matching string to filename, where:

  • x=exact     for an exact match (example: set search exact)
    x=regex     for a match using the Unix regular expression rule
    x= sub     for matching a substring (case insensitive)
    x= subcase     for matching a substring (case sensitive)
  • whatis qqq     list filenames with keywords that match qqq
  • prog zzz     list servers with filenames matching zzz
  • mail     send the result of last search to host computer
  • mail xxx@yyy     send the result of last search to xxx@yyy instead of to the host
  • servers     list all known Archie servers
  • site sss     list all files on ftp server named sss

The essence of Archie is to search for a match between the string you define (for example, xxx) and the files stored in the copies of the ftp directory listings for each computer. Setting the search matching rule is the first step. Unix is case sensitive so you have to decide if you want to make your search case-insensitive (for example, Archie=archie= ARCHIE and so on). The regular expression rule in Unix means you can use characters with special meanings.

The second step is to choose between a one-step search of the filenames using prog or a two-step process using whatis and then prog. In the two-step process, whatis finds all file names with keywords that match your string. Then you inspect these file names and decide which you want.

The advantage of using whatis is that it compares your string with strings in keyword descriptors, which tend to be more informative than the file names. Then you use prog to locate the computer that has your desired file. Once you locate the file, you leave Archie and copy the file using the ftp command. -- Robert T.H. Alden


Robert T.H. (Bob) Alden is the chair of the IEEE E-mail Committee, and a former IEEE vice president.   He welcomes your input via .

pre-IEEE website
by Bob Alden