The guide by Mark Burgess principally covers the style of Unix use most prevalent into the early 1990s, and so provides only thin coverage of the use of more recent developments. In particular:
Disregarding these specifics, however, it's still a worthy read.
Copyright (C) 1996/7 Mark Burgess Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the section entitled "GNU General Public License" is included exactly as in the original, and provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that the section entitled "GNU General Public License" may be included in a translation approved by the author instead of in the original English.
This is a revised version of the UNIX compendium which is available in printed form and online via the WWW and info hypertext readers. It forms the basis for a one or two semester course in UNIX. The most up-to-date version of this manual can be found at
http://www.iu.hioslo.no/~mark/unix/unix.html.
It is a reference guide which contains enough to help you to find what you need from other sources. It is not (and probably can never be) a complete and self-contained work. Certain topics are covered in more detail than others. Some topics are included for future reference and are not intended to be part of an introductory course, but will probably be useful later. The chapter on X11 programming has been deleted for the time being.
Comments to Mark.Burgess@iu.hioslo.no Oslo, June 1996
If you are coming to unix for the first time, from a Windows or MacIntosh environment, be prepared for a rather different culture than the one you are used to. Unix is not about `products' and off-the-shelf software, it is about open standards, free software and the ability to change just about everything.
You should approach Unix the way you should approach any new system: with an open mind. The journey begins...
In this manual the word "host" is used to refer to a single computer system -- i.e. a single machine which has a name termed its "hostname".
Unix is one of the most important operating system in use today, perhaps even the most important. Since its invention around the beginning of the 1970s it has been an object of continual research and development. UNIX is not popular because it is the best operating system one could imagine, but because it is an extremely flexible system which is easy to extend and modify. It is an ideal platform for developing new ideas.
Much of the success of UNIX may be attributed to the rapid pace of its
development (a development to which all of its users have been able to
contribute) its efficiency at running programs and the many powerful
tools which have been written for it over the years, such as the C
programming language, make, shell, lex and yacc and
many others. UNIX was written by programmers for programmers. It is
popular in situations where a lot of computing power is required and for
database applications, where timesharing is critical. In contrast to
some operating systems, UNIX performs equally well on large scale
computers (with many processors) and small computers which fit in your
suitcase!
All of the basic mechanisms required of a multi-user operating system are present in UNIX. During the last few years it has become ever more popular and has formed the basis of newer, though less mature, systems like NT. One reason for this that now computers have now become powerful enough to run UNIX effectively. UNIX places burdens on the resources of a computer, since it expects to be able to run potentially many programs simultaneously.
If you are coming to UNIX from DOS you may well be used to using applications software or helpful interactive utilities to solve every problem. UNIX is not usually like this: the operating system has much greater functionality and provides the possibilities for making your own, so it is less common to find applications software which implements the same things. UNIX has long been in the hands of academics who are used to making their own applications or writing their own programs, whereas as the DOS world has been driven by businesses who are willing to spend money on software. For that reason commerical UNIX software is often very expensive and therefore not available at this college. On the other hand, the flexibility of UNIX means that it is easy to write programs and it is possible to fetch gigabytes of free software from the internet to suit your needs. It may not look like what you are used to on your PC, but then you have to remember that UNIX users are a different kind of animal altogether
Like all operating systems, UNIX has many faults. The biggest problem for any operating system is that it evolves without being redesigned. Operating systems evolve as more and more patches and hacks are applied to solve day-to-day problems. The result is either a mess which works somehow (like UNIX) or a blank refusal to change (like DOS or MacIntosh). From a practical perspective, Unix is important and successful because it is a multi-process system which
Unix has some problems: it is old, it contains a lot of rubbish which no one ever bothered to throw away. Although it develops quickly (at light speed compared to either DOS or MacIntosh) the user interface has been the slowest thing to change. Unix is not user-friendly for beginners, it is user-friendly for advanced users: it is made for users who know about computing. It sometimes makes simple things difficult, but above all it makes things possible!
The aim of this introduction is to
To accomplish this task, we must first learn something about the shell (the way in which UNIX starts programs). Later we shall learn how to solve more complex problems using Perl and C. Each of these is a language which can be used to put UNIX to work. We must also learn when to use which tool, so that we do not waste time and effort. Typical uses for these different interfaces are
Much of UNIX's recent popularity has been a result of its networking abilities: unix is the backbone of the internet. No other widely available system could keep the internet alive today.
Once you have mastered the unix interface and philosophy you will find that i) the PC and MacIntosh window environments seem to be easy to use, but simplistic and primitive by comparison; ii) UNIX is far from being the perfect operating system--it has a whole different set of problems and flaws.
The operating system of the future will not be UNIX as we see it today, nor will is be DOS or MacIntosh, but one thing is for certain: it will owe a lot to the UNIX operating system and will contain many of the tools and mechanisms we shall describe below.
Unix is not a single operating system. It has branched out in many different directions since it was introduced by AT&T. The most important `fork()' in its history happened early on when the university of Berkeley, California created the BSD (Berkeley Software Distribution), adding network support and the C-shell.
Here are some of the most common implementations of unix.
This programming guide is something between a user manual and a tutorial. The information contained here should be sufficient to get you started with the unix system, but it is far from complete.
To use this programming guide, you will need to work through the basics from each chapter. You will find that there is much more information here than you need straight away, so try not to be overwhelmed by the amount of material. Use the contents and the indices at the back to find the information you need. If you are following a one-semester UNIX course, you should probably concentrate on the following:
The only way to learn UNIX is to sit down and try it. As with any new thing, it is a pain to get started, but once you are started, you will probably come to agree that UNIX contains a wealth of possibilities, perhaps more than you had ever though was possible or useful!
One of the advantages of the UNIX system is that the entire UNIX manual is available on-line. You should get used to looking for information in the online manual pages. For instance, suppose you do not remember how to create a new directory, you could do the following:
nexus% man -k dir dir ls (1) - list contents of directories dirname dirname (1) - strip non-directory suffix from file name dirs bash (1) - bash built-in commands, see bash(1) find find (1) - search for files in a directory hierarchy ls ls (1) - list contents of directories mkdir mkdir (1) - make directories pwd pwd (1) - print name of current/working directory rmdir rmdir (1) - remove empty directories
The `man -k' command looks for a keyword in the manual and lists all the references it finds. The command `apropos' is completely equivalent to `man -k'. Having discovered that the command to create a directory is `mkdir' you can now look up the specific manaul page on `mkdir' to find out how to use it:
man mkdir
Some but no all of the UNIX commands also have a help option which is activated with the `-h' or `--help' command-line option.
dax% mkdir --help
Usage: mkdir [OPTION] DIRECTORY...
-p, --parents no error if existing, make parent directories as needed
-m, --mode=MODE set permission mode (as in chmod), not 0777 - umask
--help display this help and exit
--version output version information and exit
dax%
There are some things that you should never do in UNIX. Some of these will cause you more serious problems than others. You can make your own list as you discover more.
rm it is impossible
to recover it! Don't use wildcards with rm without thinking
quite carefully about what you are doing! It has happened to very many
users throughout the history of UNIX that one tries to type
rm *~but instead, by a slip of the hand, one writes
rm * ~Unix then takes these wildcards in turn, so that the first command is
rm * which deletes all of your files! BE CAREFUL!
test. There is a UNIX command
which is already called test and chances are that when you try to
run your program you will start the UNIX command instead. This
can cause a lot of confusion because the UNIX command doesn't seem
to do very much at all!
The core of unix is the library of functions (written in C) which access the system. Everything you do on a unix system goes through this set of functions. However, you can choose your own interface to these library functions. Unix has very many different interfaces to its libraries in the form of languages and command interpreters.
You can use the functions directly in C, or you can use command programs like `ls', `cd' etc. These functions just provide a simple user interface to the C calls. You can also use a variety of `script' languages: C-shell, Bourne shell, Perl, Tcl, scheme. You choose the interface which solves your problem most easily.
With the exception of a few simple commands which are built into the command interpreter (shell), all unix commands and programs consist of executable files. In other words, there is a separate executable file for each command. This makes it extremely simple to add new commands to the system. One simply makes a program with the desired name and places it in the appropriate directory.
Unix commands live in special directories
(usually called bin for binary files). The location of
these directories is recorded in a variable called path or
PATH which is used by the system to search for
binaries. We shall return to this in more detail in later chapters.
Since users cannot command the kernel directly, UNIX has a command language known as the shell. The word shell implies a layer around the kernel. A shell is a user interface, or command interpreter.
There are two main versions of the shell, plus a number of enhancements.
The program tcsh is a public-domain enhancement of the csh and is
in common use. Two improved versions of the Bourne shell also exist:
ksh, the Korn shell and bash, the Bourne-again shell.
Although the shells are mainly tools for typing in commands (which are excutable files to be loaded and run), they contain features such as aliases, a command history, wildcard-expansions and job control functions which provide a comfortable user environment.
Most of the unix kernel and daemons are written in the C programming
language (1). Calls to the kernel and to
services are made through functions in the standard C library. The
commands like chmod, mkdir and cd are all C
functions. The binary files of the same name /bin/chmod,
/bin/mkdir etc. are just trivial "wrapper" programs for these C
functions.
Until Solaris 2, the C compiler was a standard part of the UNIX operating system, thus C is the most natural language to program in in a UNIX environment. Some tools are provided for C programmers:
Unix has three logical streams or files which are always open and are available to any program.
The names are a part of the C language and are defined as
pointers of type FILE.
#include <stdio.h> /* FILE *stdin, *stdout, *stderr; */ fprintf(stderr,"This is an error message!\n");
The names are `logical' in the sense that they do not refer to a particular device, or a particular place for information to come from or go. Their role is analogous to the `.' and `..' directories in the filesystem. Programs can write to these files without worrying about where the information comes from or goes to. The user can personally define these places by redirecting standard I/O. This is discussed in the next chapter.
A separate stream is kept for error messages so that error output does not get mixed up with a program's intended output.
When logged onto a UNIX system directly, the user whose name is
root has unlimited access to the files on the system. root
can also become any other user without having to give a
password. root is reserved for the system administrator or
trusted users.
Certain commands are forbidden to normal users. For example, a regular
user should not be able to halt the system, or change the ownership
of files (see next paragraph). These things are reserved for the
root or superuser.
In a networked environment, root has no automatic authority on
remote machines. This is to prevent the system administrator of one
machine in Canada from being able to edit files on another in China. He
or she must log in directly and supply a password in order to gain
access privileges. On a network where files are often accessible in
principle to anyone, the username root gets mapped to the user
nobody, who has no rights at all.
Unix has a hierachical filesystem, which makes use of directories and sub-directories to form a tree. The root of the tree is called the root filesystem or `/'. Although the details of where every file is located differ for different versions of unix, some basic features are the same. The main sub-directories of the root directory together with the most important file are shown in the figure. Their contents are as follows.
mknod. Logical
devices are UNIX's official entry points for writing to devices. For instance,
/dev/console is a route to the system console, while /dev/kmem
is a route for reading kernel memory. Device nodes enable devices to be
treated as though they were files.
/home by some convention decided by the system administrator.
/usr/spool contains spool queues and system
data. /var/spool and /var/adm etc are used for holding
queues and system log files.
Every unix directory contains two `virtual' directories marked by a single dot and two dots.
ls -a . ..
The single dot represents the directory one is already in (the current directory). The double dots mean the directory one level up the tree from the current location. Thus, if one writes
cd /usr/local cd ..
the final directory is /usr. The single dot is very useful in
C programming if one wishes to read `the current directory'. Since
this is always called `.' there is no need to keep track of what the
current directory really is.
`.' and `..' are `hard links' to the true directories.
A symbolic link is a pointer or an alias to another file. The command
ln -s fromfile /other/directory/tolink
makes the file fromfile appear to exist at /other/directory/tolink
simultaneously. The file is not copied, it merely appears to be a
part of the file tree in two places. Symbolic links can be made to
both files and directories.
A symbolic link is just a small file which contains the name of the real
file one is interested in. It cannot be opened like an ordinary file,
but may be read with the C call readlink() See section lstat and readlink.
If we remove the file a symbolic link
points to, the link remains -- it just points nowhere.
A hard link is a duplicate inode in the filesystem which is in every way equivalent to the original file inode. If a file is pointed to by a hard link, it cannot be removed until the link is removed. If a file has @math{n} hard links -- all of them must be removed before the file can be removed. The number of hard links to a file is stored in the filesystem index node for the file.
If you have never met unix, or another multiuser system before, then you might find the idea daunting. There are several things you should know.
Each time you use unix you must log on to the system by typing a username and a password. Your login name is sometimes called an `account' because some unix systems implement strict quotas for computer resources which have to be paid for with real money(2).
login: mark password:
Once you have typed in your password, you are `logged on'. What happens then depends on what kind of system you are logged onto and how. If you have a colour monitor and keyboard in front of you, with a graphical user interface, you will see a number of windows appear, perhaps a menu bar. You then use a mouse and keyboard just like any other system.
This is not the only way to log onto unix. You can also log in
remotely, from another machine, using the telnet or rlogin
programs. If you use these programs, you will normally only get a text
or command line interface (though graphics can still be arranged).
Once you have logged in, a short message will be printed (called Message of the Day or motd) and you will see the C-shell prompt: the name of the host you are logged onto followed by a percent sign, e.g.
SunOS Release 5.5 Version Generic [UNIX(R) System V Release 4.0] Copyright (c) 1983-1995, Sun Microsystems, Inc. Please report problems to sysadm@iu.hioslo.no dax%
Remember that every unix machine is a separate entity: it is not like logging onto a PC system where you log onto the `network' i.e. the PC file server. Every unix machine is a server. The network, in unix-land, has lots of players.
The first thing you should do once you have logged on is to set a reliable password. A poor password might be okay on a PC which is not attached to a large network, but once you are attached to the internet, you have to remember that the whole world will be trying to crack your password. Don't think that no one will bother: some people really have nothing better to do. A password should not contain any word that could be in a list of words (in any language), or be a simple concatenation of a word and a number (e.g. mark123). It takes seconds to crack such a password. Choose instead something which is easy to remember. Feel free to use the PIN number from your bankers card in your password! This will leave you with fewer things to remember. e.g. Ma9876rk). Passwords can be up to eight characters long.
Some sites allow you to change your password anywhere. Other sites require you to log onto a special machine to change your password:
dax% dax% passwd Change your password on host nexus You cannot change it here dax% rlogin nexus password: ****** nexus% passwd Changing password for mark Enter login password: ******** Enter new password: ******** Reenter new passwd: ********
You will be prompted for your old password and your new password twice. If your network is large, it might take the system up to an hour or two to register the change in your password, so don't forget the old one right away!
Unix has three mouse buttons. On some PC's running GNU/Linux or some other PC unix, there are only two, but the middle mouse button can be simulated by pressing both mouse buttons simultaneously. The mouse buttons have the following general functions. They may also have additional functions in special software.
On a left-handed system right and left are reversed.
Reading electronic mail on unix is just like any other system, but there are many programs to choose from. There are very old programs from the seventies such as
and there are fully graphical mail programs such as
tkrat mailtool
Choose the program you like best. Not all of the programs support modern
multimedia extensions because of their age. Some programs like
tkrat have immediate mail notification alerts. To start a mail
program you just type its name. If you have an icon-bar, you can click
on the mail-icon.
Inexperienced computer users often prefer to use file-manager
programs to avoid typing anything. With a mouse you can click
your way through directories and files without having to type
anything (e.g. the fmgr or tkdesk programs).
More experienced users generally find this to be slow and
tedious after a while and prefer to use written commands.
Unix has many short cuts and keyboard features which make
typed commands extremely fast and much more powerful than
use of the mouse.
If you come from a DOS environment, the unix commands can be a little strange. Because they stem from an era when keyboards had to be hit with hammer force, and machines were very slow, the command names are generally as short as possible, so they seem pretty cryptic. Some familar ones which DOS borrowed from unix include,
cd mkdir
which change to a new directory and make a new directory respectively. To list the files in the current directory you use,
ls
To rename a file, you `move' it:
mv old-name new-name
Text editing is one of the things which people spend most time doing on any computer. It is important to distingiush text editing from word processing. On a PC or MacIntosh, you are perhaps used to Word or WordPerfect for writing documents.
Unix has a Word-like program called lyx, but for the most part
Unix users do not use word processors. It is more common in the unix
community to write all documents, regardless of whether they are
letters, books or computer programs, using a non-formatting text
editor. (Unix word processors like Framemaker do exist, but they
are very expensive. A version of MS-Word also exists for some unices.)
Once you have written a document in a normal text editor, you call up
a text formatter to make it pretty. You might think this strange, but
the truth of the matter is that this two-stage process gives you the
most power and flexibilty--and that is what most unix folks like.
For writing programs, or anything else, you edit a file by typing:
emacs myfile
emacs is one of dozens of text-editors. It is not the simplest or
most intuitive, but it is the most powerful and if you are going to
spend time learning an editor, it wouldn't do any harm to make it this
one. You could also click on emacs' icon if you are relying on a window
system. Emacs is almost certainly the most powerful text editor that
exists on any system. It is not a word-processor, it not for formatting
printed documents, but it can be linked to almost any other program in
order to format and print text. It contains a powerful programming
language and has many intelligent features. We shall not go into the
details of document formatting in this book, but only mention that
programs like troff and Tex or Latex are used for
this purpose to obtain typset-quality printing. Text formatting is an
area where Unix folks do things differently to PC folks.
Unix began as a timesharing mainframe system in the seventies, when the only terminals available were text based teletype terminals or tty-s. Later, the Massachusetts Institute of Technology (MIT) developed the X-windows interface which is now a standard across UNIX platforms. Because of this history, the X-window system works as a front end to the standard UNIX shell and interface, so to understand the user environment we must first understand the shell.
A shell is a command interpreter. In the early days of unix, a shell was the only way of issuing commands to the system. Nowadays many window-based application programs provide menus and buttons to perform simple commands, but the UNIX shell remains the most powerful and flexible way of interacting with the system.
After logging in and entering a password, the unix process init starts a shell for the user logging in. Unix has several different kinds of shell to choose from, so that each user can pick his/her favourite command interface. The type of shell which the system starts at login is determined by the user's entry in the passwd database. On most systems, the standard login shell is a variant of the C-shell.
Shells provide facilities and commands which
The shell does not contain any more specific functions--all other commands, such as programs which list files or create directories etc., are executable programs which are independent of the shell. When you type `ls', the shell looks for the executable file called `ls' in a special list of directories called the command path and attempts to start this program. This allows such programs to be developed and replaced independently of the actual command interpreter.
Each shell which is started can be customized and configured by editing a setup file. For the C-shell and its variants this file is called `.cshrc', and for the Bourne shell and its variants it is called `.profile'. (Note that files which begin with leading dots are not normally visible with the `ls' command. Use `ls -a' to view these.) Any commands which are placed in these files are interpreted by the shell before the first command prompt is issued. These files are typically used to define a command search path and terminal characteristics.
On each new command line you can use the cursor keys to edit the line. The up-arrow browses back through earlier commands. CTRL-a takes you to the start of the line. CTRL-e takes you to the end of the line. The TAB can be used to save typing with the `completion' facility See section Command/filename completion.
Shell commands are commands like cp, mv,
passwd, cat, more, less, cc,
grep, ps etc..
Very few commands are actually built into the shell command line
interpreter, in the way that they are in DOS -- commands are mostly
programs which exist as files. When we type a command, the shell
searches for a program with the same name and tries to execute it. The
file must be executable, or a Command not found error will
result. To see what actually happens when you type a command like
gcc, try typing in the following C-shell commands directly
into a C-shell. (We shall discuss these commands soon.)
foreach dir ( $path ) # for every directory in the list path
if ( -x $dir/gcc ) then # if the file is executable
echo Found $dir/gcc # Print message found!
break # break out of loop
else
echo Searching $dir/gcc
endif
end
The output of this command is something like
Searching /usr/lang/gcc Searching /usr/openwin/bin/gcc Searching /usr/openwin/bin/xview/gcc Searching /physics/lib/framemaker/bin/gcc Searching /physics/motif/bin/gcc Searching /physics/mutils/bin/gcc Searching /physics/common/scripts/gcc Found /physics/bin/gcc
If you type
echo $path
you will see the entire list of directories which are searched by the shell. If we had left out the `break' command, we might have discovered that UNIX often has several programs with the same name, in different directories! For example,
/bin/mail /usr/ucb/mail /bin/Mail /bin/make /usr/local/bin/make.
Also, different versions of unix have different conventions for placing the
commands in directories, so the path list needs to be different for
different types of unix machine. In the C-shell a few basic commands
like cd and kill are built into the shell (as in DOS).
You can find out which directory a command is stored in using the
which
command. For example
nexus% which cd cd: shell built-in command. nexus% which cp /bin/cp nexus%
which only searches the directories in $path and quits after
the first match, so if there are several commands with the same name,
you will only see the first of them using which.
Finally, in the C-shell, the which command is built in. In the
Bourne shell it is a program:
nexus% which which which: shell built-in command. nexus% sh $ which which /bin/which $ exit nexus%
Take a look at the script /usr/ucb/which. It is a script written
in the C-shell.
Environment variables are variables which the shell keeps. They are
normally used to configure the behaviour of utility programs like
lpr (which sends a file to the printer) and mail (which
reads and sends mail) so that special options do not have to be typed in
every time you run these programs.
Any program can read these variables to find out how you have configured your working environment. We shall meet these variables frequently. Here are some important variables
PATH # The search path for shell commands (sh) TERM # The terminal type (sh and csh) DISPLAY # X11 - the name of your display LD_LIBRARY_PATH # Path to search for object and shared libraries HOST # Name of this unix host PRINTER # Default printer (lpr) HOME # The path to your home directory (sh) path # The search path for shell commands (csh) term # The terminal type (csh) noclobber # See below under redirection prompt # The default prompt for csh home # The path to your home directory (csh)
These variables fall into two groups. Traditionally the first group always have names in uppercase letters and are called environment variables, whereas variables in the second group have names with lowercase letters and are called shell variables-- but this is only a convention. The uppercase variables are global variables, whereas the lower case variables are local variables. Local variables are not defined for programs or sub-shells started by the current shell, while global variables are inherited by all sub-shells.
The Bourne-shell and the C-shell use these conventions differently and not
always consistently. You will see how to define these below. For now
you just have to know that you can use the following commands from the
C-shell to list these variables. The command env can be used
in either C-shell or Bourne shell to see all of the defined environment
variables.
Sometimes you want to be able to refer to several files in one go. For
instance, you might want to copy all files ending in `.c' to a new
directory. To do this one uses wildcards. Wildcards are characters
like * ? which stand for any character or group of characters.
In card games the joker is a `wild card' which can be substituted for
any other card. Use of wildcards is also called filename substitution
in the unix manuals, in the sections on sh and csh.
The wildcard symbols are,
ls /etc/rc.????
ls /etc/rc.*
ls [abc].C
Here are some examples and explanations.
rc. and are 7 characters long.
It is important to understand that the shell expands wildcards. When
you type a command, the program is not invoked with an argument that
contains * or ?. The shell expands the special characters
first and invokes commands with the entire list of files which
match the patterns. The programs never see the wildcard characters, only
the list of files they stand for. To see this in action, you can type
echo /etc/rc.*
which gives
/etc/rc /etc/rc.boot /etc/rc.ip /etc/rc.local /etc/rc.local% /etc/rc.local~ /etc/rc.single /etc/rc~
All shell commands are invoked with a command line of this form. This has an important corollary. It means that multiple renaming cannot work!
Unix files are renamed using the mv command. In many microcomputer
operating systems one can write
rename *.x *.y
which changes the file extension of all files ending in `.x' to the same name with a `.y' extension. This cannot work in unix, because the shell tries expands everything before passing the arguments to the command line.
The local shell variable noglob switches off wildcard expansion in
the C shell, but you still cannot rename multiple files using mv.
Some free-software programs make this possible.
The wildcards belong to the shell. They are used for matching filenames. UNIX has a more general and widely used mechanism for matching strings, this is through regular expressions.
Regular expressions are used by the egrep utility, text editors
like ed, vi and emacs and sed and awk.
They are also used in the C programming language
for matching input as well as in the Perl programming language and lex
tokenizer. Here are some examples using the egrep command
which print lines from the file /etc/rc which match certain
conditions. The contruction is part of egrep. Everything
in between these symbols is a regular expression. Notice that
special shell symbols ! * & have to be preceded with a backslash
\ in order to prevent the shell from expanding them!
# Print all lines beginning with a comment # egrep '(^#)' /etc/rc # Print all lines which DON'T begin with # egrep '(^[^#])' /etc/rc # Print all lines beginning with e, f or g. egrep '(^[efg])' /etc/rc # Print all lines beginning with uppercase egrep '(^[A-Z])' /etc/rc # Print all lines NOT beginning with uppercase egrep '(^[^A-Z])' /etc/rc # Print all lines containing ! * & egrep '([\!\*\&])' /etc/rc # All lines containing ! * & but not starting # egrep '([^#][\!\*\&])' /etc/rc
Regular expressions are made up of the following `atoms'.
These examples assume that the file `/etc/rc' exists.
If it doesn't exist on the machine you are using, try to
find the equivalent by, for instance, replacing
/etc/rc with /etc/rc* which will try to
find a match beginning with the rc.
You can find a complete list in the unix manual pages. The square brackets above are used to define a class of characters to be matched. Here are some examples,
The backwards apostrophes `...` can be used in all shells and also in the programming language Perl. When these are encountered in a string the shell tries to execute the command inside the quotes and replace the quoted expression by the result of that command. For example:
unix% echo "This system's kernel type is `/bin/file /vmunix`" This system's kernel type is /vmunix: sparc executable not stripped unix% foreach file ( `ls /etc/rc*` ) ? echo I found a config file $file ? echo Its type is `/bin/file $file` ? end I found a config file /etc/rc Its type is /etc/rc: executable shell script I found a config file /etc/rc.boot Its type is /etc/rc.boot: executable shell script I found a config file /etc/rc.ip Its type is /etc/rc.ip: executable shell script I found a config file /etc/rc.local Its type is /etc/rc.local: ascii text I found a config file /etc/rc.local~ Its type is /etc/rc.local~: ascii text I found a config file /etc/rc.single Its type is /etc/rc.single: executable shell script I found a config file /etc/rc~ Its type is /etc/rc~: executable shell script
This is how we insert the result of a shell command into a text string or variable.
bash
csh
jsh
ksh
sh
sh5
tcsh
zsh
xterm
shelltool, cmdtool
screen
rlogin
rsh
telnet
ed
vi
ed. This is the only "standard"
unix text editor supplied by vendors.
emacs
xemacs
pico
xedit
textedit
ls
dir on other
systems).
cp
mv
touch
rm, unlink
mkdir, rmdir
cat
lp, lpr
lpq, lpstat
more
less
mc
fmgr
chmod
chown, chgrp
chown
allows both these operations to be performed together using
the syntax chown owner.group file.
acl
cut
paste
sed
awk
rmcr
find
locate
whereis
du
df
users
finger
who
w
write
talk
irc
mail
Mail
elm
pine
mailtool
rmail
netscape mail
zmail
tkrat
ftp
ncftp
cc
CC
gcc
g++
ld
ar
dbx
gdb
xxgdb
ddd
perl
tcl
scheme
mercury
ps
vmstat
netstat
rpcinfo
showmount
uname
hostname
domainname
nslookup
archie, xarchie
xrn, fnews
netscape, xmosaic
tex, latex
texinfo
xdvi
dvips
ghostview, ghostscript
xv
xv -quit to place a picture on your root window.
xpaint
xfig
xsetroot
date
ispell
xcalc
dc,bc
xclock
ping
In order to communicate with a user, a shell needs to have access to a terminal. Unix was designed to work with many different kinds of terminals. Input/output commands in Unix read and write to a virtual terminal. In reality a terminal might be a text-based Teletype terminal (called a tty for short) or a graphics based terminal; it might be 80-characters wide or it might be wider or narrower. Unix take into account these possibility by defining a number of instances of terminals in a more or less object oriented way.
Each user's terminal has to be configured before cursor based input/output will work correctly. Normally this is done by choosing one of a number of standard terminal types a list which is supplied by the system. In practice the user defines the value of the environment variable `TERM' to an appropriate name. Typical examples are `vt100' and `xterm'. If no standard setup is found, the terminal can always be configured manually using UNIX's most cryptic and opaque of commands: `stty'.
The job of configuring terminals is much easier now that hardware is more standard. Users' terminals are usually configured centrally by the system administrator and it is seldom indeed that one ever has to choose anything other than `vt100' or `xterm'.
Because UNIX originated before windowing technology was available, the user-interface was not designed with windowing in mind. The X window system attempts to be like a virtual machine park, running a different program in each window. Although the programs appear on one screen, they may in fact be running on unix systems anywhere in the world, with only the output being local to the user's display. The standard shell interface is available by running an X client application called `xterm' which is a graphical front-end to the standard UNIX textual interface.
The `xterm' program provides a virtual terminal using the X windows graphical user interface. It works in exactly the same way as a tty terminal, except that standard graphical facilities like copy and paste are available. Moreover, the user has the convenience of being able to run a different shell in every window. For example, using the `rlogin' command, it is possible to work on the local system in one window, and on another remote system in another window. The X-window environment allows one to cut and paste between windows, regardless of which host the shell runs on.
The X11 system is based on the client-server model. You might wonder why a window system would be based on a model which was introduced for interprocess communication, or network communication. The answer is straightforward.
The designers of the X window system realized that network communication was to be the paradigm of the next generation of computer systems. They wanted to design a system of windows which would enable a user to sit at a terminal in Massachusetts and work on a machine in Tokyo -- and still be able to get high quality windows displayed on their terminal. The aim of X windows from the beginning is to create a distributed window environment.
When I log onto my friend's Hewlett Packard workstation to use the text
editor (because I don't like the one on my EUNUCHS workstation) I want
it to work correctly on my screen, with my keyboard -- even though my
workstation is manufactured by a different company. I also want the
colours to be right despite the fact that the HP machine uses a
completely different video hardware to my machine. When I press the
curly brace key {, I want to see a curly brace, and not some
hieroglyphic because the HP station uses a different keyboard.
These are the problems which X tries to address. In a network environment we need a common window system which will work on any kind of hardware, and hide the differences between different machines as far as possible. But it has to be flexible enough to allow us to change all of the things we don't like -- to choose our own colours, and the kind of window borders we want etc. Other windowing systems (like Microsoft windows) ignore these problems and thereby lock the user to a single vendors products and a single operating system. (That, of course, is no accident.)
The way X solves this problem is to use the client server model. Each program which wants to open a window on somebody's compute screen is a client of the X window service. To get something drawn on a user's screen, the client asks a server on the host of interest to draw windows for it. No client ever draws anything itself -- it asks the server to do it on its behalf. There are several reasons for this:
In X, the window manager is a different program to the server which does the drawing of graphics -- but the client-server idea still applies, it just has one more piece to its puzzle.
The X windows system is large and complex and not particularly user friendly. When you log in to the system, X reads two files in your home directory which decide which applications will be started what they will look like. The files are called
#!/bin/csh # # .xsession file # # setenv PATH /usr/bin:/bin:/local/gnu/bin:/usr/X11R6/bin # # List applications here, with & at the end # so they run in the background # xterm -T NewTitle -sl 1000 -geometry 90x45+16+150 -sb & xclock & xbiff -geometry 80x80+510+0 & # Start a window manager. Exec replaces this script with # the fvwm process, so that it doesn't exist as a separate # (useless) process. exec /local/bin/fvwm
xterm*background: LightGrey Emacs*background: grey92 Xemacs*background: grey92
In the terminology used by X11, every client program has to contact a display in order to open a window. A display is a virtual screen which is created by the X server on a particular host. X can create several separate displays on a given host, though most machines only have one.
When an X client program wants to open a window, it looks in the UNIX environment variable `DISPLAY' for the IP address of a host which has an X server it can contact. For example, if we wrote
setenv DISPLAY myhost:0
the client would try to contact the X server on `myhost' and ask for a window on display number zero (the usual display). If we wrote
setenv DISPLAY 198.112.208.35:0
the client would try to open display zero on the X server at the host with the IP address `198.112.208.35'.
Clearly there must be some kind of security mechanism to prevent just anybody from opening windows on someone's display. X has two such mechanisms:
xhost yourhost would allow anyone using yourhost to
access the local display. This mechanism is only present for backward
compatibility with early versions of X windows. Normally one should
use the command xhost - to exclude all others from accessing the
display.
xauth is an
interactive utility used for controlling the contents of the `.Xauthority'
file. See the `xauth' manual page for more information.
The window paradigm has been very successful in many ways, but anyone who has used a window system knows that the screen is simply not big enough for all the windows one would like! Unix has several solutions to this problem.
One solution is to attach several physical screens to a terminal. The X window system can support any number of physical screens of different types. A graphical designer might want a high resolution colour screen for drawing and a black and white screen for writing text, for instance. The disadvantage with this method is the cost of the hardware.
A cheaper solution is to use a window manager such as `fwvm' which creates a virtual screen of unlimited size on a single monitor. As the mouse pointer reaches the edge of the true screen, the window manager replaces the display with a new "blank screen" in which to place windows. A miniaturized image of the windows on a control panel acts as a map which makes it possible to find the applications on the virtual screen.
Yet another possibility is to create virtual displays inside a single window. In other words, one can collapse several shell windows into a single `xterm' window by running the program `screen'. The screen command allows you to start several shells in a single window (using CTRL-a CTRL-c) and to switch between them (by typing CTRL-a CTRL-n). It is only possible to see one shell window at a time, but it is still possible to cut and paste between windows and one has a considerable saving of space. The `screen' command also allows you to suspend a shell session, log out, log in again later and resume the session precisely where you left off.
Here is a summary of some useful screen commands:
To prevent all users from being able to access all files on the system, unix records information about who creates files and also who is allowed to access them later.
Each user has a unique username or loginname together with
a unique user id or uid. The user id is a number, whereas the login
name is a text string -- otherwise the two express the same information.
A file belongs to user A if it is owned by user A. User A then
decides whether or not other users can read, write or execute the
file by setting the protection bits or the permission of the
file using the command chmod.
In addition to user identities, there are groups of users. The idea of
a group is that several named users might want to be able to read
and work on a file, without other users being able to access it.
Every user is a member of at least one group, called the login
group and each group has both a textual name and a number (group id).
The uid and gid of each user is recorded in the
file /etc/passwd (See chapter 6). Membership of other groups
is recorded in the file /etc/group or on some systems /etc/logingroup.
The following output is from
the command ls -lag executed on a SunOS type machine.
lrwxrwxrwx 1 root wheel 7 Jun 1 1993 bin -> usr/bin -r--r--r-- 1 root bin 103512 Jun 1 1993 boot drwxr-sr-x 2 bin staff 11264 May 11 17:00 dev drwxr-sr-x 10 bin staff 2560 Jul 8 02:06 etc drwxr-sr-x 8 root wheel 512 Jun 1 1993 export drwx------ 2 root daemon 512 Sep 26 1993 home -rwxr-xr-x 1 root wheel 249079 Jun 1 1993 kadb lrwxrwxrwx 1 root wheel 7 Jun 1 1993 lib -> usr/lib drwxr-xr-x 2 root wheel 8192 Jun 1 1993 lost+found drwxr-sr-x 2 bin staff 512 Jul 23 1992 mnt dr-xr-xr-x 1 root wheel 512 May 11 17:00 net drwxr-sr-x 2 root wheel 512 Jun 1 1993 pcfs drwxr-sr-x 2 bin staff 512 Jun 1 1993 sbin lrwxrwxrwx 1 root wheel 13 Jun 1 1993 sys->kvm/sys drwxrwxrwx 6 root wheel 732 Jul 8 19:23 tmp drwxr-xr-x 27 root wheel 1024 Jun 14 1993 usr drwxr-sr-x 10 bin staff 512 Jul 23 1992 var -rwxr-xr-x 1 root daemon 2182656 Jun 4 1993 vmunix
The first column is a textual representation of the protection bits for
each file. Column two is the number of hard links to the file (See exercises
below). The third and fourth columns are the user name and group name
and the remainder show the file size in bytes and the creation date.
Notice that the directories /bin and /sys are
symbolic links to other directories.
There are sixteen protection bits for a UNIX file, but only twelve of them can be changed by users. These twelve are split into four groups of three. Each three-bit number corresponds to one octal number.
The leading four invisible bits gives information about the type of file: is
the file a plain file, a directory or a link. In the
output from ls this is represented by a single character:
-, d or l.
The next three bits set the so-called s-bits and t-bit which are explained below.
The remaining three groups of three bits set flags which indicate whether a file can be read `r', written to `w' or executed `x' by (i) the user who created them, (ii) the other users who are in the group the file is marked with, and (iii) any user at all.
For example, the permission
Type Owner Group Anyone d rwx r-x ---
tells us that the file is a directory, which can be read and written to by the owner, can be read by others in its group, but not by anyone else.
Note about directories. It is impossible to cd to a
directory unless the x bit is set. That is, directories must be
`executable' in order to be accessible.
Here are some examples of the relationship between binary, octal and the textual representation of file modes.
Binary Octal Text 001 1 x 010 2 w 100 4 r 110 6 rw- 101 5 r-x - 644 rw-r--r--
It is well worth becoming familiar with the octal number representation of these permissions.
The chmod command changes the permission or mode of a file. Only
the owner of the file or the superuser can change the permission.
Here are some examples of its use. Try them.
# make read/write-able for everyone chmod a+w myfile # add the 'execute' flag for directory chmod u+x mydir/ # open all files for everyone chmod 755 * # set the s-bit on my-dir's group chmod g+s mydir/ # descend recursively into directory opening all files chmod -R a+r dir
When a new file gets created, the operating system must decide what
default protection bits to set on that file. The variable umask decides this.
umask is normally set by each user in his or her .cshrc
file (see next chapter). For example
umask 077 # safe umask 022 # liberal
According the UNIX documentation, the value of umask is
`XOR'ed (exclusive `OR') with a value of 666 & umask
for plain files or 777 & umask for directories in order to find
out the standard protection. Actually this is not quite true: `umask'
only removes bits, it never sets bits which were not already set
in 666. For instance
umask Permission 077 600 (plain) 077 700 (dir) 022 644 (plain) 022 755 (dir)
The correct rule for computing permissions is not XOR but `NOT AND'.
A unix program is normally executed by typing its pathname.
If the x execute bit is not set on the file, this will generate
a `Permission denied' error. This protects the system from
interpreting nonsense files as programs. To make a program executable
for someone, you must therefore ensure that they can execute
the file, using a command like
chmod u+x filename
This command would set execute permissions for the owner of the file;
chmod ug+x filename
would set execute permissions for the owner and for any users in the same group as the file. Note that script programs must also be readable in order to be executable, since the shell has the interpret them by reading.
These two commands change the ownership and the group ownership of a file. Only the superuser can change the ownership of a file on most systems. This is to prevent users from being able to defeat quota mechanisms. (On some systems, which do not implement quotas, ordinary users can give a file away to another user but not get it back again.) The same applies to group ownership.
Normally users other than root cannot define their own groups. This is a weakness in Unix from older times which no one seems to be in a hurry to change. At Oslo College, Computer Science, we use a local solution whereby users can edit a file to create their own groups. This file is called `/iu/nexus/local/iu/etc/iu-group'. The format of the group file is:
group-name::group-number:comma-separated-list-of-users
The s and t bits have special uses. They are described
as follows.
Octal Text Name 4000 chmod u+s Setuid bit 2000 chmod g+s Setgid bit 1000 chmod +t Sticky bit
The effect of these bits differs for plain files and directories and
differ between different versions of UNIX. You should check the manual
page man sticky to find out about your system! The following is
common behaviour.
For executable files, the setuid bit tells UNIX that regardless of
who runs the program it should be executed with the permissions and
rights of owner of the file. This is often used to allow normal users
limited access to root privileges. A setuid-root program
is executed as root for any user. The setgid bit sets the group
execution rights of the program in a similar way.
In BSD unix, if the setgid bit is set on a directory then any new files created in that directory assume the group ownership of the parent directory and not the logingroup of the user who created the file. This is standard policy under system 5.
A directory for which the sticky bit is set restrict the deletion of
files within it. A file or directory
inside a directory with the t-bit set can
only be deleted or renamed by its owner or the superuser. This is
useful for directories like the mail spool area and /tmp
which must be writable to everyone, but should not allow a user
to delete another user's files.
(Ultrix) If an executable file is marked with a sticky bit, it is held in the
memory or system swap area. It does not have to be fetched from
disk each time it is executed. This saves time for frequently
used programs like ls.
(Solaris 1) If a non-executable file is marked with the sticky bit, it will not be held in the disk page cache -- that is, it is never copied from the disk and held in RAM but is written to directly. This is used to prevent certain files from using up valuable memory.
On some systems (e.g. ULTRIX), only the superuser can set the sticky bit. On others (e.g. SunOS) any user can create a sticky directory.
The C shell is the command interpreter which you use to run programs and utilities. It contains a simple programming language for writing tailor-made commands, and allows you to join together unix commands with pipes. It is a configurable environment, and once you know it well, it is the most efficient way of working with unix.
Most users run the C-shell `/bin/csh' as their login environment,
or these days, preferably the `tcsh' which is an improved version
of csh. When a user logs in to a UNIX system the C-shell starts by
reading some files which configure the environment by defining
variables like path.
With the advent of the X11 windowing system, this has changed slightly. Since the window system takes over the entire login procedure, users never get to run `login shells', since the login shell is used up by the X11 system. On an X-terminal or host running X the `.login' file normally has no effect.
With some thought, the `.login' file can be eliminated entirely,
and we can put everything into the .cshrc file.
Here is a very simple example `.cshrc' file.
# # .cshrc - read in by every csh that starts. # # Set the default file creation mask umask 077 # Set the path set path=( /usr/local/bin /usr/bin/X11 /usr/ucb /bin /usr/bin . ) # Exit here if the shell is not interactive if ( $?prompt == 0 ) exit # Set some variables set noclobber notify filec nobeep set history=100 set prompt="`hostname`%" set prompt2 = "%m %h>" # tcsh, prompt for foreach and while setenv PRINTER myprinter setenv LD_LIBRARY_PATH /usr/lib:/usr/local/lib:/usr/openwin/lib # Aliases are shortcuts to unix commands alias passwd yppasswd alias dir 'ls -lg \!* | more' alias sys 'ps aux | more' alias h history
It is possible to make a much more complicated .cshrc file than this. The advent of distributed computing and NFS (Network file system) means that you might log into many different machines running different versions of unix. The command path would have to be set differently for each type of machine.
We have already seen in the examples above how to define variables in C-shell. Let's formalize this. To define a local variable -- that is, one which will not get passed on to programs and sub-shells running under the current shell, we write
set local = "some string" set myname = "`whoami`"
These variables are then referred to by using the dollar `$' symbol. i.e. The value of the variable `local' is `$local'.
echo $local $myname
Global variables, that is variables which all sub-shells inherit from the current shell are defined using `setenv'
setenv GLOBAL "Some other string" setenv MYNAME "`who am i`"
Their values are also referred to using the `$' symbol. Notice that
set uses an `=' sign while `setenv' does not.
Variables can be also created without a value. The shell uses this method to switch on and off certain features, using variables like `noclobber' and `noglob'. For instance
nexus% set flag nexus% if ($?flag) echo 'Flag is set!' Flag is set! nexus% unset flag nexus% if ( $?flag ) echo 'Flag is set!' nexus%
The operator `$?variable' is `true' if variable exists and `false' if it does not. It does not matter whether the variable holds any information.
The commands `unset' and `unsetenv' can be used to undefine or delete variables when you don't want them anymore.
A useful facility in the C-shell is the ability to make arrays out of strings and other variables. The round parentheses `(..)' do this. For example, look at the following commands.
nexus% set array = ( a b c d ) nexus% echo $array[1] a nexus% echo $array[2] b nexus% echo $array[$#array] d nexus% set noarray = ( "a b c d" ) nexus% echo $noarray[1] a b c d nexus% echo $noarray[$#noarray] a b c d
The first command defines an array containing the elements `a b c d'. The elements of the array are referred to using square brackets `[..]' and the first element is `$array[1]'. The last element is `$array[4]'. NOTE: this is not the same as in C or C++ where the first element of the array is the zeroth element!
The special operator `$#' returns the number of elements in an array. This gives us a simple way of finding the end of the array. For example
nexus% echo $#path 23 nexus% echo "The last element in path is $path[$#path]" The last element in path is .
To find the next last element we need to be able to do arithmetic. We'll come back to this later.
The symbols
< > >> << | &
have a special meaning in the shell. By default, most commands take their input from the file `stdin' (the keyboard) and write their output to the file `stdout' and their error messages to the file `stderr' (normally, both of these output files are defined to be the current terminal device `/dev/tty', or `/dev/console').
`stdin', `stdout' and `stderr', known collectively as `stdio', can be redefined or redirected so that information is taken from or sent to a different file. The output direction can be changed with the symbol `>'. For example,
echo testing > myfile
produces a file called `myfile' which contains the string `testing'. The single `>' (greater than) sign always creates a new file, whereas the double `>>' appends to the end of a file, if it already exists. So the first of the commands
echo blah blah >> myfile echo Newfile > myfile
adds a second line to `myfile' after `testing', whereas the second command writes over `myfile' and ends up with just one line `Newfile'.
Now suppose we mistype a command
ehco test > myfile
The command `ehco' does not exist and so the error message `ehco: Command not found' appears on the terminal. This error message was sent to stderr -- so even though we redirected output to a file, the error message appeared on the screen to tell us that an error occurred. Even this can be changed. `stderr' can also be redirected by adding an ampersand `&' character to the `>' symbol. The command
ehco test >& myfile
results in the file `myfile' being created, containing the error message `ehco: Command not found'.
The input direction can be changed using the `<' symbol for example
/bin/mail mark < message
would send the file `message' to the user `mark' by electronic mail. The mail program takes its input from the file instead of waiting for keyboard input.
There are some refinements to the redirection symbols. First of all, let us introduce the C-shell variable `noclobber'. If this variable is set with a command like
set noclobber
then files will not be overwritten by the `>' command. If one tries to redirect output to an existing file, the following happens.
unix% set noclobber unix% touch blah # create an empty file blah unix% echo test > blah blah: File exists.
If you are nervous about overwriting files, then you can set `noclobber' in your `.cshrc' file. `noclobber' can be overridden using the pling `!' symbol. So
unix% set noclobber unix% touch blah # create an empty file blah unix% echo test >! blah
writes over the file `blah' even though `noclobber' is set.
Here are some other combinations of redirection symbols
The last of these commands reads from the standard input until it finds a line which contains a word. It then feeds all of this input into the program concerned. For example,
nexus% mail mark <<quit nexus 1> Hello mark nexus 2> Nothing much to say... nexus 2> so bye nexus 2> nexus 2> quit Sending mail... Mail sent!
The mail message contains all the lines up to, but not including `marker'. This method can also be used to print text verbatim from a file without using multiple echo commands. Inside a script one may write:
cat << "marker";
MENU
1) choice 1
2) choice 2
...
marker
The cat command writes directly to stdout and the
input is redirected and taken directly from the script file.
A very useful construction is the `pipe' facility. Using the `|' symbol one can feed the `stdout' of one program straight into the `stdin' of another program. Similarly with `|&' both `stdout' and `stderr' can be piped into the input of another program. This is very convenient. For instance, look up the following commands in the manual and try them.
ps aux | more echo 'Keep on sharpenin them there knives!' | mail henry vmstat 1 | head ls -l /etc | tail
Note that when piping both standard input and standard error to another program, the two files do not mix synchronously. Often `stderr' appears first.
Occasionally you might want to have a copy of what you see on your terminal sent to a file. `tee' and `script' do this. For instance,
find / -type l -print | tee myfile
sends a copy of the output of `find' to the file `myfile'. `tee' can split the output into as many files as you want:
command | tee file1 file2 ....
You can also choose to record the output an entire shell session using the `script' command.
nexus% script mysession Script started, file is mysession nexus% echo Big brother is scripting you Big brother is scripting you nexus% exit exit Script done, file is mysession
The file `mysession' is a text file which contains a transcript of the session.
The history feature in C-shell means that you do not have to type commands over and over again. In the `tcsh' version of the C shell, and the `bash' version of the Bourne shell, you can use the UP ARROW key to browse back through the list of commands you have typed previously.
In the normal C-shell (`csh') there are three main commands.
The first of these simply repeats the last command. The second counts backwards from the last command to three commands-ago. The final command gives an absolute number. The absolute command number can be seen by typing `history'.
In the `tcsh' extension of the C-shell, you can save hours worth of typing errors by using the completion mechanism. This feature is based on the TAB key.
The idea is that if you type half a filename and press TAB, the shell will try to guess the remainder of the filename. It does this by looking at the files which match what you have already typed and trying to fill in the rest. If there are several files which match, the shell sounds the "bell" or beeps. You can then type CTRL-D to obtain a list of the possible alternatives. Here is an example: suppose you have just a single file in the current directory called `very_long_filename', typing
more TAB
results in the following appearing on the command line
more very_long_filename
The shell was able to identify a unique file. Now suppose that you have two files called `very_long_filename' and `very_big_filename', typing
more TAB
results in the following appearing on the command line
more very_
and the shell beeps, indicating that the choice was not unique and a decision is required. Next, you type CTRL-D to see which files you ahve to choose from and the shell lists them and returns you to the command line, exactly where you were. You now choose `very_long_filename' by typing `l'. This is enough to uniquely identify the file. Pressing the TAB key again results in
more very_long_filename
on the screen. As long as you have written enough to select a file uniquely, the shell will be able to complete the name for you.
Completion also works on shell commands, but it is a little slower since the shell must serach through all the directories in the command path to complete commands.
Two kinds of quotes can be used in shell apart from the backward quotes we mentioned above. The essential difference between them is that certain shell commands work inside double quotes but not inside single quotes. For example
nexus% echo /etc/rc.* /etc/rc.boot /etc/rc.ip /etc/rc.local nexus% echo "/etc/rc.*" /etc/rc.* nexus% echo "`who am i` -- my name is $user ???" nexus!mark ttyp7 Jul 13 10:16 -- my name is mark ??? nexus% echo '`who am i` -- my name is $user ???' `who am i` -- my name is $user ???
We see that the single quotes prevent variable substitution and sub-shells. Wildcards do not work inside either single or double quotes.
So far we haven't mentioned UNIX's ability to multitask. In the Bourne shell (`sh') there are no facilities for controlling several user processes (4). C-shell provides some commands for starting and stopping processes. These originate from the days before windows and X11, so some of them may seem a little old-fashioned. They are still very useful nonetheless.
Let's begin by looking at the commands which are true for any shell. Most programs are run in the foreground or interactively. That means that they are connected to the standard input and send their output to the standard output. A program can be made to run in the background, if it does not need to use the standard I/O. For example, a program which generates output and sends it to a file could run in the background. In a window environment, programs which create their own windows can also be started as background processes, leaving standard I/O in the shell free.
Background processes run independently of what you are doing in the foreground.
A background process is started using the special charcter `&' at the end of the command line.
find / -name '*lib*' -print >& output &
The final `&' on the end of this line means that the job will be run in the background. Note that this is not confused with the redirection operator `>&' since it must be the last character on the line. The command above looks for any files in the system containing the string `lib' and writes the list of files to a file called `output'. This might be a useful way of searching for missing libraries which you want to include in your environment variable `LD_LIBRARY_PATH'. Searching the enire disk from the root directory `/' could take a long time, so it pays to run this in the background.
If we want to see what processes are running, we can use the `ps' command. `ps' without any arguments lists all of your processes, i.e. all processes owned by the user name you logged in with. `ps' takes many options, for instance `ps auxg' will list all processes in gruesome detail. (The "g" is for group, not gruesome!) `ps' reads the kernel's process tables directly.
Processes can be stopped and started, or killed one and for all. The `kill' command does this. There are, in fact, two versions of the `kill' command. One of them is built into the C-shell and the other is not. If you use the C-shell then you will never care about the difference. We shall nonetheless mention the special features of the C-shell built-ins below. The kill command takes a number called a signal as an argument and another number called the process identifier or PID for short. Kill send signals to processes. Some of these are fatal and some are for information only. The two commands
kill -15 127 kill 127
are identical. They both send signal 15 to PID 127. This is the normal termination signal and it is often enough to stop any process from running.
Programs can choose to ignore certain signals by trapping signals with a special handler. One signal they cannot ignore is signal 9.
kill -9 127
is a sure way of killing PID 127. Even though the process dies, it may not be removed from the kernel's process table if it has a parent (see next section).
Here is the complete list of unix signals which the kernel send to processes in different circumstances.
1 "SIGHUP", /* hangup */ 2 "SIGINT", /* interrupt */ 3 "SIGQUIT", /* quit */ 4 "SIGILL", /* illegal instruction (not reset when caught) */ 5 "SIGTRAP", /* trace trap (not reset when caught) */ 6 "SIGIOT/SIGABRT", /* IOT instruction */ 7 "SIGEMT", /* EMT instruction */ 8 "SIGFPE", /* floating point exception */ 9 "SIGKILL", /* kill (cannot be caught or ignored) */ 10 "SIGBUS", /* bus error */ 11 "SIGSEGV", /* segmentation violation */ 12 "SIGSYS", /* bad argument to system call */ 13 "SIGPIPE", /* write on a pipe with no one to read it */ 14 "SIGALRM", /* alarm clock */ 15 "SIGTERM", /* software termination signal from kill */ 16 "SIGURG", /* urgent condition on IO channel */ 17 "SIGSTOP", /* sendable stop signal not from tty */ 18 "SIGTSTP", /* stop signal from tty */ 19 "SIGCONT", /* continue a stopped process */ 20 "SIGCHLD/SIGCLD", /* to parent on child stop or exit */ 21 "SIGTTIN", /* to readers pgrp upon background tty read */ 22 "SIGTTOU", /* like TTIN for output if (tp->t_local<OSTOP) */ 23 "SIGIO/SIGPOLL", /* input/output possible signal */ 24 "SIGXCPU", /* exceeded CPU time limit */ 25 "SIGXFSZ", /* exceeded file size limit */ 26 "SIGVTALRM", /* virtual time alarm */ 27 "SIGPROF", /* profiling time alarm */ 28 "SIGWINCH", /* window changed */ 29 "SIGLOST", /* resource lost (eg, record-lock lost) */ 30 "SIGUSR1", /* user defined signal 1 */ 31 "SIGUSR2"
We have already mentioned 15 and 9 which are the main signals for users. Signal 1, or `HUP' can be sent to certain programs by the superuser. For instance
kill -1 <inetd> kill -HUP <inetd>
which forces `inetd' to reread its configuration file. Sometimes it is useful to suspend a process temporarily and then restart it later.
kill -18 <PID> # suspend process <PID> kill -19 <PID> # resume process <PID>
When you start a process from a shell, regardless of whether it is a background process or a foreground process, the new process becomes a child of the original shell. Remember that the shell is just a unix process itself. Moreover, if one of the children starts a new process then it will be a child of the child (a grandchild?)! Processes therefore form hierachies. Several children can have a common parent.
If we kill a parent, then (unless the child has detached itself from the parent) all of its children die too. If a child dies, the parent is not affected. Sometimes when a child is killed, it does not die but becomes "defunct" or a zombie process. This means that the child has a parent which is waiting for it to finish. If the parent has not yet been informed that the child has died, for example because it has been suspended itself, then the dead child is not removed from the kernel's process table. When the parent wakes up and receives the message that the child has terminated, the process entry for the dead child can be removed.
Now let's look at some commands which are built into the C-shell for starting and stopping processes. C-shell refers to user programs as `jobs' rather than processes -- but there is no real difference. The added bonus of the C-shell is that each shell has a job number in addition to its PID. The job numbers are simpler and are private for the shell, whereas the PIDs are assigned by the kernel and are often very large numbers which are difficult to to remember. When a command is executed in the shell, it is assigned a job number. If you never run any background jobs then there is only ever one job number: 1, since every job exits before the next one starts. However, if you run background tasks, then you can have several jobs "active" at any time. Moreover, by suspending jobs, C-shell allows you to have several interactive programs running on the same terminal -- the `fg' and `bg' commands allow you to move commands from the background to the foreground and vice-versa.
Take a look at the following shell session.
nexus% emacs myfile & [1] 4990 nexus% ( other commands ... , edit myfile and close emacs ) [1] Exit 70 emacs myfile
When a background job is done, the shell prints a message at a suitable moment between prompts.
[1] Done emacs myfile
This tells you that job number 1 finished normally. If the job exits abnormally then the word `Done' may be replaced by some other message. For instance, if you kill the job, it will say
unix% kill %12 [12] Terminated textedit file
You can list the jobs you have running using the `jobs' command. The output looks something like
[1] + Running textedit c.tex [3] Running textedit glossary.tex [4] Running textedit net.tex [5] Running textedit overview.tex [6] Running textedit perl.tex [7] Running textedit shell.tex [8] Running textedit sysadm.tex [9] Running textedit unix.tex [10] Running textedit x11.tex [11] - Running shelltool [15] Suspended emacs myfile
To suspend a program which you are running in the foreground you can type CTRL-z (this is like sending a `kill -18' signal from the keyboard). (5) You can suspend any number of programs and then restart them one at a time using `fg' and `bg'. If you want job 5 to be restarted in the foreground, you would type
fg %5
When you have had enough of job 5, you can type CTRL-z to suspend it and then type
fg %6
to activate job 6. Provided a job does not want to send output to `stdout', you can restart any job in the background, using a command like.
bg %4
This method of working was useful before windows were available. Using `fg' and `bg', you can edit several files or work on several programs without have to quit to move from one to another.
See also some related commands for batch processing `at', `batch' and `atq', `cron'.
NOTE: CTRL-c sends a `kill -2' signal, which send a standard interrupt message to a program. This is always a safe way to interrupt a shell command.