This second session on the UNIX operating system, called Intermediate UNIX, assumes a basic working knowledge of simple UNIX commands and an understanding of the concepts discussed in a previous seminar, An Introduction To UNIX. In particular, an understanding of the "rooted tree" file layout and the basic file and directory handling commands are needed.
The UNIX tools and facilities discussed in this document and tutorial are intended to help users to customize their working environment, simplify some complex tasks, and to do more sophisticated management of their files.
An outside reference that may be of some interest on this subject material is "The UNIX Programming Environment" by Kernighan and Pike.
Throughout this document the following conventions will be used:
Text appearing in the Helvetica Normal and Bold fonts, such as this, is instruction and narrative.
Text appearing in the Courier font such as
-rw-r--r-- 1 dela 4872 Nov 4 16:05 slides.txt
represents text which the computer system prints.
Text appearing in the Courier Italic font such as
ln -s /usr/tmp symlink
represents text which you type at the computer keyboard.
The symbols [RETURN] and [CR] both mean that you are to press the carriage return key on your keyboard. The symbol [ESCAPE] means that you are to press the escape key on your keyboard.
The shell is a command interpreter. It is the program which prompts you for your next command, and executes it. Shells can also function as programming languages, which will be covered in the upcoming tutorial, Programming Tools. In addition, the more advanced shells have features designed to make it easier for you to accomplish your work.
There are two shells available with virtually every version of UNIX. The first is called the Bourne Shell ("sh"). The Bourne Shell is about as old as UNIX itself. Very few individuals use the Bourne Shell as an interactive shell.
The more popular shell is called the C Shell ("csh"), so named because its syntax is reminiscent of the C Programming Language in which UNIX is written. All users in the College of Engineering should have this as their shell, unless you have explicitly changed it yourself.
There are numerous other shells available include the Bourne Again Shell ("bash") and a variety of csh-lookalikes ("tcsh" is available on most SEAS UNIX hosts, and is often preferred over the "vanilla" csh shell).
All discussion in this tutorial will cover the C Shell. Other shells have similar features, which you can learn about from their man pages.
The "command line" is the command that you type to the shell prompt, which is typically hostname%. The first word specifies the command to be executed. All words following the command are called its arguments. Several command lines can be given at once by separating them with a semicolon (;).
Any command has what are called its arguments, its input, and its output. The arguments are specified on the command line, as described above. The input is generally given from the keyboard, and the output is displayed on your screen.
The input and output of a command can be "redirected". For instance, if you wish to redirect the output of the "ls" command into a file, you can do so with the command line
ls > filename
Then, the file "filename" will contain the output of the "ls" command. To append to the file rather than overwrite it, use
ls >> filename
You can also redirect the input of a program. For instance, if you have prepared a file which you would like to mail to someone, you can say
mail address < filename
And lastly, if you wish to take the output of one program and pass it as the input to another, you can do that with a pipe (|). For instance, to take the output of "ls" and view it with a pager, you might use
ls | more
Several commands can be strung together with pipes. This is called a "pipeline".
There are three types of quotes: single-quotes (‘ ‘), double-quotes, (" "), and backquotes (` `). All three have different meanings to the C Shell.
The simplest form of quoting is the single-quote. Anything within a pair of single- quotes is literal, no substitution of any kind is done. Within double quotes, certain kinds of substitution are done, but some are not.
Backquotes are another story entirely. When used on the command line, the string between the backquotes is taken to be a command. The output of that command is substituted in the place of the backquoted string. For instance, the command
mail `users`
means to send mail to the usernames returned by the command "users". The "users" command displays the login names of everyone logged in to the system. All special characters can be "escaped", meaning used literally instead of performing their special functions. This can be done either by enclosing the word in quotes, or using the backslash character before them. For instance, if for some reason you wanted a filename with an asterisk in it, you could use vi file\*name to get a file named "file*name". The use of the backslash as an escape is also sometimes called quoting.
The C Shell has significant pattern-matching ability for filenames. The process of matching filenames using these characters is called "Filename Substitution". It is also sometimes referred to as "globbing". When expanded, all matches are sorted alphabetically.
The asterisk (*) matches zero or more characters. It will match dots, unlike DOS however, it will not match a dot at the beginning of a filename, so that "hidden" files remain hidden.
The question-mark (?) matches exactly one character.
The [] characters match any single character in the enclosed list or range. For instance, [A-Z] matches any upper-case letter. [A-Za-z] matches any letter. [0-9] and [0123456789] both match any digit.
History substitution allows you to use words from previous command lines in the command line you are typing. This simplifies spelling corrections and the repetition of complicated commands or arguments. The command "history" will list the last several commands you‘ve run.
A history substitution begins with a !, and may occur anywhere on the command line. The power of history substitution is impressive. There are too many ways of using it to explain here, but complete details can be found in the csh man page. Some of the simpler forms are:
!! Refer to the previous command!n Refer to command-line n!-n Refer to the current command-line minus n.!str Refer to the most recent command starting with str.!?str Refer to the most recent command containing str.
You can also refer to individual words of a previous command. If necessary, you can separate the event specification (above) from the word specification.
!!:0 The first word of the previous command line!!:n The nth argument of the previous command!!:$ The last argument of the previous command!!:* All of the arguments of the previous command
Further modification to the commands can be done including editing particular words, but we won‘t go into further detail here. Suffice it to say that there‘s probably a way to do whatever it is you are trying to accomplish with history substitution.
Quick substitution can be done, if all you want to do is change a single string from the previous command. For instance, if you just sent mail to dela with "mail dela", and now want to send mail to kaser, you can say
^dela^kaser
and the string "dela" will be replaced with "kaser".
A simple example of history substitution at work: You create a file containing a list of your files, and then you wish to edit that file. You run the following two commands in sequence:
ls > filenamevi !!:$
The {} characters expand to each string or pattern in the comma-separated list between them. For instance, "/usr/{bin,lib,include}" expands to "/usr/bin /usr/lib /usr/include". The strings may be patterns for pattern- matching, as described above. They may also be nested {} constructs.
The ~ character evaluates to your home directory. If followed by a username, as in "~dela", it evaluates to that user‘s home directory.
The first word in any command line, the command itself, may be an "alias". An alias is a command you create yourself to represent another command. For instance, if you are too lazy to type the command "clear" all the time and would rather just say "c", you could
alias c clear
and from then on, whenever you ran the command c, the csh would actually run the command clear. An alias can also take arguments, as only the first word on the command line is substituted. You can list your current aliases with the "alias" command, using no arguments.
In general, whenever you run a command, you are starting a process running on the computer. You already have at least one process running, which is your shell. If you run the command "ls", that starts up another process.
You can list the processes you have running with the "ps" command. This displays a list of processes and various information about each one. Here is some sample output from the "ps" command:
PID TT STAT TIME COMMAND17840 co R 0:00 ps21587 co S 0:24 -csh (csh)
The first column, labeled "PID", is the Process ID of that process. Every process on the system has a unique ID. The second column is the name of the terminal, or tty, that the process is running on. Here, the processes are running on "co", the console. STAT shows the status of the process. R means running, S means sleeping. Others might be I for idle, T for stopped, D when waiting for a disk, P when waiting for available memory, or Z while the process is exiting. There are other possibilities in the STAT field, which will not be covered here. The TIME column shows the amount of time, in minutes and seconds, that the CPU has spent running this process. The final column shows the command itself.
If for some reason you need to terminate a process you have running, you can do so with the "kill" command. For instance, to kill that "ps" process above, you might say
kill 17840
Both the "ps" and "kill" commands take several different types of arguments. Read their manual pages for more details.
It is possible to have several processes running at once, or to have some stopped while you work on one. Suppose you wish to run a command which will take a long time, but you wish to continue working while that command is running. A process that is running without user interaction is considered to be running "in the background". You can put a process in the background by ending its command line with an ampersand (&), as in
ls > filename &
The C Shell will display a line with the job number in brackets and the Process IDs of all the associated processes:
[2] 17871
When you start a process in the background, the C Shell will immediately give you a prompt for your next command, no matter how long the command takes. You will be notified when the process terminates.
You can stop a running job by hitting [Control-Z]. For instance, if you are currently reading the csh manual page, you may hit [Control-Z], and the shell will give you a new prompt for your next command. Such a process is called "stopped". You can put this stopped job in the background with the "bg" command.
To see the current list of jobs, use the "jobs" command. The job most recently stopped is referred to as the "current job", and is indicated with a +. The previous job is indicated with a -; when the current job is terminated or moved to the foreground, this job takes its place as the current job. If you give a "-l" argument to the jobs command, the PID of the processes will also be displayed. For instance:
[1] + 17850 Stopped man csh[2] - 17871 Running ls > filename
You can bring the current job to the foreground with the "fg" command. In this case, this would restart your "man csh" process, picking up where you left off. You can specify a different job to the "fg" command as an argument, if you wish. References to jobs begin with a % character.
%+ The current job%- The previous job%j Refer to a particular job. "j" can either be a job numberor a unique string matching the beginning of the commandline. For instance,%1 refers to the man job, and %ls refers to the second job.%?str Refer to the job which contains the string "str".
Job references such as these can also be passed as arguments to the "kill" command, as described above. There are several other commands and mechanisms for manipulating jobs which are not covered here; see the csh manual page.
One powerful feature the C Shell is called "filename completion". Using this feature, you need not type the entire name of a file or of a user. When an unambiguous partial filename is followed by an [ESCAPE] character, the shell fills in the remaining characters of a matching filename.
For instance, if you have a file called "results.october" in your directory, and you type "cat res[ESCAPE]", then the shell will complete the filename for you.
If a partial filename is followed by [Control-D] instead of [ESCAPE], the shell will list all filenames that match, and then prompt once again, supplying the incomplete command line typed in so far. When the partial word begins with a tilde (~), the shell attempts completion with a user name, rather than a file. The terminal will beep in the case of an error or an ambiguous match.
All systems in CEAS are set up to have this feature enabled by default. Other machines can be set up this way by setting the "filec" variable (see below).
When you log in, the shell processes two initialization files, hidden in your home directory. The first of these is called ".cshrc", pronounced "dot C-Shell R.C.". This file is run whenever the C Shell starts up. Shells can start more often than when you log in, such as when you use the "script" command described below, or run certain programs.
The second file is called ".login", and is run by your login shell when you log in, after processing your .cshrc. If you change these files, the changes will not take effect until the files are read again. You can do this by logging out and logging back in, or using the "source" command, as "source .cshrc" or "source .login".
The C Shell can be customized by setting some special variables. Some of these are maintained automatically by the shell, though they can be overridden by hand or in the .cshrc. These include:
cwd The current working directoryhome Your home directorypath The directories to search for programsprompt The promptuser Your username
Other variables are not set by default, but can be set to get a desired effect. Some of the more relevant are:
filec Enable filename and username completionhistory The number of past commands to rememberignoreeof Don‘t let me log out by hitting [Control-D]notify Tell me immediately when background jobs endterm Your terminal type
C Shell variables are set using the "set" command, as in
set history = 40
While some have values, others are booleans and do not need to be set to any particular value. For instance
set notify
You may also set variables that have no meaning to the shell, that you might use yourself later. Variables can be referenced using the dollar-sign character ($). For instance
set prompt = "$user% "
You can view the values of the shell variables with the "set" command, using no arguments.
These variables are not particular to the shell; in fact, most of them have no relevance to the shell at all. The set of environment variables is called "the environment". All programs that you run will run in the same environment, meaning that the environment is passed from program to program. Therefore, they need be set only once and are normally done in the .login file.
By convention, environment variables are written in all upper-case letters, while shell variables are traditionally all lower-case. They are referenced the same way, with the dollar-sign ($).
You can set any environment variable you like. Some of the more common ones are mentioned here; some programs may pay attention to others.
EDITOR Default editor to useMANPATH Directories to search for manual pagesPRINTER Default printer to useVISUAL Default visual editor to use
Environment variables are set using the "setenv" command. Note that the syntax is different from the C Shell variables, in that there is no equals sign (=).
setenv VISUAL vi
You can view your environment with the "setenv" command, using no arguments.
In addition to the various tools built into whichever shell you use, UNIX normally has a variety of programs to help you get your work done. These programs are often called "tools", since you may not be able to accomplish your entire task with one, but a collection of them will often help you achieve your goal.
Here, we‘ll introduce some of the more commonly used tools. Remember, these tools are often best used when combined together using the pipe mechanism descried earlier.
For more detail, consult the man page for these commands.
Tee forms a "T" fitting in pipes (|). It will take whatever is fed to it, copy it to a file, and also feed the same data to its standard input. Thus you can keep a record of whatever is flowing through some section of your plumbers nightmare.
Use it as: "tee filename"
For example:
ls | sort | tee sorted.list | less
Script is used to make a log file of your session. When you issue the command:
script filename
A new session will be started for you, and every character that‘s displayed on your terminal (including your typing that‘s echoed to the screen) will go into the file. Type "exit" or [Control-D] to end the log file.
Grep is one of the classic UNIX tools. It will search through its input, and write to its standard output any lines which contain text which matches a string you give it. This allows you to quickly search a file, or a group of files for something.
The key to using grep are the regular expressions, which are similar to the wildcards described above. A regular expression is a "formula" which describes what a text string must contain in order for a "match" to occur. Here are some of the operators which make up such a "formula":
- just match a single characterstring - match an occurrence of string. - match (almost) ANY character (once)[string] - match any character in string (once)[char1-char2] - match any character in ASCII collating sequencebetween character char1 and character char2 * - match anything which has zero or more occurrencesof ^ - make the "formula" match only if it‘sat the start of a line $ - make the "formula" match only if it‘sat the end of a line\ - use a \ if you want to use special characters, suchas "[" or "*"
To use grep, type :
grep expression filename
Where expression is a regular expression as described above, and filename is a filename, or a shell wildcarded filename.
For example:
grep #include *.c : list all the include lines in *.c.ntp.c:#includentp.c:#include ntp.c:#include test.c:/*#include "ntp.h"*/grep ^#include *.c : more precise way to do above.ntp.c:#include ntp.c:#include ntp.c:#include grep ^#include *.c | less : pipe to a pagerps -axu | grep "r.*t" | less : find anything with an "r"followed eventually by a "t"root 112 0.0 0.0 28 0 ? I Nov 1 0:00 (nfsd)root 53 0.0 0.0 68 0 ? IW Nov 1 0:04 portmaproot 2 0.0 0.0 0 0 ? D Nov 1 10:47 pagedaemondela 2213 0.0 0.0 40 0 co IW Nov 1 0:00 /usr/openwin/bin/xinit -dela 2321 0.0 0.0 36 0 co IW Nov 1 0:00 rsh augustus.me.rocheste
Finally, grep -v will list every line except those that match the regular expression. The following example is just like the one above, but skips entries which include "root".
ps -axu | grep "r.*t" | grep -v root | less dela 2213 0.0 0.0 40 0 co IW Nov 1 0:00 /usr/openwin/bin/xinit -dela 2321 0.0 0.0 36 0 co IW Nov 1 0:00 rsh augustus.me.rocheste
Diff will list the lines which are different between two files. Typically you‘ll use this to look at two versions of the same file to see how it‘s changed. The output is somewhat cryptic, it shows you the "ed" commands to change the first file into the second file, followed by the affected lines from the two files.
Invoke diff via:
diff file1 file2
For example:
diff ntp_proto.c.original ntp_proto.c176c176< pkt->status = sys.leap | NTPVERSION_1 | peer->hmode;---> pkt->status = sys.leap.year | NTPVERSION_1 | peer->hmode;
Sort will sort the contents of a file. By default it will use the ASCII collating sequence (which is wrong for numbers, 101 will sort before 12). The two options of interest are "-n", which will sort numerically, and "+#" where # is the number of the word in each line (start at zero) to sort on. For example
ls -l
produces:
-rw-r--r-- 1 dela 5692 Nov 4 16:34 #slides.txt#-rw-r--r-- 1 dela 849 Oct 30 17:15 outline.txt-rw-r--r-- 1 dela 4872 Nov 4 16:05 slides.txt-rw-r--r-- 1 dela 4872 Nov 4 16:12 slides.txt1
ls -l | sort -n +3 | less
produces:
-rw-r--r-- 1 dela 849 Oct 30 17:15 outline.txt-rw-r--r-- 1 dela 4872 Nov 4 16:05 slides.txt-rw-r--r-- 1 dela 4872 Nov 4 16:12 slides.txt1-rw-r--r-- 1 dela 5692 Nov 4 16:34 #slides.txt#
Wc will count the words in a file. It also reports how many lines and characters there are in a file.
wc slides.txt204 987 6244 slides.txt
Head and tail are two programs which will show the beginning and the end of their input respectively. They are often used in pipes as well. Both commands will take a numeric argument which determines how many lines to show (the default is 10).
ls -l | sort -n +3 | head -2 -rw-r--r-- 1 dela 849 Oct 30 17:15 outline.txt-rw-r--r-- 1 dela 4872 Nov 4 16:12 slides.txt1
Less is a pager which you use just like more, but it‘s better. Like more, less will page through your document when you hit [space], but unlike more you can page backwards by hitting the "b" key. Also "g" will move you to the start of the file, "G" will move you to the end of the file. Hit the "h" key when running less for help.
Enscript is a formatting program for output that‘s going to the printer. It will allow you to specify different fonts, page orientations and number of columns while printing plain text files. Some of the options to enscript are:
-G : Print in a gaudy format, page headings, pagenumbering, etc in a fairly bold style.-2 : print in two columns-r : rotate to landscape mode (page on its side)-f<font> : use font <font>, e.g. Courier24-P<printer> : use printer <printe>r
Look at the man page for more options.
Vgrind is a formatter for program listings. It knows about the syntax of most programming languages, and will format a program listing to emphasize structure and readability. Usually, keywords are in bold, comments are italicized and the program will be nicely indented. Vgrind generates troff output, so vgrind is usually used in a pipe with lpr -t.
Some useful options to vgrind are:
-t : send output to standard output, not typesetter-l: which language program is written in (thedefault is C)
For example:
vgrind -t ntp_proto.c | lpr -t
will print a formatted listing of ntp_proto.c.
vgrind -t -lf geometry.f | lpr -t
will print a formatted list of the fortran program geometry.f.
As discussed in the first seminar, Introduction To UNIX, all files have several characteristics, including name, ownership, group-ownership and permissions. Some of the UNIX utility programs used to manipulate these characteristics are chmod, chgrp and chown. These programs allow the permissions, group- ownership, and ownership of a file to be changed.
The latter two are most often used by the systems administrator to establish the ownership of a file, and are rarely needed otherwise. They won‘t be discussed here, but information on their use is available in the on-line man pages.
The first, chmod, is used to change the permissions (or mode) of a file or a set of files. Only the owner of a file (or the super-user) may change its mode.
The basic use of the chmod command is
chmod mode filenameorchmod -R mode dirname
where the second form causes chmod to traverse the directory dirname and perform the chmod on it and its contents.
‘mode‘ is either an absolute or a symbolic mode. An absolute mode is an octal number constructed from the addition of the following modes:
400 Read by owner.200 Write by owner.100 Execute (search in directory) by owner.040 Read by group.020 Write by group.010 Execute (search) by group.004 Read by others.002 Write by others.001 Execute (search) by others.
A symbolic mode has the form: [who] op permission
who is a combination of: u User‘s permissions.g Group permissions.o Others.a All, or ugo.op is one of: + To add the permission.- To remove the permission.= To assign the permission explicitly.permission is a combination of: r Read.w Write.x Execute.X Give execute permission if the file is adirectory or if there is execute permissionfor one of the other user classes.
Multiple symbolic modes, separated by commas, may be given, and operations are performed in the order specified.
Consider the following three examples where chmod is used in absolute mode on the file /tmp/textfile. The first gives "read" permission to all. The second gives read and write to the owner and read permission to all. The last grants read write and execute to the owner, read and execute to the group and no permission for others.
chmod 444 /tmp/textfilechmod 644 /tmp/textfilechmod 750 /tmp/textfile
Now consider the following three examples where chmod is used in symbolic mode to achieve more or less the same thing.
chmod a+r /tmp/textfilechmod a+r,u+w /tmp/textfilechmod u+rwx,g+rx /tmp/textfile
Note well, however, that these operations granted but did not remove any permissions. A better effort might be as follows:
chmod a=r /tmp/textfilechmod a=r,u+w /tmp/textfilechmod u=rwx,g=rx,o= /tmp/textfile
Both symbolic and absolute form have their uses and proper place. Which form you use depends on the task at hand and your level of familiarity with each form.
The ln program (short for "link") is used to make hard or symbolic links to files, allowing one file to be called by multiple names. When UNIX utilities encounter such a link, they will transparently follow it to access the file to which it points. Any number of links can be assigned to a file. The number of links does not affect other file attributes such as size, protections, data, etc.
The basic use of ln is
ln [ -s ] filename linknameor ln [ -s ] pathname... directory
filename is the name of the original file or directory. linkname is the new name to associate with the file or filename.
If the last argument is the name of a directory, links are made in that directory for each pathname argument; ln uses the last component of each pathname as the name of each link in the named directory.
A hard link (the default) is a standard directory entry just like the one made when the file was created. Hard links can only be made to existing files. Hard links cannot be made across file systems (disk partitions, mounted file systems). To remove a file, all hard links to it must be removed, including the name by which it was first created.
A symbolic link, made with ln -s, is a special directory entry that points to another file. Symbolic links can span file systems and point to directories. In fact, you can create a symbolic link that points to a file that is currently absent from the file system; removing the file that it points to does not affect or alter the symbolic link itself.
When using cd to move to a directory through a symbolic link, you wind up in the pointed-to location within the file system. This means that the parent of the new working directory is not the parent of the symbolic link, but rather, the parent of the pointed-to directory.
galaxy-deke 1% pwd/home/systaff/deke/linktestgalaxy-deke 2% ln -s /usr/tmp symlinkgalaxy-deke 3% cd symlinkgalaxy-deke 4% cd .. galaxy-deke 5% pwd /usr
In order for UNIX to make a file available to you, the disk partition that the file resides on must be mounted. This means that UNIX has been told to associate the disk partition with a given directory path. For example to make the partition /dev/c0t1d0s4 available as /home/seas on the server to which that disk is attached, the partition /dev/c0t1d0s4 must be mounted as /home/seas. This can only be done by the system manager.
A much more powerful capability is the ability to mount disk partitions across the network. Using a facility called NFS (Network File System), a computer can "mount", or establish a connection, between a pathname and a disk partition on a disk which is physically attached to another machine. NFS allows disk partitions to be shared across the network, and is fundamental to our local configuration.
Beginning in the Fall of 2006, we are moving from individual departmental file servers for home directories, to a single large (12 400Gb disks arranged in two RAID arrays) school-wdie home directory (and other materials) server. This will provide home directory service (via NFS) to all UNIX hosts on the production network in all SEAS departments.
This is one of the mechanisms which allows you to login to any SEAS computer and have access to your home directory. Although NFS is very efficient, if you expect to run a very I/O intensive program, you may be better off running it directing input/output from/to a disk local to the machine on which you are running the program (e.g., use /var/tmp, or on many systems, /data as your work area).
When logged onto the console of a graphics workstation you have the option of running a window system. A window system provides you with a nicer environment for doing your work, and in some cases provides the graphics capabilities you must have to do your work. Most of the newer workstations offer only the graphics interface (normally) at the console.
Most UNIX windowing systems available on most SEAS workstations are based upon The X Window System. Many window managers are available to provide different looks and feels. Gnome, CDE and KDE are the most common window managers in use at this time.
One of the underlying philosophies of UNIX is that difficult jobs can be accomplished by using a set of small tools. We have presented only a few of the available tools here and encourage you to explore others described in both on-line and hard copy documentation.
passwd change password informationman display the on-line manual pagespwd print the current working directorycd change directoriesls list the contents of a directorymkdir make new directoriesrmdir remove directoriescat display the contents of a filemore browse or page through a text filecp copy filesmv move or rename filesrm remove (delete) fileslpr print fileslpq check the print queuelprm remove entries from the print queueMail send and receive electronic maillogout end login session on a computerfinger get user informationvi visual text editorln establish a link to a filescript make a log of an interactive sessiongrep search for a string or a regular expressionwc count lines, words and charactersless an enhanced pagerchmod change the permissions mode of a filetee replicate the standard outputdiff display line-by-line differences between pairs of text filessort sort and collate linesuniq remove or report adjacent duplicate lineshead display first few lines of specified filestail display last few lines of specified filespr prepare file(s) for printing, perhaps in multiple columnsfmt simple text formattertroff typeset or format documentsenscript convert text files to POSTSCRIPT format for printingvgrind pretty print program listing to printerspell report spelling errorslook find words in the system dictionary or lines in a sorted list
Prepared by Bob Kaser and Deke Kassabian
Last modifed: Friday, 18-Aug-2006 11:15:35 EDT
联系客服