2012年9月28日金曜日

Introducing the os Modules

Introducing the os Modules

Tools in the os Modules

Table 2-1. Commonly used os module tools
Tasks Tools
Shell variables os.environ
Running programs os.system, os.popen, os.execv, os.spawnv
Spawning processes os.fork, os.pipe, os.waitpid, os.kill
Descriptor files, locks os.open, os.read, os.write
File processing os.remove, os.rename, os.mkfifo, os.mkdir, os.rmdir
Administrative tools os.getcwd, os.chdir, os.chmod, os.getpid, os.listdir, os.access
Portability tools os.sep, os.pathsep, os.curdir, os.path.split, os.path.join
Pathname tools os.path.exists('path'), os.path.isdir('path'), os.path.getsize('path')

 

Administrative Tools

>>> os.getpid()
7980
>>> os.getcwd()
'C:\\PP4thEd\\Examples\\PP4E\\System'

>>> os.chdir(r'C:\Users')
>>> os.getcwd()
'C:\\Users'

 

Portability Constants

>>> os.pathsep, os.sep, os.pardir, os.curdir, os.linesep
(';', '\\', '..', '.', '\r\n')

 

Common os.path Tools

>>> os.path.isdir(r'C:\Users'), os.path.isfile(r'C:\Users')
(True, False)
>>> os.path.isdir(r'C:\config.sys'), os.path.isfile(r'C:\config.sys')
(False, True)
>>> os.path.isdir('nonesuch'), os.path.isfile('nonesuch')
(False, False)

>>> os.path.exists(r'c:\Users\Brian')
False
>>> os.path.exists(r'c:\Users\Default')
True
>>> os.path.getsize(r'C:\autoexec.bat')
24
 
>>> os.path.split(r'C:\temp\data.txt')
('C:\\temp', 'data.txt')

>>> os.path.join(r'C:\temp', 'output.txt')
'C:\\temp\\output.txt'

>>> name = r'C:\temp\data.txt'                            # Windows paths
>>> os.path.dirname(name), os.path.basename(name)
('C:\\temp', 'data.txt')

>>> name = '/home/lutz/temp/data.txt'                     # Unix-style paths
>>> os.path.dirname(name), os.path.basename(name)
('/home/lutz/temp', 'data.txt')

>>> os.path.splitext(r'C:\PP4thEd\Examples\PP4E\PyDemos.pyw')
('C:\\PP4thEd\\Examples\\PP4E\\PyDemos', '.pyw') 

 

Running Shell Commands from Script

Other os module Exports

Introducing the sys Module

Introducing the sys Modeule

Platforms and Versions

C:\...\PP4E\System> python
>>> import sys
>>> sys.platform, sys.maxsize, sys.version
('win32', 2147483647, '3.1.1 (r311:74483, Aug 17 2009, 17:02:12) ...more deleted...')

>>> if sys.platform[:3] == 'win': print('hello windows')
...
hello windows

 

The Module Search Path

>>> sys.path
['', 'C:\\PP4thEd\\Examples', ...plus standard library paths deleted... ]
>>> sys.path.append(r'C:\mydir')
>>> sys.path
['', 'C:\\PP4thEd\\Examples', ...more deleted..., 'C:\\mydir'] 

The Loaded Modules Table

>>> sys.modules
{'reprlib': <module 'reprlib' from 'c:\python31\lib\reprlib.py'>, ...more deleted...

>>> list(sys.modules.keys())
 ['reprlib', 'heapq', '__future__', 'sre_compile', '_collections', 'locale', '_sre',
'functools', 'encodings', 'site', 'operator', 'io', '__main__', ...more deleted... ]

>>> sys
<module 'sys' (built-in)>
>>> sys.modules['sys']
<module 'sys' (built-in)>

Exception Details

>>> import traceback, sys
>>> def grail(x):
...     raise TypeError('already got one')
...
>>> try:
...     grail('arthur')
... except:
...     exc_info = sys.exc_info()
...     print(exc_info[0])
...     print(exc_info[1])
...     traceback.print_tb(exc_info[2])
...
<class 'TypeError'>
already got one
  File "<stdin>", line 2, in <module>
  File "<stdin>", line 2, in grail

 

Programming Python

2 System Programming

  • System scripting overview
  • Intruducing sys Module
  • Introducing os Module

System Scripting Overview

  • dir, dir(sys), dir(os), dir(os.path)
  • import, import sys, import os
  • sys.__doc__, print(sys.__doc__)
  • help(sys)
  • >>> line = 'aaa\nbbb\nccc\n' >>> line.split('\n') ['aaa', 'bbb', 'ccc', ''] >>> line.splitlines() ['aaa', 'bbb', 'ccc']
  • String Method Basics
    • mystr.find('XXX'), >>> mystr.find('SPAM') # return first offset 3
    • mystr.replace('AA', 'XXX'), >>> mystr.replace('aa', 'SPAM') # global replacement 'xxSPAMxxSPAM'
    • 'XXX' in mystr, >>> 'SPAM' in mystr # substring search/test True

2012年9月8日土曜日

Task 7-2

Task 7-2

You want to start a long job in the background (so that your terminal is freed up) and save both standard output and standard error in a single log file. Write a script that does this.

2012年9月7日金曜日

7.1.2. File Discripter

The next few redirectors in Table 7-1 depend on the notion of a file descriptor. Like the device files used with <>, this is a low-level UNIX I/O concept that is of interest only to systems programmers—and then only occasionally. You can get by with a few basic facts about them; for the whole story, look at the entries for read( ), write( ), fcntl( ), and others in Section 2 of the UNIX manual. You might wish to refer to UNIX Power Tools by Shelley Powers, Jerry Peek, Tim O'Reilly, and Mike Loukides (O'Reilly).
File descriptors are integers starting at 0 that refer to particular streams of data associated with a process. When a process starts, it usually has three file descriptors open. These correspond to the three standards: standard input (file descriptor 0), standard output (1), and standard error (2). If a process opens additional files for input or output, they are assigned to the next available file descriptors, starting with 3.
By far the most common use of file descriptors with bash is in saving standard error in a file. For example, if you want to save the error messages from a long job in a file so that they don't scroll off the screen, append 2> file to your command. If you also want to save standard output, append > file1 2> file2.

Task 7-1


Task 7-1

The s file command in mail saves the current message in file. If the message came over a network (such as the Internet), then it has several header lines prepended that give information about network routing. Write a shell script that deletes the header lines from the file.

We can use ed to delete the header lines. To do this, we need to know something about the syntax of mail messages; specifically, that there is always a blank line between the header lines and the message text. The ed command 1,/^[]*$/d does the trick: it means, "Delete from line 1 until the first blank line." We also need the ed commands w (write the changed file) and q (quit). Here is the code that solves the task:
ed $1 << EOF
1,/^[ ]*$/d
w
q
EOF

The shell does parameter (variable) substitution and command substitution on text in a here-document, meaning that you can use shell variables and commands to customize the text. A good example of this is the bashbug script, which sends a bug report to the bash maintainer (see Chapter 11). Here is a stripped-down version:

7.1.1. Here Documents

The << label redirector essentially forces the input to a command to be the shell's standard input, which is read until there is a line that contains only label. The input in between is called a here-document. Here-documents aren't very interesting when used from the command prompt. In fact, it's the same as the normal use of standard input except for the label. We could use a here-document to simulate the mail facility. When you send a message to someone with the mail utility, you end the message with a dot (.). The body of the message is saved in a file, msgfile:
$ cat >> msgfile << .
  > this is the text of
  > our message.
  > .

Here-documents are meant to be used from within shell scripts; they let you specify "batch" input to programs. A common use of here-documents is with simple text editors like ed.

Chapter 7.1. I/O Redirectors

Table 7-1. I/O redirectors
Redirector Function
cmd1 | cmd2 Pipe; take standard output of cmd1 as standard input to cmd2.
> file Direct standard output to file.
< file Take standard input from file.
>> file Direct standard output to file; append to file if it already exists.
>| file Force standard output to file even if noclobber is set.
n>| file Force output to file from file descriptor n even if noclobber is set.
<> file Use file as both standard input and standard output.
n<> file Use file as both input and output for file descriptor n.
<< label Here-document; see text.
n > file Direct file descriptor n to file.
n < file Take file descriptor n from file.
n >> file Direct file descriptor n to file; append to file if it already exists.
n>& Duplicate standard output to file descriptor n.
n<& Duplicate standard input from file descriptor n.
n>&m File descriptor n is made to be a copy of the output file descriptor.
n<&m File descriptor n is made to be a copy of the input file descriptor.
&>file Directs standard output and standard error to file.
<&- Close the standard input.
>&- Close the standard output.
n>&- Close the output from file descriptor n.
n<&- Close the input from file descriptor n.
n>&word If n is not specified, the standard output (file descriptor 1) is used. If the digits in word do not specify a file descriptor open for output, a redirection error occurs. As a special case, if n is omitted, and word does not expand to one or more digits, the standard output and standard error are redirected as described previously.
n<&word If word expands to one or more digits, the file descriptor denoted by n is made to be a copy of that file descriptor. If the digits in word do not specify a file descriptor open for input, a redirection error occurs. If word evaluates to -, file descriptor n is closed. If n is not specified, the standard input (file descriptor 0) is used.
n>&digit- Moves the file descriptor digit to file descriptor n, or the standard output (file descriptor 1) if n is not specified.
n<&digit- Moves the file descriptor digit to file descriptor n, or the standard input (file descriptor 0) if n is not specified. digit is closed after being duplicated to n.

Chapter 7. Input/Output and Command-line Processing

In this chapter, we switch the focus to two related topics. The first is the shell's mechanisms for doing file-oriented input and output. We present information that expands on what you already know about the shell's basic I/O redirectors.
Second, we'll "zoom in" and talk about I/O at the line and word level. This is a fundamentally different topic, since it involves moving information between the domains of files/terminals and shell variables. echo and command substitution are two ways of doing this that we've seen so far.
Our discussion of line and word I/O will lead into a more detailed explanation of how the shell processes command lines. This information is necessary so that you can understand exactly how the shell deals with quotation, and so that you can appreciate the power of an advanced command called eval, which we will cover at the end of the chapter.

2012年9月6日木曜日

6.4. Arrays

There are several ways to assign values to arrays. The most straightforward way is with an assignment, just like any other variable:
names[2]=alice
names[0]=hatter
names[1]=duchess

This assigns hatter to element 0, duchess to element 1, and alice to element 2 of the array names.
Another way to assign values is with a compound assignment:
names=([2]=alice [0]=hatter [1]=duchess)

This is equivalent to the first example and is convenient for initializing an array with a set of values. Notice that we didn't have to specify the indices in numerical order. In fact, we don't even have to supply the indices if we reorder our values slightly:
names=(hatter duchess alice)

bash automatically assigns the values to consecutive elements starting at 0. If we provide an index at some point in the compound assignment, the values get assigned consecutively from that point on, so:
names=(hatter [5]=duchess alice)

assigns hatter to element 0, duchess to element 5, and alice to element 6.
An array is created automatically by any assignment of these forms. To explicitly create an empty array, you can use the -a option to declare. Any attributes that you set for the array with declare (e.g., the read-only attribute) apply to the entire array. For example, the statement declare -ar names would create a read-only array called names. Every element of the array would be read-only.
An element in an array may be referenced with the syntax ${ array[i]}. So, from our last example above, the statement echo ${names[5]} would print the string "duchess". If no index is supplied, array element 0 is assumed.
You can also use the special indices @ and *. These return all of the values in the array and work in the same way as for the positional parameters; when the array reference is within double quotes, using * expands the reference to one word consisting of all the values in the array separated by the first character of the IFS variable, while @ expands the values in the array to separate words. When unquoted, both of them expand the values of the array to separate words. Just as with positional parameters, this is useful for iterating through the values with a for loop:
for i in "${names[@]}"; do
    echo $i
done

Any array elements which are unassigned don't exist; they default to null strings if you explicitly reference them. Therefore, the previous looping example will print out only the assigned elements in the array names. If there were three values at indexes 1, 45, and 1005, only those three values would be printed.
If you want to know what indices currently have values in an array then you can use ${!array[@]}. In the last example this would return 1 45 1005.[17]
[17] This is not available in versions of bash prior to 3.0.
A useful operator that you can use with arrays is #, the length operator that we saw in Chapter 4. To find out the length of any element in the array, you can use ${#array[i]}. Similarly, to find out how many values there are in the array, use * or @ as the index. So, for names=(hatter [5]=duchess alice), ${#names[5]} has the value 7, and ${#names[@]} has the value 3.
Reassigning to an existing array with a compound array statement replaces the old array with the new one. All of the old values are lost, even if they were at different indices to the new elements. For example, if we reassigned names to be ([100]=tweedledee tweedledum), the values hatter, duchess, and alice would disappear.
You can destroy any element or the entire array by using the unset built-in. If you specify an index, that particular element will be unset. unset names[100], for instance, would remove the value at index 100; tweedledee in the example above. However, unlike assignment, if you don't specify an index the entire array is unset, not just element 0. You can explicitly specify unsetting the entire array by using * or @ as the index.
Let's now look at a simple example that uses arrays to match user IDs to account names on the system. The code takes a user ID as an argument and prints the name of the account plus the number of accounts currently on the system:
for i in $(cut -f 1,3 -d: /etc/passwd) ; do
   array[${i#*:}]=${i%:*}
done
     
echo "User ID $1 is ${array[$1]}."
echo "There are currently ${#array[@]} user accounts on the system."

We use cut to create a list from fields 1 and 3 in the /etc/passwd file. Field 1 is the account name and field 3 is the user ID for the account. The script loops through this list using the user ID as an index for each array element and assigns each account name to that element. The script then uses the supplied argument as an index into the array, prints out the value at that index, and prints the number of existing array values.

2012年9月5日水曜日

"$*", "$@", "$#"

Two special variables contain all of the positional parameters (except positional parameter 0): * and @. The difference between them is subtle but important, and it's apparent only when they are within double quotes.
"$*" is a single string that consists of all of the positional parameters, separated by the first character in the value of the environment variable IFS (internal field separator), which is a space, TAB, and NEWLINE by default. On the other hand, "$@" is equal to "$1" "$2"... "$ N", where N is the number of positional parameters. That is, it's equal to N separate double-quoted strings, which are separated by spaces. If there are no positional parameters, "$@" expands to nothing. We'll explore the ramifications of this difference in a little while.
The variable # holds the number of positional parameters (as a character string). All of these variables are "read-only," meaning that you can't assign new values to them within scripts.

2012年9月4日火曜日

6.2. Typed Variables

Table 6-1. Declare options
Option Meaning
-a The variables are treated as arrays
-f Use function names only
-F Display function names without definitions
-i The variables are treated as integers
-r Makes the variables read-only
-x Marks the variables for export via the environment

2012年9月3日月曜日

6.1.3. getopts

getopts takes two arguments. The first is a string that can contain letters and colons. Each letter is a valid option; if a letter is followed by a colon, the option requires an argument. getopts picks options off the command line and assigns each one (without the leading dash) to a variable whose name is getopts's second argument. As long as there are options left to process, getopts will return exit status 0; when the options are exhausted, it returns exit status 1, causing the while loop to exit.

2012年9月2日日曜日

Chapter 6. Command-Line Option and Typed Variables

  In particular, if you are an experienced UNIX user, it might have occurred to you that none of the example scripts shown so far have the ability to handle options preceded by a dash (-) on the command line. And if you program in a conventional language like C or Pascal, you will have noticed that the only type of data that we have seen in shell variables is character strings; we haven't seen how to do arithmetic, for example.