2012年10月11日木曜日

test

R page

R page

Vectors

Vectors are one-dimensional arrays that can hold numeric data, character data, or logical data.

Matrices

A matrix is a two-dimentional array where each element has the same mode(numeric, character, or logical).

Arrays

Arrays are similar to matrices but can have more than two dimentions.

Data frames

A data frames is more general than a matrix in that different columns can contain different modes of data (numeric, character, etc).

Factors

As you've seen, variables can be described as nominal, ordinal, or continuous.

Lists

Lists are the most complex of the R data types. Basically, a list is an ordered collection of objects (components).

2012年10月7日日曜日

Cleaning data

"H22.1.4" have to be converted to like this, "2010.1.4"


Then, turn the expression like "2010.1.4" into date value using "as.Date()" function

I wanna turn "H21" into "2009", "1.4" into "01.04" for as.Date() conversion.

So, what should I do?

Using Python

I try to do my mission by using Python.

My Python program should read the file ("something.csv" file suppose) and convert texts of date in the file into ideal way of expression.

2012年9月28日金曜日

Introducing the os Modules

Introducing the os Modules

Tools in the os Modules

Table 2-1. Commonly used os module tools
Tasks Tools
Shell variables os.environ
Running programs os.system, os.popen, os.execv, os.spawnv
Spawning processes os.fork, os.pipe, os.waitpid, os.kill
Descriptor files, locks os.open, os.read, os.write
File processing os.remove, os.rename, os.mkfifo, os.mkdir, os.rmdir
Administrative tools os.getcwd, os.chdir, os.chmod, os.getpid, os.listdir, os.access
Portability tools os.sep, os.pathsep, os.curdir, os.path.split, os.path.join
Pathname tools os.path.exists('path'), os.path.isdir('path'), os.path.getsize('path')

 

Administrative Tools

>>> os.getpid()
7980
>>> os.getcwd()
'C:\\PP4thEd\\Examples\\PP4E\\System'

>>> os.chdir(r'C:\Users')
>>> os.getcwd()
'C:\\Users'

 

Portability Constants

>>> os.pathsep, os.sep, os.pardir, os.curdir, os.linesep
(';', '\\', '..', '.', '\r\n')

 

Common os.path Tools

>>> os.path.isdir(r'C:\Users'), os.path.isfile(r'C:\Users')
(True, False)
>>> os.path.isdir(r'C:\config.sys'), os.path.isfile(r'C:\config.sys')
(False, True)
>>> os.path.isdir('nonesuch'), os.path.isfile('nonesuch')
(False, False)

>>> os.path.exists(r'c:\Users\Brian')
False
>>> os.path.exists(r'c:\Users\Default')
True
>>> os.path.getsize(r'C:\autoexec.bat')
24
 
>>> os.path.split(r'C:\temp\data.txt')
('C:\\temp', 'data.txt')

>>> os.path.join(r'C:\temp', 'output.txt')
'C:\\temp\\output.txt'

>>> name = r'C:\temp\data.txt'                            # Windows paths
>>> os.path.dirname(name), os.path.basename(name)
('C:\\temp', 'data.txt')

>>> name = '/home/lutz/temp/data.txt'                     # Unix-style paths
>>> os.path.dirname(name), os.path.basename(name)
('/home/lutz/temp', 'data.txt')

>>> os.path.splitext(r'C:\PP4thEd\Examples\PP4E\PyDemos.pyw')
('C:\\PP4thEd\\Examples\\PP4E\\PyDemos', '.pyw') 

 

Running Shell Commands from Script

Other os module Exports

Introducing the sys Module

Introducing the sys Modeule

Platforms and Versions

C:\...\PP4E\System> python
>>> import sys
>>> sys.platform, sys.maxsize, sys.version
('win32', 2147483647, '3.1.1 (r311:74483, Aug 17 2009, 17:02:12) ...more deleted...')

>>> if sys.platform[:3] == 'win': print('hello windows')
...
hello windows

 

The Module Search Path

>>> sys.path
['', 'C:\\PP4thEd\\Examples', ...plus standard library paths deleted... ]
>>> sys.path.append(r'C:\mydir')
>>> sys.path
['', 'C:\\PP4thEd\\Examples', ...more deleted..., 'C:\\mydir'] 

The Loaded Modules Table

>>> sys.modules
{'reprlib': <module 'reprlib' from 'c:\python31\lib\reprlib.py'>, ...more deleted...

>>> list(sys.modules.keys())
 ['reprlib', 'heapq', '__future__', 'sre_compile', '_collections', 'locale', '_sre',
'functools', 'encodings', 'site', 'operator', 'io', '__main__', ...more deleted... ]

>>> sys
<module 'sys' (built-in)>
>>> sys.modules['sys']
<module 'sys' (built-in)>

Exception Details

>>> import traceback, sys
>>> def grail(x):
...     raise TypeError('already got one')
...
>>> try:
...     grail('arthur')
... except:
...     exc_info = sys.exc_info()
...     print(exc_info[0])
...     print(exc_info[1])
...     traceback.print_tb(exc_info[2])
...
<class 'TypeError'>
already got one
  File "<stdin>", line 2, in <module>
  File "<stdin>", line 2, in grail

 

Programming Python

2 System Programming

  • System scripting overview
  • Intruducing sys Module
  • Introducing os Module

System Scripting Overview

  • dir, dir(sys), dir(os), dir(os.path)
  • import, import sys, import os
  • sys.__doc__, print(sys.__doc__)
  • help(sys)
  • >>> line = 'aaa\nbbb\nccc\n' >>> line.split('\n') ['aaa', 'bbb', 'ccc', ''] >>> line.splitlines() ['aaa', 'bbb', 'ccc']
  • String Method Basics
    • mystr.find('XXX'), >>> mystr.find('SPAM') # return first offset 3
    • mystr.replace('AA', 'XXX'), >>> mystr.replace('aa', 'SPAM') # global replacement 'xxSPAMxxSPAM'
    • 'XXX' in mystr, >>> 'SPAM' in mystr # substring search/test True

2012年9月8日土曜日

Task 7-2

Task 7-2

You want to start a long job in the background (so that your terminal is freed up) and save both standard output and standard error in a single log file. Write a script that does this.

2012年9月7日金曜日

7.1.2. File Discripter

The next few redirectors in Table 7-1 depend on the notion of a file descriptor. Like the device files used with <>, this is a low-level UNIX I/O concept that is of interest only to systems programmers—and then only occasionally. You can get by with a few basic facts about them; for the whole story, look at the entries for read( ), write( ), fcntl( ), and others in Section 2 of the UNIX manual. You might wish to refer to UNIX Power Tools by Shelley Powers, Jerry Peek, Tim O'Reilly, and Mike Loukides (O'Reilly).
File descriptors are integers starting at 0 that refer to particular streams of data associated with a process. When a process starts, it usually has three file descriptors open. These correspond to the three standards: standard input (file descriptor 0), standard output (1), and standard error (2). If a process opens additional files for input or output, they are assigned to the next available file descriptors, starting with 3.
By far the most common use of file descriptors with bash is in saving standard error in a file. For example, if you want to save the error messages from a long job in a file so that they don't scroll off the screen, append 2> file to your command. If you also want to save standard output, append > file1 2> file2.

Task 7-1


Task 7-1

The s file command in mail saves the current message in file. If the message came over a network (such as the Internet), then it has several header lines prepended that give information about network routing. Write a shell script that deletes the header lines from the file.

We can use ed to delete the header lines. To do this, we need to know something about the syntax of mail messages; specifically, that there is always a blank line between the header lines and the message text. The ed command 1,/^[]*$/d does the trick: it means, "Delete from line 1 until the first blank line." We also need the ed commands w (write the changed file) and q (quit). Here is the code that solves the task:
ed $1 << EOF
1,/^[ ]*$/d
w
q
EOF

The shell does parameter (variable) substitution and command substitution on text in a here-document, meaning that you can use shell variables and commands to customize the text. A good example of this is the bashbug script, which sends a bug report to the bash maintainer (see Chapter 11). Here is a stripped-down version:

7.1.1. Here Documents

The << label redirector essentially forces the input to a command to be the shell's standard input, which is read until there is a line that contains only label. The input in between is called a here-document. Here-documents aren't very interesting when used from the command prompt. In fact, it's the same as the normal use of standard input except for the label. We could use a here-document to simulate the mail facility. When you send a message to someone with the mail utility, you end the message with a dot (.). The body of the message is saved in a file, msgfile:
$ cat >> msgfile << .
  > this is the text of
  > our message.
  > .

Here-documents are meant to be used from within shell scripts; they let you specify "batch" input to programs. A common use of here-documents is with simple text editors like ed.

Chapter 7.1. I/O Redirectors

Table 7-1. I/O redirectors
Redirector Function
cmd1 | cmd2 Pipe; take standard output of cmd1 as standard input to cmd2.
> file Direct standard output to file.
< file Take standard input from file.
>> file Direct standard output to file; append to file if it already exists.
>| file Force standard output to file even if noclobber is set.
n>| file Force output to file from file descriptor n even if noclobber is set.
<> file Use file as both standard input and standard output.
n<> file Use file as both input and output for file descriptor n.
<< label Here-document; see text.
n > file Direct file descriptor n to file.
n < file Take file descriptor n from file.
n >> file Direct file descriptor n to file; append to file if it already exists.
n>& Duplicate standard output to file descriptor n.
n<& Duplicate standard input from file descriptor n.
n>&m File descriptor n is made to be a copy of the output file descriptor.
n<&m File descriptor n is made to be a copy of the input file descriptor.
&>file Directs standard output and standard error to file.
<&- Close the standard input.
>&- Close the standard output.
n>&- Close the output from file descriptor n.
n<&- Close the input from file descriptor n.
n>&word If n is not specified, the standard output (file descriptor 1) is used. If the digits in word do not specify a file descriptor open for output, a redirection error occurs. As a special case, if n is omitted, and word does not expand to one or more digits, the standard output and standard error are redirected as described previously.
n<&word If word expands to one or more digits, the file descriptor denoted by n is made to be a copy of that file descriptor. If the digits in word do not specify a file descriptor open for input, a redirection error occurs. If word evaluates to -, file descriptor n is closed. If n is not specified, the standard input (file descriptor 0) is used.
n>&digit- Moves the file descriptor digit to file descriptor n, or the standard output (file descriptor 1) if n is not specified.
n<&digit- Moves the file descriptor digit to file descriptor n, or the standard input (file descriptor 0) if n is not specified. digit is closed after being duplicated to n.

Chapter 7. Input/Output and Command-line Processing

In this chapter, we switch the focus to two related topics. The first is the shell's mechanisms for doing file-oriented input and output. We present information that expands on what you already know about the shell's basic I/O redirectors.
Second, we'll "zoom in" and talk about I/O at the line and word level. This is a fundamentally different topic, since it involves moving information between the domains of files/terminals and shell variables. echo and command substitution are two ways of doing this that we've seen so far.
Our discussion of line and word I/O will lead into a more detailed explanation of how the shell processes command lines. This information is necessary so that you can understand exactly how the shell deals with quotation, and so that you can appreciate the power of an advanced command called eval, which we will cover at the end of the chapter.

2012年9月6日木曜日

6.4. Arrays

There are several ways to assign values to arrays. The most straightforward way is with an assignment, just like any other variable:
names[2]=alice
names[0]=hatter
names[1]=duchess

This assigns hatter to element 0, duchess to element 1, and alice to element 2 of the array names.
Another way to assign values is with a compound assignment:
names=([2]=alice [0]=hatter [1]=duchess)

This is equivalent to the first example and is convenient for initializing an array with a set of values. Notice that we didn't have to specify the indices in numerical order. In fact, we don't even have to supply the indices if we reorder our values slightly:
names=(hatter duchess alice)

bash automatically assigns the values to consecutive elements starting at 0. If we provide an index at some point in the compound assignment, the values get assigned consecutively from that point on, so:
names=(hatter [5]=duchess alice)

assigns hatter to element 0, duchess to element 5, and alice to element 6.
An array is created automatically by any assignment of these forms. To explicitly create an empty array, you can use the -a option to declare. Any attributes that you set for the array with declare (e.g., the read-only attribute) apply to the entire array. For example, the statement declare -ar names would create a read-only array called names. Every element of the array would be read-only.
An element in an array may be referenced with the syntax ${ array[i]}. So, from our last example above, the statement echo ${names[5]} would print the string "duchess". If no index is supplied, array element 0 is assumed.
You can also use the special indices @ and *. These return all of the values in the array and work in the same way as for the positional parameters; when the array reference is within double quotes, using * expands the reference to one word consisting of all the values in the array separated by the first character of the IFS variable, while @ expands the values in the array to separate words. When unquoted, both of them expand the values of the array to separate words. Just as with positional parameters, this is useful for iterating through the values with a for loop:
for i in "${names[@]}"; do
    echo $i
done

Any array elements which are unassigned don't exist; they default to null strings if you explicitly reference them. Therefore, the previous looping example will print out only the assigned elements in the array names. If there were three values at indexes 1, 45, and 1005, only those three values would be printed.
If you want to know what indices currently have values in an array then you can use ${!array[@]}. In the last example this would return 1 45 1005.[17]
[17] This is not available in versions of bash prior to 3.0.
A useful operator that you can use with arrays is #, the length operator that we saw in Chapter 4. To find out the length of any element in the array, you can use ${#array[i]}. Similarly, to find out how many values there are in the array, use * or @ as the index. So, for names=(hatter [5]=duchess alice), ${#names[5]} has the value 7, and ${#names[@]} has the value 3.
Reassigning to an existing array with a compound array statement replaces the old array with the new one. All of the old values are lost, even if they were at different indices to the new elements. For example, if we reassigned names to be ([100]=tweedledee tweedledum), the values hatter, duchess, and alice would disappear.
You can destroy any element or the entire array by using the unset built-in. If you specify an index, that particular element will be unset. unset names[100], for instance, would remove the value at index 100; tweedledee in the example above. However, unlike assignment, if you don't specify an index the entire array is unset, not just element 0. You can explicitly specify unsetting the entire array by using * or @ as the index.
Let's now look at a simple example that uses arrays to match user IDs to account names on the system. The code takes a user ID as an argument and prints the name of the account plus the number of accounts currently on the system:
for i in $(cut -f 1,3 -d: /etc/passwd) ; do
   array[${i#*:}]=${i%:*}
done
     
echo "User ID $1 is ${array[$1]}."
echo "There are currently ${#array[@]} user accounts on the system."

We use cut to create a list from fields 1 and 3 in the /etc/passwd file. Field 1 is the account name and field 3 is the user ID for the account. The script loops through this list using the user ID as an index for each array element and assigns each account name to that element. The script then uses the supplied argument as an index into the array, prints out the value at that index, and prints the number of existing array values.

2012年9月5日水曜日

"$*", "$@", "$#"

Two special variables contain all of the positional parameters (except positional parameter 0): * and @. The difference between them is subtle but important, and it's apparent only when they are within double quotes.
"$*" is a single string that consists of all of the positional parameters, separated by the first character in the value of the environment variable IFS (internal field separator), which is a space, TAB, and NEWLINE by default. On the other hand, "$@" is equal to "$1" "$2"... "$ N", where N is the number of positional parameters. That is, it's equal to N separate double-quoted strings, which are separated by spaces. If there are no positional parameters, "$@" expands to nothing. We'll explore the ramifications of this difference in a little while.
The variable # holds the number of positional parameters (as a character string). All of these variables are "read-only," meaning that you can't assign new values to them within scripts.

2012年9月4日火曜日

6.2. Typed Variables

Table 6-1. Declare options
Option Meaning
-a The variables are treated as arrays
-f Use function names only
-F Display function names without definitions
-i The variables are treated as integers
-r Makes the variables read-only
-x Marks the variables for export via the environment

2012年9月3日月曜日

6.1.3. getopts

getopts takes two arguments. The first is a string that can contain letters and colons. Each letter is a valid option; if a letter is followed by a colon, the option requires an argument. getopts picks options off the command line and assigns each one (without the leading dash) to a variable whose name is getopts's second argument. As long as there are options left to process, getopts will return exit status 0; when the options are exhausted, it returns exit status 1, causing the while loop to exit.

2012年9月2日日曜日

Chapter 6. Command-Line Option and Typed Variables

  In particular, if you are an experienced UNIX user, it might have occurred to you that none of the example scripts shown so far have the ability to handle options preceded by a dash (-) on the command line. And if you program in a conventional language like C or Pascal, you will have noticed that the only type of data that we have seen in shell variables is character strings; we haven't seen how to do arithmetic, for example.

2012年8月28日火曜日

5.1.4.1. String comparison

Table 5-1. String comparison operators
Operator True if...
str1 = str2[4] str1 matches str2
str1 != str2 str1 does not match str2
str1 < str2 str1 is less than str2
str1 > str2 str1 is greater than str2
-n str1 str1 is not null (has length greater than 0)
-z str1 str1 is null (has length 0)

5.1.4.2. File attribute checking

Table 5-2. File attribute operators
Operator True if...
-a file file exists
-d file file exists and is a directory
-e file file exists; same as - a
-f file file exists and is a regular file (i.e., not a directory or other special type of file)
-r file You have read permission on file
-s file file exists and is not empty
-w file You have write permission on file
-x file You have execute permission on file, or directory search permission if it is a directory
-N file file was modified since it was last read
-O file You own file
-G file file 's group ID matches yours (or one of yours, if you are in multiple groups)
file1 -nt file2 file1 is newer than file2 [6]
file1 -ot file2 file1 is older than file2

[6] Specifically, the -nt and -ot operators compare modification times of two files.

$* and $@

"$*" is a single string that consists of all of the positional parameters, separated by the first character in the value of the environment variable IFS (internal field separator), which is a space, TAB, and NEWLINE by default. On the other hand, "$@" is equal to "$1" "$2"... "$ N", where N is the number of positional parameters. That is, it's equal to N separate double-quoted strings, which are separated by spaces. If there are no positional parameters, "$@" expands to nothing. We'll explore the ramifications of this difference in a little while.

5.2. for

for x := 1 to 10 do
begin
    statements...
end
 
for name  [in list ]
do
    statements that can use  
    $name... 
done 

5.1.5. Integer Conditionals

Table 5-3. Arithmetic test operators
Test Comparison
-lt Less than
-le Less than or equal
-eq Equal
-ge Greater than or equal
-gt Greater than
-ne Not equal

2012年8月27日月曜日

Task 4-2


Task 4-2

You are writing a graphics file conversion utility for use in creating a web page. You want to be able to take a PCX file and convert it to a JPEG file for use on the web page.[7]

${...}, $(...)

${...}:  String Operator
  • String Substitution
    •  
  • String Pattern Matching
  • String Length
$(...):  Command Substitution
Here are some simple examples:
  • The value of $(pwd) is the current directory (same as the environment variable $PWD).
  • The value of $(ls $HOME) is the names of all files in your home directory.
  • The value of $(ls $(pwd)) is the names of all files in the current directory.
  • The value of $(< alice) is the contents of the file alice with any trailing newlines removed.[10]
    [10] Not available in versions of bash prior to 2.02.
  • To find out detailed information about a command if you don't know where its file resides, type ls -l $(type -path -all command-name). The -all option forces type to do a pathname look-up and -path causes it to ignore keywords, built-ins, etc.
  • If you want to edit (with vi) every chapter of your book on bash that has the phrase "command substitution," assuming that your chapter files all begin with ch, you could type:
          vi $(grep -l 'command substitution' ch*)

  • The -l option to grep prints only the names of files that contain matches.

2012年8月26日日曜日

Ch. 5 Flow Control

5.1.  if/else

The if construct has the following syntax:
if condition
then
    statements
[elif condition
    then statements...]
[else 
 statements]
fi




5.1.1.  Exit Status
5.1.2.  Return
5.1.3.  Combination of Exit Status
5.1.4.  Condition Tests
5.1.5.  Integer Conditional
 
 

4.4. Command Substitutions

4.4.1. 

4.3. String Operators

4.3.1.  Syntax of String Operators

Table 4-1. Substitution operators
Operator Substitution
${ varname :- word } If varname exists and isn't null, return its value; otherwise return word.
Purpose: Returning a default value if the variable is undefined.
Example: ${count:-0} evaluates to 0 if count is undefined.
${ varname := word} If varname exists and isn't null, return its value; otherwise set it to word and then return its value. Positional and special parameters cannot be assigned this way.
Purpose: Setting a variable to a default value if it is undefined.
Example: ${count:=0} sets count to 0 if it is undefined.
${ varname :? message } If varname exists and isn't null, return its value; otherwise print varname: followed by message, and abort the current command or script (non-interactive shells only). Omitting message produces the default message parameter null or not set.
Purpose: Catching errors that result from variables being undefined.
Example: {count:?"undefined!"} prints "count: undefined!" and exits if count is undefined.
${ varname:+word } If varname exists and isn't null, return word; otherwise return null.
Purpose: Testing for the existence of a variable.
Example: ${count:+1} returns 1 (which could mean "true") if count is defined.
${ varname:offset:length } Performs substring expansion.[5] It returns the substring of $varname starting at offset and up to length characters. The first character in $varname is position 0. If length is omitted, the substring starts at offset and continues to the end of $varname. If offset is less than 0 then the position is taken from the end of $varname. If varname is @, the length is the number of positional parameters starting at parameter offset.
Purpose: Returning parts of a string (substrings or slices).
Example: If count is set to frogfootman, ${count:4} returns footman. ${count:4:4} returns foot.

4.3.2.  Patterns and Pattern Matching

Table 4-2. Pattern-matching operators
Operator Meaning
${variable #pattern} If the pattern matches the beginning of the variable's value, delete the shortest part that matches and return the rest.
${variable ##pattern} If the pattern matches the beginning of the variable's value, delete the longest part that matches and return the rest.
${variable %pattern} If the pattern matches the end of the variable's value, delete the shortest part that matches and return the rest.
${variable %%pattern} If the pattern matches the end of the variable's value, delete the longest part that matches and return the rest.
${variable/ pattern/ string}${variable// pattern/ string} The longest match to pattern in variable is replaced by string. In the first form, only the first match is replaced. In the second form, all matches are replaced. If the pattern begins with a #, it must match at the start of the variable. If it begins with a %, it must match with the end of the variable. If string is null, the matches are deleted. If variable is @ or *, the operation is applied to each positional parameter in turn and the expansion is the resultant list.[6]

4.3.3.  Length Operator

${# varname }

4.3.4.  Extended Pattern Matching

Table 4-3. Pattern-matching operators
Operator Meaning
*(patternlist) Matches zero or more occurrences of the given patterns.
+(patternlist) Matches one or more occurrences of the given patterns.
?(patternlist) Matches zero or one occurrences of the given patterns.
@(patternlist) Matches exactly one of the given patterns.
!(patternlist) Matches anything except one of the given patterns.