UNIX sed - stream editor, edits for numerous files awk - easy manipulation of structured data for formated reports quick-fix scipts stream oriented file in - script - output change many documents at once ed - one lines commands, example p, prints current line, d, deletes current line g/regular/d deletes all 'regular' out of the entire file. s/ prefix is substitute: regular(address)/s/regular/complex/g - replaces regular with complex in the first occorance of regular sed cannot be used interactively sed - outputs to file, ed - outputs to original ed defaults to one line, sed defaults to all lines sed defaults output to screen if no file is choosen "-n" suppresses automatic output output = > (cannot write to original file, > overwrites it automatically first) awk -F, '{ print $1; print $2 }' list (-F change field seperator to comma) piping - output from sed piped to awk.. #! /bin/sh awk -F, '{ print $4 ", " $0 }' $* | //pipe sort | //pipe awk -F, ' $1 == LastState { print "\t" $2 } $1 != LastState { LastState = $1 print $1 }' regular expressions: . is wildcard * means 0 or more characters ... multiple wildcards ^ first character in newline $ end of a line or string \{n,m\} number of characters between n and m \ escapes th character that follows (if you are searching for a string with a period, etc.) writing regular expressions: evaluated by hits, misses, ommissions, and false alarms. .[!?;:,".] . = any character followed by an exclamation mark or question mark or semicolon or colon or comma or quotation mark or period and then followed by two spaces and any character. a hyphen can specify range ,eg. [a-z] match dates: MM-DD-YY MM/DD/YY [0-1][0-9][-/][0-3][0-9][-/][0-9][0-9] ^ as first character excludes = [^aeiou] excludes vowels. POSIX - prespecified character classes. [:alnum:]Printable characters (includes whitespace) [:alpha:] Alphabetic characters [:blank:] Space and tab characters [:cntrl:] Control characters [:digit:] Numeric characters [:graph:] Printable and visible (non-space) characters [:lower:] Lowercase characters [:print:] Alphanumeric characters [:punct:] Punctuation characters [:space:] Whitespace characters [:upper:] Uppercase characters [:xdigit:] Hexadecimal digits any string inside a quotation mark: ".*" called "look no-hands" approach caution: 1. Think through what you want to do before you do it. 2. Describe, unambiguously, a procedure to do it. 3. Test the procedure repeatedly before committing to any final changes. commands are carried out in order commands are applied to all lines unless specified commands are applied to an external file, never an original sed reads one line at a time, and thus can easily handle large files changes are virtual, s/pig/cow and s/cow/horse will change all pigs and cows to horses. addressing occurs at the beginning of a command ----------------------- address rules: If no address is specified, then the command is applied to each line. If there is only one address, the command is applied to any line matching the address. If two comma-separated addresses are specified, the command is performed on the first line matching the first address and all succeeding lines up to and including a line matching the second address. If an address is followed by an exclamation mark (!), the command is applied to all lines that do not match the address. ----------------------- {} is used to group commands. use "diff" program to compare original and modified files. write scripts "one step at a time" commands: d (delete), a (append), i (insert), and c (change). syntax: [address]command also grouped: address{ command, command } commands should be seperated by lines comment with # \n matches nth substring = \2 matches every second match & surrounds = (&) puts parentheses around the matched string $a\ End of file = append "End of File" at the end of the file. list finds 'hidden' characters, also shows 'special effects' ****PROJECT IDEA transform: replaces certain number of characters into an equal different number of characters: y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ - replaces lowercase with uppercase. print - p (prints line to screen) = prints line number /^\.H1/{ n /^$/d } You can read this script as follows: "Match any line beginning with the string `.H1', then print that line and read in the next line. If that line is blank, delete it." r (read) and w(write) work with filezzzzz. q stops at a line address. s/@f1(\(.*\))/\\fB\1\\fR/g - matches all characters bewtween f1@() the hold space - a set aside buffer / used for temp. space get (g or G) retrives items from holdspace. /1/{ (match) h (hold) d (delete) } /2/{ (match) G (get / print) } reverses ones and twossss use for sed - making an index! skimmminggggggggggg :top command1 command2 /pattern/b top command3 The script executes command3 only if the pattern doesn't match. All three commands will be executed, although the first two may be executed multiple times. /\.Rh 0/{ s/"\(.*\)" "\(.*\)" "\(.*\)"/"\1" "\2" "\3"/ t s/"\(.*\)" "\(.*\)"/"\1" "\2"/ t s/"\(.*\)"/"\1"/ } i like the way sed scripts look. AAAAWWWWWWWKKKK made in 1978. uses c style functions "the rules": awk [-v var=value] [-Fre] [--] 'pattern { action }' var=value datafile(s) awk [-v var=value] [-Fre] -f scriptfile [--] var=value datafile(s) not too bad, right? yesss.... hello world: $ echo 'this line of data is ignored' > test $ awk '{ print "Hello, world" }' test Hello, world (execute print function on every line of input (the data ignored line).) unintentional tip: echo 'this line of data is ignored' > test creates a file with the line printed main input loop - awk runs in a loop until its done BEGIN and END - actions to be performed before and after main loop. /^$/ { print "This is a blank line." } prints line for every blank line (pattern matching). ctrl - D to terminate. idea - make a game in AWK. IMPORTANT: each line is a 'record', each word (seperated by tab or space) is a 'field'. awk script: phone { print "" # output blank line print $1 # name print $2 # company print $3 # street print $4, $5 # city, state zip } FS can change field seperator! BEGIN { FS = "," } # comma-delimited fields { print $1 ", " $6 } $ awk -f phonelist.awk names John Robinson, 696-0987 Phyllis Chapman, 879-0900 also could match patterns: BEGIN { FS = "," } # comma-delimited fields /MA/{ print $1 ", " $6 } -all names and numbers of people in MA. !~ is like ! in c++ (does not match) EXPRESSIONS: \a Alert character, usually ASCII BEL character \b Backspace \f Formfeed \n Newline \r Carriage return \t Horizontal tab \v Vertical tab \ddd Character represented as 1 to 3 digit octal value \xhex Character represented as hexadecimal value[3] \c Any literal character c (e.g., \" for ")[4] assigning variables: a = "cab" - assign a to cab a = "cab" "awesome" - assign a to cabawesome awk can do arithmetic, same as c++ (++,--, +=, etc.) saw a red light at 7:50 also my back hurts. BEGIN { FS = "\n"; RS = "" } - multiline files,,,,, set field to a new line and records by a blank line. OFS - output field seperator, ORS - output field seperator AWK IS LIKE C++ and SED IS LIKE NOTHING I'VE EVR SEEEEEEN ~ match, !~ no match. REMEMBER - piping in UNIX - '|'. such as (ls -l $* | awk '{ print $5, "\t", $9 }') Character Descriptions c ASCII character d Decimal integer i Decimal integer. (Added in POSIX) e Floating-point format ([-]d.precisione[+-]dd) E Floating-point format ([-]d.precisionE[+-]dd) f Floating-point format ([-]ddd.precision) g e or f conversion, whichever is shortest, with trailing zeros removed G E or f conversion, whichever is shortest, with trailing zeros removed o Unsigned octal value s String x Unsigned hexadecimal number. Uses a-f for 10 to 15 X Unsigned hexadecimal number. Uses A-F for 10 to 15 % Literal % you can define variables from the command line. awk -f scriptfile high=100 low=60 datafile awk conditional syntax: if ( expression ) action1 [else action2] newline is like a ; wWHILE: i = 1 while ( i <= 4 ) { print $i ++i } ARRAYS: array[subscript] = value glenn gould @ 8:13 was just thinking about making a classical skate video. how many students got an A example: if ( grade == "A" ) ++gradeA else if (grade == "B" ) ++gradeB headache @ 8:15 awk can make text-based programs easily. awk ASCII art. OOOOOOOOOOOO OOOOOOOOOOOO OOXOOOOOOOOO OOOOOOOOOOOO OOOOOOOOOOOO OOOOOOOOOOOO OOOOOOOOOOOO OOOOOOOOOOOO OOOOOOOOOOOO OOOOOOOOOOOO OOOOOOOOOOOO OOOOOOOOOOOO pass in numbers at command line. FUNCTIONS IN AWK:: arithmetic functions! ex: rand(), cos(), atan(), etc. AWK programing for lottery. and STRING FUNCTIONS! custom functions: function name (parameter-list) { statements return expression } ex: function insert(STRING, POS, INS, before_tmp) { before_tmp = substr(STRING, 1, POS) after_tmp = substr(STRING, POS + 1) return before_tmp INS after_tmp } # main routine { print "Function returns", insert($1, 4, "XX") print "The value of $1 after is:", $1 print "The value of STRING is:", STRING print "The value of before_tmp:", before_tmp print "The value of after_tmp:", after_tmp } quote from google search: "Although you may experience painful eyestrain and dry eye while using a computer, computer screens do not permanently damage vision." "Blinking frequently while using the computer can also help dry eye symptoms." "But one of the most effective ways of looking after your eyes is simply to stop for regular brief breaks." OKKKKK sucessfully google searched without checking email and back to oreilly by 8:30. good sort function: function sort(ARRAY, ELEMENTS, temp, i, j) { for (i = 2; i <= ELEMENTS; ++i) { for (j = i; ARRAY[j-1] > ARRAY[j]; --j) { temp = ARRAY[j] ARRAY[j] = ARRAY[j-1] ARRAY[j-1] = temp } } return } CRAZY FUNCTIONS (dubbed 'the bottom drawer') getline < "data" #input from file 'data' date gets todays date! good for spam! eg: To: Peabody From: Sherman Date: Sun., Feb 18, 2007 I am writing you on Sun., Feb 18, 2007 to remind you about our special offer. from: /@date/ { "date +'%a., %h %d, %Y'" | getline today gsub(/@date/, today) } { print } # replaces @date with date. close() - closes file. system() - runs system command! maybe even other awk programs - maybe even print > cab.txt exports to fillllee print | vlc pipes to program. built in variables: FILENAME Current filename FS Field separator (a blank) NF Number of fields in current record NR Number of the current record OFMT Output format for numbers (%.6g) OFS Output field separator (a blank) ORS Output record separator (a newline) RS Record separator (a newline) print error: function printerr (message) { # print message, record number and record printf("ERROR:%s (%d) %s\n", message, NR, $0) > "/ dev/tty" } differences between awks. GNU, BELL labs, etc. systime function: systime() CHAPTER 12 two fullfeatured programs extensivley commented. correction: index program might only correct index formatting. CHAPTER 13. user contributed scripts. AWK used for formatting bills for sure. note: lookup troff. lesson learned - awk programs are easier to read than sed scripts. another lesson learned high-level and low-level programming is relative. sub(r, s, t) Substitute s for first match of the regular expression r in the string t. Return 1 if successful; 0 otherwise. If t is not supplied, defaults to $0.