csh and tcsh shell scripting language

C Shell (csh) or tcsh:

C shell was created by Bill Joy of UCB in 1978, shortly after release of sh shell. It was developed to make coding style similar to C language (since C was the most used language for programming at that time), and make it easier to use. Later a improved version of csh called tcsh was developed which borrowed a lot of useful concepts from Tenex systems, hence the "t". On most linux systems, tcsh is the shell used (even though it says csh, it's actually tcsh, as csh is usually a soft link to tcsh binary).

NOTE: Even though we say csh everywhere, it's really tcsh that we are talking about. Extensions still say *.csh. We'll call it csh even though we mean tcsh.

This is  offcial website for tcsh: https://www.tcsh.org/

This is good website for all shell/linux related stuff: (The guy lists a lot of reasons on why csh shouldn't be used): https://www.grymoire.com/Unix/Csh.html

Bill Joy's csh intro paper here: http://www.kitebird.com/csh-tcsh-book/csh-intro.pdf

csh syntax: https://www.mkssoftware.com/docs/man1/csh.1.asp

NOTE: C shell is not recommended shell for usage. Bash is the one that should be used. Almost all of linux scripts, that you find in your linux distro, are in bash. Csh is introduced here, since some scripts in corporate IT world are still written in csh, which you may need to work on from time to time. Csh is ripe with bugs, and you can find a lot of articles as to why Csh should not be used. One reason why csh is so buggy and unpredictable is because it doesn't have a true parser like other shells. Bill Joy famously admitted that he wasn't a very good programmer when he wrote csh. To make matters worse, there is no elaborate documentation of csh (unlike bash, which has very detailed documentation on tldp.org website). The only doc that shows up most frequently on internet searches is that "bill Joy's csh" paper, whiich is like decades old (shown in link above). This makes it hard to systematically learn csh. What I've documented below is just bits and pieces from diff websites, as well as my own trial and error with csh cmds. In spite of all this, csh became so popular, which just bewilders me. Hope, I've convinced you enough not to read thru the csh doc below.

Csh startup files: uses 3 startup files:

1. .cshrc: It's sourced everytime a new C shell starts. The shell does a "source .cshrc". csh scripts also source this file, unless "-f" (i.e fast) option is used on cmd line or on 1st line of csh script

ex: #!/bin/csh -f => -f option as 1st line of script causes csh to startup in fast mode, i.e it reads nether the .cshrc nor the .login

2. .login: If you are in a login shell, then this file is the 2nd file sourced. after the .cshrc file.

3. .logout: This is the last executed on logging out (only from a login shell)

Simple csh script example: similar to bash, except for 1st line. Save it as test.csh.

#!/bin/csh -f

echo "hello"; => this cmd followed by args. Each line of csh is "cmd followed by args"

Run this script by doing "chmod 755 test.csh", and then typing ./test.csh.

csh itself has many options that can be provided on cmdline that controls it's behaviour.

ex: csh -f -v -x ... => many more options possible. -x is an important option that echoes cmds immediately before execution (this is useful for debug)

Csh Syntax:

csh is very similar to C in syntax. Similar to bash, we'll look at reserved keywords/cmds, variables and special char. On a single line, different tokens are separated by whitespace (tab, space or blank line). The following char don't need space to be identified as a separate token: &, &&, |, ||, <, <<, >, >>, ;, (, ). That's why many special char such as + need space around them since w/o a space, csh is not able to parse it into a token.

As in bash, each line in csh is a cmd followed by args for that cmd. Separate lines are separated by newline (enter key), semicolon (;) or control characters (similar to bash).

A. Commands: similar to bash, csh may have simple or complex cmds.

1. simple cmd: just as in bash, they can be built in or external cmd. Few ex:

  • alias: same as in bash, but syntax little different. Alias are usually put in initialization files like ~/.cshrc etc so that the shell sees all the aliases (instead of typing them each time or sourcing the alias file). One caveat to note is that Aliases are not inherited by child processes, so any csh script that you run will not inherit these alias until you redefine these alias again in your script.
    • ex: alias e 'emacs'. => NOTE no "=" sign here (diff than bash where we had = sign). Now we can type "e" on cmd line and it will get replaced by emacs and so emacs will open.

2. complex/compound cmd: They are cmds as if-else, while, etc. They are explained later.

B. Variable: csh needs "set" cmd to set variables, unlike bash which just uses "=". Everything else is same. $ is used to get value of a variable assigned previously. There are 2 kinds of var: global and local var

1. local var: local vars are assigned using set and "=" sign, i.e set var = value. NOTE: we can use spaces around = sign, unlike bash where no spaces were allowed. This is because "set" keyword is the assignment cmd, and rest is arg, so spaces don't matter. In bash, the whole assignment thing was a cmd.

set libsdir = "/sim/proj/"$USER"/digtop_$1" => spaces around = sign. $ inside double quotes or outside, is always expanded.
echo "myincalibs = " $libsdir => prints: myincalibs = /sim/proj/ashish/digtop_abcd

2. global var: For global var, we need to use "setenv" and no = sign. i.e: setenv HOME /home/ashish. No separate "export" cmd needed (as in bash). setenv cmd by itself does export too, so that the var is accessible to subshells. To see list of global var, we can use cmd "setenv" by itself with no args (we can also use env and printenv cmds from bash). Most of the global vars such as HOME, PATH, etc are same as those in bash.

setenv RTL_FILES "${DIGWORK}_user/design/svfiles.f" => {} allows the var inside braces only to be used for $ subs. NOTE: no = sign. setenv (rather than set) is used for global var (we used setenv so that this variable can be used in other scripts running from same shell).

$? => this checks for existence of a var. i.e $?HOME will return 1 (implying var exists), while $?temp123 will return 0 (implying var doesn't exist). This has different meaning in bash, where it expands to exit staus of most recently executed foreground pipeline. So, $?HOME will return 0HOME if the cmd run just before this was a success, or return 127HOME if cmd run just before this was in error, where 127 is the error code

prompt: Terminals with csh usually show a "%" sign as prompt. setting prompt in csh is same as setting other global vars. To set prompt in csh, we use keyword prompt instead of PS1. However, prompt keyword itself doesn't work in many c shells.

echo $prompt => on my centos laptop when in csh, it shows "%%[%n@%m %c]%#"
set prompt = " $cwd %h$ " => doesn't work at some corporations, as csh is actually tcsh (even though echo says its /bin/csh, installed pkg are always tcsh as tcsh is improved version of csh, and csh is just a soft link to tcsh), so use tcsh cmds. tcsh is backward compatible with csh, but somehow this cmd doesn't work. So, correct way would be:
set prompt = "[%/] > " => this worked on  m/c, but not guaranteed to work on others.

Data types: variables can be of diff data types. Looks like as in bash, primitive data types here are char string.

1. String: Everything is char string. It's internally interpreted as integer numbers depending on context.

ex: set a=12; set b=13; @ c= $a + $b; echo $c will print 12 +13=25 since it will treat those 2 var as integer. @ is special cmd used to assign a calculated value. For + to be treated as an arithmetic operator, there has to be space on both sides of +, else it will error out. NOTE: it doesn't use "set" cmd to assign a arithmetic calculated value.

ex: set a=12; set b=13; set c=$a+$b; echo $c will print "12+13", since it will treat the whole thing on RHS as a string (those 2 var and + are all string). NOTE: it uses "set" cmd here for string assignment.

ex: set d=jim => this assigns a string "jim" to var "d" even though it's not enclosed in double quotes. We would have needed to use single or double quotes if we had special char inside the RHS string as space, $, etc.

2. array: array contains multiple values. syntax same as in bash. However, here index start from 1 instead of 0 (in bash, index start from 0). Also, assciative array don't seem to work in csh.

ex: my_array=(one two three) => this assigns my_array[1]=one, my_array[2]=two and so on.

ex: echo $my_array[2] => this prints "two" even though curly braces are not used. Not sure why this works. NOTE: in bash, echo ${my_array[2]} was needed to print correctly. So, to be on safe side, always use curly braces around var to remove ambiguity.  "echo $my_array[0]" would print nothing as there is no index 0.

ex: echo $my_array => this would print all array, i.e "one two three", unlike bash, where it prints just the 1st element of array, i.e "one"

ex: set my_array[2]=one => If we do "echo $my_array[2]" it will print "one" as the array got overwritten at index=2.

ex: set me[1]="pat" => this doesn't work, as arrays can only be assigned via ( ... ). We can later change the value of any index of array using this syntax, but we can't define a new array using this. If we do "echo ${me[1]}" then it gives an error "me: Undefined variable" as csh looks for array "me" defined using parenthesis ( ... ) and tries to get value for index=0. In this case, "me" was never defined to be an array

ex: my_array[name]='ajay'; => associative array don't work in csh

C. Special Characters or metacharacters: This is mostly same as in bash. cmd line editing in bash was basically taken from csh, so all cmd line edit keys from bash work in csh.

special character usage:  special char are mostly same as in bash.

1. # => comment. This is single line comment. If you want to comment multiple lines, use "if" cmd explained later. As explained in "bash" scripting section, comment line is not completely ignored in csh, in contrast to bash. The "\" character at the end of comment line is looked at to figure out if the newline at the end of comment should be escaped or not. Comment is still ignored. So, very important to carefully look at any comment line, and put a backslash at end of it, if a regular cmd there would have needed a backslash. Look at backslash description in bullet 2 below. Let's look at an ex below:

ex: foreach dirname ( dir1 \

dir2 \

#dir3 \

)

In above ex, dir3 is commented out and has a "\" at end. So, the whole line until "\" is ignored. "\" at end escapes newline, so contents of next line are treated as part of same line. This is how it looks after expanding:

foreach dirname ( dir1 dir2 ) => Here "dir3" is completely ignored as it's in comment except for "\" which causes continuation of line 4 (closing barces) on line 3 itself. NOTE: \ at end is NOT continuation of comment, i.e it's not like this: foreach dirname ( dir1 dir2 #dir3 ) => this would have caused an error as closing brackets won't be seen by csh interpreter. This is not what happens in csh.

ex: The below ex causes a syntax error "Too many ('s". This is because closing bracket ) is seen on another line, so it's like this foreach dirname ( dir1 dir2 => so ) is not on same line resulting in error.

foreach dirname ( dir1 \

dir2 \

#dir3

)

 

2. " ' \ => same as in bash, they hide special char from shell.

I. Double Quotes " " : weak quoting

II. Single Quotes ' ': strong quoting

III. Backslash \ : hides all special characters from the shell, but it can hide only one character at a time. So, it can be used to hide newline character at end of line by putting backslash at end of line (this allows next line to be seen as part of current line). There is slight variation to this for the comment line as shown in example above (backslash at end of comment line is not seen as continuation of comment line).

3. End of cmd: Same as in bash, a newline character (by pressing enter/return) is used to denote end of 1 cmd line. For multiple cmds on same line, semicolon (;) can be used to separate multiple cmd. There has to be a space after semicolon, else parser will not see ; as a token.

Control operators: Same as in bash. Pipe cmd and list work same way.

4. source or . cmd: same as in bash

5. backquote or backtick (`): same as in bash. However, the o/p here is stored in an array, instead of a simple string

ex: a=`ls`; echo $a; => This will print the array a (as o/p of this is stored in array). To see individual elements of array, we can do $a[1], $a[2]. etc (NOTE: $a[0] is not valid as arrays start with index=1 in csh)

6A. user interaction: All languages provide some way of getting i/p from a user and dumping o/p. In bash, we can use these builtin cmds to do this:

Output cmds: echo cmd supported.

Input cmds: csh has "$<" for reading i/p.

ex: below cmd reads i/p from terminal and prints the o/p

echo -n Input your number:

set input = $<

echo You entered $input

6B.  IO redirection: same as in bash.

7. Brackets [ ] , braces { } and parenthesis ( ) : same as in bash. [] and {} are used in pattern matching using glob. All [], {}, () are used in pattern matching in BRE/ERE. See in regular expression section. However, they are used in other ways also:

I. single ( ) { } [ ]:

( ) { } => these are used to group cmds, to be executed as a single unit. parenthesis (list) causes all cmds in list to be executed in separate subshell, while curly braces { list; } causes them to be executed in same shell.

{ } => Braces { } are also used to unambiguously identify variables. They protect the var within {} as one var. { ..} is optional for simple parameter expansion (i.e $name is actually simplified form of ${name})

{ } can also be used for separate out a block of code. spaces should be used here. ex: a=3; { c=95; .... } echo $c;

[ ] => square brackets are used for globbing as explained above.

[ ] are also used to denote array elements as explained in array section above.

arithmetic operators: One of the most useful feature of csh, which is missing in bash, is that csh allows direct arithmetic operations. No "expr" cmd needed to do numeric arithmetic. These arithmetic operations can appear in @, if, while and exit cmds. Many of these operators below return boolean "true" or "false", but there is no inbuilt boolean type. i.e if (true) ... is invalid. An expr has to be used that evaluates to true or false.

  • number arithmetic: +, -, *, /, %, **(exponent), id++/id-- (post inc/dec), ++id/--id (pre inc/dec)
  • bitwise: &, |, ^(bitwise xor), ~(bitwise negation), <<(left shift), >>(right shift). ~ is also used as expansion to home dir name.
  • logical: &&, ||, !(logical negation)
  • string comparison: ==(equality), !=(inequality),  These are not arithmetic comparisons but lexicographical (alphabetic) comparisons on strings, based on ASCII numbering. Here RHS has to be a string and not a pattern.
  • <=, >=, < ,>. => these operate on numbers (unlike bash, where these operate on string, and -lt, -ge, etc were used instead to compare numbers)
  • assignment: =(assigns RHS to LHS), *=, /= %= += -= <<= >>= &= ^= |= => these are assigments where RHS is operated on by the operator before =, and then assigned to LHS (i.e a*=b; is same as a=a*b.
  • matching: =~ this is a matching operator (similar to perl syntax) where string on RHS is considered ERE, and is matched with string on LHS. ex: [ $line =~ *?(a)b ] => returns true if string contains the pattern
    • seems like .* for ERE is not honored, but rather just plain * as used in glob. ex: $name =~ raj_.* doesn't match raj_kumar, but $name =~ raj_* (w/o the dot) does match raj_kumar
  • non matching: !~ this is non matching operator, where string on RHS is considered ERE. returns true  if strings doesn't contain the pattern.
  • condional  evaluation: expr ? expr1 : expr2 => similar to C if else stmt
  • comma : comma is used as separator b/w expr

ex: @ c=$a + $b; => @ needed to do arithmetc. No "set" cmd needed.

ex: @ num = 2 => this assigns numeric value 2 to num. If we used "set num = 2" then it assigns num to string 2. It may then still be interpreted as num or string depending on how it's used later.

ex: @ i++ => increments var i by 1

ex: if ($a < $b) echo "a < b"

II. double (( )) {{ }} [[ ]] => no known usage in csh

8. pattern matching: same as in bash

9. looping constructs: These 2 cmds used to form loops: foreach and while. "break" and "continue" builins are used to control loop execution, and have same behaviour as in bash. break exits the loop, not the script. continue continues the loop w/o going thru the remaining stmt in loop that are after continue.

  • foreach: It's a loop where the variable name is successively set to each member of wordlist and the sequence of commands until the matching end statement are executed. Both foreach and end must appear alone on separate lines.  syntax:
    • foreach name (wordlist)
               commands
           end
    • ex:
      foreach color (red orange yellow green blue)
              echo $color
           end
  • while: Just like foreach, it's a loop Statements within the while/end loop are conditionally executed based upon the evaluation of the expression. Both while and end must appear alone on separate lines.syntax:
    • while (expression)
               commands
           end
    • ex: set word = "anything"
           while ($word != "")
             echo -n "Enter a word to check (Return to exit): "
             set word = $<
             if ($word != "") grep $word /usr/share/dict/words
           end
  • break / continue: these are used in foreach or while loop statements above. "break" terminates execution of loop, and transfers control to stmt after the end stmt, while "continue" transfers control to the end stmt. So, "break" forces the pgm to exit the loop, while "continue" keeps on continuing with next iteration of loop (while skipping stmt in the loop that come after continue).
    • foreach number (one two three exit four)
             if ($number == exit) then
               echo reached an exit
               break (or use continue. break takes ctl to stmt right after "end" stmt, while continue takes it back to beginning of loop, to start with next iteration)
             endif
             echo $number
           end

10. Conditional constructs: same as in bash, syntax slightly different. Also these are closer to C lang in syntax, and have switch.

  • if-else: There are 2 variants of if cmd: A. if without else B. if with else
    •  if => if (expr) command [arguments] => here cmd must be a simple cmd, not piped cmds. There is no else clause.
      • ex: if ($#argv == 0) echo There are no arguments => all in one line
      • ex: if (-d $dirname) echo "dir exists" => this checks for existence of dir. options supported for existence of files/dir are same as those in bash
    •  if-then-else => if (expr) then (cmd1) else (cmd2) endif . There may be multiple else clauses here. "then" and "endif" are required in this form of if-else.
      • ex: if ($number < 0) then  
                   @ class = 0  
                else if (0 <= $number && $number < 100) then  
                   @ class = 1      
                else  
                   @ class = 3  
                endif
      • if ($dat == $vrfir) then => Note no semicolon used. == used (as in C)
          echo foo
        else => optional
          echo bar!
        endif
      • if-then may be used for multi line comments. ex: set debug=1; if ($debug == 1) then ........ endif
      • if (!(-e ${AMS_DIR}/net.v)) then ... else ... endif => checks for existence of a file. same options as in bash.
  • switch case: The switch structure permits you to set up a series of tests and conditionally executed commands based upon the value of a string. If none of the labels match before a `default' label is found, then the execution begins after the default label.  Each case has "breaksw" cmd at end of case that causes execution to continue after the endsw. Otherwise control may fall through case labels and default label may execute if there is no "breaksw". Also, the patterns for each case may contain ? and * to match groups of characters or specific characters. syntax is as follows:
    • switch (string)
        case pattern1:
          commands...
          breaksw
        case pattern2:
          commands...
          breaksw
        default:
          commands...
          breaksw
      endsw
    • ex: if ($#argv == 0 ) then
              echo "No arguments supplied...exiting"
              exit 1
           else 
              switch ($argv[1])
              case [yY][eE][sS]:
                echo Argument one is yes.
                breaksw
              case [nN][oO]:
                echo Argument one is no.
                breaksw
              default:
                echo Argument one is neither yes nor no.
                breaksw
              endsw
           endif
  • goto: The goto statement transfers control to the statement beginning with label:
    •      if ($#argv != 1) goto error1
           goto OK
           error1:
             echo "Invalid - wrong number or no arguments"
             echo "Quitting"
             exit 1
           OK:
             echo "Argument = $argv[1]"
             exit 1
    • if (.$RUN == ams) goto ams_run
      ....
      ams_run: => control transferred here
      ....
      exit

    •  goto is usually used to print usage info of a script when number of args is insufficient
      if ($#argv < 3) then
      goto usage
      else ... endif => process cmd line args
      usage:
      echo " shows usage of cmd" => This prints usage info for that script when cmd is typed
      exit

 
Advanced csh cmds: Read on csh links on top for more on this.