C programming lang: Most popular language of last 50 years. Must learn language which can be used to write any simple or complex pgm


C library ref guide: http://www.acm.uiuc.edu/webmonkeys/book/c_guide/index.html

C language has it's own syntax, which may not seem very easy for a person who is used to more modern languages like python. We'll start with very basic "Hello World" pgm which prints "Hello World" on screen. This pgm is explained under "gcc" section, but I'm repeating it here. Name the pgm hello.c

#include <stdio.h> //include files: explained under gcc article

int main (void) { //main function explained below

  printf ("Hello, world!\n");

  return 0;

}

 
main function: Every C pgm needs a main function. This is where the pgm starts executing from line by line.

arguments to main() function are optional. argc and argv variables store the parameters passed in cmd line of the shell when running the pgm.On POSIX compliant systems, char *envp[] contains vector of program's environment var (i.e SHELL environment var that are passed to the program

argc=num of arguments in cmd line including cmd name,

argv[]= vector storing arguments in cmd line, argv[0]=cmd_name (1st argument), argv[1]=2nd argument (1st option), and so on

envp[] = vector storing ENIRONMENT variables

argc, argv[] and envp[] can be any names, i.e main(int cnt, char * arr_option[], char *arr_env[]); is equally valid, though usimg argc, argv and envp is standard

ex: a.out 1 myname 17cd => argc=4 (4 arguments on cmd line), argv[] = {"a.out"  "12", "myname", "17cd" }; Note that argv is an array of strings(char), so even numbers are stored as string. Char strings "12" can be converted to numbers using atol() for 64 bit int conversion or atoi() for 32 bit. i.e y=atoi(argv[1]); => This will assign y to integer 12. Plain casting using (int) will not work, as (int)"12" will convert string 12 to integer which will be ascii code of 1,2,\n converted to integer.
 
int main (int argc, char *argv[]) { // int before main refers to the return value of main. In early days of C, there was no int before main as this was implied. Today, this is considered to be an error (even if main doesn't return any value).

int main (void) { => no arguments to main
...
exit (1); //return value from int. Can also do: return 1;
}

Compile C pgm: We can't run C pgm directly by typing hello.c on terminal. We need to compile it first, which generates an executable "a.out", and then run "a.out" on terminal.To compile above pgm, we use gcc compiler. Type below cmd on terminal. Look in "gcc" section for details.

gcc hello.c => generates a new file a.out. Now type ./a.out on terminal, and it runs the pgm

Syntax of C pgm:

1. comments start with // for single line comment, For multi line comment, use /* comment */

2. Each stmt ends with semicolon ;. We don't need semicolons for blocks

3. main() function is needed. We can define other functions as needed.

4. All variables need to defined before using them. We define the var and specify their type. Types are explained below.

5. For loops, if-else,etc are supported. Many reserved keywords are defined for these, and they can't be used as variable names,

Std Input/Output functions:

printf and scanf are 2 std functions that are going to be used the most. They are inbuilt library functions in C programming language which are available in C library by default. These functions are declared and related macros are defined in “stdio.h” which is a header file in C language. That's why we have to include “stdio.h” file.

1. printf: printf() function is used to print the (“character, string, float, integer, octal and hexadecimal values”) onto the output screen.

ex:

int var1=5; // var1 used in printf below. We define it to be of integer type and assign it a value of 5.

printf ("Value is = %d \n",var1); // %d specifies that var1 is int type, and should be printed in decimal format. var1 needs to be defined before it can be used. var1 is put outside " .. ", while %d is inside " ...". \n is used to print a newline.

2. scanf: scanf() function is used to read character, string, numeric data from keyboard.

ex:

char ch; //var ch is defined of type "character", and is used to stre the i/p entered by the user.

printf("Enter any character \n"); //This printf is same as before. We can't print anything from within scanf. It can only take input. So, we usually precede scanf with printf.

scanf("%c", &ch); //Same as in printf, we have to specify the type of i/p. Here %c says it's character type, and whatever character user enters on prompt is stored in var "ch". Here var "ch" has to be defined as a "char type" before using it. We use an & with ch. That is needed, and will be xplained in pointer section below. No & is needed for var in printf function.



Type:  In C pgm, we have to define data type of all the variables, else C compiler will error out. These are 2 categories of data types for any var. Primary data types and derived data types
-----

I. Primary data type: These are fundamental data types in C namely integer(int), floating point(float), character(char) and void.
arithmetic type: 4 basic types: char, int, float, double and 4 optional type specifiers (signed, unsigned, short, long). void is a data type for nothing.

1. char: 8 bits.
 char, signed char, unsigned char => signed char stored as signed byte. If we use "char" to store character then ascii code for that character is stored. In that case signed or unsigned doesn't matter. However, if we use char to store 8 bit number, then signed unsigned matters, as it represents 8 bit signed or unsigned integer. As "int" stores integers in 16 bit or larger, the only way to store 8 bit integers is by using signed/unsigned char. signed char is from -128 to +127, while unsigned char is from 0 to +255.
 Normal char (no signed/unsigned) are represented in ASCII codes. See in ASCII part of "bash shell scripting language" section.

2. int: 16 bits to 64 bits.
 A. short integer type. atleast 16 bits in size => short, short int, signed short, signed short int, unsigned short, unsigned short int.
 B. basic integer type. atleast 16 bits in size => int, signed, signed int, unsigned, unsigned int. usually 32 bits.
 C. long integer type. atleast 32 bits in size => long, long int, signed long, signed long int, unsigned long, unsigned long int. usually 64 bits, but on embedded arm uP, these are 32 bits.
 D. long long integer type. atleast 64 bits in size => long long, long long int, signed long long, signed long long int, unsigned long long, unsigned long long int. Usually 128 bits, but on embedded arm uP, these are 64 bits.

3. float: IEEE 754 single precision floating point format (32 bit: 1 sign bit, 8 exponent bit and 24 significand precision(23 explicitly stored since 1 bit is hidden)).

4. double: IEEE 754 double precision floating point format (64 bit: 1 sign bit, 11 exponent bit and 53 significand precision(52 explicitly stored since 1 bit is hidden)). "double double" is extended precision floating-point type. It can be 80 bit floating point format or some non-IEEE format.
 
NOTE:  The above types don't have particular size as part of C lang, as it's target processor dependent. So, their size is provided via macro constants in two headers: limits.h header defines macros for integer types and float.h header defines macros for floating-point types.
limit.h: (look in http://www.acm.uiuc.edu/webmonkeys/book/c_guide/2.5.html)
-------
In cortex M0, char size is defined in /apps/arm/rvds/4.0-821/RVCT/Data/4.0/400/include/unix/limits.h as follows:
#define CHAR_BIT 8 => Number of bits in a byte. The actual values depend on the implementation, but user should not define these to be lower than what's put here.
#define CHAR_MIN 0 => minimum value for an object of type char
#define CHAR_MAX 255 => maximum value for an object of type char

similarly signed char is defined as -128 to +127 (8 bits), unsigned char is 0 to 255, signed short int or int is -32768 to +32767 (16 bits), signed long int is -2147483648 to +2147483647(32 bits).

float.h: (look in http://www.acm.uiuc.edu/webmonkeys/book/c_guide/2.4.html)
--------
for float(32 bits), double(64 bits), long double(80 bits).
ex: In cortex M0 float.h, we have:
#define FLT_MAX  3.40282347e+38F

Compiler uses these files to determine size of these primary data types. So, the same pgm may compile differently on different m/c if these files have diff values.

II. Derived data types: Derived data types are nothing but primary datatypes grouped together like array, stucture, union and pointer.

Boolean type: _Bool (true/false). It's an unsigned byte. False has value 0 while true has value 1. It can be seen as derived from int.

In cortex M0 stdbool.h, bool is defined as _Bool, true as 1 and false as 0. (for C compilers pre C99 std). C++ has bool type instead of _Bool type (as per C99 std).
#define bool _Bool
#define true 1
#define false 0

pointer type:

Pointer type var are a new class of var that store addr. Addr could have been stored as type int, but a special "pointer" type was declared to store the addr. For every type T (i.e char, int, etc) there exists a type pointer to T. Variables can be declared as being pointers to values of various types, by means of the * type declarator.
eg:
char v; declare var v as of type char. To get addr location of any variable, use &. So, addr of v is &v
char *pv; (can also be written as char* pv; or char * pv;)=> declares a variable "pv" which is a pointer to value of type char. pv doesn't store a char, but rather an addr. That addr stores a char var. *pv refers to contents stored at memory pointed to by this addr (the addr stored in var pv). &pv is the addr of this var pv, just like &v is the addr of var v. Since, var v istores char, it is just 1 byte in length, while var pv stores an addr, which is 32 bits on a 32 bit system. We can initialize pv to any addr, including special value "NULL" (pv=NULL;):
char *pv=0x1000; => assigns pv to addr value 1000. This is similar to char *pv;pv=0x1000; However, this will error out with this msg: "a value of type "int" cannot be used to initialize an entity of type char *". Reason is although C language allows any integer type to be converted to a pointer type, but such conversions cannot be done implicitly; a cast is required explicitly to do this non-portable coversion. so, we have to do: char *pv = (char *)0x1000; => This casts 0x1000 to pointer type pointing to a char var, which is exactly what pv is, so both sides become same type.

char v='a';
char *pv=&v; => assigns pv to addr loc of variable v. If we print various values of v and pv, this is what we might get:

printf ("v=%c, &v=%x, pv=%x, &pv=%x, *pv=%c",v,&v,pv,&pv,*pv); => v and *pv both store char, so we use %c, while pv, &v and &pv store addr, so we use %x

v=a, &v=f98ca5cf, pv=f98ca5cf, &pv=f98ca5c0, *pv just happens to store the char


pv = &v; => assigns addr loc of v to pv. If pv wasn't declared as a pointer, we could not do this.
pv = (char *)0x1000; => pv is assigned addr value of 1000. The addr 0x1000 stores a char type.
*pv = 0; => store 0 in contents stored at addr pointed by pv.

initializing ptr var: We can initialize ptr var to any addr val. However we can also initialize the contents of ptr var via this:

char *p="Name"; => Here p is created as a ptr and values assigned from addr p, p+1 and so on. So, *p='N', *(p+1)='a', and so on to *(p+4)='\n'. *(p+5) is not initialized to anything, and contains garbage value.


Below 2 assignments are used extensively for rd/wrt of reg mem locations:
1. Rd from given mem loc: 0x1000 or "#define addr 0x1000"
data = *((char *)0x1000); => grab contents of addr loc 0x1000 and assign it to data.
2. Wrt to given mem loc: wrt 1 or "0x01" (since char, so 1 byte only, so 0x01. If short int, use "0x0001")
*((char *) 0x1000) = 1; => This stores 0x01 at addr 0x1000 (Note: it is not char '1' as char '1' is ascii value 0x31. Here we cast 0x1000 to type "pointer of type char" and then "*" in front of it refers to contents stored at addr 0x1000. So, we store "1" at that addr. We can omit extra brackets and do: *(char *) 0x1000 = 1; NOTE: if we try to do: *0x1000=1; then we get this error "operand of "*" must be a pointer" as 0x1000 is a int and not a pointer, so explicit conversion is required.

double pointer: When the pointer stores addr of other pointer (instead of storing addr of char, int, etc), it's pointer to a pointer or double pointer

ex:

int **dpv; => here dpv is double ptr to type int. dpv stores addr of pointer pv for ex, where pv pointer points to data tof ype int

int *pv; => pv is the ptr pointing to data type int.

int v; => regular var decalred as int

pv = &v => single ptr pv is assigned addr of var "v".

dpv = &pv; => double ptr dpv is assigned addr of ptr pv. If dpv wasn't declared a double ptr, then &pv couldn't be assigned to dpv. If dpv was declared a single ptr (i.e int *dpv), then dpv could store addr of regular var "v" and not addr of single ptr pv.

use of double ptr: We use it arg of functions, when we want to change the contents of a var which is outside the function, from inside the function.

1. We use it in arg of main func:

ex: int main(int argc, char **argv) => here we could use char **argv or char *argv[]. It represents an array of char sequences.

2. We use it  in arg of regular func:

 

Array: An array is defined as finite ordered collection of homogenous data, stored in contiguous memory locations.

Array types are introduced along with pointers, as in C, arrays are just a pointer to the first object of that data in the memory. In most contexts, array names "decays" to pointers. In simple words, array names are converted to pointers. That's the reason why you can use pointer with the same name as array to manipulate elements of the array.

ex: char p[10] = "Name"; => This declares an array of 10 char type var = p[0] to p[9]. It is initialized to value "Name", so p[0]='N', p[1]='a',..p[4]='\n', and p[5] to p[9] are uninitialized. p[0] to p[9] are stored in continuous mem locations, and *p or p[0] refers to first element of array *(p+1) or p[1] to second element of array and so on. The way compiler translates array is that it stores it as stores "p" as ref addr, and then uses it to figure out p[0] = *p, p[1] = *(p+1) and so on. This is similar to char *p="Name";

ex: printf("ptr: p addr in hex= %x p addr in dec= %d &p= %x &p[0]=%x *p=%c &p[1]=%x *(p+1)=%c\n",p,p,&p,&p[0],*p,&p[1],*(p+1));

o/p => ptr: p addr in hex= 23fccc60 p addr in dec= 603769952 &p= 23fccc60 &p[0]=23fccc60 *p=N &p[1]=23fccc61 *(p+1)=a //as can be seen, p, &p and &p[0] all refer to addr of start of array p. We can use any of them to access array p. p is used as a ptr to start of array p[9:0].

There are some special cases when an array doesn't decay into a pointer. ex: char str[] = "Name"; Here the size of array is not explicitly specified, so not treated as pointer.

2D array: 2D array are simply double pointers. To see it, consider 1D array, where first element of array is referred via ptr *p. ptr p stores the addr of p[0].  Now consider 2D array. So, p[0][0] to p[0][n] is the first row of 2D array, p[1][0] to p[1][n] is the second row of 2D array and so on. So, to store each row of array, we can have a ptr *p for 1st row, ptr *q for next row and so on. Or we can have an array of ptr *p[] that will store 1st row, 2nd row and so on. So, ptr *p[0] points to 1st element of 1st row, ptr *p[1] points to 1st element of 2nd row and so on. So, it becomes 1D array of pointers, char *p[0], char *p[1] and so on. But any 1D array can be represented by pointers, so p[0], p[1] etc which are pointers can be rep by *p. But *p means p[0], p[1] are char and not ptr. To specify that p[0], p[1] etc are pointers to char, and not char themselves, we specify it as a double ptr, char **p or an array of single pointers as char *p[].

struct:

Struct type is collection of different kind of data which we want stored as single entity. For ex a person's account may contain name, age, income, etc.We can obviously make 3 different types of array: char type for name, int type for age, float type for income, etc. But that would be not easy to manage. However, we can group all these items together in a structure named "account", and then access each of these. That way, we have to create an array of just this "account" type, and it's much easier to manage data.


struct pointer:
------
struct reg { int a; char ch;};
struct reg arr[] = {{1,'s'},{2,'f'}}; //defines arr[0],arr[1] with values of reg type
struct reg *my0 = &arr[0]; //my0 is ptr to type "reg" and is assigned to addr of arr[0] which is of type "reg" too.
struct reg *my1 = &arr[1];

function pointers: http://denniskubes.com/2013/03/22/basics-of-function-pointers-in-c/
-----------------
function names actually refer to addr of func. So, we can have pointers to functions by using func name as addr (no need to use & with func name for addr).
char *pv=&v; => as explained above, this assigns pv to addr loc of variable v. *pv refers to contents of pv.
but for func, we need to have paranthesis around pointer. Also, instead of char, we need to have return val of func, and also need to provide all args of func (If no args, we can write "void" or leave it empty). i.e
void (*FuncPtr)() = HelloFunc; //Hellofunc is function defined with no i/p or o/p. OR
void (*FuncPtr)(void) = HelloFunc;
or:
void (*FuncPtr)(void); => define the ptr function
FuncPtr = HelloFunc; => assign ptr to addr of HelloFunc.

Now to reference any pointer value, we use *pv to read contents. (a=*pv reads contents of pv and stores in a). For function, we do the same, but we need to have *pv inside paranthesis to indicate it's a func. Also, we need (), since () operator is needed to call a function (with args in it if any). i.e
(*FuncPtr)(); //This calls the func HelloFunc() => equiv to calling HelloFunc(); directly.
FuncPtr(); //This also works and is exactly equiv to above line. This is since a func name (label) is converted to a pointer to itself. This means that func names can be used where function pointers are required as input. &func and *func are same and refer to func name, which is a pointer to itself. So, instead of using &func or *func, we should use func or func() directly

ex: (with args)
int (*FuncCalcPtr)(int, int) = FuncCalc; //or *FuncCalc, &FuncCalc, **FuncCalc all are same. FuncCalc fn is subtracts 2 int and returns an int.
int y = (*FuncCalcPtr)(10,2); //or FuncCalcPtr(10,2) is the same. returns 10-2=8, and stores it in y.

Func ptr can also be used as parameters to pass into another func. This is primary use of func ptr. ex:
int domath(int (*mathop)(int, int), int x, int y) {
  return (*mathop)(x, y);
}
in  main, do: int a = domath(add, 10, 2); //this calls domath func with ptr to "add" func.

typedef with pointers:
----
Above ex needed extra typing everytime fn ptr was defined. typedef can simplify this.
ex: Using typedef with pointers:
typedef int *intptr; => newptr is new alias with pointer type int*. "typedef int dist" creates "dist" as a synonym for int. similarly "intptr" is a synonym for "pointer pointing to int"
intptr ptr; => this defines a var "ptr" with type int. So, ptr is a pointer which can point to a memory with int type. We could have also written it as "int *ptr".

ex: Using typedef with fn pointers:
typedef int (*FuncCalcPtr_def)(int, int); => creates "FuncCalcPtr_def" as a synonym for a pointer to a fn of 2 int args that returns an int
FuncCalcPtr_def FuncCalcPtr; => this defines a var "FuncCalcPtr" with type "FuncCalcPtr_def", which is a ptr to fn. We could have written this as "int (*FuncCalcPtr)(int, int);

--------------------
size type:  http://www.acm.uiuc.edu/webmonkeys/book/c_guide/2.11.html
----------
size_t and ptrdiff_t were defined as separate type, since they are mem related. existing arithmetic types were deemed insufficient, since their size is defined according to the target processor's arithmetic capabilities, not the memory capabilities, such as available address space. Both of these types are defined in the stddef.h header as "typedef size_t", "typedef ptrdiff_t".

size_t: used to represent the size of any object (including arrays) in the particular implementation. It is used as the return type of the sizeof operator and is unsigned int.
ptrdiff_t:  used to represent the difference between pointers.

extended integer data type: *_t.
-------------------
to make code portable across diff OS, since existing int type have various sizes depending on system. The new types are especially useful in embedded environments where hardware supports usually only several types and that support varies from system to system. For ex: int N; may be 16 bit with certain complier/processor, while it may be 32 bit with other. Generally we don't care, but if these are being used in structure (as in MMIO), and we use a pointer to refer to various elements of that structure, then size of int does matter. One solution is to define new type as this:
typedef unsigned short uint16; => Here, we still need to know size of char,short,int,long on that complier/processor and modify this defn as needed.

an example depicting this problem is this:
typedef struct {
 volatile uint16_t CTRL1;           // control register
 volatile uint16_t CTRL2; }  CWT_TypeDef;
#define CWT                     ((volatile CWT_TypeDef*)  0x50000000  

Now in main.c, we do:
CWT->CTRL1=0x0012; CWT->CTRL2=0xFF55; => since compiler understands uint16_t to be of 2 bytes, it adds 2 to base addr to store CTRL2.
STRH     r0,[r1,#0];   => stores 0x0012 at base addr
STRH     r0,[r1,#0x2]; => stores 0xFF55 at base_addr+2

However, if we tried to use the same compiled code on some arch which had "unsigned short" implemented as 32 bits, then this code will incorrectly wrt 0x0012 into lower 2 bytes of CTRL1 and then 0xFF55 into upper 2 bytes of CTRL1. CTRL2 will not get written. In order to get rid of this problem, we have to change typedef in stdint.h to the correct 16bit integer and then we don't need to change our C code anywhere else. We recompile and generate correct binary for new arch. It saved us changing code from multiple places.

C99 std of ANSI-C, defined all additional data types as ending in _t, and user is asked not to define new types ending in _t. All new types are to be defined in inttypes.h header and also in stdint.h header by the compiler vendors. We can define types for exact width, least/max width type, etc. Exact width integer types are guaranteed to have the same number N of bits across all implementations. Included only if it is available in the implementation = intN_t, uintN_t. eg: uint8_t = unsigned 8-bit, uint32_t, etc.
sizeof() function can be used to find out the size of int, short, uint32_t, etc.

In cortex M0 stdint.h, int16_t is defined as type of "signed short int" =>  typedef signed short int int16_t;
similarly for uint8_t: typedef unsigned char uint8_t; => Note that in C, integers are represented in 16 bits or larger, so the only way to rep integer in 8 bits is by using signed/unsigned char.

---------------------
keywords & variables: http://www.acm.uiuc.edu/webmonkeys/book/c_guide/1.2.html#variables
---------------------
1. keywords: reserved keywords that can't be used as a variable identifier. ex: for, char, const, extern, etc.
---------
char short int long float double short signed unsigned => type and type specifier   
void
volatile const => type qualifier


2. variables: used to store values of a particular type
--------------------
names of identifiers: The use of two underscores (`__') in identifiers is reserved for the compiler's internal use according to the ANSI-C standard. Underscores (`_') are often used in names of library functions (such as "_main" and "_exit") and are reserved for libraries. In order to avoid collisions, do not begin an identifier with an underscore. Having __ both before and after variable name (eg __Symbol__) almost gurantees that there would be no name collision, as such identifiers with double underscores are extremely rare in user code.

A variable is defined by the following: <storage-class-specifier> <type-qualifier> <type-specifier> <type> variable-names,...
ex: extern const volatile unsigned long int rt_clk; => defines real time clk variable rt_clk

I. storage-class-specifier: storage class reflects data's lifespan during program execution.
1. typedef: The symbol name "variable-name" becomes a type-specifier of type "type-specifier". No variable is created.
ex: typedef long int mytype_t; => declares a new type mytype_t to be long int. From here on, we can use mytype_t instead of long int.
typedef most commonly used with struct to reduce cumbersome typing.
ex:
struct MyStruct {
    int data1;
    char data2;
};
with no typedef, we define var "a" of type Mystruct struct as follows: struct MyStruct a;
however with typedef, we can just define a new type as follows: typedef struct my_struct newtype;
or we can directly do typedef with struct defn as follows:
typedef struct  MyStruct {  
    int data1;
    char data2;
} newtype;
then we can define var "a" as being of type "newtype" as follows: newtype a;

2. extern: Indicates that the variable is defined outside of the current file. This brings the variables scope into the current scope. No variable is created.
3. static (permanent): Causes a variable that is defined within a function to be preserved in subsequent calls to the function. Variables declared outside the body of any function have global scope and static duration (for ex var declared outside main() are static, as they are not within any function). Although initial values may be assigned to global var, these are usually uninit. Since main() itself is a function, all var defined in it are local and auto, so we use "static" for these var to make them permanent. variables decalred outside main() are global for all functions in that file. static var are not released from mem on exit of function, so they consume mem space.
NOTE: we sometimes define function itself as "static".  In C, a static function is not visible outside of its translation unit, which is the object file it is compiled into. In other words, making a function static limits its scope. You can think of a static function as being "private" to its *.c file.
4. auto (temporary): Causes a local variable to have a local lifetime (default). Any variables declared within body of a function, including main(), have local scope and auto duration.
5. register: Requests that the variable be accessed as quickly as possible. This request is not guaranteed. Normally, the variable's value is kept within a CPU register for maximum speed.

II. type-qualifier: any declaration, inlcuding those of variables, struct/union, enum, etc can also have type-qualifier (volatile, auto) which qualifies the decl, instead of specifying it.
1. volatile: added to C pgm later. It causes the value to be fetched from memory every time it's referenced. It tells the compiler that the object is subject to sudden change for reasons which cannot be predicted from a study of the program itself, and forces every reference to such an object to be a genuine reference. This is used for defining variables stored in peripherals, so that uP doesn't read the variable from the register, which it might have stored a while back.
ex: volatile int j;

2. const: const means that something is not modifiable, so a data object that is declared with const as a part of its type specification must not be assigned to in any way during the run of a program.
ex: const int ci = 123; => declares a simple constant ci which always has a value of 123.
ex: const int *cpi; => declares a pointer "cpi" to a constant. cpi is an ordinary, modifiable pointer, but the thing that it points to must not be modified. comipler will check that cpi never points to something whose value changed.
ex: int *const cpi => declares a pointer "cpi" which is constant. It means that cpi is not to be modified, although whatever it points to can be modified \ the pointer is constant, not the thing that it points to.

III. type-specifier: void, all arithmetic types, boolean type, struct, etc.

3. enumerated tags:
-----------------

4. Arrays:
--------
array is defined as: <type-of-array> <name-of-array> [<number of elements in array>];
ex: int arr[10] => defines an array of 10 integers. Array elements are arr[0]. arr[1], ...  arr[9]. Each integer is 4 bytes (let's assume), so total size of array=4*10=40 bytes.

to initialize arrays:
we can either use for loop in main pgm, or init it at time of declaration.
int arr[] = {'1','2','3','4','5'}; => this init an array of 5 integers as follows: arr[0]=1, arr[1]=2 and so on. NOTE: there is no need to mention any value in the subscripts []. The size will automatically be calculated from the number of values. In this case, the size will be 5.
to init array with string 2 ways:
A. char arr[] = {'c','o','d','e','\0'};
B. char arr[] = "code"; => equiv to ex A above. We don't need an explicit null char here, since double quotes do that for us.

To access values in an array:
int j=arr[2]; => assigns j to arr[2] value which is 3.

We can also define array of structures, as well as array elements within structures.

5. structures and union:
----------------------
struct st{
    int a;
    char c[5];
};
int main()
{
    struct st st_arr[3]; // Declare an array of 3 structure objects
    struct st st_any[] = { {0,'c'},{1,'f'}}; //declares an array of 2 struct obj with values assigned    
    struct st st_obj0; // first structure object
    st_obj0.a = 0;
    st_obj0.c = 'a';
}

6. const:

7. strings: simply an array of characters encapsulated in double quotes. At the end of the string a null character is appended.
ex: char x="\x41" or "A" are the same string. x[0]=A, x[1]=null character.

8. define, ifdef: preprocessor that are remved by compiler depending on directive
#define NEW 0
#ifdef NEW => since NEW is defined this portion is kept by compiler
#define var ab
#else => this portion is removed by compiler
#define var cd
#endif

ex: this very commonly used in defines, so that we don't redefine something in multiple files
#ifndef CHAR_T => this piece of code can be placed in multiple files, but it will be compiled only from 1 file.
#define CHAR_T 0x45
#end
 
-----------------------------------

random number gen:
------------------
srand((unsigned) time(&t)); => inits rand num gen with time (in sec since epoch)
rand() % 50; => generates rand num b/w 0 to 49 using above seed. otherwise rand() will always gen same rand num seq since it will always use same seed (if we don't use srand).

Python:

Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming language. Similar to perl.
Python provides interfaces to all major commercial databases. Python supports GUI applications that can be created and ported to many system calls, libraries and windows systems. It can be easily integrated with C, C++, Java, etc.

Python has so much support from community, that almost everything can be done by using python's vast library or modules. You can build a website, write a program for raspberry pi, build games with gui, etc. Infact, it's one language that you can learn and get by doing everything without learning any more languages. I've avoided python in the past because of it's huge confusion over python2 vs python3, but now this issue looks settled.

 

Python 2 Vs Python 3:


Python 3 is latest version, python 2 is going to be EOL in 2020. So, switch to python 3 for all coding. There are significant differences b/w python2 and python3, so don't waste your time learning python2. However, we may still want to install both python2 and python3, as many pgms are still written in python2. So, not having python2 will cause those pgms to error out. When we install python 2, it is installed as both python (python is just a soft link to python2) and python2, while python 3 is installed as python3. Since python is just a soft link to python2, we can change it at any time to point to either python2 or python3. We might be tempted to change "python" soft link to point to "python 3". However, it's very risky as pgms written in python 2 may suddenly start failing if python3 is installed as python. There are many system programs that might be written in Python2, and they rely on the python link to point to Python2. So, we keep soft links for all python, python2 and python3 as is. We can change the soft link of "python" to point to "python3" if needed for a short term. However, it's advisable to leave python soft link pointing to python2, and have a separate link for python3.

python3 itself has several versions as python3.4, python3.6, etc. Latest stable version is python3.8 as of 2020. 

NOTE: On any linux distro, after installing latest version of python3, change the soft link of python3 to point to python3.8 or whatever is the latest version. That way, your latest python version would be available for your programs, by just typing python3.

cmd: cd /usr/bin; rm python3; ln -s python3.8 python3; => now calling python3 calls python3.8

All the discussion below is for python3, unless explicitly specified. I'll point out the differences where applicable. Be careful when searching for python on internet, as many of them are for python2, and may not work for python3.

Official doc is on python.org site: https://docs.python.org/

geeksforgeeks also has very elaborate and fantastic tutorial here: https://www.geeksforgeeks.org/python-programming-language/

 

Python installation:

On any Linux distro, to install python3, there are 2 ways: one thru package mgmt, while other by manually downloading and installing.

1. Pckage mgmt:

CentOS: For CentOS, we install using yum. Below are the ways to install python2 and python3.

A1. rpm pkg for python2: sudo yum install python => by default, it installs puthon2. It specifically installs python 2.7 in /usr/bin/python2.7. A soft link is created to python2 in /usr/bin dir (/usr/bin/python2 -> python2.7). Another soft link "python" is made to python2 (/usr/bin/python -> python2).  and python2 are soft links to python2.7 already installed

A2.  rpm pkg for python 3.4: sudo yum install python34 => installs python 3.4 in /usr/bin/python3.4. A soft link is created to python3 in /usr/bin dir (/usr/bin/python3 -> python3.4) python and python2 are soft links to python2.7 already installed

A3. rpm pkg for python 3.6: sudo yum install python36 => installs python 3.6 in /usr/bin/python3.6. A soft link is created to python3 in /usr/bin dir (/usr/bin/python3 -> python3.6). python and python2 are soft links to python2.7 already installed

A4: rpm pkg for python 3.7: sudo yum install python37 => Although latest version of python is python3.7, yum repo still doesn't have it, and gives an error that "no such package found". Run "yum info python37" to find out if python 3.7 available or not.

NOTE: one very important thing to note is that "yum" is written in python2. So if you change soft link of python to change to python3 (after installing python3), then yum will not work and will throw this error:
  File "/usr/bin/yum", line 30
    except KeyboardInterrupt, e:
 SyntaxError: invalid syntax

To fix this, do one of 2 things:

1. change python version being called in yum to python2: In /usr/bin/yum file, change first line from "#!/usr/bin/python" to "#!/usr/bin/python2". This will force python2 soft link to be used, instead of using python link.

2. change softlink in python to python2. This will cause yum to still use python2 as softlink python is pointing to python2. However, this may cause other pgms to fail, which may rely on pyton3, and need python ink to point to python3. To fix this, any pgm that needs to have python3, change first line in that pgm to point to python3 instead of python

First choice is preferred, as python3 is the step forward, so keeping soft link python pointing to python3 is going to work for most pgms.

Linux Mint: On LinuxMint, we install using apt. Latest python is 3.8 as of June 2021.

A1. sudo apt install python3.8 => This installs python version 3.8. Look in /usr/bin/ dir to make sure you see python3.8 over there.

 

2. manual: not tried yet. It's not recommended way, as it requires lot more effort, and there's no reason to do it (as all linux distro allow you to install via pkg mgmt)

 

Python syntax:

1. comment: Python comment is anything after a # at start of line or end of line. Multiline comments are not supported, but can be mimicked by putting any comment within triple quotes, i.e " " " .... multi line comment " " "

ex: a=2 #comment


2. case sensitive: Python is case sensitive. So, var and Var are different.


3. End of line: Each stmt ends with newline \n. So, no ; needed (this is in contrast to other languages which use ; etc to indicate end of line). However for multi stmt in single line, we need ; In cases where we need line to continue, we use line continuation char \. Recall that \ hides metacharacter immediately following it and treats it as literal, so newline metacharacter is hidden from shell interpretor. What the interpretor sees is just a space.
ex:
total = item_one + \
        item_two + \
        item_three


4. Blocks of code: no braces provided, instead all statements within the block must be indented the same amount. This is unique feature of python, and also very confusing as all other languages use brackets or keywords to mark begin or end of block, but never rely on spaces or indentation. This indentation needs to be a tab or 4 spaces to signify a block. 2 tabs or 8 spaces signifies another block nested within the outer parent block. Similarly 3 tabs signifies yet another nested block within outer 2 blocks and so on. We can have 1 space also to indent a block, but for readability, we keep it as 4 spaces or 1 tab (most editors automatically convert tab into 4 spaces, so it's the same thing) . All of the code with same number of spaces at start of line is considered part of one block. NOTE: we can't have 0 spaces to identify a block, as that will error out. We do need some indentation.


ex:
if True: => header line begin stmt with keyword (if, else, while, etc), terminate with : and are followed by suite
   print "True" => this group of stmt in single code block called suite. This is indented by a tab or 4 spaces, so it's part of if block

   print "I'm here" => This is part of if block too, as it's same indentation.
else:
   print "False" => this is part of else bock, as it's indented by a tab

print "end" => this is not part of if-else block as it's not indented at all.

NOTE: these are 2 of the most distinct departure from other languages.

  1. One is the absence of end of line character (i.e no semicolon etc, just a newline marks end of cmd in a line). We can always add a semicolon at end of line and python will work just fine, but correct way is to not put a semicolon.
  2. Second is the use of tabs or spaces to identify blocks of code. Usually high level languages don't rely on spaces for correct functionality, but python is all about spaces. Most other languages use curly braces  { ... }  to define scope of loops, functions, etc.

 

5. reserved keywords: Like any other pgm lang, python has reserved keywords, which can't be used as var names or for any other purpose. ex:

1. if else,
2. and, not, or
3. for, while, break, continue
4. print, exec, try, return, assert, class. print function is most used function and is explained later under "Functions" section.

6. quotes: single quotes and double quotes have same meaning, and so are interchangeable in python. We use one or the other when it's absolutely needed, i.e use double-quotes if your string contains a single-quote

Running python:

We can run python interactively or run a python pgm via a cmd line

python --version => returns version num. If it's 2.x it's older, if 3.x it's newer. We can also run "python -V" to get version num.

1. interactively:

typing python brings up python shell. Prompt is >>>. We can type any python cmd in it. Type "Ctrl + D" to exit shell.

>>> print("Hello")

prints Hello on screen

2. via cmd line:

file: test.py => here are are specifying python3 as interpretor instead of python (since python is usually set as a soft link to python2)

#!/usr/bin/python3
print ("Hello, Python!") #this is a comment:
# single line comment


> type ./test.py to run above file. (do chmod 755 test.py to make it a executable file).

> python3 test.py => This also runs the above python file. We could do "python test.py" too. This will work as long as syntax in test.py is python2 syntax.

 

Data Types and variables:

 

I. Variables:

As in any programming language, we need to define variables which store data, which may be of multiple data types supported by the language. var do not need explicit declaration of data type. This declaration happens automatically when you assign a value to a variable using = sign (i.e var2=1.1 => assigns float num 1.1 to var2)

variable names or Identifiers: starts with letter (a-z) or _. Variables are not declared beforehand to be of a particular type (as in C), but this is a common practice in most shell programming. The type is figured out by python during assignment.

II. Data types: Python is strongly typed language, meaning we would need to convert one data type to other to use it, else it will give error.

A. primitive data types: In python, we have 4 primitive data types:

1. numbers: numbers may of 4 types:

 A. int (signed integers) ex: var1=10
 B. long (long int, can also be oct or hex) ex: var2=-579678L; var3=0xDEADBEEF
 C. float (fp real) ex: var4=15.2; var5=32.3e18
 D. complex (complex num) ex: 3.14j, 4.5-3j

2. Strings: strings are continous set of char in double quotes "...." or single quotes '....'. In both of these quotes, values are not substituted but printed as is. There are special formatting available that allow substitution within single or double quotes that is explained later. This is different from other scripting languages and common programming languages which treat single and double quotes differently. There is also a triple quote in python that allows string to span multiple lines (so newline tab etc are treated as part of string).

ex: var1 = "my name"

ex: address = ''' my house is at => due to triple quotes, everything on this line and below is part of string including new llines. print(address) will print all 3 lines as is.

1207 goog ln,

los angeles "'


Subsets of strings can be taken using the slice operator ([ ] and [:] ) with indexes starting at 0 in the beginning of the string and working their way from -1 at the end.
The plus (+) sign is the string concatenation operator and the asterisk (*) is the repetition operator.
ex: str='My world'; print(str[0]) => prints M; print(str[2:5]) => prints worl; print(str+"TEST") => prints My worldTEST

There are 2 types of string in python. traditional str type (which is 1 byte or 8bit char), and newer unicode type (which are 2 byte or 16 bit char and can be upto 4 bytes for UTF-8). On any string type, we can put a char "u" infront of the string to indicate that it's unicode type. u or U refers to UTF-8 style where each string can be variable length from 1 byte to 4 bytes. (UTF-8 is used widely now, since 1 byte could only store ASCII char and can't handle millions of other char out there. UTF-8 is compatible with 1 byte ASCII code)

There are many other prefixes besides "u" to indicate how the string is going to be interpreted. "r" means raw string type (so that anything inside the string is going to be treated as literal and not interpreted. ex: r"me\n" => this is not going to treat \n as new line but instead as 2 literals \ and n.

str=u'Me .op' => this string is now unicode type (since u preceedes the string). So, each character is stored as 16 bits instead of 8 bits. u'text' is just a shortcode for calling unicode('text')

formatted strings: In version 3.6 of python, formatted string literals were introduced. So far, no substitutions happened for any characters inside strings, but with formatted string (or f-string), we can provide replacement fields by enclosing them within { ... }. Any python expr is allowed within these curly braces.

ex: name = "Fred"

a = f"He said his name is {name}." => This substitutes name with variable "name"., since we have f in front of string.

a = "He said his name is {name}."=> no substitution occurs

NOTE: char: There is no char var type in python. char are rep with string with length of one.

There are many string methods available to operate on strings. Look in python link for such methods.

ex: str.upper() returns a copy of string, with all letters uppercased. "My name".upper() returns string "MY NAME"

3. boolean: 2 values: True and False.

 

B. Compound data types:

4. List: most versatile. similar to arrays in C, except that items belonging to list can be of diff data types. NOTE: there are no array data type in Python. List are superset of arrays, so we use list in it's place. list have syntax same as that of array. On internet, lot of articles talk about array in python. In reality they are not talking about array, but list.

list contains items separated by commas and enclosed within [].

1D Lists:

ex: mylist = [] => this defines an empty list (since [] used, it implies a list). However, the size of list is not defined here, i.e if the list has 10 entries or 100 entries isn't mentioned, so it's not possible for compiler to reserve memory for this list at this point in time.
ex: list1 = ['A',1,"john", 23]; print(list1[1:3]) => prints [1, "john"] => Here, we specify entries of the list. So, here compiler/interpretor reserves memeory for list depending on how many entires are in the list, and the size of each entry. NOTE: the range specified includes item with index=1,2 but NOT index=3, as range is up to index-1. Also, commas preserved when we print the list
ex: list1[3]=102 => this updates value 23 with new value 102, not possible with tuple since it's read only
ex: for x in [1, 2, 3]: print x, => prints 1 2 3

Assigning values to list: We saw one way to assign initial values to list. Let's see if we can assign initial values to a list in other way.

my_list[0]=4;=> Here my_list is defined for the 1st time with 0th entry having value 4. Previously, we assigned list values as my_list=[4] which worked. This will give a Name Error: "NameError: name 'my_list' is not defined". This is because we are accessing indices of list, and python doesn't know what indices it has. So, let's define an empty list.

my_list = [];  my_list[0]=4; my_list[1]=2; => This will give an Error: "IndexError: list assignment index out of range". This is because python doesn't know the size of the list. If we assigned values to this list as my_list = [4,2] => then python knows the size of list as 2, and assigns value as my_list[0]=4 and my_list[1]=2. Then we can access value as my_list[0].

One way to resolve above issue is to define the list with size specified. ex: my_list = [0]*4; => This defines a list with 4 elements [0,0,0,0]. Now we can do my_list[0]=4. However, here list elements must all be of same type, else *4 won't work.

2D lists:

2D lists are en extensions of 1D list, i.e each element of a 2D list is in itself a 1D list.

ex: my_arr = [ [300, 200,100, 900], [600, 500, 400, 700] ]; => This is a 2D list, where each list element is 1D list.

Accessing list elements: We access it the same way as in 1D list, except that we provide the index of 1D list also.

print(my_arr[1][0:2]) => prints [600, 500]. This is called slicing of array/list/tuple. format is [start_index:stop_index-1:increment of index]. See in numpy module section for more details. so, my_arr[0][3:1:-1] =  [900, 100]

ex: print(my_arr[:]) => This prints entire 2D array since blank start means start from 1st index and blank end means stop at last index. Since no dimensions specified, it includes all dimension, so o/p is: [ [300, 200,100], [600, 500, 400] ]. This applies to any dimension array. arr[:] will all elements of the array. For some reason, slicing across multiple indices don't work, i.e my_arr[1:3][0:5] returns empty array.

We define a 2D list same way as 1D list. i.e list_2d = []. However, we can't do something like list_2D[0][0]=5, without having this list already specified for same reasons as 1D list above) with values as: list_2D=[[67,34],[35,67]]. Now we can do: list_2D[0][0]=5.

We can initialize 2D list as: my_list = [[0]*2]*3; => This will create 2D list of 2x3 with all values as 0, i.e [[0,0,0],[0,0,0]]

list operators: There are multiple operators for manipulating lists. Some of them are: cmp(list1, list2); len(list3); list.sort(list4);

Arrays: Lists behave almost same as arrays, but are not efficient. Lists are more generic than array (in that they allow multiple data types, while array allow same data type only), but they also get less efficient for storing/retrieving, computing, etc. Most scientific computations can be easily carried out with arrays, since they usually work on only one kind of data (i.e int, float, etc). Python doesn't enforce typing (i.e one particular type of data as int, etc), so they never created an array data type in Python. For most basic uses, lists serves our purpose, and we don't care about speed. However, if performance becomes critical because of large amount of data to work with, then Arrays are needed.

We said previously that python doesn't have arrays. However, python supports modules which allow us to use arrays. 2 ways to create arrays in Python

A. array module: Python has module "array" that can be imported to get arrays.  We specify the type of data elements, and all elements of array have to be of that data type. There are many functions available to operate on array. This method is not recommended method for creating arrays, use 2nd method using numpy module.

ex: import array as arr => We don't need to install any module for this. More details about array module can be found on internet

my_array = arr.array('i', [2, 4, 6]); print(my_array) => prints array('i', [2, 3, 5]) => NOTE: everything in array including data type is printed. Again, commas preserved while printing array (same way as in lists)

B. numpy module: There is NumPy module that can be used to create arrays. It's not included by default with Python distribution, so will need to be installed (see in NumPy section). This is the recommended method for creating arrays.

5. tuples: similar to list, specified using (). however they cannot be updated (i.e read only). We can apply slicing across tuples also. Used very rarely in simple codes.
ex: tuple1 = ('ab', 2.7)

6. sets: sets are similar to sets in maths where we can take union, intersection, etc. Sets defined using curly braces { .. }. They contain any number of objects, and of different types. Sets are unordered: the original order, as specified in the definition, is not necessarily preserved. Additionally, duplicate values are only represented in the set once. set elements must be immutable. For example, a tuple may be included in a set, as it's immutable. However lists and dictionaries are mutable, so they can’t be set elements. Other way to create set is using the set() function.

ex: x = {'foo', 'bar', 'baz', 'foo', 'qux', 12, (1,2), None}

print(x) => {none, 'foo', 12, (1,2), 'bar', 'baz', 'qux'} => NOTE: duplicate entries are removed, and order of elements is not preserved

Many operators as union, intersection, difference, |, &, ^, etc are allowed on sets. sets are also very rarely used in simple programs.


7 dictionary: They are like hashes in perl. They are also known as associative arrays. They consist of key-value pair. key/values can be any data type.
Dictionaries are enclosed by { } and values can be assigned and individual elements are accessed using [ ... ]. since both sets and dictionary use { }, we distinguish b/w the two via presence of ":". Since { } is used to rep empty dictionary, we can't use {} to rep empty set (since then python interpretor has no way of knowing if the object is a set or a dictionary). In that case,, we use set() func with no args to create empty set. We use ":" to assign key:value pair for each element

1D dictionary: Just like 1D list, we have 1D dictionary:

ex:

tinydict = {'name': 'john','code':6734, 'dept': 'sales'} => Assigns key value pair as follows: name->john, code->6734, etc. print(tinydict.keys()) prints ['dept', 'code', 'name'] while tinydict.values() prints ['sales', 6734, 'john']

tinydict['name'] prints "john", tinydict['code'] prints "6734" and so on

Assigning values to list: There are 2 ways to assign dict key/value pair.

A. We can assign dict key/value pair as we did in 1D list, and as shown in ex above.

ex: tinydict = {"name": "john",5:9}

B. We can also assign dict values in array form as shown below. This is different than in 1D list, where we weren't allowed to do dict[0]=5 and so on.
dict = {} => initialize dict. This is needed for dictionary, as w/o this there is no way to know for python compiler/interpretor to find out if dict[0]=1 is list assignment or dictionary assignment.

dict[0]=5 => Now we are allowed assignment like these. NOTE: 0 is a key her, and not index number. It just happens to be a integer key here, as 0 is not enclosed in quotes. The value is also integer as it's not enclosed in quotes.
dict['one'] = "This is one" => print (dict['one']) prints "This is one". Here both key and value are strings.
dict[2]     = "This is two"

2D dictionary: Just like 2D list, we can have higher dim dict as 2D, 3D, etc. However for 2D dict, we can't do something like dict_2D['a']['b']='xyz". Th reason might be that 2nd index it needs to know the range. So, we have to first define 1D dict, and then use that 1D dict as elements of 2D dict.

ex: dict1D['age']=35; dict1D['salary']=300;

dict2D['ramesh']=dict1D => Now dict2D['ramesh']['age']=35, dict2D['ramesh']['salary']=300 and so on. dict2D['mohan']={'age':50,'salary':500}. So 2D dict are just an array of 1D dict.

So, 2D dict are little cumbersome to write as you will first need to form 1D dict and then use that as elements of 2D. It would have been nice to just directly assign elements to 2D dict.


Operators:

Just like in other lang, we have various operators to operate on variables. Mostly operators are used for number data type (int, float, etc), but some of them can be used on other data types too. How the operator behaves depends on the data type of it's operands.

1. arithmetic: +, -, *, /, etc. ex: a+b. + and * are used in strings to concatenate or repeat strings.
2. comparison: ==, !=, >, >=, etc ex: (a<b)
3. assignment: =, +=, -=, etc ex: c=a+b;
4. bit wise : &, |, ^, ~, <<, >>, etc ex: a=16, b=2, a&b
5. logical: not, or, and

Control statements:

1. if elif else: This is same as if stmt in other languages. elif is substitute for "else if". Both elif and else are optional. An ifelifelif … sequence is a substitute for the switch or case statements found in other languages.

ex:Below if .. elif .. else stmt needs appr tab spaces for each block of code. NOTE: if, elif and else are at start of line with no tab.

if x < 0:

  x = 0

  print('Negative changed to zero')

elif x == 0:

  print('Zero')

else:

  print('More')

ex: if ( var == 100 ) : print ("val is 100") #for single line suite, it can be on same line


2. for: for stmt differ from that in C. There is no start, end or iteration index. Python’s for statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence.

ex: below iterates over the list and prints each word and length

words = ['cat', 'window', 'defenestrate']

for w in words:

  print(w, len(w))

ex: to iterate over a seq of numbers just as we do in for loop in C pgm, we can use range() function. syntax of range is (start,stop,step), where stop is required parameter, while star/step are optional. sop value is not included in range (i.e range is upto stop-1).  range(10) generates 10 values, from 0 to 9 (doesn't include 10). range(5,9) generates 4 values = 5,6,7,8. range(0,10,3) indicates step value of 3, so it generates 4 values = 0, 3, 6, 9. So, by using range() function, we can achieve what we do using for loops in C pgm.

for i in range(5):

  print(i) => prints 0,1,2,3,4

ex: To iterate over the indices of a sequence, you can combine range() and len() as follows:

for i in range(len(words)):

  print(i, words[i])) => This prints index 0,1,2 and prints the 3 words

3. while: The while statement is used for repeated execution as long as an expression is true.

ex: infinite loop below since expr is set to "True"

while True:

  print("infinite loop")

4. break, continue, else: break, continue and else claues can be used in loops as "for" and "while", "break" breaks out of the innermost enclosing for or while loop, while "continue" continue thru next iteration of loop. Else clause can be used for loops as "for" and "while". a loop’s else clause runs when no break occurs. Look for more details in the python website link above.


Functions: Function syntax is similar to those of other lang. All functions require parenthesis and optional args inside it.

1. Builtin: Python provides many builtin functions as print(), int(), abs(), open(), sorted(), etc.

A. print( ) : print function is one of the most used functions to o/p something on screen. It wasn't a function in python2 (it was just a statement), so no ( ) were required with print, but it's a function in python3, so it needs ( ). i.e: print("Hello, Python!"); However () works in python 2 also. So, it's preferred to use print as a func with parenthesis ( .... )

Python2: print "The answer is", 2*2, "My name=", name, "var=", 2*var1

Python3: print("The answer is", 2*2, "My name=", name, "var=",2*var1) => this will work in python2 also as parenthesis work in python2. Anything within quotes is printed as literal string, anything outside quotes is computed if it can be computed based on data types, or the value is just printed if it's a var. A newline is appended by default, but if we put a comma at the end of args 9i.e just before closing parenthesis), it suppresses newline.

We can use strings, list, var, etc to be printed using print. With List and tuples, full list will be printed, w/o requiring us to iterate over each element of the list.

ex: formatted string and other string type can be used inside print

name = "Fred"; print(f"He said his name is {name}." ) => This substitutes name with variable "name"., since we have f in front of string.

% operator for strings: String objects have one unique built-in operation: the % operator (modulo). This is also known as the string formatting or interpolation operator. Given format % values (where format is a string), % conversion specifications (as d, s, f, etc) in format are replaced with zero or more elements of values. The effect is similar to using the sprintf() in the C language.

ex: name="afg"; age=2;

my_format = "his name is %s, his age is %2d"; my_values =  (name, age) => NOTE: my_values need parenthesis since they are tuples (not curly braces or square brackets)

print(my_format % my_values) => Here %s and %2d in format string are replaced with values in var "name" and "age".NOTE: the whole thing here can be treated as a string, that is put inside print function. Whatever is the o/p of this formatting operator is passed to print func as an argument.

o/p is => his name is afg, his age is  2

ex: print( ' %(language)s has %(number)03d quote types.' % {'language': "Python", "number": 2}) => outputs "Python has 002 quote types". Here "s" after %(language) is a conversion spec saying convert 'language' object into a string using function str(). similarly 03d spec asks it to convert "number" into signed integer with 3 digits. Here values are not tuples, but hash, so curly braces used. NOTE: there is no comma after single or double quotes of string, as it's "format % value" that is being used inside print function, and not the typical "string followed by variable" syntax

ex:  We can use % operator on string inside print func, along with other regular args, as strings, var, etc to be printed. The whole format string is just another string arg to print func.

var2=23; var3 = "my stuff"
print('The value of pi is approximately %5.3f.' % math.pi, var2, "good", var3) => Here math.pi is formatted with total of 5 digits and 3 digits of precision (%. %5.3f means width=5, precision=2).

o/p is => The value of pi is approximately 3.142 23 good my stuff

format method: above are older ways of formatting print o/p. Now, we use format method to format strings.

ex: print('{0} and {1}'.format('Geeks', 'Portal'))=> {0} is replaced by string in 0 position which is 'Geeks' and {1} is replaced by string in position 1 which is 'Portal', so o/p is => Geeks and Portal. NOTE: there is no comma here after single or double quotes but a dot, since we are using the method on print argument, so it's not typical print variable.

B. input( ): input function is other widely used function to get input from user. There are diff b/w how this func behaved in python2 vs python3.

Python 2:

python2: str = raw_input("Enter your input: "); => raw_input() reads 1 line from std i/p and returns it as string w/o the newline
python2: str = input("Enter your cmd: "); => same as above except that valid python expr can be provided, and it will return result. result is still stored as string.
  Enter your cmd: [x*5 for x in range(2,10,2)]
  Recieved input is :  [10, 20, 30, 40] => str stores this list

Python 3:

python3: raw_input() function from python2 has been deprecated and instead replaced by input() func.So, no python expr can be provided.

python3: input() function from python2 is depracted, and instead stmt eval(input()) must be used to get same behaviour as input() func of python2. We don't use this stmt much, instead input() func above is used.

With all these input functions above, the result is stored as string, so in order to do numeric computation, we have to do data conversion using func below. Also, no expr are allowed, i.e expr will be treated as strings, and won't be computed.

ex: here 2 numbers are provided as i/p, but have to be converted to int in order to add them

num1=input("1st number")

num2=input("2nd number")

sum=int(num1)+int(num2)

print("Sum is", sum); #Here if i/p is 1 and 2, then o/p is 3. If we just did "sum=num1+num2", then it would concatenate the 2 strings and print "12"

C1. type(): type is an inbuilt func to find data type of any var or object (in case of OOP discussed later):

ex: age=50; print(type(age)) => prints type as "int".

ex: type_var = type(tinydict) => assigns "dict" string to type_var (as tinydict defined above is of type "dict")


C2. data conversion: data can be converted from one type to other by casting. Some of the casting functions are:
ex: int(x), str(x), list(y), hex(x), dict(d)

ex: python3: var_int = int(input("Enter any number: ")); var1=var_int+1; => here, var_int stores integer (i.e any number entered is a string, but then int() func converts it to int, so that we can do airthmetic computation on it.

C3. isinstance(): The isinstance() function returns True if the specified object is of the specified type, otherwise False.

ex: if (isinstance("Hello", str) ): print("true") => This checks if "Hello" is of type string. It returns True since anything within ".." is a string

ex: my_num=4.7; var1=isinstance(my_num, (str,list,dict,tuple)); print(var1) => this prints "False", since my_num is of type "int", while allowed types that this func is checking for are str,list,dict and tuple.

D. Maths:
ex: abs(x); log(x); max(x1,x2,...); pow(x,y);
ex: random()
ex: cos(x); radians(x);
ex: constants: pi, e

E. File functions: Python has file functions for reading/writing files just as in other lang.

file read/write ex shown below:
fo = open("foo.txt", "w+") => opens file for both rd/wrt, ptr at start of file. w=wrt_only, r=rd_only, (a=append_mode, ptr at end of file)
fo.write( "Python is a great language.\nYeah its great!!\n");
str = fo.read(10); => reads 10 bytes from file, if no arg provided, then reads whole file until EOF
print "Read String is : ", str
fo.close

exception: when script encounters a situation that it cannot cope with, it raises an exception. An exception is a Python object that represents an error. exception must be handled, else pgm terminates.
ex:
try:
   fh = open("testfile", "r")
   fh.write("This is my test file for exception handling!!") => trying to wrt to rd only file, raises an exception of IOError
except IOError: => std exception raised when an I/O operation fails
   print "Error: can\'t find file or read data" => This gets printed when IOError exception happens in try block
except ... => some other exception code can be put here for a diff exception raised. There are about 30-40 different exception errors that we can specify

except: => "except" stmt w/o any Exception code means raise this exception for any exception error
else:
   print "Written content in the file successfully" => If no exception, then run this block

Assert: An assertion is a sanity-check. An expression inside assert stmt is tested, and if the result comes up false, an exception is raised. Assertions were added in Python 1.5. They are usually placed inside function definition to check for valid inputs or to check for valid outputs. AssertionError exceptions can be caught and handled like any other exception using the try-except statement, but if not handled, they will terminate the program and produce a traceback. Assertions are very useful in exposing bugs, and should always be used extensively.

assert (Temperature >= 0),"Colder than absolute zero!" => This checks for Temperature variable to be +ve. If -ve, then the stmt following assert is printed "Colder ..." and pgm terminates.

assert(isinstance(b, float) or isinstance(b, int)) => Here on failure of assertion (i.e b is neither float nor int), no stmt is printed, but pgm terminates with traceback. If there are many assertions in pgm, it may be tedious to figure out which assertion failed, so it's good practice to have "text" following assert keyword.

 

2. User defined: Besides the built in functions provided by python, we may define our own function also. There are 2 kinds of function defined in python:

A. Normal function: These are regular function definition (as is common in other pgm lang)

defining a func:
def functionname( parameters ): => i/p param or args
   "function_docstring" => optional: explains what this func does
   function_suite
   return [expression] => If no expr provided, it returns none

ex:
def printme( str ):
   "This prints a passed string into this function"
   print str
   return;

printme("I'm first call to user defined function!") => calls printme func

NOTE: All parameters (arguments) in Python are passed by reference. It means if you change what a parameter refers to within a function, the change also reflects back in the calling function.

If var defined within func, then they are local to func, and are diff from same var declared outside the func.
total = 0; # This is global variable.
def sum( arg1, arg2=10 ): //default val of arg2 is 10
   total = arg1 + arg2; # Here total is local variable.
   return total; //here 30 is stored in total and returned.

# Now you can call sum function
total1 = sum( arg1=10, arg2=20 ); //here total1 is 30. We use arg1 to specify that 10 is for arg1, so on. This allows to place args out of order
print total; => here total is printed as 0, as it's global var

Passing func as an arg: We can also pass a func as an arg to another func

ex:

def shout(text): 
    return text.upper() 
def greet(func1): => Arg of greet function is func1
    greeting = func1("hi") => func1 is called with arg specified
    print(greeting)
  
greet(shout) => This calls greet func with arg "shout", which is itself a func. shout gets called with arg "hi", so o/p returned is HI.

B. anonymous function: These are functions w/o a name, and are faster way of implementing simple one line functions. "lambda" keyword is used to create anonymous functions. This function can have any number of arguments but only one expression, which is evaluated and returned. It's also called as lambda func and can also have another function as an argument. 

ex: square = lambda x1:x1 * x1 => Here, we define square as lambda func with one arg "x1". It computes square.Here lambda func is assigned a var "square", which points to the lambda func

print(square(5)) => This calls the var pointing to func "square" with arg =5. It returns 25.

ex: cube = lambda func1:func1**3 => here func1 is an arg to lambda func.

print(cube(square(2)) => here cube func is called with arg "square(2)". Now, square func is called with arg 2, which returns 4. This 4 is now cubed to get final answer

More Topics: More advanced topic are in next section.

Perl

perl stands for practical extraction and report language. It is designed for tasks that are too heavy for shell, and too complicated to code in C.

perl is highly portable. It runs on any unix like system that has C compiler. It runs on most platforms, since package comes with configuration script that pokes dir looking for things it requires, and adjusts include files and defined symbols accordingly. Perl originated in 1990's, became very popular, but now is getting over shadowed by python. Before perl came, awk and sed scripting languages wre used. Perl was a big improvement over these, which contibuted to it's rise. However, syntax wise, python is easier for beginners than perl. I've included perl on this site, as many legacy programs at various companies are still written in perl, which you may need to debug, so knowing little perl is going to be useful. However, if you are looking to learn a new scripting language, move to python. Python has lot more support than perl, and is increasingly preferred for future scripts.

Unlike shell pgm, perl is not a true interpreter . It compiles the file, before executing any of it. Thus it is compiler and interpreter (just like awk and sed).

Link for beginners (There are lot of other useful link for beginners on this site.): http://perl-begin.org/tutorials/perl-for-newbies/ 

Official perl documentation: https://perldoc.perl.org/

perl version: very important to verify version before starting to work, as syntax/features changes a lot b/w versions. Perl version 5 and beyond have lot more changes compared to earlier versions.

perl -v => returns v5.18.4 on centOS 6 running on my machine.

perl -V => (note capital V). This shows lot more details as compiler, library, flags, etc used for perl on this system

simple perl pgm: test.pl => we name the file with extension .pl as a convention. Unix doesn't care about file extensions, as it's not used for anything.

#!/usr/bin/perl
use strict;
use warnings;

print "Hello ARGS = @ARGV $0 \n";

Save file above as test.pl, then type:

chmod 755 test.pl => This makes the file executable.

./test.pl cat dog => this gives "Hello ARGS = cat dog ./test.pl"

Basic Syntax:

Just like any other programming language, perl has variables to store diff data types, has reserved keywords or commands and special characters. These allow the language to do all sorts of tasks.


1. comments: start with # till the end of line. No multi line comments


2. semicolon ; => all stmt terminated by ;


3. whitespace (spaces, tabs, newline, returns) => optional. whitespace is mandatory only if putting 2 tokens together can be mistaken for another token, else not needed. However, as we have seen with other scripting languages, we should always put whitespace to avoid ambiguity

4. curly braces {} => curly braces are used to group bunch of stmt into 1 block. Mostly used with control stmt.


5. parenthesis () => parenthesis for built in functions like print are optional. ex: print ("Hello\n"); print "Hello";

6. use <mod_name> <LIST>; => This function imports all the functions exported by MODULE, or only those referred to by LIST, into the name space of the current package. LIST is optional, but saves time and memory, when all functions in MODULE are not needed

ex: use Cwd qw(cwd chdir); => imports functions "cwd" and "chdir" from module Cwd

ex: use Time::HiRes "gettimeofday"; => imports function "gettimeofday" from module Time::HiRes

use strict; => this is perl pragma that will require all var to be declared before being used (all var will need to be declared with "my", else it will generate an error). pragma is a directive to compiler/interpretor on how to process its i/p (sort of cmd line options). "use" stmt applies pragma even before the pgm starts. This is enabled by default on perl 5.12 and later, so no need to explicitly code this.
ex: my $a=2; => "my" makes var local to the scope, so that once code is out of this scope, var is restored to its original value outside of scope. This helps prevent conflicts from having same name in multiple places. This is useful using in subroutines, as all var are global by default.  Since we used "strict" above, if my wasn't used to declare $a, then any refernce to $a would be an error (i.e $a=2 is error). This helps to find typing errors.

ex: our $q; => our var can be accessible from any code that use or require that file/package by prepending with appr namespace. $pkg1::q (if we used my $q, then this would have given error as q won't be accessibke outside the pkg)

use warnings; => This turns on warnings, and got introduced in perl 5.6. It hss same effect as -w on 1st line of perl (#! ... -w)
reserved words are almost always lowercase. So, use uppercase for user defined var. var names are case sensitive. so var and VAR are 2 diff names.

Data types:

Perl has 3 data types: scalar, array of scalar and hash of scalar. Any var not assigned has undef value (0 for num and empty string for string). print function can be used to print scalar and array data type directly, while hash needs each element to be printed separately.

1. scalar: preceded by $ sign, and then a letter followed by letters,digits,_. It's single unit of data which may be int, float, char, string or a reference.  Data itself is scalar, so var storing it is scalar.
operators (as + or concatenate) can be used on scalars to yield scalars.
ex: $salary=25.1; $name="John Adf"; $num = 5+ 4.5; #perl interprets scalars to get correct computation. Note: 5+4.5 with no spaces also works.
ex: $camels = '123'; print $camels + 1; => prints 124 as scalars are interpreted automatically depending on operator

Scalar comes in 3 different flavors: number, string or reference.

A. Numbers: Though numbers can be int or float, internally perl deals only with double precision float. i.e int are converted to float internally. number literal can be 1.25, -12e24, 3485, etc. These are called as constants in other pgm languages. perl supports octal and hex literals also. Numbers starting with 0 are octal, while those starting with 0x are hex.

$num ="129.7"; => here even though number is in double quotes and a string, it will be converted to number if numeric operator (i.e +) used


B. Strings: seq of char, where each char is 8 bit value from entire 256 character set. string lieterals can be any char in these 2 flavors:
  I. single quoted strings: anything inside ' ... ' is treated as it is (similar to bash where it hides all special char from shell interpretation), except for 2 exceptions = backslash followed by single quote and backslash followed by backslash. backslash followed by anything else is still treated as it is.
    ex: 'don\'t' => this is converted to string don't. 'don't' gives a syntax error, as it treats don as a string and sees t' later which is not valid token.
    ex: 'hello\\n' => this is treated as hello\n. 'hello\n' will be treated as it is.
    ex: $cwd = 'pwd' => here pwd string is printed instead of dir, as special char "pwd" is hidden due to single quotes
  II. double quoted strings: acts like c string. It is similar to bash where all whitespace char are hidden from shell, but all other special char are still interpreted. Here backslash takes full power to specify special char as \n=newline, \x7f=hex 7f, etc. Also, variables as $x are interpolated in "...", while they aren't in ' .. '.
    ex: "coke\tsprite" => coke tab sprite => tab space is added b/w coke and sprite

NOTE: print function can have both single quotes or double quotes, and they are treated same way as above.

ex: $a='my name'; print $a; => prints var a, which is "my name"

ex: print "$a";=> will print "my name" as substitution done within " ... "

ex: print '$a'; => will print "$a" as special char are treated as is within ' ... '

C. Reference: This is explained below after array and hash.


2. array: preceded by @ sign and stores ordered list of scalars. array list @var also accessed via $var[0], $var[1], etc
ex: @ages = (25,30,40); print "$ages[0] $ages[1] $ages[2]" => 25 30 30
ex: $#ages => gives index value of last element of @ages, in this case 2 (since 0,1,2)
ex: $ages = (25,30,40); => since $ages is scalar, length of array is assigned to $ages, which is 3. $ages=@ages also gives 3. So, scalar(@ages)=$#ages+1 => always true since scalar() returns length of list
ex: @names = ("john W", @ages, "amy r12", 1);
ex: print "@names"; => This will print the whole array (no need to separate it out into individual elements (as $names[0], etc)
ex: ($me,$lift)=@names; => sets $me="john W", $lift="amy r12"
ex: ($alpha, $omega) = ($omega, $alpha); => this swaps the 2 values, occurs in parallel (nt like C)

various builtin functions available for array:
A. push/pop, shift/unshift, reverse, sort
ex: sort(Fred, Ben, Dino) => returns Ben, Dino, Fred.
ex: @guys=("Fred", "Ben"), others is a func that returns Dino. sort(@guys, others()) => returns same sorted list as above.
B. chomp(@ages); => chops last char from each element of array

C. qw => quote word function creates a list from non whitespace parts b/w (). Instead of bracket, we can also use other delimiters as {..}, /../, etc.
ex: @names = qw(john amy beth); => creates list "john","amy","beth". no " ... " or , required. list built by removing whitepsaces.

D. q => this returns single quoted string, no whitespace separation or interploation of any var done. Just whole string returned with single quotes.

ex: $a = q(I am good $NAME is); => $a = "I am good $NAME is"

D. scalar(@_); => returns num of elements in array. See ex above.

3. hash: preceded by % sign, and used to store sets of key/value pair. key/value pair may have single quotes, double quotes or no quotes as needed.
ex: %data = ('john p', 1, 'lisa', 25); print "$data{'john p}"; => prints value 1. "john p" is the key.
    @dat1 = %data; => This assigns hash data to array dat1. So @dat1 = ('john p', 1, 'lisa', 25); We can also assign one hash to other: %dat2=%data. %data=@dat1 converts array to hash (odd entries are key, while even entries are values)
ex: %map=(); => clear the hash
ex: %map = (red=>0xff; green=>0x07; ...); #other way of assigning hash values
ex: $global{Name} = "Ben"; => Name is key, while Ben is value
ex: $GLOBAL{"db_path"} = "$GLOBAL{db_root}/$GLOBAL{version}/verification"; => substitution happens to provide complete path

ex: print %map; => This will print nothing. This is because hash elements can't be printed directly. We will have to use "each" (with a while loop) or "foreach" function as below. However, if hash is passed into a sub, it gets converted to an array @_, and printing @_ will print the list (as arrays can be printed directly).

various builtin functions available for hash:
1. keys: keys(%ages) => returs all keys. i.e returns all odd numbered elemenets of array (1st,3rd,5th,etc)
2. values: @num = values %ages; => returns all values, i.e all even numbered elements. note no brackets as they are always optional
3. each: returns key/value pair for all elements of list
ex: while (($name, $age) = each(%data)) { print $name $age ; } => each key/val pair ssigned to var.
ex: foreach $key (keys %ages) {print $ages{$key};} => other way to access key/val pair

NOTE: => is used in hash, but one other use is as fat comma. It's a replacement for comma.

ex: Readonly my $foo => "my car"; #here ReadOnly module (an alternative to constant) is called, It's syntax is 'Readonly(my $foo, "my_car")' to assign "my_car" to $foo as constant. Since ( ) around args in sub are optional, this can be written as 'Readonly my $foo, "my_car". Here , can be replaced with =>, resulting in ' Readonly my $foo => "my_car" '.


NOTE: hash can be converted to array which can be converted to scalar($age). Array is just collection of scalars stored in index 0 onwards ($age[0]), while hash is collection of scalars stored in array whose index values are arbitrary scalars($age{john}).
perl maintaines every var type in separate namespace. So, $foo, @foo and %foo are stored in 3 different var, so no conflict.

typeglob: Perl uses an internal type called a typeglob to hold an entire symbol table entry. The type prefix of a typeglob is a * , because it represents all types. This used to be the preferred way to pass arrays and hashes by reference into a function, but now that we have real references (see below in subroutine section), this is seldom needed.

The main use of typeglobs in modern Perl is to create symbol table aliases.

ex: *this = *that; => this makes $this an alias for $that, @this an alias for @that, %this an alias for %that, &this an alias for &that, etc. Much safer is to use a reference, as shown below.

ex: *var1 is same as \@var1, as both are ref to array @var1

Another use for typeglobs is to pass filehandles into a function or to create new filehandles. If you need to use a typeglob to save away a filehandle, do it this way:

$fh = *STDOUT; #here we get ref to STDOUT by prefixing it with *, and store that ref in scalar $fh. We can also do it as a real reference like this: $fh = \*STDOUT; Now, $fh can be used instead of STDOUT, i.e

ex: print $fh "print this line"; #instead of "print STDOUT "print ...""

ex: use LogHandle; $fh = LogHandle->hijack(\*STDOUT); $fh->mute(); *fh->autoflush(); #here we are passing ref to STDOUT to LogHandle::hijack module. We get as return value a scalar $fh. We can call functions using $fh or *fh. This is perfectly valid.

1. Scalar: we talked about numbers and strings in scalars, but there's a third kind of scalar called reference.

C. Reference: reference are similar to pointers in C. They hold the location of another value which could be scalar, arrays, or hashes. Because of its scalar nature, a reference can be used anywhere, a scalar can be used. Being scalar (since addr is a scalar quantity), reference stores as $var. Reference can be static or dynamic.

1. static reference is one where changes made to reference change the original value. To create a static reference to any var, precede that var by backslash (\).

$scalarref = \$foo; => Here we create ref (addr) for var $foo, by preceeding it with \. Now, $scalarref has addr of $foo

$arrayref = \@ARGV; => for array

$hashref = \%ENV; => for hash

$coderef = \&handler; => function/subroutine

$globref = \*foo; => globref for foo (foo may be scalar, array, hash, func, etc.

Function reference: ex: sub print_m { ... }; $print_ref = \&print_m; &$print_ref(%hash); => calling func by ref. Useful in OO pgm.

Function ref(): This returns the var type of any reference. So, ref($mapref) returns HASH (since $mapref is refrencing hash type). It can return SCALAR, ARRAY, HASH, CODE, etc. If arg is not a reference, then it returns FALSE.

2. Dynamic reference are ones where a dynamic copy of the object is made, and if changes are made to this reference, then the original var doesn't change with it. To create a dynamic reference to any var, enclose that var within [ .. ] for array and within { .. } for hash. Mostly used with constants (i.e when we don't have a var assigned to store these constants)

array: Use [ ... ]

@ages = (25,30,40);=> stores array in var @ages.

$ages = (25,30,40); => stores size of array. So, $ages=3

$agesref = [25,30,40, ['q','a']]; => Since square brackets used, it creates copy of this and stores ref of that array in $agesref

$agesref = [ @ages ]; => this creates dynamic ref to array @ages

hash: Use { ... }

%data = ('john p', 1, 'lisa', 25); => stores hash in var %data.

$dataref = {'john p', 1, 'lisa', 25}; => Since curly brackets used, it stores ref of this hash in $dataref

%map = (red=>0xff, green=>0x07, blue=>(magenta=>45, ...), ..); => other way to store hash

my $mapref = {red=>0xff, green=>0x07, blue=>{...}, other=>[ ...], ...}; => Since curly brackets used, it creates copy of this ref and stores ref of this hash in $mapref. Note that inside we can have multi level hash/array (i.e other is a arrayref in this ex, since it has array in [ ... ]).

$hashref = [ %data ]; => this creates dynamic ref to hash %data

DeReference of var: Derefrencing means getting the var back from the addr. It's same for both static or dynamic ref. Use $,@ or % in front of ref var. We can use { ... } around scalarref for clarity or when the expression inside them is complex.

$scalarderef = $$scalarref; or ${$scalarref}; => putting a $ in front of the ref var, gets the value pointed to by that ref var. (or using ${$scalarref} is the same thing)

$arrayderef = @$arrayref; or @{$arrayref}; => printing $arrayderef prints the whoele array (though with no spaces)

$hashderef = %$hashref; or %{$hashref}; => printing $hashderef prints the whoele hash key+value (though with no spaces)

&$coderef(args); => function call using reference. To call function via indirect way, we do "&handler(args)". See in Function section below.

arrow operator ->: An arrow operator is used in C pgm to access individual elements of struct pointer (reference to struct). i.e for struct person *p_ptr with element age, we do "p_ptr->age". We use similar concept in perl to access elements of array or hash reference.

1. array: use -> followed by [ .. ]. Inside [ ], enter Number "n" which will get the value of nth element of array.

ex: $cont = [ 1,2,ab,cd]; $cont->[3] refers to 4th element = cd

2. hash: use -> followed by { ... }. Inside { }, enter the "key", which will get the value corresponding to the key

ex: $cont = {"title me"=>a, name=>"john c", addr=>{city=>aus, zip=>12231} }; $cont->{"title me"} gets value "a". $cont->{addr}->{city} gets value "aus"

3. mixture of array and hash: use [ ]  for array and { }  for hash. For multilevel, we may omit subsequent -> after the first one.

ex: $cont = {"title me"=>a, name=>"john c", addr=>{city=>aus, zip=>12231, addr=>[{street=>"main"}, {house=>201}] } }; print "$cont->{addr}->{addr}->[1]->{house}" gets value "201"

4. class/subroutine (or object/method): use -> followed by ( ... ) => We use ( ) for args of methos, and not for method itself. this is used in OOP section later. class is treated as reference to data and subroutine. It points to mem loc of 1st element of class.

ex: $class1->new("matt",10); #Here class1 is a package named "class1". We are calling subroutine named "new" in this class.

ex: $obj->{name}; #here $obj is ref to class "$class1", and has hash data type. So, it's similar to case 2 abov, where we get the value corresponding to key "name". NOTE: { ..} used here since it's referring to hash object.


operators: these operate on scalar or list, and returns scalar or list. () defines precedence of operations in case if ambiguity.

scalar operators:
1. for numbers: arithmetic: +,-,*,/,**(exponent),%, comparison (returns true/false): <,>,<=,>=,==,!=
2. for strings:
   A. concatenation (.). ex: "hello"."world" => "helloworld"
   B. comparison: eq, ne, lt, gt, le, ge. ex: 7 lt 30 gives false, as 7 and 30 are treated as strings, and string "30" comes before string "7" as 3 has lower ascii code than 7. If numeric operator < was used, then it would return true as literals would be converted to numbers, and 7<30 is true.
   C. string repeatition: consists of single lowercase letter x. ex: "fred" x 3 => "fredfredfred"          
      ex: (3+2) x 4 => "5555" as 3+2=5 is treated as string since there's a string operator on it.

perl converts numbers to strings or viceversa depending on operator. If operator is nummeric, literals are converted to float, and if operator is string, then numbers are converted to string. If literal can't be converted to correct type for that operator, then an error is printed. So, even though perl doesn't have types for scalar, it uses operator type to figure out literal type as number or string.
ex: $name="john"; print $name+1; => here john can't be converted to number, so error "Argument "john" isn't numeric in addition (+) "
ex: $name="123"; print $name + 1; => here 123 can be converted to numeric, so + is carried out and 124 printed. NOTE: spaces don't matter

= is also an operator.  $a=17; a gets value=17, but this whole expresssion is also given value of $a (which is 17)
 ex: $a= ($b=15); => b is assigned 15, but then a is assigned value of ($b=15) which is $b which is again 15.

shorthand operators:
$a += 3; => $a=$a+3;
$str .= "me"; => $str = $str . "me";
$e = ++$a; => $a=$a+1;$e=$a; => prefix version of autoincrement, $e gets incremented value
$e = $a++; =>  $e=$a;$a=$a+1; => sufffix version of autoincrement, $e gets non-incremented value

defined operator: scalar can be defined or undefined. undefined scalar, or scalar with null string("" , i.e nothing within the string) or number 0 are all interpreted as FALSE when scalar is used in Boolean expr, while anything else is treated as TRUE.

ex: if (defined($args)) { ... } #We can omit brackets around args of function. so "if (defined $args)" is also valid


control stmt: control expr is evaluated as string.  If empty "" or "0" string, treated as false, everything else is true

1. if/else:
if ($ready) { $a=1;}
elsif { ...}
else { ... }

2. while/until: while => repeat while expr is true. until => repeat until expr is false
while ($tcks <100) { $sum += ... }
while (@ARGV) { process(shift @ARGV); }

3. do/while: with while, if cond is false, loop will not execute even once. do/while causes loop to execute atleast once
do { ... } while ($cnt <100);

4. unless:
unless ($dest eq $home) {print ...;}

5. for/foreach: These can be converted into equiv while stmt.
for ($sold=0; $sold<100; $sold++) { ... }
foreach $user (@name) { if $user ... } => here each element of @name is assigned to $user and loop run for each element. Modifying $user modifies original list (since it's a reference and NOT a copy)

6. next/last: next allows to skip to next iteration, while last allows to skip to end of block, outside of loop
foreach $user @user {
  if ($user eq "root") {next;} #skip to next iteration
  if (...)             {last;} #comes out of loop
}
If we specify loop by a var, then we can specify which loop to break out of by specifying loop name.
LINE: while ($line = <FILE1>) { # this loop is anmed LINE
       last LINE if $line eq "\n"; => we get out of loop LINE when we encounter 1st blank line
       next LINE if $line =~ /^#/; => skip comment line

      do something .....
      }

7. goto: ex: goto LINE;

8. switch: For switch cmd to work, "Switch" module needs to be used, which requires some other modules to be installed. syntax same as in other languages.

use Switch;

switch(arg) {

 case "a"  {print "name"; .... }

 case /\w+/ {print "..."}

 else { print ...}

}


Built in functions: perl provides a lot of built in functions that are very helpful. Most of the times you can use these functionsto write more complex ones

1. chop($x) => it takes a scalar var, and removes last char from string value of that var
ex: $x="hello"; $y=chop($x); => $x becomes hell. $y gets assigned the chopped char "o".

2. chmop($x); => removes only the newline char at end if present, else does nothing.

3. print/printf: printf is C like providing formatted o/p
ex: printf "a=%15s b=%5d c=%10.2f \n",$a,$b, $c; => string $a is printed in 15 char field, decimal number $b in 5 char field, fp num $c in 10 char field with 2 decimal places

4. split/join:
ex:@fields = split(/:/,$line); => split $line using : as delimiter and assign it to $fields[0], etc.

5. system cmds: any linux cmd can be executed using system or backtick

A. system cmd: any unix cmd run using "system" should be avoided as it makes perl unportable and may break perl script for other users or other linux machines. This is because, cmds run using "system" run on current shell of user which may be bash, csh, etc. So, if some other user has a different shell, which supports some other version of this cmd, then the system cmd may not work any more. Also, the return status of system cmd is 0 on success (any non zer value indicates a failure which is different than how all other cmds behave). Other problem is that system cmd has3 diff forms, and depending on which one is used, i may behave differently. So, avoid "system" cmd all together. Instead use perl modules as mkdir, chdir, etc.


system("date"); #
$status=system($cmd); #runs whatever $cmd is. $status is assigned 0 on success
system "grep fred in.txt >output";
system "cc -o @options $files"; #var substitution occurs

B. backtick or qx: any cmd inside backtick or qx is executed. backtick is an operator.
my $output = `script.sh --option`; #using backtick, cmds within `` are executed and results returned to STDOUT (in this case to $output)
my $output = qx/script.sh --option/; #similar to above as qx/.../ same as ``

We have system cmds for cd, pwd, etc that we can execute using "system" or backtick. However, it's preferred to use perl provided modules for doing this, as they work across all platforms. These modules eventually end up making the system call, but do it cleanly.

1. getcwd: This gets current working dir. same as unix "pwd" cmd.

use Cwd qw(getcwd);

$cur_dir = `pwd`; => this returns unix pwd but has a trailing newline at end. This is stored in var $cur_dir

$cur_dir = getcwd; => same as above, except no newline at end

2. chdir: This changes dir to specified dir. same as unix "cd" cmd.
use Cwd qw(chdir);

$save_pwd_dir = `pwd`;

chomp $save_pwd_dir;

$status=chdir($save_pwd_dir);=> since `pwd` above returns newline at end, using it in chdir module will return status of 0 (i.e error, so no cd happens). Only if we remove the newline by using "chomp", is when the chdir cmd will work and will status of 0 (i.e success)

$cur_dir= getcwd;

chdir($cur_dir); => This cmd works and returns status of 1, since there's no newline in $cur_dir (since getcwd cmd was used)

chdir("/home/ajay"); => Here we change to given dir, by directly specifying the name

6. here: It's not a function. It's the same "here" as in bash script. syntax is "<<IDENTIFIER; .... Any Stmts .... IDENTIFIER". The same effect can be achieved with print stmt, but that will need multiple print cmds, one for each line. To interploate variables in stmts use double quote around "IDENTIFIER", else use single quotes 'IDENTIFIER'.

ex:below ex in perl script will print stmt1 and stmt2 on screen, since default for print is STDOUT. newlines if present in text are automatically printed.

print <<Foo;

My name is

You are ill)

Foo

ex:below will print the stmt in $file1 handle which is opened in write mode

open my $file1 '>', "file.txt" or die $!

print $file1 <<My_text;

this is test;

My_test

 
subroutines:

Declare: ex: sub NAME1; #forward declaration of a subroutine NAME1. If we have args, use sub NAME1(PROTO); All sub have default arg list stored in @_ array, which can be used inside body of sub(@_ stores args, i.e @_[0], @_[1]). @_ is private to that invocation of sub (i.e local copies made), so nested sub can be called, w/o these values getting overwritten.
To declare and define all in one place, just add the block to it. i.e sub NAME1 (args) {BLOCK } => NOTE: args of sub is in ( ... ), but body is in { ... }. arg list is optional even if args are used by the calling func, as @_ will store the args for any sub. Also ( ) brackets are optional for args, so "sub NAME1 arg1 arg2 {BLOCK }" is perfectly valid


ex: sub say_hello { print "hello $what"; return $a+$b; } => any var used within sub are global by default (diff than conventional C pgm). To make a var local, declare it with my() i.e: my($sum, @arr, %a); my($n,@values)=0; my $a; "local" can also be used to declare local var. return value is what's specified, or the last expression evaluated. We can return any data type, i.e. scalar, array, hash. If no return value provided, then the last calc performed becomes the return value (if print is the last calc done, then 1 is the return value). If we retrun more than 1 array or hash or a combo, then their separate identities are lost. In such cases we use references.

To call subroutine, 2 ways
1. direct calling: Here we call by directly providing name of sub with optional args. ex: NAME1; NAME1(list); NAME1 LIST; => any of these 3 ways is fine.
ex: $a=3+say_hello(); => here sub returns value of $a+$b
ex:
sub bg { my(@values) = @_; foreach $_ (@values) { return @result; } }
@val = bg(1,2,3); => any number of args can be provided as @_ stores them in array (as many as needed). Note, sub above doesn't have any arg list in it's defn (it's implied). Return value is stored in array @val.
 
my $cont = get_contents(); => this stores return value from func in scalar $cont. If return value is array or hash, then conversion happens, as explained in array/hash section above.


2. indirect calling: Here we call by providing reference to function (i.e pointer to addr of func). This was used in Perl 5.0 and before, but not recommended. ex: &NAME1;
ex: &bg(1,2,3); => same o/p as above except that func called via reference.

ex: $func_ref = \&bg; &$func_ref(1,2,3); => same o/p as above. here addr of func "bg" passed on to $func_ref. Now we accesss "bg" by derefrencing the addr $func_Ref.

Passing args to functions: Args can be any data type, and they can be passed via value or via reference. We pass them via reference, when we want to alter the original arg itself.

ex: $tailm = my_pop(\@a, \@b); Here array @a,@b are passed by reference, so whatever we do to @_ inside my_pop func, modifies @a and @b too.

 

module / package:

  1. module: A Perl module is a reusable collection of related variables and subroutines that perform a set of programming tasks. There are a lot of Perl modules (>100K) available  on the Comprehensive Perl Archive Network (CPAN). You can find various modules in a wide range of categories such as network, XML processing, CGI, databases interfacing, etc. Each perl module put in it's own separate file called as file1.pm, having same syntax as perl file. It can be loaded by other pgm or modules, by using do, require or use.
  2. Package: Packages are perl term for namespaces. Namespaces enable the programmer to declare several functions or variables with the same name, and use all of them in the same code, as long as each one was declared in a different namespace. Packages are the basis for Perl's objects system (explained later). Our main perl script itself is in "main" package (so all var can be referenced as main::a, or just plain "a"). We switch package using "package" keyword. Then our namespace changes to package_name (until the end of file). Now we can use var in this package using new package namespace.

Diff b/w module and package: Although package and module are used interchangeably, they are completely different. package is a container (a separarte namespace), while module is a perl file that can contain any number of namespaces. It doesn't need to have any kind of pkg declaration inside it. To load a module in anaother file, we use any of do/require/use keyword. "use dir1::File1" just loads a file named dir1/File1.pm. To remove confusion b/w these, perl programmers obey these 2 laws, so that package and module can be treated as same thing:

  1. A Perl script (.pl file) must always contain exactly zero package declarations.
  2. A Perl module (.pm file) must always contain exactly one package declaration, corresponding exactly to its name and location. So, every module goes with same package name.

I. writing your own module: Filelog.pm => pm means perl module

package Filelog; => makes Filelog  module a package. We adhere to law 2 above (name of module file exactly same as package name, or else it will error out). So, now namespace is "Filelog" instead of main or anything else. So, we don't have to worry about using my() for each var. all var/sub from here on will be in namespace Filelog

use strict;

my $LEVEL = 1; //put global var $LEVEL to 1, so that any subroutine can access it

sub open_my{ .... $a = shift; ... } => write subroutines for diff functions to do

1; => this is required to return a true value from this module to the calling pgm. Newer versions of perl do not require this. We keep it for backward compatibility

II. Using above module in other pgm: pgm1.pl (we do not need separate file for package, we can put all code for package "Filelog" in pgm1.pl too)
    
#!/usr/bin/perl
use strict;
use warnings;
 
use FileLog; =>load Filelog module.  we could use any 1  of these 3 stmt: do, require, use. Since there is also a package declaration with same name, new namespace "Filelog" can be used.
 
FileLog::open_my("logtest.log"); //sub in modules called by using namespace separator (::). args within brackets passed to subroutine "open_my"
 
FileLog::log(1,"This is a test message"); //sub "log" in namepsace "Filelog" with 2 args

$STDERR = LogHandle->hijack(\*STDERR); #this is other way of calling sub in package "LogHandle". See in package section later
 

Read cmd line args: All languagaes have way of reading cmd line args. We can write our own code to get args or use perl module for that. Getopt is a very popular module to get args of a cmd line.

1. Regular way: Al cmd line args in perl are  stored in @ARGV array (after the name of script). $#ARGV is the subscript of the last element of the @ARGV array, so num of args = $#ARGV+1. $0 stores the name of the script, that we are running

ex: ./test.pl cat dog => here @ARGV stores "cat dog" array. so, $ARGV[0]=cat, $ARGV[1]=dog and so on. $#ARGV=1 (since num of args=2). $0 stores ./test.pl

2. test.pl

use Getopt::Std; #load Getopt/Std.pm module

my %options=(); => we declare empty hash "options"
getopts("hj:", \%options); => We store args in ref to hash "options". here we are capturing arg values specified via flags -h -j. : indicates that there is more stuff coming after -j. So our cmd line is something like this "./test.pl -h -j my_help". There are many different ways of storing args via getopts. Look in perl doc.
print "option $options{h} , $options{j}\n";

run: ./test.pl -h -j amit => prints "options 1 , amit"

Signal trap pragma:

use sigtrap qw(handler my_handler normal-signals); => This pragma is simple i/f to installing signal handlers, so that when the program abruptly quits, we can do graceful exit, by having a sub execute on receiving interrupt. Here "my_handler" sub is called on getting interrrupt. There are many signals as INT, ABRT, TRAP, etc that causes perl script to terminate. The last arg "normal-signals" says that employ this handler for only normal-signals as INT, TERM, PIPE and HUP, and not for other interrupt signals,

sub my_handler {

   my $signal = shift; #gets the signal causing the pgm to terminate

   die " Pgm killed with signal $signal";

}

special code blocks:

There are five specially named code blocks that are executed at the beginning and at the end of a running Perl program, if present in the pgm. These are the BEGIN, UNITCHECK, CHECK, INIT, and END blocks. These code blocks are not subroutine, even though they look like it. "BEGIN" is exxecuted at the very beginning of script, while "END" block is run at the very end, just before the interpreter exits. Multiple BEGIN, END, etc blocks can be in same pgm, and they are exxecuted in reverse order of where they are in code. Usually 1 BEGIN, 1 END block suffices.

ex:

END {
  my $program_exit_status = $?; #Inside END block, $? contains the value that the program is going to pass to exit()

  print "Exit status is: $program_exit_status"; #we can have an stmt here that we want to be executed at end

}


Format: Perl supports formatting so that scripting languages "sed" and "awk" may no longer be needed, as perl supports more complex formatting.

format => defines a format, and writes data in that format
ex: defining a format. keyword format NAME = <some format> . => . at end is important
format LABEL1 =
 ==========
 | @<<<<< | => @<<<<< specifies a left justified text field with 5 char
 $name
 | @< |
 $state
 ==========
 .
open(LABEL1, ">file.txt"); => filehandle name needs to be same as format name
($name,$state) = ...;
write(LABEL1); => this writes into file.txt

Regular expressions: These are same as ERE we studied in Linux section. However, perl RE have slight variation from POSIX ERE. Perl RE have become so widely used, that when people say RE, they usually mean Perl RE. Perl RE basics are best explained here: https://perldoc.perl.org/perlre

Perl RE are way of describing a set of strings w/o having to list all strings in the set. All ERE regex still valid. Following are the Perl RE metacharacters:

  • dot . => matches any single char except newline
  • * => matches 0 or more of preceeding char
  • + => matches 1 or more of preceeding char
  • ? => matches 0 or 1 of preceding char.
  • \ => backslash to escape next metachar
  • ^, $ => matches beginning or end of line
  • (), {}, [] => () is for grouping subexpressions, {m,n} and [abc], same as in ERE. These are treated as metachar, so use backslash to use them as literals
  • <> => used for capture grps in conjunction with (). This is different than ERE as ERE doesn't use this (BRE uses this but for different purpose)
  • | => Or or alteration. Used inside (), but may be used without () too.
  • - => used to indicate range inside []
  • # => comment

Above metachar are used for pattern matching, substitution, spliting, etc
ex: /foo/ => // is pattern matching operator looking for foo
while ($line = <FILE2>) {
 if $line =~ /http:/ { print $line; } => matches pattern for http:. =~ is pattern binding operator  asking it to do this
}
while <FILE1> { print if /http:/ ;} => our default is $_. pattern binding operator =~ automatically applied to $_. o/p exactly same as above

quantifier:
{min,max} => preceeding item can match min number of times upto max number of times
+ => {1,} matches one or more of preceeding items
* => {0,} matches zero or more of preceeding items
? => {0,1} matches zero or one of preceeding items

common patterns:
/[a-zA-Z]+/ => matches one or more of alphabets
/[\t\n\r\f]/ => matches any of tab, newline etc. Instead of this, we can also use /[\s]/
/[0-9]/ => matches any digit. Same as /\d/. /\d+/ matches any number of digits
/\d{7,11}/ => matches min 7 digits but no more than 11 digits. ex: telephone number
/[a-zA-Z_0-9]/ => matches any single word char. equiv to /\w/. /\w+/ matches an entire word
/./ => matches any char whatsoever (except a newline). needs to be atleast 1 char
/a./ => matches a followed by . => a followed by any char after that => matches all strings that have a in them, and "a" is not the last char
/\S\W\D/ => uppercase provides negation. \D means any non digit char
/(\d+)/=> match as many digits as possible and put it in var $1. If more (), they are stored in $2,$3 etc.
/\bFred\b/ => \b matches at word boundary. So, this matches "the Fred Linc", but not "Fredricks"
/^Fred/ => matches lines beginning with Fred. ^ is anchor for beginning of line, while $ for end of line
/Fred|Wilma|Bren/ => matches any of 3 names
/(..):(..)/ => matches 2 colon separated fields each of which is 2 char long

pattern matching/substitution:

m/pattern/gimosx =>  m=matching, m is optional as by default pattern matching is implied. gimosx is modifier such as g,i,. g=match globally(find all occurences), i=case insensitive matching.
ex: ($key,$val) =~ m/(\w+) = (\w+)/ => extracts key value pair from $_

s/pattern/replacement/egimosx => s=substitution
ex: $paragraph =~ s/Miss\b/Mrs/b/g => substitute Miss with Mrs globally in $paragraph. By default, it works on $_

ex: $MAILBOX->{'ScriptName'} =~ s/.*\/// => it substitutes the script path name with just the script name (strips out everything before the last /). So, "../dir1/./file.txt" will return "file.txt". Useful when trying to get name of script from cmd line.


Files:
----
To read/write files, we need to create an IO channel called filehandle. 3 automatic file handles provided: STDIN, STDOUT, STDERR corresponding to 3 std IO channels.
open (HANDLE1, "file1.txt") or "Cannot open for Read: $! \n"; => Read file. $! contains error msg returned by OS
open (HANDLE1, "<$file1"); => same as above. Read file
open (HANDLE1, ">$file1"); => create file and write to it
open (HANDLE1, ">>$file1"); => append to existing file
close (HANDLE1); => need to close file

File test operator: -e is the operator that operates on any scalar operation
-e $a => true if file named in $a exists
-r $a => true if file named in $a is readable, -w=writable, -x=executable, -d=is_directory

ex:
$name="index.html";
if(-e $name) {print "EXISTS";} else {print "ABSENT";};

<> = Line reading operator
print STDOUT "type number";
$num = <STDIN>; reads complete text line from std i/p upto first newline. That string is assigned to $num (including \n). <> returns undef when there's no more data to read (as in end of file). STDIN can be ommitted here (since default in STDIN)
print STDOUT "num is" chomp($num); \n is removed here.
chomp($num = <STDIN>); => this also works as any action refers to action on LHS of = operator
@num = <STDIN>; => This stores all lines of input in array until CTRL+D is pressed (i,e EOF). Each line is stored separately in $num[0], $num[1] and so on ..

ex:
while (<>) { print $_; } => $_ is the default storage var, when no var specified. this is equiv to
while (defined($_ = <STDIN>) { .. } => At end of file when there are no more lines to read <> returns undef

ex:
#! /usr/local/bin/perl -w => -w for turning ON warning
$num_args = $#ARGV + 1;
if ($num_args != 1) {
  print "\nUsage: def_report_nets.pl  name_of_def_file\n";
  exit;
}
open (DEF, "$ARGV[0]") || die "Cannot open $ARGV[0] for Read ...";
while (<DEF>) { //or while ($_ = <DEF>)
  if (/count : (\d+) ;/) {
    $count = $1; //$1 is assigned to whatever matches in first (). Here $1=(\d+)
    $count_sum += $count;
    print DEF1 "count = $count, sum=$count_sum"; => write this into DEF1 file handle (assuming it's open for edit)
 }
}

Object Oriented (OO):

perl is unique to be both procedural language as well as OO language. OO is not the best soln for every problem. It's particularly useful in cases where system design is already OO, and is very large and expected to grow. OO concept in perl is similar to those in other languages. OO system is either protocol based (as in Javascript) or class based (as in most other languages, as C++, Java, Python, etc). Inheritance, overloading, polymorphism, garbage collection are all provided in perl OO similar to other languages. Perl buil in OO is very limited, and many OO systems have been built on top of this, which are typicaly used (Moose is one such ex). For our purpose, built in OO for perl is good enough. class, method, object and attributes are 4 concepts related to OO. I've put this OO section to get some basics, but if you need to do OO, python is preferred (python is preferred in generate over perl)

class: Class is a name for a category (like phones, files, etc). package explained above declares a class (i.e package Person; declares class "Person"). In Perl, any package can be a class. The difference between a package which is a class and one which isn't is based on how the package is used.

attribute: these are data var associated with the class. An instantiation of class, known as object, assigns values to these attributes.

method: This class (package) has var and sub that work on these var. Sub used within this package are called methods.

class and method in OO term are thus package and subroutine that we studied earlier.

object: Let's create an instance of this class, which is known as object. When we create an object, we actually create a reference to attr/method in the class. All objects belong to a specific class (i.e we can define an object "LG_phone" belonging to class "phone"). We can have multiple objects for a given class. An object is a data structure that bundles together data and subroutines which operate on that data. An object's data is called attributes, and its subroutines are called methods. You can use any kind of Perl variable (scalar, array, hash) as an object in Perl. Most Perl programmers choose either references to arrays or hashes (ref to hashes are most common).

ex: Person.pm => this *.pm file name has to be same as package name, as package is searched for looking for file with name = file_name.pm

package Person; #creates class "Person"

sub my_new { #sub for creating an instance of this class. This is called a constructor, and is usually named "new", but can be anything. This constructor is just like any other method (most OOP languages have special syntax for constructors, but not for perl)

my $class = shift; #First arg passed to any method call is the method's invocant. Whenever we call new() method in perl, it automatically passes class name "Person" as first arg. So, this is the object instance name for this class.

my $self = { Name => shift, ssn => shift }; # this class has 2 attr: Name and ssn. Here the object of this class is of ref to hash data ype, but could have been any type = scalar, array, hash, etc. These 2 attr are the args to this method call. $self is a scalar storing ref to hash

#instead of shift, we could have also used @_. my ($class, $name, $ssn) = @_; my $self = { Name => $name, ssn => $ssn }; 

print "class is $class\n"; print "Name is $self->{Name}\n"; print "SSN is $self->{ssn}\n"; #print values: class=Person, Name=matt, ssn=1224

bless $self, $class; #Turning a plain data structure into an object is done by blessing that data structure using Perl's bless function. W/O this, data structure won't become an obj. 1st arg to bless func is the refrence to data, while 2nd arg is class. So, ref $self is blessed to of class "Person". Otherwise $self remains ref to hash data, just like any regular hash ref.

#we can also combine, $self and bless in same line as below
#my $self = bless { Name => $args->{Name}, ssn => $args->{ssn} }, $class;

return $self; #we return the ref to hash. This is a scalar ref. This becomes the ref of the new object being created.

}

sub setName {

my ( $self, $Name ) = @_; #1st arg is always object ref, so we store in $self

$self->{Name} = $Name if defined($Name); #Now, we can access attr of object

return $self->{Name};

}

sub getName {

my( $self ) = @_;

return $self->{Name};

}

1;

test.pl => this file uses the above package

use Person; #Person package is now included
 
my $object = Person->my_new("Mary",22345); #we are passing args as list (scalar,array,hash) and not as reference. "Person" is also passed as arg so that the object created is associated with class "Person". $object is now a refrence to hash data type containing 2 data (1 keys/value pair). It's associated with class "Person", so is little different than ref to regular hash data type, but still can be treated as "reference to hash data type" for most purpose.

#my $object = my_new Person("Mary",22345); #this is another way to create object

my $name = $object->{Name}; => This references the object "$object" and gets the value for key "Name" which is "Mary"

print $name; => prints Mary. Note printing "$object->{Name}" doesn't work, as print only expands $object which is an addr, so it prints "Person=HASH(0x3f4578A0)->{Name}", i.e -> is not expanded to get the correct value

$name = $object->setName("James"); => sets $object->{Name} to "James". Could have done directly too via: $object->{Name} = "James", however using subroutines as part of a class ids preferred, as it keeps the object organized, by having everything related to an object in 1 place.

Inheritence: Object inheritence is common concept in OOP, so that any class can be derived from any other class. This is useful, if we want to add few more data or sub to an existing class. Instead of modifying an existing class or duplicating everything in existing class to create a new class, we inherit the old class, and just add new code in new class. @ISA cmd achieves that.

package Bar; => new package Bar declared

use foo; => existing class foo

@ISA=qw(foo); => inherit foo into this package Bar

sub my_add { .... }; => we now add new subroutines to package Bar. All subroutines and var in original package "foo" are accessible to this package.

1; => return value for older perl pgm

Misc modules:

1. reading excel files: This is very commonly used to import excel sheet data into perl program:
ex: read excel sheet from libre office .xlsx files => this makes use of OOP. Uses Spreadsheet module available in std modules.

#!/apps/perl/5.14.2/bin/perl
use lib "/apps/perl/modules-1503/lib"; => this adds lib path to existing lib paths to search for modules
use Spreadsheet::XLSX; //here subroutine XLSX from perl module Spreadsheet is loaded (perl module is reusable package defined in a file)

my $spreadsheet = "$ENV{VERIFICATION}/my_testlist"; //here my_testlist is open office excel sheet
if (! -e "$spreadsheet") { //checking for existence of spreadsheet
    print "Spreadsheet $spreadsheet not found. Please try again.\n";
    exit 0;
}

my $excel = Spreadsheet::XLSX -> new ($spreadsheet, $converter);
foreach $sheet (@{$excel -> {Worksheet}}) {
    printf("Sheet: %s\n", $sheet->{Name});
    foreach $row (($sheet -> {MinRow} +1) .. $sheet -> {MaxRow}) { //skipping 1st row=title row
        $testname           = ($sheet -> {Cells} [$row][0]) -> {Val}; #0 means 1st col
        $rtl_count          = ($sheet -> {Cells} [$row][4]) -> {Val}; #4 means 5th col
        ... //do more processing
    }
}

Some useful perl cmds:

1. perl cmd to substitute and replace one pattern with some other pattern in mutiple files (below cmds can be run on cmd line in bash shell, as long as perl is installed):
perl -pi -e s/old_pattern/new_pattern/g dir1/subdir1/*.tcl => does it for one dir only
perl -e s/old_pattern/new_pattern/g -pi.backup $(find dir1 -type f) => does it for all directories and files in dir1. (-pi with .backup creates backup of old original files with .backup extension). works only in bash shell as $(find dir1 -type f) is bash syntax
ex: perl -pi -e s/1p0/2p0/g $(find . -type f) => replaces 1p0 with 2p0 in all subdir starting with current dir. works only in bash shell.

*******************************************
For Running Place n Route in Synopsys ICC (IC compiler):
---------------------------------------------------------------------

Synopsys ICC uses Milkyway(MW) ref lib, MW tech/RC Model files (TLU+) for physical data to do PnR.
Note, Cadence Encounter uses .lef file for physical data.

For logical data (during synthesis, timing, etc), synopsys and cadence both use .lib or .db timing files.
For running DC in topo mode, we need the MW lib, so that DC can estimate cell placement, and we need TLU+ files so that DC can calc wire delays from tech file instead of based on wire load models.

MW ref lib structure: It's a unix dir, and has binary files in it. It has 3 views for any ref lib (i.e stdcells)
-----
1. CEL view: It's in subdir CEL, and contains actual layout data for cell. It's not used by router. It's used for signoff extraction and signoff DRC/LVS checks. We don't really need this view as extraction file (eg *.spef) have only routing extraction info (R,C of nets), and no extraction from physical lib cell. Timing data in .lib file for all these std cells is used for delay calc to do timing analysis. It might be useful for DRC/LVS checks, assuming .lef file had some incorrect blkg, etc.

2. FRAM view: It's frame view (similar to lef file). It has pin, blkg, via, dimension, symmetry, etc which is used by PnR tool.

3. LM view (optional): It's logical model view and has all timing info. These are same as .lib/.db files used for timing during synthesis. These are specified using "target_library" and "link_library" as during synthesis.

:1, :2 etc denote the version number for that particular cell.

To create MW ref lib, we need to create one using MilkyWay tool:
/apps/synopsys/milkyway/2010.03/bin/AMD.64/Milkyway => brings up a GUI
In the gui, goto: cell_library->lef_in (appears on 2nd row). opens a read_lef box. Specify "MW lib name", where we want to store all stdcells (ie pml48MwRefLibs), specify tech lef file (/db/.../*tech_6layer.lef), and stdcell lef file (/db/.../*core_2pin.lef). click ok. This converts all cells in LEF file to their equiv FRAM view and adds them as subdir in MW lib dir name specified above.

---
Instead of working from gui, we can use cmd line i/f as follows (after opeing mw gui):
;# step 1 create a milkyway library from the tech file
cmCreateLib
setFormField "Create Library" "Library Name" "pml48_ref_libs/CORE" => since the dir to be created by MW is CORE, we need to have dir pml48_ref_libs already existing, or else mw will fail.
setFormField "Create Library" "Technology File Name" "../gs40.6lm.tf"
setFormField "Create Library" "Set Case Sensitive" "1"
formOK "Create Library"

;# step 2 read the lef into CEL view and model it into FRAM view
read_lef
setFormField "Read LEF" "Library Name" "pml48_ref_libs/CORE"
setFormField "Read LEF" "Cell LEF Files" "/db/pdk/1533e035/rev1/diglib/pml48/r2.4.0/vdio/lef/pml48_1533c035_core_2pin.lef"
setFormField "Read LEF" "Cell Options" "Make New Cell Version"
formOK "Read LEF"

Ex: /db/DAYSTAR/design1p0/HDL/Milkyway => In this dir, we create mw ref lib (both for regular and Chameleon cells). We also put .tf file and mapping file in here to generate tlu+ files. Steps for doing this are shown below in tlu+ section.

--------
create/open design MW lib
--------------
Once we are done creating mw ref lib, we create mw design lib using DC. We run DC in topo mode, and create our design MW lib. We need to create the design lib only once, then we need to only open it for any subsequent run.

create_mw_lib -technology <tech_file> -mw_reference_library <ref_lib> my_mw_design_lib => creates design my_mw_design_lib with top level dir my_mw_design_lib, and subdir lib, lib_1, lib_bck within it
open_mw_lib my_mw_design_lib  => opens design my_mw_design_lib so that we can run cmds on it.

We can combine create and open MW in one cmd: create_mw_lib ... digtop -open

ex: create_mw_lib -technology /db/DAYSTAR/.../Milkyway/gs40.6lm.tf -mw_reference_library "/db/DAYSTAR/.../Milkyway/pml48MwRefLibs/CORE /db/DAYSTAR/.../Milkyway/pml48ChamMwRefLibs/CORE" -open my_mw_design_lib => done only once, when mw design lib doesn't exist. mw ref lib is the one created above using MilkyWay tool.
open_mw_lib my_mw_design_lib => just open mw lib for any subsequent run, as mw lib already exists.


#synthesize design, or do whatever we want to do on this mw design, then save MW design using this cmd: (save_mw_cel cmd doesn't work here, as it's supported only in ICC). MW db has netlist, synth constraints and optional fp, place, route data (if they exist).
write_milkyway -output digtop => this creates my_mw_design_lib/CEL dir, which has digtop:1 file. here :1 is the version number. If we use write_milkway cmd more than once, it creates additional design file and increments the version number. You must make sure you open the correct version in Milkyway; by default Milkyway opens the latest version. To avoid creating an additional version, use the -overwrite switch to overwrite the current version of the design file and save disk space.

-----
To load TLU+ file:
---
TLU+ is tech lookup table binary file, which is used by ICC to calc interconnect R,C values based on net geometry.
cmd: set_tlu_plus_files -max_tlu_plus <max_tluplus_file> -tech2itf_map <mapping_file> => sets pointer to tlu+ files assuming it's already been generated. tech2itf map file is needed to map names from .tf (technology file) file to .itf (interconnect technology format) file. We need the mapping file, as we used .tf file above in create_mw_lib cmd. That .tf file may have layer names different from .itf file. Since .itf files are used to generate .tlu+ files, names in .tlup files may be diff than ones in .tf file. so mapping file resolves this.
ex: set_tlu_plus_files \
    -max_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.maxc_maxvia.wb2tcr.metalfill.spb.nlr.tlup \
    -min_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.minc_minvia.wb2tcr.metalfill.spb.nlr.tlup \
    -tech2itf    /db/DAYSTAR/design1p0/HDL/Milkyway/mapping.file

To generate TLU+ files:
---
Generally, we have tech file (.tf) which is similar to lef tech file used in vdio. .tf file has all metal/via rules, complex drc rules, all layers, etc and is very elaborate. This is what we had at AMD.
For an ex, look in /db/DAYSTAR/design1p0/HDL/Milkyway/gs40.6lm.tf. It has following in it:
Technology      { name="gs40" unitLengthName="micron" ...  }
Tile    "unit"  { width=0.4250 height=3.4000 }
Layer   "MET1"  { layerNumber=10 minWidth=0.175 minSpacing=0.175 ... } => many more layers as poly, tox, hvt, bondwire, etc.
Layer   "VIA2"  { layerNumber=13 ... }
FringeCap 17    { number=17 layer1="MET6" layer2="MET1" minFringeCap=0.000010 maxFringeCap=0.000010 } =>b/w any 2 layers
DesignRule      { layer1="VIA1" layer2="VIA2" minSpacing=0 }
ContactCode "VIA23" { contactCodeNumber=2 cutLayer="VIA2" lowerLayer="MET2" upperLayer="MET3" ... }
and many more layers ...

Generally vendors provide only .itf (interconnect technology format). These .itf contain desc of process, thickness and phy attr of conductor and dielectric layers, via layers, etc. These are used to extract RC values for interconnects. These .itf are used to generate TLU+ files to be used by ICC by this cmd:
grdgenxo -itf2TLUPlus -i <abc.itf> -o <abc.tluplus> => -itf2TLUPlus option generates tlu+ file instead of nxtgrd file (nxtgrd file are used in star-rcxt tool. this is needed when running ICC in signoff mode)
ex: grdgenxo -itf2TLUPlus -i .../gs40.6lm.maxc_maxvia.wb2tcr.metalfill.spb.nlr.itf.eval -f /testcase/di3/techfiles/sp_di1/sr60/TLUPlus/6lmalcap/itfs/c021.format -o gs40.6lm.maxc_maxvia.wb2tcr.metalfill.spb.nlr.itf.eval.tlup

These tlu+ files have layer names same as those in .itf files. Since these names may not match the names in .tf files, we use mapping file that maps .tf layer/via names to .itf layer/via names. It's called as .map file or mapping.file or any other name. For an ex, look in /db/DAYSTAR/design1p0/HDL/Milkyway/mapping.file => It has all "capital letter" layer names mapped to "small letter" layer names. It also removes all layers except active, poly, met and via layers, as they are not needed.

------------------------
DC synthesis topo mode: So, the complete flow for DC synthesis in topo mode looks like this:
----

In .synopsys_dc.setup file in Synthesis dir, set search path to wherever you have .db files.
Then run dc_shell in topo mode: dc_shell-t -2010.03-SP5 -topo -f tcl/top.tcl | tee logs/top.log
 
#In dc_shell, run initial setup/analyze the normal way
source tcl/setup.tcl
source tcl/analyze.tcl

elaborate      $DIG_TOP_LEVEL
current_design $DIG_TOP_LEVEL
link
set_operating_conditions -max W_125_1.35 -library {PML48_W_125_1.35_COREL.db PML48_W_125_1.35_CTSL.db} => points to lib in search path
#set auto_wire_load_selection true => commented as no wlm (as we use tlu+ and net geometry to calc res/cap values)
#set_wire_load_mode enclosed => commented as no wlm

#### start of special cmds for running in topo mode ####
#open/create mw lib
set lib_exist [file exists my_mw_design_lib]
if {$lib_exist != 1} {
create_mw_lib -technology /db/DAYSTAR/design1p0/HDL/Milkyway/gs40.6lm.tf \
              -mw_reference_library "/db/DAYSTAR/design1p0/HDL/Milkyway/pml48MwRefLibs/CORE /db/DAYSTAR/design1p0/HDL/Milkyway/pml48ChamMwRefLibs/CORE" -open my_mw_design_lib
}
open_mw_lib my_mw_design_lib

#Enable Cell area and footprint checks (so that area of cell and footprint of cell are consistent) between logical(in link_library) and physical library(in MW db)
set_check_library_options -cell_area -cell_footprint
check_library

#set tlu+ file instead of WLM
set_tlu_plus_files \
    -max_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.maxc_maxvia.wb2tcr.metalfill.spb.nlr.tlup \
    -min_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.minc_minvia.wb2tcr.metalfill.spb.nlr.tlup \
    -tech2itf    /db/DAYSTAR/design1p0/HDL/Milkyway/mapping.file

check_tlu_plus_files => performs sanity checks on TLU+ files to ensure correct tlu+ and map file
#### end of special cmds for running in topo mode ####

#start with normal flow
set_driving_cell -lib_cell IV110 [all_inputs]
set_load 2.5 [all_outputs]

source tcl/dont_use.tcl
source tcl/dont_touch.tcl
...
compile_ultra -scan ...
...
#save final design from mem to MW lib (MW stores physical info of design), and name it as digtop
set mw_design_library my_mw_design_lib => to make sure design lib is set correctly
write_milkyway -output digtop -overwrite => Overwrites existing version of the design under the CEL view.

exit

-----------------------------------

Running ICC:
-----------
just like in DC, cp .synopsys_dc.setup file from the synthesis dir to the dir, where you are running ICC. It has all the same settings as DC, i.e It sources other tcl files from admin area. sets search_path to /db/../synopsys/bin, target_library and link library to PML*_CTS.db, and other parameters for snps ICC.

run ICC:
icc_shell -2011.09-SP4 -f tcl/top.tcl | tee logs/my.log => starts up icc

icc_shell> start_gui => to start gui from icc_shell. 2 gui may open: One is the ICC main window from where we can enter cmd on icc_shell built within this window. The other is ICC layout window, which opens up whenever we open/import design. From this window, we control and view PnR.

We can run ICC in 2 modes. Choose from File->Task in ICC layout window.
1. Design planning: Full chip planning/feasability/partitioning is done. Visibility is turned OFF for cells and cell contents. Top panel shows fp, preroute, place, partition, clk, route, pin assgn, timing, etc. Once we are satisfied, we partition top level design into blocks and do block level impl as shown next.
2. Block implementation: actual impl at block level is done. Visibility is turned ON for cells and cell contents. Top panel shows fp, preroute, place, clk, route, signoff, finish, eco, verification, power, rail, timing, etc.

#reset_design => removes all attr and constraints (dont_touch, size_only, ...)

top.tcl:
-------
#source some other files (same as in DC) => In this file set some variables, i.e "set RTL_DIR /db/dir" "set DIG_TOP_LEVEL  digtop" or any other settings

#create is needed only for the first time design is created in ICC. From next time, we just need to open the design.
create_mw_lib -technology /db/DAYSTAR/design1p0/HDL/Milkyway/gs40.6lm.tf \
              -mw_reference_library "/db/DAYSTAR/design1p0/HDL/Milkyway/pml48MwRefLibs/CORE /db/DAYSTAR/design1p0/HDL/Milkyway/pml48ChamMwRefLibs/CORE" -open my_mw_design_lib

open_mw_lib my_mw_design_lib => to open mw lib

#ICC can also directly open a mw db written by DC (as in DC topo), so no need to create/open new mw or import any netlist.
#open_mw_lib ../../Synthesis/digtop/my_mw_design_lib

set_tlu_plus_files \
    -max_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.maxc_maxvia.wb2tcr.metalfill.spb.nlr.tlup \
    -min_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.minc_minvia.wb2tcr.metalfill.spb.nlr.tlup \
    -tech2itf    /db/DAYSTAR/design1p0/HDL/Milkyway/mapping.file

check_tlu_plus_files

set mw_logic0_net "VSS"
set mw_logic1_net "VDD"

#read in verilog, vhdl or ddc format
#read_verilog -netlist ../Synthesis/netlist/digtop.v
#current_design $DIG_TOP_LEVEL
#uniquify
#link
#save_mw_cel -as $DIG_TOP_LEVEL

#all of the above can be replaced by this one liner
import_designs ../Synthesis/digtop/netlist/digtop.v -format verilog -top $DIG_TOP_LEVEL

#If we imported mw db from DC, then instead of importing netlist, we can open mw cel directly
#open_mw_cel $DIG_TOP_LEVEL => opens mw cel digtop written by DC. No need to specify path of mw lib, as path is set, whenever we open mw lib.

IO pad/pin placement:
--------------------
set_pad_physical_constraints => Before creating fp, we should create placement and spacing settings for I/O pads. These IO pads refer to analog buf cells that connect I/O pins to internal logic.
set_pin_physical_constraints => To constrain Pins. ICC checks to make sure constraints for both pads and pins are consistent.
set_fp_pin_constraints => sets global constraints for a block. If a conflict arises between the individual pin constraints and the global pin constraints, the individual pin constraints have higher priority.

To save pin/pad constraints:
write_pin_pad_physical_constraints <const_file> => saves all const applied using pad and pin cont cmd above

To read pin/pad constraints:
read_pin_pad_physical_constraints <const_file> => to read all pin/pad const

In our case, these pads are at top level, so we don't need pad const in digtop. We need pin const, only when the actual pin io file is not available. Once we get real io file with pin placement, we don't need to run this section. We just read in the pin def file.
 
create_floorplan: create a floorplan similar to what we do in VDI (for older ICC versions, use initialize_floorplan as create_floorplan is only supported from 2011 onwards)
------------------
create_floorplan => creates block shape, size and placement rows (target util, aspect ratio, core size, bdry to core spacing, etc.and std cell. placement rows are visible on zooming. places constrained pads first. Any unconstrained pads are placed next, using any available pad location. Then pins are placed. If pin location not specified, then pins are placed randomly and evenly dist along 4 sides of block.

#-control_type aspect_ratio | width_and_height | row_number | boundary> => The default control type is the aspect_ratio which indicates that the core area of the floorplan in the current Milkyway CEL is determined  by the ratio of the height divided by the width. The  width_and_height  control type indicates that the core area is determined by the exact width and height.
#-core_width <width> -core_height <height>=> Specifies the width and height of the core area in user units.  This option is  valid only if you specify the -control_type width_and_height option.
#-left_io2core <x1> -right_io2core <x2> -bottom_io2core <y1> -top_io2core <y2> => Specifies the distance between the left/right/bot/top side of the core area and the right/left/top/bot side of the closest terminal or pad.
create_floorplan => creates fp with default options, to fit in all cells.
create_floorplan -control_type width_and_height -core_width 180 -core_height 200 -left_io2core 8.5 -right_io2core 8.5 -bottom_io2core 8.5 -top_io2core 8.5 => specify fp size and spacing

#initialize_rectilinear_block => only for rectilinear blocks (L,T,U or cross-shaped). In this, pins are not touched at all.

##defining routing tracks. create_track to create tracks. report_track shows all tracks (usr or def). Generally, we'll see all metal layers in both X and Y dirn.

#write_def or write_floorplan to save fp into def or mw.
write_def -output fp_for_DC_topo.def => writed fp def, so that we can use this fp info in DC topo, to get better synthesized netlist.

#read_def or read_floorplan to import in a fp def file, which has some/all pwr routes, i/o pins and chip dimensions.
read_def chip.def => it adds the physical data in the DEF file to the existing physical data in the design. To replace rather than add to existing data, use the -no_incremental option

#pg connections
derive_pg_connection -power_net VDD -ground_net VSS => creates logical connection b/w pg nets in design to pg pins on stdcells
check_physical_constraints => check that logical lib (.db) and physical lib (mw) match. we see warnings about missing pg nets in fp
report_cell_physical -connection => reports all pin connections for all stdcells

Virtual flat placement:  This is for design planning/feasibility purpose only.
----------------------
help you decide on the locations, sizes, and shapes of the top-level physical blocks. This placement is “virtual” because it temporarily considers the design to be entirely flat, without hierarchy. After you decide on the shapes and locations of the physical blocks, you restore the design hierarchy and proceed with the block-by-block physical design flow.

set_fp_placement_strategy => sets parameters that control the create_fp_placement and legalize_fp_placement commands. These settings are  not applicable  to  other  placement  commands  or other parts of the flow.
create_fp_placement => performs a virtual flat placement of  standard  cells  and hard  macros.  It provides you with an initial placement for creating a floorplan to determine the relative locations and shapes  of  the  toplevel  physical  blocks


power planning: optional, only needed if we need to create straps/rings.
-------------
#set_fp_rail_constraints => defines PNS (Power network synthesis) constraints
set_fp_rail_constraints -add_layer -layer MET2 -direction vertical -max_strap 20 -min_strap 10 -min_width 0.4 -spacing minimum => -add_layer says to add 10-20 power straps on MET2 in vert dirn, with min_width of 0.4 units. -spacing says that spacing b/w pwr and gnd nets can be min spacing. Sometimes we want to route signals in b/w these pwr and gnd nets, so we may choose "-spacing distance" to specifically specify the distance.
set_fp_rail_constraints -add_layer -layer MET3 -direction horizontal -max_strap 20 -min_strap 10 -min_width 0.4 -spacing minimum => this adds horz straps in MET3

#set_fp_block_ring_constraints => defines the constraints for the power and ground rings that are created around plan groups and macros, when pg n/w is synthesized. This may not be needed for our purpose, since we don't have macros, around which we want to create rings
set_fp_block_ring_constraints -add -horizontal_layer METAL5 -vertical_layer METAL6 -horizontal_width 3 \
-vertical_width 3 -horizontal_offset 0.600 -vertical_offset 0.600 -block_type master -nets {VDD VSS} -block { RAM210 }

#synthesize_fp_rail command => synthesizes the power network based on the set_fp_rail_constraints cmd.
synthesize_fp_rail -power_budget 800 -voltage_supply 1.32 -output_directory powerplan.dir -nets {VDD VSS} -synthesize_power_plan => synthesizes fp rail

commit_fp_rail => commit the power plan to convert the virtual power straps and rings to actual power wires, ground wires, and vias.

create views:
-----------
#specifying min/max timing lib => "link_library" or "target_library" in .synopsys_dc.setup has the max lib only. We are not allowed to specify min lib there. If more than 1 .db files are specified in link/taget library, tool just looks through these .db files, and stops the first time, it finds the required cell. That's why we specify just the max lib files for both CORE and CTS cells.
#So, to specify min lib for min delay analysis, we need to use the "set_min_library" cmd => it associates a min lib with max lib, i.e to compute min dly, tool first consults the library cell from the max library.  If a library cell exists  with  the same  name, the same pins, and the same timing arcs in the min library, the timing information from the min library is used.  If the tool  cannot  find  a  matching cell in the min library, the max library cell is used.

set_min_library PML48_W_125_1.35_COREL.db -min_version PML48_S_-40_1.65_COREL.db => for core cells
set_min_library PML48_W_125_1.35_CTSL.db -min_version PML48_S_-40_1.65_CTSL.db => for cts cells

list_libs => shows all min/max lib. m=min, M=max. Make sure all paths, etc are correctly reported.

###setting mmmc flow: ICC uses multi scenario method to analyze and optimize these designs across all design corners and modes of operation.
A scenario is a combination of modal constraints (test mode or standby mode) and corner specifications (operating conditions of various PVT). create_sceanario defines one such mode/corner. In multicorner-multimode designs, DC/ICC uses a scenario or a set of scenarios as the unit for analysis and optimization. The current scenario is the focus scenario; when you set modal constraints or corner specifications, these typically apply to the current scenario. The active scenarios are the set of scenarios used for timing analysis and optimization.
Specify the TLUPlus libraries, operating conditions, and constraints that apply to the scenario. In general, when you specify these items, they apply to the current scenario.

###create scenario func_max, with max dly lib, and max rc tlu+.
create_scenario func_max => creates scenario, makes that scenario current and active
current_scenario => display the current scenario
current_scenario func_max => current scenario is set to func_max

#set_operating_conditions => defines op cond under which to time or optimize the design
set_operating_conditions W_125_1.35 -library {PML48_W_125_1.35_COREL.db PML48_W_125_1.35_CTSL.db}

#create_operating_conditions -name typ_lib_set -lib {PML48_N_25_1.5_COREL.db PML48_N_25_1.5_CTSL.db} -proc 0 -temp 25 -volt 1.8 => creates new op cond which may not be present. NOT needed for our purpose

#tlu+ set to max rc for both max/min corner
set_tlu_plus_files \
    -max_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.maxc_maxvia.wb2tcr.metalfill.spb.nlr.tlup \
    -min_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.maxc_maxvia.wb2tcr.metalfill.spb.nlr.tlup \
    -tech2itf    /db/DAYSTAR/design1p0/HDL/Milkyway/mapping.file

check_tlu_plus_files

#read sdc file that has constraints from DC. this replaces all lines in DC starting from "set_op_cond" to dont_use/touch, i/o dly, max_transition, create_clock, false_path/multicycle_path, disable_timing etc.
read_sdc
read_sdc ../../Synthesis/digtop/sdc/constraints.sdc

#check
check_timing => all paths should be constrained. If there are unconstrained paths, these should all be false paths as defined in false path file. run report_timing_requirements cmd to verify that.

###create scenario func_min, with min dly lib, and min rc tlu+.
create_scenario func_min
current_scenario => displays the current scenario
current_scenario func_min => current scenario is set to func_min

set_operating_conditions S_-40_1.65 -library {PML48_S_-40_1.65_COREL.db PML48_S_-40_1.65_CTSL.db}
set_tlu_plus_files \
    -max_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.minc_minvia.wb2tcr.metalfill.spb.nlr.tlup \
    -min_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.minc_minvia.wb2tcr.metalfill.spb.nlr.tlup \
    -tech2itf    /db/DAYSTAR/design1p0/HDL/Milkyway/mapping.file

check_tlu_plus_files

read_sdc ../../Synthesis/digtop/sdc/constraints.sdc

#reporting scenarios
all_scenarios => displays all the defined scenarios
report_scenarios => reports all the defined scenarios
set_active_scenarios {s1 s2} => sets s1,s2 to active. -all makes all scenarios active
all_active_scenarios => display the currently active scenarios
remove_scenario => remove the specified scenarios from memory (-all removes all scenarios)
check_scenarios => check all scenarios for any issues

place:
-----
#insert_port_protection_diodes => add diodes to the specified ports to your netlist to prevent antenna violations. should be done after fp and before place. report_port_protection_diodes reports the port protection diodes that are inserted in your design.

#pg connections
preroute_standard_cells -fill_empty_rows => Generates physical PG rails for standard logic cells. It connects all pwr/gnd rails in stdcells together, and then connects them to straps and pwr rings. "-fill_empty_rows" switch fills the CORE area or specified area with empty PG rails where cells can be subsequently placed, so that the entire region has PG rails.

#set active scenario to run setup opt for func_max and hold opt for func_min
set_scenario_options -setup true -hold false -scenarios func_max => Sets the scenario options for func_max to do opt on setup but not on hold (by default, it does opt on both setup and hold timing)
set_scenario_options -setup false -hold true -scenarios func_min => Sets the scenario options for func_min to do opt on hold but not on setup

set_active_scenarios {func_max func_min} => set both these scenario active. NOTE: .lib doesn't have process set to 1 for min lib (process=-3), so check_scenarios will warn. place, route, etc won't run. So, set active scenario to "func_max" only =>  set_active_scenarios {func_max}

# Add set_propagated_clock
set_propagated_clock [all_clocks]

###checks to be done prior to running place, so that any issues can be identified
check_design => check_design -summary cmd automatically runs on every design that is compiled. However, you can use the check_design cmd explicitly to see warning messages. Potential problems detected by this cmd include unloaded input ports or undriven output ports, nets without loads or drivers  or  with multiple drivers, cells or designs without inputs or outputs, mismatched pin counts between an instance  and its ref, tristate buses with non-tristate drivers, and so forth.

check_timing => checks timing and issues warnings. This cmd without any options performs the checks defined  by the timing_check_defaults variable. Redefine this variable to change the value.

check_physical_design -stage pre_place_opt => does phy design checks on design data for place. use "-stage pre_clock_opt" for pre cts, and "-stage pre_route_opt" for pre route.

# Perform timing analysis before placement (only run setup). When we do report_timing for setup, it reports setup for active scenarios. If one of those active scenarios doesn't have "setup=true", then nothing is reported. So, we provide scenario name as "func_max" during report timing (as "func_min is only valid for hold)
set rptfilename [format "%s/%s" timingReports ${DIG_TOP_LEVEL}_pre_place.rpt]
redirect $rptfilename {echo "digtop pre place setup run : [date]"}
redirect -append $rptfilename {report_timing -delay_type max -path full_clock_expanded -max_paths 100 -scenarios {func_max}}

#### place
# Add I/O Buffers
set_isolate_ports -driver BU120 -force [all_inputs] => force BU120 cell on all i/p ports
set_isolate_ports -driver BU120 -force [all_outputs] => force BU120 cell on all o/p ports
report_isolate_ports => reports all i/o ports and their isolation cells.

#place_opt -area_recovery  => Performs coarse placement, high-fanout net synthesis, physical opt, and legalization. Doesn't touch the clk n/w.
#-area_recovery => min area target
#-cts => enables quick cts, opt and route within place_opt, when designs are large. Should always run clock_opt eventually.
#-spg => uses Design Compiler's Physical Guide information to guide optimization. We can use either mw, .ddc or def file from DC, all of which have physical info. However, the guidance feature is only available in DC gui, so -spg will work only if DC mw or ddc has been gen using this feature. Also fp def from ICC should be imported into DC, so that DC can better synthesize the netlist based on fp. Just using DC topo mode doesn't mean that placement info can be read into ICC.

place_opt -area_recovery => may need to be run multiple times with diff options to fix violations.

#reports/checks
report_constraint
report_design
report_placement_utilization
create_qor_snapshot -name post_place => stores design qor in set of report files in dir "snapshot"
report_qor_snapshot => used to retrieve the qor rpt.

# Perform timing analysis after placement
set rptfilename [format "%s/%s" timingReports ${DIG_TOP_LEVEL}_post_place.rpt]
redirect $rptfilename {echo "digtop post place setup run : [date]"}
redirect -append $rptfilename {report_timing -delay max -path full_clock -max_paths 100}

# Add Spares. for scan, add flops with scan, otherwise non-scan flops.
insert_spare_cells -lib_cell {IV120L NA210L} -cell_name spares -num_instances 10 -tie => inserts spare cells group specified (IV120L,NA210L) 10 times spread uniformly across design with input pins tied to 0.
all_spare_cells => list all spare cells in design

#check and save
check_design
save_mw_cel -as post_place => we'll see a post_place:1 file in my_mw_design_lib/CEL dir.
write_def -output digtop_post_place.def
write_verilog ./netlist/digtop_post_place.v

#Post-placement optimization
psynopt => performs timing optimization and design rule fixing, based on the max cap and max transition settings while keeping the clock networks untouched. It can also perform power optimizations. It can remove dangling cells (to prevent that, use "set_dont_touch" cmd to apply dont_touch attr on required cells)

CTS
----
Prereq for CTS are:
1. check_legality -verbose => to verify that the placement is legal
2. pwr/gnd nets should be prerouted
3. High-fanout nets, such as scan enables, should already be synthesized with buffers.
4. By default, CTS cannot use buf/inv that have the dont_use attribute to build the clock tree. To use these cells during CTS, you can either remove the dont_use attribute by using the remove_attribute command or you can override the dont_use attribute by specifying the cell as a clock tree reference by using the set_clock_tree_references cmd.

CTS traces thru all comb cells (incl clk gating cells). However, it doesn't trace thru seq arcs or 3 state enable arcs
.
check_physical_design -for_cts => checks if design is placed, clk defined and clk root are not hier pins.
check_clock_tree => checks and warns if clk src pin is hier, incorrect gen clk, clk tree has no sync pins, and if there are multiple clks per reg.

#set_clock_tree_options => sets clk tree options
#-clock_trees clock_source
#-target_early_delay insertion_delay => by default, min insertion delay is set to 0.
#-target_skew skew
#-max_capacitance capacitance => by default, max cap is set to 0.6pf.(if not specified for design or not specified using switch here)
#-max_transition transition_time => By default, the max transition time is 0.5 ns
set_clock_tree_options -clock_trees sclk_in -target_early_delay 0 -target_skew 0.5 -max_transition 0.6 => set skew and tran

4 kinds of pins that are used in CTS. A pin may belong to more than 1 of these:
1. STOP pins: pins that are endpoints of clk tree. eg. clk pins of cells, clk pins of IP.
2. NONSTOP pins: pins that would normally be stop pins, but are not. The clock pins ofsequential cells driving generated clocks are implicit NONSTOP (not STOP) pins, as clk tree balancing needs to be done thru these pins. NOTE: this default behaviour is different than EDI, where ThroughPin has to be used in .ctstch file to force CTS thru the generated clks.
3. FLOAT pins: similar to STOP pins, but have special insertion delay requirements (have extra delay on clk pins). ICC adds the float pin delay (positive or negative) to the calculated insertion delay up to this pin. Usually, IP/Macro pins are defined as FLOAT pins so that we can add appr delay to the pin, equal to dly in the clk tree inside the IP/Macro.
4. EXCLUDE pins: clock tree endpoints that are excluded from CTS. implicit exclude pins are clk pins going to o/p ports or pins on IP/macro that are not defined as clk pins(i.e they are treated as data pins. We have to explicitly set these pins to stop_pins), or data pins of seq cells. During CTS, ICC isolates exclude pins (both implicit and explicit) from the clock tree by inserting a guide buffer before the pin. Beyond the exclude pin, ICC never performs skew or insertion delay optimization, but does perform design rule fixing. NOTE: In EDI, we use ExcludedPin in .ctstch file to specify exclude pins

#set_clock_tree_exceptions => sets clk tree exceptions on the pins above. We don't need this.
#-clocks clk_names => clks must be ones defined by "create_clock" and NOT by "create_generated_clock".
#-stop_pins stop_pin_collection
#-non_stop_pins non_stop_pin_collection
#-exclude_pins exclude_pin_collection
#-float_pins float_pin_collection => additional options for max/min_delay_rise/fall should be used.

#set_clock_tree_references => Specifies  the buffers, inverters, and clock gates to be used in CTS.
#-clock_trees clock_names => by default, it applies to all clks
#-references ref_cells => Specifies the list of buffers, inverters,  and  clock  gates for CTS.
set_clock_tree_references -references "CTB02B CTB15B CTB201B CTB20B CTB25B CTB30B CTB35B CTB40B CTB45B CTB50B CTB55B CTB60B CTB65B CTB70B" => In EDI, equiv cmd was "Buffer" used in .ctstch file


clock_opt => Performs  clock  tree  synthesis, routing of clock nets, extraction,  optimization,  and  hold-time  violation  fixing. Uses default wires (default routing rules) to route clk trees. We can define non default routing rules using "define_routing_rule" cmd, and use these routing rules with "set_clock_tree_options -routing_rule" cmd. NDR rules define what wires, routing layers, clk sheilding to use. Shielding is done using "create_zrt_shield" cmd, after doing clock_opt.
Prior  to the clock_opt command, use the set_clock_tree_options command to control the compile_clock_tree command. Briefly, it runs the following cmds under the hood:
o Runs the compile_clock_tree cmd => run multiple times using diff options
o Runs the optimize_clock_tree cmd
o Runs the set_propagated_clock command for all clocks from the  root pin, but keeps the clock object as ideal, Performs interclock delay balancing, if enabled (using set_inter_clock_delay_options command), Sets the clock buffers as fixed, Updates latency on clock objects with their insertion delays obtained after the compile_clock_tree, if enabled (using set_latency_adjustment_options  command)
0 Runs "route_group -all_clock_nets" cmd to route clk nets. "-no_clock_route" switch Disables routing of clock nets.

#running clock_opt in these steps is more flexible than clock_opt alone.
#clock_opt -only_cts -no_clock_route => performs CTS with opt only with no routing of nets
#clock_opt -only_psyn -no_clock_route => performs opt only with no routing of nets. This is used n a user-customized CTS flow where CTS is performed  outside of the clock_opt command
#route_group -all_clock_nets

clock_opt

## Post CTS optimization
clock_opt -only_psyn

route:
------
zroute is default router for ICC. Even though it's grid router, it allows nets to go off grid to connect to pins. Prereq for running zroute are: pwr/gnd nets must be routed and CTS should have been run.
We can run prerouter to preroute signal nets, before running zroute. zroute doesn't reroute these nets, but only fixes DRC.

check_routeability => to verify that design is ready for routing

#define routing guides
#create_route_guide -coordinate {0.0 0.0 100.0 100.0} -no_signal_layers {MET3 MET4 MET5 MET6}
#set_route_zrt_common_options -min_layer_mode hard -max_layer_mode hard => min/max layers are set to hard constraints, instead of soft constraints.
set_ignored_layers -min_routing_layer MET1 -max_routing_layer MET3 => max/min routing layers, by default these are hard constraints.
#define_routing_rule => to define nondefault routing rules (width,spacing,etc), both for routing and for shielding. These rules are assigned diff names, and then they are applied either on clk nets using "set_clock_tree_options" during CTS, or on signal nets and clk nets after CTS using "set_net_routing_rule".

#displays current settingsfor all routing ptions
set_route_zrt_common_options -verbose_level 1
report_route_zrt_common_options

#3 methods to route signal nets:
1. route_zrt_global => performs global routing. route_zrt_track => to perform track assignment. route_zrt_detail => to perform detail routing. Useful in cases where we want to customize routing flow
2. route_zrt_auto => performs all tasks in method 1 above. Runs fast so useful for analyzing routing congestion, etc.
3. route_opt => performs everything in method 2 above + postroute opt. To skip opt, add "-initial_route_only". Used for final routing.

The 3 substeps of routing are as follows:
-----------------
1. global routing:
----------
The global router divides design into global routing cells (GRC). By default, the width of a GRC is the same as the height of a standard cell and is aligned with the standard cell rows.
For each global routing cell, the routing capacity is calculated according to the blockages,
pins, and routing tracks inside the cell. Although the nets are not assigned to the actual wire
tracks during global routing, the number of nets assigned to each global routing cell is noted.
The tool calculates the demand for wire tracks in each global routing cell and reports the
overflows, which are the number of wire tracks that are still needed after the tool assigns
nets to the available wire tracks in a global routing cell.
Global routing is done in two phases:
phase 0 = initial routing phase, in which the tool routes the unconnected nets and calculates the overflow for each global routing cell
phase 1 = The rerouting phases, in which the tool tries to reduce congestion by ripping up and rerouting nets around global routing cells with overflows. It does it several times (-effort minimum causes this phase to run once while -effort high causes it to run 4 times)

routing report:
phase3. Both Dirs: Overflow = 453 Max = 4 GRCs = 449 (0.02%) => there are 453 wires in design that don't have corresponding track available. The Max value corresponds to the highest number of overutilized wires in a single GRC. The GRCs value is the total number of overcongested global routing cells in the design

2. track assignment:
------
The main task of track assignment is to assign routing tracks for each global route. During track assignment, Zroute performs the following tasks:
• Assigns tracks in horizontal partitions.
• Assigns tracks in vertical partitions.
• Reroutes overlapping wires.
After track assignment finishes, all nets are routed but not very carefully. There are many violations, particularly where the routing connects to pins. Detail routing works to correct those violations.

routing report: reports a summary of the wire length and via count.

3. detail routing:
------
The detail router uses the general pathways suggested by global routing and track assignment to route the nets, and then it divides the design into partitions and looks for DRC violations in each partition. When the detail router finds a violation, it rips up the wire and reroutes it to fix the violation. During detail routing, Zroute concurrently addresses routing design rules and antenna rules and optimizes via count and wire length.
Zroute uses the single uniform partition for the first iteration to generate all DRC violations for the chip at the same time. At the beginning of each subsequent iteration, the router checks the distribution of the DRC violations. If the DRC violations are evenly distributed, the detail router uses a uniform partition. If the DRC violations are located in some local areas, the detail router uses nonuniform partitions. It Performs iterations until all of the violations have been fixed, maximum number of iterations has been reached or It cannot fix any of the remaining violations.

routing report: reports DRC violations summary at the end of each iteration. a summary of the wire length and via count.

route_opt => does all 3 stages of routing + opt.

report_design_physical -verbose => to view PnR summary rpt.
verify_zrt_route => checks for routing DRC violations, unconnected nets, antenna rule violations, and voltage area violations on all nets in the design, except those marked as user nets or frozen nets.

extract_rc -coupling_cap => explicitly performs postroute RC extraction, with coupling cap. RC estimation is already done, when route_opt or any report_* cmd is run.

#report setup/hold timing, write def/verilog

#post route opt if needed
#route_opt -incremental
#route_opt -skip_initial_route -xtalk_reduction

STA:
----

#set all scenarios active
set_scenario_options -setup true -hold true -scenarios {func_max func_min scan_max scan_min}
set_active_scenarios {func_max func_min scan_max scan_min}

#report timing

#opt if needed
#for fixing DRV
set routeopt_drc_over_timing true
route_opt -effort high -incremental -only_design_rule

#for fixing hold
route_opt -only_hold_time

#for si
set_si_options -delta_delay true -route_xtalk_prevention true -route_xtalk_prevention_threshold 0.35
route_opt -skip_initial_route -xtalk_reduction

#focal_opt

Signoff:
--------
from routed db, we can do signoff driven design closure by 2 ways:
1. signoff_opt => auto flow. runs analysis and optimization.
2. run_signoff => manual flow. runs analysis

During analysis in signoff, StarRC is used to perform a complete parasitic extractionand stores the results as an Synopsys Binary Parasitic Format (SBPF) file or SPEF file. For timing, PT is run, and the timing info is passed back to ICC. When not in signoff, ICC internal engine is used for both extraction and timing.

set_primetime_options -exec_dir /apps/synopsys/pt/2011.12/amd64/syn/bin
set_starrcxt_options -exec_dir /apps/synopsys/star-rcxt/2011.12/amd64_starrc/bin

report_primetime_options
report_starrcxt_options

#scenarios
set_starrcxt_options -max_nxtgrd_file $max_grd_file -map_file /db/DAYSTAR/design1p0/HDL/Milkyway/mapping.file

#NOTE: still get errors, when running signoff_opt =>
#Information: Use StarRCXT path /apps/synopsys/star-rcxt/2011.12/amd64_starrc/bin. (PSYN-188)
#Error: The star_path option can only be used in conjunction with the star_max_nxtgrd_file option(s). (UIO-18)
#Error: The star_path option can only be used in conjunction with the star_map_file option(s). (UIO-18)

signoff_opt => run signoff optimization by ICC, based on results from signoff tool: starRC and PT.

#report_timing
#report_constraint -all_violators

save_mw_cel -as signoff

#if inc opt needed (to fix drv, hold time, si => use additional options with signoff_opt)
signoff_opt -only_psyn

#check_signoff_correlation => check the correlation between ICC and PT, and between ICC and StarRC.

Filler:
-------
# we insert filler cells before running signoff, so as to catch any issues
#insert_stdcell_filler => Fills empty spaces in standard cell rows with filler cells. the tool adds the filler cells in the order that you specify, so specify them from the largest to smallest. Run after placement.
#-cell_without_metal <lib_cells> or -cell_with_metal <lib_cells> => specify filler cells that don't contain metal or those that contain metal. Tool doesn't check for DRC if "cell_without_metal" is used.
insert_stdcell_filler -cell_without_metal {FILLER_DECAP_P12L FILLER_DECAP_P6L}
 
final_checks:
------------
#need to find checks for drc, antenna, connectivity

signoff_drc => performs signoff design rule checking. IC validator, or Hercules license reqd.

export_final:
------------
write_parasitics -format SPEF -output final_files/digtop_starrc.spef => writes spef file. If there are min and max operating conditions, parasitics for both conditions are written. In mmmc flow, the tool uses the name of the tluplus file and the temperature  associated  with the corner, along with the file name you specified, to derive the file name of the parasitic file (<tluplus_file_name>_<temperature>[_<user_scaling>].<output_file_name>).

write_def -version 5.5 -output final_files/digtop_final_route.def => writes def version 5.5

write_verilog final_files/digtop_final_route.v

--------------------------

*******************************************
For Running Place n Route in VDI:
---------------------------------------------------------------------
NOTE: our design are in terms of dbu
1 dbu=1 um before shrink. For LBC7, shrink=0.9, so 1dbu=0.9um.  For LBC8, shrink=0.35, so 1dbu=0.35um.

Cadence Encounter VDI (Virtuoso Digital Implementation):

Dir: /db/Hawkeye/design1p0/HDL/Autoroute/digtop/vdio

run Encounter VDI:
encounter -9.1 -vdi -log logs/encounter.log => brings up gui
encounter -9.1_USR2_s159 -vdi -log logs/encounter.log => use this version to avoid manufacturing grid issues.
For bsub: bsub -q gui -Is -R "linux" "encounter ......"

Help for encounter :
/apps/cds/edi/9.1/doc/soceUG/soceUG.pdf
/apps/cds/edi/9.1/doc/fetxtcmdref/fetxtcmdref.pdf
/apps/cds/edi/9.1/doc/encounter/encounter.pdf

On command line, type man for that cmd, or help cmd.
type exit to exit encounter.

script: run_encounter => brings up gui (removes all previous log files and dbs)
Then in tcl/top.tcl, you have multiple scripts for different phases of PnR.
-------------
Import Design
---------------
import_design.tcl => Import design => set up design for port into Encounter Digital impl system (EDI).
On gui: file->import design->basic
# Import LEF/Cap Tables/LIB/Netlist/Constraints => this file sets rda_input(*) for various parametrs.
loadConfig  /db/Hawkeye/design1p0/HDL/Autoroute/digtop/vdio/scripts/import.conf

Important parameters are :
ui_netlist => structural verilog netlist
ui_timelib.min/max =>min/max timing lib (ex: /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/synopsys/src/PML30_S_-40_1.95_CORE.lib)
ui_timingcon_file (constraints.sdc file) => same as pulled from DC (i.e set_load, set_driving_cell, set_dont_touch)
ui_*_footprint => provides names so that such cells can easily be identified.
ui_leffile => provide leffile for both tech and std cells.
Tech file: if its 3 layer metal, file will have pitch,width,spacing.etc for MET1/2/3 and various vias for VIa12 and VIA23. ex: /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/vdio/lef/pml30_lbc8_tech_3layer.lef
Std cell file: /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/vdio/lef/pml30_lbc8_core_2pin.lef

ui_core_* => core width,height,row_height,utilization,etc. => these values are bogus and not used for anything.
ui_captbl_file => lookup res/cap tables for typ,worst,best for M1/2/3 (cap for various width and space, min W=0.2um,S=0.2um, Ctot=0.35ff/um. It provides total cap, Coupling cap, Area cap and fringing cap. There's also an extended cap table) and for CONTACT/VIA1/2 (via resistance is about 5ohms. For M1/M2/M3 res is about 0.1ohm/um. Res is usually higher for M1 as it's thinner than top layers).We specify minC_minVia / maxC_maxVia cap table file.  NOTE: If QRC techfile specified, then that is used, and captbl file is ignored by tool.
ui_pwrnet,ui_gndnet => set pwr nets to VDD/VSS (for multi pwr domains, put all pwr supplies for that net). This will get connected to pwr/gnd pins found in stdcells lef file. Lef file has pin names and attribute to identify it as pwr/gnd pin.
ex: for VDD pin in lef file for a stdcell
PIN VDD => pin name is VDD
DIRECTION INOUT ;
USE POWER ; => pin is a pwr pin. If it was gnd pin, it would be USE GROUND).

eg: set rda_Input(ui_pwrnet) {VDD VDD_WL EXTVREF} => specifies 3 pwr nets (NOT pins) with names VDD, VDD_WL, EXTVREF. These are the nets that are routed during "sroute". These will get connected to pins in stdcells with "USE POWER" attribute, provided name of nets match the pin name (in lef) from stdcell. If names are different, then use "globalNetConnect" cmd explained below later. We can also specify nets with high terminal connections (large fanout) to get some default delay, load, etc  to save runtime.

#from Enc version 11 and onwards, import design looks different:
--------
source /db/Hawkeye/design1p0/HDL/Autoroute/digtop/vdio/scripts/import.conf
init_design => this loads parameters from import.conf
Important parameters in import.conf are :
set defHierChar {/}
set init_top_cell {DIG_TOP}
set init_verilog {../input/DIG_TOP.preroute.v} => same as ui_netlist
set init_pwr_net {V1P8D} , set init_gnd_net {DGND} => pwr/gnd nets specified
set init_lef_file {../input/MSL445_4lm_tech.lef  ../input/MSL445_CORE_2pin.lef ../input/MSL445_CTS_2pin.lef ../input/sshdbw00096016020.lef} => all lef files
set init_mmmc_file {mmmc.view} => optional: everything in "create views" section (in create_views.tcl) below is specified here.
-------

Remove assign from netlist => To remove assign from synthesized netlist or final PnR netlist, use this cmd:
setDoAssign  on -buffer BU110 => this places buffer BU110 wherever assign are found. Buffers are placed only if needed, else it will just move nets up/down the hier to get rid of assign. So, final netlist will be free of assigns. This cmd can also be placed in import.conf above.

NOTE: if there are any HardIP, .lef and .lib should be provided for those. If .lib is missing for any cell, Enc doesn't generate any error/warning, treats that cell as blockbox, and makes the path unconstrained going in/out of that cell. This is very dangerous, as these paths will not be optimized for timing and will show up as "unconstrained paths" in report_timing.

# Save Design after Import. this saves design so that it can be restored later from enc.dat/* (encounter database). After various phases of PnR, EDI puts files here in appropriate dir.
saveDesign ./dbs/import/import.enc -def  => we save it in import dir. Def file in "dbs/import/import.enc.dat/digtop.def.gz" has die area (initial area in import.conf file), initial rows, tracks, gcellgrids (gcell grid and tracks are taken to be equal to M2 pitch), NO vias, components (just the names of all components from synthesized verilog netlist with no placement info), unplaced pins(pin names derived from synthesized verilog netlist), unplaced special nets VDD/VSS and all unplaced nets.
import.enc has this line: restoreDesign ./dbs/import/import.enc.dat digtop

#dir structure of dbs:
dbs has dir for each step run. within each dir, it has .dat subdir which  has multiple files. For ex, in dbs/impor/import.enc.dat/, it has these files:
1. digtop.conf: same as import.conf, except that "ui_netlist" verilog netlist is now pointing to digtop.v.gz in import dir. If we are in route dir, then this netlist is set to digtop.v.gz in route dir. ui_core_height/width etc are also changed to the latest value depending on if floorplan has been run or not.
2. digtop.def.gz: has def file
3. digtop.v.gz: verilog generated after import (same as initial verilog from synthesis).
4. digtop.fp.gz: derived from  digtop.def.gz.
5. digtop.fp.spr.gz: just has vias/vdd/vss coords in it.
6. digtop.globals: sets global values for encounter to use
7. digtop.mode, digtop_power_constraints.tcl, enc.pref.tcl, digtop.opconds: all set*mode, pwr_mode encounter cmd, enc pref settings put here to be used later
8. digtop.place.gz, digtop.route.gz: intermediate place and route info to be used by enc.

#on screen o/p
On screen, we see VDI reads in .lef, .lib, and digtop.v netlist from synthesis tool. It reports total no. of cells and modules in verilog netlist. Then it reads .lib files and reports all cells found [all comb cells, seq cells, usable buffers (BU*), unusable delaycells/buffers (delay cells as BU112, clk tree buf as CTB* etc which are marked as dont_use)]. Reads in cap tables, sets few default parameters, and then saves verilog netlist and def file. Def file in "dbs/import/import.enc.dat/digtop.def.gz" has the initial floorplan size, rows, tracks, gcellgrid, Vias, all components(from digtop.v netlist), pins(all ports), special nets(VDD/VSS) and all other nets in the digtop.v netlist.

#freeDesign => used to remove lib and design-specific data from the Encounter session. It can be used as a shortcut in place of exiting and re-starting Encounter.
When you specify the freeDesign command, the Encounter software does not free collections but only invalidates them. For ex, after saveDesign, if we do freeDesign, it invalidates import.enc file, so that we can do loadConfig to load import.conf or do source *.enc to load any other file we wish.

#source => we can use this to source design from a particular step
source ./dbs/cts/cts_opt.enc => sources design from cts step => or restoreDesign ./dbs/cts/cts_opt.enc.dat digtop

#update_* => this can be used to update some variable that you set to some wrong value before.

Create Floorplan
-------------------
create_floorplan.tcl => add spacing b/w rows, define fp boundary, create ring, read pin locations,check fp, and then save design

# Add spacing between two rows => default is VDD then VSS then VDD and so on. 13.6dbu is spacing height and 2 says after every 2 rows. So, it owuld be VDD VSS VDD space VDD VSS VDD space VDD ... Keep spacing as 1 row height i.e 13.6dbu for LBC8
setFPlanRowSpacingAndType 13.6 2

# Define Die, IO, Core boundaries => die is whole chip, IO is inside die where we want IO pins, CORE is inside IO where we want logic to be placed. space b/w DIE/IO and CORE boundary can be used for power rings or left empty for signals to be routed. IO pins can be placed on DIE or IO boundary. CoreMargins are spacing b/w core-to-IO or core-to-die
#NOTE: core height needs to be a multiple of std row height (which in turn is a multiple of M1 pitch). Core width needs to be a multiple of M2 pitch. Boundary around core also needs to be multiple of M1 pitch for top/bottom and M2 pitch for left/right. For LBC8: M1/M2 pitch is 1.7du, so boundary around core needs to be a multiple of 1.7.

#( -b <die_x1> <die_y1> <die_x2> <die_y2> (co-ord of die) <io_x1> <io_y1> <io_x2> <io_y2> (co-ord of outside edge of I/O box) <core_x1> <core_y1> <core_x2> <core_y2> (co-ord of outside edge of core box) ) => all co-ord in du. so power ring gets into that area b/w die edge and I/O box edge
#-s <core_box_Height> <core_box_Width> <coreToLeft> <coreToBottom> <coreToRight> <coreToTop> => <coreTo*> specifies margin from outside edge of core box to left/right/bottom/top DIE/IO.
#-d <die_box_Height> <die_box_Width> <coreToLeft> <coreToBottom> <coreToRight> <coreToTop> => <coreTo*> specifies margin from outside edge of core box to left/right/bottom/top DIE/IO.

#-d is most convenient to use as you specify the outermost size. -s is convenient when we have power rings. -b is only used when we want to have much finer control.
floorPlan -site CORESITE -b 0.0 0.0 2700 1452 14 14 2686 1438 14 14 2686 1438 => draw die (0,0,2700,1452), then inside it draw the IO box leaving 14dbu space on all sides (14,14,2700-14,1452-14), then inside it we have CORE box (CORE box in this case is same size as IO box)
floorPlan -site CORESITE -s 2100.0 2100.0 14 14 14 14 => draw 2100 dbu size CORE and leave space of 14dbu on all sides b/w DIE/IO to core.
floorPlan -site CORESITE -d 2100.0 2100.0 14 14 14 14 => draw 2100 dbu size DIE and leave space of 14dbu on all sides b/w DIE/IO to core.  Full floorplan is 2100x2100, but stdcells can only be placed in core which is smaller by 14 on all sides.

#for rectilinear shape
setObjFPlanPolygon 0 0 0 750 600 750 600 900 1000 900 1000 0 0 0 => draws rectilinear shape staring from (0,0) to (0,750) to (600,750) to (600,900) to (1000,900) to (1000,0) to (0,0). Run this cmd after floorPlan cmd above. Then it modifies the fp area according to the polygon shape.
loadFPlan DIGTOP_mod_rect.fp => We can also use this to load rectilinear fplan. This loads the floorplan from fp file which has rows(DefRow), Track and GCellGrid defined. This fp file is generated first time by Tool after we manually adjust the boundary, and then it can be saved and then used for future use.

reportDesignUtil => It reports stdcell area utilization (area where stdcells are placed divided by allocated area of die (excluding placement blockages). This can approach 80% or more for dense design. It's always < 100% as outer area of die is for VDD/VSS lines, so no stdcells can ever be there. It also reports Core and Chip utilization (area of core where stdcells can be placed divided by area of die)
We can also get same utilization report thru GUI: goto Place->Query_density->Query_place_density

#To manually edit VDD/VSS routes, we use setedit cmd. Else we can use addRing cmd to automatically create rings.
#setedit: Updates the Edit Route form and the design display area. many options available:
setEdit -shape RING => Specifies the shape associated with the wire you draw. here, wire drawn will be always RING shape.
setEdit -use_wire_group {0|1} => Groups multiple wires from the same net, which decreases resistance. default is 0, meaning wires are not grouped.
setEdit -width_horizontal 3.5 -spacing_horizontal 1.2 => Specifies the width and spacing for horizontal wires.
setEdit -width_vertical   3.5 -spacing_vertical   1.2 => Specifies the width and spacing for vertical wires.
setEdit -nets {VSSS VDDS} => Specifies one or more nets for editing. Here we are going to edit only nets VDD and VSS.
setEdit -layer_vertical MET2 => specifies the layer for vertical wires.
setEdit -layer_horizontal MET3 => specifies the layer for horizontal wires.
setEdit -close_polygons {0|1} => Specifies whether to close a special route structure toward itself, using the Escape key. For the closing to complete, the ending wire segments must be drawn towards the start wire segments, but do not have to touch them. default is 0, meaning do not close.

#now routes can now be added and committed using these 2 cmds: editAddRoute create wire segments that start and stop at the specified points. The wire ends at the point specified by editCommitRoute.
editAddRoute x1,0 => Specify (x,y) of centerline for the start point or end point of the wire segment.. Continue doing this with more editAddRoute, until we are about to reach to startpoint. At that time, do
editCommitRoute x1,y1 => route is closed at x1,y1 which is the startpoint for rectangular shape.

# Create Ring  (get metal layer names from /db/pdk/lbc*/.../vdio/lef/*.lef file). Power pin names (VDD,VSS) are the pin names that appear in std cell lef files, so we specify those names so that sroute connects all of them.
#-nets => first net specifies the first net around the core, 2nd net specifies the second net around the core and so on. So {VDD VSS} means first put VDD around the core and then VSS (so VDD is inside while VSS is outside)
#-type core_rings => Creates core rings that follow the contour of the core boundary or the I/O boundary.
#-center 1 => center the core rings b/w IO pads and core bdry. If -center 0, then we need to specify the 4 offsets: offset_top, offset_bottom, offset_left, offset_right. Offset is from edge of the inner ring to Core/IO bdry
#-layer_*  Specifies which layer to use for each side of the ring or rings being created.
#-spacing_* Specifies the edge-to-edge spacing between rings for each side of the ring
#-width_* Specifies the width of the ring segments for each side of the ring
#-follow core|io => specifies whether to follow core or io bdry (default is core)
#-skip_side {top bottom} => skips putting ring on top and bottom as regular VDD/VSS lines will anyway get added there.
#NOTE: in Encounter versions before -9.1_USR2_s159, core bdry top is taken as the last VDD net if closest power ring is VDD, or VSS net if closest power ring is VSS. So, this causes offset in power rings. Even in later encounter versions, rings may get offset (with -center 1). just add an extra row in such cases, so that vdd/vss gets lined up correctly.

#ex with ring centered
addRing -nets {VDD VSS} -type core_rings -center 1 -layer_top MET1 -layer_bottom MET1 -layer_right MET2 -layer_left MET2 -width_top 4 -width_bottom 4 -width_left 4 -width_right 4 -spacing_top 1 -spacing_bottom 1 -spacing_right 1 -spacing_left 1

#ex with ring not centered, allows more control. use this to avoid spacing b/w i/o bdry and ring, so that no routes are inserted there. -offset specifies spacing from the edge of the inner ring to the boundary of the referenced object for each side of the ring.
addRing -nets {VDD VSS} -type core_rings -center 0 -offset_top 5 -offset_bottom 5 -offset_left 5 -offset_right 5  -layer_top MET1 -layer_bottom MET1 -layer_right MET2 -layer_left MET2 -width_top 4 -width_bottom 4 -width_left 4 -width_right 4 -spacing_top 1 -spacing_bottom 1 -spacing_right 1 -spacing_left 1 => offset 0 gets the ring starting from boundary of core.

NOTE: in newer versions, we can use "layer, width, spacing, offset" within array style for each side. Above way is obsolete.
i.e instead of "-offset_top 5 -offset_bottom 4 -offset_left 3 -offset_right 2", we do "-offset {left 3 bottom 4 top 5 right 2}"

###############################################
# Add stripe => Creates  power stripes within the specified area. These stripes connect all the way down to horizontal VDD/VSS lines on stdcells so that pwr supply to these regios in centre of core is still robust, preventing huge IR drop.
#-block_ring_top_layer_limit = Specifies the highest layer that stripes can switch to when encountering a block ring
#-block_ring_bottom_layer_limit = Specifies the lowest layer that stripes can switch to when encountering a block ring.

addStripe -block_ring_top_layer_limit MET3 -max_same_layer_jog_length 1.6 -padcore_ring_bottom_layer_limit MET1 -number_of_sets 1 -stacked_via_top_layer MET4 -padcore_ring_top_layer_limit MET3 -spacing 1 -xleft_offset 1345 -merge_stripes_value 0.85 -layer MET2 -block_ring_bottom_layer_limit MET1 -width 4 -nets {VSS VDD } -stacked_via_bottom_layer MET1 => width, layer, spacing and x-offset provided for the stripes. First VSS put then VDD starting from x=0.

#global net connect => used to connect pins/nets in inst to a specified global net (required only if we have more than 1 pwr net or gnd net, or names of pwr/gnd nets don't match with those of pwr/gnd pins in stdcells). type of pin needs to be specified, it can be one of any 4 types - tiehi, tielo, pgpin, net. 3 use scenarions for this cmd:
1. Connecting pins in a single instance to a global net:
ex: globalNetConnect NET123 -type pgpin -pin VDD -singleInstance Ictrl/FF_0_reg => connects pin VDD of flop to NET123
2. Connecting pins in a single/multiple instance to a global net:
ex: globalNetConnect VDD123 -type tiehi => tie "1'b1" in netlist to net VDD123.
ex: globalNetConnect VDD456 -type tiehi -pin OEN -inst PAD* -module {} => tie "1'b1" on -pin OEN of all PAD* inst to net VDD456.
3. Connecting nets to a global net:
ex: globalNetConnect NET123 -type net -net net1 {-hierarchicalInstance Ictrl/I_Reg | -all} => connects net1 to NET123
ex: globalNetConnect VDD123 -type pgpin -pin VDD -all => connects pg pin VDD of all instances to global net VDD123.

NOTE: "globalNetConnect -type tiehi|tielo" cmd connects 1'b1 or 1'b0 directly to power rails, and NOT to tie high/low cells. Ususally we want to isolate the input pins of cells from the power grid. This reduces noise coming from the power grid and reduces the possibility of damaging the gate oxide of the pin. To make connections to tie high/low cells, look in "warnings" section below.

#clearGlobalNets => clear everything
#globalNetConnect VDD_1P8 -type pgpin -pin VDD -inst * -module {} => adds new global net VDD_1P8 (1st arg) to pg pin VDD (2nd arg) found in all physical instances and modules of design. -type pgpin specifies that pwr/gnd pins listed with "-pin" param should be connected to global net VDD_1P8. VDD_1P8 is the Power ring around die specified as pwr_net in import.conf.
#globalNetConnect DGND -type pgpin -pin VSS -inst * -module {} => adds new global net DGND (1st arg) to pg pin VSS
#globalNetConnect VDD_WL_1P8 -type pgpin -pin VDD_WL -inst fram -module {} => connects for fram inst of any module. fram module lef file has VDD_WL as a power pin with multiple ports around fram bdry. VDD_WL_1P8 is the net in import.conf

#createRouteBlk => Creates a routing blockage object that prevents routing of specified metal layers, signal routes, and hierarchical instances in this area
createRouteBlk -box <llx lly urx ury> -layer {MET1 MET2} -exceptpgnet -name blk_1 => creates routing blkg named blk_1 to be applied on routing layers MET1 and MET2, in coords specified. -exceptpgnet Specifies that the routing blockage is to be applied on a signal net routing and not on power or ground net routing. usually needed on pwr rings so that VDIO doesn't route any signal nets there
ex: createRouteBlk -box 59.950 0.000 61.000 147.000 -layer {1} => created routing blkg on met1

#createPlaceBlockage => To prevent tool from putting any instance in this area. Usually done around HardIP.
createPlaceBlockage -box 779.3500 659.1000 1062.0500 813.8000 => routing blkg size will adjust automatically so that blkg always starts from a row height (i.e row cannot be partially blocked. It's either completely blocked or completely unblocked)

# sroute => (special routes) Routes power structures. Use this command after creating power rings  and  power  stripes. Throws some warnings related to def file that was created during import. sroute knows cell row height from CORESITE size in std cell lef file, so it routes VDD/VSS at CORESITE height.
#-nets {VDD VSS} => nets to sroute.
#-stripeSCpinTarget boundaryWihPin => extends unconnected stripes and standard cell pins to design boundary and creates a new power pin along the design boundary. Any overlaps with existing I/O pins at the design boundary are flagged as violations after the extension. This option is helpful, since Layout at top level connects to these power routes, so extending it all the way to edge, makes it easier to connect to global power supply.
sroute uses both layer changes and jogging to avoid DRC viol.
#-allowJogging 1 => jogs are allowed during routing to avoid DRC violations. If 0, then jogs are avoided as much as possible.
#-allowLayerChange 1 => Allows connections to targets on different layers. If jogs do occur, it says that preferred routing dirn should be used, wherever possible.

sroute -verbose => normal routing where power structures stop at core boundary or at power rings.
sroute -stripeSCpinTarget boundaryWithPin -allowJogging 0 -allowLayerChange 1 => routes power structures all the way to IO/die boundary.

#create power pins (not needed). -geom creates physical pin at specified co-ord, else only logical pin created.
#createPGPin -geom <layerId> <llx> <lly> <urx> <ury> -net <net_name> <pg_Pin_name>=> layerId is number 4,5,etc.

# Read pin locations => we load io location file from cadence when it places the pin in some order the first time. This is to ensure that next time, we invoke VDI, we get same pin location. Goto File->save->I/O file => save in digtop.save.io in current dir (select locations for now). this is to be done after PnR is done the first time. then we get pin locations, and we save it in this file.
#To move pin placement the first time VDI generates it, we can goto edit->Pin editor. then choose which pin to be placed where, and then save it using file->save->i/o file. Choose Save IO as "locations" and select "generate template IO file"
loadIoFile scripts/digtop.save.io (In this pin file we specify pin name, offset, width(thickness) and depth(length) and metal layer of pins. offset specified for left/right side is wrt bottom edge while for top/bot is wrt left edge, so even if size increases in x or y dirn, we don't need to change this file. Pins are always put at boundary of die. This is in contrast to def file, which have absolute coords.)
ex: (pin name="CLK10MHZ"    offset=3.2500 layer=3 width=0.2500 depth=1.4000 ) => this is for iopin offset by 3.25 db

# Read pin locations (these i/o pin loc comes from top level design from layout person. for 1st pass, comment it)
#defIn /db/BOLT/design1p0/HDL/Autoroute/digtop/Files/input/digtop_pins.def

NOTE: left/right pins (horizontal pins) are usually on MET3 (not MET1 which is lowest layer), while top/bot pins are on MET2. Pins are usually on top 2 metal layers for that block, as that allows more efficient routing.

# Set Fix IO so that placement does not move pins around (comment it for 1st pass, as these are put arbitrarily initially, so we don't want to fix these)
#fixAllIos => changes the status of all I/O pins and I/O cells to a FIXED state. -pinOnly option changes the status of all I/O pins only to a FIXED state, while -cellOnly changes the status of all I/O cells only to a FIXED state.

# Check Floorplan
setDrawView fplan => Sets the design view in the design display area to amoeba, fplan or place
checkFPlan -reportUtil -outFile ./dbs/floorplan/check_fp.rpt => Checks  the  quality  of the floorplan. This should be run on initial fp and the final fp (and also during intermediate steps for debug purpose). checks that can be performed are -feedthrough (feedthrough buffer insertion), -place (placement), -powerDomain (checks pwr domain) and -reportutil (reports target util and effective utilization). Look in check_fp.rpt for any issues (like pins not on tracks which result in inefficient layout, etc)
#utilization = stdcell_area/total_area (total_area is total area of die including empty rows, power rings, etc)
#density = stdcell_area/alloc_area (alloc_area is area of core where stdcells can be placed, so if we have power lines where there's not enough height for a row, we don't count that in alloc_area. similarly empty rows, power rings, area b/w core/die not counted. stdcell_area is sum total of all stdcells+IP_Blocks. Area of std_cells and IP_blocks is taken from lef file.
#So, utilization is always a lower number than density.
#NOTE: to get additional info, use below 2 cmds:
reportGateCount => can be used to report total no. of cells, and their area in terms of nd2x1 as well as absolute area.
checkDesign -noHtml -all -outfile ./dbs/floorplan/check_design_fp.rpt => run this after each step to get detailed info. checks design for missing cells, etc and is very comprehensive check. (-all performs all checks as dangling nets, floorplan errors, I/O pads/cells, nets, physical lib, placement errors, pwr/gnd connections, tieHi/Lo and if cells used have been defined in timing lib). It shows a concise report on screen and a detailed report in the outfile.
checkDesign report:
1. design summary (on screen): shows total no. of stdcells used, and their area.
1. design stats: On screen, it shows total no. of instances and nets, while in report it shows all cell types (as nand, or, sparefill, etc) used in design
2. LEF/LIB integrity check (in reports): checks whether cells used in design have correct lef/timing info.
3. netlist check: On screen, it shows IO port summary (total no of ports), while in reports, it shows Floating ports, ports connected to multiple pads (pads are what is on the bdry of chip, ports are connected to these pads inside the chip), Port connected to core instances (in our case, no. of ports connected to core cells should equal total no. of io ports (minus any floating ports) as each i/o port has a IO buffer, so it's connected to just one inst). There should be 0 o/p pins connected to pwr/gnd net (since nothing should be connected to PG directly, it's thru TieOff cells). Under "Instances with multiple input pins tied together", we see those gates whose i/p pins are tied to same net. Here we see all spare cells, as well some other cells whose i/p are tied together for opt. "Floating Instance terminals" and "Floating IO terms" should be 0. Note that the "Floating terminals" only reports a terminal as floating if it's not connected to any net. If it's connected to a net, which is floating, then the terminal would still be considered as not floating, but the net will be considered as floating, which will be reported in next section as "undriven net" => very important to check these for floating input on gates.
4. net DRC: On screen, we see no. of floating pins, and other DRC on pins, while in reports, we see "No Fanin","no fanout" and "High FO" nets. We may have "no fanin" nets for modules which have i/o ports, but they aren't being used inside the module. So, such ports get connected to "FE_UNCONNECTED_*" nets by encounter. These floating nets get carried from synthesized netlist, where they couldn't be removed because they were part of bus, or because they wer tied to 0/1, which is no longer needed (optimized away during PnR). NOTE: very important to check all "No Fanin" nets as any floating nets will be reported here, which may be input of gates.
5. IO pin check: In reports, we see all IO pins connected to which inst (all pins should be connected to BUF), "Instance with no net defined for any PGPin" (basically all inst starting from instances in digtop and then in modules as hey are referenced in digtop [no. of inst reported in design stats] are reported here as we don't have PGPin for inst, PG pins only exist in lef file of inst, not in verilog model).
6. Top level Floorplan check: on reports, it shows tracks which are offgrid, IO pins offtrack (some pins may get reported here as pin def file from layout folks may not have all pins on track, though they will still be on mfg grid), "Floating/Unconnected IO Pins" (these are also pins offtrack, but not sure why it gets reported in this section), etc. Look at final numbers for "Floating/Unconnected IO Pins" and "IO Pin off track" given at the end of report. That's correct number.

NOTE: checkDesign should be run at each stage, as it gives valuable information about the design. "checkNetlist -includeSubModule" is by default included as part of checkDesign (it only includes "netlist check" section from "checkDesign" and is a good concise report). Run  checkDesign after final netlist is generated to see full report.

# Save design after floorplan
saveDesign ./dbs/floorplan/floorplan.enc -def => note, this time we save it in floorplan dir. Def file in "dbs/floorplan/floorplan.enc.dat/digtop.def.gz" has new area, rows, tracks, gcellgrids, vias, components, placed pins(from io def file), placed special nets VDD/VSS (as sroute is done) and all nets.

place blocks: This is needed only if we have hard macros that we want to instantiate.
------------
#instantiate hard macro at specified loc
#setObjFPlanBox <objectType> <objectName> <llx> <lly> <urx> <ury> => Defines the bounding box of a specified object, even outside the core boundary. <objectType> can be Bump, Cell, Group, Instance, I/O cell, I/O pin , Layershape, Module, Net, etc.
#flipInst <Inst> {MX | MY} => flips inst. MX -> Flip with Mirror on X axis, MY -> Flip with Mirror on Y axis
#orientateInst <Inst> {R90 | R180 | R270 | MX | MY} => orientate. R -> Rotate, M -> Mirror
ex: setObjFPlanBox Module abc 100.00 200.00 400.00 500.00 => bounding box for module abc with lower left x=100, lower left y=200, upper right x=400, upper right y=500.
ex: setObjFPlanBox Instance fram 10 10 30 40 => bounding box for instance fram present in digtop.v netlist.
ex: orientateInst fram  R90 => rotate inst fram by 90 degrees.

#add halo to block. A halo is an area that prevents the placement of blocks and standard cells within the specified halo distance from the edges of a hard macro, black box, or committed partition in order to reduce congestion.
addHaloToBlock 5  10  95 15 fram => adds halo to fram instance (in um). <from left edge=5> <bottom edge=10> <right edge=95> <top edge=15>

#cutRow => Cuts site rows that intersect with the specified area or object. Needed so that there will be rows over that area or object for router to route VDD/VSS lines. If no options are specified, the cutRow command automatically cuts all blocks and all rows around the placement blockage. Instead of "cutRow", we can also do "sroute" after placing these IP blocks, so that sroute will automatically not put VDD/VSS lines over these IP.
#-area <box_coords> => Specifies the x and y coordinates of the box area in <llx> <lly> <urx> <ury> in which rows will be deleted.
#-selected => only rows interfering with selected objects will be cut
#-halo <space> => Specifies the additional space to be provided on the top, bottom, left, and right sides of the specified or selected object.

selectInst fram
cutRow -selected -halo 1 => specs that additional space of 1um should be provided on all sides of selected obj (fram). Also, all rows around placement blkg are deleted.

#we can also place an instance using these cmd:
selectInst I_ram/fram_inst => selects fram instance in digtop. here it's hard IP as felb800432
placeInstance I_ram/fram_inst 805 563 R0 => places at x,y =(805,563) with R0 orientation (cut sign on bot left of IP)
addRing .. -nets {DGND V1P8D} ... => adds power ring around fram
deselectAll => deselects the inst so that new cmds can be applied to whole design

#connect pwr pins on Blocks with power rings around them
sroute -connect blockPin  -blockPin all\
    -blockPinRouteWithPinWidth -jogControl { preferWithChanges preferDifferentLayer } \
    -nets { DGND V1P8D } -blockPinMinLayer 2 -blockPinMaxLayer 4

Create views
---------------
create_views.tcl => creates views for various operating modes (scan, functional,etc) of design with various operating conditions (PVT).  called as mmmc: multi mode multi corner. We specify bc/wc std cell library delay, and bc/wc Res/cap values. Then we "create_delay_corner" based on cell+wire delay. Then on top of that we create constraint_mode based on sdc files for func/scan/other modes. Then various "analysis_view" are created based on "delay corner" + "constraint_mode". Then we set appr analysis view for setup and hold corner.

# Create Library Sets => for worst case (P=weak, T=150C, V=1.65V), best case (P=strong, T=-40C, V=1.95V)
create_library_set -name wc_lib_set -timing [list /db/pdk/lbc8/rev1/diglib/pml30/r2.4.3/synopsys/src/PML30_W_150_1.65_CORE.lib \
                                                  /db/pdk/lbc8/rev1/diglib/pml30/r2.4.3/synopsys/src/PML30_W_150_1.65_CTS.lib]
#                                    -si     [list ../cdb/cdb_files/max.cdb]
create_library_set -name bc_lib_set -timing [list /db/pdk/lbc8/rev1/diglib/pml30/r2.4.3/synopsys/src/PML30_S_-40_1.95_CORE.lib \
                                                  /db/pdk/lbc8/rev1/diglib/pml30/r2.4.3/synopsys/src/PML30_S_-40_1.95_CTS.lib]
#                                    -si     [list ../cdb/cdb_files/min.cdb]

# Create Operating Conditions => just use ones in .lib files

# Create RC Corners to use in delay corner after this. Cap tables are specified to be used for extraction, when running this RC corner (default is to use Enc internal rules to extract RC). T is specified to derate R values in cap table (it overrides the value of Temperature in cap table). QRC tech file is used for sign-off RC extraction.  
create_rc_corner -name max_rc -cap_table    /db/pdk/lbc8/rev1/diglib/pml30/r2.4.3/vdio/captabl/4m_maxC_maxvia.capTbl -T 150
                              -qx_tech_file /db/pdk/lbc8/rev1/rules/parasitic_data/qrc/2009.06.01.SR6/4m/maxC_maxvia/qrcTechFile
create_rc_corner -name min_rc -cap_table    /db/pdk/lbc8/rev1/diglib/pml30/r2.4.3/vdio/captabl/4m_minC_minvia.capTbl -T -40
                              -qx_tech_file /db/pdk/lbc8/rev1/rules/parasitic_data/qrc/2009.06.01.SR6/4m/minC_minvia/qrcTechFile

# Create min/max Delay Corner. specifies lib set, rc corner and operating condition for this corner. -opcond specifies the op cond found in .lib file.
operating_conditions (W_125_2.5) { process : 3;
    temperature : 125;
    voltage : 2.5;
    tree_type : "balanced_tree";
  }
#-opcond_library Specifies the internal library name for the library in which the operating condition is defined. Every .lib file has a library name at the top. Note: this is NOT the file name, but library within that file. See liberty.txt file for info.
library ( MSL270_W_125_2.5_CORE.db ) {
 ...
}
So, in the lib set, if we specified multiple lib files, then for setup/hold analysis, tool picks up default op cond in each lib set when it's called. But if we want to force a particular op cond, we specify the opcond_library where it will look for opcond, and then use that P,V,T cond for particular corner.  We specify *CORE.db but could have specified *CTS.db too, since both of them have that op cond.
create_delay_corner -name max_delay_corner -library_set wc_lib_set -opcond_library PML30_W_150_1.65_CORE.db -opcond W_150_1.65 -rc_corner max_rc
create_delay_corner -name min_delay_corner -library_set bc_lib_set -opcond_library PML30_S_-40_1.95_CORE.db -opcond S_-40_1.95 -rc_corner min_rc

# Create Constraint Mode => for this netlist, we create two modes: functional and scan. NOTE: all files same as from synthesis.
create_constraint_mode -name functional -sdc_files \
    [list /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/env_constraints.tcl \ => env constraints (i/o load, i/p driver)
          /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/dont_use.tcl \ => dont_use (optional)
      /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/dont_touch.tcl \ => dont_touch (optional)
          /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/clocks.tcl \ => clk defn
          /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/constraints.tcl \ => all design constraints = i/o delays
          /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/gen_clocks.tcl \ => generated clk defn
          /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/case_analysis.tcl \ => scan_mode set to 0 (only if scan present)
      /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/false_paths.tcl \
      /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/multicycle_paths.tcl]
    
create_constraint_mode -name scan -sdc_files [list ./scripts/scan.sdc]
#scan.sdc has env constraints (i/p driver, o/p load), clk defn for scan clk (on port designated for scan clk) with a slower cycle than func clk, case analysis with scan_mode set to 1, and all design constraints (i/o delay redefined wrt scan clk). Only difference in scan sdc  (compared to functional sdc) is that no false path file is needed as there is only single clk (scan clk) when scan mode is set to 1. Also, i/o delay specified here is wrt scan clk, whereas in func, all i/o delay were wrt func clk

NOTE: instead of using all these *.tcl files from Synthesis dir, we can use .sdc file generated in sdc/constraints.sdc using write_sdc command. Be careful though to remove "set_ideal_network", "set_false_path -from scan_enable", "set_clock_uncertainty", "set_resistance" from internal nets, "set_load" from internal nets, etc from this sdc file so that it can be used in PnR. Or safer approach is to just use all constraints tcl files separately and not rely on sdc file. In DC-topo, constraints file has resistance/load on each net of design, causing EDI to pick these up, instead of calc res/cap for each net. See Synthesis_DC.txt. also set_units cmd causes differnen cap/time units to be used in cdns/snps tools, so be careful. See in sdc.txt. Also, sdc generated by synopsys has "-library *.db" for set_driving_cell cmd. This causes warning as "Could not locate cell IV110 in any library for view MIN" in encounter, as when reading sdc file for MIN corner, there's no MAX corner db file avilable, causing those warnings. Best approach is to manually remove any reference to lib/db files, so that same sdc files can be used for all MIN/MAX/NOM corners.

# Create Analysis Views => now create 4 views: func(max/min) and scan(max/min)
create_analysis_view -name func_max -delay_corner max_delay_corner -constraint_mode functional
create_analysis_view -name func_min -delay_corner min_delay_corner -constraint_mode functional
create_analysis_view -name scan_max -delay_corner max_delay_corner -constraint_mode scan
create_analysis_view -name scan_min -delay_corner min_delay_corner -constraint_mode scan

#NOTE: update_library_set, update_rc_corner, update_delay_corner, update_constraint_mode, update_analysis view can be used to update any of these variables.
#report_case_analysis can be done to see what values of pins are associated with diff analysis views. This is useful to verify that all views are correct. Sometimes, tools don't pick up constraints in *.tcl files and just ignore them, if it's not the expected syntax
#report_path_exceptions can be used to see list of all false paths used by VDIO.
#report_ports -pin [all_output] => to report caps+external_delay on all o/p ports. similarly for i/p ports with [all_input]. This can be used to verify if sdc files were loaded properly in all views (check thse values for both func_mode and scan_mode)
#report_ports -pin [get_ports {ENABLE_PORT1}] => to report for a specific port

# Save design after floorplan => save in views dir
saveDesign ./dbs/views/views.enc

Place
------
place.tcl => apply more constraints, set view, perform timing analysis, check design, attach bufs,  then do placeDesign, and check and save
# Apply additional constraints
set timing_enable_genclk_edge_based_source_latency false => Controls  how  the  software  chooses generated clock source latency paths. When set to false, the software does not check paths for the correct cause-effect relationship. We should set it to "true" so that we can see if all generated clocks have correct rise/fall relation with source clk. latency for generated clock is chosen as "0" for gen clk edges which don't have correct relationship with source clk. Ex: if gen clk is div by 1 and it's a +ve edge clk, then fall edge of gen clk will generate error (and hence have 0 latency) as fall edge of source clk can't generate fall edge of gen clk.

# Do placement, CTS and Route in Functional mode. We use scan mode briefly during CTS (to get clk tree) and then again go to func mode. We goto scan mode after place during STA/SIGNOFF timing analysis.
set_analysis_view -setup func_max -hold func_min => Defines  the  analysis views (func_max and func_min only) to use for setup and hold analysis and optimization. Here cap tables are used for wire res/cap. On screen, it shows what files it used for each view. sdc file is read here for the first time, so any errors/warnings found in sdc file syntax are reported here.
#since views are set to func mode, all_constraint_mode active at this time are only func mode. scan mode is not active. We can type "all_constraint_modes -active" to see all active modes.

### print some useful reports before doing placement
#report_clocks => This reports all clks (gen too) with their waveforms. If something is oncorrect, it needs to be fixed
#check_timing -verbose timingReports/check_timing.rpt => This reports any problems with any clks. It shows all flops with no clks, timing loops, unconstrained paths, ideal clks and problem b/w master and gen clks. It can be used to find out any mismatches b/w PT timing run and VDIO run (especially if some paths still have setup/hold violations in PT, it's most likely due to unconstrained paths in VDIO)
#report_path_exceptions can be used to see list of all false paths used by VDIO.

# Perform timing analysis before placement. timeDesign Runs Trial Route, extraction, and timing analysis, and generates detailed timing reports. The generated timing reports are saved in ./timingReports directory or the directory that you specify using the -outDir  parameter. It saves reports for setup/hold for reg2reg, reg2out, in2out, clkgate (for paths ending in clk gating). options are -prePlace | -preCTS | -postCTS | -postRoute [-si] | -signoff [-si] | -reportOnly. Only -signoff uses QRC extraction and SignalStorm delay calc, others use native extraction. -si can only be used with -postroute and -signoff option. It generates glitch violation report and incremental SDF for timing analysis.
timeDesign -prePlace  -prefix digtop_pre_place => running setup, so uses func_max view.

# check design before placement
#check_timing -verbose timingReports/check_timing.rpt

checkPlace ./dbs/place/check_place_pre_place.rpt => Checks FIXED and PLACED cells for violations, and generates violation rpt in file specified. If no o/p file specified, summary report is shown, which shows placed and unplaced instances and density.
#On the screen (and also in log/encounter.log file), it shows total no. of unplaced instances, which should equal the no. of instances in the *_scan.v netlist generated from DC, which is fed into VDI (in file script/import.conf as ui_netlist). There is a script in  ~/scripts/count_instances.tcl to count total no. of leaf cells in DC. The gate count from this script should equal no. of unplaced instances in VDI.
#Other way, if you dont want to use the script is to goto DC reports and look at reports/digtop.scan.area.rpt file which shows total cell area in terms of nd2x1 gates. In VDI log/encounter.log file, look at placement density numerator area. divide this by area of nd2x1 gate (by looking at nd2x1 gate area from leffile), and you get the total no. of gates in terms of nd2x1.

checkDesign -noHtml -all -outfile ./dbs/place/check_design_pre_place.rpt => checks design for everything.

#add placement obstruction incase we need to add diodes or other IP. After done with obstruction, we can delete them
createObstruct <x1 y1 x2 y2> -name ANT_RESV => block standard cell placements in box formed by co-ords provided, and given name ANT_RSVD. this is so that any subsequent placement doesn't place any cells here.

#add antenna diodes to i/o pins. We do it before placing anything, since we specify exact location where we want to add diodes.
#2 ways: one by using below script (doesn't work with arrays), and other by using attachDiode cmd explained later.  
script by cadence to add diodes to all input + output pins: <EDI_install>/share/fe/gift/scripts/tcl/userAddDiodesToIOs.tcl
script by cadence to add diodes to all input pins:          <EDI_install>/share/fe/gift/scripts/tcl/userAttachIoDiodesToInputs.tcl
These scripts have procedures, which can be called as below:
encounter > userAttachIoDiodesToInputs AP001L => adds AP001L to all inputs near to where the input ports are.
3 main cmds in these scripts:
1. addInst -cell AP001L -inst I_GPIO_user_added => add an instance of AP001L and name it (still unplaced)
2. placeInstance I_GPIO_user_added 1550 8 -placed => place instance of AP001L at (1550,8)
3. attachterm I_GPIO_user_added A I_GPIO[7] => A is diode pin while I_GPIO[7] is i/p port. Connect these terminals

# Add I/O Buffers
#attachIOBuffer => Adds  buffers  to  the I/O pins of a block and places the buffers near the I/O pins. Buffers are attached and then some of them are flipped to match row orientation (for VDD/VSS hookup).
#IMP: we need to use -markFixed with attachIOBuffer before running place, else place will remove many of them.
#-in or -out => specifies cell name of input or output buffer from the lib.
#-markFixed =>Marks newly-inserted buffers as Fixed.
#-port =>Prepends the port name to the name of the net or instance created.
#-suffix <suffxName> =>Appends a string to name of the net or instance created.
#-selNetFile <selNetFileName>=> Specifies the file that contains the names of nets (or ports in our case) to include in the buffer attachment operation.
#-excNetFile <selNetFileName>=> Specifies the file that contains the names of nets (or ports in our case) to exclude in the buffer attachment operation.
This is useful when we want to add one set of buffers to few nets, and other set of buffers to all other nets. with exclude, we can use just 1 file.
# Add BU140 on all inputs that do not go to the scan isolation gate (as scan iso already has 4x and gates to its inputs)
attachIOBuffer -port -suffix "_buf" -in  BU140  -markFixed -selNetFile ./scripts/in_bu140_list.txt

# Add 10X buffer on select outputs
attachIOBuffer -port -suffix "_buf" -out BU1A0M  -markFixed -selNetFile ./scripts/out_bu1a0m_list.txt

# Insert BU140 on all outputs that do not have the 10X buffer
attachIOBuffer -port -suffix "_buf" -out BU140  -markFixed -selNetFile ./scripts/out_bu140_list.txt

#To just attach buffers to all i/o ports, don't use any netfile.
attachIOBuffer -port -suffix "_buf" -in  BU140L  -markFixed
attachIOBuffer -port -suffix "_buf" -out  BU140L  -markFixed

# Set Fix IO so that placement does not move pins around
fixallios

# at this stage, reportGateCount should show cell count to be equal to gates from synthesis + IO buffers added.
reportGateCount

# Scan Trace
#specifyScanChain chain1 -start sdi_in -stop U19/B =>Specifies  a  scan  chain  or  group in a design, and gives it a name (ex: chain1 here). -start/stop specifies starting and stopping scan pin names (or inst i/p or o/p pin names).
#scanTrace -lockup -verbose => Traces  the  scan chain connections and reports the starting and ending scan points and the total number of elements in the scan chain. -lockup implies that tracing detects lockup latches automatically. -verbose prints cell inst names of scan chain.  used after specifyScanChain cmd.

# Place standard cells and spares. placeDesign first deletes buffertree to get rid of unwanted buffers/inverters. Reads all analysis views, reports total stdcell (after deleting buffers), does spec file integrity, moves/flips instances, and then runs placeDesign cmd which does a trial route. It looks for obstruction in Vertical/Horizontal dirn, and shows final congestion distribution. It does resizing, buffering, other DRV fixes(max cap, max tran,etc), calcualtes delays, fixes timing and then reclaims area by deleting/downsizing cells. It keeps on refining placement and building a congestion distribution map, until it's efficient placement.

setPlaceMode -timingdriven true -reorderScan false -congEffort high -clkGateAware true -modulePlan true =>
placeDesign -inPlaceOpt -prePlaceOpt => Places  standard  cells based on the global settings for placement, RC extraction, timing analysis, and trial routing. pre-placed buffer tree, etc are removed and optimized.
#-inPlaceOpt  = Performs timing-driven placement with optimization. enables the in-place optimization flow
#-noPrePlaceOpt = Disables the pre-placed buffer tree removal ( or pre-place optimization during the placement run). same as -incremental

# check and save design after placement
checkPlace ./dbs/place/check_place.rpt
checkDesign -noHtml -all -outfile ./dbs/place/check_design_place.rpt
saveDesign ./dbs/place/place.enc

# Perform timing analysis after placement
timeDesign -preCTS -prefix digtop_post_place
-----------
# Add Spares, repeat steps creating a spare module (containing some gates) and then placing it repeatedly. find name of available gates from ATD page
#-clock <net_name> => specifies clk net to connect to clk pins of seq cells in spare module. Usually we do this to offer balanced clk tree even when spare flops are added during eco. Otherwise, extra load on clk net due to these spare flops may cause some other paths to fail hold/setup, which may not be fixable by metal only change.
#-reset <net_name>:<pin_name> =>  specifies reser net to connect to reset pins of seq cells. If this option not used, then tieLo option should be used to tie reset pins, else they will be left floating.
#-tie <tie-cell-name> => specifies tie-hi and tie-low cells to add to spare module. w/o this, all pins are connected to 1'b0 or 1'b1 instead of being connected to tie-hi/tie-lo cell o/p.
#-tieLo <pin_names> => default is to tie pins high, unless specified using tieLo.
createSpareModule -cell  {IV120 IV120 IV120 IV120 BU120 BU120 BU120 BU120 AN220 AN220 AN220 AN220 NA220 NA220 NA220 NA220 OR220 OR220 NO220 NO220 EX220 EX220 MU121 MU121 LAL20 TDB21 TDB21 TDB21}  -tie TO010 -tieLo {TDB21:CLRZ  LAL20:CZ} -moduleName spare_mod1
#-area gives the total area coord where we want to place spares. -util is obselete parameter
placeSpareModule -moduleName spare_mod1 -offsetx 50 -offsety 300 -stepx 400 -stepy 700 -area { 15 15 2700 1400 }

NOTE: there are designs where we have spare module in RTL itself. In such case, we don't need to create spare module or place it separately in encounter. We run this: (we can use "specifySpareGate" cmd in eco script too, as that is where we need this spare gate info to do eco gate substitution)
#specifySpareGate -inst *Spare* => This lets encounter understand that this isntance is spare module and all gates in it are spare cells, so that it can be treated accordingly.
#specifySpareGate -inst I_scan_iso_out/g1453 => This adds "spare" property on this gate (which is not in spare module) so that it can be used as spare gate during eco.
#set_dont_touch *Spare*/* true => this is so that the tool doesn't remove the gates in Spare module.

# check and save design after placement with spares
checkPlace ./dbs/place/check_place_spares.rpt
checkDesign -noHtml -all -outfile ./dbs/place/check_design_place_spares.rpt
saveDesign ./dbs/place/place_spares.enc
-----------
NOT NEEDED
#optimizations: optDesign optimizes setup time (for worst -ve slack path, and then tries to reduce total -ve slack), corrects drv (for max_tran and max_cap viol), then if specified, corrects holdtime, opt useful skew, opt lkg power and reclaim area. In MMMC mode, it opt all analysis views concurrently. It uses techniques as add/delete buffer, resize gate, remap logic, move instance, apply useful skew.
#optDesign -preCTS|-postCTS|-postRoute -drv|-incr|-hold -prefix <fileNamePrefix> -outDir <dir_name> => w/o any options, it fixes setup and drv violations. -incr can only be used after running optDesign by itself to fix setup viol. -drv fixes drv, while -hold fixes hold viol. -drv|-incr|-hold can only be used one at a time. Default dir is timingReports for writing timing reports. In MMMC mode, optimizes all analysis views concurrently.

# Post-placement optimization => only if needed, repeat steps
setOptMode -effort high => effort level (default is high)
setOptMode -simplifyNetlist false => if true, simplifies netlist by removing dangling o/p, useless/unobservable logic, spares, etc.
setOptMode -fixFanoutLoad true => causes max FanOut design rule violations to be repaired (by default, drv don't fix these)
optDesign -preCTS -prefix digtop_post_place_opt => repairs design rule violations and setup violations  before clock tree is built. -prefix specifies a prefix for optDesign report file in timingReports/<prefix>_hold.summary, etc.
#optDesign -preCTS -drv -prefix digtop_post_place_opt => -drv (design rule violation) corrects max_cap and max_tran violations
#optDesign -preCTS -incr -prefix digtop_post_place_opt => -incr performs setup opt

# check and save design after post-place optimization
checkPlace ./dbs/place/check_place_opt.rpt
checkDesign -all -outfile ./dbs/place/check_design_place_opt.rpt
saveDesign ./dbs/place/place_opt.enc

# Perform timing analysis after placement
timeDesign -preCTS -prefix digtop_post_place_opt

#Save netlist post-placement optimization
#saveNetlist => this saves netlist from top lvel to leafcells. options:
#-excludeCellInst {SPAREFILL4 DECAP10 ..} => excludes specified logical or physical cells. put cell names in {...} or "...".
#-includePhysicalInst : Includes physical instances, such as fillers. Fillers are present in top level module. Physical cells are not present in .liberty files but only in .lef, so by default they are not included in netlist. This is how EDI figures which are physical cells by looking for cells in .lef which are missing in liberty files. These cells if put in verilog netlist will not run timing as there is no timing info for these cells. However, diodes and some other cells are present in liberty files, even though they are physical only cells. This helps them be in netlist so that we can run lvs for schematic vs layout when imported into icfb. Filler cells are just cap, so lvs complains about missing DCAP in schematic, which we then manually add to schematic to make it lvs clean.
#-includePhysicalCell {FILLER5 FILLER10 ..} includes the mentioned physical cell instances in the netlist.
#-excludeLeafCell => writes all of the netlist, but excludes leaf cell definitions in the netlist. This is how the netlist normally looks.
#-includePowerGround => Includes power and ground connections in the netlist file. This will add pwr nets (VDD/VSS) to all cells.

saveNetlist ./netlist/digtop_post_place_opt.v => this saves netlist from top level to leafcells.
-------

CTS => inserts clock tree, synthesize scan clk tree and mclk clk tree.
-----
to view clktree in gui, 2 options:
1. to view tree in text tree format: goto clock->browse clk tree ->set clock to spi_clk or whatever, select Preroute and then OK. Shows the whole hier in tree like structure.
2. to view the actual layout of clktree in gui, goto Clock->Display->Display_clock_tree. Choose "all clocks", display "all level" or start with "selected level 1" and then move to 2nd level and so on.

# Start Clean
#freeDesign

# Import post-place design
#source ./dbs/place/place_opt.enc

#creating clk tree spec file: we can either manually create this file or tool can create one for us from the SDC constraints in effect (here func view is in effect, so func.sdc used).
createClockTreeSpec -file func_clktree.ctstch => SDC mapping to CTS is done as follows. (SDC cmd -> CTS cmd)
#create_clock -> AutoCTSRootPin
#set_clock_transition ->  SinkMaxTran/BufMaxTran  (default is 400ps)     
#set_clock_latency -> MaxDelay(default=clock period), MinDelay(default=0)
#set_clock_uncertainty -> MaxSkew (default=300ps)
#create_generated_clock -> ThroughPin (adds necessary ThroughPin stmt)

# Insert Clock Tree, we have 4 separate clk trees here, but we use spi_clk to build CTS in scan mode, so that only 1 clk tree is built. This covers all clks. If we aeren't in scan_mode, then we need to build 4 separate clk trees.
set_case_analysis 1 scan_mode_in

#setCTSMode is used in lieu of putting these settings in clk tree spec file. This cmd should be run before running specifyClockTree. Settings in clk tree spec file (in specifyClockTree) takes priority.
setCTSMode -useLibMaxCap true => set  all setCTSMode parameters before running the specifyClockTree command.
#-useLibMaxCap true => Uses the maximum capacitance values specified in the timing library.
#-routeBottomPreferredLayer 4 => Specifies the bottom preferred metal layer for routing non leaf-level nets.Default= 3
#-routeTopPreferredLayer 6 => Specifies the top preferred metal layer for routing non leaf-level nets.Default= 4
#-routeShielding VSS => shield nonleaf-level clk nets with net named VSS
#-routePreferredExtraSpace 3 => provide extra spacing of 3 tracks b/w clk and VSS, when routing nonleaf-level nets. Default=1

specifyClockTree -file ./scripts/func_clktree.ctstch => Loads  the  clock  tree specification file.
#scripts/func_clktree.ctstch: embed each clk between AutoCTSRootPin and END. Specify Period, MaxSkew, Buffer, ThroughPin.  ThroughPin is used for generated clks, so that skew requirements are maintained for generated clk to master clk. This helps in getting rid of hold violation between flops in master clk to flops in generated clks.
 Ex:
AutoCTSRootPin spi_clk => root pin spi_clk
Period         100ns => default=10ns
MaxDelay       10ns => max delay allowed from clock port of chip to any sink. default=10ns
MinDelay       0ns  => min delay allowed from clock port of chip to any sink. default=0ns
MaxSkew        2000ps => max skew between clk pins of any 2 flops. large value here implies fewer buffers will be injected in clk tree. 2ns allows only 1 or 2 level of clk tree to be built. hold delays if any will be fixed by adding buffers in data path (burns less power). It we put skew of 200ps, we'll get 4 or 5 levels of clk tree.
SinkMaxTran    600ps => max tranistion (rise/fall) allowed at sink
BufMaxTran     600ps => max tranistion (rise/fall) allowed at i/p of any clk tree buffer
Buffer         CTB02B CTB15B CTB201B CTB20B CTB25B CTB30B CTB35B CTB40B CTB45B CTB50B CTB55B CTB60B CTB65B CTB70B => buffer cells to use during automatic, gated CTS
NoGating       NO => trace through clock gating logic. default=NO. If "rising/falling" used => Stops tracing through a gate (including buffers and inverters) and treats the gate as a rising/falling-edge-triggered flip-flop clock pin
DetailReport   YES
ForceMaxTran   YES
#AddSpareFF DTB10 5 => add max of 5 spare DTB10 FF to lowest level of clock tree. i/p of FF are tied to 0 and o/p left floating. These can be used during ECO without disturbing the existing clk tree network.
#SetDPinAsSync  NO => treat Data pin of FF as sync/excluded (default=NO => treat it as excluded pin, i.e don't try to balance to it, YES => try to balance it if CTS is able to trace to it)
#SetIoPinAsSync NO => treat I/O pin as sync/excluded (default=NO => treat it as excluded pin, YES => try to balance it if CTS is able to trace to it)
RouteClkNet     Yes => runs globalDetailRoute on clk tree using nanoroute. (by default, "setCTSMode -routeClkNet true' is set inside clockDesign/ckSyntehsis, so globalDetailRoute is always run)
#PostOpt        YES => turns on opt => resizes buffers or inverters or gating components, refines placement, and corrects routing for signal and clock wires. default=YES.
#OptAddBuffer   NO => Controls whether CTS adds buffers during optimization.
#RouteType      specialRoute
#LeafRouteType  regularRoute
ThroughPin => traces thru the pin, even if pin is clk pin.
 + Iclk_rst_gen/clk_count_reg_1/CLK => div by 4 clk generated using this flop. Causes CTS to get clk thru this pin for balancing clk.
 + Iclk_rst_gen/clk_count_reg_2/CLK => div by 8 clk generated using this flop. Causes CTS to get clk thru this pin for balancing clk.
ExcludedPin => to exclude some pins for CTS purpose. CTS will not try to balance clk thru this pin.
END

#NOTE: when we use ThroughPin for clk pin, then that clk pin thru which we are doing through, is treated as excluded pin , and cts will not try to balance that clk pin with other clk leaf pins. It will actually try to balance the final leaf flops that are connected after going thru that clk pin (i.e connected to Q o/p of such a flop). DynamicMacroModel can be used to balance skew for such flops (see: encounter CTS documentation).
 
#CTB buffers are added as part of clk tre with suffix __L1_ (or L2,L3,etc). Apart from these,  __Exclude_0 (or 1,2,etc) buffers are added by the CTS engine (b/w driver and exclude pin) to exclude the pins specified in the clock tree specification file. This is needed so the driver(s) of the excluded pin(s) does not see a large load if they are located a significant distance apart. All clk pins of flops driven by rootclk are sync pins, and are balanced by CTS. If we use "ThroughPin", then clk pins of these flops driven by divided clk are also treated as sync pins for CTS and will be balanced. All other pins are treated as "exclude" pins, meaning they are async and CTS doesn't consider them when doing CTS. So, throughpins for clk above, will be treated as async and any buffers added to drive clk pins of these will be marked as __Exclude_. These exlude buffers as well as the flops connected to them don't show up in clock tree browser (in VDIO). They are not considered part of normal clk tree. So, total number of flops shown by CTS may not be some as total nummber of flops in CTS tree, due to "excluded" flops.
#To see list of all flops not in any clk tree, open clk tree browser from VDIO panel, and click on Tool->List->FF not in clk tree. Over here, apart from spare flops, we'll see all "throughpin" flops on which exclude buffers have been added, all spi flops and regfile flops.

#actual clk tree synthesis: cksynthesis resizes inv,buf and clk gating elements unless they have been marked as dont touch. Clock gating components consist of buffers, inverters, AND gates, OR gates, and any other logical element  (defined  in the library) that appears in the clock tree before CTS synthesis inserts any buffers or inverters. Then globalDetailRoute is run to route clk nets
#-forceReconvergent=> Forces CTS to synthesize a clock with self-reconvergence or clocks with crossover points. Without this option, CTS halts and issues errors. To synthesize clocks with crossover points, list such clocks together in the clock tree specification file.
ckSynthesis -report ./dbs/cts/clockt.report -forceReconvergent => Builds  clock trees, routes clock nets, and resizes instances, depending on the parameters you specify.  These routes/placement are not touched again during signal routing.
#clockDesign -specFile Clock.ctstch -outDir ./dbs/cts => optional: this 1 cmd replaces 2 cmds above (specifyClockTree and ckSynthesis. These 2 cmds are called in background). It provides clock_report in ./dbs/cts/clock.report

#CTS reports: On screen, first we see res/cap tables being read for all views (MAX/MIN), then it reads clktree spec file, then it runs ckSynthesis. It does various checks for clk pins, then it builds clk tree, shows subtree 0 (tree from clk i/p port), subtree 1 (tree from first driving gate of clk) and so on. It tries to satisfy the constraints in clktree spec file across all active views. It then does routing and again tries to satisfy all constraints.
#In CTS report, we'll see many subtrees, each of which corresponds to bunch of flops driven directly by the driver.
Ex: on the main screen, we see reports like this for one of the subtrees:
SubTree No: 5 => represents that it is subtree 5, and has all flops driven by driver shown below
Input_Pin:  (Iclk_rst_gen/clk_gate_reg/latch/CLK) => i/p pin of driver
Output_Pin: (Iclk_rst_gen/clk_gate_reg/latch/GCLK) => o/p pin of driver
Output_Net: (Iclk_rst_gen/n27) => net name of clk that is driving bunch of flops on this clk tree.
*** Find 2 Excluded Nodes. => there are 2 excluded nodes on this clktree, which aren't going to be part of CTS.
**** CK_START: TopDown Tree Construction for Iclk_rst_gen/n27 (5-leaf) (1 macro model) (mem=491.2M) => no. of leaf elements is 5, this includes flops as well as buf/clk-gaters for another subtree.
Total 2 topdown clustering.
Trig. Edge Skew=725[532,1257] N5 B0 G2 A0(0.0) L[1,1] score=900 cpu=0:00:00.0 mem=491M
**** CK_END: TopDown Tree Construction for Iclk_rst_gen/n27 (cpu=0:00:00.0, real=0:00:00.0, mem=491.2M)

#set_interactive_constraint_modes {<list_of_constraint_modes>} => Puts  the  software  into  interactive  constraint entry mode for the specified multi-mode multi-corner constraint mode objects. Any timing constraints that you specify after this command take effect  immediately  on  all  active  analysis views that are associated with the specified constraint modes. The  software  stays in interactive mode until you exit by specifying the set_interactive_constraint_modes command with an empty list: set_interactive_constraint_modes { }

set_case_analysis 0 scan_mode_in => we exit out of scan mode back to normal func mode

# Check and save design after clocktree insertion
checkPlace ./dbs/cts/check_place.rpt
checkDesign -noHtml -all -outfile ./dbs/cts/check_design_cts.rpt
saveDesign ./dbs/cts/cts.enc

# Add set_propagated_clock by entering interactive mode
set_interactive_constraint_modes [all_constraint_modes -active]
set_propagated_clock [all_clocks] => propagates delay along clk n/w (accurate only after CTS) from clk source to reg clk pin. We can also specify clk src latency (latency from external src to clk port) using set_clock_latency. Total latency is sum of clk src latency and propagated delay. To specify uncertainty for external src latency, use -early or -late, and tools choose worst one for setup/hold. To specify internal uncertainty (for skew or variation in the successive edges of clk wrt exact clk), use set_clock_uncertainty.
set_interactive_constraint_modes { }

# Timing Analysis after CTS before optimization
timeDesign -postCTS -prefix digtop_post_cts
timeDesign -postCTS -hold -prefix digtop_post_cts -numPaths 50

# Post CTS optimization
#setOptMode -effort high
#setOptMode -simplifyNetlist false
#setOptMode -fixCap true -fixTran true -fixFanoutLoad false => this says which drv viol need to be fixed (usually FO fix not needed)
#-postCTS repairs design rule violations and setup violations after clk tree has been built. -hold will fix hold violations also. -incr performs setup opt if needed further.
optDesign -postCTS -prefix digtop_post_cts_opt => fixes drv and setup viol only (if -hold added here, then it fixes hold viol only. if -drv, then it fixes drv viol only)
optDesign -postCTS -hold -prefix digtop_post_cts_opt => fixes hold viol only.

IMP: all viol should be fixed by now, as from here on, no gates can be added. So, only minor viol related to routing can be fixed. If any gross viol remains by now, it will never be fixed post CTS.

# Check and save design after clocktree insertion post optimization
#checkDesign -all -outfile ./dbs/cts/check_design_cts_opt.rpt
#saveDesign ./dbs/cts/cts_opt.enc

# Timing analysis after optimization
#timeDesign -postCTS -prefix digtop_post_cts_opt
#timeDesign -postCTS -hold -prefix digtop_post_cts_opt

# Save netlist post CTS optimization
saveNetlist ./netlist/digtop_post_cts_opt.v

Route => runs nanoroute to route it, cmd is routeDesign
------
# Import Post CTS design => file from posS step above
#source ./dbs/cts/cts_opt.enc

#setting SI (noise)driven and Timing driven  to true, enables SMART algo (Abbreviation for Signal integrity, Manufacturing Awareness, Routability, and Timing). by default, nanoroute takes into account both timing and SI while routing. If timing driven is set to false, it uses an older algo. Use the options -routeSiEffort and -routeTdrEffort to adjust the effort level for SI and Timing Driven routing, respectively. These options fine-tune the priorities the router assigns to timing, signal integrity, and congestion. All these options can be selected using gui: route->nanorute->route.

setNanoRouteMode -routeWithTimingDriven true => minimize timing violation by causing most crit nets to be routed first.
#setNanoRouteMode -routeTdrEffort 0:10 =>effort level with tdr(timing driven route). 0=>congestion driven while 9=>timing driven
setNanoRouteMode -routeWithSiDriven true =>  minimize crosstalk violation by wire spacing, layer hopping, net ordering and minimizing the use of long parallel wires.
#setNanoRouteMode -routeSiEffort {high | medium | low } => default is high when timing driven is set to true else default is low. set to high for congested designs (since congested designs have SI problems), low for non congested.

#specify top and bottom routing layers (by default bot/top routing layers are ones specified in tech lef file).
setNanoRouteMode -quiet -routeBottomRoutingLayer default => specifies lowest layer nanoroute uses for routing. Layers can be specified using the LEF layer names or layer ID numbers. default is lowest layer specified in lef file. range is 1-15 => 1 means metal1 and so on. If POLY is defined as routing layer in tech lef file, then POLY is assigned layer id 1, METAL1 is layer id 2 and so on.
setNanoRouteMode -quiet -routeTopRoutingLayer default => Specifies the highest layer the router uses for routing. default is the highest layer specified in lef file. range is 1-15.

#specify iterations for nanoroute. nanoroute first does global route, then starts detail routing from iteration 0 to 20(max) in steps. Iterations after 0 do not run routing. Instead, they run search and repair. Iterations after 20 run post route opt. start and end iterations are set by default to 0.
setNanoRouteMode -drouteStartIteration default => Specifies the first pass in a detailed routing step.
setNanoRouteMode -drouteEndIteration default => Specifies the last pass in a detailed routing step. set to default (which implies run post route opt).  If set to some number, antenn violations will not get fixed

#Pitch/Mgrid options
#setNanoRouteMode -quiet -drouteOnGridOnly none|via|all => we use this option to control off-grid (off-track) routing. Note: grid means track in nanoroute which is Metal1 pitch. 3 options:
 all =>no off grid routing of vias and wires, none =>no off grid routing of wires, via =>no off grid routing of vias  
#OBSELETE: setNanoRouteMode -drouteHonorMinFeature true => This is to honor Manufacturing Grid. this is set by default to true if MANUFACTURINGGRID is set in tech lef file. In future releases, this not needed as nanoroute is always going to honour MGrid.

#antenna violation options
#setNanoRouteMode routeIgnoreAntennaTopCellPin =>Ignores antenna violations on top-level I/O block pins, but repairs antenna violations elsewhere. default is true, so no need to set it.

#antenna violations can be fixed by 2 ways: 1. layer hopping 2. Antenna diode insertion.
1. layer hopping:
setNanoRouteMode -drouteFixAntenna True => This can be used when antenna viol are the only violations, and we want to just fix these. Do a "setNanoRouteMode -reset" before running this
2. Antenna diode insertion:
setNanoRouteMode -routeInsertAntennaDiode true => nanoroute searches in LEF for cells of type ANTENNACELL specified in the LEF MACRO statement. These cells will be used for diode insertion, provided diffusion area is specified for the antenna cell ( ANTENNADIFFAREA ) so Nanoroute understands that adding this cell to the net will reduce the process antenna effects for the gates connected to it. First Nanoroute will use layer hopping, and if violations still remain, it will do diode insertion)
#NOTE: Antenna diodes are not inserted for ECO routing. Also by default, antenna diodes are not inserted on clock nets, since clock nets are don't touch (and also clk nets are routed first, so lot of flexibility in layer hopping allows all antenna viol to be fixed). To have antenna diodes on clock nets, use:
#setNanoRouteMode -routeInsertDiodeForClockNets true

#setNanoRouteMode -reset => resets all setNanoRouteMode parameters to their default values
#getNanoRouteMode => displays everything that's set for nanoroute. good for sanity check.

#Nanoroute
routeDesign -globalDetail => (equiv to "globalDetailRoute") runs global and detailed routing (by default). It's timing and SI driven by default, but we can set both of these false.
#global routing is initial pahse, where tool plans global interconnect and prduces a congestion map. During this phase, NanoRoute breaks the design into rectangles called global routing cells (gcells). It finds connections for the regular nets defined in the NETS section of the DEF file by assigning them to the gcells. The goals of global routing are to distribute and minimize congestion and to minimize the number of gcells that have more nets assigned than routing resources available.
#detail routing is when NanoRoute builds the detailed routing database. Then it routes the wires that connect the pins to their corresponding nets, following the global routing plan. During the search-and-repair phase of detailed routing, NanoRoute repairs design rule violations.The primary goal of detailed routing is to complete the interconnect without creating shorts or spacing violations. Tech lef file has all DRC rules (which are mostly spacing rules for vias and metal lines).
#from VDI gui, we can run routing using route->nanoroute->route. choose timing driven and set scale to 5. (for congestion driven, set scale to 0)

#to add antenna diodes manually to internal nets which still have violations, after routing is done, use this:
attachDiode -prefix <custom_diode> -diodeCell <diodeCellName> -pin <instName> <termName> => adds antenna diode to named pin of named inst.
ex: attachDiode -prefix custom_diode_input -diodeCell AP001 -pin inst1/reg_gater_4 PREZ => adds Diode named AP001 to pin named term2 of inst1 residing in inst2 present in top level module. Names it with prefix "custom_diode_input" so that it's easier to recognize it.

# Check and Save design after route
checkPlace ./dbs/route/check_place.rpt
checkDesign -all -outfile ./dbs/route/check_design_route.rpt -noHtml
saveDesign ./dbs/route/route.enc

# Remove any interactive constraints entered before (set_propogated to respecify)
update_constraint_mode -name functional -sdc_files \
    [list /db/BOLT/design1p0/HDL/Autoroute/digtop/Files/input/bolt_constraints.sdc \
          /db/BOLT/design1p0/HDL/Synopsys/digtop/tcl/clocks.tcl \
          /db/Hawkeye/design1p0/HDL/Synthesis/digtop/tcl/case_analysis.tcl \
          /db/BOLT/design1p0/HDL/Synopsys/digtop/tcl/false_paths.tcl \
          /db/BOLT/design1p0/HDL/Synopsys/digtop/tcl/constraints.tcl \
          /db/BOLT/design1p0/HDL/Synopsys/digtop/tcl/multicycle_paths.tcl \
          ./scripts/case_analysis.sdc \
          ./scripts/pre_place_constraints.sdc]

# Add set_propagated_clock by entering interactive mode
set_interactive_constraint_modes [all_constraint_modes -active]
set_propagated_clock [all_clocks]
set_interactive_constraint_modes { }

# extractRC options - PostRoute, Non-Coupled, Native Extractor (low effort). Here, we have switched from cap table lookup (*.capTbl in vdio dir in pdk) to RC extractor for more accurate delays.
#setExtractRCMode: sets rc extraction mode fo extractRC cmd.
setExtractRCMode -reset => all setExtractRCMode parameters are reset to default value.
setExtractRCMode -engine postRoute => postroute uses postroute engine where RC extraction is done by detailed measurement of distance b/w wires, and coup cap is reported. preroute uses preroute engine where RC extraction is done by fast density measurement of surrounding wires, and coup cap is not reported. use option -engine postroute with -effortLevel <high or signoff> to achieve greatest accuracy.
setExtractRCMode -coupled false => false implies coupling cap to be grounded, typically used for STA. For SI analysis, this should be set to true so that coupling cap is o/p separately than gnd cap.

setExtractRCMode -effortLevel low => low invokes native extraction engine (lowest accuracy), medium invokes TQRC (Turbo QRC), high invokes IQRC (Integrated QRC), while signoff invokes standalone QRC (highest accuracy). Version of QRC to be used is fixed for a particular Encounter version, but we can change it by specifying it in .amerc file in vdio dir as follows:
.amerc: ext-10.1.2_HF1 => add this line for extractRC to pick up this version of RC extractor

#setExtractRCMode [-total_c_th, -relative_c_th, -coupling_c_th] <value> => there are 3 separate parameters: total_c_th, relative_c_th, coupling_c_th. These determine the threshold of when will the coupling cap of nets be grouded. We don't set this option in our flow, as default values based on process node (using setDesignMode -process command) takes care of it.
#setExtractRCMode -total_c_th <cap> => If total cap for nets < total_c_th, coupling cap is grounded (default=5ff but adjusted based on process node).
#setExtractRCMode -coupling_c_th <cap> => If coupling cap (NOT total cap) for nets < coupling_c_th, coupling cap is grounded (default=3ff but adjusted based on process node),
#setExtractRCMode -relative_c_th <ratio> => If the total coupling cap b/w  a pair of nets is less than the percentage (specified with this parameter) of the total cap of the net with the smaller total cap in the pair, the coupling cap b/w these two nets will be grounded (default=0.03).

#setExtractRCMode -capFilterMode  <relOnly | relAndCoup | relOrCoup> => this option is used only when -coupled is set to true above. default is relAndCoup for process node below 130nm, else default is relOnly. process node is set using setDesignMode -process command.
#any setting => if net's cap < total_c_th, then coupling cap grounded regardless of the -capFilterMode setting.
#relOnly => if net's coupling cap < relative_c_th, then coupling cap grounded.
#relAndCoup => if net's coupling cap < relative_c_th and coupling_c_th, then coupling cap grounded. most restrictive.
#relOrCoup => if net's coupling cap < relative_c_th or coupling_c_th, then coupling cap grounded.

#setDesignMode    -process 150 => implies process is 150nm and above. We adjust this based on what nm process we are using so that the tool automatically adjusts coupling cap thresholds. For 150nm, total_c_th=5, relative_c_th=0.03 and coupling_c_th=3. For lower nm tech, coupling cap threshold raised (i.e, any coupling cap below a certain value is kept as coupling instead of lumping to gnd)

#extractRC => not needed to run, since "setExtractRCMode" automatically invokes extractRC when timedesign is run below.

#instead of using setExtractRCMode, we can also use setDelayCalMode
#setDelayCalMode -engine Aae -SIAware false

# Timing Analysis after route
timeDesign -postRoute -prefix digtop_post_route
timeDesign -postRoute -hold -prefix digtop_post_route -numPaths 140

# Post-route optimization => we need this, since routing may have introduced some hold and drv violations.
setOptMode -effort high
setOptMode -maxDensity 0.98 =>Specifies the maximum value for area utilization. optdesign does not grow the netlist above this value.
setOptMode -holdTargetSlack 0.1 -setupTargetSlack 0.05
setOptMode -simplifyNetlist false
#-postRoute repairs design rule violations and setup violations after routing is done. -hold will fix hold too. usually need to fix hold and drv
#optDesign -postRoute -prefix digtop_post_route_opt => to fix setup and drv
optDesign -postRoute -hold -prefix digtop_post_route_opt => fix hold
optDesign -postRoute -drv -prefix digtop_post_route_opt => fix drv

# Timing Analysis after route opt
timeDesign -postRoute -prefix digtop_post_route_opt
timeDesign -postRoute -hold -prefix digtop_post_route_opt

# Check and Save design after optimization
checkPlace ./dbs/route/check_place_opt.rpt
checkDesign -all -outfile ./dbs/route/check_design_route_opt.rpt
saveDesign ./dbs/route/route_opt.enc

# Save netlist post-route optimization
saveNetlist ./netlist/digtop_post_route_opt.v

STA: Here we run timing in all modes
----
set_analysis_view -setup {func_max func_min scan_max scan_min} -hold {func_max func_min scan_max scan_min} => imp to run timing in all modes as there might be paths in setup/hold in other views which may show up in PT, but may never get opt in VDIO. i.e there may be hold paths in func_max and setup paths in func_min which will need to be fixed here.

Run timing, Repeat opt step if necessary as in route step, rerun timing, then check and save.
#-postRoute repairs design rule violations and setup violations after routing is done. -hold will fix hold too. usually need to fix hold and drv after running STA, since some paths might start failing since we have enabled timing for SCAN mode also, so new [paths may pop up.
#optDesign -postRoute -prefix digtop_post_route_sta_opt

Then check area:

set dbgSitesPerGate 5 => /db/pdk/lbc7/.../lef/msl270_lbc7_core_2pin.lef  leffile defines coresite size at the top of lef file. This coresite shows the min x dimension that any gate can have. It's basically M2 pitch, as we allow gate widths to be in multiple of M2 pitch. We take x dimension of  nand2 x1 (NA210) gates, which is usually 4 or 5 times of this M2 pitch and set dbgSitesPerGate to that number. For this case, CORESITE size is 0.9x11.0, while NA210 has size 4.5x11.0, so dbgSitesPerGate is 4.5/0.9 = 5. This number is very important and changes with process tech. For LBC8, it's 6.8/1.7 = 4.

#If you look at layout of NA210 in lbc7_2pin, it's 4.5x11um (it's in um) with Lmin=0.4um (400nm). It has 3 metal1 lines (min W=0.3um) and 2 poly lines running vertically. So, width and spacing of these 5 lines, sets the x dimension of the cell. On contrast, IV110 has area of 3.6x11um. It has 2 metal1 lines and 1 poly line. Reason for such a large area is to leave space for routing

#gatecount of imported design, and what VDI has currently (we can also use cmd "checkFPlan -util" or "checkPlace" instead of "reportGateCount" to see current design's stats. reportGateCount should be used instead of reportDesignUtil as it's supported cmd)
reportGateCount -level 5 -outfile gatecount_sta.rpt => gives size of the imported design in terms of gatecount. Physical cells (as FILLER, etc) are not reported in this. -stdCellOnly reports stdcells only (no IP_blocks / IO cells reported). -module <modulename> reports gate count for named module. -level reports gate counts for sub-hier upto that level deep. So very useful to see where size increase is coming from.
For gatecount, this is the formula used by VDI: gateCount = moduleArea / gateSize, where
moduleArea  is the area of the module (sum of the areas of all instances inside of the module, including standard cells, blocks, and I/O cells,
gateSize = dbgStdCellHgt x dbgDBUPerIGU x dbgSitesPerGate
dbgStdCellHgt is the standard cell row height, dbgDBUPerIGU is the M2 layer pitch, dbgSitesPerGate is a user-defined global variable that determines the gate size the software assumes when calculating the gate count. For example, the default value of 3 means the assumed gate size is equal to 1 standard cell row height and 3 M2 layer pitch widths. So, gatesize is basically in terms of M2 pitch, so we set "dbgSitesPerGate" parameter above to get gatesize in terms of NA210 size.

for our case, gatesize = 11.0x0.9x5=49.5um^2 (size of a nd2x1 gate). Note: we got M2 pitch by looking in vdio/lef/msl270_lbc7_tech_2layer.lef. To confirm, area of nd2x1 gate, we can also look in vdio/lef/*_2pin.lef and get exact X-Y dimension of nd2 gate)
So, this reports total no. of gates in terms of equiv nd2x1 (NA210) gates. It reports total cell area (area occupied by module), and total gates =  total_area/nd2x1_area = 106584/49.5 = 2153 gates. It also reports total no of cells placed (cells mean instances of stdcells, i.e flop is 1 cell). It also gives density which is calc as area occupied by cells divided by the total area of the block.

#checkPlace => reports placement density and no. of placed and unplaced instances.

#NOTE: DC report_area gives area by looking at area field in .lib file (in synopsys/src dir) for each cell.  For our case it's in terms of nd2x1 gate, since NA210 is assigned an area of 1, and all other stdcells have an area relative to this. So, if it says "Total cell area: 1750" => total area is 1750 nd2x1 gates or 1750*49.5 um^2 = 86625 um^2. to compare gate area, we just compare DC cell area with VDI Gates count (both of which are in terms of nd2x1). This shows us what are the extra no. of gates added post route.

SIGNOFF
---------
#set extract rc mode to signoff, extract RC.
setExtractRCMode -effortLevel    signoff => signoff used to be more accurate
setExtractRCMode -coupled        true => coupling cap to be kept
setExtractRCMode -capFilterMode  relAndCoup

setDesignMode    -process 150 => implies process is 150nm and above.

extractRC => Extracts  resistance  and  capacitance  for  the interconnects and stores the results in an RC database. done after routing

#time design
timeDesign -signoff -reportOnly       -prefix digtop_post_route_signoff
timeDesign -signoff -reportOnly -hold -prefix digtop_post_route_signoff

SIGNAL INTEGRITY
----------------
Cadence CeltIC is signal integrity analyzer in Encounter platform. It performs noise analysis  (impact of noise on both delay and functionality) and generates repairs back into PnR. Noise lib (.cDB) are created for efficiently characterizing cells

setExtractRCMode -reset
setExtractRCMode -engine         postRoute
setExtractRCMode -effortLevel    signoff
setExtractRCMode -coupled        true
setExtractRCMode -lefTechFileMap ./scripts/qrc_layer_map.ccl
setExtractRCMode -capFilterMode  relAndCoup

setDesignMode    -process 150

#set view, propagate clk and set ocv/cprr as during route.
set_analysis_view -setup {func_max func_min scan_max scan_min} -hold {func_max func_min scan_max scan_min}

set_interactive_constraint_modes [all_constraint_modes -active]
set_propagated_clock [all_clocks]
set_interactive_constraint_modes { }

setAnalysisMode -analysisType onChipVariation -cppr both

#delay calc mode: used when optimizing design
setDelayCalMode -engine signalStorm -signoff true => for signoff, use signalstorm delay calculator. default is feDc which is EDI delay calculator. -signoff enables signoff quality (highest accuracy) delay calc mode.

#set SI mode
setSIMode       -reset => resets all param to default
setSIMode       -analysisType default -acceptableWNS same => analysis type resets parameters to default or pessimistic settings. acceptableWNS Specifies the worst negative slack (WNS) that is acceptable for the design. same means keep slack same as before SI, usually 0. Or we can provide the WNS value.

setSIMode -insCeltICPreTcl  { source scripts/pre_celtic.tcl } => Changes the default environment variable values to the specified values. sets these parameters. message_handler -set_msg_level ALL; message_handler -set_msg_level ALL
setSIMode -insCeltICPostTcl { source scripts/post_celtic.tcl} => Executes the specified CeltIC NDC commands after the SI analysis engine performs noise analysis. It runs these cmd: generate_clock_report -reverse_slope_limit -1 -nworst 10 -file timingReports/clock_report.rpt, generate_report -txtfile timingReports/noise.rpt.

#timeDesign Runs trial route, extraction and timing analysis. also generates detailed timing reports. -signoff calls QRC for extraction. -si generates glitch violation report and incremental sdf (backannotates an incr.sdf) to calc WNS due to noise. runs SI timing in MMMC mode (all active views), and shows worst case timing.
timeDesign -signoff -si -prefix si_setup
timeDesign -signoff -si -hold -prefix si_hold

# Report Timing including incremental delays for setup/hold. This reports timing for all analysis views that are in effect at this point using "set_analysis_view" cmd (which is FUNC_MAX/MIN, SCAN_MAX/MIN for both setup/hold)
#set_analysis_view -setup {func_max func_min scan_max scan_min} -hold {func_max func_min scan_max scan_min} => change analysis view if you need timing only for a particular view i.e FUNC_MAX.

setAnalysisMode -checkType setup => default is setup.
report_timing -nworst 1 -max_points 500 -check_type setup -net -path_type full_clock -format {instance arc cell slew delay incr_delay arrival required} > timingReports/report_timing_setup.rpt

setAnalysisMode -checkType hold
report_timing -nworst 1 -max_points 500  -check_type hold -net -path_type full_clock -format  {instance arc cell slew delay incr_delay arrival required} > timingReports/report_timing_hold.rpt

reportDelayCalculation -from Itimergen/U185/Y -to Itimergen/U1925/A1

#fixing SI.
setOptMode -effort high
setOptMode -maxDensity 0.98
setOptMode -usefulSkew false
setOptMode -holdTargetSlack 0.1 -setupTargetSlack 0.05

#optDesign: -postRoute fixes both setup(incr) and drv if nothing specified. -hold fixes hold violations also. -si corrects glitch and setup violations caused by incremental delays due to coupling cap. -si can only be used with -postroute.
#optDesign -signoff -postRoute -si
optDesign -signoff -postRoute -hold -si -incr

#NOTE: after optdesign finishes it shows setup/hold slack without SI. When we run timeDesign or report_timing, it shows slack with SI. So, the slack with SI will always be lower than what optDesign reports.

timeDesign -signoff -si       -prefix si_setup_opt
timeDesign -signoff -si -hold -prefix si_hold_opt

#setup
setAnalysisMode -checkType setup => mode has to be "setup" or else report_timing won't report timing for setup. default is setup, so this cmd not needed.
report_timing -nworst 1 -max_points 500 -check_type setup -net -path_type full -format {instance arc cell slew delay incr_delay arrival required} > timingReports/report_timing_si_opt_setup.rpt
#hold
setAnalysisMode -checkType hold => mode has to be "hold" or else report_timing won't report timing for hold.
report_timing -nworst 1 -max_points 500  -check_type hold -net -path_type full -format  {instance arc cell slew delay incr_delay arrival required} > timingReports/report_timing_si_opt_hold.rpt

setDelayCalMode -considerMillerEffect true
setUseElmoreDelayLimit 300
set_global timing_cppr_self_loop_mode true
set_global timing_disable_bidi_output_timing_checks false
set soceSupportWireLoadModel 1

checkPlace ./dbs/si_fix/check_place.rpt
checkDesign -all -noHtml -outfile ./dbs/si_fix/check_design_sta.rpt
saveDesign ./dbs/si_fix/si_fix.enc


FILLER => add filler cells. Fillers maintain continuity of VDD/VSS and of NWELL/PWELL. after running filler, placement desnisty will goto 100%, so you cant place anything more. Go back to post route step, to do any opt.
#NOTE: filler cells are not defined in .lib file, as they don't have any function or timing. So, when we add filler cells, these don't get saved in verilog netlist (as only the cells in .lib are used for verilog netlist), but are saved in the def file.
-------
There are 2 filler cell flow:
1. Normal filler cells: Here, filler cells are just poly.
addFiller -cell SPAREFILL1 SPAREFILL2 FILLER_DECAP_P6 -prefix FILLER_NORMAL => -prefix adds a prefix to all these cells so it's easy to identify this.

2. ECO filler cells: Here, filler cells are eco cells (gate array cells) which can be converted to any desired gate by just altering metal layers (they require extra CONT mask too). We fill with ECO filler cells and then with normal filler cells.
addFiller -cell  FILLER5LL FILLER10LL FILLER15LL FILLER20LL FILLER25LL FILLER30LL FILLER40LL FILLER50LL FILLER55LL -prefix FILLER_ECO => ECO cells added first so that we can add as many of these cells. ECO cells width are multiple of X-grid, so there may be single grid gaps in design after placing ECO cells which can be filled by normal filler cells.
addFiller -cell  SPAREFILL1LL SPAREFILL2LL SPAREFILL4LL SPAREMOSCAP3LL SPAREMOSCAP4LL SPAREMOSCAP8LL -prefix FILLER_NORMAL => normal cells added later so that any remaining space not filled by ECO cells will be filled with these normal filler cells.
#To see ECO filler cells only on the gui: do
selectInst *FILLER_ECO* => selects all filler eco on gui, to help us see if they are unifrmly placed.
#To find total num of FILLER cells used, goto Tools->DesignBrowser. On new window do find=Instance and then search for *FILLER_50* => This will show all filler which are FILLER50. Select all of them from list below (by using left mouse) and they will be highlighed on gui. We can also count the num of fillers this way to see how many of them are there for ECO purpose. Repeat for other filler cells. Filler cells are numbered sequentially for ECO fillers and NORMAL fillers, so easy to count them. These filler cells cannot be counted by using any script, as these filler cells don't exist in verilog netlist of enc database (enc.dat).
checkFiller => reports any gaps found inside the core area where there are no filler cells. shows up on gui on all such missing places. Make sure these gaps are OK
 
# Check and save design
checkPlace ./dbs/filler/check_place.rpt
checkDesign -all -noHtml -outfile ./dbs/filler/check_design_filler.rpt
saveDesign ./dbs/filler/filler.enc

Final Check => does final checks
-------------
# Start Clean
freeDesign

###############################################
# Import post sta design
source ./dbs/filler/filler.enc

###############################################
# Verify Connectivity/Geometry/Antenna

#verifyConnectivity => Detects  conditions such as opens, unconnected wires (geometric antennas), unconnected pins, loops, partial routing, and unrouted nets; verify connectivity can also be chosen from Gui thru top panel: Verify-> Verify connectivity. Choose Net type to "all" to check all types of nets (regular/special) or "Regular only" to exclude special nets as PG nets (-noSoftPGConnect also disables checking of soft Power/Ground connects). -geomConnect uses geometric model instead of centerline model so that if the wires overlap at any point, they are considered to be connected, they do not have to connect at the center line. For check types, click appr box. Provide name/path for conn rpt.
verifyConnectivity -type all -error 1000 -warning 50 -report ./dbs/final_check/connectivity.rpt => checks for all net types, all nets and all default checks.

#verifyGeometry => checks for width, spacing, shorts, off routing/manufacturing grid, via enclosure, min cut, and internal geometry of objects and the wiring between them. Many options can be added on cmd line or using gui. -allowRoutingBlkgPinOverlap allow routing obstructions to overlap pins.
verifyGeometry -allowRoutingBlkgPinOverlap -report  ./dbs/final_check/geomtry.rpt

#verifyProcessAntenna => Verifies process antenna effect (PAE) and maximum floating area violations. -pgnet checks tie-high and tie-low nets also for AE. -noIOPinDefault specifies that ANTENNAINPUTGATEAREA, ANTENNAINOUTDIFFAREA, ANTENNAOUTPUTDIFFAREA keywords from lef file are not applied to IO pins. These options can be chosen from GUI too.
verifyProcessAntenna -error 1000 -reportfile ./dbs/final_check/antenna.rpt -leffile ./dbs/antenna.lef

#check for max_cap/max_tran/max_fanout violations
reportTranViolation => reports transition vio on all nets (>4ns or limit specified in .lib file)
reportCapViolation => reports cap vio on all nets (>150ff or limit specified in .lib file)
reportFanoutViolation

#optional: additional checks
verifyPowerVia
checkTieHiLowTerm
checkAssignStatement
checkPhyInst
checkFloatingInput
checkFeedbackLoop
checkSpareCell
checkNetCollision
checkLECDir

summaryReport -noHtml -outfile summaryReport.rpt -outdir ./dbs/final_check => reports stats for entire design.

Look in dbs/final_check/*.rpt for conn,ant,geom violations, and summaryReport.rpt for all other report. Also look in checkPlacement.rpt and checknetlist.rpt.
 
Export => exports design
-------
Need to give .def file (for place and route info) and .v file (for running simulation on top level). Also, need to give spef file to digital simulation team (for cap,res, other extracted parameters to run gate level simulation with these parasitics back annotated).

SPEF: standard parasitic exchange format. part of "IEEE 1481-1998" std for IC delay and Power calculation system. Part of Open Verilog International's delay-calculation-system (DCS) standard. Based primarily on SPF (std parasitic format [includes DSPF and RSPF], useful in Spice sims), SPEF has extended capability and a smaller format. represents parasitic data of wires in a chip in ASCII format for parasistic parameter R (ohm), C (farad) and L(henry) for RC (or RLC) timing modeling. Used after layout to back-annotate timing for STA & simulation

SDF: standard delay format. while spef contains actual RLC values, these are annotated in STA tools (like PT) and wire delays calculated. These wire delays (from spef file) along with cell delays  (from liberty files used during synthesis) are then put in sdf file (no info abt RC here), which can then be used by STA tools to generate timing. RC extraction tools generate spef file, while STA tools use this to generate SDF file.

#export native or /and QRC coupled min/max spef file (native is crude extractor using cap look up table, while QRC is assura extractor which solves maxwell's 3D).
NOTE: extractRC has been run many times previously, but we never generated spef files. So, we run it again to make sure we get clean extraction. All extract settings remain in effect unless overwritten here.
//native
setExtractRCMode -effortLevel    low => invokes native extractor
extractRC
rcOut -rc_corner max_rc -spef ./dbs/final_files/digtop_native_max_coupled.spef
rcOut -rc_corner min_rc -spef ./dbs/final_files/digtop_native_min_coupled.spef

//qrc
setExtractRCMode -reset
setExtractRCMode -engine         postRoute
setExtractRCMode -effortLevel    signoff => invokes highest accuracy qrc extractor
setExtractRCMode -coupled        true => if set to false, coupling caps are lumped to gnd.
setExtractRCMode -lefTechFileMap ./scripts/qrc_layer_map.ccl
setExtractRCMode -capFilterMode  relAndCoup
setDesignMode    -process 150 => Based on process node specified (here it's 150nm), various coupling thresholds are chosen.

extractRC

rcOut -rc_corner max_rc -spef ./dbs/final_files/digtop_qrc_max_coupled.spef
#delayCal -sdf ../output/digtop_max.sdf => to gen max sdf from QRC extractor
rcOut -rc_corner min_rc -spef ./dbs/final_files/digtop_qrc_min_coupled.spef
#delayCal -sdf ../output/digtop_min.sdf => to gen min sdf from QRC extractor

# Export DEF
set dbgDefOutLefVias 1 => This ensures that all Vias (std, custome or using viarule) will be defined in def file itself. Vias are represented by patterns, so there is no problem of whether matching vias exist in pdk or not, when importing these into icfb. This is important, else these will be vias referencing other vias/via-rule which may not be present in pdk, causing import errors.
set dbgLefDefOutVersion 5.5 => If Def is set to 5.6 or 5.7, then viarule is still present in def file. If matching viarule is not there in pdk, then def import into icfb will cause errors. So, use def 5.5 to avoid this issue.
defOut -floorplan -netlist -routing ./dbs/final_files/digtop_final_route.def

# Export Netlist
#saveNetlist digtop_final.v  -includePhysicalCell {SPAREFILL1 SPAREFILL2 SPAREMOSCAP4 FILLER5} -excludeLeafCell -includePowerGround => This creates netlist which has VDD/VSS ports on all stdcells and module, and includes all physical cells specified (If no physical cells specified, then all filler cells included). netlist will have additional lines like "FILLER5 FILLER_INST_24 ();". Tool figures out physical cells based on "addFiller" cmd used previously, as there's no special property in Filler cells lef file to identify them as filler cells. "-excludeLeafCell" excludes leaf cell defn (i.e defn of AN210 etc) to be written to netlist
saveNetlist ./dbs/final_files/digtop_final_route.v => doesn't have VDD/VSS ports, nor any physical cells in it.

NOTE: final netlist above (netlist: digtop_final_route.v) has the format shown below.

1. First all modules in RTL are defined in terms of gate level components (structural netlist) with the same module name as in RTL. If the same module is called 4 times, then there will be 4 defn of this RTL module with 4 different names. This unifiqation is done, so that separate optimization can be done on each instance of such module.  
Ex: module module_name (i/o port defn) ... endmodule.
Note if scan test ports were added during dft step in synthesis, then the module is renamed as module_name_test_1

2. Then all such modules are instantiated in the top level module "digtop"(see bullet 5). The instance name is kept same as the defn name. However for modules with *_test_1 defn name, test_1 is dropped and RTL name is kept for instance names. signal_name to connect module_defn_pin_name are kept the same as in RTL as much as possible.
Ex: module_defn_name module_instance_name (.module_defn_pin_name(signal_name), ...)

3. For instances where dft was added, 3 new pins are added => test_si, test_so and test_se. For multiple test chains, we may see more than 1 si/so. i.e test_si1, test_si2, .. and test_so1, test_so2, ...etc. test_si connects to first scannable flop's SD pin, test_se connects to all scannable flop's S pin, Q pin of this flop connects to SD pin of next flop and so on forming a scan chain, and the final o/p pin is test_so pin which is just a buffered version of Q o/p pin of the last flop. Note there may be logic b/w Q o/p pin of the last flop and the PO pin of block, but scan chain connects just the o/p of flop to SD i/p of next flop.

4. a module spares is also there, which has all spare cells in it. spare modules don't have any i/o ports. If there were multiple spares then there would be multiple spares def as module spares_1 (..) endmodule, module spares_2 (...) endmodule, etc. Pins of spare gates are tied to 0/1. These 0/1 come from Tieoff gates (TO*) in spare module, which provide a zero o/p and one o/p. Sometimes o/p of these tieoff cells are buffered inside the module to provide signal to other cells, while other times different spare gates (inv,nd2,etc) are tied to different 0/1 from different TO* gates. NOTE: spare cells inside spare cell module have o/p pins omitted in their instantiation. Reason might be to avoid having floating o/p nets as the o/p pins of spare cells are not used anyway.

5. Top module "digtop" is defined at the end. It has buffers for i/p signals (BU*), for clk signals (CTB*), for o/p signals (BU*), tieOff (TO*). It instantiates all the other modules defined in module defn. Top level module has extra scan pins added: sdata_in and sdata_out.

# Export Gds (Do: source ./dbs/filler/filler.enc after opening encounter, before you do streamOut. Then filler.enc db is used for gds)
streamOut ./dbs/final_files/digtop_final_route.gdsii => Creates a GDSII Stream file version of the current database. By default, the Encounter software creates a Version 3 GDSII file.
#-libName <libname> Specifies the library to convert to GDSII format. Default: Name is DesignLib.
#Note: we can also use Gui: file->Save->GDS/OASIS

Report specific timing paths to match b/w PT/ETS and Encounter:
----------------------
set_analysis_view -setup {func_max} -hold {func_min} => change analysis view if you need timing only for other view.
setAnalysisMode -checkType hold => default checktype is setup.
report_timing -check_type hold -from u_DIG/flop1_r_reg -to u_dsp/sync1_reg -path_type full_clock => shows detailed clock path too.

OA design exchange process:
--------------------------
Instead of using defin for design exchange, we can directly write an OA database. In conventional flow, we take in floorplan def (pins def) and generate DEF or GDS. We use abstract LEF file for stdcells. In OA flow, we take in floorplan OA db directly and generate OA db. We use abstract OA db for stdcells. Abstract OA db doesn't have physical layout, just an abstract view. Steps:
1. Use encounter 9.1 or later. Add these to scipts/import.conf file in vdio dir(/db/NOZOMI_NEXT_OA/design1p0/HDL/Autoroute/digtop/vdio) :
 A. set rda_Input(ui_oa_oa2lefversion) {5.6}
 B. set rda_Input(ui_oa_reflib) "pml30_lbc8_2pin lbc8" => provide name of stdcell and tech lib
 C. set rda_Input(ui_oa_abstractname) {abstract}
 D. set rda_Input(ui_oa_layoutname) {layout}
2. For importing floorplan: In VDI gui, goto File->Load->OA Cellview. Provide library=HAYATE_dig1p0, cell=digtop, view=layout (or on cmd line: oaIn HAYATE_dig1p0 digtop layout). Not needed for our purpose, since we don't do floorplan import.
3. After going thru the flow, and running export_final.tcl, we are ready for OA db creation. In VDI gui, goto File->Save Design. choose data type=OA, library=HAYATE_dig1p0, cell=digtop, view=layout (or on cmd line: saveOaDesign HAYATE_dig1p0 digtop layout).

This creates a OA db in vdio dir. Where ever we are trying to save OA db, we need to have cds.lib file which needs to have these 5 lines:
SOFTINCLUDE $CIC_HOME/tools/dfII/local/cds.lib
DEFINE lbc8 /data/pdkoa/lbc8/2011.12.15/cdk/lbc8
DEFINE pml30_lbc8_2pin /data/pdkoa/lbc8/mcache/diglib/pml30/DIGLIB-PML30-RELEASE-r2.5.1_2_f/pml30_lbc8_2pin
DEFINE avTech /apps/artisan_cds/assura/3.2_EHF2_OA/tools/assura/etc/avtech/avTech
DEFINE HAYATE_dig1p0 HAYATE_dig1p0

In vdio dir, OA db is created under HAYATE_dig1p0 dir, which has "digtop" subdir, data.dm and tech.db files. "digtop" dir has "layout" dir which has layout.oa file, master.tag, digtop.conf and multiple other files. Make sure, digtop.conf file has same parameters as import.conf file. This dir structure is exactly the same as in "/db/NOZOMI_NEXT_OA/cds/HAYATE_dig1p0" which has digtop subdir, data.dm and tech.db files along with other subdir for schematic modules. "digtop" dir has "layout" dir (along with schematic and symbol dir) which has layout.oa file and master.tag in layout dir.

4. Now, we need to import this data in virtuoso. open icfb where we saved the OA library (/db/NOZOMI_NEXT_OA/design1p0/HDL/Autoroute/digtop/vdio). In lib mgr, we should see our "HAYATE_dig1p0" lib. Open digtop layout. We see that design is saved as OA abstract view, so we need to save it as layout view. To do that goto Tools-> Remaster Instances. Leave library and cell name empty. enter "search for" viewname as "abstract" and "update to" viewname as "layout". click OK, and the physical layout appears. Now, we can add pin labels the way we do it normally, and then save the design.

NOTE: this whole process is only for layout transfer (subtitute for Def import). We still have to do schematic/symbol transfer using Verilog import, exactly the way we used to do it normally. So, OA db process only saves us time of DefIn.

------------------------------------------------------

#Mask formats: (all these formats are hier formats). Files easily over 100GB in size. OPC done on gdsii and oasis files and 90% of mask data files are manipulated and refractured, and inspected before going into actual mask.
--------------------
GDSII (graphics database system 2): Now owned by Cadence. It's used for exchange of IC layout data and also given to Fab for IC fabrication. It consists of different layer patterns and shows all the different layer, with each layer number as layer 1, layer 2, etc. It doesn't know which layer is what as it's just showing patterns. In order to map these layers numbers to actual layers names in pdk, we need layer map file. This layer map file is pdk dir. For 1533eo35, it's in: /db/pdk/1533e035/current/cdk446/current/doc/stream.map. This has cds Layer name mapped to a gds layer number. For ex: layer 1 is mapped to NWELL, layer 2 to ACTIVE, etc. k2_viewer from cadence can be used to view gds files. see in cadence_virtuoso.txt for generating gds from layout.

OASIS (Open Artwork system Interchange standard for Photomasks) format: successor to GDSII. Owned by trade and std org  SEMI (Semiconductor Equipment and Materials International). Open std format to rep physical and mask layout data. It reduces the size of files by 10x. OASIS.MASK further reduces it by half. It allows the same datafile to be used for pattern generation, metrology and inspection.

MEBES format: Design layout files, in the form of either GDSII or Oasis data formats, are transferred to Mebes format for transmission to photomask shops. Mebes is a proprietary mask data format from Applied Materials Inc. It is regarded as the de facto industry standard for exchanging fractured photomask data. commonly used format for electron beam lithography and photomask production tools. Inspection tools inspect these files and perform MRC (manufacturing rule check) which is DRC-like check on post fractured data. Mebes files are generally much more data-heavy than either GDSII or Oasis formats because of the addition of resolution enhancement technique (RET) features and the need to provide essentially flat data--with a very limited amount of hierarchy--to e-beam photomask pattern generation tools.

LAFF format: seems like it's internal TI format. Look in eco.txt for more details.

-------------------------------------------------
=============================================

Done with all required steps. do si_check (for signal integrity, if needed) and si_signoff for final signoff checks.

**************************************************************************

---------------------------------------
Encounter Warnings and errors:
------------------------------------
A. reading .lib files during reading config file:
-----------------------------------------
Log:
**************
Reading max timing library '/db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/synopsys/src/PML30_W_150_1.65_CORE.lib' ...

*WARN: (TECHLIB-436):  Attribute 'fanout_load' on output/inout pin 'CO' of cell 'AD210' is not defined in the library. Either define this value in the library or use set_default_timing_library to pick these values from a default library.
*************
Reason: fanout_load not present. deafult is set to 1.
-------------------------------

B. On running verifyGeometry or during nanoRoute:
----------------------------------
verifyGeometry: *WARN: (ENCVFG-47):    Pin of Cell mldd_env_thrsh_out_4_I_buf at (15.300, 1062.300), (32.300, 1066.100) on Layer MET1 is not connected to any net.
NanoRoute: #WARNING (NRDB-1005) Can not establish connection to PIN S at (558.900 206.100) on METAL1 for NET net1. The NET is considered partially routed.

These warnings say that a pin is  connected to some wire.  Usually, after issuing these warnings globalDetailRoute will complete the connection of the previously partially connected nets. As a summary, this warning shows that there might be some issue (mentioned above) but if the issue in the design is not real then globalDetailRoute will complete the connection of these partially connected nets. When we get it during verifyGeometry, check that location to make sure it's connected properly. Most of the time, it throws this warning for VDD/VSS pin of some cells.
Ex: during optDesign we see these warnings,  optDesign is free to move fixed instances around placement. But, fixed clock wires connected to their pins cannot be moved at this stage. That is the reason some pins are not connected due to instance movement by optDesign and resulting in this warning. Also, if driver driving o/p port is moved, then since port can't move, it results in this warning being issued. Nanoroute will try to fix it by adding extra routing during later stage.

For debugging it to see if the issue is real or is just a warning while doing nanoRoute, some verification can be performed as below :
1. checkPlace -checkpinAccess
2. verifyConnectivity
3. grid check (Sometimes the pins are not properly on grid)
4. Proper Layout connection.

------------------
C. **WARN: (ENCDB-2136):For instance 'IShootCtrl/g22236', its Input term 'A' does not connect to a 'TieLo' net. It is floating.
 
This happens when i/p pins get connected to 1'b1 or 1'b0. Router doesn't know what to connect it to, since they may be connected to one of the pwr grids or to tieoff cell o/p. This usually happens in 2 scenarios:
1. when an existing cell becomes a spare cell, because the i/p to that cell got connected to something else. In such case, i/p pin of this cell has no connection and hence tool connects it to 1'b1 or 1'b0.
2. Other scenario that it happens is when an existing cell o/p was driving i/p of some other cell, but then the eco change caused that cell to be used as a spare cell. So, now the i/p and o/p of that existing cell has diiferent connections. So, o/p of this existing cell can't be used to drive i/p of that other cell. So, tools connects it to 1'b1 or 1'b0.

Detailed soln at this link:
http://support.cadence.com/wps/myportal/cos/COSHome/viewsolution/!ut/p/a1/nY9NDoIwEEbPwgFMp1AoLOtPQGggKkbKxkBsTCMUguDC0wvGxJWaOLuZvHkzH8pRhnJd3NS56FWji2rqc-e4xiuCgwRC32MLYEA34T7CZkTtERAjAB-Kwa_9A8qfyBeDGE_Qt8Pl3APmR842oKkFCUV73XT1-Oxucp2kbLnSFyT6bpDT5NpUwxQnHupSdkhgTDwbU_IS28GSQAg4TOYmBRakxCcxx5CYf4vbOoOZqF3LVuWdGcYDE0SB8g!!/dl5/d5/L2dBISEvZ0FBIS9nQSEh/

we need to use this flow to fix the issue:

A. NON ECO design: Do it after placement as it's easy to add cells:
   1. restoreDesign
   2. placeDesign # Run placement before inserting tie high/low cells
   3. setTieHiLoMode -cell {TIEHI TIELO} # Specify tie high/low cells to use
   4. addTieHiLo # Insert the tie high/low cells. We need to add these cells as they are removed during placement in step 2 above. Appr Tiehi/Tielo cells will b inserted in every module that needs it and 1'b1 and 1'b0 will be connected to these.

B. ECO design (all layer): Add Tiehi/Tielo cells and then do eco Place/Route:
   1. addTieHiLo -cell "TIEHI TIELO"
   2. ecoPlace
   3. ecoRoute

C. ECO design (metal only): If the TIELHI/TIELO cells were already present in the netlist, route them using NanoRoute.
  1. selectNet <tielo_signal_o/p_from_tieoff_cell>
  2. setNanoRouteMode -routeSelectedNetOnly true
  3. detailRoute -select

D. If routing tiehi/tielo signals to the pin doesn't work, we can just connect any of the other pins to the floating pin. that way there's no extra routing (as pins are close together, so most of the times little bit of MET1 routing inside stdcell will suffice). This usually works for spare cells (or cells whose o/p is not used for functional purpose, so tying i/p pin to any signal will work). Steps to do this are as follows:
  1. attachTerm IShootCtrl/g21 B1 IShootCtrl/n513 => connect pin B1 of gate g21 to net n513 (which is connected to pin B2 of g21). This only connects logically, physical connection will be done later
  2. ecoRoute => actual routing done. ecoRoute cmd used to minimize any routing changes.

Run below cmds on any final design to make sure there are no 1'b1 or 1'b0 in netlist:
To ensure that all your tiehi/lo connections have tie cells (and are not connected to a rail instead), run the following dbGet commands:

  dbGet top.insts.instTerms.isTieHi 1
  dbGet top.insts.instTerms.isTieLo 1

The previous commands should return "0x0" if all connections have tie cells. If "1"s are returned, use the following commands to find the terms that still need a tie cell:

  dbGet [dbGet -p top.insts.instTerms.isTieHi 1].name
  dbGet [dbGet -p top.insts.instTerms.isTieLo 1].name

---------------------------------------------