javascript (JS):

Now that you learned DOM basics, javascript is easier to learn as DOM and javascript kind of go together.

This is one of the best tutorials on javascript on digitalocean website: https://www.digitalocean.com/community/tutorial_series/how-to-code-in-javascript

javascript or JS is one of the core languages besides HTML and CSS that go into your website. Supoort for javascript is built into the browser, so that no extra plugin is required to run javascript. Javascript runs locally on your computer, by browser downloading the script from specified location and then running locally. Javascript makes web pages interactive, and so has become very popular (as HTML, CSS just provide static webpages and were never designed for dynamic interaction).

console: In DOM section, we saw how to open console in firefox, and write javascript cmds on it. This is also called "javascript console".This console is like javascript shell, where we can try any javascript cmd in real time.

console.log() => To o/p anyhting on this console, we can use console cmd below. This console cmd is useful during debug of javascript to dump out values.

>> console.log("Hello") => This will print "Hello" on javascript console

>> console.log(4+5); => This will print "9" on console.

alert(): This is another function used to print values in a pop up box. This is also used to dump out values.

>> alert("hello") => brings a pop up box with "hello" written on it.

Adding JS into HTML doc:

JS is added into HTML doc by adding tag <script> ... </script>. Just as in CSS, they can be inserted inline or as a separate file. Generally JS is put into the <HEAD> section, signalling the browser to run the JavaScript script before loading in the rest of the page. However, if your script needs to run at a certain point within a page’s layout, then it has to put at a point, where it is to be called, generally in <BODY> section. Putting JS in HEAD section is preferred, as it keeps JS separate from main HTML content, but it's not always possible.

1. Inline: Here, JS code is put directly within <script> tags. In ex below, JS function alert( ) is called in <head> section, so a popup with msg "Hello" shows up on browser, even before the erest of the page loads. This is because <body> section hasn't even run yet, as alert ( ) is in <head> section.

<head>

  <script>

    alert("Hello")

  </script>

</head>

2. separate file: Here, JS code is put in a separate file, so that it's cleaner. In ex below, test.js is a separate JS file having JS function alert("Hello");

<head>

  <script> src="/test.js"  </script>

</head>

Syntax: Just like any other programming languages, JS also has variables, reserved keywords (if, else, etc) and special characters (for doing arithmetic, signalling end of cmd, etc).

1. Whitespace: Whitespace in JavaScript consists of spaces, tabs, and newlines. Whitespace is used to separate out tokens. Extra whitespace is ignored. Some tokens may be recognized even without whitespace, however, it's preferred to put whitespace.

2. semicolon: JS consists of multiple statements. Each statement is terminated by a newline or a semicolon( ; ). semicolon is mandatory when there are multiple statements on a single line. It's possible to write entire JS program in a single line by using semicolons and spaces, but that would be unreadable. Indentation (or using multiple spaces) is good way to make your pgm readable.

ex: alert("Me"); var a="ASD";

3. comments: JS comments are written in same style as in C language. single lines comments are written using //, and multiline comments within /* this is multi line comment */

4. variables / identifiers: The name of a variable, function, or property is known as an identifier in JavaScript. Identifiers consist of letters and numbers, but they cannot include any symbol outside of $ and _, and cannot begin with a number. They are case sensitive. They are declared using "var", "let" or "const". Initially JS only allowed keyword "var" to declare variables, but now 2 additional keywords "let" and "const" used. That is why in lot of legacy JS code, only "var" is seen. However it's recommended to use "let" and "const" instead of var. "let" is used for variables that can be reassigned as in loops, etc, while "const" is used everywhere else, where var are fixed and never modified or reassigned. The scope of variables may be local or global (i.e if variables may or may no be accessible outside the function or block where they are defined). The scope of "let" is block level, while scope of "const" is global level. See digitalocean link for more details.

 var myname = "Tom". Here myname is an identifier which stores var "Tom" which is of string type.

let tr=23;

const Me="Ajay";

let ToVar; => We can also define a var, but not initialize it to any value. Then we can assign it later as ToVar=1;

Data Types: variables can be of different data types. In JS, just as in other scripting languages, we don't define data type beforehand. Variables can be accessed directly by their identifier (no need of appending it with any special var). These are the primitive data types in JS:

I. number: Both integers and floating point numbers can be declared. Scientific notation is also supported.

let t = 16;         // t is a number.

var t = 14.23 //t is floating number

console.log(t) => prints value of var t. ooks for a var named t and if found prints it's value. It knows t is a var, since it's not a number, not a string (since not enclosed within quotes) and not a boolean.

var t; //here t is undefined, so is assigned null value

Arithmetic operators: as+, -, *, /, +=, ++, etc can be applied on number data type.

ex: 10 +20 => returns 30

ex: let a=4.2; let b=2.1; let c=a / b; => here c is assigned 4.2/2.1=2

ex: let x=a++; ++ and -- are increment and decrement operators. Used in for loops

II. string: A string is a sequence of one or more characters (letters, numbers, symbols).  There are 3 ways to write a string: single quotes ( ' ... '), double quotes ( " ..." ) and backtick ( `.... ` ).There's no difference b/w single or double quotes. backtick quotes is the newest method and is also called template literal. It allow substitution of variables inside the quotes. "+" is a concatenation operator for strings (i.e it joins 2 strings).

ex: let t = "Mother Teresa";   // t is a string

let T='teresa' + t; //this concatenates 2 strings and assigns it to T. So T = "teresa Mother Teresa"

backtick: Above we had to use + to concatenate string, since anything inside single or double quotes is interpreted as string, so we could not have used our variable inside quotes. However, backtick allows us to do that by using ${}.

ex: let T=`teresa ${t}` => here t is seen as a var, and it's value is substituted.

NOTE: when having single or double quotes as part of string, we can use \' or \" to treat it as part of string, or use backtick to store that string (since then single or double quotes won't be recognized as end of string). Also, \n is recognized as newline in a string, or even string on different lines is recognized as same string with newlines in between (string continues as long as ending quotes are not found)

string manipulation: There are many methods available to manipulate strings. Note that string is treated as an array of characters starting with index 0.

const originalString = "How are you?"; => Here index [0]=H, [1]=o, [2]=w, [3]="space" or "blank", [4]=a and so on ....

var char5 = originalString[5] => this returns index 5 of above array of char, which is char "r".

originalString.charAt(5) => returns index 5, which is r"

"My name".indexOf("n") => return index number of char "n", which is 3

//slice of string

"How are you?".slice(8, 11); => returns "you"

originalString.sliceof(8) => returns everything starting from index 8 to end of string, so this returns "you


// Split string by whitespace character
const splitString = originalString.split(" "); => this splits string by whitespace, so var splitString is now an array [ 'How',  'are',  'you' ]

console.log(splitString); => we can now access various elements of array too, via splitString[0], etc

There are various other methods to convert cases, finding length, trim, replace, etc. These can be found in digitalocean link at top.

III. Boolean: The Boolean data type can be one of two values, either true or false.
let t = true;       // t is a Boolean

let myBool = 5 > 8; // var myBool is assigned a value "false"

IV. array: An array can hold multiple values within a single variable. square brackets used to hold elements of array. They can store any data type as number, string or objects, and single array can have mixed dat types. Arrays can contain arrays within it. Arrays are similar to string, since strings are array of characters.

ex: let fish = ["shark", "cuttlefish", "clownfish", "eel", 7, ["meen", 9], null, 0]; => declares an array of string, number, etc.

ex: fish; => this will print the contents of entire array named fish. We can access items in an array via index, i.e fish[0] refers to 1st element of array, etc.

Array methods: We can use many methods to access various elements of array, or remove/add elements, modify elements, etc.

There are 3 kinds of array methods available:

1. mutator methods: these modify the array itself (i.e adding, removing elements, etc).

Array methods are properly written out as Array.prototype.method(), as Array.prototype refers to the Array object itself.

isArray(): returns True if var is an array. ex: Array.isArray(fish) => returns boolean true

pop(): removes the last element from the end of an array.. ex: fish.pop() => removes last element 0 from fish array.

similarly other methods as push(), shift, unshift, sort, etc. Because of these methods, it's very easy to work with arrays.

 2. accessor methods: these return new value or representation. methods such as concat, slice, indexOf, etc

ex: let newfish = fish.slice(2,4); => create new array with elements from index 2 to index 3. newfish = ["clownfish", "eel"]; NOTE: last index has to be one more than the index you want. So, in this case, we did (c,4), even though we wanted (2,3).

 3. iteration methods: These are used to operate on every item in an array, one at a time. These methods are closely associated with loops. 

To access all elements of array, we can use for loop below (discussed later).

for (let i = 0; i < fish.length; i++) { //here length is a property of array, not a method
  console.log(i, fish[i]);
}

Other way using method:. This prints each element of array (same as above), but using a method forEach. Here forEach method calls a function, which we'll learn later
fish.forEach(individualFish => {
    console.log(individualFish);
})

Similarly many other methods available, as find, reduce, etc

V. objects: The JavaScript object data type can contain many values as name:value pairs. These pairs provide a useful way to store and access data, and can be of any data type. The object literal syntax is made up of name:value pairs separated by colons with curly braces on either side { }. The name value pairs are called the property of the object. We can also have methods or functions as part of object. As a data type, object is stored in variable.

2 ways to construct an object:

1. Using object literal: This is preferred method. This uses curly braces { ... }

ex: below ex creates an object var named "sammy", which contains values as indicated. We can have functions too. The whole statement can also be on same line, however that makes it unreadable.

let sammy = {
    firstName: "Sammy",
    lastName: "Shark",
    color: "blue",

    greet: function() {
        return `Hi, my name is ${this.firstName}!`; //this keyword inside an object refers to current object, in this case to "sammy"
    },
    location: "Ocean"
};

We can print object data by just typing anme of object.

ex: sammy; => prints all data

{firstName: "Sammy", lastName: "Shark", etc ....}

We can also create empty object.

ex: const sammy = { }; //cretaes empty object named "sammy"

2. Using object constructor: This is one other way to create object, by using "new" keyword

ex: const sammy = new Object();

Properties and methods:

Objects can have properties and methods as discussed above.

There are two ways to access an object’s properties.

  • Dot notation: using a dor (.): ex: sammy.firstName => returns "Sammy". Dot notation is more widely used, but has it's limitations.
  • Bracket notation: using square brackets [ ] : ex: sammy["lastName"] => "Shark". Bracket notation is used if an object’s property contains any sort of special character.

Objects's methods can be accessed the same way as we access any function, here it's just attached to the object variable.

ex: sammy.greet() => returns "Hi, my name is Sammy"

Modifying or adding to objects properties and methods is easy to do via assignment operator (=). We can use either dot notation or bracket notation.

ex: sammy.firstName = "Akbar" => this changes the value for key=firstName

We can methods by just attaching a new function to object.

ex:

sammy.hello = {
        return `Hi, my location is ${this.location}!`;
    },

Removing object's properties is done via delete cmd:

ex: delete sammy.firstName => this removes key value pair from object "sammy"

We can loop thru object's properties via "for ... in" loop. NOTE: this is different than "for ...of" loop that is used for arrays. "for ...in" loop is specifically meant for iterating over the properties of an object.

// Get keys and values of sammy properties
for (let my_key in sammy) { //my_key is a var
  console.log(my_key.toUpperCase() + ':', sammy[my_key]); //
}

Built in objects: There are many built in objects and associated methods in Javascript.

1. date: The Date object is a built-in object in JavaScript that stores the date and time. there are many methods associated with it.

ex: // Set variable to current date and time
const now = new Date();
now; // prints current date and time. returns => Wed Oct 18 2017 12:41:34 GMT+0000 (UTC)

now.getTime();// getTime is a method of Date object. Gets the current timestamp,

 

Finding data type: To find the type of any var, we can use function typeof:

ex: typeof originalString => this return string, since it's defined as string above (by enclosing it in double quotes)

ex: console.log(t) => l


let t;              // t is undefined

ex: var myNum= 1 => this assigns it to number type

ex: var a = "my world" => this assigns it to string type

Converting data type:

When using operators, JS coerces data into particular type if needed. This is implicit conversion.

ex: "3" - "2" => even though 3 and 2 are strings here, JS comverts them to number since operator "minus" works on numbers, Returns result as 1

We should always explicitly convert data type to remove ambiguity by using below methods

String(49) => converts number 49 to string "49" and returns "49"

(1776).toString() => converts number 1776 to string "1776"

let str="17";

Number(str) => converts string "17" to Number 17.

Boolean(0) => converts number to boolean, returns false

Boolean("man") => returns boolean true, as any non zero number or string is treated as true

Conditional Statements:

1. if-else: same as if else in other languages. "else" and "else if" are optional.

if (condition a) {
    // code that will execute if condition a is true
} else if (condition b) {
    // code that will execute if condition b is true
} else if (condition c) {
    // code that will execute if condition c is true
} else {
    // code that will execute if all above conditions are false
}

ternary operator: similar to ? : in C lang, it's an alternative form of if else.

ex:
let age = 20;
const oldEnough = (age >= 21) ? "enter." : "not enter.";

2. switch: same as switch case in other languages.

switch (expression) {
    case x:
        // execute case x code block
        break;
    case y:
        // execute case y code block
        break;
    default:
        // execute default code block
}

Loop statements:

1. while, do while: runs loop based on condition. if-else and switch stmt run only once, while runs in a loop as long as the condition is true.

A. while loop: If we put "true" inside while condition (i.e while (true) { ... }), then loop runs infinitely. If we put "false", then loop never runs.
while (fish < popLimit) {
    fish++;
    console.log("There's room for " + (popLimit - fish) + " more fish.");
}

B. do while loop: this loop will always execute once, even if the condition is never true.

do {
    // execute code
} while (condition);

2. for loop: 3 types of for loop: for, for .. of, for ... in

A. for loop: uses 3 optional expr that control execution of loop. All 3 expr are optional and for loop will still run. However the loop may become infinite, if there are no epr to control the loop's var value, so a break stmt inside the loop might be needed. for loops can be used to modify or read values of arrays.

ex:
for (let i = 0; i < 4; i++) {
    // Print each iteration to the console
    console.log(i);
}

B. for .. in loop: used to iterate over the properties of an object. See in object section above.o

C. for .. of loop: This is used to iterate over iterable objects like arrays and strings, The for ... of statement is a newer feature as of ECMAScript 6.  ECMAScript (or ES) is a scripting-language specification created to standardize JavaScript. We looked at the iterator method above to access all elements of an array, which used regular for loop. Here, we use the for ... of loop.

 ex: This prints all elements of array.
let sharks = [ "great white", "tiger", "hammerhead" ];

for (let shark_name of sharks) {
    console.log(shark_name);
}

 A string can be iterated through in the same way as an array (as a string is eventually an array of char).

ex: This prints each char of string

let sharkString = "sharks";
for (let shark of sharkString) {
    console.log(shark);
}

Functions:

Functions are same as in other pgm languages. Function can have optional args.

3 types of function syntax allowed:

1. Regular function defn: Here we define functions the regular way

ex:  Here we first define function "greet"  with 3 args. Args are optional. If no args, then just do "function greet( ) { ... }".
function greet(name, a, b) {
    console.log(`Hello, ${name}!`);

   return a+b; //return is optional.
}

greet("Sammy", 5, 6); // Invoke greet function with "Sammy", 5, 6 as arguments. Since function returns a value, so 5+6=11 is returned. This value is returned at the point in pgm, that this function is called. It prints 11 on screen, even though we didn't print out the value explicitly by using console.log. This return value can be used immediately or stored in a variable.

2. Function expression: Here, we  assign function to a var.

ex:
const sum = function add(x, y) { //function is named add, but the function itself is stored in var "sum"
    return x + y;
}

sum(20, 5); //Here we invoke function via the var (function name not needed)

Anonymous functions: Since function name is not really needed for anything when using the var to store the func, we can omit func name altogether. These are called anonymous functions.

ex: here function name "add" is omitted, although parenthesis and args still needed.
const sum = function(x, y) {
    return x + y;
}

sum(100, 3);

3. Arrow functions: This is a newer, more concise method of defining a function known as arrow function expressions as of ECMAScript 6.  Arrow functions are represented by "=>". Arrow functions are always anonymous functions and a type of function expression.

ex: Here we omit the word "function" and instead use arrow "=>" after the args. If there are no args, empty parenthesis needed. However, if there is only 1 arg, then parenthesis is not required.


const multiply = (x, y) => {
    return x * y;
}

multiply(30, 4);

NOTE: if the function consist of a return statement only, arrow functions allow the syntax to be reduced even further. If the function is only a single line return, both the curly brackets and the return statement can be omitted, as seen in the example below.

ex:
const square = x => x * x; //NOTE: no parenthesis or curly braces needed. No "return" keyword needed either.

square(10);

Events:

Events are are actions that take place in the browser that can be initiated by either the user or the browser itself. We learned a little bit about events in DOM. They are an integral part of javascript, as we can code responses to events in browser, that can make the page look responsive.

Most common events are:

1. mouse events: They refer to events that involve clicking buttons on the mouse or hovering and moving the mouse pointer. 

 ex: click: Fires when the mouse is pressed and released on an element

2. form events: actions that pertain to forms, such as input elements being selected or unselected, and forms being submitted.

ex: submit: Fires when a form is submitted (i.e submit button is clicked on form element)

3. keyboard events: used for handling keyboard actions, such as pressing a key, lifting a key, and holding down a key.

ex: ketpress: Fires continuously while a key is pressed

Event handlers: Whenever an event is fired (by user clicking mouse, etc), a JS function can be made to run. This function is called event handler. Button, form, etc are elements of an html page. These elements have properties or attributes such as submit, click, etc (i.e button has onclick attribute). We can assign events to these elements. 

There are three ways to assign events to elements:

  • Inline event handlers
  • Event handler properties
  • Event listeners

1. Inline event handlers: Here we set event handler on attribute of an element. Then we assign an attribute value which is the JS function that we want to call. It's called inline as we add handlers to every element that we want to respond to events. This is very cumbersome/inefficient, and so should not be used at all.
ex:   In below ex, we assign value of "onclick" (NOT onClick, i.e all small letters) attribute of button element to "changeText()" JS function. In js/test.js, we define this function to change text of p element in html.

<button onclick="changeText()">Click me</button>
<script> src="/js/test.js"  </script>

js/test.js => Below file has function defined to modify the text content of the paragraph
const changeText = () => {
    const p = document.querySelector('p');
    p.textContent = "I changed because of an inline event handler.";
}

Forms: With forms too, we can have inline event handlers. We have attribute onsubmit of a form, on which we can apply event handler.

ex: <form action ="script/action_page.php" onsubmit="ValidateForm()" .. > => Here unless the value returned by function "ValidateForm" is true, the action "script/action_page.php" won't be taken.

 2. Event handler properties: Very similar to an inline handler, except we’re setting the property of an element in JavaScript instead of the attribute in the HTML. In the above ex, we set attribute "onclick" to JS func in HTML itself, but here we set property "onclick" to JS func reference in JS code. This also should not be used, as we have better 3rd way of handling events.

ex:

<button>Click me</button> => NOTE: we didn't assign any value to onclick attribute.
<script> src="/js/test.js"  </script>

js/test.js => Now we access above element "button" in JS itself (just like we access any other element of HTML in JS), and assign JS function pointer to it's property "onclick". NOTE: onclick is a property of button element here, instead of an attribute.
const changeText = () => {
    const p = document.querySelector('p');
    p.textContent = "I changed because of an event handler property.";
}

// Add event handler as a property of the button element.
const button = document.querySelector('button');
button.onclick = changeText; => NOTE: when passing a function reference to the onclick property, we do not include parentheses, as we are not invoking the function in that moment, but only passing a reference to it.

3. Event Listeners:  An event listener watches for an event on an element. Instead of assigning the event directly to a property on the element (as in example above), we use the addEventListener() method to listen for the event.  addEventListener() takes two mandatory parameters — the event it is to be listening for, and the listener callback function.

Event listeners are the newest and preferred way to handle events. They have the advantage that multiple listeners can be added to same event and element, while with the "event handler property" method above, last property set on an element overwrites all previous properties. Furthermore, you can use addEventListener() on the document and window object too.

ex:

<button>Click me</button> => NOTE: we didn't assign any value to onclick attribute.
<script> src="/js/test.js"  </script>

js/test.js => Here, we define 2 functions, and add 2
const changeText = () => {
    const p = document.querySelector('p');

    p.textContent = "I changed because of an event listener.";
}

const alertText = () => {

alert("Am I here?");

}


//Here we access addEventListener() method of element "button". We listen for "click" event on button, and then call appr function.
const button = document.querySelector('button');
button.addEventListener('click', changeText);

button.addEventListener('click', alertText);

// An anonymous function can also be added on an event listener. Anonymous functions are functions that are not named. Often, anonymous functions are used instead of a function reference on an event listener. 
button.addEventListener('click', () => { //NOTE: no separate func called, but func body defined here itself.
    p.textContent = "Will I change?";
});

NOTE: With the first two methods, a click event was referred to as onclick, but with event listeners it is referred to as click. Every event listener drops the on from the word. 

 

Event Objects: We talked about various events earlier. These events can be accessed via objects. The Event object consists of properties and methods that all events can access. In addition to the generic Event object, each type of event has its own extensions, such as KeyboardEvent and MouseEvent.

The Event object is passed through a listener function as a parameter. It is usually written as event or e. We can access the code property of the keydown event to replicate the keyboard controls of a PC game.

ex: Access the code property of "keydown" event. "keydown" event is captured in "event" object. We then access the "code" property of that event.
document.addEventListener('keydown', event => {
    console.log('code: ' + event.code); //This returns the key pressed, as "KeyA", "KeyB", etc
});

 

 

DOM: Document Object Model. Before learning Javascript, we need to learn DOM. 

A good link here on digitalocean. Go thru the whole series, very neatly explained. You should not need to look anywhere else, as it's comprehensive.

https://www.digitalocean.com/community/tutorial_series/understanding-the-dom-document-object-model

Mozilla docs are also a very good source for anything internet/browser related. sometimes they do appear more technical for beginners, but they do start with basics:

https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model

We already read about html syntax, and saw that it contains various tags, many of them nested inside each other. XML, SVG documents are also similar to html where they have various tags. These kind of documents can easily be represented by a tree.  DOM is that tree rep of this document. This is how browsers store html pages in memory, so that it's easier for other pgm languages to access this html doc, as well for the browser to carry out various operations on this html doc. If browsers or any programming language worked directly on html doc, it would be very inefficient to parse it and collect all reqd info about the doc every time. DOM is just a more efficient rep of that same HTML page, with all the info in it in a tree form. The DOM is an object-oriented representation of the web page, which can be modified with a scripting language such as JavaScript.

DOM level 0 was released in 1995. During 1997, intermediate DOM versions were released, which added support for DHTML (Dynamic HTML), where HTML is modified on user computer thru Javascript. However, these DOM versions were largely incompatible b/w IE and Netsscape. This led W3C tp release a standard known as DOM level 1. Then DOM level 2 was published in 2000, which introduced the getElementById function as well as an event model and support for XML namespaces and CSS. The latest DOM level 4 was released in 2015.

Layout engine of browsers parse HTML into DOM tree. The DOM represents the document as nodes and objects. These nodes are organized in a tree structure called the DOM tree, with the topmost node named as "Document object". Every element in a HTML document — the document as a whole, the head, tables within the document, table headers, text within the table cells—is part of the document object model for that document, so they can all be accessed and manipulated using the DOM and a scripting language like JavaScript. Initially DOM and Javascript were tightly interwined, but eventually, they evolved into separate entities. 

So, a typical DOM tree for HTML page would be something as shown on this wiki page (see top right pic): https://en.wikipedia.org/wiki/Document_Object_Model

DOM programming for HTML:

window and document objects:

The top most object in DOM is the window object which represents something like the browser tab, and the document object is the root of the HTML document itself (the HTML doc that you see in your browser currently). Each separate HTML doc in separate tab or window has it's own DOM tree, and they can't talk to each other.  There are many methods and properties associated with these objects (aka API)  that we can use to get access to various items in the HTML doc as well as modify them locally.

DOM tree: In DOM tree, top level "window" object has "document" object under it. Then this document object has <HTML> object which in turn has 2 objects: <HEAD> and <BODY> . <HEAD> object has <TITLE> object under it, while <BODY> object has various objects as <a>, <p>, <h1>, etc.

All items in the DOM are defined as nodes. There are many types of nodes, but there are three main ones that we work with most often:

  • Element nodes : In above DOM tree, all these objects as <HEAD>, <p> etc are elements, so they form element node.
  • Text nodes : Any text outside of any element is called text node. NOTE: text within element is not called a "text node", i.e text within <h1> tag is not a text node as it is inside "h1" element.
  • Comment nodes : Anything within comments is a comment node.

In addition to these three node types, the document tself is a document node, which is the root of all other nodes. window is a parent of document node. The DOM consists of a tree structure of nested nodes, which is often referred to as the DOM tree. Ultimately, everything in an HTML doc is present in this DOM tree. This DOM tree is what is used in javascript to manipulate html file that is loaded into the browser.

Properties: To access method or properties of an object, we do it as in other Object Oriented languages, which is object name followed by a dot and then the method/property name. So, to access the contents of body element, we can use "document.body", where document is the object and body is the property. This contains the whole <body> element, and so everything is printed out on console. Here Similarly we can change properties of an element via something like: document.body.style.backgroundColor = 'red';

We can follow the example given in digitalocean, by writing a file "index.html" and then opening it in firefox browser by entering this in address bar "file:///home/ashish/index.htmt" or whatever the path is to your file. Once the o/p is seen on the browser, you can go to "open menu" on top right side of firefox, click on "Web Developer" and then on "Web Console". This will bring a console (or something to type on with >>) in bottom of page. Now we can type any javascript commands on console (NOTE: all cmds below are javascript cmds):

>> window => this prints the "window" object. Window is the global, top-level object representing a tab in the browser. The window object has access to such information as the toolbar, height and width of the window, prompts, and alerts. The document object consists of what is inside of the inner window.

>> document => document object is the property of window object. this prints the document object. It shows the whole index.html file in HTMLDocument node which has 2 child nodes: <!DOCTYPE html> and <html>. There is also whole lots of other info displayed as properties, attributes, elements, etc. Name shown after the solid arrow on 1st line (that you expand by clicking on) shows the node name. Here, HTMLDocument is the node name of this node.

>> document.head => This refers to "head" property of object "document". head is an object as well as property of document object. This prints the head element and everything inside it. The head node shown on top of the tree has 3 child nodes: #text, <title> and #text. #text are blank for head node, as these refer to text outside of any element, before and after the title element. indentation b/w elements is counted as "text" node, even though this may not be visible. children for head is just one: <title> element

>> document.body => This refers to "body" property of object "document". body is an object as well as property of document object. This prints the body element and everything inside it. The body node has 3 child nodes: #text, <h1> and #text. #text are blank for body node. children for body is just one: <h1> element

>> document.documentElement => This refers to "html" property of object "document". html is an object as well as property of document object. This prints the html element and everything inside it. The html node has 3 child nodes: <head>, #text and <body>. #text is blank for html node. children for html are 2 nodes: <head> and <body> elements. There is no "document.html", instead "document.documentElement" used.

>> document.title => this prints the title element and everything inside it. This is the text inside title: "Learning the DOM". For some reason, "document.head.title" doesn't work as title is not a property of head.

>> document.h1 => this is undefined. Similarly document.body.h1 is undefined

>> document.body.nodeType => This gives the node type of <body> element which is an "element" type. "element" node type has value=1. We can also print nodeValue ( returns "null" as it has no text to return) or nodeName ( returns "body")

There are many more properties to traverse DOM tree, add/delete nodes, and modify the DOM itself. 

Methods: Above we called various properties for given objects. We can also call methods instead of properties. Few imp methods listed below:

0. console.log( ): This can be used to print anything out on console. This is useful during debug:

ex: >> console.log("Aamir"); => prints string "Aamir"

ex: >> console.log(var1); => This looks for a var named "var1" and if found, prints the valoe of that var.

These are the 5 ways to access HTML elements in the DOM — by ID, by class, by HTML tag name, and by selector (CSS selectors). 

1. getElementById( ): This gets all elements in document whose id match the given id.

ex: <a id="nav" href="/index.html">  Home </a>

>> document.getElementById('nav') => this returns the whole <a> element, i.e <a> .... </a>. We can put double quotes also around "nav" here, but since double quotes are used in html itself, we use single quotes here so that this code can work when it's embedded within html (else html parser will mistake double quotes inside here as ending double quotes for html and give erroneous results.

>> let navLink = document.getElementById('nav'); => here we assigned a var to store the <a> element. Just typing "navLink" on console, prints the entire <a> element

>> navLink.href = 'https://www.wikipedia.org'; => We can modify href attribute via this. On the webpage, we will see href change to this new value.

>> navLink.textContent = 'Navigate to Wikipedia'; => We can modify textContent property via this. On the webpage, we will see href change to this new value.

2. getElementsByClassName ( ): This returns an HTML collection of all elements in document whose class name match the given name. Since there may many elements matching the class name, this returns an array, so we need to use index number to access each individual entry of that array.

>> const demoClass = document.getElementsByClassName('demo'); => here demoClass is an array as there may be various instance with class = "demo"

>> demoClass[0].style.border = '1px solid orange'; => NOTE: this changes values for index 0 of demoClass var, implying changing value for the 1st occurence of class = demo.

3. getElementsByTagName( ): This returns an HTML collection of elements with the given tag name. This also returns an array, similar to getElementsByClassName ( ).

4. querySelector ( ): This returns single element for matching class, id, tag, etc in CSS selector format.

>> const demoClass = document.querySelector('.demo'); => This is same as 2 above, except that it returns only first matching element. NOTE the dot in .demo, which indicates class=demo

5. querySelectorAll ( ): This returns multiple elements for matching class, id, tag, etc in CSS selector format.

>> const demoClass = document.querySelectorAll('.demo'); => This is same as 2 above, it returns all matching elements. We can access all matching elements by using method forEach()
 

Traversing the DOM tree: We can traverse the tree by using these parent, child and sibling properties.

I. parent node: We can get parent of any node by using parentNode property of that object.

>>  let navLink = document.getElementById('nav') => gets the element <a> in var "navLink"

>> navLink.parentNode =>This returns the parent of <a> which is <body> node

>> navLink.parentNode.parentNode => this returns the parent of parent of <a> which is <html> node

II. children node: We can get children nodes of any node by using childNodes property of that object. There are many other properties also available. The array of child nodes or elements can be traversed by using "for ... of" loop.

>> navLink.childNodes =>This returns all the child nodes of <a> which is #text node (which is blank). Uusally it retrns an array or list as there are multiple child nodes.

III. sibling node: The siblings of a node are any node on the same tree level in the DOM. Siblings do not have to be the same type of node - text, element, and comment nodes can all be siblings. There are various sibling properties as previousSibling, nextSibling, etc.

Changing the DOM: Th whole purpose of javascript cmds is to modify the DOM tree. We can do this using various methods:

1. createElement ( ): Creates a new element node. Then we can add text to the element by using textContent property.

>> const paragraph = document.createElement('p'); => creates new element <p>, which creates <p></p>

>> paragraph.textContent = "I'm a brand new paragraph."; => this adds text to above element. So, paragraph var is now "<p>I'm a brand new paragraph.</p>

2. appendChild ( ): Adds a node as the last child of a parent element. insertBefore, replaceChild, removeChild are other methods available.

There are various other methods available to alter the DOM by modifying styles, classes, and other attributes of HTML element nodes. 

Events: Above methods change DOM by manually adding applying changes using the methods above. However, we have methods that get triggered on certain events, that are either initiated by the user or the browser itself. Then the webpage becomes responsive to user actions, which makes the page very interactive.

When a user clicks a button or presses a key, an event is fired. These are called a click event or a keypress event, respectively.

An event handler is a JavaScript function that runs when an event fires.

An event listener attaches a responsive interface to an element, which allows that particular element to wait and “listen” for the given event to fire.

There are three ways to assign events to elements:

  • Inline event handlers
  • Event handler properties
  • Event listeners

1. Inline event Handlers: These should not be used.

 2. addEventListerner(): It listens for an event, and performs an action when the event happens

ex: <button id="my_button">Change Background Color</button>

>>  let my_but = document.getElementById('my_button'); => this returns "button" element above, and assigns it to var "my_but"

>> my_but.addEventListener('click', () => { => This adds a event on this my_but button element, so that when it's clicked, the background color of all of body is changed to red.
>>   document.body.style.backgroundColor = 'red';
>> });

 -----------------------------------------------

 

 

SDC Clock cmds:

Here are some of the imp clock cmds as part of SDC spec. Many of these are used in synthesis also.

 


 

create_clock:

creates clock on specified source object (usually  clk ports, but may be pins or nets too. When a net is used as source, then the first driver pin of net is the actual source used in creating clk. If clk is defined on an internal pin, maybe because it's a PLL or osc o/p inside the block, we may see a warning in PT (UITE-130 warning), since "create_generated_clock" is the cmd for clks on internal pins/nets. When "create_clock" is used on internal pin, any upstream timing info is lost for that clock, i.e any logic before that pin is not considered for timing. Clk starts at time 0 from that pin, so we may need to add latency for this clk to account for extra delay before this pin). If source object is not specified, then a virtual clk is created with the given name. A virtual clk is usually used to rep an off chip clk for input output delay spec (i.e set_input_delay, set_output_delay). For each clk specified, by default, a new path group is created for that clk. A path group is all the paths in design that have that clk as the endpoint. This is used for cost calc func for optimization. see "group_path" cmd for more info.

This new clk has ideal latency (zero delay at source of clk), 0 transition time (i.e slope of 90 degrees for rise/fall) and 0 propagated delay. That means that even if there are muxes, gates, clkgaters etc in design, they are all assumed ideal with 0 delay. To enable propagated delay, use "set_propagated_delay" cmd. To enable latency, use "set_clock_latency" cmd. To enable transition, use "set_clock_transition" cmd. "set_clock_uncertainty" cmd is used to account for clk edge variation from cycle to cycle. This in a way, makes clk more pessimistic, and we have to meet tighter timing requirements. These 4 cmds are detailed below later.

syntax: create_clock -name <clk_name> -period <period_vale> <other_options> <src_pin_or_port>

<other_options>:

  • -add: sometimes multiple clks are speciifed on same src for simultaneous analysis with multiple clks. In such cases, using -add, adds the clock on that source on top of whtever clks were already specified on that src. W/o this option, the new clk would over write other clocks.
  • -waveform {1st_rising_edge_after_time_0 1st_falling_edge_after_time_0 2nd_rising_edge_after_time_0 ... }. There must be even num of edges. W/o this option, rise is assumed at 0, and fall at period/2.
    • ex: create clock  -name "spi_clk" -period 50 -waveform { 2 27 } [get_ports spi_clk] => 20 MHz clk (period is in specified lib time units), rising edge at 2ns and falling edge at 27ns. To get inverted clock which starts high, we should do "-waveform {27 2}" which makes clock go high at 27ns and go low at 2ns, i.e clk is inverted. -waveform {2 25 40 45} indicates 2 high pulses in 1 period (goes high at 2, low at 25, then goes high at 40, low at 45.

ex:  create clock -name "clkm" -period 50 {cla/Z clb/Y} => creates clk with name clkm, with multiple sources "cla/Z, clb/Y", rise at 0, fall at 25. equiv to -waveform {0 25}. This is different than "-add" where there were multiple clks defined on same source. Here, it's same clk defined on multiple source. Basically, it's saying that same clk waveform exists on both of these clks. They are connected externally, but here they appear as 2 separate pin. But we define same clk on both these ports to indicate they have same waveform.



create_generated_clock:

Creates generated clk which is synchronous with other clocks. Syntax is almost same as that of create_clock. Here we specify gen_clk along with it's master clk which is the source of this gen_clk.

So, these 2 options are required:

  • gen_clk and on which pin/port/net it is on - i.e the name of the port, pin, etc where this gen clk is defined. PT starts the gen clk from this point onwards.
  • what is the source master pin (aka src pin) for this gen_clk? This is specified via -source option. There's a different "master clk", which is a clk and NOT pin. See below for details.

Clks start from their origin, and by default, keep flowing thru combo logic, until it hits seq pins where it stops. It stops on sequential pin for obvious reason - clks usually don't go thru flops as clk pin of flop is the final destination (sink point for a clk). This becomes an issue when we want to have divided clks. Divided clks are mostly generated at o/p of flop. To make this work, sdc allows us to use this cmd. We define the gen_clk on the o/p pin of this flop.

If gen_clk is on o/p pin of flop, then timing tool cannot trace the clk back without help. Here we help the tool by defining src clk of this gen clk. -source is used to specify the master or source of gen_clk which is generally the clk pin of seq logic. Tool starts tracing path from src clk going thru combo logic (assuming src clk is defined little further up from the clk pin of flop). When it comes to flop clk pin, it needs to see a path from input clk to generated clk, meaning a timing arc has to exist between these 2 pins (i.e for flop, it sees arc from clk pin to Q pin [CLK->Q arc, i.e Q pin has "related pin" as CLK]). If it sees that arc, it continues and is able to calc the clk latency for the generated clk from the src clk. If there's no arc from CLK->Q pin, then we can't have CLK pin as source for clk on Q pin of flop as PT can't calc delay w/o any timing arc (we get PTE-075 error. see below for details). src clk itself may be generated clk, which eventually will lead to a real clk created using "create_clock" cmd.

Also, the tool doesn't know the logic to figure out waveform relation between clk pin and o/p pin of the seq logic. So, we need to define the relationship too. Gen_clk can either be div_by, mul_by or same clk as master (i.e div_by 1 specified by using -combinational), or edge derived clk. Gen clk can only have edges corresponding to src clk edges, i.e it's not possible for gen_clk to have an edge at 1.5ns, when src clk is changing edges only at 1 ns increment. Tht is why mul_by or div_by can only have integer numbers and can't be decimals. In short, a generated clk edge must always come from certain edge of src clk. We do not define period for gen_clk, as it's deduced automatically from these options.

NOTE:  Although gen_clk can be specified on any ports/pins/nets, we should generally specify it on driving pin. When we define it on a net, tool automatically uses the driver pin of the net as the src pin of the clk. We should avoid defining it on module boundary pin, since that causes the net to be segmented, so xtalk can't be calc on that net, resulting in 0 delta delay for that net (PT issues UITE-136 warning). A tcl procedure can be used to get driving pin for any net or port of a block:

proc get_driving_pin {pin} {
 return [get_pins -leaf -of [all_connected [get_pins $pin]] \
  -filter {pin_direction =~ *out}]
}

create_clock ... [get_driving_pin CLKGEN/CLKOUT] => preferred way

create_clock ... [get_pins CLKGEN/CLKOUT] => not recommended since the hierarchical pin name may change from one synthesis to another.

syntax: create_generated_clock -name <clk_name> -source <master_pin> -divide_by/multiply_by/combinational <other_options> <gen_clk_pin_or_port>

<other_options>:

  • -name => name of clock,
  • -divide_by/multiply_by => divide frequency of src clk by that number (-multiply_by multiplies freq. -duty_cycle used to specify duty cycle for multipled clks). These can only be integers as explained above. We don't have period or frequency for generated clk, as it's derived via this option.
    • -combinational => This is special option for "-divide_by 1" gen_clk. It tells to include only combo logic when calc delay from master clk to gen_clk, and not to include delay thru latch data pin, flop clk pin or src pin of other gen_clk. This is done when logically the gen clk is not really a divided clk, but we define a new gen clk anyway for some other STA reason.
  • -source <master_pin> => Specifies the name of the source master pin from which the new clock is to be derived (delays for gen clk are computed using the source clk. A path should exist from src clk to gen clk or else tool will report PTE-075 error). If mutiple clks exist on master src pin, then use "-add" and "-master_clock" (or -master in short form) option to add more gen clks on the same pin/port. If there are 10 clks specified on master source pin for this gen clk, then there must be 10 gen_clks defined, 1 for each src master clk with -add option and -master_clock option indicating the master clk for that gen clk. -add option is required, else existing gen clks on that pin will be replaced with the latest clk defined. It's 10 diff gen_clk (since 10 diff name of gen clk) each with it's own clock waveforms on it. A lot of bogus launch/capture paths will show up when we define multiple clks on same src pin, so use FP (or set_clock_groups -exclusive) to remove such paths. If we want to define just 1 gen clk (instead of 10) on the gen_pin, then we should choose the src master clk that we want to use as the parent of this gen_clk. In this case, the tool will ignore the other 9 clks and won't have any paths from these other 9 clks to the gen clk.
    • NOTE: -source specifies pin while -master_clock specifies "clock" in design and not a pin (as same pin may have multiple clks since there may be a mux before the source pin). If we define gen_clk source to be before all the muxes in clk path, then no -master_clock needed, since there is usually single clk defined on input ports. For single clk cases, just defining with "-source <master_pin> suffices, and the tool is able to figure out the clk on that pin. However, if we do use -master_clock in such a case, it's harmless since it will anyway trace to that clk. It's a good practise to use -master for all gen_clks irrespective of how many clks are there on master pin. If we don't specify a -master_clock when there are multiple clks in the path of CGC, then it seems like PT picks up one of the clks as master (we can see that, when we do report_clocks). So, always specify a -master when having multiple clks on the source.
    • NOTE: It always safe to use port name or o/p pin of gate, or module pin name as -source of generated clock. Do not use internal net name as timing tools (such as Cadence VDIO) may just ignore such a clock. Also, if you want to use clkname as -source, use "-master_clock <source_clock_name>" instead of -source else VDIO may ignore it.
    • One more subtle issue on how -source is treated for generated clk. If we define gen_clk on the o/p of a mux for multiple clks coming into the mux, then the behaviour of -source option causes PT to take paths that we don't expect. This is because -source is only used to determine master clk taced back thru the src pin (It doesn't force the path to go thru that -source pin. So, when we define gen_clk at o/p of mux, it considers all possible paths from master_clk of that gen clk. To prevent this, we should define gen_clk directly on the o/p of the divider, so that gen_clk only sees one path thru that master clk. See article below for details =>
      • See this solvnet article (link missing as of June, 2025, filed ticket): https://solvnetplus.synopsys.com/s/article/Specifying-MUXed-Clocks-in-PrimeTime-1576002495523
        • From above link: The -source option of the create_generated_clock command determines the identity of the master clock, and the sense (non-inverted or inverted) of that master clock. It does not steer the source latency path. Once the master clock's identity and sense are determined, all possible paths back to that clock's source are considered for the source latency path.
      • A consequence of above is that we will have multiple clk paths from master clk to the generated clks. All of the paths will be part of clk tree n/w and have attr "is_gclock_source_network_pin true". This attr is only set for pins in the path between master and generated clk, and is otherwise false for everything else (i.e clk path beyond the generated clk defn point don't have this attr set). The launch and capture path will be chosen amongst these possible paths based on which gives the worst timing for setup and hold. This may not be real, so we need to have them as asyyn clks so that such paths aren't traced. Other consequence is that attribute "clocks" and attr "is_clock_network" are set to false for part of the logic between the seq cell and the point of defn of generated clk. This is because clk tracing stops passing these attr once it encounters sequential cell in the path. This may seem confusing, as we rely on "is_clock_network" attr to trace clk tree. To avoid all these issues, we should always define divided clks or gen clks right at the o/p pin of flop. This resolves the issue of multiple bogus paths being traced by PT and also the issue of attr set to false.
  • -invert => inverts the gen clk (only in case of div_by or mul_by). This is needed in cases where is inverter on the path from master clk to gen clk, which causes the polarity of gen_clk to be opposite to that of master clk.
    • Warning: Not using -invert option when design has inverter b/w source and gen clk will cause warning ""no src latency defined for gen_clk". See below for details
  • -edges {1 2 4 5 7} -edge_shift {5 10 5 10 5} => This option is provided for generated clock to change the edges of gen clk, i.e 25-75 duty cycle. The numbers represent edges from the source clock that are to form the edges of the generated clock. Number of edges must be odd and >=3 to make 1 full clk cycle. edge_shift represents amount of shift that specified edges undergo to yield final gen clk. These shifts are logical shifts. Delay due to gate latency will automatically be handled by tool. Usually, edge_shifts are not needed, but may be useful to model clks which are not integer divided. NOTE: no "-waveform" used for gen clks, as waveforms are automatically derived based on src clk or by using -edges. Also, num of edges need to be odd (atleast 3 for 1st rise, 1st fall and 2nd rise), while for waveform it needed to be even (atleast 2 for 1st rise and 1st fall).


ex: create_generated_clock -name "reg_clk" -divide_by 1 -source [get_ports clock_12m_port] [get_pins clk_rst_gen/reg_clk_mux/clock_out] => Creates  a new clock signal from the clock waveform of a given pin in the design, and binds it with the pins or hierarchical pins in the <target_pin_list> argument. Here target_pin_list is clk_rst_gen/reg_clk_mux/clock_out. Here gen_clk is same as master_clk. There's just a mux in b/w to select clk_12m_port (normal mode) or scan_clk_port (scan mode).

ex: create_generated_clock -name "reg_clk" -edges {1 7 13} -source [get_ports clock_12m] [get_pins clkgen/reg_clk_mux/clock_out] => Same as above, except we are using -edges here. creates "reg_clk" from src_clk clock_12m using 1st edge (rising or falling) of src_clk as 1st rising edge of gen_clk, 7th edge (rising or falling) of src_clk as 1st falling edge of gen_clk and 13th edge (rising or falling) of src_clk as 2nd rising edge of gen_clk => that completes 1 clk cycle of gen_clk. gen_clk is high for 3 clk cycles of src_clk and low for next 3 clk cycles of src_clk, so effectively it's a divide by 6 clock (could have used -divide_by 6 also). Atleast 3 edges need to be specified to form 1 cycle.  we use this edge based way when we want to have asymmetric clk where high and low time are different.

ex: create_generated_clock -edges {1 1 3} -edge_shift {0 2.5 0} -invert -source [get_pins CLK] {u_COR_FRQ/.../CLKOUT} => This creates a falling edge  pulse  (since inverted) of 2.5 units triggered  by  the rising edge of its master clock with a period 10 (since 1st rising and 1st falling are both from same rising edge of source clk with shift of 2.5 time units). gen clk object may be a list of port/pin/net (if net is used, then first driver pin of the net is the actual source used in  creating the generated clock). Here CLKOUT is a net, so it's driver pin used as gen clk object. -invert is used since there's a inverter in design b/w src master clk and gen clk.

WARNING: We should also watch out for warning as "no src latency defined for gen_clk" as this may indicate that the clk is not traced back to the master clk. I've seen this happening with gen_clk which was inverted in design, but I didin't use the -invert option. This caused the gen_clk not to be traced back as the tool was looking for rise->rise or fall->fall arc based on clk definition, but instead got rise->fall or fall->rise path in design. It couldn't find rise->rise or fall->fall arc in the inverter and so just dropped the path before the gen clk pin. Gen_clk started from the source_pin (i.e Q pin of flop) with 0ns delay and continued forward. This is incorrect as all setup/hold time calculated will be wrong in the absence of gen_clk not traced back to include src latency.

NOTE: create_clock and create_generated_clock are 2 most imp cmds for creating clks during Synthesis as well as STA. All the clls in the design need to be defined using these 2 cmds. Whenever we define a generated clk on any pin, clk from that point onward is the clk of this generated clk. i.e, let's say we define a gen clk on o/p pin of a clk gater, then, the regular clk that was propagating thru the clk gater (since it's a clk gater, PT can propagate clks thru it) will not propagate any more beyond the o/p pin of that clk gater. On any clk path before the o/p pin of clk gater, we'll see regular clocks, but on the o/p pin of clk gater, we'll see regular clock + our newly defined gen clk, and on any pin coming after this clkgater pin, we will only see the gen clk (and not any of regular clocks). So, the o/p pin of clk gater where we defined our gen clk is kind of odd in that it reports both clocks. We should not look at this pin, as it may confuse us, but always look at pins on the fanout of this clk gater o/p.

  • From Synopsys solvnet: When a generated clock is created at a pin, all other clocks arriving at that pin are blocked unless they too have generated clock versions created at that pin.

 

CLK on Ports/Pads of Chip:

Most of the times, pads on chip are bidirectional. They are set as "input only" pad by setting the "OE" (Output Enable) line of the Pad to "zero", or as "output only" pad by setting the "IE" (Input Enable) line of the Pad to "zero". These pads are designed as bidirectional, so that they may be used for multiple functions. For ex, in scan mode, we may want to set the pad as scanclk (as an input pad), while in Functional mode, we may want to drive data out on the pin. In this case, the pad being bidirectional helps, and we can make it input or output by setting bit via software.

The bidir PAD has 3 pins. One is the bidir PAD (seen as a pin on chip by the outside world). The other 2 pins are the input pin (IN) and output pin (OUT) inside the chip. The i/p IN pin takes incoming data from the PAD and passes it to the chip. The o/p OUT pin takes data generated from the logic on the chip, and passes it to the PAD.

Let's say we have a case, where the PAD acts as the clk. There are 3 cases possible:

  1. Chip as a Driver of Clk: Here the clk is driven from the chip to the outside world. We have a create_clk/create_generated clk on some internal node of the logic driving this clk. This clk will flow to the PAD as long as there is combo path from the internal clk to the PAD. The clk will flow from the OUT pin to the PAD. There's no need to define any separate CC or CGC on the PAD. If we report clock attribute on the PAD, we'll see the clk on the PAD meaning it propagated correctly.
  2. Chip as the Receiver of Clk: Here the clk is driven from the outside world to the chip. We have a create_clock cmd on the PAD to genreate a clk. Now this clk will flow from the PAD to the IN pin and then into the internal logic. We need to have a CC cmd on PAD, else there will be no clk on PAD.
  3. Chip as both Driver and Receiver of Clk: This is the weird case, where we want the clk to be both coming in and going out of the chip. You may wonder, if we have case 1 above, then will the clk going out from the chip to the PAD, come back into the chip? There are 2 timing arcs => one from the OUT pin to PAD, and other from PAD to IN pin. USually there's no timing arc from OUT to IN pin in the .lib of the PAD cell. It's possible that timing path takes the route from OUT to PAD and then from PAD to IN, thus creating a path from OUT to IN pin. Checking in PT, this does NOT seem to be the case. If you report clock attribute on the OUT pin and the PAD, you will see clock attr, but you will no clock attribute on the IN pin.  There will also be no timing path from OUT to IN pin. The reason is that the path breaks at the bidirectional pin/port. If we want the clk to propagate from the OUT pin to PAD to IN pin, then we have to do create_generated_clock on the PAd, with the "Source" pin as the OUT pin or some other pin on internal logic of OUT pin path. Then the tool sees the generated clock on the PAD (in addition to whatever clocks were previously there on the PAD), and this newly generated clock is now seen on IN pin of PAD. This can be checked via "clcok" attribute on IN pin.
    1. NOTE: Remeber to do this CGC on the PAD if you do want to loop back a clock from the OUT to IN pin. This is sometimes needed, as there are many high speed interfaces which have a loop back clk or data path for debug purposes.

We have associated remove_*, report_*, get_* cmds for clks.

  • remove_clock <clk_list> => It takes a list of collection containing clks or patterns and removes them from design (only ones that were previously created using "create_clock" cmd). "remove_clock -all" removes all clocks from design. All associated input/output delay wrt these clks are also removed. If there was path group with only these clks in it, then those path groups are also removed. NOTE: If you are on PT_SHELL, and use this cmd to remove a clk, and then add the same clk with any changes, then all previous clk grps are gone, and since we don't define any new clk grps, this clk will be synchronous to all existing clks. So, be sure to add appr clk grps, etc.
    • ex: remove_clock clk1 => removes previously defined clk1 from design.
  • remove_generated_clock <clk_list> => It's similar to remove_clock except that it applies to generated clks only. "remove_generated_clock -all" removes all generated clocks from design.
  • report_clock => always run this cmd after creating all clks to see all clks (incl generated_clks), their period, waveform, attribute (i.e generated/propagated) and their source(port or pin where it's defined). For generated clk, it shows additional info as generated src pin, master clk, master src pin(or port) and waveform modification (i.e div by 1 , combinational, etc). To see details of a specific clk, just provide the name of the clock. If no clk name provided, all clks shown
    • ex: report_clock -nosplit => Most used cmd when all clk creation is done. This dumps all clk info for all clks in design. -nosplit prevents line splitting so that it's easier to read and parse the file by scripts.
    • ex: report_clock [get_clocks clk1] => provides details of clk1 (i.e it's master_clk, src_pin, period, etc)
    • ex: report_clock -skew [get_clocks clk1] => Using option "-skew" reports clk uncertainty for given clk for all edges of clk. So, if we used 24 clk uncertainty cmds as shown below, we'll see all those values here (including any other clk to which we defined interclk uncertainty wrt this clk). This cmd also reports src clk latency for given clk under both setup and hold condition (for early and late arrivals).
  • get_clocks => Just like other get_* comds, it returns collection of clk objects that can be used within the tool for further processing.
  • get_clock_network_objects => returns a collection of clock network objects of certain type (i.e cells, nets, etc) that belong or relate to one or several clock domains. A clock network is a special logic part of the design that propagates the clocks from the clock sources to the clock pins of latches, flip-flops (that function as anything but propagating clocks) or black-box IPs. The propagation also stops at design output ports, dangling pins or nets, or the sources of other clocks (i.e src pin of generated clk). So, basically this cmd retrieves certain types of objects from the direct clock network (including the latches, flip-flops and black-box IPs driven by the clock network). If you specify to retrieve objects from some master clk n/w, then this stops at generated clk, as that's considered part of other clk n/w. Similarly, if you specify to retrieve objects from some generated clk n/w, then this doesn't trace back to src of that generated clk, as that's considered part of other clk n/w.
    • ex: get_clock_network_objects -type cell -include_clock_gating_network [all_clocks] => This reports all cells (buffers, inverters, combo cells, etc) on all clks in design. by default, if clk list is not specified, then all clks in design are considered (so [all_clocks] is not required strictly). option -include_clock_gating_network causes clk gating cells and it's fanout to be included in clk n/w. Otherwise, clk gating cell is a latch, and is considered endpoint for determining clk n/w.

 




set clock uncertainty => For clk uncertainty details, see the "clock tree" section. We can specify clk uncertainty (amount of variation in arrival times of successive clock edges, in library time units) b/w edges of same clk, or b/w edges of 2 diff clks. Usually it's set for edges of same clk, both for setup and hold. During synthesis, clk uncertainty is used to model skew which is going to occur during CTS in PnR. It's equiv to reducing clk period by skew amount.

syntax: set_clock_uncertainty <options> <uncertainty_number> <object_list>=> The object list is either a list of clocks, ports, or pins. For a clock object, the uncertainty applies to capturing latches clocked by that clock. For a port or pin, the uncertainty applies to capturing latches whose clock pins are in the fanout of that port or pin. It's preferred to use get_clocks, get_ports or get_pins for object_list, but we can use the patterns directly too. Clock uncertainty number is usually +ve (-ve numbers not encouraged). 

options:

-setup/-hold => By default uncertainty applies to both setup and hold checks, unless we specifically say -setup or -hold only.

-from/rise_from/fall_from <src_clock>, -to/-rise_to/-fall_to <dest_clock> => Specifies the source and destination clock for clk uncertainty for specified edges of clock. For 1 clock, src and dest cclk are same, while for 2 clks, src and dest clks are diferent. We need these options for a single clk, when we specify uncertainty for half cycle paths. We can use these options for full cycle paths too.

We can specify interclock uncertainty or simple uncertainty. These are mutually exclusive. For same clk, we can't use both options below.

  • for simple uncertainty: Here we specify uncertainty to edges of same clk. These are all possible combo of simple uncertainty for a given clk (6 uncertainty values provided as shown below):
    • SETUP clk uncertainty:
      • clk1 -> clk1 Full cycle setup uncertainty (R->R, F->F) => Here clk jitter, but no clk dcd.  
        • ex: set_clock_uncertainty -setup 0.8 clk1 (this takes care of all cases, R->R, F->F, R->F, F->R. However, R->F and F->R are overwritten by next 2 cmds). We could also write 2 separate cmds for r->R and F->F as we did in inter clock uncertainty.
      • clk1 -> clk1 Half cycle setup uncertainty (R->F, F->R)         => Here both clk jitter and clk dcd (since it's half cycle).
        • ex: set_clock_uncertainty -setup 1.2 -rise_from clk1 -fall_to clk1
        • ex: set_clock_uncertainty -setup 1.2 -fall_from clk1 -rise_to clk1
    • HOLD clk uncertainty:
      • clk1 -> clk1 Full cycle hold uncertainty (R->R, F->F) => Here no clk jitter, and no clk dcd as it's a 0 cycle path. However, a small hold value is provided as margin. 
        • ex: set_clock_uncertainty -hold 0.05 clk1 (this takes care of all cases, R->R, F->F, R->F, F->R. However, R->F and F->R are overwritten by next 2 cmds)
      • clk1 -> clk1 Half cycle hold uncertainty (R->F)         => Here both clk jitter and clk dcd (since it's half cycle). Hold half cycle jitter and dcd very similar to setup half cycle jitter and dcd values since paths are the same.
        • ex: set_clock_uncertainty -hold 1.21 -rise_from clk1 -fall_to clk1
        • ex: set_clock_uncertainty -hold 1.21 -fall_from clk1 -rise_to clk1
  • for interclock uncertainty: This is used to specify uncertainty b/w edges of different clks which are async to each other. Most of the times paths b/w 2 unrelated clks are declared as false paths. However, if we really want to time such paths, we can use this cmd to set uncertainty b/w the 2 clks. The worst uncertainty of the 2 clks is assigned to interclock uncertainty (as that's the worst deviation of clk edge. Adding the uncertainty from the 2 clocks is not right, as one of the edges is going to be fixed). Interclock uncertainty is direction specific. So, to apply clk uncertainty from other direction, we have to specify clk uncertainty cmd in other direction too. These are all possible combo of simple uncertainty for a given clk (8 uncertainty values provided as shown below for clk1->clk2). Similarly we've 8 such uncertainty values for clk2->clk1 (they usually are same values as clk paths remain the same).
    • SETUP clk uncertainty from clk1 to clk2:
      • clk1 -> clk2 Full cycle setup uncertainty (R->R, F->F) => Here clk jitter, but no clk dcd as it's assumed that rise (fall) from one clk to rise (fall) of other clk is a full cycle path. That may not be true if clk is inverted or some othe rleationship exists, that makes rise to rise (or fall to fall) half cycle path. 
        • ex: set_clock_uncertainty -setup 0.8 -rise_from clk1 -rise_to clk2
        • ex: set_clock_uncertainty -setup 0.8 -fall_from clk1 -fall_to clk2
      • clk1 -> clk2 Half cycle setup uncertainty (R->F, F->R) => Here both clk jitter and clk dcd (since it's half cycle). again it's assumed that R->F or F->R implies half cycle path (may not be true)
        • ex: set_clock_uncertainty -setup 1.2 -rise_from clk1 -fall_to clk2
        • ex: set_clock_uncertainty -setup 1.2 -fall_from clk1 -rise_to clk2
    • HOLD clk uncertainty from clk1 to clk2:
      • clk1 -> clk2 Full cycle hold uncertainty (R->R, F->F) => Here no clk jitter, and no clk dcd as it's assumed to be a 0 cycle path. However, a small hold value is provided as margin. 
        • ex: set_clock_uncertainty -hold 0.05 -rise_from clk1 -rise_to clk2
        • ex: set_clock_uncertainty -hold 0.05 -fall_from clk1 -fall_to clk2
      • clk1 -> clk2 Half cycle hold uncertainty (R->F, F->R) => Here both clk jitter and clk dcd (since it's half cycle). Hold half cycle jitter and dcd very similar to setup half cycle jitter and dcd values since paths are the same.
        • ex: set_clock_uncertainty -hold 1.21 -rise_from clk1 -fall_to clk1
        • ex: set_clock_uncertainty -hold 1.21 -fall_from clk1 -rise_to clk1

 

So, in above ex we see that we need 8 uncertainty values for simple uncertainty and 16 uncertainty values for interclock uncertainty (8 for clk1->clk2 nd 8 for clk2->clk1).

ex: set_clock_uncertainty 1.0  [get_clocks clkosc] => sets uncertainty of 1ns to clkosc successive rise or fall edges. uncertainty is same for setup runs and hold runs

ex: set_clock_uncertainty -setup 0.8 -hold 0.2  clkosc => here uncertainty is set differently for setup (0.8ns) and hold (0.2ns). Usually hold clk uncertainty is much lower as it's launch/capture on same edge.

ex: set_clock_uncertainty  -setup 1.3 -fall_from s_clk -fall_to d_clk => setup uncertainty is 1.3 from falling edge of s_clk to falling edge of d_clk. Most of the times, we also want to have same clk uncertainty in other direction too. So, we have to specify "set_clock_uncertainty  -setup 1.3 -fall_from d_clk -fall_to s_clk" too.

report_clock_uncertainty => There's no such cmd. To see clk uncertainty, we have to use report_clock with -skew option. See in report_clock section below.

remove_clock_uncertainty => removes clk uncertainty set by set_clk_uncertainty cmd. We use same syntax as above. Note that interclock uncertainty can be removed by providing -from/-to, while simple uncertainty can be removed just by providing <object-list>. We can't remove clk uncertainty by providing different syntax cmd that the one that was used to set clk uncertainty.

ex: remove_clock_uncertainty [all_clocks] => This removes simple uncertainty for all clocks

ex: remove_clock_uncertainty -from s_clk -to d_clk => This removes inter clk uncertainty that was set above. We don't have to use "-fall_from s_clk -fall_to d_clk" as this remove uncertainty cmd is a superset and removes uncertainty for all edges rise->rise and fall->fall. If we had uncertainty b/w rise->fall or fall->rise, then we would have to use the more targeted cmd: remove_clock_uncertainty -rise_from s_clk -fall_to d_clk

 



set_clock_transition -min -rise 0.4 [get_clocks SCLK] => specifies slew rate for ideal clocks (slew rates for gen clocks are calculated by tool). Here, it's 0.4ns rise time slew at min(fast) process corner.

 



set_clock_latency 0.4 -source -rise [get_clocks  SYSCLK] => rise latency of 0.4 for SYSCLK. This latency is used to model off-chip clk latency. clk network latency is internal to design and is still propagated as it should.

 




set_propagated_clock [all_clocks] => Specifies that delays be propagated through the clock network to determine latency at register clock pins.  If not specified, ideal  clocking is  assumed.   Ideal  clocking  means  clock  networks have a specified latency (from  the  set_clock_latency  command),  or  zero  latency  by default.

 




set_clock_groups (added in sdc 1.7): all create/gen clk cmd above creat sync clocks, so all paths analyzed b/w different clocks. To specify different behaviour of clk, use set_clock_groups.

Clks can be sync or async in terms of timing relationship. They can also be exclusive in terms of their functionality (i.e only 1 clk or set of clocks can be active at a time. An ex might be a mux on clk path that selects 1 of 2 clocks to propagate, in this case the 2 clks are exclusive). Timing paths are analyzed only b/w sync clocks. Paths b/w clocks which are declared async or exclusive are considered as equiv to declaring false paths b/w these clocks. So, no timing paths analyzed b/w these clks. Within exclusive, we have 2 varieties: logically exclusive and physically exclusive. Logically exclusive is when clocks are logically exclusive, but physically they can still fire at same time. An ex would be when there is a mux choosing b/w 2 clocks, so only 1 clk is acive downstream. However, these 2 clks are active at same time upstream of mux. Physically exclusive is when these  clocks can never physically interact with each other. In above ex, if mux is placed right at i/p port of chip, then these 2 clocks will have no interaction b/w each other. Thus physically exclusive is more restrictive than logically exclusive.The only diff b/w these 3 kind of paths is in how they are handled during crosstalk analysis (noise runs), otherwise for regular timing runs, they are all treated the same (i.e treated as false paths). For crosstalk analysis b/w clocks, depending on whether clocks are sync, async, logically exclusive or physically exclusive, we do diff thing:

  • sync: For sync clocks, clocks have finite timing window based on when they arrive, so they can couple with each other or other signals within that timing window. This is the default behaviour for all clocks. Timing paths are analyzed for all paths b/w sync clocks.
  • async: For async clocks, clocks are assumed to have infinite window relationship, so they can couple with each other at any time. Timing paths are not analyzed b/w async clocks, but they are still analyzed for everything else, so these coupling are considered.
  • logically exclusive: For logically exclusive clks, there might be coupling b/w clock, so crosstalk computation is done normally (i.e for sync, we have finite window while for async, we have infinite window)
  • physically exclusive: For physically exclusive clks, there is no coupling b/w clocks, so only one of these clks can be an aggressor or victim for noise purpose.

Usually we can just declare false paths b/w clks which are async or exclusive. This will work OK as long as we don't run any noise runs (i.e noise impact on timing, or noise bumps). For noise runs, we do need to have these, as "set_false_path" doesn't say anything about relationship b/w clocks, so all clks will be considered sync, resulting in optimistic timing runs (as sync clks can have finite timing window, while in reality, async clk can fire at any time, and hence cause worst case coupling at the right time).


ex: set_clock_groups -asynchronous -group {reg_clk} -group {spi_clk} => reg_clk and spi_clk are async, so don't analyze any path b/w these. equiv to declaring false paths b/w these.

ex: set_clock_groups -logically_exclusive -group {clk1 clk3} -group {clk2 clk4} => Here, clk1/Clk2 are 2 i/p to same mux, and Clk3/Clk4 are 2 i/p to other mux. If they share same select signal, then we can use this style to say that the 2 groups are exclusive to each other (don't interact with each other as they might be muxed so only one of them is ON), so don't analyze any path b/w these (paths that start in one group and end in other group, i.e fp b/w clk1 and clk2/clk4 and fp b/w clk3 and clk2/clk4), equiv to declaring false paths b/w these. However, no relationship is specified b/w clk1 and clk3, or clk2 and clk4.

#set_clock_groups -physically_exclusive -group [get_clocks {clk1 clk22*}] -name grp1 => when we specify only 1 clk group, that implies this group (clk1, clk22*) is async to all other clocks in design. Here, a separate default "other" group is created for  this single group. However, no relationship is specifed b/w clk1 and clk22*, so these clks are sync to each other unless specified via a separate "set_clock_groups" cmd.

remove_clock_group => This is used to remove clk grp set by the above cmd. You must specify either the -exclusive (or -logically_exclusive or -physically_exclusive too depending on how clk grp was set) or -asynchronous option. You must specify either the name_list or -all option. -all removes all clk grps while providing a name removes only that clk grp.

  • remove_clock_group -exclusive -name grp1 => removes exclusive clk grp named grp1
  • remove_clock_group -asynchronous -all => removes all asynchronous clock groups from the current design

 

set_clock_exclusivity (SCE)  => This is another way to set different clks exclusive to each other, which makes our job much easier. As we saw in the example of mux above, all clks coming into a mux will be mutually exclusive to each other, provided the mux select lines are not dynamically changing. Previously, we had to define multiple gen_clk and then make them physically exclusive to each other. This cmd makes it easy for us by making all clks mutually exclusive to each other that traverse from i/p to o/p of the cell. We specify the cell, it's o/p pin and i/p pins whose clks we want to be exclusive. All clocks at the specified inputs are considered mutually physically exclusive clocks beyond the output. So, no Silicon integrity (SI aka cross talk) amongst these clks will be considered beyond the o/p pin of exclusivity mux. SI impact will still be considered amongst these clks, if the clks after the mux interact with clks before the mux.

This link has a link to presentation which goes over all the details of this cmd => https://solvnetplus.synopsys.com/s/article/PrimeTime-Automated-Clock-MUX-Constraints

Link to presentation  => https://solvnetplus.synopsys.com/apex/pdfViewer?ContentDocumentId=0694w00000Gu9stAAB

Automatic clk exclusivity: If we set var "set timing_enable_auto_mux_clock_exclusivity true", PT will automatically put SCE cmd on all simple muxes which has "is_mux" cell attr set in .lib. For all other muxes, we'll have to write this cmd manually. To selectively disable SCE on specific mux (after you have enalbled SCE globally), use "set_disable_auto_mux_clock_exclusivity [get_pins MUX25/Z]".

syntax: set_clock_exclusivity -output <output_pin> [-type mux | user_defined] [-inputs <input_pin_list>] => If the cell is a Mux, we can use -type as mux, and then we don't need to define i/p pins. But for other non-mux cells as and gate etc, we need to use -type "user_defined" and specify all i/p pins.

ex: set_clock_exclusivity -type mux -output MUX25/Z => mux25 cell is defined as point of exclusivity

ex: set_clock_exclusivity -type user_defined -output AND3/Z -inputs {AND37/A AND37/B} => Here AND3 cell is defined as point of exclusivity. Since it's not mux type, we had to specify input pins explicitly.

IMP: We shouldn't have gen_clk on the o/p pin of mux or any path downstream, else set_clock_exclusivity cmd won't work. Also, SCE works for cascaded mux, when we use auto enable feature. This is since clk exclusivity propagates downstream. If we are manually specifying exclusivity, then we have to use sce cmd for each of the muxes all the way to the final mux of all cascaded muxes. If there's a non exclusivity relation on i/p pin of a mux, that non exclusivity will be preserved downstream.

Reporting clk exclusivity: 2 ways to report clk exclusivity:

  1. report_clock -exclusivity => shows all exclusivity points along with associated clks. We can do "report_timing -groups *" to report timing b/w all clk grps, or specify "-from clk1 -to clk2" and vice versa, and make sure we see only intended paths.
  2. Attr "is_clock_exclusivity": This attr can be queried on any pin to find out if SCE is set. ex: get_att [get_pins Mux2/Z] is_clock_exclusivity => returns true or false.

remove_clock_exclusivity =>To remove exclusivity set above

 

 



set_clock_gating_check => Whenever a clk encounters any logic on it's path, other than a buf/inv, the other leg of that logic has the poyential to cause glitches on the clk that is coming out of this logic. To prevent such glitches from happening on the o/p clk of logic, we have a clk gating check. As an ex, if we have an AND gate, and 1 signal is is clock, while other leg of the AND gate is some "data" signal from a flop, then we have to make sure that the output of this and gate is free of glitches. AND/OR gate may not provide glitch free clock output. It depends on the clk gating "data" signal timing. That is why latch based clk gaters were introduced which guarantee glitch free clk o/p.

CG CHECK: STA tools perform clk gating check only on simple gates as AND/NAND, OR/NOR. For any complex gate as XOR, MUX, etc, PT doesn't perform clk gating check and issues a PTE-060 warning. That's OK as most of the times presence of MUX implies that 2 clks are coming in and only 1 will be active at a time and select signal is quasi static (i. Select signal fires only once in a while, and that also when the clk is inactive). However, if you have design, where select signal of MUX is actively switching while clk is active, then you need to design more complex logic to go with the mux that can achieve that. That's totally separate topic that we'll touch on "glitchless clock mux design". For our case, STA tools as PT only CG checks on simple gates. If we have case analysis or constant values on complex gates such that the complex gates can be reduced to simple gate, then PT will infer CG, and do the CG check. So, when PT is doing CG check on complex gate, it's a red flag that something in the mu is tied off or has a set case analysis. You may want to trace that, and make sure that is what you want.

Solvnet article on MUX CG: https://solvnetplus.synopsys.com/s/article/Clock-Gating-Checks-on-Multiplexers-1576002513927

Nice artice on CG check: https://solvnetplus.synopsys.com/s/article/How-Are-Clock-Gating-Checks-Inferred-1576020186140

By default, latch based clk gaters have setup/hold checks in their .lib. However, other logic gates as AND/NAND, OR/NOR, MUX, OR-AND on clock path do not have any setup/hold checks specified in .lib (even though you may have special clock gates as CKAND2, their timing might be just in->out timing). Gating check is required on any pin gating a clock signal. This cmd specifies a setup and hold time check on those pins gating clk (by default, setup and hold time check are 0). -high and -low options help PT in situations where it can't determine whether enable pin needs to set to "high clk" or "low clk". This happens for complex cells as MUX, OR_AND, AOI. For AND/NAND, check is performed against high clk, while for OR/NOR, check is against low clock.

Below cmd forces tool to perform clk gating check on cells where by default tool doesn't do CG check, or to put specific setup/hold values to be applied for CG check.

syntax: set_clock_gating_check <options> <object_list>

options: -setup/-hold, -rise/-fall, -low/-high

object list to be specified here is optional. If nothing specified, clk gating check is applied to all of design. Objects specified can be cell, pins or clocks. If cell specified, then all i/p pins of that cell are affected, while if particular pin specified, then clk gating check applied only to that pin. If clock specified, then clk gating check applied to all gates gating that clk. clk is most commonly used, since it's simpler to cover all gaters in the path.

ex: set_clock_gating_check -setup 0.2 -hold 0.4 [get_clocks CK1] => specifies a setup time of 0.2 and a hold time of 0.4 for all gates in the clock network of clock CK1.

ex: set_clock_gating_check -setup 0.5 -hold 0.2 [get_cells and1] => specifies a setup time of 0.5 and hold of 0.2 on the and1 cell

remove_clock_gating_check => removes clk gating check set using above cmd. Use same options as what was used in "set_clock_gating_check" cmd.

report_clock_gating_check => reports all clk gating checks done by PT on cells. user specified high/low options for clk gating checks shown by *, while PT inferred ones show no *

set_disable_clock_gating_check => Disables the clock gating check for specified cells/lib_cells and pins in the current design. This command will only disable auto-inferred clock gating checks. Clock gating checks from library will not be disabled.

ex: set_disable_clock_gating_check [get_lib_cells {"class/AND1"}] => It disables clk gating check on this AND1 lib cell. When the checking through a cell is disabled, all gating checks in the cell are disabled.
ex: set_disable_clock_gating_check mod1/AND2/A => It disables clk gating check on A pin of this clk gating instantiated cell. When the checking through a pin is disabled, any gating check is disabled if it uses the disabled pin as a gating clock pin or a gating enable pin.

Most of the cells have just one clk gating check, where there's a clk pin and an en pin. So, specifying the cell works for most of the cases.

 



set_clock_sense / set_sense => set_clock_sense has been deprecated and replaced by set_sense. Generally, we have simple gaters on clk paths, and PT is able to propagate clk thru these gates on clk paths. For AND/NAND, OR/NOR, it's able to figure out the clk dirn at o/p of gate (known as unateness => +ve unate means if clk rises, o/p of gate rises, and if clk falls, o/p of gate falls, while -v unate means the opposite).  This cmd is  generally used on more complex gaters where tool cannot figure out the unateness of clk at that pin. If we know what the unateness of clk is ging to be at that pin (based on functionality), we can specify it via this cmd.  It restricts unateness at pin (to positive or negative unate) with respect to clock source. However, the specified unateness only applies within the non-unate clock network (If PT can figure out unateness by itself, it will issue a warning).

This cmd is also used to stop clks from propapagting forward. This is useful in cases where we have defined multiple clks on the same net, but we don't want one or more of these clks to propagate to some part of design.

syntax: set_sense -type <clock|data> -clocks <clock_list> <more_options> <object_list>

options:

  • -positive/-negative => applies +ve/-ve unateness to all pins in object list wrt clk src.
  • -clocks => by default, all clocks passing through the given pin objects are considered for unateness. However, we can restrict it to a given set of clocks by specifying it via -clocks.
  • -stop_propagation => Stops the propagation of specified clocks in the clock_list (via -clocks) from the specified pins or cell timing arcs in the object_list. This option is not used with -positive/-negative, as we aren't specifying any unateness here, but instead stopping clk propagation all together.
  • object list can be pins, ports, or cell timing arcs.

ex: set_sense -positive -clocks [get_clocks CLK1] XOR/Z => specifies a positive unateness for the XOR/Z pin with respect to the CLK1 clock. If -clocks is omitted, then +ve unateness is applied on XOR/Z pin for all clocks.

ex: set_sense -stop_propagation -clocks [get_clocks {my_clk your_clk}] [get_pins  {chip/mod/reg1/Q ... }] => this stops propagation of my_clk and your_clk on pin Q. Instead of {my_clk your_clk} we may also provide it as a list => [list my_clk your_clk]

ex: set_sense -stop_propagation [get_pins $mod/gateer1/CLK_OUT] => This stops propagation of all clocks from CLK_OUT pin onwards. CLK_OUT pin still has clk attribute set to true, but CLK_IN pin on this clk gater wiil have clk attribute set to false.

remove_sense => This cmd undoes the effect of set_sense. Syntax here is little different than set_sense as we only specify clocks and pin that were used when using set_sense. Otherwise the cmd needs to have the same options and pin names as what was used in set_sense.

syntax: remove_sense -type <clock|data> -clocks <clock_list> <object_list> => -type specifies sense is to be removed from clock (default) or data network. -clcoks is used only if -clocks was used with set_sense cmd. <object_list> should be the same as what was used with set_sense cmd.

Other option "-all" is used to remove all the unateness/clk_prop info from clk network network. If we specify -type data along with -all, then it removes unateness info from data network too.

ex:

set_sense -positive -clocks [get_clocks CLK1] XOR/Z
remove_sense -clocks [get_clocks CLK1] XOR/Z => Here we didn't use -positive option as that's not in the syntax of remove_sense cmd.

 


 

PT Clk Propagartion Errors (PTE-075):

NOTE: There are 2 cmds that are used to kill clks from propagating in PT. One is the "set_case_analysis" cmd which forces a constant 0 or 1 value on the pin, and the other is "set sense -stop_propagation" cmd. PT errors as PTE-025 or PTE-075, where the  tool is not able to trace a generated clk back to it's master clk are usually due to one of these cmds in sdc file (assuming netlist by itself is structurally correct, i.e there is correct clk net connection). One of the best ways to debug such PT errors is to use "all_fanout" or "all_fanin" cmd (see in PT cmds section for details). Use "-from master_clk_pin" and "-to generated_clk_pin" in this cmd, and check if it reports any path between these 2 points. Then keep on doing it, until you come to a point, where it doesn't show any path. However, when the issue is due to "set_sense" cmd, then all_fanout/all_fanin will still show the path as correct, but on reporting clk attribute, we will see no such attribute on the gates in the clk tree path.

Procedure to debug PTE-075 Error (Error: Generated clock 'my_gen_clk' has no path to its master clock. (PTE-075)). Bring up PT_SHELL by running Primetime and run below cmds:

  1. Report generated clk to make sure it's reported as expected. This is a sanity check to make sure our gen_clk definition is correct.
    • pt_shell> report_clock my_gen_clk => This will show master_clk (my_clk), as well as generated src (as top/.../INVX1/Z). If not, then debug why by looking at your clk defn.
  2. Now report fanin or fanout from gen_src to/from master src. We can just use one of the 2 cmds below. It will either show "no path found" or will show all the cells in the path if the path exists.
    • pt_shell> all_fanin -to top/.../INVX1/Z -from CLK_PORT -flat => This reports all the cells in the path starting from gen_clk src all the way backward to master_src
    • pt_shell> all_fanout -to top/.../INVX1/Z -from CLK_PORT -flat => This reports all the cells in the path starting from master_clk and going forward all the way to gen_clk src.
  3. If the all_fanin/all_fanout cmd in bullet 2 above shows valid path, then the issue is probably due to clk cells not having clk attribute on them. Not having "clk" attribute on relevant pins of all the cells in the clk path, breaks the clk path and causes "PTE-075" error, even though the path exists both logically and physically. This issue commonly happens due to "set sense -stop_propagation" cmd being used on some of the cells in the sdc constraints file written by the user. Debug it using "get_att" cmd recursively from the gen_clk src all the way to the master_clk src.
    • pt_shell> get_att [get_pins top/.../INVX1/Z] clocks => This cmd is run on o/p pin of the gate, where gen_clk is defined. It should show both gen_clk and master_clk {my_clk my_gen_clk}
    • pt_shell> get_att [get_pins top/.../INVX1/I] clocks => Now we keep moving backward from gen_clk to master_clk. We move to i/p pin of buffer where we defined gen_clk (gen_clk is defined on o/p pin). This should show only master_clk {my_clk}. Every gate pin going all the way to the master_clk pin should show master_clk. We stop when "get_att" cmd shows this msg => "Warning: Attribute 'clocks' does not exist on pin 'top/.../INVX1/Z' (ATTR-3)". That is where clk is being stopped from getting propgagted due to "set_sense" cmd. Remove that cmd from constraints to fix this.
  4. If the all_fanin/all_fanout cmd above in bullet 2 doesn't show any path, then we need to rerun all_fanin/all_fanout cmds with option "-trace_arcs all". If that shows the path, then there's a set_case_analysis or tie off somewhere that will need to be debugged by using other options "-trace_arcs".
  5. Once we've narrowed the issue in bullet 3 or 4 above, we can fix it in PT_SHELL itself by runing "remove_sense" (bullet 3) or "set_case_analysis" (bullet 4). Then we rerun "check_timing" cmd to get PT to update timing. On the screen, we shouldn't see any PTE-075 error anymore while "check_timing" is running.
  6. We can also run report_timing cmd to see the clk path.
    • pt_shell> report_timing -from CLK_PORT -thr top/.../INVX1/Z -exceptions all => This should show the clk path if structurally and logically the clk path exists (even though "set_sense" cmd may be blocking the clk from propagating, but seems like report_timing doesn't care about that).

 

Exception to PTE-075 Error: We sometimes don't get a PTE-075 Error, but we still see a generated clk which shows up as if it's generated using "create_clock". It doesn't trace back to it's master clock, and generated clk starts from the point where it's defined (with 0 delay).

 

 


misc clk attribute cmds:

1. get_attribute => this cmd can be used to get value of various attributes for cells, nets, and other design objects. For more details on syntax, see in PrimeTime Commands section. In this section, we will only discuss about clk attributes, which can be used to debug clk errors. 

Clk Attr on clk object:

To see list of all attr for clock object, run list_attr cmd: ex: list_attribute -application -class clock -nosplit => (details in PT cmds section). Below are few of such attr for clks which are commonly used.

A. clk period for clock object:

ex: get_attribute [get_clocks] period => get_clocks returns collection of all clocks in design, Then get_attribute which works on collection, looks at each object of collection and returns attribute "period" for each of them. get_clocks actually returns a pointer to the collection of clock object (echo [get_clocks] = _sel1245). get_attribute which expects it's arg to be a collection, gets this pointer, and grabs period from it. If we diretly provide clk name as arg, it will error out, as that is not a collection (i.e pointer). The return value is a list of periods, i.e [12, 17.4, 81, ..]. We can do any list operation on this i.e lsort, etc, or we can also pass this list thru foreach_in_collection to get each individual period (as if they were a collection)

ex: get_attribute [get_clocks clk1] period => returns valid period of clk1 as 20, as [get_clocks clk1] returns a collection of object "clk1"

ex:  get_attribute clk1 period => Warning: Nothing implicitly matched 'clk1' (SEL-003) => this happens because clk1 is not a valid collection here

B. clk source pin or master clk for clock object:

ex: get_attribute [get_clocks clk1] sources => this returns the source pin of the clk => {"mod1/clk_buffer/I_clkinv/ZN"}

ex: get_attribute [get_clocks clk1] master_clock => this returns the master clk for this generated clk which is {"clk2"} here.

C. clk network pins for clock object: Returns a collection of pin (both i/p and o/p pins of cells on the path) and port objects in the propagation path of the clock (both master and generated clks). It traces forward and stops at clk sink points (at clk pins of seq cells, ports, etc). As noted above, if generated clk is not defined directly on flop o/p, then paths of logic in between the master clk and gen clk, won't show in this attr (as master clk stops at clk pin of flop, while gen_clk doesn't start from Q pin, but some point after the q pin, which becomes the starting point of clk n/w for gen clk).

To report pins in the latency path of a generated clk, we can use attr clock_source_latency_pins => This returns a collection of pins and ports in the source latency network of a generated clock. This attribute is undefined for master clocks. This shows pins of all cells on all possible path from master src clk to gen clk (clk pin of flops is also included in this along with Q pin, as clk pin is the pin where master clk stops, which is the src of this gen clk).

So both these attr combined cover all pins in clk path of all clks.

ex: get_attribute [get_clocks clk1] clk_source_latency_pins => {"cnt_1/Q", cnt_1/Clk", buf_0/I, buf_0/Z, ....}

D. Misc attr: Many misc attr for getting more clk attr on clk objects.

  • clock_latency* => latency reported which were set by using set_clock_latency cmd.
  • min/max_delay*, max_cap*, max_tran* => These attr are reported which were set by set_min/max_delay, set_max_transition, set_max_capacitance, etc.

 

Clk attr on any object (These attr are defined on any cell, pin, net, etc. Not necessarily on clk objects only). So, these can be used to determine if given object has any clk attr.

C. clocks on pins: "clocks" attr returns a a collection of clock objects that propagate through the pin. It is undefined if no clocks are present. This cmd only works on pins of leaf cells. If you apply this on non-leaf cells (i.e a net, module port, etc), it will return an error "no such attr found".

ex: set clkName [get_attr [get_pins $ModName/GATE_latch_0/CLK] clocks -quiet] => here we are retreiving all clocks on CLK pin of a latch. Then this collection is put into a var named clkName. This is very imp cmd to see if all clocks are seen on CLK pin of various flops.

D. clock, is_clock_pin and is_clock_source_network attr on pins: (Looks like clock attr is not valid anymore, "clocks" (plural of clock with a s) attr above gives all valid clocks on that pin)

clock and is_clock_pin => The clock attribute indicates that the pin is a clock pin in the library cell definition (i.e "pin (CK) {clock: true}" in lib cell defn) whereas the is_clock_pin attribute indicates that the library or instance pin is an active and valid clock pin in the design (i.e it is reached by a clock signal and the sequential cell instance containing this pin is not disabled by disabled timing arcs or by case analysis). For tied off pin, or clock pins which are not reachable via any clk in that mode, the is_clock_pin attr will be seen as set to false.

ex: get_attr [get_pins $ModName/GATE_latch_0/Q] is_clock_ source_network => Returns true if the pin is part of a clock source latency network. This is very useful cmd to see where on clk tree does the clk not propagate. i.e if we see that this attr is false on the o/p of a clk gater, but is true on i/p CLK pin of that clk gater, that means that there is possibly "set_case_analysis" or "stop propagation" on that clk gater.

 


 

startup process:

After powerup, m/c goes thru different steps before the login prompt appears. A lot of useful info here: https://www.thegeekstuff.com/2011/02/linux-boot-process

and here: https://www.tldp.org/LDP/sag/html/boot-process.html

and here: https://www.linuxnix.com/linux-booting-process-explained/

and here: https://wiki.archlinux.org/index.php/Arch_boot_process

and here: http://tldp.org/LDP/khg/HyperNews/get/tour/tour.html

In short these are the steps:


1. Powerup: As soon as the power button is pressed, the system powers up. CPU is released off reset. 80x86 CPU puts addr 0xFFFF0 on Addr lines, and this is the very first addr that the CPU will read from. This addr (historically) happens to be the addr on EEPROM (a separate chip on motherboard, that just stores read only memory. EE stands for Electrically Erasable and Programmable. We use EEPROM instead of ROM so that manufacturers can push out updates easily). This 0xFFFF0 addr contains another addr to which the CPU jumps to (known as indirection). EEPROM stores piece of code starting from that addr. That code is known as BIOS/UEFI or boot firmware code (UEFI is referred to as BIOS or UEFI BIOS, even though UEFI is different than BIOS). There are still 12 more bytes at addr 0xFFFF4 and beyond. We may store soft power up (i.e reset or warm boot) addr at that location, and likewise at other top addr. That way, we may jump to some other addr in ROM code depending on the powerup scenario. 

A. BIOS/UEFI: This is the code stored in EEPROM. On older system, BIOS used to be stored here. On newer computers, UEFI is stored instead of BIOS. UEFI is just different code than BIOS.

BIOS/UEFI consist of 2 parts. The 1st part is stored in EEPROM. This part does similar things for BIOS/UEFI in initializing the system.The 2nd part is stored on Hard disk or some other disk. The 2nd part is called the 1st stage boot loader.

For BIOS systems, BIOS code first does POST (power on self test). It then initializes the remaining hardware, detects the connected peripherals (mouse, keyboard, pendrive etc.) and checks if all connected devices are healthy. You might remember it as a 'beep' that desktops used to make after POST is successful. Finally, the firmware code cycles through all storage devices and looks for a boot-loader. These storage device may be floppy disk, hard drive, usb stick, etc. It will then choose a disk drive and read it's very first sector, which is addr 0x7C00. This is called boot sector on floppy disks. On hard disk, it's called master boot record (MBR), since HD can have several partitions, each with it's own boot sector. MBR is the 1st 512 bytes on the HD. The code in MBR then transfers control to other code, which does the main booting.

For UEFI systems, UEFI code first initializes hardware to the point it can detect keyboard keys pressed, display something on screen, and access the EFI partitions of detected drives. EFI partition is a separate partition for UEFI booting. So, the 1st part of UEFI code stored on EEPROM also contains a file system driver to read that EFI partition as only then it can actually load EFI extensions from there. EFI partition is like a 1st stage boot loader.

These boot loaders (either on MBR or EFI partition) are called 1st stage boot loader.

Difference b/w BIOS and UEFI: This link explains it nicely: https://www.freecodecamp.org/news/uefi-vs-bios/

BIOS: BIOS stands for Basic Input Output system. In short, BIOS is stored in EEPROM and all of the BIOS related code is in this memory. It runs in 16 bit mode, so that's why it doesn't any graphic support, and we have to use use arrow and enter keys to navigate thru the options. It is the screen that shows up when you press F1, F2 or some other key (depending on your laptop manufacturer) on computer startup.

UEFI: UEFI stands for Unified Extensible Firmware Interface. It does the same job as a BIOS, but with one basic difference: it stores all data about initialization and startup in an .efi file, instead of storing it on the firmware. This .efi file is stored on a special partition called EFI System Partition (ESP) on the hard disk. This ESP partition also contains the bootloader. UEFI was designed to overcome many limitations of the old BIOS such as drive size, slow boot time, etc. UEFI is fast and it runs in 32/64 bit mode (that's why it has GUI). All modern systems support UEFI, so if you bought a laptop in the last 5 years, you most certainly have UEFI on it.

Secure boot: One of the most important feature added by UEFI is a security feature called "Secure Boot", which prevents the computer from booting from unauthorized/unsigned applications. This helps in preventing rootkits, but also hampers dual-booting, as it treats other OS as unsigned applications. Currently, only Windows and Ubuntu are signed OS. This is a a big pain when installing any Linux OS. One way to get around is to disable "Secure Boot" feature (see in Linux Installation section).

B. MBR/GPT: After code in EEPROM runs, control is transferred to MBR (for BIOS startup) or to EFI partition (for UEFI). These usually reside on hard disk (see in Filesystems section for details on hard dik, partitions, etc)

MBR (Master Boot Record): It is the first 512 bytes on hard drive (first 446 bytes is code, remaining 64 bytes is partition info. read more about MBR in File systems section). First BIOS executes code in MBR, then identifies bootable partition (on that hard disk), reads boot sector of that partition, and then starts the code in that boot sector. This code (in boot sector of that partition) reads in the kernel from the partition and starts it. Ideally, the kernel image is stored in seq addr, and the code can be read sequentially. However, this would require a separate partition for kernel image, which is not practical. Instead the kernel image is stored in a File system (FS), so the code will need to figure out the sectors where the FS has stored this kernel image. In a FS, image may be scattered on different physical sectors, so there's no requirement of kernel image to be in consecutive locations. The most common way to achieve this is using a bootLoader as GRUB or LILO.

GPT (GUID Partition Table): MBR is only 512 bytes which is too small. So, a newer scheme called GPT was introduced. It is a standard for the layout of the partition table on a physical hard disk, using globally unique identifiers (GUID). Usually, MBR and BIOS (MBR + BIOS), and GPT and UEFI (GPT + UEFI) go hand in hand. This is compulsory for some systems (eg Windows), while optional for others (eg Linux). UEFI supports the traditional MBR too, while BIOS may support modern GPT too.

3. GRUB: GRUB is a bootloader pgm used in Linux world. grub is now replaced by grub2 as of 2020 in all new linux distro. grub2 was written from scratch and is in bash. grub files are not found anymore on any linux distro, so we'll be talking about grub2 (even though we may write grub, we mean grub2 unless mentioned specifically)

For systems still using classical BIOS on powerup,  Grub replaces the code in MBR (1st 512 bytes) with it's own code. This is done by any Linux installer without you doing anything. So, when system is powered up in BIOS mode, and code in MBR is executed, it's the GRUB bootloader code that gets executed (instead of default code in MBR that was there before grub replaced it, which was windows code). For systems using UEFI, grub is copied on to EFI partition (again it's done by any Linux installer without you doing anything.). Grub is read directly from an EFI System Partition. GRUB has the advantage of being able to read ext2, ext3, and ext4 partitions and load its configuration file. Grub uses hd for hard disk, instead of sd used by Linux File System.

 
GRUB for BIOS:

Grub for BIOS is slightly different than Grub for UEFI. Grub for BIOS is not really used in any system after 2020, so you can skip the section below.

Grub for BIOS intro here: https://www.dedoimedo.com/computers/grub.html

Grub (for BIOS) works in 2 stages.This 2 stage approach allows large code in GRUB to be executed w/o the limitation of 512 bytes in MBR.

  • 1st stage: This is located in the MBR and mainly points to Stage 2, since the MBR is too small to contain all of the needed data.  
  • 2nd stage: This points to its configuration file, which contains all of the complex user interface and options we are normally familiar with when talking about GRUB. Stage 2 can be located anywhere on the disk. If Stage 2 cannot find its configuration table, GRUB will cease the boot sequence and present the user with a command line for manual configuration.  

If we do auto install of any linux distro, GRUB gets installed by default. To manually install it, we need to first log into a linux OS, and from within there, execute grub cmd to get a grub prompt. Next we need to place GRUB Stage 1 in the first sector of the hard disk (MBR or Partition Table). For this, we first find all possible grub on the system (there may be more than 1 grub if more than 1linux distro is installed). Then we choose the one which we want to be copied to MBR.

  • grub> find /boot/grub/stage1 => this will possibly return (hd0,1) for SUSE, (hd1,2) for Ubuntu, etc (assuming system has multiple linux OS installed). That means it found stage 1 grub in these partitions.
  • grub> root (hd1,2) => This is saying that choose grub from disk1 (2nd disk) and partition 2 (3rd partition in 2nd disk) to copy to MBR.
  • grub> setup (hd0) => Write above grub stage 1 from (hd1,2) to hd0 (disk 1 MBR)
  • grub> quit

Instead of above 4 step process, we can just use single cmd shown stating where to cp grub stage1 to => grub-install /dev/hd0 => Here we cp grub from one of the available grub stage1 to hd0 MBR

GRUB files are in 2 main dir => /boot/grub and /usr/lib/grub.

  • /boot/grub/ => GRUB cfg (or menu) is located on the root partition. For legacy grub, we have menu.lst file, while for grub2, we have grub.cfg.
    •  Legacy grub has /boot/grub/menu.lst. menu.lst lists all partitions and marks the active one. It lists partition one by one, Format is:
      • 1st partiton is windows OS and is marked as active. Since this OS is not understood by GRUB, chainloader is used.
        • title Windows 95/98/NT/2000 => title says what OS it is (for human understanding)
        • rootnoverify (hd0,0) => This specifies the root (/) partition of Windows. In this instance, the boot image is on (hd0,0) => hd0 means 1st hard disk, and later 0 means 1st partition on that disk. rootnoverify implies GRUB cannot understand Windows OS, i.e. no multi-boot compliance. The job of mounting the boot image is left to chainloader (see below). 
        • makeactive => this sets the active partition to this partiton, i.e (hd0,0)
        • chainloader +1 => This feature is used for OS such as Windows that cannot be booted directly. They are booted by the method of chainloading (GRUB passes the control of the boot sequence to another bootloader, located on the device to which the menu entry points).
      • 2nd partition is Linux OS
        • title Linux => implies it's Linux OS
        • root (hd0,1) => root specifies where the root (/) partition is. Here / is on (hd0,1), i.e hard disk 0, partition 1.
        • kernel /vmlinuz root=/dev/hda3 ro
    • grub2 has /boot/grub/grub.cfg. This is autogenerated at install via grub-mkconfig using templates from /etc/grub.d and settings from /etc/default/grub
      • /etc/grub.d/ => It has separate files for each OS entry in grub menu (options that we see to choose the OS when grub is loaded), i.e 00_header, 10_linux, 20_memtestx86+,40_custom, etc. 40_custom
  • /usr/lib/grub/ => It has stage1 and stage2 files under /usr/lib/grub/i386-pc/. In grub2, you see *.mod files here, as they are all modules for various things as grub,  boot, sleep, etc.

GRUB for EFI:

GRUB cfg file for EFI are in /boot/efi/EFI/<linux_OS_name>/grub.conf

This grub.cfg is a very small wrapper which takes us back to /boot/grub/grub.cfg.

If we have multiple Linux Os installed, then every time we install a new OS, it's GRUB will overwrite the prior GRUB in MBR/GPT. So, we won't be able to access other Linux OS, as the latest Linux OS only sees itself and Windows (assuming Windows is there). To prevent this, we don't allow later Linux OS to install GRUB (don't do auto install Linux OS, but instead choose "manual installation" and then turn off "grub installation"). This way, we leave the GRUB untouched from prior Linux OS installation.

User Selection for OS:

Once the second stage boot loader (from chosen partition in BIOS or EFI partition in UEFI) is in memory, it presents the user with a graphical screen showing the different operating systems or kernels it has been configured to boot (when you update the kernel, the boot loader configuration file is updated automatically). On this screen a user can use the arrow keys to choose which operating system or kernel they wish to boot and press Enter. If no key is pressed, the boot loader loads the default selection after a configurable period of time has passed. Depending on what OS user chose to load, it loads Windows or Linux or any other OS listed in the menu. Assuming Linux is selected, it locates the corresponding kernel binary in the /boot/ directory. The kernel binary is named using the following format — /boot/vmlinuz-<kernel-version> file (where <kernel-version> corresponds to the kernel version specified in the boot loader's settings). On Ubuntu, file is /boot/vmlinuz-3.10.0* (single file, size~7MB, there is also a rescue copy of this with same size),
 
The boot loader then places one or more appropriate initramfs images into memory. The initramfs is used by the kernel to load drivers and modules necessary to boot the system. Once the kernel and the initramfs image(s) are loaded into memory, the boot loader hands control of the boot process to the kernel. Boot loaders only need to support the file system on which kernel and initramfs reside (the file system on which /boot is located). 
 

 

4A. Kernel load: First Kernel is loaded into RAM and remains there until shutdown. 

The very first part of the Linux kernel is written in 8086 assembly language (boot/bootsect.S). When run, it moves itself to absolute address 0x90000, loads the next 2 kBytes of code from the boot device to address 0x90200, and the rest of the kernel to address 0x10000. The message ``Loading...'' is displayed during system load. Control is then passed to the code in boot/Setup.S, another real-mode assembly source.

The setup portion identifies some features of the host system and the type of vga board. If requested to, it asks the user to choose the video mode for the console. It then moves the whole system from address 0x10000 to address 0x1000, enters protected mode and jumps to the rest of the system (at 0x1000).

4B. Kernel decompress: The Linux kernel is installed compressed, so it will first uncompress itself. The beginning of the kernel image contains a small program that does this. The code at 0x1000 comes from zBoot/head.S which initializes registers and invokes decompress_kernel(), which in turn is made up of zBoot/inflate.c, zBoot/unzip.c and zBoot/misc.c. The decompressed data goes to address 0x100000 (1 Meg = 2^20), and this is the main reason why Linux can't run with less than 2 megs ram.

Decompressed code is executed, and eventually, the routine start_kernel is invoked. start_kernel executes a wide range of initialization functions., including unpacking the initramfs (initial RAM filesystem), which becomes the initial root filesystem. The purpose of the initramfs is to bootstrap the system to the point where it can access the root filesystem.

The source for the above operations is in boot/head.S. start_kernel() resides in init/main.c, and never returns. Anything from now on is coded in C language, left aside interrupt management and system call enter/leave. The kernel then executes init as the first process. The early userspace starts. Until this point, initial root filesystem in RAM was being used. At the final stage of early userspace, the real root is mounted ((i.e sets up / root dir, and all dir under it), which replaces the initial root filesystem. This is done via a call to pivot_root ( ) which unmounts the temporary root file system and replaces it with the use of the real one. The memory used by the temporary root file system is then reclaimed. If the mounting of the root filesystem fails, for example because you didn't remember to include the corresponding filesystem driver in the kernel, the kernel panics and halts the system.


5. Init: Once real file system is mounted, kernel finishes it's own part of boot process. init process then tries to execute /sbin/init which is the first user pgm with process id=1. The first process it runs is a script at /etc/rc.d/rc.sysinit which check all the system properties, hardware, display, SElinux, load kernel modules, file system check, file system mounting etc.Then initrd (initial ram disk) is run. Now, /etc/inittab file is read to determine run level, and pgms for that run level are run. Once all these pgms for that run level have run, init process runs one more file /etc/rc.local which are the last commands run in initialization process or even booting process. Once everything is completed the control is given back to the kernel.

On newer systems using systemd, init is replaced by systemd. So, 1st user pgm with process id=1 is systemd (and NOT init) in such a case. See details under "init vs systemd" section.

After exec()ing the init program above, the kernel has no direct control on the program flow. Its role, from now on is to provide processes with system calls, as well as servicing asynchronous events (such as hardware interrupts). Multitasking has been setup, and it is now init which manages multiuser access by fork()ing system daemons and login processes.


6. Prompt: init (or systemd) forks 2 processes: getty (get terminal) and login (for user login). These allow the user to log into the system.

A. getty: getty is called once for each virtual terminal (typically six of them), which initializes each tty and brings up a "login:" prompt asking for a username. Without this getty pgm, communication via terminal can't happen. The init program starts up other programs similar to getty for networked connections. For example, sshd, telnetd, and rlogind are started to service logins via ssh, telnet, and rlogin, respectively. Instead of being tied directly to a specific, physical terminal or modem line, these programs connect users' shells to pseudo ttys. These are devices that emulate terminals over network connections. getty listens at the terminal and waits for the user to notify that he is ready to login in (this usually means that the user must type something). When it notices a user, getty outputs a welcome message (stored in /etc/issue), and prompts for the username. On my CentOS laptop, /etc/issue has these 2 lines: \S, \r, \m are prompt variable char, and print corresponding system info. i.e \u=user, \h=hostname, etc.

ex: /etc/issue => this has 2 lines shown as below (On ubuntu, it just has a single line with name of linux distro "Ubuntu 22.04.2 LTS \n")

\S

Kernel \r on an \m

Above 2 lines print this message followed by the "login:" prompt

CentOS Linux 7

Kernel 3.18xxx on an x86_64

DESKTOP-ASHISH login:

Now this prompt for login may be CLI or GUI. If we did a basic installation of Linux with no graphical interface, then we will see CLI on all virtual terminals. Else if X11 was installed, then getty starts display manager (dm) on one of these virtual terminal (the default virtual terminal) which brings a graphical login screen (you will see a gdm pgm running, under user "root"). Once the username is provided, getty pgm disappears. But before it disappears, it starts a login pgm at /bin/login to complete the login process.

B.login: login pgm gets the username as a parameter from getty pgm, and prompts the user for the password. It checks them against /etc/passwd and /etc/shadow.

Even though the file/etc/passwd is supposed to store encrypted passwords too (which it used to do in old days), but this file is readable by everyone. So, now for security reasons, the encrypted password is stored as "x" in this file, and the real encrypted password is stored in /etc/shadow file which is readable by super user (su) only. So, even encrypted password is not visible to any one else besides su.

On successful matching of username/password, the login program begins a session for the user by setting environment variables and starting the user's shell, based on /etc/passwd.

/etc/password file contains one line for each user of the system. That line specifies, among other things, the login name, home directory, and program to start up when that user logs in. There are seven entries on each line separated by colon. Exact syntax of this file explained here: https://www.cyberciti.biz/faq/understanding-etcpasswd-file-format/

The last bit of information (the program to start up) is stored after the last colon of each line. If nothing follows the last colon, the standard shell /usr/bin/sh is assumed by default. The following lines show typical lines from my /etc/passwd file

  • root:x:0:0:root:/root:/bin/bash => thi is when we type "root" at login prompt, login shell started is bash for root user
  • gdm:x:42:42::/var/lib/gdm:/sbin/nologin => in RHEL, every process runs under a particular user. Users corresponding to certain processess dont have the need to login. So, they have been assigned with a nologin shell. 
  • ashish:x:1000:1000:ashish:/home/ashish:/bin/bash => this is the line used when user "ashish" logs in. After password match, bash shell is started. NOTE: we don't have to start a shell. We could run any program, i.e :/home/my_test.tcl is an equally valid program that can be mentioned here, and login pgm will start that pgm instead of a shell.

The blinking cursor that appears when typing password in CLI is the login program running. No shell has been started yet.


7. login: The login program displays the contents of /etc/motd (message of the day) after a successful login, and then it executes the pgm mentioned in /etc/passwd, which is usually a shell pgm. This is called a login shell. This login shells may be one of many shells supported, and is the entry in /etc/passwd file as indicated above. The default shell on most linux distro is bash. Once the user's login shell is started, it will typically run a runtime configuration file, such as .bashrc or .cshrc, before presenting a prompt to the user. For bash shell, /etc/profile; is executed. in addition, it executes .profile in the user's home directory. For csh, .login is executed. However, if the login was done graphically via dm, then different set of runtime configuration files are called to call the graphical windows i/f. However, on other virtual terminals (accessible by pressing ctrl+alt+F2 or ctrl+alt+F3 or so on), we still see the text login shell after entering username and password.


8. xinit:  For grahical windows i/f, login pgm calls some other configuration files after successful login.These runtime configuration file will call startx or xinit. It runs the user's xinitrc runtime configuration file, which normally starts a window manager (wm). When the user is finished and exits the window manager, xinit, startx, the shell, and login will terminate in that order, returning to getty. NOTE: xinit may not be use any longer on linux distro that have switched to systemd (instesd of init).

startx: This cmd is a wrapper to xinit, and is implemented differently on diff OS variants. More details here: https://www.computerhope.com/unix/startx.htm. startx is a standalone cmd and is used to start the wm under full user control. This is not the cmd that is used by the OS to start wm, but whatever "xinit" script is being used by the OS achieves results similar to this.


 

steps to download and run programs in Linux = Using Package Manager:

Once linux is installed and running, you will want to download and install new software on your system. One way to install any new software on your computer is to download source code from internet, then compile it on local machine using Makefile provided, then install it in appropriate dir. This process is very cumbersome, and is suitable for advanced users only. However for layman users, this process can all be automated by using package managers that come preinstalled on linux OS. However for these pacakage managers to be able to download, compile and install the software, the software needs to be in a package in a special format. The creator of the software should provide their software in specific format for that package manager, or else the package manager can't be used. Fortunately, most of the software that you will ever encounter is available in multiple package formats, so they can be used with almost all package managers.

Package Manager:

A package manager or package management system is a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for an OS on a system. A package manager deals with packages which are distributions of software and data in archive files. Packages contain metadata, such as the software's name, description of its purpose, version number, vendor, checksum, and a list of computer_programming dependencies necessary for the software to run properly. Package managers are designed to eliminate the need for manual installs and updates. Running any package manager requires admin rights, so you have to precede cmds with "sudo" (see in "linux cmds" section for more details on sudo).

Following are the package managers and package formats supported by various linux distro. As you can see there are only 2 formats that will suffice all linux users - .deb and .rpm formats.

1. Deb (Debian) format: All Debian and its derivatives such as Ubuntu, Linux Mint, support .deb package format. The package manager is apt (advanced package tool). APT resolves dependency problems and retrieves requested packages from designated package repositories. APT delegates the actual installation and removal of packages to dpkg. So, instead of directly typing dpkg cmds, APT handles it for us. We type apt commands on command line to interact with APT. NOTE: APT is different than apt cmd line tools. There are various apt cmd line tools to interact with APT pkg mgr. Some examples of cmd line tools are apt-get, apt-cache, aptitude, apt, etc. Until 2016, we had apt-get and apt-cache as 2 most popular cmd line tools for APT. However, cmd line tool "apt" was introduced in Ubuntu 16.04, to combine cmds from both of these into one, and make cmds simpler. apt is now recommended cmd to use with APT. You will still see lots of documentation for apt-get and apt-cache online, and you can continue to use them (as they will still work), but apt is the path forward. So, I'll just show "apt" cmds from now on.

Few of the important apt cmds here: https://itsfoss.com/apt-command-guide/

  • sudo apt update => updates the pkg database repository, so that APT knows if there are any newer packages available, for software already installed on your system. You will see pkg info being retrieved from various servers. You need to run this cmd, before installing any new pkg, so that you get the latest in the repo.
  • sudo apt upgrade => once the pkg database is updated using above cmd, we can upgrade all installed pkg with this cmd. If update cmd not run before upgrade, then APT repository may not that there are latest updates available, and will say, it's up to date.
  • apt search php => this is neat way to find out or search for any packages which contains the specified term, here it searches for any pkg containing term "php" in it. It will generally show hundreds of pkg with that name, so it's usually hard to determine which pkg to choose. i.e if we see 100's of pkg with name "php", we don't know which one to download. Some of these are basic pkg, while some are additional modules for that pkg, etc. Usually search on internet will show what all to download. NOTE: no sudo reqd, as it doesn't make any changes to your system.
  • apt show php => will show additional info about pkg "php". Useful to know, before you install that pkg.
  • sudo apt install <pkg_name> => most used cmd. installs required pkg, like apache2, tcl, etc
  • sudo apt remove <pkg_name> =>removes the pkg.

These are some important directories associated with APT:

/etc/apt has the apt configuration folders and files. All files/folders in /etc/apt are:

  • sources.list: Locations to fetch packages from. This has path to locate the desired packages, which might be available on the network or a removable storage medium, for example, and retrieve them, and also obtain information about available (but not installed) packages.  On my system, this file has 1 line as follows:
    • #deb cdrom:[Linux Mint 19 _Tara_ - Release amd64 20180717]/ bionic contrib main non-free => This line is commented out. So, apt automatically searches other files in sources.list.d for location of packages:
  • sources.list.d/official-package-repositories.list => This has the location of where to fetch packages from:
    • deb http://packages.linuxmint.com tara main upstream import backport => We see many more links from ubuntu, canonical, etc.
  • To use official debian repository which contains about 50K packages, we could insert a line like this in any of above file, and then run "apt update"
    • deb http://ftp.debian.org/debian stable main contrib non-free => Now, APT will use this link to download stable pkg
  •  apt.conf.d => APT configuration file fragments.
  • preferences: version preferences file. This is where you would specify a preference to get certain packages from a separate source or from a different version of a distribution

/var/cache/apt/archives/: storage area for retrieved package files. It has all .deb files that are downloaded. deb files are readable text files. This is a good place to see what was downloaded for let's say "sql" package.

/var/lib/apt/lists/: storage area for state information for each package resource specified in sources.list

Ex: to install a new scripting language called tcl (which is not installed by default on Linux Mint), we run these cmds from any dir in a terminal:

sudo apt install tcl => sudo needed to run as super user since root permissions are needed to run apt. It asks for user's password, and then shows what's going to get installed, and asks if you want to continue. On pressing Y, it downloads pkg. shows lines like this ...

Get:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 libtcl8.6 amd64 8.6.8+dfsg-3 [881 kB] => download link for tcl pkg

Preparing to unpack .../libtcl8.6_8.6.8+dfsg-3_amd64.deb ... => .deb file for tcl downloaded from link above,
Unpacking libtcl8.6:amd64 (8.6.8+dfsg-3) ... =>

Setting up tcl8.6 (8.6.8+dfsg-3) ... => This finally installs tcl on your m/c.

NOTE: Instead of using command line above, Linux Mint provides graphical software called "synaptics package manager" and "software manager". We can use this too to install new software just by searching for that and clicking "install". However if any of these is open, and you try to run apt cmd on cmd line in terminal, you may get a lock error. So, close these before using cmd line i/f on terminal.

 

2. RPM (Red Hat package manager) format: All Fedora and it's derivatives as CentOS, RHEL, and SUSE, OpenSUSE support .rpm package format. RPM is delivered as a single file either as .src.rpm (for source pkg) or as .<arch>.rpm(for binary pkg). Package manager is yum.

Yum (yellowdog Updater Modified). Few of the imp cmds:

  • sudo yum update => updates the pkg database repository, so that yum knows if there are any newer packages available, for software already installed on your system. -y option updates w/o asking for your confirmation. Similar to apt, you need to run this cmd, before installing any new pkg, so that you get the latest in the repo.
  • sudo yum upgrade => once the pkg database is updated, we can upgrade all installed pkg with this cmd. 
  • sudo yum upgrade httpd -y =>this upgrades specific pkg named "httpd". -y upgrades w/o asking for confirmation
  • yum search httpd => searches for any pkg containing term "httpd" in it
  • yum info httpd => shows additional info about pkg "httpd".
  • sudo yum install <pkg_name> => most used cmd. installs required pkg, like apache2, tcl, etc
  • sudo yum remove <pkg_name> =>removes the pkg. This will figure out all files associated with that pkg, and cleanly remove them. Linux "rm" cmd should NOT be used to remove packages.

 ex: upgrading firefox:

  1. sudo yum update => updates local repo.
  2. sudo yum info firefox => This gives info on installed firefox pkg, as well as the latest one available. For me, it was showing intalled firefox version as 80.xx, while available was showing as 93.0.0. That meant my pkg was out of date. However, for latest avilable pkg to show, we should have run above "update" cmd, else yum will show latest based on when the repo was last updated.
  3. sudo yum upgrade firefox => This finally upgrades firefox to latest 93.0.0

On top of individual pkg, there are also groups, of which there are 2 kinds:

1. environment groups: consists of a group of packages define how the server needs to be built, it can be Web Server, Minimal installation or server with the graphical interface.

2. groups: consists of all other groups of software like “Development Tools” or “System Administration Tools”. We can apply yum to these groups too

  • yum groups => prints summary of groups. On my centOS laptop, it shows 12 Environment groups and 25 groups.
  • yum grouplist => prints all avilable groups with group names. Env groups are Cinnamon Desktop, Mate Desktop, Basic Web Server, etc. Groups are Cinnamon, Mate, Sound and Video, etc.
  • yum groupinfo "Cinnamon Desktop" => provides more info about this DE (Desktop Env) pkg. 
  • sudo yum groupinstall "GNOME desktop" => installs GNOME3 DE
  • sudo yum groupremove "Security Tools" => removes this group named "security tools"

FIXME: yum details FIXME