ARM: It was initially called Advanced Risc Machines Ltd, but later changed the name to ARM Ltd. It licenses CPU cores (cortex lineup), as well as GPU cores (mali lineup). NOTE: ARM primarily licenses processor cores, and NOT microprocessor or microcontroller. However, they also provide licences at system level and software level for companies which want to get the full pkg. It's cores can be used anywhere, although they are mostly used in microcontrollers. It licenses these designs (core license), as well as it allows companies to license it's instruction set (architectural license), so that they can build their own design any way they like. Apple and Qualcomm use architectural license from ARM to design their mobile processors. For Core licenses, companies receive ARM synthesizable core IP in verilog (i.e processor RTL). These IP are called soft IP, as they are in RTL. Synthesis and PnR are carried out by the companies getting these licenses.

ARM was formed in 1990, and introduced ARM6 processor family in 1991. It was based on ARM v4T processor architecture. Then came ARM7, which were still based on v4T, most popular of which was ARM7TDMI. ARMv5E arch was introduced with ARM9E processor families (E stands for enhanced, which added DSP inst for multimedia processing). Then came the ARM11 processor family based on v6 arch. At this time, it was decided to branch processor families into different types, based on their use profile. Also, each processor family would be based on arch most suited for that application type. This resulted in new product portfolio from ARM called cortex family of processors.These were subdivided into 3 profiles based on application usage. The 3 profiles were A, R, M (discussed later). New v7 arch was introduced for the cortex family. Each of these profiles had their own tailored arch, namely v7-A, v7-R and v7-M.

Though ARM had it's classic cores starting from 1990's, it's more popular "cortex" cores started appearing from 2004. These are the cores that you see everywhere in designs, the classic cores (ARM6, ARM7, etc) were used in older devices, and have disappeared from mass market. One reason for success of ARM cortex lineup is that they became very cheap, with some cortex M0 based microcontrollers sold for as low as 25 cents. This allowed these 32 bit cores to replace 8 and 16 bit microcontrollers. Let's talk about ARM ISA before delving into profiles and arch.

 

VNC: Virtual Network Computing

VNC is a graphical desktop sharing system, used to remotely control another PC. It's same as other software as chrome remote desktop,  TeamViewer, , etc which allow you to control remote PC. VNC is very popular among enterprises, and is open source under GNU license. VNC was orinally developed in UK. Many other commercial or open source products based on VNC original source code developed. In 2002, VNC R&D center was closed. It's developers formed RealVNC which developed open source as well as commercial product under the same name. Most of the time when people say VNC, they mean RealVNC.

Intro material on Wiki: https://en.wikipedia.org/wiki/Virtual_Network_Computing

 

VNC server/client model:

VNC software has 2 parts: a sever software, and a client software. You install server software on the desktop which you want to control. You install client software on the desktop from where you want to control. The client software knows how to connect to server software. The client software displays the desktop screen on remote desktop which is running the server software for VNC.

Installation:

Install RealVNC from here: You will need to install both RealVNC server and RealVNC client. They will need to be installed on different computers. Server on the computer to be controlled, and client on the computer that controls the server computer. Choose appropriate OS and then download it.

Real VNC server: https://www.realvnc.com/en/connect/download/vnc/

RealVNC client: https://www.realvnc.com/en/connect/download/viewer/

Running VNC:

Once installed, you can start VNC server on server desktop by clicking on RealVNC icon or typing "vncserver" on terminal. Once started, VNC server always starts on powerup. OWhen vncserver is running, it shows the ip address for the computer on which it is running. It's something like this kind of message:

$vncserver

......

New desktop is raspberrypi (192.168.1.109)

....

Now on the client machine (where you have the VNC client software installed), you enter this number in address section (here, it's 192.168.1.109). Once enetered, it brings up an icon, on which you click, and you can see the remote desktop screen (where VNC server is running). Now if you work on this screen, it seems as if you are directly working on the remote desktop (the screen refreshes amazingly fast. Keystrokes from client to server, as well as pixels from server to client are transmitted pretty fast, especially if both client and server are connected to high speed internet).

vncserver command on Linux terminal can be used with a lot of options to set the display options. One helpful option is:

vncserver -geometry 2560x1024 -depth 24

NOTE: that when we say display, we mean the physical screen that is on the monitor of remote desktop. However, the pixels of that display are stored in memory, and the monitor is just displaying whatever is stored in that memory. So, we can have another display which has pixels stored in memory only and doesn't go to any monitor. This is called a "virtual display". VNC allows these virtual displays to be created on the server machine, and then be accessed using VNC viewer. Thus we can have 10's of display on a single server desktop, where one of them is real display connected to physical monitor, while all others are all virtual displays.

Headless servers or servers which don't have any monitor connected, don't start the gui program for the display. In such cases, VNC server has nothing to display since it always shows the physical display by default. So, in turn VNC server program doesn't start at startup. in such cases, we start VNC server by logging into the server machine via ssh. Then VNC server creates a virtual display and this virtual display can be seen via a VNC client.

On the top center of VNC session, we have a way to kill VNC or set many options. Look thru them, if you need to set anything else.For ex, if you have 2 monitors, and it's not working, try setting "UseAllMonitors to true" over there.

NOTE: we have .vnc dir in the home dir. Inside this dir, is a config file, which controls how the vnc desktop should look. To get full screen extended on desktop,  we add these lines to config file, so that we don't have to type it every time on cmd line:
-geometry 2560x1024
-depth 24

Guest Access: In VNC, allowing guest acces to others is easy. Steps:

  • Run "vncconfig &" on cmd line. This cmd has to be run on lindesk terminal and not on lsf terminal
  • On pop up box, click commands->options. On new pop up box, choose advanced.
  • Change guest access to "Interactive" and click apply.
  • On main pop-up box, if we click on options, we should see a "tick mark" on Guest Login. If not, tick that by clicking.
  • Now, anyone can connect using login "guest" and no password.
  • When user requests access, a new box appears on bottom. click on "accept" to allow guest access to your vnc m/c.

 

Putty:

 If we are on a windows machine, and don't have terminal to connect to, we can use a program called Putty, that supports a lot of protocols as ssh, ftp, etc. It has a GUI interface, and is a lot easier to use.

First download Putty. Then Use Putty to SSH to the above machine. That brings up a terminal on remote machine to which you work in usual way. When done, log out of Putty and close the window.


-----------------------------------------------------

 

ALL JUNK BELOW. NEED TO MOVE ELSEWHERE FIXME ???

LSF:


Any jobs can now be run only in lsf machine. So open xterm on lsf
Open Xterm on an LSF machine: bsub \-Is \-R "linux&&bit64" "xterm" & => OBSELETE
Open Konsole on an LSF machine: bsub \-Is \-R "linux&&bit64" "konsole" & => OBSELETE
Open Konsole on an LSF machine on RHEL6 OS: bsub \-Is \-R "select[ws60]" "konsole" & => use ws40 for RHEL4 (ws60 is latest). ws60 provides latest AME tools.
Open Konsole on an LSF machine on SUSE11 OS: bsub \-Is \-R "select[sles11]" "konsole" & => this was needed to get latest AME tools, but not anymore. SUSE not used anymore
Run <tool_name> -ame on both OS to see which gives you latest tools. Some newer versions may be avilable on 1 OS and not on other.

OS for Artisan:
Artisan 5.2.1 and earlier will only run on the legacy SuSE11 OS.
Artisan 5.3 will run on both SuSE11 and RHEL6.
The upcoming Artisan 5.4 will only run on RHEL6.

Run icfb on suse m/c: bsub -R "select[sles11]" -Is icfb -artisan-2.91p1 &

NOTE: to get around check and save issues, run icfb on suse m/c: icfb -artisan-5.2.1


For LSF jobs submitted, if we want to know what OS job got submitted on, look in the log file (i.e irun.log) to find name of lsf m/c. Then run:
ex: /home/kagrawal/ > lshosts machine1.com => last 2 RHS entries show OS

HOST_NAME      type    model  cpuf ncpus maxmem maxswp server RESOURCES
dlewz2732.d LIN_X64 p4x_3400 417.0    12 262047M 262145M    Yes
(bit64 cs dc X64 linux srvClass01 maxmem32G linux26 maxmem64G p4x maxmem128G warm maxmem256G sles suse sles11p2 !sles11) => OS is sles11.2
#Preventing jobs from getting killed in lsf:
jobexclude --add <jobid> => to add a job
jobexclude --list => to list all added jobs


#snapshot
In any dir, there is .snapshot dir, within which is there are dir with timestamp. Just cd into appr dir, and cp stuff that is to be retrieved.

dssc cmds:
---------
dssc -help => lsits all options
dssc <cmd_name> -help => lists syntax of a specific cmd

checkin:
checkin for 1st tme: dssc ci -new <file/dir> -com "comments_here" -rec => -rec needed for recursive
checkin a file after editing with lock: dssc ci <filename>
dssc ls -report status -rec => shows revs of all checked out files/dir (current rev vs .)
dssc diff "file1.v" "file1.v;Latest" => diffs b/w current modified checked out file against one in repository.
dssc retire -force => to completely remove it from database.
dssc ci -new <file/dir> -skip => use ci with -skip after retiring a file in db. Else, old retired file will be checked in.

checkout:
checkout all files recursively in read mode: dssc pop -rec => does it starting from current dir
checkout a file in readmode: dssc co <filename>
checkout a file in editmode with lock: dssc co -lock <filename>
checkin a file after editing with lock: dssc ci <filename>

dssc cancel:
dssc cancel -force <filename> => this is to cancel checked out file (even with edits), and to repop with original version.
checkout a file in editmode without lock: dssc co -get <filename> => This gives unlocked copy which can be modified. Do a chmod to 664 or 755.
Then to populate the original file (and discard the current modified file), do:
dssc pop -force <filename>
dssc unlock <filename> => To unlock files (i.e remove lock). This can be useful when files can't be checked out or something else gone bad.

C++ programming lang:

C++ is backward compatible with C, that means you can use all your C code in your C++ pgm, and pgm will still compile fine. So, you already can write C++ pgm, if you know C. Infact all C pgms that you have can be renamed as C++. However, a lot of object oriented features were added in C++, which programmers take advantage of, by modifying existing C code. So, C++ allows you to take incremental steps from a C pgm. All C functions like printf, malloc, etc are available in C++ too.


http://www.learncpp.com/

sample pgm: hello.cpp (C++ files can also be named as .cxx, it's an old style extension that's still used)
----
#include <iostream> //needed for std io func, newer system header files do not have .h extension
#include "myfile.h" //for user defined header files, use " "

void printA(int x, int y) //each func needs to be defined separately
{
    std::cout << "A" << x << y << std::endl; //std:: prefix says that cout is in std namespace
}
 //If func printA is defined after main(), then we need to do forward declaration
void printA(int x, int y); //using func prototype for forward declaration. also used if printA is defined in separate file by itself

// Definition of main()
int main()
{
  int a;
  std::cout << "Enter number " ; //cout to print out on screen. << indicates RHS is transferred to LHS. string is transffered to cout
  std::cin >> a; //cin to take i/p from screen. >> indicates LHS is transferred to RHS. cin transfers val to a
  std::cout << "Hello World num is " << a << std::endl; //endl to put newline. multiple prints can be in same stmt as long as separated by <<
    std::cout << "Starting main()" << std::endl;
    printA(1,a);
    std::cout << "Ending main()" << std::endl;
    return 0; //any +ve num returned means error
}


complie C++:

g++ ~/hello.cpp
execute: ./a.out

Enter number 1
Hello World num is 1
Starting main()
A 1 1
Ending main()

------------------------

Keywords : C++ reserves a set of 84 words (as of C++14) for its own use.
ex: char, int, for, case, while, struct, void, ...

identifiers: The name of a variable, function, type, or other kind of object in C++ is called an identifier. The identifier can only be composed of letters (lower or upper case), numbers, and the underscore character. identifiers are case sensitive

Literals: A literal is a fixed value that has been inserted (hardcoded) directly into the source code, such as 5, or 3.14159. Literals always evaluate to themselves.

operands: Literals, variables, and function calls that return values are all known as operands.

operators: Operators tell the expression how to combine one or more operands to produce a new result.

data types:
boolean: bool (true or false). true is stored as int 1, false as int 0
character: char, char16_t(C++11 only), char32_t(C++11 only). char16_t and char32_t store char in 16 or 32 bit as UTF-16 or UTF-32 Unicode char. ex: 'c'. char stored as 1 byte int (usually signed). Since char is 8 bits, ASCII char numbers are b/w 0 to 127. some char from 0 to 31 are escape char, \n=newline, \t=tab, char code 27 is escape \.
floating point: float, double, long double. signed by default
integer: short, int, long, long long (C++11 only). signed by default (avoid unsigned)
void: for functions that do not take any param or return a value. In C++, empty param are allowed
 ex: int Value(void) { ...} same as int Value() {...}
sizeof(char) => returns size of char in bytes. C++ gurantees min size of each data type, actual size may be bigger. int is min 2 bytes, while float is min 4 bytes. Fixed size int, etc were defined later in C++ inside std namespace. i.e int8_t, uint8_t, int16_t, ...

3 ways to init a var
A. int nval = 5; bool b1=true; //copy initialization
B. int nval(5); //direct initialization
C. int nval{5}; //uniform initialization, works for all data types, but only with C++11. Note: curly braces instead of circle brackets. recommended to use this style.
 int value{}; // default initialization to 0

var assgn (not init):
int nValue;
nValue = 5; // copy assignment (no way to do direct or uniform assignment)
const double gravity { 9.8 }; => assigns const val. can't be changed
const int maxNameLength { 30 }; => assigned const 30
-----------

preprocessor:
ex: #define NUM 7
ex: #ifdef PRINT_J std::cout << "joe"; #endif
ex: header guards
#ifndef SQUARE_H
#define SQUARE_H
....
#endif

---------
namespace:
sample pgm:
ex: constants.h
namespace constants
{
    const double pi(3.14159);
    const double avogadro(6.0221413e23);
    // ... other related constants
}
ex: myfile.cpp
#include "constants.h"
double circumference = 2 * radius * constants::pi;

-------------

C programming lang: Most popular language of last 50 years. Must learn language which can be used to write any simple or complex pgm


C library ref guide: http://www.acm.uiuc.edu/webmonkeys/book/c_guide/index.html

C language has it's own syntax, which may not seem very easy for a person who is used to more modern languages like python. We'll start with very basic "Hello World" pgm which prints "Hello World" on screen. This pgm is explained under "gcc" section, but I'm repeating it here. Name the pgm hello.c

#include <stdio.h> //include files: explained under gcc article

int main (void) { //main function explained below

  printf ("Hello, world!\n");

  return 0;

}

 
main function: Every C pgm needs a main function. This is where the pgm starts executing from line by line.

arguments to main() function are optional. argc and argv variables store the parameters passed in cmd line of the shell when running the pgm.On POSIX compliant systems, char *envp[] contains vector of program's environment var (i.e SHELL environment var that are passed to the program

argc=num of arguments in cmd line including cmd name,

argv[]= vector storing arguments in cmd line, argv[0]=cmd_name (1st argument), argv[1]=2nd argument (1st option), and so on

envp[] = vector storing ENIRONMENT variables

argc, argv[] and envp[] can be any names, i.e main(int cnt, char * arr_option[], char *arr_env[]); is equally valid, though usimg argc, argv and envp is standard

ex: a.out 1 myname 17cd => argc=4 (4 arguments on cmd line), argv[] = {"a.out"  "12", "myname", "17cd" }; Note that argv is an array of strings(char), so even numbers are stored as string. Char strings "12" can be converted to numbers using atol() for 64 bit int conversion or atoi() for 32 bit. i.e y=atoi(argv[1]); => This will assign y to integer 12. Plain casting using (int) will not work, as (int)"12" will convert string 12 to integer which will be ascii code of 1,2,\n converted to integer.
 
int main (int argc, char *argv[]) { // int before main refers to the return value of main. In early days of C, there was no int before main as this was implied. Today, this is considered to be an error (even if main doesn't return any value).

int main (void) { => no arguments to main
...
exit (1); //return value from int. Can also do: return 1;
}

Compile C pgm: We can't run C pgm directly by typing hello.c on terminal. We need to compile it first, which generates an executable "a.out", and then run "a.out" on terminal.To compile above pgm, we use gcc compiler. Type below cmd on terminal. Look in "gcc" section for details.

gcc hello.c => generates a new file a.out. Now type ./a.out on terminal, and it runs the pgm

Syntax of C pgm:

1. comments start with // for single line comment, For multi line comment, use /* comment */

2. Each stmt ends with semicolon ;. We don't need semicolons for blocks

3. main() function is needed. We can define other functions as needed.

4. All variables need to defined before using them. We define the var and specify their type. Types are explained below.

5. For loops, if-else,etc are supported. Many reserved keywords are defined for these, and they can't be used as variable names,

Std Input/Output functions:

printf and scanf are 2 std functions that are going to be used the most. They are inbuilt library functions in C programming language which are available in C library by default. These functions are declared and related macros are defined in “stdio.h” which is a header file in C language. That's why we have to include “stdio.h” file.

1. printf: printf() function is used to print the (“character, string, float, integer, octal and hexadecimal values”) onto the output screen.

ex:

int var1=5; // var1 used in printf below. We define it to be of integer type and assign it a value of 5.

printf ("Value is = %d \n",var1); // %d specifies that var1 is int type, and should be printed in decimal format. var1 needs to be defined before it can be used. var1 is put outside " .. ", while %d is inside " ...". \n is used to print a newline.

2. scanf: scanf() function is used to read character, string, numeric data from keyboard.

ex:

char ch; //var ch is defined of type "character", and is used to stre the i/p entered by the user.

printf("Enter any character \n"); //This printf is same as before. We can't print anything from within scanf. It can only take input. So, we usually precede scanf with printf.

scanf("%c", &ch); //Same as in printf, we have to specify the type of i/p. Here %c says it's character type, and whatever character user enters on prompt is stored in var "ch". Here var "ch" has to be defined as a "char type" before using it. We use an & with ch. That is needed, and will be xplained in pointer section below. No & is needed for var in printf function.



Type:  In C pgm, we have to define data type of all the variables, else C compiler will error out. These are 2 categories of data types for any var. Primary data types and derived data types
-----

I. Primary data type: These are fundamental data types in C namely integer(int), floating point(float), character(char) and void.
arithmetic type: 4 basic types: char, int, float, double and 4 optional type specifiers (signed, unsigned, short, long). void is a data type for nothing.

1. char: 8 bits.
 char, signed char, unsigned char => signed char stored as signed byte. If we use "char" to store character then ascii code for that character is stored. In that case signed or unsigned doesn't matter. However, if we use char to store 8 bit number, then signed unsigned matters, as it represents 8 bit signed or unsigned integer. As "int" stores integers in 16 bit or larger, the only way to store 8 bit integers is by using signed/unsigned char. signed char is from -128 to +127, while unsigned char is from 0 to +255.
 Normal char (no signed/unsigned) are represented in ASCII codes. See in ASCII part of "bash shell scripting language" section.

2. int: 16 bits to 64 bits.
 A. short integer type. atleast 16 bits in size => short, short int, signed short, signed short int, unsigned short, unsigned short int.
 B. basic integer type. atleast 16 bits in size => int, signed, signed int, unsigned, unsigned int. usually 32 bits.
 C. long integer type. atleast 32 bits in size => long, long int, signed long, signed long int, unsigned long, unsigned long int. usually 64 bits, but on embedded arm uP, these are 32 bits.
 D. long long integer type. atleast 64 bits in size => long long, long long int, signed long long, signed long long int, unsigned long long, unsigned long long int. Usually 128 bits, but on embedded arm uP, these are 64 bits.

3. float: IEEE 754 single precision floating point format (32 bit: 1 sign bit, 8 exponent bit and 24 significand precision(23 explicitly stored since 1 bit is hidden)).

4. double: IEEE 754 double precision floating point format (64 bit: 1 sign bit, 11 exponent bit and 53 significand precision(52 explicitly stored since 1 bit is hidden)). "double double" is extended precision floating-point type. It can be 80 bit floating point format or some non-IEEE format.
 
NOTE:  The above types don't have particular size as part of C lang, as it's target processor dependent. So, their size is provided via macro constants in two headers: limits.h header defines macros for integer types and float.h header defines macros for floating-point types.
limit.h: (look in http://www.acm.uiuc.edu/webmonkeys/book/c_guide/2.5.html)
-------
In cortex M0, char size is defined in /apps/arm/rvds/4.0-821/RVCT/Data/4.0/400/include/unix/limits.h as follows:
#define CHAR_BIT 8 => Number of bits in a byte. The actual values depend on the implementation, but user should not define these to be lower than what's put here.
#define CHAR_MIN 0 => minimum value for an object of type char
#define CHAR_MAX 255 => maximum value for an object of type char

similarly signed char is defined as -128 to +127 (8 bits), unsigned char is 0 to 255, signed short int or int is -32768 to +32767 (16 bits), signed long int is -2147483648 to +2147483647(32 bits).

float.h: (look in http://www.acm.uiuc.edu/webmonkeys/book/c_guide/2.4.html)
--------
for float(32 bits), double(64 bits), long double(80 bits).
ex: In cortex M0 float.h, we have:
#define FLT_MAX  3.40282347e+38F

Compiler uses these files to determine size of these primary data types. So, the same pgm may compile differently on different m/c if these files have diff values.

II. Derived data types: Derived data types are nothing but primary datatypes grouped together like array, stucture, union and pointer.

Boolean type: _Bool (true/false). It's an unsigned byte. False has value 0 while true has value 1. It can be seen as derived from int.

In cortex M0 stdbool.h, bool is defined as _Bool, true as 1 and false as 0. (for C compilers pre C99 std). C++ has bool type instead of _Bool type (as per C99 std).
#define bool _Bool
#define true 1
#define false 0

pointer type:

Pointer type var are a new class of var that store addr. Addr could have been stored as type int, but a special "pointer" type was declared to store the addr. For every type T (i.e char, int, etc) there exists a type pointer to T. Variables can be declared as being pointers to values of various types, by means of the * type declarator.
eg:
char v; declare var v as of type char. To get addr location of any variable, use &. So, addr of v is &v
char *pv; (can also be written as char* pv; or char * pv;)=> declares a variable "pv" which is a pointer to value of type char. pv doesn't store a char, but rather an addr. That addr stores a char var. *pv refers to contents stored at memory pointed to by this addr (the addr stored in var pv). &pv is the addr of this var pv, just like &v is the addr of var v. Since, var v istores char, it is just 1 byte in length, while var pv stores an addr, which is 32 bits on a 32 bit system. We can initialize pv to any addr, including special value "NULL" (pv=NULL;):
char *pv=0x1000; => assigns pv to addr value 1000. This is similar to char *pv;pv=0x1000; However, this will error out with this msg: "a value of type "int" cannot be used to initialize an entity of type char *". Reason is although C language allows any integer type to be converted to a pointer type, but such conversions cannot be done implicitly; a cast is required explicitly to do this non-portable coversion. so, we have to do: char *pv = (char *)0x1000; => This casts 0x1000 to pointer type pointing to a char var, which is exactly what pv is, so both sides become same type.

char v='a';
char *pv=&v; => assigns pv to addr loc of variable v. If we print various values of v and pv, this is what we might get:

printf ("v=%c, &v=%x, pv=%x, &pv=%x, *pv=%c",v,&v,pv,&pv,*pv); => v and *pv both store char, so we use %c, while pv, &v and &pv store addr, so we use %x

v=a, &v=f98ca5cf, pv=f98ca5cf, &pv=f98ca5c0, *pv just happens to store the char


pv = &v; => assigns addr loc of v to pv. If pv wasn't declared as a pointer, we could not do this.
pv = (char *)0x1000; => pv is assigned addr value of 1000. The addr 0x1000 stores a char type.
*pv = 0; => store 0 in contents stored at addr pointed by pv.

initializing ptr var: We can initialize ptr var to any addr val. However we can also initialize the contents of ptr var via this:

char *p="Name"; => Here p is created as a ptr and values assigned from addr p, p+1 and so on. So, *p='N', *(p+1)='a', and so on to *(p+4)='\n'. *(p+5) is not initialized to anything, and contains garbage value.


Below 2 assignments are used extensively for rd/wrt of reg mem locations:
1. Rd from given mem loc: 0x1000 or "#define addr 0x1000"
data = *((char *)0x1000); => grab contents of addr loc 0x1000 and assign it to data.
2. Wrt to given mem loc: wrt 1 or "0x01" (since char, so 1 byte only, so 0x01. If short int, use "0x0001")
*((char *) 0x1000) = 1; => This stores 0x01 at addr 0x1000 (Note: it is not char '1' as char '1' is ascii value 0x31. Here we cast 0x1000 to type "pointer of type char" and then "*" in front of it refers to contents stored at addr 0x1000. So, we store "1" at that addr. We can omit extra brackets and do: *(char *) 0x1000 = 1; NOTE: if we try to do: *0x1000=1; then we get this error "operand of "*" must be a pointer" as 0x1000 is a int and not a pointer, so explicit conversion is required.

double pointer: When the pointer stores addr of other pointer (instead of storing addr of char, int, etc), it's pointer to a pointer or double pointer

ex:

int **dpv; => here dpv is double ptr to type int. dpv stores addr of pointer pv for ex, where pv pointer points to data tof ype int

int *pv; => pv is the ptr pointing to data type int.

int v; => regular var decalred as int

pv = &v => single ptr pv is assigned addr of var "v".

dpv = &pv; => double ptr dpv is assigned addr of ptr pv. If dpv wasn't declared a double ptr, then &pv couldn't be assigned to dpv. If dpv was declared a single ptr (i.e int *dpv), then dpv could store addr of regular var "v" and not addr of single ptr pv.

use of double ptr: We use it arg of functions, when we want to change the contents of a var which is outside the function, from inside the function.

1. We use it in arg of main func:

ex: int main(int argc, char **argv) => here we could use char **argv or char *argv[]. It represents an array of char sequences.

2. We use it  in arg of regular func:

 

Array: An array is defined as finite ordered collection of homogenous data, stored in contiguous memory locations.

Array types are introduced along with pointers, as in C, arrays are just a pointer to the first object of that data in the memory. In most contexts, array names "decays" to pointers. In simple words, array names are converted to pointers. That's the reason why you can use pointer with the same name as array to manipulate elements of the array.

ex: char p[10] = "Name"; => This declares an array of 10 char type var = p[0] to p[9]. It is initialized to value "Name", so p[0]='N', p[1]='a',..p[4]='\n', and p[5] to p[9] are uninitialized. p[0] to p[9] are stored in continuous mem locations, and *p or p[0] refers to first element of array *(p+1) or p[1] to second element of array and so on. The way compiler translates array is that it stores it as stores "p" as ref addr, and then uses it to figure out p[0] = *p, p[1] = *(p+1) and so on. This is similar to char *p="Name";

ex: printf("ptr: p addr in hex= %x p addr in dec= %d &p= %x &p[0]=%x *p=%c &p[1]=%x *(p+1)=%c\n",p,p,&p,&p[0],*p,&p[1],*(p+1));

o/p => ptr: p addr in hex= 23fccc60 p addr in dec= 603769952 &p= 23fccc60 &p[0]=23fccc60 *p=N &p[1]=23fccc61 *(p+1)=a //as can be seen, p, &p and &p[0] all refer to addr of start of array p. We can use any of them to access array p. p is used as a ptr to start of array p[9:0].

There are some special cases when an array doesn't decay into a pointer. ex: char str[] = "Name"; Here the size of array is not explicitly specified, so not treated as pointer.

2D array: 2D array are simply double pointers. To see it, consider 1D array, where first element of array is referred via ptr *p. ptr p stores the addr of p[0].  Now consider 2D array. So, p[0][0] to p[0][n] is the first row of 2D array, p[1][0] to p[1][n] is the second row of 2D array and so on. So, to store each row of array, we can have a ptr *p for 1st row, ptr *q for next row and so on. Or we can have an array of ptr *p[] that will store 1st row, 2nd row and so on. So, ptr *p[0] points to 1st element of 1st row, ptr *p[1] points to 1st element of 2nd row and so on. So, it becomes 1D array of pointers, char *p[0], char *p[1] and so on. But any 1D array can be represented by pointers, so p[0], p[1] etc which are pointers can be rep by *p. But *p means p[0], p[1] are char and not ptr. To specify that p[0], p[1] etc are pointers to char, and not char themselves, we specify it as a double ptr, char **p or an array of single pointers as char *p[].

struct:

Struct type is collection of different kind of data which we want stored as single entity. For ex a person's account may contain name, age, income, etc.We can obviously make 3 different types of array: char type for name, int type for age, float type for income, etc. But that would be not easy to manage. However, we can group all these items together in a structure named "account", and then access each of these. That way, we have to create an array of just this "account" type, and it's much easier to manage data.


struct pointer:
------
struct reg { int a; char ch;};
struct reg arr[] = {{1,'s'},{2,'f'}}; //defines arr[0],arr[1] with values of reg type
struct reg *my0 = &arr[0]; //my0 is ptr to type "reg" and is assigned to addr of arr[0] which is of type "reg" too.
struct reg *my1 = &arr[1];

function pointers: http://denniskubes.com/2013/03/22/basics-of-function-pointers-in-c/
-----------------
function names actually refer to addr of func. So, we can have pointers to functions by using func name as addr (no need to use & with func name for addr).
char *pv=&v; => as explained above, this assigns pv to addr loc of variable v. *pv refers to contents of pv.
but for func, we need to have paranthesis around pointer. Also, instead of char, we need to have return val of func, and also need to provide all args of func (If no args, we can write "void" or leave it empty). i.e
void (*FuncPtr)() = HelloFunc; //Hellofunc is function defined with no i/p or o/p. OR
void (*FuncPtr)(void) = HelloFunc;
or:
void (*FuncPtr)(void); => define the ptr function
FuncPtr = HelloFunc; => assign ptr to addr of HelloFunc.

Now to reference any pointer value, we use *pv to read contents. (a=*pv reads contents of pv and stores in a). For function, we do the same, but we need to have *pv inside paranthesis to indicate it's a func. Also, we need (), since () operator is needed to call a function (with args in it if any). i.e
(*FuncPtr)(); //This calls the func HelloFunc() => equiv to calling HelloFunc(); directly.
FuncPtr(); //This also works and is exactly equiv to above line. This is since a func name (label) is converted to a pointer to itself. This means that func names can be used where function pointers are required as input. &func and *func are same and refer to func name, which is a pointer to itself. So, instead of using &func or *func, we should use func or func() directly

ex: (with args)
int (*FuncCalcPtr)(int, int) = FuncCalc; //or *FuncCalc, &FuncCalc, **FuncCalc all are same. FuncCalc fn is subtracts 2 int and returns an int.
int y = (*FuncCalcPtr)(10,2); //or FuncCalcPtr(10,2) is the same. returns 10-2=8, and stores it in y.

Func ptr can also be used as parameters to pass into another func. This is primary use of func ptr. ex:
int domath(int (*mathop)(int, int), int x, int y) {
  return (*mathop)(x, y);
}
in  main, do: int a = domath(add, 10, 2); //this calls domath func with ptr to "add" func.

typedef with pointers:
----
Above ex needed extra typing everytime fn ptr was defined. typedef can simplify this.
ex: Using typedef with pointers:
typedef int *intptr; => newptr is new alias with pointer type int*. "typedef int dist" creates "dist" as a synonym for int. similarly "intptr" is a synonym for "pointer pointing to int"
intptr ptr; => this defines a var "ptr" with type int. So, ptr is a pointer which can point to a memory with int type. We could have also written it as "int *ptr".

ex: Using typedef with fn pointers:
typedef int (*FuncCalcPtr_def)(int, int); => creates "FuncCalcPtr_def" as a synonym for a pointer to a fn of 2 int args that returns an int
FuncCalcPtr_def FuncCalcPtr; => this defines a var "FuncCalcPtr" with type "FuncCalcPtr_def", which is a ptr to fn. We could have written this as "int (*FuncCalcPtr)(int, int);

--------------------
size type:  http://www.acm.uiuc.edu/webmonkeys/book/c_guide/2.11.html
----------
size_t and ptrdiff_t were defined as separate type, since they are mem related. existing arithmetic types were deemed insufficient, since their size is defined according to the target processor's arithmetic capabilities, not the memory capabilities, such as available address space. Both of these types are defined in the stddef.h header as "typedef size_t", "typedef ptrdiff_t".

size_t: used to represent the size of any object (including arrays) in the particular implementation. It is used as the return type of the sizeof operator and is unsigned int.
ptrdiff_t:  used to represent the difference between pointers.

extended integer data type: *_t.
-------------------
to make code portable across diff OS, since existing int type have various sizes depending on system. The new types are especially useful in embedded environments where hardware supports usually only several types and that support varies from system to system. For ex: int N; may be 16 bit with certain complier/processor, while it may be 32 bit with other. Generally we don't care, but if these are being used in structure (as in MMIO), and we use a pointer to refer to various elements of that structure, then size of int does matter. One solution is to define new type as this:
typedef unsigned short uint16; => Here, we still need to know size of char,short,int,long on that complier/processor and modify this defn as needed.

an example depicting this problem is this:
typedef struct {
 volatile uint16_t CTRL1;           // control register
 volatile uint16_t CTRL2; }  CWT_TypeDef;
#define CWT                     ((volatile CWT_TypeDef*)  0x50000000  

Now in main.c, we do:
CWT->CTRL1=0x0012; CWT->CTRL2=0xFF55; => since compiler understands uint16_t to be of 2 bytes, it adds 2 to base addr to store CTRL2.
STRH     r0,[r1,#0];   => stores 0x0012 at base addr
STRH     r0,[r1,#0x2]; => stores 0xFF55 at base_addr+2

However, if we tried to use the same compiled code on some arch which had "unsigned short" implemented as 32 bits, then this code will incorrectly wrt 0x0012 into lower 2 bytes of CTRL1 and then 0xFF55 into upper 2 bytes of CTRL1. CTRL2 will not get written. In order to get rid of this problem, we have to change typedef in stdint.h to the correct 16bit integer and then we don't need to change our C code anywhere else. We recompile and generate correct binary for new arch. It saved us changing code from multiple places.

C99 std of ANSI-C, defined all additional data types as ending in _t, and user is asked not to define new types ending in _t. All new types are to be defined in inttypes.h header and also in stdint.h header by the compiler vendors. We can define types for exact width, least/max width type, etc. Exact width integer types are guaranteed to have the same number N of bits across all implementations. Included only if it is available in the implementation = intN_t, uintN_t. eg: uint8_t = unsigned 8-bit, uint32_t, etc.
sizeof() function can be used to find out the size of int, short, uint32_t, etc.

In cortex M0 stdint.h, int16_t is defined as type of "signed short int" =>  typedef signed short int int16_t;
similarly for uint8_t: typedef unsigned char uint8_t; => Note that in C, integers are represented in 16 bits or larger, so the only way to rep integer in 8 bits is by using signed/unsigned char.

---------------------
keywords & variables: http://www.acm.uiuc.edu/webmonkeys/book/c_guide/1.2.html#variables
---------------------
1. keywords: reserved keywords that can't be used as a variable identifier. ex: for, char, const, extern, etc.
---------
char short int long float double short signed unsigned => type and type specifier   
void
volatile const => type qualifier


2. variables: used to store values of a particular type
--------------------
names of identifiers: The use of two underscores (`__') in identifiers is reserved for the compiler's internal use according to the ANSI-C standard. Underscores (`_') are often used in names of library functions (such as "_main" and "_exit") and are reserved for libraries. In order to avoid collisions, do not begin an identifier with an underscore. Having __ both before and after variable name (eg __Symbol__) almost gurantees that there would be no name collision, as such identifiers with double underscores are extremely rare in user code.

A variable is defined by the following: <storage-class-specifier> <type-qualifier> <type-specifier> <type> variable-names,...
ex: extern const volatile unsigned long int rt_clk; => defines real time clk variable rt_clk

I. storage-class-specifier: storage class reflects data's lifespan during program execution.
1. typedef: The symbol name "variable-name" becomes a type-specifier of type "type-specifier". No variable is created.
ex: typedef long int mytype_t; => declares a new type mytype_t to be long int. From here on, we can use mytype_t instead of long int.
typedef most commonly used with struct to reduce cumbersome typing.
ex:
struct MyStruct {
    int data1;
    char data2;
};
with no typedef, we define var "a" of type Mystruct struct as follows: struct MyStruct a;
however with typedef, we can just define a new type as follows: typedef struct my_struct newtype;
or we can directly do typedef with struct defn as follows:
typedef struct  MyStruct {  
    int data1;
    char data2;
} newtype;
then we can define var "a" as being of type "newtype" as follows: newtype a;

2. extern: Indicates that the variable is defined outside of the current file. This brings the variables scope into the current scope. No variable is created.
3. static (permanent): Causes a variable that is defined within a function to be preserved in subsequent calls to the function. Variables declared outside the body of any function have global scope and static duration (for ex var declared outside main() are static, as they are not within any function). Although initial values may be assigned to global var, these are usually uninit. Since main() itself is a function, all var defined in it are local and auto, so we use "static" for these var to make them permanent. variables decalred outside main() are global for all functions in that file. static var are not released from mem on exit of function, so they consume mem space.
NOTE: we sometimes define function itself as "static".  In C, a static function is not visible outside of its translation unit, which is the object file it is compiled into. In other words, making a function static limits its scope. You can think of a static function as being "private" to its *.c file.
4. auto (temporary): Causes a local variable to have a local lifetime (default). Any variables declared within body of a function, including main(), have local scope and auto duration.
5. register: Requests that the variable be accessed as quickly as possible. This request is not guaranteed. Normally, the variable's value is kept within a CPU register for maximum speed.

II. type-qualifier: any declaration, inlcuding those of variables, struct/union, enum, etc can also have type-qualifier (volatile, auto) which qualifies the decl, instead of specifying it.
1. volatile: added to C pgm later. It causes the value to be fetched from memory every time it's referenced. It tells the compiler that the object is subject to sudden change for reasons which cannot be predicted from a study of the program itself, and forces every reference to such an object to be a genuine reference. This is used for defining variables stored in peripherals, so that uP doesn't read the variable from the register, which it might have stored a while back.
ex: volatile int j;

2. const: const means that something is not modifiable, so a data object that is declared with const as a part of its type specification must not be assigned to in any way during the run of a program.
ex: const int ci = 123; => declares a simple constant ci which always has a value of 123.
ex: const int *cpi; => declares a pointer "cpi" to a constant. cpi is an ordinary, modifiable pointer, but the thing that it points to must not be modified. comipler will check that cpi never points to something whose value changed.
ex: int *const cpi => declares a pointer "cpi" which is constant. It means that cpi is not to be modified, although whatever it points to can be modified \ the pointer is constant, not the thing that it points to.

III. type-specifier: void, all arithmetic types, boolean type, struct, etc.

3. enumerated tags:
-----------------

4. Arrays:
--------
array is defined as: <type-of-array> <name-of-array> [<number of elements in array>];
ex: int arr[10] => defines an array of 10 integers. Array elements are arr[0]. arr[1], ...  arr[9]. Each integer is 4 bytes (let's assume), so total size of array=4*10=40 bytes.

to initialize arrays:
we can either use for loop in main pgm, or init it at time of declaration.
int arr[] = {'1','2','3','4','5'}; => this init an array of 5 integers as follows: arr[0]=1, arr[1]=2 and so on. NOTE: there is no need to mention any value in the subscripts []. The size will automatically be calculated from the number of values. In this case, the size will be 5.
to init array with string 2 ways:
A. char arr[] = {'c','o','d','e','\0'};
B. char arr[] = "code"; => equiv to ex A above. We don't need an explicit null char here, since double quotes do that for us.

To access values in an array:
int j=arr[2]; => assigns j to arr[2] value which is 3.

We can also define array of structures, as well as array elements within structures.

5. structures and union:
----------------------
struct st{
    int a;
    char c[5];
};
int main()
{
    struct st st_arr[3]; // Declare an array of 3 structure objects
    struct st st_any[] = { {0,'c'},{1,'f'}}; //declares an array of 2 struct obj with values assigned    
    struct st st_obj0; // first structure object
    st_obj0.a = 0;
    st_obj0.c = 'a';
}

6. const:

7. strings: simply an array of characters encapsulated in double quotes. At the end of the string a null character is appended.
ex: char x="\x41" or "A" are the same string. x[0]=A, x[1]=null character.

8. define, ifdef: preprocessor that are remved by compiler depending on directive
#define NEW 0
#ifdef NEW => since NEW is defined this portion is kept by compiler
#define var ab
#else => this portion is removed by compiler
#define var cd
#endif

ex: this very commonly used in defines, so that we don't redefine something in multiple files
#ifndef CHAR_T => this piece of code can be placed in multiple files, but it will be compiled only from 1 file.
#define CHAR_T 0x45
#end
 
-----------------------------------

random number gen:
------------------
srand((unsigned) time(&t)); => inits rand num gen with time (in sec since epoch)
rand() % 50; => generates rand num b/w 0 to 49 using above seed. otherwise rand() will always gen same rand num seq since it will always use same seed (if we don't use srand).

Python:

Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming language. Similar to perl.
Python provides interfaces to all major commercial databases. Python supports GUI applications that can be created and ported to many system calls, libraries and windows systems. It can be easily integrated with C, C++, Java, etc.

Python has so much support from community, that almost everything can be done by using python's vast library or modules. You can build a website, write a program for raspberry pi, build games with gui, etc. Infact, it's one language that you can learn and get by doing everything without learning any more languages. I've avoided python in the past because of it's huge confusion over python2 vs python3, but now this issue looks settled.

Python 3 is latest version, python 2 is going to be EOL in 2020. So, switch to python 3 for all coding. There are significant differences b/w python2 and python3, so don't waste your time learning python2. However, we may still want to install both python2 and python3, as many pgms are still written in python2. So, not having python2 will cause those pgms to error out. When we install python 2, it is installed as both python (python is just a soft link to python2) and python2, while python 3 is installed as python3. Since python is just a soft link to python2, we can change it at any time to point to either python2 or python3. This helps reduce confusion, as pgms written in python 2 may suddenly start failing if python3 is installed as python. So, we keep soft links for all python, python2 and python3. We can change the soft link of "python" to point to "python3" if needed. However, it's advisable to leave python soft link pointing to python2, and have a separate link for python3.

python3 itself has several versions as python3.4, python3.6, etc. Latest stable version is python3.8 as of 2020. 

NOTE: On any linux distro, after installing latest version of python3, change the soft link of python3 to point to python3.8 or whatever is the latest version. That way, your latest python version would be available for your programs, by just typing python3.

cmd: cd /usr/bin; rm python3; ln -s python3.8 python3; => now calling python3 calls python3.8

All the discussion below is for python3, unless explicitly specified. I'll point out the differences where applicable. Be careful when searching for python on internet, as many of them are for python2, and may not work for python3.

Official doc is on python.org site: https://docs.python.org/

geeksforgeeks also has very elaborate and fantastic tutorial here: https://www.geeksforgeeks.org/python-programming-language/

 

Python installation:

On any Linux distro, to install python3, there are 2 ways: one thru package mgmt, while other by manually downloading and installing.

1. Pckage mgmt:

CentOS: For CentOS, we install using yum. Below are the ways to install python2 and python3.

A1. rpm pkg for python2: sudo yum install python => by default, it installs puthon2. It specifically installs python 2.7 in /usr/bin/python2.7. A soft link is created to python2 in /usr/bin dir (/usr/bin/python2 -> python2.7). Another soft link "python" is made to python2 (/usr/bin/python -> python2).  and python2 are soft links to python2.7 already installed

A2.  rpm pkg for python 3.4: sudo yum install python34 => installs python 3.4 in /usr/bin/python3.4. A soft link is created to python3 in /usr/bin dir (/usr/bin/python3 -> python3.4) python and python2 are soft links to python2.7 already installed

A3. rpm pkg for python 3.6: sudo yum install python36 => installs python 3.6 in /usr/bin/python3.6. A soft link is created to python3 in /usr/bin dir (/usr/bin/python3 -> python3.6). python and python2 are soft links to python2.7 already installed

A4: rpm pkg for python 3.7: sudo yum install python37 => Although latest version of python is python3.7, yum repo still doesn't have it, and gives an error that "no such package found". Run "yum info python37" to find out if python 3.7 available or not.

NOTE: one very important thing to note is that "yum" is written in python2. So if you change soft link of python to change to python3 (after installing python3), then yum will not work and will throw this error:
  File "/usr/bin/yum", line 30
    except KeyboardInterrupt, e:
 SyntaxError: invalid syntax

To fix this, do one of 2 things:

1. change python version being called in yum to python2: In /usr/bin/yum file, change first line from "#!/usr/bin/python" to "#!/usr/bin/python2". This will force python2 soft link to be used, instead of using python link.

2. change softlink in python to python2. This will cause yum to still use python2 as softlink python is pointing to python2. However, this may cause other pgms to fail, which may rely on pyton3, and need python ink to point to python3. To fix this, any pgm that needs to have python3, change first line in that pgm to point to python3 instead of python

First choice is preferred, as python3 is the step forward, so keeping soft link python pointing to python3 is going to work for most pgms.

Linux Mint: On LinuxMint, we install using apt. Latest python is 3.8 as of June 2021.

A1. sudo apt install python3.8 => This installs python version 3.8. Look in /usr/bin/ dir to make sure you see python3.8 over there.

 

2. manual: not tried yet. It's not recommended way, as it requires lot more efforst, and there's no reason to do it (as all linux distro allow you to install via pkg mgmt)

 

Python syntax:

1. comment: Python comment is anything after a # at start of line or end of line. Multiline comments are not supported, but can be mimicked by putting any comment within triple quotes, i.e " " " .... multi line comment " " "

ex: a=2 #comment


2. case sensitive: Python is case sensitive. So, var and Var are different.


3. End of line: Each stmt ends with newline \n. So, no ; needed (this is in contrast to other languages which use ; etc to indicate end of line). However for multi stmt in single line, we need ; In cases where we need line to continue, we use line continuation char \. Recall that \ hides metacharacter immediately following it and treats it as literal, so newline metacharacter is hidden from shell interpretor. What the interpretor sees is just a space.
ex:
total = item_one + \
        item_two + \
        item_three


4. Blocks of code: no braces provided, instead all statements within the block must be indented the same amount. This is unique feature of python, and also very confusing as all other languages use brackets or keywords to mark begin or end of block, but never rely on spaces or indentation. This indentation needs to be a tab or 4 spaces to signify a block. 2 tabs or 8 spaces signifies another block nested within the outer parent block. Similarly 3 tabs signifies yet another nested block within outer 2 blocks and so on. We can have 1 space also to indent a block, but for readability, we keep it s 4 spaces or 1 tab (most editors automatically convert tab into 4 spaces, so it's the same thing) . All of the code with same number of spaces at start of line is considered part of one block. NOTE: we can't have 0 spaces to identify a block, as that will error out. We do need some indentation.
ex:
if True: => header line begin stmt with keyword (if, else, while, etc), terminate with : and are followed by suite
   print "True" => this group of stmt in single code block called suite. This is indented by a tab or 4 spaces, so it's part of if block

   print "I'm here" => This is part of if block too, as it's same indentation.
else:
   print "False" => this is part of else bock, as it's indented by a tab

print "end" => this is not part of if-else block as it's not indented at all.

NOTE: these are 2 of the most distinct departure from other languages.

I. One is the absence of end of line character (i.e no semicon etc, just a newline marks end of cmd in a line). We can always add a semicolon at end of line and python will work just fine, but correct way is to not put a semicolon.

II. Second is the use of tabs or spaces to identify blocks of code. Usually high level languages don't rely on spaces for correct functionality, but python is all about spaces. Most other languages use curly braces  { ... }  to define scope of loops, functions, etc.

5. reserved keywords: Like any other pgm lang, python has reserved keywords, which can't be used as var names or for any other purpose. ex:

1. if else,
2. and, not, or
3. for, while, break, continue
4. print, exec, try, return, assert, class. print function is most used function and is explained later under "Functions" section.

6. quotes: single quotes and double quotes have same meaning, and so are interchangeable in python. We use one or the other when it's absolutely needed, i.e use double-quotes if your string contains a single-quote

Running python:

We can run python interactively or run a python pgm via a cmd line

python --version => returns version num. If it's 2.x it's older, if 3.x it's newer. We can also run "python -V" to get version num.

1. interactively:

typing python brings up python shell. Prompt is >>>. We can type any python cmd in it. Type "Ctrl + D" to exit shell.

>>> print("Hello")

prints Hello on screen

2. via cmd line:

file: test.py => here are are specifying python3 as interpretor instead of python (since python is usually set as a soft link to python2)

#!/usr/bin/python3
print ("Hello, Python!") #this is a comment:
# single line comment


> type ./test.py to run above file. (do chmod 755 test.py to make it a executable file).

> python3 test.py => This also runs the above python file. We could do "python test.py" too. This will work as long as syntax in test.py is python2 syntax.

 

Data Types and variables:

 

I. Variables:

As in any programming language, we need to define variables which store data, which may be of multiple data types supported by the language. var do not need explicit declaration of data type. This declaration happens automatically when you assign a value to a variable using = sign (i.e var2=1.1 => assigns float num 1.1 to var2)

variable names or Identifiers: starts with letter (a-z) or _. Variables are not declared beforehand to be of a particular type (as in C), but this is a common practice in most shell programming. The type is figured out by python during assignment.

II. Data types: Python is strongly typed language, meaning we would need to convert one data type to other to use it, else it will give error.

A. primitive data types: In python, we have 4 primitive data types:

1. numbers: numbers may of 4 types:

 A. int (signed integers) ex: var1=10
 B. long (long int, can also be oct or hex) ex: var2=-579678L; var3=0xDEADBEEF
 C. float (fp real) ex: var4=15.2; var5=32.3e18
 D. complex (complex num) ex: 3.14j, 4.5-3j

2. Strings: cont set of char in "...." or '....'. In both of these quotes, values are not substituted but printed as it is. There are special formatting available that allow substitution within single or double quotes that is explained later. This is differences from scripting languages and other languages which treat single and double quotes differently. There is also a triple quote in python that allows string to span multiple lines (so newline tab etc are treated as part of string).

ex: var1 = "my name"

ex: address = ''' my house is at => due to triple quotes, everything on this line and below is part of string including new llines. print(address) will print all 3 lines as is.

1207 goog ln,

los angeles "'


Subsets of strings can be taken using the slice operator ([ ] and [:] ) with indexes starting at 0 in the beginning of the string and working their way from -1 at the end.
The plus (+) sign is the string concatenation operator and the asterisk (*) is the repetition operator.
ex: str='My world'; print(str[0]) => prints M; print(str[2:5]) => prints worl; print(str+"TEST") => prints My worldTEST

There are 2 types of string in python. traditional str type (which is 1 byte or 8bit char), and newer unicode type (which are 2 byte or 16 bit char and can be upto 4 bytes for UTF-8). On any string type, we can put a char "u" infront of the string to indicate that it's unicode type. u or U refers to UTF-8 style where each string can be variable length from 1 byte to 4 bytes. (UTF-8 is used widely now, since 1 byte could only store ASCII char and can't handle millions of other char out there. UTF-8 is compatible with 1 byte ASCII code)

There are many other prefixes besides "u" to indicate how the string is going to be interpreted. "r" means raw string type (so that anything inside the string is going to be treated as literal and not interpreted. ex: r"me\n" => this is not going to treat \n as new line but instead as 2 literals \ and n.

str=u'Me .op' => this string is now unicode type (since u preceedes the string). So, each character is stored as 16 bits instead of 8 bits. u'text' is just a shortcode for calling unicode('text')

formatted strings: In version 3.6 of python, formatted string literals were introduced. So far, no substitutions happened for any characters inside strings, but with formatted string (or f-string), we can provide replacement fields by enclosing them within { ... }. Any python expr is allowed within these curly braces.

ex: name = "Fred"

a = f"He said his name is {name}." => This substitutes name with variable "name"., since we have f in front of string.

a = "He said his name is {name}."=> no substitution occurs

NOTE: char: There is no char var type in python. char are rep with string with length of one.

There are many string methods available to operate on strings. Look in python link for such methods.

ex: str.upper() returns a copy of string, with all letters uppercased. "My name".upper() returns string "MY NAME"

3. boolean: 2 values: True and False.

 

B. Compound data types:

4. List: most versatile. similar to arrays in C, except that items belonging to list can be of diff data types. NOTE: there are no array data type in Python. List are superset of arrays, so we use list in it's place. list have syntax same as that of array. On internet, lot of articles talk about array in python. In reality they are not talking about array, but list.

list contains items separated by commas and enclosed within [].

1D Lists:

ex: mylist = [] => this defines an empty list (since [] used, it implies a list). However, the size of list is not defined here, i.e if the list has 10 entries or 100 entries isn't mentioned, so it's not possible for compiler to reserve memory for this list at this point in time.
ex: list1 = ['A',1,"john", 23]; print(list1[1:3]) => prints [1, "john"] => Here, we specify entries of the list. So, here compiler/interpretor reserves memeory for list depending on how many entires are in the list, and the size of each entry. NOTE: the range specified includes item with index=1,2 but NOT index=3, as range is up to index-1. Also, commas preserved when we print the list
ex: list1[3]=102 => this updates value 23 with new value 102, not possible with tuple since it's read only
ex: for x in [1, 2, 3]: print x, => prints 1 2 3

Assigning values to list: We saw one way to assign initial values to list. Let's see if we can assign initial values to a list in other way.

my_list[0]=4;=> Here my_list is defined for the 1st time with 0th entry having value 4. Previously, we assigned list values as my_list=[4] which worked. This will give a Name Error: "NameError: name 'my_list' is not defined". This is because we are accessing indices of list, and python doesn't know what indices it has. So, let's define an empty list.

my_list = [];  my_list[0]=4; my_list[1]=2; => This will give an Error: "IndexError: list assignment index out of range". This is because python doesn't know the size of the list. If we assigned values to this list as my_list = [4,2] => then python knows the size of list as 2, and assigns value as my_list[0]=4 and my_list[1]=2. Then we can access value as my_list[0].

One way to resolve above issue is to define the list with size specified. ex: my_list = [0]*4; => This defines a list with 4 elements [0,0,0,0]. Now we can do my_list[0]=4. However, here list elements must all be of same type, else *4 won't work.

2D lists:

2D lists are en extensions of 1D list, i.e each element of a 2D list is in itself a 1D list.

ex: my_arr = [ [300, 200,100, 900], [600, 500, 400, 700] ]; => This is a 2D list, where each list element is 1D list.

Accessing list elements: We access it the same way as in 1D list, except that we provide the index of 1D list also.

print(my_arr[1][0:2]) => prints [600, 500]. This is called slicing of array/list/tuple. format is [start_index:stop_index-1:increment of index]. See in numpy module section for more details. so, my_arr[0][3:1:-1] =  [900, 100]

ex: print(my_arr[:]) => This prints entire 2D array since blank start means start from 1st index and blank end means stop at last index. Since no dimensions specified, it includes all dimension, so o/p is: [ [300, 200,100], [600, 500, 400] ]. This applies to any dimension array. arr[:] will all elements of the array. For some reason, slicing across multiple indices don't work, i.e my_arr[1:3][0:5] returns empty array.

We define a 2D list same way as 1D list. i.e list_2d = []. However, we can't do something like list_2D[0][0]=5, without having this list already specified for same reasons as 1D list above) with values as: list_2D=[[67,34],[35,67]]. Now we can do: list_2D[0][0]=5.

We can initialize 2D list as: my_list = [[0]*2]*3; => This will create 2D list of 2x3 with all values as 0, i.e [[0,0,0],[0,0,0]]

list operators: There are multiple operators for manipulating lists. Some of them are: cmp(list1, list2); len(list3); list.sort(list4);

Arrays: Lists behave almost same as arrays, but are not efficient. Lists are more generic than array (in that they allow multiple data types, while array allow same data type only), but they also get less efficient for storing/retrieving, computing, etc. Most scientific computations can be easily carried out with arrays, since they usually work on only one kind of data (i.e int, float, etc). Python doesn't enforce typing (i.e one particular type of data as int, etc), so they never created an array data type in Python. For most basic uses, lists serves our purpose, and we don't care about speed. However, if performance becomes critical because of large amount of data to work with, then Arrays are needed.

We said previously that python doesn't have arrays. However, python supports modules which allow us to use arrays. 2 ways to create arrays in Python

A. array module: Python has module "array" that can be imported to get arrays.  We specify the type of data elements, and all elements of array have to be of that data type. There are many functions available to operate on array. This method is not recommended method for creating arrays, use 2nd method using numpy module.

ex: import array as arr => We don't need to install any module for this. More details about array module can be found on internet

my_array = arr.array('i', [2, 4, 6]); print(my_array) => prints array('i', [2, 3, 5]) => NOTE: everything in array including data type is printed. Again, commas preserved while printing array (same way as in lists)

B. numpy module: There is NumPy module that can be used to create arrays. It's not included by default with Python distribution, so will need to be installed (see in NumPy section). This is the recommended method for creating arrays.

5. tuples: similar to list, specified using (). however they cannot be updated (i.e read only). We can apply slicing across tuples also. Used very rarely in simple codes.
ex: tuple1 = ('ab', 2.7)

6. sets: sets are similar to sets in maths where we can take union, intersection, etc. Sets defined using curly braces { .. }. They contain any number of objects, and of different types. Sets are unordered: the original order, as specified in the definition, is not necessarily preserved. Additionally, duplicate values are only represented in the set once. set elements must be immutable. For example, a tuple may be included in a set, as it's immutable. However lists and dictionaries are mutable, so they can’t be set elements. Other way to create set is using the set() function.

ex: x = {'foo', 'bar', 'baz', 'foo', 'qux', 12, (1,2), None}

print(x) => {none, 'foo', 12, (1,2), 'bar', 'baz', 'qux'} => NOTE: duplicate entries are removed, and order of elements is not preserved

Many operators as union, intersection, difference, |, &, ^, etc are allowed on sets. sets are also very rarely used in simple programs.


7 dictionary: They are like hashes in perl. They are also known as associative arrays. They consist of key-value pair. key/values can be any data type.
Dictionaries are enclosed by { } and values can be assigned and individual elements are accessed using [ ... ]. since both sets and dictionary use { }, we distinguish b/w the two via presence of ":". Since { } is used to rep empty dictionary, we can't use {} to rep empty set (since then python interpretor has no way of knowing if the object is a set or a dictionary). In that case,, we use set() func with no args to create empty set. We use ":" to assign key:value pair for each element

1D dictionary: Just like 1D list, we have 1D dictionary:

ex:

tinydict = {'name': 'john','code':6734, 'dept': 'sales'} => Assigns key value pair as follows: name->john, code->6734, etc. print(tinydict.keys()) prints ['dept', 'code', 'name'] while tinydict.values() prints ['sales', 6734, 'john']

tinydict['name'] prints "john", tinydict['code'] prints "6734" and so on

Assigning values to list: There are 2 ways to assign dict key/value pair.

A. We can assign dict key/value pair as we did in 1D list, and as shown in ex above.

ex: tinydict = {"name": "john",5:9}

B. We can also assign dict values in array form as shown below. This is different than in 1D list, where we weren't allowed to do dict[0]=5 and so on.
dict = {} => initialize dict. This is needed for dictionary, as w/o this there is no way to know for python compiler/interpretor to find out if dict[0]=1 is list assignment or dictionary assignment.

dict[0]=5 => Now we are allowed assignment like these. NOTE: 0 is a key her, and not index number. It just happens to be a integer key here, as 0 is not enclosed in quotes. The value is also integer as it's not enclosed in quotes.
dict['one'] = "This is one" => print (dict['one']) prints "This is one". Here both key and value are strings.
dict[2]     = "This is two"

2D dictionary: Just like 2D list, we can have higher dim dict as 2D, 3D, etc. However for 2D dict, we can't do something like dict_2D['a']['b']='xyz". Th reason might be that 2nd index it needs to know the range. So, we have to first define 1D dict, and then use that 1D dict as elements of 2D dict.

ex: dict1D['age']=35; dict1D['salary']=300;

dict2D['ramesh']=dict1D => Now dict2D['ramesh']['age']=35, dict2D['ramesh']['salary']=300 and so on. dict2D['mohan']={'age':50,'salary':500}. So 2D dict are just an array of 1D dict.

So, 2D dict are little cumbersome to write as you will first need to form 1D dict and then use that as elements of 2D. It would have been nice to just directly assign elements to 2D dict.


Operators:

Just like in other lang, we have various operators to operate on variables. Mostly operators are used for number data type (int, float, etc), but some of them can be used on other data types too. How the operator behaves depends on the data type of it's operands.

1. arithmetic: +, -, *, /, etc. ex: a+b. + and * are used in strings to concatenate or repeat strings.
2. comparison: ==, !=, >, >=, etc ex: (a<b)
3. assignment: =, +=, -=, etc ex: c=a+b;
4. bit wise : &, |, ^, ~, <<, >>, etc ex: a=16, b=2, a&b
5. logical: not, or, and

Control statements:

1. if elif else: This is same as if stmt in other languages. elif is substitute for "else if". Both elif and else are optional. An ifelifelif … sequence is a substitute for the switch or case statements found in other languages.

ex:Below if .. elif .. else stmt needs appr tab spaces for each block of code. NOTE: if, elif and else are at start of line with no tab.

if x < 0:

  x = 0

  print('Negative changed to zero')

elif x == 0:

  print('Zero')

else:

  print('More')

ex: if ( var == 100 ) : print ("val is 100") #for single line suite, it can be on same line


2. for: for stmt differ from that in C. There is no start, end or iteration index. Python’s for statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence.

ex: below iterates over the list and prints each word and length

words = ['cat', 'window', 'defenestrate']

for w in words:

  print(w, len(w))

ex: to iterate over a seq of numbers just as we do in for loop in C pgm, we can use range() function. syntax of range is (start,stop,step), where stop is required parameter, while star/step are optional. sop value is not included in range (i.e range is upto stop-1).  range(10) generates 10 values, from 0 to 9 (doesn't include 10). range(5,9) generates 4 values = 5,6,7,8. range(0,10,3) indicates step value of 3, so it generates 4 values = 0, 3, 6, 9. So, by using range() function, we can achieve what we do using for loops in C pgm.

for i in range(5):

  print(i) => prints 0,1,2,3,4

ex: To iterate over the indices of a sequence, you can combine range() and len() as follows:

for i in range(len(words)):

  print(i, words[i])) => This prints index 0,1,2 and prints the 3 words

3. while: The while statement is used for repeated execution as long as an expression is true.

ex: infinite loop below since expr is set to "True"

while True:

  print("infinite loop")

4. break, continue, else: break, continue and else claues can be used in loops as "for" and "while", "break" breaks out of the innermost enclosing for or while loop, while "continue" continue thru next iteration of loop. Else clause can be used for loops as "for" and "while". a loop’s else clause runs when no break occurs. Look for more details in the python website link above.


Functions: Function syntax is similar to those of other lang. All functions require parenthesis and optional args inside it.

1. Builtin: Python provides many builtin functions as print(), int(), abs(), open(), sorted(), etc.

A. print( ) : print function is one of the most used functions to o/p something on screen. It wasn't a function in python2 (it was just a statement), so no ( ) were required with print, but it's a function in python3, so it needs ( ). i.e: print("Hello, Python!"); However () works in python 2 also. So, it's preferred to use print as a func with parenthesis ( .... )

Python2: print "The answer is", 2*2, "My name=", name, "var=", 2*var1

Python3: print("The answer is", 2*2, "My name=", name, "var=",2*var1) => this will work in python2 also as parenthesis work in python2. Anything within quotes is printed as literal string, anything outside quotes is computed if it can be computed based on data types, or the value is just printed if it's a var. A newline is appended by default, but if we put a comma at the end of args 9i.e just before closing parenthesis), it suppresses newline.

We can use strings, list, var, etc to be printed using print. With List and tuples, full list will be printed, w/o requiring us to iterate over each element of the list.

ex: formatted string and other string type can be used inside print

name = "Fred"; print(f"He said his name is {name}." ) => This substitutes name with variable "name"., since we have f in front of string.

% operator for strings: String objects have one unique built-in operation: the % operator (modulo). This is also known as the string formatting or interpolation operator. Given format % values (where format is a string), % conversion specifications (as d, s, f, etc) in format are replaced with zero or more elements of values. The effect is similar to using the sprintf() in the C language.

ex: name="afg"; age=2;

my_format = "his name is %s, his age is %2d"; my_values =  (name, age) => NOTE: my_values need parenthesis since they are tuples (not curly braces or square brackets)

print(my_format % my_values) => Here %s and %2d in format string are replaced with values in var "name" and "age".NOTE: the whole thing here can be treated as a string, that is put inside print function. Whatever is the o/p of this formatting operator is passed to print func as an argument.

o/p is => his name is afg, his age is  2

ex: print( ' %(language)s has %(number)03d quote types.' % {'language': "Python", "number": 2}) => outputs "Python has 002 quote types". Here "s" after %(language) is a conversion spec saying convert 'language' object into a string using function str(). similarly 03d spec asks it to convert "number" into signed integer with 3 digits. Here values are not tuples, but hash, so curly braces used. NOTE: there is no comma after single or double quotes of string, as it's "format % value" that is being used inside print function, and not the typical "string followed by variable" syntax

ex:  We can use % operator on string inside print func, along with other regular args, as strings, var, etc to be printed. The whole format string is just another string arg to print func.

var2=23; var3 = "my stuff"
print('The value of pi is approximately %5.3f.' % math.pi, var2, "good", var3) => Here math.pi is formatted with total of 5 digits and 3 digits of precision (%. %5.3f means width=5, precision=2).

o/p is => The value of pi is approximately 3.142 23 good my stuff

format method: above are older ways of formatting print o/p. Now, we use format method to format strings.

ex: print('{0} and {1}'.format('Geeks', 'Portal'))=> {0} is replaced by string in 0 position which is 'Geeks' and {1} is replaced by string in position 1 which is 'Portal', so o/p is => Geeks and Portal. NOTE: there is no comma here after single or double quotes but a dot, since we are using the method on print argument, so it's not typical print variable.

B. input( ): input function is other widely used function to get input from user. There are diff b/w how this func behaved in python2 vs python3.

Python 2:

python2: str = raw_input("Enter your input: "); => raw_input() reads 1 line from std i/p and returns it as string w/o the newline
python2: str = input("Enter your cmd: "); => same as above except that valid python expr can be provided, and it will return result. result is still stored as string.
  Enter your cmd: [x*5 for x in range(2,10,2)]
  Recieved input is :  [10, 20, 30, 40] => str stores this list

Python 3:

python3: raw_input() function from python2 has been deprecated and instead replaced by input() func.So, no python expr can be provided.

python3: input() function from python2 is depracted, and instead stmt eval(input()) must be used to get same behaviour as input() func of python2. We don't use this stmt much, instead input() func above is used.

With all these input functions above, the result is stored as string, so in order to do numeric computation, we have to do data conversion using func below. Also, no expr are allowed, i.e expr will be treated as strings, and won't be computed.

ex: here 2 numbers are provided as i/p, but have to be converted to int in order to add them

num1=input("1st number")

num2=input("2nd number")

sum=int(num1)+int(num2)

print("Sum is", sum); #Here if i/p is 1 and 2, then o/p is 3. If we just did "sum=num1+num2", then it would concatenate the 2 strings and print "12"

C1. type(): type is an inbuilt func to find data type of any var or object (in case of OOP discussed later):

ex: age=50; print(type(age)) => prints type as "int".

ex: type_var = type(tinydict) => assigns "dict" string to type_var (as tinydict defined above is of type "dict")


C2. data conversion: data can be converted from one type to other by casting. Some of the casting functions are:
ex: int(x), str(x), list(y), hex(x), dict(d)

ex: python3: var_int = int(input("Enter any number: ")); var1=var_int+1; => here, var_int stores integer (i.e any number entered is a string, but then int() func converts it to int, so that we can do airthmetic computation on it.

C3. isinstance(): The isinstance() function returns True if the specified object is of the specified type, otherwise False.

ex: if (isinstance("Hello", str) ): print("true") => This checks if "Hello" is of type string. It returns True since anything within ".." is a string

ex: my_num=4.7; var1=isinstance(my_num, (str,list,dict,tuple)); print(var1) => this prints "False", since my_num is of type "int", while allowed types that this func is checking for are str,list,dict and tuple.

D. Maths:
ex: abs(x); log(x); max(x1,x2,...); pow(x,y);
ex: random()
ex: cos(x); radians(x);
ex: constants: pi, e

E. File functions: Python has file functions for reading/writing files just as in other lang.

file read/write ex shown below:
fo = open("foo.txt", "w+") => opens file for both rd/wrt, ptr at start of file. w=wrt_only, r=rd_only, (a=append_mode, ptr at end of file)
fo.write( "Python is a great language.\nYeah its great!!\n");
str = fo.read(10); => reads 10 bytes from file, if no arg provided, then reads whole file until EOF
print "Read String is : ", str
fo.close

exception: when script encounters a situation that it cannot cope with, it raises an exception. An exception is a Python object that represents an error. exception must be handled, else pgm terminates.
ex:
try:
   fh = open("testfile", "r")
   fh.write("This is my test file for exception handling!!") => trying to wrt to rd only file, raises an exception of IOError
except IOError: => std exception raised when an I/O operation fails
   print "Error: can\'t find file or read data" => This gets printed when IOError exception happens in try block
except ... => some other exception code can be put here for a diff exception raised. There are about 30-40 different exception errors that we can specify

except: => "except" stmt w/o any Exception code means raise this exception for any exception error
else:
   print "Written content in the file successfully" => If no exception, then run this block

Assert: An assertion is a sanity-check. An expression inside assert stmt is tested, and if the result comes up false, an exception is raised. Assertions were added in Python 1.5. They are usually placed inside function definition to check for valid inputs or to check for valid outputs. AssertionError exceptions can be caught and handled like any other exception using the try-except statement, but if not handled, they will terminate the program and produce a traceback. Assertions are very useful in exposing bugs, and should always be used extensively.

assert (Temperature >= 0),"Colder than absolute zero!" => This checks for Temperature variable to be +ve. If -ve, then the stmt following assert is printed "Colder ..." and pgm terminates.

assert(isinstance(b, float) or isinstance(b, int)) => Here on failure of assertion (i.e b is neither float nor int), no stmt is printed, but pgm terminates with traceback. If there are many assertions in pgm, it may be tedious to figure out which assertion failed, so it's good practice to have "text" following assert keyword.

 

2. User defined: Besides the built in functions provided by python, we may define our own function also. There are 2 kinds of function defined in python:

A. Normal function: These are regular function definition (as is common in other pgm lang)

defining a func:
def functionname( parameters ): => i/p param or args
   "function_docstring" => optional: explains what this func does
   function_suite
   return [expression] => If no expr provided, it returns none

ex:
def printme( str ):
   "This prints a passed string into this function"
   print str
   return;

printme("I'm first call to user defined function!") => calls printme func

NOTE: All parameters (arguments) in Python are passed by reference. It means if you change what a parameter refers to within a function, the change also reflects back in the calling function.

If var defined within func, then they are local to func, and are diff from same var declared outside the func.
total = 0; # This is global variable.
def sum( arg1, arg2=10 ): //default val of arg2 is 10
   total = arg1 + arg2; # Here total is local variable.
   return total; //here 30 is stored in total and returned.

# Now you can call sum function
total1 = sum( arg1=10, arg2=20 ); //here total1 is 30. We use arg1 to specify that 10 is for arg1, so on. This allows to place args out of order
print total; => here total is printed as 0, as it's global var

Passing func as an arg: We can also pass a func as an arg to another func

ex:

def shout(text): 
    return text.upper() 
def greet(func1): => Arg of greet function is func1
    greeting = func1("hi") => func1 is called with arg specified
    print(greeting)
  
greet(shout) => This calls greet func with arg "shout", which is itself a func. shout gets called with arg "hi", so o/p returned is HI.

B. anonymous function: These are functions w/o a name, and are faster way of implementing simple one line functions. "lambda" keyword is used to create anonymous functions. This function can have any number of arguments but only one expression, which is evaluated and returned. It's also called as lambda func and can also have another function as an argument. 

ex: square = lambda x1:x1 * x1 => Here, we define square as lambda func with one arg "x1". It computes square.Here lambda func is assigned a var "square", which points to the lambda func

print(square(5)) => This calls the var pointing to func "square" with arg =5. It returns 25.

ex: cube = lambda func1:func1**3 => here func1 is an arg to lambda func.

print(cube(square(2)) => here cube func is called with arg "square(2)". Now, square func is called with arg 2, which returns 4. This 4 is now cubed to get final answer

More Topics: More advanced topic are in next section.