Physics:

In this section, we'll deal with physics.

Feymann Lectures: There is very good lecture series by Nobel Prize winner, Feymann, that describes fundamental Physics in layman's terms.

http://www.feynmanlectures.caltech.edu/

Libretexts.org: is another very good website for learning science related topics including physics. Link => https://phys.libretexts.org

Goto Bookshelves and then choose appropriate Phics level (i.e college level, university level, etc). Below are the sections in the link shown => https://phys.libretexts.org/Bookshelves?readerView

Physics sections include:

  • solid state physics
  • mechanics
  • thermodynamics
  • Electricity and Magnetism

- ...

---------------
Cortex M0-plus (CM0+):
---------------
Here, we use only minimal required C code files from ARM.
steps:
1. make: make 32 bit bin file from c code: only these files needed:
   *.c: test1.c, main.c, boot.c, IRQService.c, export.s
   *.h: address_def.h, defines.h
   *.txt: scatter.txt
2. In irun, load this bin file in rom memory, and start sim

make:
---
make -f /db/.../Makefile all

Makefile:
-------
CCFLAGS = --cpu=Cortex-M0 -c --asm --interleave --apcs=interwork -Otime -D RAM_ABSOLUTE_BASE_ADDR -D main=__main -g
LDFLAGS = --cpu=Cortex-M0 --map --no_debug --datacompressor=off --map --symbols --info=inline
IKDEPS = /sim/.../Makefile

all: clear test1_32.dat clean

############ rm files at start
clear:
      rm -rf *.o *.sym *.list *.bin *.elf *.sec *.dis *.map *.ihex *.obj *.log *.inc *.dat

###########start compilation .dat<=.ihex<=.elf<=.o<=.c
test1_32.dat : test1.ihex
        @echo " *** building $@ ***" => building test1_32.dat
        ihex2dat32 -input $< -output $@ => o/p in 32 bit mem mage file

test1.ihex : test1.elf
        fromelf --i32 $< --output $@

test1.elf : boot.o IRQService.o init.o export.o test1.o
        armlink -o $@  boot.o IRQService.o init.o export.o test1.o $(LDFLAGS) --scatter scatter.txt

test1.o: test1.c $(IKDEPS)
       armcc $(CCFLAGS) -c -o $@ $<

#get .o files for boot.c, IRQService.c, init.c, export.s
boot.o: boot.c $(IKDEPS) => similarly for IRQService.c, init.c
      armcc $(CCFLAGS) -c -o $@ $<
IRQService.o: ...
init.o: ....
export.o: export.s $(IKDEPS) => export.c isn't there, use export.s
      armasm --cpu=Cortex-M0 --keep -o $@ $<


##### remove files at end
clean:
        rm -rf *.obj

------------------------
output of running this:
#clean
rm -rf *.o *.sym *.list *.bin *.elf *.sec *.dis *.map *.ihex *.obj *.log *.inc *.dat

#compile files
armcc --cpu=Cortex-M0 -c --asm --interleave --apcs=interwork -Otime  -D RAM_ABSOLUTE_BASE_ADDR -D main=__main -g -c -o boot.o /sim/.../boot.c
armcc ... IRQService.c
armcc ... init.c
armasm --cpu=Cortex-M0 --keep -o export.o /sim/.../export.s
armcc ... test1.c

armlink -o test1.elf boot.o IRQService.o export.o init.o basic.o --cpu=Cortex-M0 --map --no_debug --datacompressor=off --map --symbols --info=inline --scatter /sim/.../scatter.txt => Here armlink is using --scatter which causes it to use scatter.txt for ro-base, rw-base, zi-base. For sporsho, we used --ro-base and --rw-base on armlink cmdline instead of using scatter file.  

#show inline results => symbol table, memory map,
Image Symbol Table ....
Memory Map of the image ....

#generate .elf and .dat
fromelf --i32 test1.elf --output test1.ihex
ihex2dat32 -input test1.ihex -output test1_32.dat => final *.dat file generated

#clean
rm -rf *.obj

-------------------------------------------
#contents of various files

###### boot.c ######
typedef void (*intr_handler)(void);
extern void IRQService (void); // Default interrupt handler
void BaseIRQService0 (void); //similarly for IRQ0 to IRQ31
intr_handler IRQService0   = IRQService;

//define func for all intr handlers, either as C/asm code or via calling other func.
void Reset_Handler (void) {
        init (); //this calls init func, defined in init.c
}

void BaseIRQService0 (void) { //irq0.
    IRQService0 (); //this ultimately calls IRQService
}

typedef const int * vect_t; //vect_t is a ptr to type "const int"
const int * __Vectors[] __attribute__ ((section("vectors"))) = { //this matches contents of .dat binary file below
        (vect_t) (&USR_STACK),     //addr=0x0000, stores MSP value, here it stores 0x20046F00
        (vect_t) Reset_Handler,    //addr=0x0004
        (vect_t) NMI_Handler,       //addr=0x0008
        (vect_t) HardFault_Handler,//addr=0x000C
        (vect_t) 0xFFFFFFFC,       //addr=0x0010    
        (vect_t) 0xFFFFFFFC,       //addr=0x0014
        (vect_t) 0xFFFFFFFC,       //addr=0x0018
        (vect_t) 0xFFFFFFFC,       //addr=0x001C
        (vect_t) 0xFFFFFFFC,       //addr=0x0020
        (vect_t) 0xFFFFFFFC,       //addr=0x0024
        (vect_t) 0xFFFFFFFC,       //addr=0x0028
        (vect_t) SVC_Handler,       //addr=0x002C, stores addr of SVC intr handler, which in this case is 0x00000115
        (vect_t) 0xFFFFFFFC,       //addr=0x0030
        (vect_t) 0xFFFFFFFC,       //addr=0x0034
        (vect_t) PendSV_Handler,   //addr=0x0038, stores addr of Pend intr handler, which in this case is 0x00000127
        (vect_t) SysTick_Handler,  //addr=0x003C, stores addr of systick intr handler, which in this case is 0x00000139
        (vect_t) BaseIRQService0,  // similarly for IRQ0 to IRQ31, addr=0x0040 to 0x00BC (addr 0x00C0 is the first addr where this vector table ends). stores addr starting from 0x00000141 to 0x000001FB, each IRQ is 6 bytes.
       //(vect_t) IRQService,      //we could directly point to IRQService routine. That's what we used for compiled code below
}

asm code for boot
----------
AREA ||.text||, CODE, READONLY, ALIGN=2 => this is text section

Reset_Handler PROC
000000  b510              PUSH     {r4,lr}
000002  f7fffffe          BL       init => jumps to init, here target addr is not yet finalized, so we see jump inst as f7ff_fffe. Once init addr is know in final linking of *.o files, this BL changes coding to f827_f000 (as seen in *.dat file below) which implies target_msb[10:0]=000, target_lsb[10:0]=027<<1 => pc=pc+(027<<1)=0c6+04e=0x114 (init section starts from 0x114, so it's correct). (LR stored with next PC=06=0xc6, so return from main() comes right here)
000006  bd10              POP      {r4,pc}
ENDP

similarly for other proc handlers
NMI_Handler PROC
000008  4805              LDR      r0,|L1.32| => Load pc+|L1.32| into r0 reg. (4805=>reg=r0,label=05<<2=0x14(dec=20),pc+0x14=0x0a+0x14=0x1e. Note that label=0x05 is resolved here itself, since the label is within this file, so compiler knows how much to add to goto that label.
00000a  6800              LDR      r0,[r0,#0]  ; NMIService, r0=
00000c  4700              BX       r0 => pc=r0=
ENDP

HardFault_Handler PROC
|L1.14|
00000e  bf00              NOP
000010  e7fd              B        |L1.14| => while (1) { __nop (); e7fd => offset=0x7fd=-3, so pc=pc-3=0x012-0x003=0x00e (prev inst)
ENDP

SVC_Handler PROC
|L1.18|
000012  bf00              NOP
000014  e7fd              B        |L1.18| => while (1) { __nop ();
ENDP

PendSV_Handler PROC
|L1.22|
000016  bf00              NOP
000018  e7fd              B        |L1.22|
ENDP

SysTick_Handler PROC
|L1.26|
00001a  bf00              NOP
00001c  e7fd              B        |L1.26|
ENDP

00001e  0000              DCW      0x0000
|L1.32| => NMIService is here
                          DCD      ||.data||
##data section for boot.c starts from here
                          AREA ||.data||, DATA, ALIGN=2 => area named .data
                  NMIService
                          DCD      IRQService
                          AREA ||area_number.5||, DATA, ALIGN=2 => area named .area_number.5
                          EXPORTAS ||area_number.5||, ||.data||
                  IRQService0
                          DCD      IRQService
                         AREA vectors, DATA, ALIGN=2 => area named vectors
                  __Vectors
                          DCD      0x20046f00
                          DCD      Reset_Handler
                          DCD      NMI_Handler
                          DCD      HardFault_Handler
                          DCD      0xfffffffc
                          DCD      0xfffffffc
                          DCD      0xfffffffc
                          DCD      0xfffffffc
                          DCD      0xfffffffc
                          DCD      0xfffffffc
                          DCD      0xfffffffc
                          DCD      SVC_Handler
                          DCD      0xfffffffc
                          DCD      0xfffffffc
                          DCD      PendSV_Handler
                          DCD      SysTick_Handler
                          DCD      IRQService
                ....
                          DCD      IRQService (32 of these)

---------------

######## init.c ###########
void init (void) {
 main(); //this calls "main" func in user code, defined in test1.c
}

asm code for init:
----------
AREA ||.text||, CODE, READONLY, ALIGN=1
init PROC

000000  b510              PUSH     {r4,lr}
000002  f7fffffe          BL       __main => BL is 32 bit inst (LR stored with next PC=06=0x11a, so return from main() comes right here)
000006  bd10              POP      {r4,pc}

ENDP
----------

######## IRQService.c ###########
void IRQService (void) { //default IRQ, doesn't do anything
 (*(volatile uint32 *)(TB_RSVD_ADDR + 0x0F98))++; //FAIL_LOC++
 (*(volatile uint32 *)(TB_RSVD_ADDR + 0x0FEC)) = 0xD0D0D0D0; //DISPLAY(0xD0D0D0D0)
 (*(volatile uint32 *)(TB_RSVD_ADDR + 0x0F90)) = 0xB000B000; //FAIL;   //writes to some reserved mem section to indicate to verilog tb that it's in IRQ
 (*(volatile uint32 *)(TB_RSVD_ADDR + 0x0FAC)) = 0x50; __nop(); (*(volatile uint32 *)(TB_RSVD_ADDR + 0x0FAC)) = 0x11; //EXIT; wrt to BOOTCODE_ADDR
 return; //just returns back
}

asm code:
-------
000000  4807              LDR      r0,|L1.32|
000002  6981              LDR      r1,[r0,#0x18] => FAIL_LOC++;
000004  1c49              ADDS     r1,r1,#1
000006  6181              STR      r1,[r0,#0x18]
000008  4a07              LDR      r2,|L1.40| => DISPLAY(0xD0D0D0D0);
00000a  4906              LDR      r1,|L1.36|
00000c  62d1              STR      r1,[r2,#0x2c]
00000e  4907              LDR      r1,|L1.44| => FAIL
000010  6101              STR      r1,[r0,#0x10]
000012  2150              MOVS     r1,#0x50 => EXIT
000014  62c1              STR      r1,[r0,#0x2c]
000016  bf00              NOP
000018  2111              MOVS     r1,#0x11
00001a  62c1              STR      r1,[r0,#0x2c]
00001c  4770              BX       lr => return
00001e  0000              DCW      0x0000
                  |L1.32|
                          DCD      0x20046f80
                  |L1.36|
                          DCD      0xd0d0d0d0
                  |L1.40|
                          DCD      0x20046fc0
                  |L1.44|
                          DCD      0xb000b000
------

######## test1.c ###########
int main (void) {
  int i=0;
    while(i<1) i=i+1;
    return(0);
}

asm code:
---------
000000  2000              MOVS     r0,#0 => r0=0
|L1.2|
000002  1c40              ADDS     r0,r0,#1 => while(i<1) i=i+1;
000004  2801              CMP      r0,#1
000006  dbfc              BLT      |L1.2|
000008  2000              MOVS     r0,#0 => return(0)
00000a  4770              BX       lr
ENDP
-----------


######## export.s ###########
  IMPORT ||Image$$ROM_EXEC2$$RO$$Limit||  => ROM_EXEC2 region (in scatter.txt below)
  IMPORT ||Image$$RAM_EXEC1$$RW$$Base||   => RAM_EXEC1 region
  IMPORT ||Image$$RAM_EXEC1$$RW$$Length||
  IMPORT ||Image$$RAM_EXEC2$$ZI$$Base||   => RAM_EXEC2 region
  IMPORT ||Image$$RAM_EXEC2$$ZI$$Length||
        END

######## scatter.txt #########
scatter file has 1 or more load region for diff obj *.o files. Each load region has RO and RW/ZI regions. The addr and placement of these regions is specified in

ROM_LOAD 0x0 {
    ROM_EXEC1 0x0 { => start from addr=0x0000
        boot.o (vectors, +First) => boot.c "vectors" sction is put here (RO)
        * (InRoot$$Sections)
    }
    ROM_EXEC2 0xc0 { => addr=0xC0 is where the vector section has ended. We put all other code starting here (RO)
        *.o
    }
    OTP_EXEC3 +0 {
        *.o (.sec_text)
    }
    RAM_EXEC1 0x00043000 { //let's say this has 148 bytes (=0x94 bytes)
      *.o (.data,.RW)
    }
    RAM_EXEC2 +0 { //then starting addr for this is 0x00043094
      *.o (.bss)
    }
}


#contents of .dat (32 bit):
Section 1 => ROM_EXEC1 (192 bytes)
---------
 @00  20046F00 000000C1 000000C9 000000CF => addr 0(msp) loaded with 2004_6f00. Reset=0xC0/NMI=0xC8/HardFault=0xCE
 @10  FFFFFFFC FFFFFFFC FFFFFFFC FFFFFFFC
 @20  FFFFFFFC FFFFFFFC FFFFFFFC 000000D3 => 0xD2=SVC handler
 @30  FFFFFFFC FFFFFFFC 000000D7 000000DB => 0xD6=PendSV, 0xDA=SysTick
 @40  000000E5 000000E5 000000E5 000000E5  => IRQ0-IRQ31 all goto same IRQService
 @50  000000E5 000000E5 000000E5 000000E5
 @60  000000E5 000000E5 000000E5 000000E5
 @70  000000E5 000000E5 000000E5 000000E5
 @80  000000E5 000000E5 000000E5 000000E5
 @90  000000E5 000000E5 000000E5 000000E5
 @a0  000000E5 000000E5 000000E5 000000E5
 @b0  000000E5 000000E5 000000E5 000000E5
--- All vector table ends before here ----

Section 2 => ROM_EXEC2 (104 bytes)
---------
 @c0  => boot.c starts here. reset handler starts, then NMI, HardFault, SVC, PendSV, SysTick  here
----
F000B510 => reset handler (B510 => PUSH {r4,lr})
BD10F827 => F827_F000 => BL init (target addr=000_027=>pc=pc+027=0xc6+(0x27<<1)=0x114, (BD10 => POP {r4,pc})
 @c8 => NMI handler
68004805 BF004700 (BF00 = NOP from HardFault handler)
 @d0 BF00E7FD BF00E7FD BF00E7FD 0000E7FD => HardFault, SVC, PendSV, SysTick (last 0000 is for next inst DCW 0x0000)
 @e0  00043000 => DCD ||.data|| => 0x0004_3000 is addr of RAM region

 @e4 => IRQservice.c starts here
---
69814807 61811C49 49064A07
 @f0  490762D1 21506101 BF0062C1 62C12111
@100  
00004770 => 4770=BX lr, 0000=DCW 0x0000
20046F80 D0D0D0D0 20046FC0 => all DCD
@110  
B000B000 => all DCD

@114 => init.c starts here
---
F000B510 =>
BD10F801 => F801_F000 => BL __main (target addr=000_001 =>PC=PC+1=

@11c => main.c starts here
---
1C402000 =>  main starts from here __main
DBFC2801
47702000 => end of main.c

@128
000000E5 000000E5 => extra bytes, not sure for what

--------------
seq of exec of code above:
1. HCLK starts running. 1 cycle later, HRESET_N gets released.
2. HADDR goes from 0x00->0x04->0xC0,0xC4(reset_handler)->0x114,0x118(init)->0x11c,0x120,0x124(main)->

----------------------------------------
Section 3: RAM_EXEC1
---------
remove below lines ---

4770BF30 => Reset_Handler starts here (B510 => PUSH {r4,lr})
4770B662
4770B672 4770BF00
 @d0  8F5FF3BF F3BF4770 47708F4F 8F6FF3BF
 @e0  00004770
F000B510 => Reset_Handler starts here (B510 => PUSH {r4,lr})
BD10F827 => F827_F000 => BL init, (BD10 => POP {r4,pc})
68004805
 @f0  BF004700 BF00E7FD BF00E7FD BF00E7FD
 @100  0000E7FD 00043000 69814807 61811C49
 @110  49064A07 490762D1 21506101 BF0062C1
 @120  62C12111 00004770 20046F80 D0D0D0D0
 @130  20046FC0 B000B000 F000B510 BD10F81D
 @140  F7FFB510 BD10FFC5 F7FFB510 BD10FFC4
 @150  F7FFB510 BD10FFC3 F7FFB510 BD10FFB7
 @160  F7FFB510 BD10FFAF F7FFB510 BD10FFAD
 @170  F7FFB510 BD10FFA5
 @12C
1C402000 => main starts from here __main
DBFC2801
47702000 => ENDP
 @138
000000E5 000000E5

-------------------------

------------------------------
compiling c files:
------------------------------
create file hellow.c. Put some code in it: int main () {while (1); }. Then compile hellow.c file using armcc, armasm, armlink, fromelf:
A. compiler: creates obj files (*.o) from .c files
   armcc: obj file is created using armcc(ARM C compiler): path is /apps/ame/bin/armcc: run "armcc -help" for options.
    ex: armcc -4.0-821 --cpu=Cortex-M0 -I ./CMSIS/Core/CM0 -c --data_reorder --diag_suppress=2874 --asm --interleave -D MSC_CLOCK -D ITERATIONS=5 -O2 -Otime -Ono_autoinline -o hellow.o hellow.c
       -I <dir> => Include <directory> on the #include search path
       -c => Compile only, do not link
       --cpu => selects cpu for which to generate code,
       --asm => generates assembly code too (hellow.s).
       --interleave => interleaves C source with assembly code in hellow.txt file.
       -O0 => Minimum optimization, -O2 => high optimization
       -Otime => Optimize for maximum performance
       -D <symbol>  =>   Define <symbol> on entry to the compiler
B. linker: combines the contents of one or more object files with selected parts of one or more object libraries to produce executable images, partially linked object files, or shared object files.
   armlink: create object files for any other C files if needed. Then create ELF (Executable and linkable format) file using armlink:
    ex: armlink -4.0-821 --map --ro-base=0x0 --rw-base=0x20000020 --first='boot_debugdriver.o(vectors)' --datacompressor=off --info=inline -o hellow.elf hellow.o boot_debugdriver.o system_cm0ikdebugdriver.o retarget_cm0ikdebugdriver.o
      -o => creates output as hellow.elf
      --datacompressor off => Do not compress RW data sections.
      --ro-base n   =>  Set exec addr of region containing RO sections.
      --rw-base n   =>  Set exec addr of region containing RW/ZI sections.
      --map         =>  Display memory map of image.
      --info topic  =>  List misc. information about image, depending on topic.
C. fromelf: executable image is created using fromelf, which is image loaded into memory:
    ex: fromelf -4.0-821 --bin -o hellow.bin hellow.elf => creates plain binary hellow.bin from hellow.elf

-----------------
RunIk for hellow test case:
-----------------
EX: when we make hellow.c, this is the seq:
cmd: RunIK -build -make hellow

1. Makefile rules specified in Makefile. For TGT=hellow, hellow.bin is needed which in turn is generated from hellow.elf, which in turn is from hellow.o:
 - hellow -> hellow.bin -> hellow.elf -> hellow.o:
   - DEP: hellow.c CMSIS/Core/CM0/core_cm0.h CMSIS/Core/CM0/core_cm0.c cm0ikmcu.h IKtests.h IKtests.c IKConfig.h debug_i2c.h sporsho_tb.h Makefile debugdriver (debugdriver has a rule specified below, so if debugdriver is out of date, armcc(ACTION) doesn't run yet)
   - ACTION: armcc -4.0-821 --cpu=Cortex-M0 -I ./CMSIS/Core/CM0 -c --data_reorder --diag_suppress=2874 --asm  -D MSC_CLOCK -D ITERATIONS=5 -O2 -Otime -Ono_autoinline -Ono_inline -o hellow.o hellow.c

2. debugdriver make rule specified.
 - debugdriver -> debugdriver.bin -> debugdriver.elf -> debugdriver.o:  (We compile debugdriver, since this code always runs on the Cortex-M0 core within the debug driver module in the Integration Kit testbench)
   - DEP: debugdriver.c CMSIS/Core/CM0/core_cm0.h CMSIS/Core/CM0/core_cm0.c cm0ikdebugdriver.h Makefile debugdriver.h IKConfig.h
   - ACTION: armcc -4.0-821 --cpu=Cortex-M0 -I ./CMSIS/Core/CM0 -c --data_reorder --diag_suppress=2874 -o debugdriver.o debugdriver.c => action

3. ELF file generated using linker.
 - debugdriver.elf:
   - DEP: debugdriver.o boot_debugdriver.o system_cm0ikdebugdriver.o retarget_cm0ikdebugdriver.o
   - ACTION: armlink -4.0-821  --map --ro-base=0x0 --rw-base=0x20000020 --first='boot_debugdriver.o(vectors)' --datacompressor=off --info=inline -o debugdriver.elf debugdriver.o boot_debugdriver.o system_cm0ikdebugdriver.o retarget_cm0ikdebugdriver.o (boot_debugdriver.o system_cm0ikdebugdriver.o retarget_cm0ikdebugdriver.o have their DEP and ACTION defined for each, which uses .c files for each to generate .o files)

4. shows Memory Map of the image after running armlink
  Image Entry point : 0x000000c1
  Load Region LR_1 (Base: 0x00000000, Size: 0x00000ea8, Max: 0xffffffff, ABSOLUTE)
    Execution Region ER_RO (Base: 0x00000000, Size: 0x00000e2c, Max: 0xffffffff, ABSOLUTE)

   Base Addr    Size         Type   Attr      Idx    E Section Name        Object

    0x00000000   0x000000c0   Data   RO            6    vectors             boot_debugdriver.o
    0x000000c0   0x00000008   Code   RO           18  * !!!main             __main.o(c_p.l)
    0x000000c8   0x0000003c   Code   RO          173    !!!scatter          __scatter.o(c_p.l)
    0x00000104   0x0000001a   Code   RO          175    !!handler_copy      __scatter_copy.o(c_p.l)
    0x0000011e   0x00000002   PAD
    0x00000120   0x0000001c   Code   RO          177    !!handler_zi        __scatter_zi.o(c_p.l)
    0x0000013c   0x00000006   Code   RO           53    .ARM.Collect$$libinit$$00000000  libinit.o(c_p.l)
    0x00000142   0x00000000   Code   RO           62    .ARM.Collect$$libinit$$00000006  libinit2.o(c_p.l)
    ...
    0x00000144   0x00000002   Code   RO          119    .ARM.Collect$$libshutdown$$00000000  libshutdown.o(c_p.l)
    ...
    0x00000160   0x00000006   Code   RO          106    .ARM.Collect$$rtexit$$00000004  rtexit2.o(c_p.l)
    0x00000166   0x00000002   PAD
    0x00000168   0x00000c0c   Code   RO            1    .text               debugdriver.o
    0x00000d74   0x00000020   Code   RO            5    .text               boot_debugdriver.o
    0x00000d94   0x00000002   Code   RO           14    .text               use_no_semi_2.o(c_p.l)
    0x00000d96   0x00000020   Code   RO           16    .text               llushr.o(c_p.l)
    0x00000db6   0x00000002   Code   RO           20    .text               use_no_semi.o(c_p.l)
    0x00000db8   0x0000003e   Code   RO           42    .text               sys_stackheap_outer.o(c_p.l)
    0x00000df6   0x0000000c   Code   RO           45    .text               exit.o(c_p.l)
    0x00000e02   0x00000002   PAD
    0x00000e04   0x00000008   Code   RO           54    .text               libspace.o(c_p.l)
    0x00000e0c   0x00000020   Data   RO          171    Region$$Table       anon$$obj.o
    
    Execution Region ER_RW (Base: 0x20000020, Size: 0x0000007c, Max: 0xffffffff, ABSOLUTE)

    Base Addr    Size         Type   Attr      Idx    E Section Name        Object

    0x20000020   0x0000007c   Data   RW            3    .data               debugdriver.o


    Execution Region ER_ZI (Base: 0x2000009c, Size: 0x00000078, Max: 0xffffffff, ABSOLUTE)

    Base Addr    Size         Type   Attr      Idx    E Section Name        Object

    0x2000009c   0x00000018   Zero   RW            2    .bss                debugdriver.o
    0x200000b4   0x00000060   Zero   RW           55    .bss                libspace.o(c_p.l)

5. generate bin file from .elf:
 - debugdriver.bin:
   - DEP: debugdriver.elf
   - ACTION: fromelf -4.0-821 -bin -o debugdriver.bin debugdriver.elf

6. Now, since debugdriver is made, hellow.o gets generated running armcc (STEP 1). Once that is done, linker runs to gen hellow.elf
 - hellow.elf:
   - DEP: hellow.o boot.o system_cm0ikmcu.o retarget_cm0ikmcu.o IKtests.o debug_i2c.o sporsho_tb.o
   - ACTION: armlink -4.0-821 --map --ro-base=0x0 --rw-base=0x20000020 --symbols --first='boot.o(vectors)' --datacompressor=off --info=inline -o hellow.elf hellow.o boot.o system_cm0ikmcu.o retarget_cm0ikmcu.o IKtests.o debug_i2c.o sporsho_tb.o (boot.o system_cm0ikmcu.o retarget_cm0ikmcu.o IKtests.o debug_i2c.o sporsho_tb.o have their own DEP and ACTION defined and so make runs for each)

7. It show image symbol table at this point:
Image Symbol Table
    Local Symbols
    Symbol Name                              Value     Ov Type        Size  Object(Section)

    vectors                                  0x00000000   Section       96  boot.o(vectors)
    ../../angel/boardlib.s                   0x00000000   Number         0  boardshut.o ABSOLUTE
    similarly for kernel.s, startup.s, sys.s. armsys.c, libinit.s, signal.c, hellow.c, boot.c, etc
    .text                                    0x000000ec   Section        0  hellow.o(.text)
    .text                                    0x000001b0   Section        0  boot.o(.text)
    .text                                    0x000001f0   Section        0  sporsho_tb.o(.text)
    .text                                    0x000006e8   Section        2  use_no_semi_2.o(.text)
    .text                                    0x000006ec   Section        0  strlen.o(.text)
    .text                                    0x00000730   Section        2  use_no_semi.o(.text)
    .text                                    0x00000732   Section       62  sys_stackheap_outer.o(.text)
    .text                                    0x00000770   Section        0  exit.o(.text)
    .text                                    0x0000077c   Section        8  libspace.o(.text)
    .bss                                     0x20000020   Section       96  libspace.o(.bss)

    Global Symbols
    Symbol Name                              Value     Ov Type        Size  Object(Section)
    
    __ARM_use_no_argv                        0x00000000   Number         0  hellow.o ABSOLUTE
    __Vectors                                0x00000000   Data          96  boot.o(vectors)
    _printf_flags                            0x00000000   Number         0  printf_stubs.o ABSOLUTE
    _init_alloc                               - Undefined Weak Reference
    __main                                   0x00000061   Thumb Code     8  __main.o(!!!main)
    main                                     0x000000ed   Thumb Code   116  hellow.o(.text)
    NMI_Handler                              0x000001b1   Thumb Code     2  boot.o(.text)
    __temporary_stack_top$libspace           0x20000080   Data           0  libspace.o(.bss)

8. shows Memory Map of the image
  Image Entry point : 0x00000061
  Load Region LR_1 (Base: 0x00000000, Size: 0x00000794, Max: 0xffffffff, ABSOLUTE)
    Execution Region ER_RO (Base: 0x00000000, Size: 0x00000794, Max: 0xffffffff, ABSOLUTE)

    Base Addr    Size         Type   Attr      Idx    E Section Name        Object
    0x00000000   0x00000060   Data   RO            4    vectors             boot.o
    0x00000060   0x00000008   Code   RO           22  * !!!main             __main.o(c_p.l)
    0x00000784   0x00000010   Data   RO          175    Region$$Table       anon$$obj.o

    Execution Region ER_RW (Base: 0x20000020, Size: 0x00000000, Max: 0xffffffff, ABSOLUTE)
    **** No section assigned to this execution region ****

    Execution Region ER_ZI (Base: 0x20000020, Size: 0x00000060, Max: 0xffffffff, ABSOLUTE)
    Base Addr    Size         Type   Attr      Idx    E Section Name        Object
    0x20000020   0x00000060   Zero   RW           59    .bss                libspace.o(c_p.l)

9. generate binary executable hellow.bin which gets loaded into fram (This loading is done via using preload feature of memory to write desired hex value into fram. This is done in verilog "initial begin .. end").
 - hellow.bin:
   - DEP: hellow.elf
   - ACTION: fromelf -4.0-821 --bin -o hellow.bin hellow.elf

10. After this the simulator runs.
---------------------------------------



---------------------------------------------------------------
cortex M0 assembly:
---------------------------------------------------------------

ARM assembly code: => http://www.keil.com/support/man/docs/armasm/armasm_babdagdi.htm
-----------------------------------------

ASM syntax: ASM syntax depends on assembler we are using. Here syntax wrt ARM assembler is discussed. The general form of source lines in assembly language is:

{symbol} {instruction|directive|pseudo-instruction} {;comment}

All three sections of the source line are optional.

symbol is usually a label. In instructions and pseudo-instructions it is always a label. In some directives it is a symbol for a variable or a constant. The description of the directive makes this clear in each case.

Instructions and pseudo-instructions make up the code a processor uses to perform tasks. It's of form: opcode, operand1, operand2, ... => 1st operand is destination, remaining operands depend on type of inst.

double bars || ... || : If you use the same name as an instruction mnemonic or directive, use double bars to delimit the symbol name.

Directives provide important information to the assembler that either affects the assembly process or affects the final output image. They are used by assembler and are not part of m/c code to be generated.

Data definition directives allow insertion of constants inside assembly code. 3 kinds generally used:

1. DCI (define constant inst): used to code an inst, if assembler can't generate the exact inst that we want, and if we know the binary code of that inst. ex: DCI 0xBE00 :

2. DCB:

3. DCD:

Previous

ex: MOV R0, #0x12; set R0=0x12 (in hex)

ex: ADD R0, R1; R0=R0+R1, written in traditional Thumb syntax (not to be used)

ex: ADD R0, R0, R1; R0=R0+R1, written in UAL syntax, which is what is preferred


ELF sections:
-------------
AREA: In a source file, the AREA directive marks the start of a section. This directive names the section and sets its attributes.
END: The END directive instructs the assembler to stop processing this source file. Every assembly language source module must finish with an END directive on a line by itself.

----------------------------------------------
example assembly file generated by armcc
------------------------------------------------
This is the c code:
---------------------
#include "cm0ikmcu.h"
#include "sporsho1.h"
#include "sporsho_tb.h"

volatile uint16_t err = 1;

int main() {
  SYSTEM->MAGICN = 0x4A454449;
  SYSTEM->FRAM_WAIT = 0;
  SYSTEM->MAGICN = 0x0;
  msg_disp("kail_rtsc:\n");
  RTSC->CTRL = RTSC_IE_EN;
  NVIC_EnableIRQ(RTSC_IRQn);
  while (err);
  return err;
}
void RTSC_IRQHandler()
{
  NVIC_DisableIRQ(RTSC_IRQn);
  RTSC->CTRL = RTSC_IE_DIS;
  err--;
  NVIC_EnableIRQ(RTSC_IRQn);
  RTSC->CTRL = RTSC_IE_EN;
}
-------------------------------
this is the asm code:
---------------------
; generated by ARM C/C++ Compiler, RVCT4.0 [Build 821]
; commandline armcc [-c --asm -okail_rtsc.o --cpu=Cortex-M0 -Otime --data_reorder -Ono_autoinline -Ono_inline --diag_suppress=2874 -I./CMSIS/Core/CM0 -DMSC_CLOCK -DITERATIONS=5 kail_rtsc.c]
        THUMB
        REQUIRE8
        PRESERVE8

----- defines code section below -----
        AREA ||.text||, CODE, READONLY, ALIGN=2 => The AREA directive instructs the assembler to assemble a new code or data section. Sections are independent, named, indivisible chunks of code or data that are manipulated by the linker. Syntax "AREA sectionname{,attr}{,attr}..." |.text| is used for code sections produced by the C compiler, or for code sections otherwise associated with the C library. Attr are: CODE=Contains machine instructions. READONLY is the default. DATA=Contains data, not instructions. READWRITE is the default. ALIGN=expression => aligns on "expr" byte boundry. ALIGN=2=> align on 2^2=4 byte boundary. If ALIGN=10=> align on 2^10=1 KB boundary. By default, ELF sections are aligned on a 4-byte boundary.
    
-----------------------
#NVIC_EnableIRQ PROC is defined in Testbenches/digtop/tc/c/CMSIS/Core/CM0/core_cm0.h. C code is below:
static __INLINE void NVIC_EnableIRQ(IRQn_Type IRQn)
{
  NVIC->ISER[((uint32_t)(IRQn) >> 5)] = (1 << ((uint32_t)(IRQn) & 0x1F));                             /* enable interrupt */
}

#NVIC is a struct of type NVIC_Type
#define     __I     volatile const            /*!< defines 'read only' permissions      */
#define     __O     volatile                  /*!< defines 'write only' permissions     */
#define     __IO    volatile                  /*!< defines 'read / write' permissions   */

typedef struct
{
  __IO uint32_t ISER[1];                      /*!< Interrupt Set Enable Register            */
       uint32_t RESERVED0[31];
  __IO uint32_t ICER[1];                      /*!< Interrupt Clear Enable Register          */
       uint32_t RSERVED1[31];
  __IO uint32_t ISPR[1];                      /*!< Interrupt Set Pending Register           */
       uint32_t RESERVED2[31];
  __IO uint32_t ICPR[1];                      /*!< Interrupt Clear Pending Register         */
       uint32_t RESERVED3[31];
       uint32_t RESERVED4[64];
  __IO uint32_t IP[8];                        /*!< Interrupt Priority Register              */
}  NVIC_Type;

#define SCS_BASE            (0xE000E000)                              /*!< System Control Space Base Address    */
#define NVIC_BASE           (SCS_BASE +  0x0100)                      /*!< NVIC Base Address =  0xE000E100      */
#define NVIC                ((NVIC_Type *)          NVIC_BASE)        /*!< NVIC configuration struct            */

---------------------
NVIC_EnableIRQ PROC asm code below=> r0 is already loaded with 7, which is IRQn
        LSLS     r2,r0,#27 => r2:= r0 << 27 (logical left shift by 27, shifting in 0)
        LSRS     r2,r2,#27 => r2:= r2 >> 27 (logical right shift by 27, shifting in 0), so [31:5]=0 => r2 stores lower 5 lsb of r0 => r2=7. => corresponds to r2= ((uint32_t)(IRQn) & 0x1F))
        MOVS     r1,#1       => r1:= 1
        LSLS     r1,r1,r2  => r1:= r1 << r2, In r1, 1 is placed at bit location 7 (which is stored in [4:0] of r0). corrsponds to r1= (1 << ((uint32_t)(IRQn) & 0x1F))
        LSRS     r0,r0,#5  => r0:= r0 >> 5, since r0 was 7, so now r0=0. => corresponds to r0=((uint32_t)(IRQn) >> 5)
        LDR      r2,|L1.112| => loads word from this label. r2:= 0xe000e100(4 bytes stored in this mem loc) (label needs to be word aligned). This addr=0xe000e100 is in system segment (top 512MB) as NVIC->ISER (interrupt set enable register). It stores "set" bits for IRQ31 to IRQ0.
        LSLS     r0,r0,#2  => r0:= r0 << 2 , r0=r0*4 (r0=0)
        ADDS     r0,r0,r2  => r0:= r0 + r2 , r0=0 + 0xe000e100 (since initial r0=0, so r0=0xe000e100)
        STR      r1,[r0,#0]=> [r0+0]:=r1, stores word from reg r1 (which has bit set for IRQ7) to mem addr [r0]=0xe000e100.  corresponds to NVIC->ISER[((uint32_t)(IRQn) >> 5)] = r1 => [0xe000e100]=IRQ7 set.
        BX       lr        => PC:=lr & 0xFFFF_FFFE, so, pc=lr (with lsb=0), HW aligned
        ENDP

--------------------
#NVIC_DisableIRQ PROC clears the int bit.
static __INLINE void NVIC_DisableIRQ(IRQn_Type IRQn)
{
  NVIC->ICER[((uint32_t)(IRQn) >> 5)] = (1 << ((uint32_t)(IRQn) & 0x1F));    /* disable interrupt */
}

-----------
NVIC_DisableIRQ PROC asm code below:
        LSLS     r2,r0,#27
        LSRS     r2,r2,#27
        MOVS     r1,#1
        LSLS     r1,r1,r2
        LSRS     r0,r0,#5
        LDR      r2,|L1.116| => r2:= 0xe000e180
        LSLS     r0,r0,#2
        ADDS     r0,r0,r2
        STR      r1,[r0,#0]
        BX       lr
        ENDP

------------
int main() {

  SYSTEM->MAGICN = 0x4A454449;
  SYSTEM->FRAM_WAIT = 0;
  SYSTEM->MAGICN = 0x0;

  msg_disp("kail_rtsc:\n");
  RTSC->CTRL = RTSC_IE_EN;

  NVIC_EnableIRQ(RTSC_IRQn);
  while (err);

  return err;
}
-----------
main PROC
        PUSH     {r4,lr}      => push r4 lr on stack
        LDR      r0,|L1.124|  => r0:=0x50007000 (addr of SYSTEM->MAGICN)
        LDR      r1,|L1.120|  => r1:=0x4a454449 (value of SYSTEM->MAGICN)
        STR      r1,[r0,#0]   => [r0+0]:=r1, stores 0x4a454449 into 0x50007000(SYSTEM->MAGICN = 0x4A454449;)
        MOVS     r1,#0           => r1:=0
        STR      r1,[r0,#0xc] => stores 0 into addr 0x50007000+0xc (SYSTEM->FRAM_WAIT = 0;)
        STR      r1,[r0,#0]   => stores 0 into addr 0x50007000+0x0 (SYSTEM->MAGICN = 0x0;)
        ADR      r0,|L1.128|  => load label into r0. r0:=label |L1.128| which stores "kail_rtsc:\n",0 (0=null char to signify end of string)
        BL       msg_disp     => branch to msg_disp. arg of msg_disp is a pointer to char, so starting addr stored in r0. There's strlen() fn called within msg_disp that figures out the length of the string (string is the arg to msg_disp)
        MOVS     r0,#1          => r0:= 1
        LDR      r1,|L1.140|  => r1:= 0x50006000
        LSLS     r0,r0,#11    => r0:= r0 << 11 , r0=1*2048, bit[11]=1, everything else is 0
        STRH     r0,[r1,#0]   => sets bit[11]=IE into 0x50006000(RTSC_reg)(RTSC->CTRL = RTSC_IE_EN;)
        MOVS     r0,#7        => r0:=7
        BL       NVIC_EnableIRQ => jumps to EnIRQ, which uses r0=7, LR=addr of next inst
        LDR      r0,|L1.144|  => r0:=.data section. .data section has mem addr that is determined later. It stores data variable "err" which has initial value=0x0001.
|L1.76|                       => following 3 inst are for while (err);
        LDRH     r1,[r0,#0]  ; err => LDRH calc addr=r0+#0 and loads halfword from mem and 0 extends it to form word and writes into r1. here, r1:=load contents from .data section, which stores variable "err" with value=0x0001. so, r1:=0x0001
        CMP      r1,#0        => subtracts 0 from contents of r1, updates condition flag (z bit, nbit, cbit, vbit in APSR) based on result and discards the result.
        BNE      |L1.76|    => if Z bit (Zero flag in APSR) != 0, jump to |L1.76| (2 stmt above)
        MOVS     r0,#0        => else r0:=0
        POP      {r4,pc}    => pop r4 pc from stack. So, r4 gets it's initial value while pc=lr
        ENDP

---------------
void RTSC_IRQHandler()
{
  NVIC_DisableIRQ(RTSC_IRQn);
  RTSC->CTRL = RTSC_IE_EN;
  err--;
}
------------
RTSC_IRQHandler PROC asm code below         => this process not called from anywhere inside the pgm
        PUSH     {r4,lr}    => push r4 lr on stack just as in main proc
        MOVS     r0,#7        => r0:=7
        BL       NVIC_DisableIRQ =>jump to DisIRQ
        MOVS     r0,#1          => r0:=1. following inst to set IE
        LDR      r1,|L1.140|    => r1:= 0x50006000
        LSLS     r0,r0,#11    => r0:=r0<<11 => 11th bit of r0=1.
        STRH     r0,[r1,#0]     => stores bit[11]=IE=1 into 0x50006000 (RTSC->CTRL = RTSC_IE_EN;)
        LDR      r0,|L1.144|    => r0:=.data section (start of err--; )
        LDRH     r1,[r0,#0]  ; err => same as main process above. r1 stores err value:=0x0001,
        SUBS     r1,r1,#1    => r1:=r1-1 =>  err--;
        STRH     r1,[r0,#0]    => .data section start addr:=err=0
        POP      {r4,pc}    => pop r4 pc from stack.
        ENDP

|L1.112|
        DCD      0xe000e100 => DCD directive allocates one or more words of memory, aligned on 4-byte boundaries, and defines the initial runtime contents of the memory. & is a synonym for DCD. syntax: {label} DCD{U} expr{,expr}. Here it defines 1 word (4 bytes) of mem containing 0xe000e100 = mem addr of NVIC-ISER.
|L1.116|
        DCD      0xe000e180 =>  mem addr of NVIC-ICER.
|L1.120|
        DCD      0x4a454449
|L1.124|
        DCD      0x50007000
|L1.128|
        DCB      "kail_rtsc:\n",0 => DCB is same as DCD except that it allocates one or more bytes of memory, = is a synonym for DCB. It has 12 bytes here (incl null char at end), so next mem loc has offset 128+12=140 => |L1.140|
|L1.140|
        DCD      0x50006000
|L1.144|
        DCD      ||.data||

----- defines data section with name=arm_vfe_header below -----
        AREA ||.arm_vfe_header||, DATA, READONLY, NOALLOC, ALIGN=2

        DCD      0x00000000

----- defines data section with name=.data below -----
        AREA ||.data||, DATA, ALIGN=1 => |.data| is used for code sections produced by the C compiler. global and static variables are stored in the .data segment. .data section is as big as the sum of sizes of the initialized variables.
#below code is for this declaration of var: volatile uint16_t err = 1;
err
        DCW      0x0001 => DCW is same as DCD except that it allocates one or more halfwords (2 bytes as it's unit16_t) of memory. Here label "err" stores 0x0001.

__ARM_use_no_argv EQU 0

        EXPORT __ARM_use_no_argv => The EXPORT directive declares a symbol that can be used by the linker to resolve symbol references in separate object and library files. GLOBAL is a synonym for EXPORT. Use EXPORT to give code in other files access to symbols in the current file.
        EXPORT main [CODE]
        EXPORT RTSC_IRQHandler [CODE]
        EXPORT err [DATA,SIZE=2]

        IMPORT ||Lib$$Request$$armlib|| [CODE,WEAK] =>  => The IMPORT directive provides the assembler with a name that is not defined in the current assembly. symbol name defined in a separately assembledsource file, object file, or library. The name is resolved at link time to a symbol defined in a separate object file. The symbol is treated as a program address. WEAK = prevents the linker generating an error message if the symbol is not defined elsewhere. It also prevents the linker searching libraries that are not already included.
        IMPORT msg_disp [CODE] => msg_disp symbol defined in another file

        KEEP NVIC_EnableIRQ => The KEEP directive instructs the assembler to retain local symbols in the symbol table in the object file.
        KEEP NVIC_DisableIRQ

        ATTR FILESCOPE
        ATTR SETVALUE Tag_ABI_PCS_wchar_t,2 => sets value of Tag_ABI_PCS_wchar_t to 2
        ATTR SETVALUE Tag_ABI_enum_size,1
        ATTR SETVALUE Tag_ABI_optimization_goals,2
        ATTR SETSTRING Tag_conformance,"2.06"
        ATTR SETVALUE AV,18,1

        ASSERT {ENDIAN} = "little" => The ASSERT directive generates an error message during the second pass of the assembly if a given assertion is false. Use ASSERT to ensure that any necessarycondition is met during assembly.
        ASSERT {INTER} = {TRUE}
        ASSERT {ROPI} = {FALSE}
        ASSERT {RWPI} = {FALSE}
        ASSERT {IEEE_FULL} = {FALSE}
        ASSERT {IEEE_PART} = {FALSE}
        ASSERT {IEEE_JAVA} = {FALSE}
        END => denotes end of assembly file

----------------------------------------
hexdump of kail_rtsc.bin file:
------------------------------------------
This file is hexdump of kail_rtsc.bin
----------------------------
below is memory map of image:
Image Entry point : 0x00000061 => actually it's 00000060 (since HW aligned) which is from where the actual code starts. Before this addr is vector table.

Load Region LR_1 (Base: 0x00000000, Size: 0x00000794, Max: 0xffffffff, ABSOLUTE) => this indicates the total size of binary file. It has 3 execution regions:
1. Execution Region ER_RO (Base: 0x00000000, Size: 0x00000790, Max: 0xffffffff, ABSOLUTE) => this is read only section. Section names are .text for user binary and .ARM,.* for other binaries. It is primarily code section, but has data section in 2 places: vector table (0x00000000 to 0x0000060) and Region table (0x00000770 to 0x00000790)

2. Execution Region ER_RW (Base: 0x20000020, Size: 0x00000004, Max: 0xffffffff, ABSOLUTE) => this is read write ".data" section for storing global and static variables (which should be initialized as per C lang convention). It comes from object kail_rtsc.o. Here it stores "err", which is a variable defined in kail_rtsc.c.

NOTE: The Base addr of ER_RO and ER_RW comes from the Makefile during armlink" step, where ro-base=0x0 and rw-base=0x20000020. The size of these regions is figured out by the compiler during compilation. Remember that addr 0 of binary image file has MSP stored in it, so that should be modified to be pointing to the sram area. It's in boot.c in vector table section.

3. Execution Region ER_ZI (Base: 0x20000024, Size: 0x00000060, Max: 0xffffffff, ABSOLUTE) => this is read write ".bss" section which starts 4 bytes after the end of 2nd execution section. Is is that part of the data section that contains statically allocated variables. uninitialized data are stored in the .bss segment.It is zero initialized and comes from libspace.o in linker file (c_p.l)


----------------------------
1. Execution Region ER_RO
----------------------------

-----------------------
######### vector table
-----------------------
#0x00000000 to 0x0000060 is vector table (It's data section named "vectors" in object boot.o with type=DATA)
code in boot.c:
vect_t __Vectors[]
__attribute__ ((section("vectors"))) = {
  (vect_t)(0x00001FF0),     // Top of Stack - for 8KB FRAM
  (vect_t)Reset_Handler,    // Reset Handler
  (vect_t)NMI_Handler,      // NMI Handler
  (vect_t)HardFault_Handler,// Hard Fault Handler
  0,                        // Reserved
  0,                        // Reserved
  0,                        // Reserved
  0,                        // Reserved
  0,                        // Reserved
  0,                        // Reserved
  0,                        // Reserved
  (vect_t)SVC_Handler,      // SVCall Handler
  0,                        // Reserved
  0,                        // Reserved
  (vect_t)PendSV_Handler,   // PendSV Handler
  (vect_t)SysTick_Handler,  // SysTick Handler

  // External Interrupts 0 - 7
  (vect_t)CWT_IRQHandler,
  (vect_t)TMR0_IRQHandler,
  (vect_t)TMR12_IRQHandler,
  (vect_t)GPIO_IRQHandler,
  (vect_t)SIO_IRQHandler,
  (vect_t)ADC_IRQHandler,
  (vect_t)CPSW_IRQHandler,
  (vect_t)RTSC_IRQHandler,
};
----
asm code in boot.s is:
--
        AREA vectors, DATA, READONLY, ALIGN=2

__Vectors
        DCD      0x00001ff0
        DCD      Reset_Handler
        DCD      NMI_Handler
        DCD      HardFault_Handler
        DCD      0x00000000
        DCD      0x00000000
        DCD      0x00000000
        DCD      0x00000000
        DCD      0x00000000
        DCD      0x00000000
        DCD      0x00000000
        DCD      SVC_Handler
        DCD      0x00000000
        DCD      0x00000000
        DCD      PendSV_Handler
        DCD      SysTick_Handler
        DCD      CWT_IRQHandler
        DCD      TMR0_IRQHandler
        DCD      TMR12_IRQHandler
        DCD      GPIO_IRQHandler
        DCD      SIO_IRQHandler
        DCD      ADC_IRQHandler
        DCD      CPSW_IRQHandler
        DCD      RTSC_IRQHandler
----
#0000000 (starting from addr 0, these are the values in each mem addr)
03f0 2000 => addr 0(msp) is loaded with 2000_03f0 (this value hardcoded in boot.c)

NOTE: all vector handler addr below have 1 as LSB (not HW aligned), since bit[0] of vector addr sets the T bit (implies THUMBS code is being executed) in EPSR on that particular exception entry. If we have 0 in LSB, then T bit gets cleared. Attempting to execute instructions when the T bit is 0 results in a HardFault or lockup (since inst are Thumbs code and NOT arm code).
01bd 0000 => addr 4(reset vector) = 0000_01bd (addr of reset vector handler). since 01bd is not HW aligned, jump to addr=01bc
019d 0000 => addr 8 NMI fault handler addr, 0000_019d gets aligned to 0000_019c
019f 0000 => addr c Hard Fault handler addr, 0000_019f gets aligned to 0000_019e

#0000010 to 0000002b is reserved
0000010 0000 0000 0000 0000 0000 0000 0000 0000
0000020 0000 0000 0000 0000 0000 0000

01a1 0000 => addr 2C SVcall, 01a1 gets aligned to 01a0

#0000030 to  0000037 is reserved
0000030 0000 0000 0000 0000

01a3 0000 => adde 38 PendSV, 01a3 gets aligned to 01a2
01a5 0000 => addr 3C systick handler vector
01a9 0000 => addr 40 IRQ0
01ab 0000 => addr 44 IRQ1
01b1 0000 => addr 48 IRQ2
01a7 0000 => addr 4c IRQ3
01b3 0000 => addr 50 IRQ4
01b5 0000 => addr 54 IRQ5
01b7 0000 => addr 58 IRQ6
015f 0000 => addr 5c IRQ7 (Total 8 IRQ)


-----------------------
######### files from c_p.l
-----------------------
#An initialization sequence executes to set up the system before the main task is executed. These object files are part of C library provided in c_p.l. __main is the image entry point. It copies code, copies or decompress RW data, and zeroes uninit data. It then calls __rtentry which sets up stack and heap, init lib func, and then calls user main() code. Once user code finishes, it returns from user code and calls __rexit to exit from the app.

#0x0000060 to 0x00000108 is __main.o, __scatter.o, libinit.o, rtentry.o, rexit.o etc (all in /apps/arm/rvds/4.0-821/RVCT/Data/4.0/400/lib/armlib/c_p.l). Note that all std functions as stdio.h, etc are in /apps/arm/rvds/4.0-821/RVCT/Data/4.0/400/include/unix/*)

#0000060: __main starts here
f000 f802 => 0000060 jumped to this addr1 from Reset_Handler: f000f802  =  BL       {pc} + 0x8  ; 0x68
f000 f840

#0000068: __scatter.o starts here
a00c = 00000068 a00c =      ADR      r0,{pc}+0x34  ; 0x9c
c830 = 0000006a c830 =      LDM      r0!,{r4,r5}
3808 = 0000006c 3808 =      SUBS     r0,r0,#8
1824 = 0000006e 1824        ADDS     r4,r4,r0
182d = 00000070 182d        ADDS     r5,r5,r0
46a2 = 00000072 46a2        MOV      r10,r4
1e67 = 00000074 1e67        SUBS     r7,r4,#1
46ab = 00000076 46ab        MOV      r11,r5
4654 = 00000078 4654        MOV      r4,r10
465d = 0000007a 465d        MOV      r5,r11
42ac = 0000007c 42ac        CMP      r4,r5
d101 = 0000007e d101        BNE      {pc} + 0x6  ; 0x84
f000 f832 = 00000080 f000f832    BL       {pc} + 0x68  ; 0xe8 => jumps to rtentry.o only after loading of .data/.bss is done.
467e = 00000084 467e        MOV      r6,pc
3e0f = 00000086 3e0f        SUBS     r6,r6,#0xf
cc0f = 00000088 cc0f        LDM      r4!,{r0-r3} => Load Multiple Increment (LDM) After loads multiple registers from consecutive memory locations using an address from a base register. "!" means that base register should not included for loading. lowest numbered reg is loaded from lowest mem addr and highesr numbered reg is loaded from highest mem addr. here reg r0,r1,r2,r3 are loaded with mem loc starting from addr in r4. Reg r4=0x00000780. So, r0=(0x770)=00000790, r1=(0x774)=20000020, r2=(0x778)=00000004, r3=(0x77c)=000000a4. so, 1st symbol table entry (ER_RW) for .data is loaded. Next time it comes here, Reg r4=0x00000790, so it loads the 2nd symbol table entry (ER_ZI)

46b6 = 0000008a 46b6        MOV      lr,r6
2601 = 0000008c 2601        MOVS     r6,#1
4233 = 0000008e 4233        TST      r3,r6
d000 = 00000090 d000        BEQ      {pc} + 0x4  ; 0x94
1afb =
46a2 = 00000094 46a2        MOV      r10,r4
46ab = 00000096 46ab        MOV      r11,r5
4333 = 00000098 4333        ORRS     r3,r3,r6
4718 = 0000009a 4718        BX       r3 => jump to scatter_copy.o the first time. Next time it jumps to scatter_zi.o.
06d4
0000
00000a0 06f4 0000

#000000a4: __scatter_copy.o starts here
3a10 = 000000a4 3a10        SUBS     r2,r2,#0x10
d302 c878 c178 d8fa 0752
00000b0 d301 c830 c130 d501
6804 = 000000b8 6804        LDR      r4,[r0,#0]
600c = 000000ba 600c        STR      r4,[r1,#0] => stores 1 into mem addr=2000_0020 (init err=1)
4770 = 000000bc 4770        BX       lr => jump to 0x78
0000 => 2 byte padding

#000000c0: __scatter_zi.o
00000c0 2300 2400 2500 2600 3a10 d301 c178 d8fb
00000d0 0752 d300 c130 d500 600b 4770

#00000dc: libinit.o,
b51f = 000000dc b51f        PUSH     {r0-r4,lr}
46c0 = 000000e0 46c0        MOV      r8,r8
46c0 = 000000e0 46c0        MOV      r8,r8

#00000e2: libinit2.o,
bd1f = 000000e2 bd1f        POP      {r0-r4,pc} => pc is loaded with 000000f2 which has inst to jmp to main proc (few inst below)

#000000e4: libshutdown.o
b510 = 000000e4 b510        PUSH     {r4,lr}

#000000e6: libshutdown2.o
bd10 = 000000e6 bd10        POP      {r4,pc}

#000000e8: rtentry.o, rtentry2.o, rtentry4.o
f000 fb19 = 000000e8 f000fb19    BL       {pc} + 0x636  ; 0x71e => jumps to sys_stackheap_outer.o
4611      = 000000ec 4611        MOV      r1,r2

#000000ee: rtentry2.o
f7ff fff5 = 000000ee f7fffff5    BL       {pc} - 0x12  ; 0xdc => jump few inst above
f000 f81d = 000000f2 f000f81d    BL       {pc} + 0x3e  ; 0x130 => jump to main proc in kail_rtsc.c
f000 fb31 = 000000f6 f000fb31    BL       {pc} + 0x666  ; 0x75c

#000000fa: rtexit.o
b403 = 000000fa b403        PUSH     {r0,r1}

#000000fc: rtexit2.o
f7ff fff2 = 000000fc f7fffff2    BL       {pc} - 0x18  ; 0xe4
bc03 = 00000100 bc03        POP      {r0,r1}
f000 f862 = 00000102 f000f862    BL       {pc} + 0xc8  ; 0x1ca
0000 = 2 byte padding


-----------------------
######### user code
-----------------------

#0x0000108 to 0x0000019c is kail_rtsc.o. It starts with inst in NVIC_EnableIRQ (since that process is in top of file) and then has NVIC_DisableIRQ proc, then main proc.
#In asm code for kail_rtsc.s, we see that this is marked as .text section: AREA ||.text||, CODE, READONLY, ALIGN=2
0000108: start of NVIC_EnableIRQ:
06c2 = 00000108 06c2        LSLS     r2,r0,#27
0ed2 = 0000010a 0ed2        LSRS     r2,r2,#27
2101 = 0000010c 2101        MOVS     r1,#1
4091 = 0000010e 4091        LSLS     r1,r1,r2
0940 = 00000110 0940        LSRS     r0,r0,#5
4a19 = 00000112 4a19        LDR      r2,[pc,#100]  ; [0x178] => 0x0114(next pc)+0d100=0x0114+0x0064=0x178. has NVIC->ISER addr=0xe000e100
0080 = 00000114 0080        LSLS     r0,r0,#2
1880 = 00000116 1880        ADDS     r0,r0,r2
6001 = 00000118 6001        STR      r1,[r0,#0]
4770 = 0000011a 4770        BX       lr

000011c: start of NVIC_DisableIRQ
06c2 = 0000011c 06c2        LSLS     r2,r0,#27
0ed2 = 0000011e 0ed2        LSRS     r2,r2,#27
2101 = 00000120 2101        MOVS     r1,#1
4091 = 00000122 4091        LSLS     r1,r1,r2
0940 = 00000124 0940        LSRS     r0,r0,#5
4a15 = 00000126 4a15        LDR      r2,[pc,#84]  ; [0x17c] => 0x0128(next pc)+0d84=0x0128+0x0054=0x017c. has NVIC->ICER addr=0xe000e180
0080 = 00000128 0080        LSLS     r0,r0,#2
1880 = 0000012a 1880        ADDS     r0,r0,r2
6001 = 0000012c 6001        STR      r1,[r0,#0]
4770 = 0000012e 4770        BX       lr

0000130: start of main proc. jumped here from addr=0x000000f2 which is in rtentry2.o
b510 = 00000130 b510        PUSH     {r4,lr}
4814 = 00000132 4814        LDR      r0,[pc,#80]  ; [0x184]
4912 = 00000134 4912        LDR      r1,[pc,#72]  ; [0x180]
6001 = 00000136 6001        STR      r1,[r0,#0]
2100 = 00000138 2100        MOVS     r1,#0
60c1 = 0000013a 60c1        STR      r1,[r0,#0xc]
6001 = 0000013c 6001        STR      r1,[r0,#0]
a012 = 0000013e a012        ADR      r0,{pc}+0x4a  ; 0x188
f000 f855 = 00000140 f000f855    BL       {pc} + 0xae  ; 0x1ee => branch to msg_disp
2001 4913 02c0 8008 2007 f7ff
ffdb
4811 = 00000152 4811        LDR      r0,[pc,#68]  ; [0x198] => loads r0 with data at 0x198 (which is .data section storing 0x2000_0020). So, r0=0x2000_0020
8801 = 00000154 8801        LDRH     r1,[r0,#0] => r0 stores mem addr=20000020, which has variable "err". so r1=err
2900 = 00000156 2900        CMP      r1,#0
d1fc = 00000158 d1fc        BNE      {pc} - 0x4  ; 0x154
8800 = 0000015a 8800        LDRH     r0,[r0,#0]
bd10 = 0000015c bd10        POP      {r4,pc}

000015e: start of RTSC_IRQHandler
b510 = 0000015e b510        PUSH     {r4,lr}
2007 = 00000160 2007        MOVS     r0,#7
f7ff ffdb = 00000162 f7ffffdb    BL       {pc} - 0x46  ; 0x11c => jump to NVIC_DisableIRQ
2001 = 00000166 2001        MOVS     r0,#1
490a = 00000168 490a        LDR      r1,[pc,#40]  ; [0x194]
02c0 = 0000016a 02c0        LSLS     r0,r0,#11
8008 = 0000016c 8008        STRH     r0,[r1,#0]
480a = 0000016e 480a        LDR      r0,[pc,#40]  ; [0x198]
8801 = 00000170 8801        LDRH     r1,[r0,#0]
1e49 = 00000172 1e49        SUBS     r1,r1,#1
8001 = 00000174 8001        STRH     r1,[r0,#0]
bd10 = 00000176 bd10        POP      {r4,pc} => end of process for main.

data varaibles start below. NOTE: this is NOT .data section for main(). It is still .text section. It stores all fixed constants .data section is put separately at the end, as we keep all .text section for all .o files together, and all .data section together for all .o files. Note that order of byte pair is reversed, since data is stored in mem as [31:0], while data in hexdump is [15:0],[31:16]
e100 e000 = 00000178 = |L1.112| DCD      0xe000e100 =>  allocates 1 word (4 bytes) of mem with initial contents = 0xe000e100. This initial content is actually mem addr of NVIC->ISER.
e180 e000 = 0000017c = |L1.116| DCD      0xe000e180 => mem addr of NVIC->ICER
4449 4a45 = 00000180 = |L1.120| DCD      0x4a454449
7000 5000 = 00000184 = |L1.124| DCD      0x50007000
616b 6c69 725f 7374 3a63 000a = 00000188 = |L1.128| DCB      "kail_rtsc:\n",0 => data in hexdump shows as [15:0], so data stored in mem[0:xx]=6b61_696c_5f72_7473_633a_0a00 => 6b=k, 61=a, 69=i, 6c=l ...  3a=:, 0a=new line feed(\n), 00=NULL. A string is always terminated by NULL character. That is how, end of string can be identified.
6000 5000 = 00000194 = |L1.140| DCD      0x50006000
0020 2000 = 00000198 = |L1.144| DCD      ||.data|| => .data section starts at 2000_0020

-----------------------
######### boot file:
-----------------------
#boot.c defines all of these: fault handlers as infinite while loop, reset handler which calls __main, vector_table which has msp, and addr of fault handlers and _sys_exit function.
#0x000019c to 0x000001dc is boot.o (look in boot.c or boot.s)
000019c to 000001bb: all fault handler (except reset handler) which branch to itself (infinite loop)
e7fe = 000019c: NMI_hanlder e7fe        B        {pc} => branches to itself, infinite loop, while(1);
e7fe = 000019e: HardFault_Handler
e7fe = 00001a0: SVC_Handler
e7fe = 00001a2: PendSV_Handler
e7fe = 00001a4: Systick nadler
e7fe = 00001a6: GPIO_IRQHandler
e7fe CWT_IRQHandler
e7fe TMR0_IRQHandler
4770 TMR1_IRQHandler BX       lr => just return back to caller. return;
4770 TMR2_IRQHandler

00001b0
e7fe TMR12_IRQHandler
e7fe SIO_IRQHandler
e7fe ADC_IRQHandler
e7fe CPSW_IRQHandler
e7fe RTSC_IRQHandler
e7fe Default_IRQHandler

#000001bc to 000001bf: Reset_Handler PROC starts here: it just calls __main
void Reset_Handler(void) {  __main(); }

b510 = 000001bc b510 PUSH     {r4,lr}
f7ff ff4f = 000001be BL       {pc} - 0x15e  ; jump to addr1=0x60 which is __main process
bd10 = POP      {r4,pc}

#__user_initial_stackheap PROC: this is the c code in boot.c:
__user_initial_stackheap(unsigned hb, unsigned sb, unsigned hl, unsigned sl) {    
  struct __initial_stackheap s;    
  s.heap_base   = hb;
  s.stack_base  = sb;
  s.heap_limit  = s.stack_base;
  s.stack_limit = s.heap_base;
  return s;
}

460a = 000001c4 460a        MOV      r2,r1 => r2 assigned 200003e8 = stack base
4603 = 000001c6 4603        MOV      r3,r0 => r3 assigned 20000020 = heap base
4770 = 000001c8 4770        BX       lr

#_sys_exit PROC: defn of TB_COM in sporsho_tb.h
typedef struct
{
  __IO uint8_t    STB;    // [0] stroble bit
  __IO uint8_t    CMD;    // [4:0] command bit
  __IO uint8_t    RUN;    // [0] dbg_run 0: idle, 1: run
  __IO uint8_t    ERR;    // [0] dbg_err 0: pass, 1: fail
} TB_DBG_TypeDef; //total size of TB_DBG is 1+1+1+1=4

typedef struct
{
  __IO uint16_t   ERR;    // Error code depend on each test case
  __IO uint8_t    STAT;   // Cleared by 0 during simulation startup phase.
              // Reset signal will not change this data
  __IO uint8_t    COM;    // Command for verilog test bench and data byte
              // Upper 4bit is byte num, Lower 4bit is command
  __IO uint32_t   DATA1;  // 4 byte data for verilog test bench
  __IO TB_DBG_TypeDef  DBG;
  __IO uint32_t   FLUSH;  // dummy for flush write buffer
} TB_COM_TypeDef; //total size of TB_COM is 2+1+1+4+4+4=16 bytes
#define TB_COM                  ((volatile TB_COM_TypeDef*) SRAM1K_BASE    )

#This is the c code in boot.c.
void _sys_exit(int return_code) {
  TB_COM->ERR   = return_code; // No error is 0x0000
  TB_COM->COM   = TB_COM_EXIT; // If this bit is asserted then Verilog testbench catch up this flag and call $finish.
  TB_COM->FLUSH = 0x00;        // Flush SRAM write buffer. If you don't, then TB_COM_EXIT bit is not carried to Verilog testbench
  while(1); //infinite loop, so pgm runs for ever
}
2101 = 000001ca 2101        MOVS     r1,#1
0749 = 000001cc 0749        LSLS     r1,r1,#29
8008 = 000001ce 8008        STRH     r0,[r1,#0]
2001 = 000001d0 2001        MOVS     r0,#1
70c8 = 000001d2 70c8        STRB     r0,[r1,#3]
2000 = 000001d4 2000        MOVS     r0,#0
60c8 = 000001d6 60c8        STR      r0,[r1,#0xc]
e7fe = 000001d8 e7fe        B        {pc}  ; 0x1d8
0000 = 000001da is 2 byte PADDING

data area vectors is defined after this in boot.s, which has vectors starting from addr=0x00 to 0x60

-----------------------
######### user lib file for common functions
-----------------------

#0x000001dc to 0x000006d4 is sporsho_tb.o (generated from sporsho_tb.c)
00001dc: start of putchar function
2101 0749
7108 2200 70ca 2312 70cb 60ca 4770

00001ee to 0006d3: start of msg_disp function
00001ee b5f8
00001f0 4604 f000 fa71 2100 460a 2800 dd17 2700
...
00006d0 4040 5000

-----------------------
######### additional files from c_p.l
-----------------------

#0x000006d4 to 0x00000790 is use_no_semi_2.o, strlen.o, exit.o, etc (in c_p.l)

#00006d4; use_no_semi_2.o
4770
0000 = 2 byte PAD

#000006d8: strlen.o
b530 1c44 e005 7801
00006e0 1c40 2900 d101 1b00 bd30 0781 d1f7 4b0a
00006f0 01dd c804 1ad1 4391 4029 d0fa 1b00 060a
0000700 d001 1ec0 bd30 040a d001 1e80 bd30 0209
0000710 d0fc 1e40 bd30 0000 0101 0101

#0000071c: use_no_semi.o
4770

#0000071e: sys_stackheap_outer.o
4675
0000720 f000 f822 46ae 0005 4669 4653 08c0 00c0
0000730 4685 b018 b520
f7ff fd45 =  00000736 f7fffd45    BL       {pc} - 0x572  ; 0x1c4
bc60 2700 0849
46b6 = 00000740 46b6        MOV      lr,r6
2600 = 00000742 2600        MOVS     r6,#0
c5c0 = 0000074e c5c0        STM      r5!,{r6,r7}
c5c0
c5c0
c5c0
c5c0
c5c0
c5c0
c5c0 = 00000752 c5c0        STM      r5!,{r6,r7}
3d40 = 00000754 3d40        SUBS     r5,r5,#0x40
0049 = 00000756 0049        LSLS     r1,r1,#1
468d = 00000758 468d        MOV      sp,r1
4770 = 0000075a 4770        BX       lr

#0000075c: exit.o
4604 = 0000075c 4604        MOV      r4,r0
46c0 = 0000075e 46c0        MOV      r8,r8
46c0 = 00000760 46c0        MOV      r8,r8
4620 = 00000762 4620        MOV      r0,r4
f7ff fcc9 = 00000764 f7fffcc9    BL       {pc} - 0x66a  ; 0xfa

#00000768: libspace.o
4800 4770 0024 2000

-----------------------
######### Region Table for .data/.bss
-----------------------

#00000770 to 00000790 is Region Table: anon$$obj.o. It's .data (not .text) section but Read only. Compiler relocates this .data section to appr mem loc, so that it can be RW instead of RO.
0000770 => deatils about .data section. (execution Region ER_RW). object is kail_rtsc.o
addr 770 = 0790 0000 = source location (from fram) to be copied. addr 770 stores 00000790, which is starting addr of all .data values. It stores value of err
addr 774 = 0020 2000 = dest location (in sram) to be init. addr 774 stores base addr for .data as 2000_0020 corresponding to err (execution Region ER_RW starts at 2000_0020)
addr 778 = 0004 0000 = no. of bytes to be init. addr 778 stores size of ER_RW as 4 bytes, even though it's only 2 bytes in size (as it's uint16_t). this is due to 2 byte padding because mem access in M0 are always word-aligned.
00a4 0000 = misc info.

0000780 => details about .bss section (execution Region ER_ZI). object is libspace.o(c_p.l)
0794 0000 = source location (from fram) to be copied. addr 780 stores 00000794 which stores nothing as it's .bss data (un initialized)
0024 2000 = dest location (in sram) to be init. addr 784 stores base addr for .bss as 2000_0024 (execution Region ER_ZI starts at 2000_0024)
0060 0000 = no. of bytes to be init. addr 788 stores size of ER_ZI as 96 bytes (hex=0x60 bytes)
00c0 0000 = misc info

-----------------------
######### actual init values to be stored for .data section
-----------------------

#0000790: this stores the initial values of all .data and .bss sections.
0001 => init value of err. err is deined as 16 bit in kail_rtsc.c (volatile uint16_t err = 1;)
0000 => PADDING.
           
-symbol table above looks like this:
-------------
0x770: ER_RW = src_addr (in fram) | dest_addr (in sram) | no_of_bytes | misc_info
0x780: ER_ZI = src_addr (in fram) | dest_addr (in sram) | no_of_bytes | misc_info
0x790: DATA  = var1, var2, .... varn => this stores initial values of all var to be init in ER_RW section.
------------

-----------------------
######### end of binary file. No data below. 0000793 is last addr, so, size of file is 0x794 bytes
-----------------------


--------------------------------------
inst seq during execution on M0:
--------------------------------
At start, HReset line is low. Processor in in reset, with AHB bus idle. HTRANS[1]=0(idle), HADDR[31:0]=4, HMASTER=1(=> src of transactio is NOT core but slave, so basically it refers to idle spot). HPROT[3:0]=1010=a (cacheable, not bufferable, privileged, opcode). HSIZE=0(1 byte).

Now, when HRESET goes high, then 2 cycles later, AHB starts firing.
3rd HCLK cycle: HADDR=0, HMASTER=0, HSIZE=2(1 Word), HTRANS=2. HPROT[3:0]=1011=b (cacheable, not bufferable, privileged, data). HSIZE=0(1 byte) => read from addr 0. Loading msp.
4th HCLK cycle: HADDR=4 => read from addr 4 (addr of reset vector). HPROT[0]=1=data. It's read as 10D (HRDATA[31:0]=10C)
5th HCLk cycle: HADDR=4, HMASTER=1, HSIZE=0(1 byte), HTRANS=0. => dumy cycle. inst to jump is being prepared.
6th HCLK cycle: HADDR=10C, HMASTER=0, HSIZE=2(1 Word), HTRANS=2, HPROT[0]=0=opcode => jumps to reset ISR and fetches 1st inst from there.
7th HCLK cycle: still processing previous inst. 1st inst in any function call is "PUSH     {r4,lr}" which is being processed here.
8th-10th HCLK cycle: actual push inst executes here.  This pushes r4 onto the lower mem and lr onto higher mem. SP is updated to point to the lowest loaded mem. In 8th HCLK, next seq inst at addr 110 gets fetched. In 9th HCLK, r4 is written to mem SP-8. In 10th HCLK, lr is written to mem SP-4, SP is updated to SP-8 and next inst is fetched which is "branch to 0x60 = __main process".

----------------------------------------

retarget.c: (in veridian) => This file Retargets printf and scanf statements to AHB Snooper or UART. printf fn calls fputc/fgetc (via compiled code), and we intercept the fputc/fgetc to retarget to AHB or UART
----------

#ifdef UART //for UART
int fputc(int ch, FILE *f) {
        int delay,codereturn;
        codereturn = UartPutc(ch); //UartPutc defined as {USART_Transmit(USART,ch); return(ch);} => In USART_Transmit, ch assigned to tx buffer. pusart->UTBUF = ch;
        for(delay = 0; delay < 2000; delay++);
  return (codereturn);
}
int fgetc(FILE *f) {
  return (UartPutc(UartGetc())); //UartGetc defined as {unsigned short int data = 0; USART_Receive(USART, &data); return data;} => In USART_Receive, *(&data) assigned to rx buffer. *pdata |= (pusart->URBUF);
}
#else //for AHB
int fputc(int ch, FILE *f) {
  (*(volatile int *)(0x20002F10)) = ch; //string i/p on this addr
  (*(volatile int *)(0x20002F14)); //string display on this addr. NOTE: this is just reading that addr, but not assigning read value to anything.
  return ch;
}
int fgetc(FILE *f) {
  return 0; //does nothing as we don't process any scanf inputs
}
#endif
----
display_monitor.v => This snoops for addr and outputs string in verilog
----
//set flag
always @ (posedge HCLK or negedge HRESETN) begin //this snoops for above addr on AHB bus, and sets flag
    if(~HRESETN)
        string_input_flag <= 0;
        string_display_flag <= 0;
    else if ((i_haddr == 32'h20002F10) & `hready & `htrans[1]) string_input_flag <= 1;
    else if ((i_haddr == 32'h20002F14) & `hready & `htrans[1]) string_display_flag <= 1;
end

//store string
reg [7:0] string_store;
always @ (posedge HCLK) begin
 if(string_input_flag)
   string_store = hwdata[7:0]; => on each addr match, data on next clk is stored in this array (only lower 8 bits, as it's ASCII char) .
   string_input_flag <= 0; => after each char, flag is again reset
end

//display string
always @ (posedge HCLK) begin
 if(string_display_flag)
     $write("%s",string_store); //In this case, in fgetc function, string display is right after string input, so only 1 char stored at a time. %s used to display ASCII equiv for hex value.
     string_display_flag <= 0;
end
------

-------------------------------
Cortex M0 (ARMv6-M):
-------------------------------
Cortex M0 is the simplest, smallest and most popular core used in devices worldwide. NOTE: all registers are 32 bit wide here, even though most inst are only 16 bit wide.

For Cortex M0 inst set, Look in ARMv6-M arch reference manual. (For Cortex M3, look in ARMv7-M arch reference manual, chapter A4 and A5 (page 85 - page 417) for details and encoding)

ARMv6-M supports subset of T32 ISA = here all inst are 16 bit inst except these 32 bit inst: BL, DMB, DSB, ISB, MRS and MSR instructions. Not all inst in T32 ISA are supported by v6-M (it doesn't support CBZ, CBNZ and IT inst from T32 ISA, which are supprted by v7-M). As can be seen, T32 ISA (or even subset of T32 ISA as shown below) has lot more inst than Thumbs1 ISA.

these group of inst supported by M0: (Total inst =  86 inst listed in ARM cortex-M0 reference manual)
1. Arithmetic: ADD, ADDS, ADCS (Add with carry), SUB, SUBS, SBCS (subtract with carry), MULS, RSBS (same as NEG in Thumbs1)
2. Branch: B, BAL(unconditional), BL, BX, Bxx (conditional, various flavors = BEQ, BNE, BCS/BHS, BCC/BLO, BMI, BPL, BVS/BVC, BHI, BLS, BGE/BGT/BLE/BLT), BLX (link and exchange)
3. Data Xfer (load/store): MOV/MOVS/MVNS, MSR/MRS (MRS=rd special reg, MSR=wrt special reg), Unsigned load/store (LDR/STR, LDRH/STRH, LDRB/STRB), signed load (LDRSH,LDRSB),Load/store multiple (LDM/STM, LDMIA/STMIA, LDMFD, STMEA, PUSH/POP)
4. Logical: ANDS, ORRS, EORS, BICS (bit clear), ASRS (arithmetic shift), logical shift (LSLS/LSRS), RORS (rotate right), TST, reverse bytes (REV/REV16/REVSH) => 
5. Bit oriented (compare): CMP/CMN
6. Pack/unpack: SXTB, SXTH, UXTB, UXTH => there inst not there in 8051.

7. barrier: DMB/DSB/ISB (memory barrier), 

8. hint: SEV (send event), WFE (wake from event), WFI (wait for interrupt. this inst puts processor in sleep until wakeup event happens), NOP (Hint inst)
9. Misc: ADR, CPSIE/CPSID (enable/disable interrupt), BKPT (breakpoint), , SVC (supervisor call). . NOTE: none of these inst not present in Thumbs1 ISA.

Inst:
----
B (branch):
-----------
causes branch to target addr (PC) = current_addr + offset. offset can be even number (lsb=0), since all access are 16 bit aligned (is 16 bit in T1/T2 but 32 bit in T3/T4).
 T1 encoding: 16 bit [15:0]: [15:12]=1101, [11:8]=cond, [7:0]=8 bit imm value (-256 to +254 => values allowed are -128 to +127, but since msb=0, so it becomes -256 to +254 ).
 T2 encoding: 16 bit [15:0]: [15:11]=11100, [10:0]=11 bit imm value (-2048 to +2046).
 T3 encoding: 32 bit [31:0]: imm value = {S, NOT(J1 EOR S), NOT(J2 EOR S),imm6,imm11} = -2^20 to (+2^20-2)
   HW1 (lower 16 bits: [15:0]): [15:11]=11110, [10]=S, [9:6]=cond, [5:0]=imm6
   HW2 (upper 16 bits: [31:16]: [15:14]=10, [13]=J1, [12]=0. [11]=J2, [10:0]=imm11
 T4 encoding: 32 bit [31:0]: imm value = {S, NOT(J1 EOR S), NOT(J2 EOR S),imm10,imm11} = 2^24 to (+2^24-2)
   HW1 (lower 16 bits: [15:0]): [15:11]=11110, [10]=S, [9:0]=imm10
   HW2 (upper 16 bits: [31:16]: [15:14]=10, [13]=J1, [12]=0. [11]=J2, [10:0]=imm11

-------------------------------------------------
3 addressing modes: [Rn, offset]: Rn=base reg
1. offset addressing: offset value added/sub from addr in base reg and used as the addr for mem access. base reg is unaltered.
2. Pre-index addressing: offset value added/sub from addr in base reg and used as the addr for mem access. but here base reg is written with the new addr.
3. Post-index addressing: addr in base reg is used as the addr for mem access. but here base reg is written with the new value which is offset value added/sub from addr in base reg.  



Alignment:
---------
For ARMv6-M, instruction fetches are always halfword-aligned and data accesses are always naturally aligned. So, inst fetch can only be done from addr whose botom bit=0. However, in compiled code we'll still see some addresses with the bottom bit set.  The bottom bit is used to show the destination address is a Thumb instruction.  It is not treated as part of the address. Each inst fetch brings in 4 bytes of inst. Since each inst is min 2 bytes, so each fetch can bring in 2 16bit inst, or 1 32 bit inst or 1 16 bit and part of 1 32 bit inst or parts of 2 32 bit inst. Inst prefetches are done in advance whenever there is a empty slot on AHB bus. inst to execute next is calculated as = (address_of_current_instruction) + (size_of_executed_instruction) after each inst. If that inst is already prefetched, it's executed else inst fetch is done for that address. Inst fetches are done with "WRITE" line set low. Data access are done with "WRITE" set high for writing to mem (inst STRB, STRH, STR), and set low for reading from mem (inst LDRB, LDRH, LDR). Naturally aligned means that an access will be aligned to its size.  So for a 4-byte access, it will be on a 4-byte boundary. So, if ld/st is STRB/LDRB,STRH/LDRH,STR/LDR, compiler will generate ld/st with addr aligned on byte or halfword or word boundary. So, if we do STRH from addr 0xF1 in C ode (because struct pointer points to that addr), then compiler will generate code for STRH from addr 0xF2 to make it HW aligned. (It didn't do str from addr 0xF0 as complier will only move to forward addr and not backward). Should software attempt an unaligned data access, a fault will be generated.

For C structs, the C standard says that struct has the same alignment as its most aligned member. So, if the struct has something defined as uint32_t, then it's starting addr will be aligned to Word. But individual members of struct will then be aligned as per the member's data size.

 

Cortex M0 pins:
-------------
i/p pins
----
reset pins: PORESETn(for jtag/sw), DBGRESETn, HRESETn, SYSRESETREQ(o/p)
clks: FCLK(for WIC), SCLK, HCLK, DCLK
irq: IRQ[31:0]
scan: SE (scan enable H during shifting and L during normal op) RSTBYPASS (bypasses internal reset sync so that ATPG tool can have controllability of reset flops)

o/p pins:
----
HALTED: indicates that uP is in debug state.
LOCKUP: indicates that uP is in architected lockup state, as the result of an unrecoverable exception.
SLEEPING: indicates that uP is idle, waiting for interrupt on either IRQ[31:0], NMI or internal Systick (int entered due to WFI), or HIGH on RXEV (int entered due to WFE)
DEEPSLEEP: active when SLEEPDEEP bit in SCR set to 1. and SLEEPING is HIGH.
WAKEUP:

AHB bus:
i/p: HADDR[31:0], HBURST[2:0], HMASTLOCK, HPROT[3:0], HSIZE[2:0], HTRANS[1:0], HWDATA[31:0], HWRITE
o/p: HRDATA[31:0], HREADY, HRESP

M0 debug i/f can be configured for SW i/f or JTAG i/f but not both.
-----
debug: CDBGPWRUPACK(i/p), CDBGPWRUPREQ(o/p)
serial wire i/f: SWCLKTCK(sw clk), SWDITMS(sw data), SWDO(o/p), SWDOEN(swd o/p pad ctl signal),
JTAG i/f:
i/p: nTRST, TDI, SWCLKTCK(jtag TCK)), SWDITMS(jtag TMS)
o/p: TDO, nTDOEN(jtag TDO o/p ctrl signal)

 

ISA:

Any ISA needs a minimum of 3 inst types: load/store inst, conditional branch and logical (and|or|not) function. Load/Store inst needed to move contents from one place to other. Branch inst needed to implement if-else conditions which form the basis of doing intelligent things based on conditions. Logical functions are needed to implement any arithmetic/logical operations as add, multiply, etc (since basic gates as and, or, not are sufficient to implement any logic function). However, such an ISA would require very large code size to do simple operations like ADD, SUB, CMP, etc. So, we extend the ISA to include more commonly used inst, so that code size and compute cycles are reduced.

ARM had multiple inst set that it supported, and over so many products, it became very confusing. details of inst set are provided in ARM Architecture Reference Manual (ARM ARM)

Initially ARM inst set (called ARM ISA or A32) was the only ISA in ARM processors, which was fixed 32 bit inst set. Bits 27 to 20 stored various opcode, while other fixed bits stored source/dest reg number. There are 16 registers in user space (R0-R15). However code size for ARM ISA is large due to inst being 32 bit, even though it has good perf with low power. So, later 16 bit inst (called THUMBS ISA) were added to improve code density. 1st popular chip to include Thumbs ISA was ARM7TDMI (T in name implies Thumb ISA). Initially Thumb ISA included only 16 bit inst, but later more inst were added to both ARM ISA and Thumbs ISA, resulting in few 32 bit inst in Thumbs ISA. ARM ISA being fixed inst size was RISC style, while Thumbs ISA being both 16 and 32 bit inst size was CISC style.

THUMBS ISA: these were mostly 16 bit. To be able to use THUMB inst in ARM processors (which support ARM ISA), there is a decompressor in ARM hardware, which decompresses THUMBS 16 bit ISA into ARM 32 bit ISA, which is then passed onto ARM instruction decoder. Each 16 bit Thumb inst had an equiv 32 bit ARM inst. The main motivation for introducing Thumbs ISA was to reduce code size - by encoding most commonly used 32 bit ARM instructions in 16 bit. This was very useful in embedded designs, where memory is limited and expensive. Thus the inst length became variable - while most inst were 16 bit, few inst were encoded in 32 bit, where 16 bit encoding wasn't possible (but the inst was needed to improve cycle time). There were 2 set of THUMBS ISA introduced.

  • THUMBS1:  had 35 inst of which only 'BL' inst is 32 bit. All 16bit inst can only access lower 8 registers (R0-R7), since there are only 3 bit encoding for registers. Since, most inst were 16 bit, this reduced codesize by 30% and reduced I-cache misses, but increased cycle counts. Solution was to blend THUMBS1 and ARM inst, where critical section of code was written in ARM ISA and rest in THUMBS1 ISA. This gave rise to Thumbs2 ISA. Loosely speaking, Thumbs ISA usually refers to Thumbs1 ISA.
  • THUMBS2: So, THUMBS2 ISA introduced which had all THUMBS1 ISA + some new 16 bit inst for code size wins. Few new 32 bit inst were also added (DMB, DSB, ISB, MRS, MSR, BL/BLX). On top of this 32 bit equiv inst for corresponding 16 bit inst were also provided. Thus Thumbs2 provided most of ARM inst too. This resulted in total of 100's of inst. However, virtually all inst available in ARM ISA (with exception of few) were now available in THUMB2 which allowed to have a unified assembly language (UAL) for ARM and THUMBS2, which can then be compiled to generate binaries for either ISA. Thumbs2 was also known as T32 ISA (Thumbs 32 bit ISA). T32 provided the flexibility to programmers to code sections of their pgm in 16 bit as well as 32 bit, depending on whether code size or performance was more important. There is no mode to switch b/w 16 bit and 32 bit, all T32 inst (whether they are 16 bit or 32 bit) are decoded by ARM core the same way, to generate internal ARM 32 bit inst. So, Thumbs2 is a confusing mnemonic, instead we'll use T32 for it. NOTE: T32 or Thumbs2 includes all inst which are 16/32 bit Thumbs inst, as well as equiv 32 bit ARM inst. Thumbs2 was introduced in Cortex M3, which was the 1st cortex processor. Thumbs2 kind of unified Thumbs1 ISA and ARM ISA into one, which allowed cortex processors to run in 1 operation state, instead of switching b/w ARM state (when running 32 bit ARM ISA) and Thumbs state (when running 16 bit Thumbs ISA). This was a huge advantage in terms of perf for Cortex cores.

UAL (unified assembly language): This assembly language syntax is for ARM assembly tools. T32 had there own assembly language syntax, while A32 had their own. This caused confusion. Since most inst were almost same b/w T32 and A32, UAL was developed to allow both ISA to have same assembly language syntax. This allowed easier porting b/w the 2. We'll use the new UAL syntax for any assembly language code. "THUMB" directive in assembly file indicates that the code is in UAL syntax ("CODE16" directive implies it's in traditional Thumbs syntax). Since most inst in T32 have 16 bit and 32 bit variants, compilers choose which variant to use to generate assembly code. Suffix ".W" after any inst indicates it's 32 bit inst (W=wide), while ".N" indicates it's 16 bit inst (N=narrow). If no suffix provided, then assembler can choose b/w 16 bit or 32 bit (but usually defaults to 16 bit to get smaller code size).


Since THUMB2 was allowed to be backward compatible with THUMBS1 (meaning any code in THUMB1 should run on THUMB2 m/c), this implied that 16 bit inst from THUMBS1 could not be changed. Trick was to make processor recognize new 16 bit inst as well as new 32 bit inst. On looking at original Thumbs1 ISA, it was seen that bits [15:13] of only 2 inst were 111. These 2 inst were "B" (unconditional branch), which was 16 bit inst and "BL" (long branch with link) which was 32 bit inst. So, to accommodate 32 bit inst, the 3 MSB [15:13] of 16 bit inst were used to indicate if it was 32 bit inst. Process was as follows:

Look at Bits[15:13] of first HalfWord(HW): If it's anything other than "111", it's current 16 bit Thumbs inst. If it's "111" => it may be B, BL or some other new inst.  Now, look at bits [12:11]:

1. 00 => If bit[12:11]=00, it's current THUMBS1 unconditional Branch (B), which is a 16bit inst.

2. 01, 10, 11 => If bit[12:11]=anything else, it's a THUMBS2 32 bit instruction (inlcuding BL which was Thumbs1 32 bit inst).

new THUMBS2 inst which were 16 bit were encoded in remaining 16 bit encodings left.


NOTE: Thumb instruction execution enforces 16-bit alignment on all inst.  This means that 32-bit inst are treated as two halfwords, hw1 and hw2, with hw1 at the lower address. So, 32 bit inst is as follows:
Data: 31:24  23:16  15:8  7:0 => HW2=[31:16], HW1=[15:0]
Addr:  A+3    A+2    A+1   A  => Addr can only be 16 bit aligned, so lsb of Addr is always 0.

THUMBS2 introduced CBZ (Branch if zero) inst, which previously required 2 separate inst. It also introduced predication, if then inst (IT) which caused next 1-4 inst in memory to be conditional. THUMBS2 performance was 98% of ARM perf, and code size was 30% less than ARM ISA. So, THUMBS2 became the ISA of choice.

Later ARM ISA for 64 bit (aka A64) were added to inst set. So, in nutshell, A64 ISA is for 64 bit processor, while A32/T32 ISA is for 32 bit processors. When we talk about Thumbs ISA, we'll mean Thumbs2 ISA, or refer to it as T32 also. There is no more Thumbs1 ISA. It's all Thumbs2 or T32. A32 is the other 32 bit ISA used in A and R profiles. A32 and T32 are almost the same ISA, with the processor decompressing T32 into A32 internally. T32 ISA is just the compressed version of A32 ISA to save memory space (where some 32 bit inst from A32 were encoded in 16 bit inst in T32). Thus there is not much diff b/w T32 and A32 ISA.

Thumbs1 Instructions:

There are 35 total Thumbs 1 inst (34 are 16 bit while 1 is 32 bit). Out of 16 bits, few msb bits are used for opcode encoding, while lower bits specify reg, const, etc. There are 19 instruction format for these 35 inst. Instruction format refers to the opcode, reg num location (i.e same opcode ADD may appear in 3 or 4 formats, depending on whether it's adding 2 reg, or adding a constant number, etc). Below we list all 35 inst based on their type, and NOT on their format (all inst listed in ARM7 TDMI manual):

1. Arithmetic: 6 inst = ADD, ADC (add with carry bit), SUB, SBC (sub with carry bit), MUL (multiply 2 reg), NEG (2's complement, Rd=-Rs),. These arithmetic can be b/w reg or b/w reg and constant.

ADD has 4 formats:

- add 9 bit signed constant to stack pointer

- add 10 bt constant to either PC or SP, and load resulting addr into a reg. So, this is a "load addr" instead of "load datat"

- add 8 bit constant to one reg and store in another reg

- add 2 reg

NEG: do negative of 1 reg and store in other reg, i.e Rd = -Rs

2. load from mem: 7 inst = LDRB (load byte), LDRH (load half word or 2 bytes), LDR (aka LDRW or load full word or 4 bytes), LDM/LDMIA (load multiple), LDSB (load sign extended byte), LDSH (load sign extended half word), POP. NOTE: load/store inst is not there in 8051. Move inst in 8051 does load/store func. move inst in ARM does move from reg to reg, and not from/to mem.

LDMIA: load multiple reg  (only 8 reg possible from R0 to R7, since 8 bits allocated in inst) from contents of mem, specified by addr contained in a base reg (3 bits allocated for base reg, 000=R0 ... 111=R7).

POP: pop reg specified by the list (optionally LR also, depending on opcode bit for LR), from the stack in mem (i.e load contents from stack mem to reg). Only 8 reg possible in the list from R0 to R7, since 8 bits allocated in inst. used during function/subroutine calls

3. store to mem: 5 inst = STRB, STRH, STR (aka STRW or store word), STM/STMIA (store multiple), PUSH. These store don't have sign extended version as in load. These inst same as those of load above, except they store to mem (instead of loading from mem)

PUSH: same as pop, except it does push of reg contents to the stack (i.e store contents from reg to stack mem). used during function/subroutine calls

4. move from reg to reg: 2 inst

MOV (move one reg to another reg, or move constant to another reg),

MVN (move NOT of one reg to another reg),

5. logical: 8 inst = AND, ORR (or), EOR (xor), LSL (logical shift left, <<), LSR (logical shift right >>), ASR (arithmetic shift right), ROR (rotate right), BIC,  TST (AND test)

BIC: bit clear = AND NOT of 2 reg, i.e Rd = Rd AND NOT Rs

TST: (AND test) = set condition code (N,Z,C,V flag in PSR reg) on Rd AND Rs, so smilar to AND, but sets condition code too

6. Branch: 4 inst = B, Bxx, BL, BX. branch may be conditional or unconditional.

B: unconditional PC relative branch. offset is bit[10:0], so 11 bits, but it's shifted left (<< 1) by one, since addr is always HW aligned. So, the offset actually becomes 12 bit 2's complement offset, so range of addr that can be jumped to is PC +/- 2048 bytes.

BX: branch indirect. performs unconditional branch to addr specified in LO or HI reg.

BL: long unconditional branch with link. This is the only 32 bit inst in Thumbs1. This is same as Bxx, except that offset is 23 bit 2's complement, where upper 11 bits of offset are stored in 1st 16 bit inst, and lower 11 bits are stored in 2nd 16 bit inst ( and then shifted left by one, so addr becomes 23 bits). Addr of inst following the BL is placed in LR, so that after end of branch, PC can return back to where it was.

Bxx: This performs conditional branch depending on state of CPSR condition code. condition code in N,Z,C,V can be each bit set or clear. So, 8 opcodes allocated for each bit (N,Z,C,V) either set or clear. Remaining 6 opcodes allcated to combination of set/clear bits. There are 4 bits for opcode from 0000 to 1111, but only 14 opcodes coded (BEQ, BNE, BCS, BCC, etc).

7. compare b/w 2 reg: 2 inst = CMP, CMN.

CMP: this inst used in 3 ways: compare b/w 2 reg and set condition flag, or subtract 2 reg, or compare b/w reg and constant. These all set condition flag (N,Z,C,V flag in PSR reg)

CMN: add 2 reg and set condition flag

8. software interrupt: 1 inst = SWI (It's not hardware interrupt). It causes the processor to switch to ARM state and enter supervisor (SVC) mode. It loads SWI vector addr (addr 0x08 in vector table) into the PC. This vector addr is also known as non-maskable interrupt (NMI) addr. This 16 bit inst has 8 bit comment field which can be used by SWI handler, it is ignored by the processor.

Thumbs2 instructions: These include all Thumbs1 ISA + new 16 bit inst + new 32 bit inst + 32 bit equiv inst for all 16 bit inst. Total inst icount is over 100. Some inst from here are part of v7-M arch only (.e they are not supported in v6-M arch)

1. Arithmetic:16 bit

ADR

SDIV/UDIV = signed/unsigned divide

CPY

RSB

REV/REV16/REVH/REVSH => reverses byte order (individual bytes are not reversed or modified). REV is for full word, REVH is for half word (both half words are reversed separately), while REVSH is for reversing the lower half word, and then sign extending the result with MSB.

RBIT => reverses bit order in data word. Useful for processing serial bit streams in data communication, where the entire stream needs to be reversed

BFC/BFI => bit field clear(BFC), bit field insert (BFI)

SBFX/UBFX => signed and unsigned bit field extract

SXTB/SXTH, UXTB/UXTH = used to extend a byte/HW int a Word. S=sign extend with MSB(bit [7] for byte or bit [15] for HW), U=unsigned, value is 0 extended to 32 bits

2. barrier instructions: new 32 bit inst, to force a memory/inst barrier. It forces all mem access/inst before it to complete, before allowing mem access/inst coming after it to complete. This may be needed in complex mem systems, when out of order execution can cause race conditions. All 3 inst below can't be coded in high level laguage, so these can be accessed via functions defined in CMSIS compliant device driver library. i.e void __DMB(void); //function defn for DMB inst

DMB: data mem barrier. It forces all mem access before it to complete, before new mem access can be done. This is helpful in multi processor systems, where shared mem is used.

DSB: data sync barrier. It forces all mem access before it to complete, before allowing inst coming after it to complete.

ISB: inst sync barrier. It forces all inst before it to complete, before allowing inst coming after it to complete.

3. move inst to rd/wrt special reg:

MRS: move contents of special reg (i.e APSR, IPSR, PSR, MSP, PSP, etc) to general purpose reg. This causes rd of special purpose reg.

MSR: move contents of general purpose reg to special reg. This causes wrt of special purpose reg. MRS is used in conjunction with MSR as part of rd-modify-wrt seq (ex: to update a PSR to clear Q flag)

3. hint inst: 16 bit

SEV = send event, causes an event to be signaled to all processors within a multiprocessor system. It also sets local event reg to 1.

WFE = sleep and wait for event,

WFI = sleep and wait for interrupt. this inst puts processor in sleep until wakeup event happens,

NOP = no operation

4. branch: 16 bit

CBZ

CBNZ

BLX = branch indirect with link. This is unsupported inst, but existed in traditional ARM processors.

IT = If then. allows upto 4 succeeding inst to be conditionally executed. It avoids branch penalties, as there is no change to pgm flow.

5. misc : 16 bit

SVC = supervisor call. causes SVC exception

BKPT = breakpoint

CPS (CPSIE/CPSID) = change processor state

6. All 32 bit equiv inst for 16 bit inst above (thiese include 16 bit inst from Thumbs1 as well as from Thumbs2 (ex: if total number of 16 bit inst were 50, then there are 50 equiv 32 bit inst in Thumbs2 ISA)

ARM instructions:

A32 is pretty much similar to T32, except that it has no 16 bit inst. So, it can be considered a subset of T32 with some minor changes. A64 is more complex ISA, and competes with x86_64.