Theatres in USA

Theatres in USA are very expensive. Average Ticket price is $8/person (incl tax). Tickets are cheaper for movies before 6PM. However, since this is a country, where deals abound, you can save atleast half the money while enjoying movies at these theatres. Many of these theatres put up Bollywood movies also, if there are significant number of Indians living in that area. There are dedicated Indian theatres also in some of the big cities like Dallas, Houston, Atlanta, etc. Most of the theatres are digital theatres, while there are some IMAX theatres too. If you are here, you should try a real IMAX theatre atleast once. There are lot of IMAX theatres, but the real ones are the ones with huge screens in a dedicated theatre. If you are seeing an IMAX in multiplex, it's not a real IMAX. The sad part is that these "IMAX certified" thatres end up charging you a lot of money, while giving you the experience of something just slightly better than a regular 3D experience. You can find more details of American movie theatres on this Wiki Link

Different Theatres

  1. Cinemark. Cinemark is the 2nd largest movie theatre chain in USA. They have Normal theatres, Digital and 3D theatres, as well as IMAX theatres. Their tickets are usually $8, but they have so called discount theatres, which offer you tickets for $1 or $2. The discount theatres are exactly the same as normal theatres, so there is nothing that you miss. Only catch is that these discount theatres do not have latest release, but movies which are atleast couple of months old. Cinemark tries to have atleast 1 discount theatre in big cities, so it's the cheapest way to watch movies anywhere in USA. These discount theatres are usually referred to as "Movies 6", "Movies 8", "Movies 10", "Movies 14", etc  or "dollar cinema" theatres on the cinemark website. Look through the "theatre and showtime" page on the website, to find a theatre near you. I'm listing some theatres below (movies 8, etc), which are "dollar" theatres (<$3/ticket). There are other "movies 14" etc. theatres also, which have slightly higher prices (<$6/ticket), but they have newer releases. You should not have to pay more than $6 for any movie show, since you will usually find these higher priced movie theatres in almost all cities with latest releases.
    • Austin, TX: Cinemark Movies 8 (Round Rock, TX) => PERMANENTLY CLOSED (as of 2021) regular movies = $2.50/ticket, 3D = $4.25/ticket (discounted to $3.75 on wednessdays), mostly 6 month old movies
    • Austin, TX: Cinemark Movies 14 (Round Rock, TX) => regular movies = $5.75/ticket, mostly newer movies
    • Dallas, TX: Cinemark Hollywood USA Movies 15 (Garland, TX) => regular movies = $2/ticket, 3D = $4/ticket (discounted on Tuesdays), mostly 6 month old movies
  2. AMC Theatres. AMC is the largest movie theatre chain in USA (almost twice the size of AMC). This theatre chain gives you the best value for money. They also have similar movie halls as Cinemark theatres. Although their normal movie charges are also about $8, but they charge $4 (some places they charge $5 or $6) for movies on Friday, Saturday and Sunday that start before 12PM noon. This is the cheapest you can watch movies at places where you don't have cinemark discount theatres.
  3. Bollywood Theatres: There are a lot of Bollywood theatres operated by Indians, which exclusivly show desi movies. However, these are only in big cities. This website is the best I was able to find, which lists all the bollywood cinemas in USA, as well as has links to all the bollywood movies that are released. However, most of the theatres charge $8/ticket. Some of these theatres have deals for 1 day in a week. Still too expensive, and not worth the money. Most of these movies you can watch online for free or at a very low cost. See "online movie serial" link on left.
  4. IMAX: IMAX theatres are the ones that are worth watching a movie outside. A decent projector (<$500) can give you an experience similar to what you get in regular movie theatres, but real IMAX theatres, give you an experience which is impossible to capture via a home projector. Typical home projector screens are 10ftx10ft=100 sq ft, which is just 5% of typical IMAX screen.

    Probably best to try the world's three largest IMAX screens:

    1. Sydney IMAX, Sydney, Australia - Held world record for 15 years and has recently been upgraded
    Dimensions: 97x117ft (29.7x35.7m) - 11,350sq. ft. , 1,060m2

    2. Melbourne IMAX, Melbourne, Australia - Used to have the world's largest 3D screen prior to Sydney upgrade
    Dimensions: 75x104ft (23x32m) - 7,800sq. ft, 736m2

    3. Prasad IMAX, Hyperabad, India - The world's busiest and 3rd largest screen
    Dimensions: 72x95ft (22x29m) - 6,840sq ft., 638m2

    To put these screens into perspective, most IMAX screens are 2,500sq ft. or less that less than 22% of the Sydney screen size

     

     


 

How to watch Hollywood movies for free in Theatres

Probably you have heard about movie screenings. These are the first shows of the movie that are shown to journalists, reporters and selected guests. This is from where these people get to write reviews about a movie, before it gets released to the mass public. But here in US, these movie screenings are available to the general public also. There are lot of ways you can find out about these free movie screenings. One is to ask the theatre, and see when they have it. Other is to look in newspapers. However, the best and the most reliable source is the wild about movies website. This link https://www.wildaboutmovies.com/free-movie-screenings/ takes you to movies that are available for free screening. Click on the movie you are interested and see if it is available in your city. Most of the big US cities are covered here. Fill in the form. You will get confirmation in email. Print your pass and goto the theatre on designated date. Remember that not everyone gets passes. Only a few selected random entries get confirmation email with a pass.

One thing to note is that seating is on a first come and first serve basis. The number of passes issued to general public are more than the number of seats. So, you have to go to the theatre atleast half an hour before the scheduled movie time. If the movie is a very hyped movie, you might have to go hours before the movie starts. You will have to stand in the queue. Don't be surprised if you see a long queue. Unless the theatre administrator comes and informs that NOT everyone will be admitted, you will most likely be admitted even if you are at the end of the queue. Come full, since popcorn/drinks are not cheap in theatres cry

 


 

Kids Summer Movies

A lot of theatres in USA have movie shows during the summer for very cheap. The movies are usually kids movies that are old. They are the first show during the day, when the theatres are anyway empty. Also, they avoid weekends, and have these shows on weekdays to minimize their losses. Very few theatres participate, so check on the website to see which theatre is closest to you. In spite of all of these limitations, it's a nice way to get kids to watch movies on the big screen for a dollar or two. AMC, Cinemark, Regal all have their own version of "Kids Summer Movie Club"

  1. Cinemark: They call it the "Summer Movie Clubhouse". For about 10 weeks in summer, they show a different movie every week. The movies are shown from Monday-Wednesday @10am or before.
    1. 2024 => June 10-Aug 15 (10 weeks). Link => https://www.cinemark.com/series-events-info-pages/summer-movie-clubhouse/

 

 

******************************************
For running synthesis in Cadence RC (RTL Compiler):
-------------------------------------------------------------------

RC does global opt which isolates timing critical and non-timing critical paths before mapping them to gates. This results in better design than tools which do local/incremental opt in which design is mapped to gates first and then timing is opt.

RC:
---
Create a dir: /db/NOZOMI_NEXT_OA/design1p0/HDL/RCompiler/digtop/

NOTE: everything in RC is stored in virtual dir starting at / . So, we see / after many of the cmd which specifies that the cmd applies to all of the design (/ implies the top level dir of design. It's NOT for continuation on next line)

cp .synth_init file from some other dir. It is similar to .synopsys_dc.setup and has search paths, lib path and other variables setup. RC searches for this file first in installation dir as master.synth_init file, then in home dir for .cadence/.synth_init file, and finally in current dir for .synth_init file. It has the following and many more settings:

#set_attribute <attr_name> <attr_value> <object> => sets value of an attribute. In RC, there are predefined attr associated with objects. We can set only those attr on objects, which are read-write. Also, some attr can only be set at root (/) level, while some can be set on "designs" objects only.
ex: set_attribute lp_clock_gating_exclude true /designs/digtop => setting attr on designs/digtop object

#we can also create our own attribute:
set_user_attribute <attr_name> <attr_value> <object>

#get_attribute  <attr_name> <object> => gets attr value on single object only.
ex: get_attr load /libraries/slow/inx1/A

#set library paths to max delay lib. When we set this attr, RC scans these .lib files and reports errors/warnings in these files such as usupported constructs, etc. It also reports unusable cells (marked as "dont_use" in these lib files. usually all CTS cells and dly cells are marked as dont_use). Then it sets attribute of root "/": library = PML30_W_150_1.65_CORE.lib PML30_W_150_1.65_CTS.lib
set_attribute lib_search_path {/db/pdkoa/lbc8/2011.06.26/diglib/pml30/synopsys/src} /
set_attribute library {"PML30_W_150_1.65_CORE.lib" "PML30_W_150_1.65_CTS.lib"} / => max library. library attr is a root attr and so it's applied at root dir. This cmd populates the /libraries virtual dir.

#WLM, PLE, spatial or Physical RC can be used for wire modeling. WLM is worst and Physical is best. WLM is default.
#RC-WLM: info in .lib file. has WLM models. look in liberty.txt for details. WLM provides same res/cap for all layers (which is Res=0, cap=1pf/unit_length). In reality, res=0.2ohm/um and cap=0.2ff/um for LBC7 process. so, WLM is overly optimistic for net delays, and effectively treats net delays as 0.
set attr interconnect_mode wireload /
set attr wireload_mode top /

#RC-PLE (physical layout estimation), RC-spatial, RC-physical(RCP): needs tech_lef and std_cell_lef files in addition to .lib files. cap_table and floorplan def files are optional for PLE and spatial, but floorplan .def file is required for physical as it has pin location, macro placement, etc. PLE does good job modeling local interconnects since physical cell size as well as various metal layer info is present. Providing cap table info gives better estimate of cap/res as actual cap/res taken for each layer. spatial models longer wires better, as it does coarse placement under the hood. providing floorplan def helps a lot in RC-spatial.

set attr interconnect_mode ple / => not needed as specifying lef files applies PLE.
#when setting attr for lef lib, RC scans these files and reports number of routing layers, number of logic/seq cells, and any warnings/errors etc. It alos looks for consistency b/w tech lef and cap table for width of layers, etc. Then it sets attr "lef_library", "cap_table" for root "/" to named files below.
set_attribute lef_library {"/db/pdk/lbc7/rev1/diglib/msl270/r3.0.0/vdio/lef/msl270_lbc7_core_iso_2pin.lef" "/db/pdk/lbc7/rev1/diglib/msl270/r3.0.0/vdio/lef/msl270_lbc7_tech_3layer.lef" } / => both tech and std cell lef files provided. stored in compiler memory at /libraries
set_attribute cap_table_file {/db/pdk/lbc7/rev1/diglib/msl270/r3.0.0/vdio/captabl/3lm_maxC_maxvia.capTbl} / => helpful. This shows res and cap for various width and spacing for each layer and vias.

For running spatial or physical, include it in "synthesize" cmd as follows when running rc:
synthesize -to_mapped -spatial -effort [low|medium|high] => spatial
synthesize -to_placed => physical. It runs First encounter (FE) placeDesign, trialroute, extractRC, buffers long wires, brings in physical timing and performs inc opt. Then we can do: write_encounter digtop. We can output a def file which is fully placed legal design pre-CTS. We can then start from the CTS step in FE.


##Default undriven/unconnected setting is 'none'.  These connect each i/p, o/p or internal undriven signal (wire/reg) to specified value. none implies undriven signal remains undriven. post elaboration netlist will have appr gates and assign stmt to support driven value.
#set_attribute hdl_unconnected_input_port_value 0 | 1 | x | none /
#set_attribute hdl_undriven_output_port_value   0 | 1 | x | none /
#set_attribute hdl_undriven_signal_value        0 | 1 | x | none /

#naming style in verilog netlist generated. %s is variable name, %d is individual bit
set_attribute hdl_array_naming_style %s_%d /  
set_attribute bus_naming_style %s_%d /

#Selects the Verilog style for unconnected instance pins. default is to write out dummy wires for unconnected instance pins. ex: for this line in original RTL: DELAY1 DL (.A(A2)); //DL module has 1 i/p port and 1 o/p port which is not coded in RTL.
#full => Put UNCONNECTED for nets connecting unconnected instance pins in gate netlist. ex: DELAY1 DL(.A (A2), .Z (UNCONNECTED));
#partial => Put the unconnected instance pins in gate netlist, but no wire to connect to it. ex: DELAY1 DL(.A (A2), .Z ());
#none => do nothing. ex: DELAY1 DL (.A(A2));
set_attribute write_vlog_unconnected_port_style  partial / => remove  UNCONNECTED nets from pins.

#set_attribute tns_opto true / => turn ON TNS
 
##set_attribute wireload_mode <value> /
set_attribute information_level 7 /

set_attribute hdl_track_filename_row_col true / => To include the RTL file name and line number  at  which  the  DFT violation occurred in the messages produced by check_dft_rules

#clk gating set for 3 or more flops
set_attribute lp_insert_clock_gating true /
set_attribute lp_clock_gating_min_flops 3 /
set_attribute lp_clock_gating_prefix CLK_GATE /

#do not merge equiv flops and latches
set_attribute optimize_merge_flops false /
set_attribute optimize_merge_latches false /

#optimize const flops. By default, set to true so that const 0/1 can be propagated thru flops, thus allowing removal of flops.
#set_attr optimize_constant_0_flops false
#set_attr optimize_constant_1_flops false

#use_tiehilo_for_const: const are tied to hi/lo cells. This doesn't connect all 1'b1/1'b0 to tiehi/lo cells , so we use another cmd after synthesize to fix remaining 1'b1/1'b0 problem. options:
#duplicate => Allows each constant assignment to be replaced with a tie cell.
#unique => Allows  only  one unique tie cell in the netlist. Treatment of the remaining constant assignments depends on settings of the remove_assigns and set_remove_assign_options
#none => Prevents the  replacement  of constants in the netlist with tie cells
set_attr use_tiehilo_for_const unique => only 1 unique tie cell should be added.

vaious other attr can be set in .synth_init file, before running rc cmds.

------------------
run RC: script run_rc
rc -10.1-s202 -over -f ./tcl/top.tcl -logfile ./logs/top.log -cmdfile ./logs/top.cmd

#IMP: for getting help with cmds on rc,
rc:/> cdnshelp => brings up cdns help browser for that rev of tool
rc:/> man or help <cmd_name>. Tab key shows all possible completions.
rc:/> man lib_serach_path => this will show man page for attr "lib_serach_path"

# write_template => template script can be generated by running write_template with various options
write_template -outfile run.tcl -full => creates script with all basic cmd, dft, power, retiming. -simple creates a simple script.

#running scripts within RC: do a source or include with script file name.

top.tcl:
-------
#initial setup: (it has set SCAN_EXISTS 0 => choose b/w scan vs non-scan design)
#include/source other files
include tcl/setup.tcl => set varaibles as DESIGN, SYN_EFF, lib path, dir, etc. All lib path, cap_table etc are put in .synth_init file, but can also be put here.
#source tcl/setup.tcl => we can also use source to include the file

#source tcl/analyze.tcl
#read verilog/vhdl/systemVerilog files, elaborate, check design, uniqify and then check for uniquufy
#read_hdl <-v1995 | -v2001 | -sv | -vhdl> [list "$RTL_DIR/global.v"  ... " " ] => default lang attr is one specified by hdl_language  attribute.  The  default value for the hdl_language attribute is -v1995. For -vhdl, hdl_vhdl_read_version root attribute apecifies vhdl version, by default it's set to VHDL-1993.
read_hdl -v2001 [list "$RTL_DIR/global.v"  ... "$RTL_DIR/digtop.v" ]

#read_netlist design_struct.v => to read gate level netlist

elaborate $DIG_TOP_LEVEL => $DIG_TOP_LEVEL is set to digtop above. This elaborates top level design and all its references. We only specify the top level. It builds data structures, infers registers,performs HDL opt, identifies clk gating and operand isolation candidates.

check_design -unresolved => checks for design problems as unresolved references. Using -all checks for  undriven/multidriven ports/pins, unloaded ports/pins, constant connected ports/pins and any assign stmt.

#uniquify not needed as design is uniquified by default.
/*
uniquify $DIG_TOP_LEVEL
#task to make sure design is uniquified
proc is_design_uniquified {} {
    foreach subd [find /des*/* -subdesign *] { => look in designs dir for all sub designs
        if {[llength [get_attr instance $subd]] > 1 } {
            puts "ERROR: design is NOT uniquified"
            return
        } else { return "design is uniquified" }
    }
}
is_design_uniquified => calling the actual procedure
*/

#provide constraints in SDC: 2 options: read sdc file directly by using read_sdc or enter constraints as in DC. eg.
#option 1: read_sdc ./tcl/env_constraints.tcl => reads all DC sdc cmds directly without any prefixing. Useful as same file can be used in EDI/Dc, etc. IMP: we've to add ./tcl and not tcl/, since virtual dir structure is assumed for RC, so it looks for tcl dir in virtual dir, which is not there, so it complains. By changing it to ./tcl, it looks in unix tcl dir in current dir
#option 2: replace all dc cmds with dc::, or change them to RC equiv cmd. ex: dc::set_load ..... We can read these cmds anytime within RC shell or put it in file and source it: source tcl/env_constraints.tcl. However, same file can't be used in synopsys tools as "dc::" is not sdc cmd.

#env constraints: (see in sdc.txt for cmd details: some cmds in DC sdc file aren't std sdc cmd, so they have to be replaced with appr RC cmds).
option 1: read_sdc ./tcl/env_constraints.tcl => same file can be used in EDI
option 2: prepend sdc cmds with dc::.

env_constraints.tcl file: op_cond (PVT), load (both i/p and o/p), drive (only on i/p), fanout(only on o/p) and WLM. Of these, op_cond and WLM are already specified in .synth_init file. dont_touch, dont_use directives also provided here.
------
#i/p driver: use "set_driving_cell" as it's std sdc cmd
#set_attribute external_driver [find [find "MSL270_W_125_2.5_CORE.db" -libcell IV110] -libpin Y] [all_inputs] => DC cmd
set_driving_cell -lib_cell IV110 [all_inputs] => sdc cmd. use this for both RC/DC

#o/p load: use "set_load" as it's std sdc cmd. However, to automatically use i/p cap for IV110 as load cap for o/p ports, we need to use diff cmd in DC vs RC. Then we can use set_load.
if {$RUN_PNR ==1} {
set output_load 0.005
} else {
#set output_load [get_attribute capacitance "MSL270_W_125_2.5_CORE.db/IV110/A"] => get_attribute is native cmd for both RC/DC with different syntax, so it gives an error in RC. Also, it can't be used in EDI. For RC, we use "get_liberty_attribute" which is simpler.
set output_load [get_liberty_attribute capacitance [find [find "MSL270_W_125_2.5_CORE.db" -libcell IV110] -libpin A]] => get_liberty_attribute isn't supported in EDI. use this in RC only.
#set output_load  [get_attribute max_capacitance [find [find / -libcell MSL270_W_125_2.5_CORE/IV110] -libpin A]] => here get_attribute is used and full path of libcell is provided since we start search from top level virtual dir "/".
}

set output_load_4x [expr 4 * $output_load]
set_load -pin_load $output_load_4x [all_outputs]

write_set_load > ${_OUTPUTS_PATH}/net_cap.txt => shows load values for all the nets in design in set_load format. since set_load is sdc cmd, values are shown in pf. Run this in RC to make sure units are correctly shown.

#set_dont_use
read_sdc ./tcl/dont_use.tcl

#set_dont_touch
read_sdc ./tcl/dont_touch.tcl

#write out HDL in cadence primitives, before doing synthesis
write_hdl    > ./netlist/${DESIGN}.PreSyn.v

#initial synthesis
synthesize -to_generic -eff low -no_incr => opt mux and datapath and stops before mapping. It contains tech independent components. It does const propagation, resource sharing, logic speculation, mux opt, CSA (carry save adder) opt. -no_incr allows it to opt logic from scratch.
synthesize -to_mapped  -eff low -no_incr => maps design to cells in tech lib and optimizes it. It evaluates every cell in design and resizes to improve area and power. If -incr option is used, then it runs DRC, timing, area cleanup and critical region resynthesis to meet timing. -incr preserves current impl and performs opt only if there is an improvement in overall cost of design. -to_mapped is default option.

#when we synthesize with map, we see "global mapping target info" on screen and in log file. In each cost group, RC will estimate a target slack number based on the design structure, libraries, and design constraints. This slack number is the estimated slack on the worst path of a cost group seen before mapping. During mapping, RC will try to structure logic, and select cells to bring this target slack number close to 0.

#puts "Runtime & Memory after initial synthesize"
#timestat MAPPED

generate_reports -outdir $_REPORTS_PATH -tag ${DESIGN}.initial => reports area, gate, timing in separate files.
#report area > $_REPORTS_PATH/${DESIGN}.initial_area.rpt => no need of this cmd, as area already reported by above cmd

write_hdl  > ${_NETLIST_PATH}/${DESIGN}_initial.v

#### design constraints => case_analysis, i/p,o/p delays, clocks/generated clocks, false/multicycle paths
if {$SCAN_EXISTS} {
read_sdc ./tcl/case_analysis.tcl => set_case_analysis only if scan exists to force part in functional mode. We want to have simple functional timing paths, and not have paths for scan_mode too. strictly speaking, this stmt is not required.
#case_analysis.tcl
#set_case_analysis 0 scan_mode_in => force scan_mode to 0 so that we see timimg paths b/w diff clocks. We are not interested in timing when part is in scan mode.
}
read_sdc ./tcl/constraints.tcl => has i/p, o/p delays
#constraints.tcl
set_input_delay  0.2 -clock clk1 [all_inputs]
set_output_delay 0.4 -clock clk1 [all_outputs]
}

#clocks (set_drive and create_clock/create_generated_clock for all clks).
read_sdc ./tcl/clocks.tcl

#we don't set uncertainty in clocks.tcl, since we use that file in EDI, where we want to use real clk delays)
set_clock_uncertainty $SPI_SCK_skew SPI_SCK

#turn off clk gating if not wanted (in .synth_init we set clk gating to true)
set_attribute lp_insert_clock_gating false /

#read false_paths/multi-cycle paths
read_sdc ./tcl/false_paths.tcl
read_sdc ./tcl/multicycle_paths.tcl

#to prevent any logic changes on instances of specified cells.
#map_size_ok => Allows  resizing, unmapping, and remapping of a mapped sequential inst during opt,  but not renaming or deleting it.
#size_ok     => Allows resizing a mapped inst during  opt, but not deleting, renaming, or remapping it.
    
set_attr preserve map_size_ok [find I_S1_CONTROL -instance  instances_seq/sm_reg*]
#set_attr preserve true => Prevents logic changes to any object in the design during opt. needed

# Incremental Compile with high effort
source tcl/compile.tcl

###compile.tcl has following cmds.
#report worst case timing by setting this variable:
set_attribute map_timing true /

## Define cost groups (clock-clock, clock-output, input-clock, input-output)
if {[llength [all::all_seqs]] > 0} {
  define_cost_group -name I2C -design $DESIGN
  define_cost_group -name C2O -design $DESIGN
  define_cost_group -name C2C -design $DESIGN
  path_group -from [all::all_seqs] -to [all::all_seqs] -group C2C -name C2C
  path_group -from [all::all_seqs] -to [all::all_outs] -group C2O -name C2O
  path_group -from [all::all_inps]  -to [all::all_seqs] -group I2C -name I2C
}

define_cost_group -name I2O -design $DESIGN
path_group -from [all::all_inps]  -to [all::all_outs] -group I2O -name I2O

#report all failed cmds when reading sdc
echo "failed sdc cmds" > $_REPORTS_PATH/${DESIGN}.after_constrain.rpt
echo $::dc::sdc_failed_commands >> $_REPORTS_PATH/${DESIGN}.after_constrain.rpt

echo "The number of exceptions is [llength [find /designs/$DESIGN -exception *]]" >> $_REPORTS_PATH/${DESIGN}.after_constrain.rpt

report timing -lint -verbose >> $_REPORTS_PATH/${DESIGN}.after_constrain.rpt => reports possible timing problems in the design, such as ports that have no external delays (unclocked primary I/O), unclocked flops, multiple clocks propagating to the same clock pin, timing exceptions that cannot be satisfied, timing exceptions overwriting other timing exceptions, constraints that may have no impact on the design, and so on.

#incremental synthesis
synthesize -to_mapped -eff high -incr

#IMP: we might have 1'b0 and 1'b1 in logic at this time. To connect them to tiehi/tielo cells, run this:
insert_tiehilo -all -hilo TO020L -verbose [find -design *] => for both hi/lo connections, same cell used. verbose shows info on screen, as to which 1'b1/1'b0 are still not tied. -all does it for all including scan cells. If we put "-hi TO020 -lo TO020", then tool connects hi connections to one instance of TO020 (to HI pin. LO pin is left floating) and lo connections to another instance of TO020 (to LO pin. HI pin is left floating). So, this results in 2 copies of TO020 cells. By using "-hilo TO020", we use same instance for hi and lo connections.

#reports
generate_reports -outdir $_REPORTS_PATH -tag ${DESIGN}.incremental
summary_table -outdir $_REPORTS_PATH

report timing  -num_paths 500 >> $_REPORTS_PATH/${DESIGN}.all_timing.rpt
foreach cg [find / -cost_group -null_ok *] {
  report timing -cost_group [list $cg] -num_paths 100 > $_REPORTS_PATH/${DESIGN}.[basename $cg]_timing.rpt
}

report area > $_REPORTS_PATH/${DESIGN}.compile.rpt
report design_rules >> $_REPORTS_PATH/${DESIGN}.compile.rpt
report summary >> $_REPORTS_PATH/${DESIGN}.compile.rpt => reports area, timing and design rules.

#optional reports
report messages >> $_REPORTS_PATH/${DESIGN}.compile.rpt => reports summary of error msg
report qor     >> $_REPORTS_PATH/${DESIGN}.compile.rpt
report gates -power >> $_REPORTS_PATH/${DESIGN}.compile.rpt => reports libcells used, total area and instance count
report clock_gating >> $_REPORTS_PATH/${DESIGN}.compile.rpt
report power -depth 0 >> $_REPORTS_PATH/${DESIGN}.compile.rpt
report datapath >> $_REPORTS_PATH/${DESIGN}.compile.rpt => datapath resource report

#write results
write_design -basename ${_OUTPUTS_PATH}/${DESIGN}
write_script > ${_OUTPUTS_PATH}/${DESIGN}.script
write_hdl  > ${_NETLIST_PATH}/${DESIGN}.v => final non-scan netlist

####### Insert Scan
if {$SCAN_EXISTS} { => see synthesis_DC.txt for details on this
set_ideal_network [get_ports scan_en_in]
set_false_path -from scan_en_in

source tcl/insert_dft.tcl
}

#insert_dft.tcl has following
source ./tcl/scan_constraints.tcl

#scan_constraints has following:
set_attribute dft_dont_scan true [ list Idigcore/IResetGen/nReset_meta1_reg \
                                        Idigcore/IResetGen/nReset_meta2_reg ]

set_attr dft_scan_style muxed_scan / => mux_scan style
set_attribute dft_prefix DFT_ / => prefix dft with DFT_

# For VDIO customers, it is recommended to set the value of the next two attributes to false.
set_attribute dft_identify_top_level_test_clocks false /
set_attribute dft_identify_test_signals false /

set_attribute dft_identify_internal_test_clocks false /
set_attribute use_scan_seqs_for_non_dft false /

set_attribute dft_scan_map_mode tdrc_pass "/designs/$DESIGN"
set_attribute dft_connect_shift_enable_during_mapping tie_off "/designs/$DESIGN"
set_attribute dft_connect_scan_data_pins_during_mapping loopback "/designs/$DESIGN"
set_attribute dft_scan_output_preference auto "/designs/$DESIGN"
set_attribute dft_lockup_element_type preferred_level_sensitive "/designs/$DESIGN"
#set_attribute dft_mix_clock_edges_in_scan_chains true "/designs/$DESIGN"

---
### define clocks, async set/reset, SDI, SDO, SCAN_EN and SCAN_MODE.
##all dft cmds have these common options:
#define_dft <test_mode | test_clock | shift_enable | scan_chain> -name <testObject> <port or pin name> -create_port -hookup_pin <pin_name> -hookup_polarity <inverted|non_inverted> -shared_in -shared_out

#<port or pin name>: we provide the driving port_or_pin_name. However, that will work only if we code the RTL in a way where the top level port can directly be used as SE, SCLK, SDI, SDO. In many cases, functional pins are used as scan pins by multiplexing them, so if we directly use the port name, that will be incorrect. For ex spi_cs_n being used as scan_shift_en (during scan_mode) has spi_cs_n anded with scan_mode to generate scan_shift_en which is then connected to SE pin of all flops. In this case, internal scan_shift_en needs to be used for SE, so we add option "-hookup_pin B/scan_shift_en_int" so that tool connects this pin to SE of all flops. When you specify this option, the RC-DFT engine does not validate the controllability of any logic between the top-level test-mode signal and its designated hookup pin under test-mode setup (i.e if the hookup pin can be changed to desired value by toggling i/p port or not). The way RTL is coded in our group is we get the pin driven out and then driven back in as dedicated pin for scan purpose (for ex scan_enable_out and scan_enable_in pins). Then we don't need -hookup_pin option. Look in DFT compiler notes (pg 1 back).

#-shared_in is used to indicate that i/p port is used for functional port also. similarly -shared_out is used to indicate that o/p port is used for functional port also. By default, the signal applied to the specified driving pin or port is considered to be a dedicated test signal. By specifying these, we ensure that these test signals will not get constrained in the write_do_lec dofile. Not specifying this option for a shared test signal will result in overconstraining the write_do_lec dofile (by forcing that input port to inactive state) which can lead to false EQs.

#-no_ideal marks the test signal as non-ideal which allows buffering in RC. By default, it's treated as ideal.

----
#force pins for test mode: i.e async set/reset need to be in inactive state, while SCAN_MODE needs to be high.
#define_dft test_mode -name <testModeObject> -active <high|low> -no_ideal -scan_shift <port_or_pin_name> [-create_port] [-shared_in] -hookup_pin <pin_name> -hookup_polarity <inverted|non_inverted>
define_dft test_mode -name scan_mode -active high scan_mode_in
#define_dft test_mode -name scan_reset -active high n_reset => we don't define async set/reset as we force them to 0, when scan_mode=1 (in RTL itself). If we need to toggle n_reset during scan test to have more coverage, then we need to use -scan_shift option which holds the scan signal to its test-mode active value during the scan shift operation of the tester cycle, but is otherwise allowed to pulse during capture cycle (test signal will be treated as a non-scan clock signal by the ATPG tool). -scan_shift option is also needed to generate correct lec.do file, else n_reset pin will get get constrained which will lead to false EQs.

#now define scan_clk, scan_shift_en, scan_data_in and scan_data_out for each chain. Note that these scan_pins are multiplexed with normal functional pins, so -hoopup_pin option is used.
#define_dft test_clock -name <testClockObject> -domain <testClockDomain> -period <delay in pico sec, default 50000> -rise <integer> -fall <integer> <portOrpin> -hookup_pin <pin_name> -controllable => Defines  a  test  clock  and  associates a test clock waveform with the clock. If  you  do  not define test clocks, the DFT rule checker automatically analyzes the test clocks and creates these objects with a default waveform. -hookup_pin specifies the core-side hookup pin to be used for the top-level test clock during DFT synthesis.
#-controllable => when specifying an internal pin for a test clock, this option indicates that the internal clock pin is controllable in test mode (for example, Built-in-Self-Test (BIST)). If you do not specify this option, the rule checker must be able to trace back from the internal pin to a controllable top-level clock pin. If you specify an internal pin as being controllable, you need to ensure that this pin can be controlled for the duration of the test cycle. The tool will not validate your assumption.
#-domain => pecifies the DFT clock domain associated with the test clock.Clocks belonging to the same domain can be mixed in a chain. If you omit this option, a new DFT clock domain is created and associated with the test clock. Flip-flops belonging to different test clocks in the same domain can be mixed in a chain. Lockup elements can be added between the flip-flops belonging to different test clocks.

define_dft test_clock -name scan_clk -domain scan_clk -period 100000 -rise 40 -fall 80 SCLK => scan_clk defined at port SCLK with period of 100ns (10 Mhz). rise happens at 40% from start of clk period while fall happens at 80%. So, rise happens at 40ns, while fall happens at 80 ns, assuming clk starts at 0ns. This test clk can be referred to as scan_clk from now on (name is helpful to search for the test clk, or look it in reports, etc). We don't specify hookup_pin as in RTL, we force the i/p clk pin to goto all flops in scan_mode (by using mux).

#define_dft shift_enable -name <shiftEnableObject> -active <high|low> <portOrpin_name> -hookup_pin <pin_name> [-create_port] => It specifies name and active value for shift_en signal. Active value is propagated during dft rule checking. The input signal can be defined on a top-level port or an internal driving pin. hookup_pin is internal pin which is the actual scan_en that should goto all flops.
define_dft shift_enable -name scan_enable  -active high SCAN_EN_IN => SCAN_EN_IN is defined as shift_enable and referred to as "scan_enable". In this RTL is coded so that scan_en_out comes back in as input port with name SCAN_EN_IN, so no need of hookup_pin.

#define_dft scan_chain -name <ChainName> -sdi <topLeveLSDIPort> -sdo <topLevelSDOPort> [-hookup_pin_sdi <coreSideSDIDrivingPin>] [-hookup_pin_sdo <coreSideSDOLoadPin>] [-shift_enable <ShiftEnableObject>] [-shared_output | -non_shared_output] [-terminal_lockup <level | edge>] => -hookup_pin_sdi/sdo specs core side hookup pin to be used for the scan data input/output signal during scan chain connection. -shift_enable designates chain specific SE signal, else default shift_enable signal used. -shared_output specs that a mux be inserted in the scan data path by the connect_scan_chains cmd since functional o/p port is being used as SDO port.

define_dft scan_chain -name chain1 -sdi spi_mosi  -sdo spi_miso -shared_output => sdi and sdo defined

###end of scan_constraints.tcl file

# DFT DRC Checking
check_dft_rules       > $_REPORTS_PATH/${DESIGN}_dft.rpt => look at hdl_track_filename_row_col attr.

report dft_registers >> $_REPORTS_PATH/${DESIGN}_dft.rpt
report dft_setup     >> $_REPORTS_PATH/${DESIGN}_dft.rpt

check_design -multidriven
check_dft_rules -advanced                                     >> $_REPORTS_PATH/${DESIGN}_dft.rpt
report dft_violations -tristate -xsource -xsource_by_instance >> $_REPORTS_PATH/${DESIGN}_dft.rpt

#fix dft violations before proceeding (either by modifying RTL or using auto fixing)
fix_dft_violations

# To turn off sequential merging on the design uncomment & use the following attributes.
set_attribute optimize_merge_flops false /
set_attribute optimize_merge_latches false /

#synthesize to map regular FF to scan FF (define_dft above makes forces synthesize cmd to include scan FF instead of non-scan FF. There is no separate scan option to synthesize with scan
synthesize -to_map -no_incr -auto_identify_shift_register => shift reg auto identified so that they are not replaced by scan cells

#Build the full scan chanins.
connect_scan_chains -preview => It shows how scan chain will be connected but makes no changes yet to the netlist.  
connect_scan_chains -auto_create_chain => connects scan FF which pass DFT into scan_chain. -auto_create_chain option allows the tool to create more chains, if needed, than what has been specified thru define_dft cmd.
report dft_chains > $_REPORTS_PATH/${DESIGN}_SCAN_Chain.txt

delete_unloaded_undriven -force_bit_blast -all digtop => remove unconnected ports in the design
set_attribute remove_assigns true => remove assigns & insert tiehilo cells during Incremental synthesis
set_attribute use_tiehilo_for_const duplicate

#incremental synthesis only if needed
#synthesize -to_mapped -eff low -incr

#IMP: we might have 1'b0 and 1'b1 in logic after scan synth. To connect them to tiehi/tielo cells, run this:
insert_tiehilo -all -hilo TO020L -verbose [find -design *]

#reports after scan insertion
report dft_setup > $_REPORTS_PATH/${DESIGN}-DFTsetup_final
write_scandef > ${DESIGN}-scanDEF
#write_atpg [-stil|mentor|cadence] > ${DESIGN}-ATPG
write_atpg -stil > ${DESIGN}-ATPG
write_dft_abstract_model > ${DESIGN}-scanAbstract
write_hdl -abstract > ${DESIGN}-logicAbstract
write_script -analyze_all_scan_chains > ${DESIGN}-writeScript-analyzeAllScanChains
## check_atpg_rules -library <Verilog simulation library files> -compression -directory <Encounter Test workdir directory>
## write_et_bsv -library <Verilog structural library files> -directory $ET_WORKDIR
## write_et_mbist -library <Verilog structural library files> -directory $ET_WORKDIR -bsv  -mbist_interface_file_di
r <string> -mbist_interface_file_list <string>
## write_et_atpg -library <Verilog structural library files> -compression -directory $ET_WORKDIR
write_et_atpg -library  /db/pdk/lbc7/rev1/diglib/msl270/r3.0.0/verilog/models/*.v  -directory $ET_WORKDIR

#final reports
generate_reports -outdir $_REPORTS_PATH -tag ${DESIGN}.scan
summary_table -outdir $_REPORTS_PATH

report timing  -num_paths 500 >> $_REPORTS_PATH/${DESIGN}.all_timing.scan.rpt
foreach cg [find / -cost_group -null_ok *] {
  report timing -cost_group [list $cg] -num_paths 100 > $_REPORTS_PATH/${DESIGN}_scan.[basename $cg]_timing.rpt
}

report area > $_REPORTS_PATH/${DESIGN}.scan.compile.rpt
report design_rules >> $_REPORTS_PATH/${DESIGN}.scan.compile.rpt
report summary >> $_REPORTS_PATH/${DESIGN}.scan.compile.rpt => reports area, timing and design rules.

#optional reports
report messages > $_REPORTS_PATH/${DESIGN}.scan.compile.rpt
report qor >> $_REPORTS_PATH/${DESIGN}.scan.compile.rpt
report gates -power >> $_REPORTS_PATH/${DESIGN}.scan.compile.rpt
report clock_gating >> $_REPORTS_PATH/${DESIGN}.scan.compile.rpt
report power -depth 0 >> $_REPORTS_PATH/${DESIGN}.scan.compile.rpt
report datapath > $_REPORTS_PATH/${DESIGN}.scan.compile.rpt

write_design -basename ${_OUTPUTS_PATH}/${DESIGN}_scan
write_script > ${_OUTPUTS_PATH}/${DESIGN}_scan.script
write_hdl  > ${_NETLIST_PATH}/${DESIGN}_scan.v => final scan netlist

-- end of insert_dft.tcl

#write sdc and do files
write_sdc > sdc/constraints.sdc

#Write do file for RTL is to be compared with the final netlist. only revised is specified since RTL is taken as golden. Otherwise we need to specify "-golden_design <RTL_files>"
if {$SCAN_EXISTS} {
write_do_lec -revised_design ${_NETLIST_PATH}/${DESIGN}_scan.v -logfile ${_LOG_PATH}/rtl2final.lec.log > ${_OUTPUTS_PATH}/rtl2final.lec.do
write_et_atpg -library /db/pdkoa/1533c035/current/diglib/pml48/verilog/models => write Encounter Test ATPG scripts in et_scripts dir to generate patterns
} else {
write_do_lec -revised_design ${_NETLIST_PATH}/${DESIGN}.v -logfile ${_LOG_PATH}/rtl2final.lec.log > ${_OUTPUTS_PATH}/rtl2final.lec.do
}

puts "Final Runtime & Memory."
timestat FINAL
puts "============================"
puts "Synthesis Finished ........."
puts "============================"

#################################
#for scan mapping, use this section
#################################
define_dft test_mode -shared_in -active high $TESTSCANMODE
set_attribute dft_dont_scan true [find / -inst I_WRAPPER/scanmode_r*]
set_attribute dft_dont_scan true [find / -inst I_WRAPPER/clked_nt_result*]

define_dft shift_enable  -name SE \
                         -active high \
                         -hookup_pin [find / -pin I_WRAPPER/SCANEN]\
                         [find / -port I_GPIO_Y[1]]
define_dft test_clock    -name SCANCLOCK \
                         -period 100000 -fall 40 -rise 60 \
                         [find / -port I_GPIO_Y[0]]
#define_dft test_mode     -scan_shift -name RESET -active high \
#                         [find / -port I_XRESET]

define_dft scan_chain    -name chain1 \
                         -sdi [find / -port I_GPIO_Y[2]] \
                         -sdo [find / -port O_GPIO_A[3]] \
                         -hookup_pin_sdi [find / -pin I_WRAPPER/SI1] \
                         -hookup_pin_sdo [find / -pin I_WRAPPER/SO1] \
                         -shared_out

define_dft scan_chain    -name chain2 \
                         -sdi [find / -port I_GPIO_Y[4]] \
                         -sdo [find / -port O_GPIO_A[5]] \
                         -hookup_pin_sdi [find / -pin I_WRAPPER/SI2] \
                         -hookup_pin_sdo [find / -pin I_WRAPPER/SO2] \
                         -shared_out


set_attribute dft_min_number_of_scan_chains 2 [find / -design $DIGTOPLEVEL]
#set_attribute dft_mix_clock_edges_in_scan_chains true [find / -design $DIGTOPLEVEL]
################################################################################
## dft_drc is used instead of check_test command
################################################################################
check_dft_rules > ./reports/check_dft_rules.rpt

############################################33



Scan mapping: converting flip-flops that pass TDRC.
Scan mapping: bypassed.  You have to either
1) set attribute 'dft_scan_map_mode' to 'tdrc_pass' and run 'check_dft_rules' or
2) set attribute 'dft_scan_map_mode' to 'force_all'.

Scan mapping bypassed because no TDRC data is available: either command 'check_dft_rules' has not been run or TDRC data has been subsequently invalidated.

#for scan
connect_scan_chains

---------------------------------------------------------------------------

For synthesis which involves multiple power domains:
----------

read_power_intent -module TOP -cpf "../TOP.cpf"
redirect chk.cpf.detailed.rpt "check_cpf -detail"
commit_power_intent
verify_power_structure -lp_only -pre_synthesis -detail > $_REPORTS_PATH/digtop_verify_power.rpt

write_cpf -output_dir ${_OUTPUTS_PATH} -prefix ${DESIGN}_
write_power_intent -base_name ${_OUTPUTS_PATH}/TOP_m -cpf -design TOP


DC (Design Compiler):  This is the synthesis tool from Synopsys, which takes RTL as input and generates a synthesized netlist.


For running synthesis in Design Compiler:
-------------------------------------------------------------------
In synthesis, clk and scan_enable are set as ideal network, so they don't get buffered (they get buffered in PnR). Reset and all other pins are buffered as needed to meet DRC. This reset tree built in DC is again rebuilt in PnR during placement to make sure it meets recovery/removal checks.

steps in DC synthesis are as follows:


1. RTL opt: HDL-Compiler compiles HDL (performs translation and arch opt of design). DC translates HDL desc to components extracted from GTECH(generic tech) and DW(Design Ware) lib called as RTL opt. GTECH consists of basic logic gates and flops, while DW contains complex cells as adder, comparators, etc. these are tech independent.
2. Logic opt: DC then does logic opt. first, it does structuring which adds intermediate variables and logic structures to GTECH netlist. then it does flattening which converts combo logic paths into 2 level SOP rep. At this stage, all intermediate variables and it's associated logic structure are removed.
3. Gate opt: it optimizes and maps GTECH design to specific tech lib (known as target lib). It's constraints driven. It does delay opt, design rule fixing and area opt. Power Compiler used if static/dynamic power opt done.
4. Add DFT: Next Test synthesis is done using DFT Compiler, which integrates test logic into design.
5. Place and Route (PnR): PnR is done next, from which delays can be back annotated to design. DC can then resynthesize to generate better netlist.

Operating condition for any chip is defined via 3 conditions: Process (P), Voltage (V) and Temperature (T). Since these 3 uniquely determine the speed of transistor, we choose a particular PVT corner for running Synthesis. Usually we define 3 PVT corners (below ex is for a design in 250nm). The term max, min, etc refers to delay, so max corner means corner with maximum gate delay, i.e slowest corner.

NOM: P=TYP, V=1.8V, T=25C (TYP) => This is the typical or nominal corner where chip is supposed to run at nominal speed. Here PVT is specified as 1.8V, room temperature and nominal process.
MAX: P=WEAK, V=1.65V, T=150C (WC) => This is the worst case corner, where chip is supposed to run at the slowest speed. Here PVT is specified as 1.65V, high temperature and weak (slow) process. MAX implies this PVT gives you maximum delay (i.e slowest speed). You will note that voltage is -10% below typ. This is generally a safe voltage to choose as voltage is not supposed to fluctuate by more than +/- 10% even in worst case scenarios (as voltages are usually controlled by PMU, which hold voltage levels very tight. Most of the voltage fluctuations happen due to IR drop on and off chip).
MIN: P=STRONG, V=1.95V, T=-40C (BC) => This is the best case corner, where chip is supposed to run at the slowest speed. Here PVT is specified as 1.95V, low temperature and strong (fast) process. MIN implies this PVT gives you minimum delay (i.e fastest speed). Voltage here is usually +10% above typ

Since we want our design to be able to run in worst possible scenario, we choose WC (MAX) corner to synthesize our design. Then, our design is guaranteed to work across all OP conditions.

run DC:


dc_shell-t -2010.03-SP5 => brings up DC shell. dc_shell is shell mode (dc own shell), while dc_shell-t is tcl mode (dc shell which can accept tcl cmd too). dc_shell-xg-t is XG mode, which uses opt mem mgmt to reduce run time.
dc_shell-t -2010.03-SP5 -f ./tcl/top.tcl | tee logs/my.log => runs with cmds in top.tcl and keeps all info printing on screen to my.log
dc_shell-t -2010.03-SP5 -t topo -f ./tcl/top.tcl => To bring dc-shell-t in topo mode. This requires MilkyWay (MW) db. See section in synopsys ICC (PnR_ICC.txt)

When we run in DC shell above, it's a text based shell. We can also have GUI.
Design Vision is GUI for synopsys synthesis env. symbol lib is needed to generate design schematic. To start gui, either run "dc_shell -gui", or from within dc_shell, run "gui_start".
DC family:
1. DC Expert (compile cmd used).
2. DC Ultra(compile cmd used).

Help in DC: type "help" or "cmd_name -help" or "man cmd_name"


setup file for DC:


We have a setup file for DC that DC reads before invoking DC shell. This file is .synopsys_dc.setup and is usually put in the dir from where DC is invoked. This file has search paths, lib path and other variables setup. Note this file can be copied from some other project by using: cp dir1/.synopsys_dc.setup to dir2/.

.synopsys_dc.setup => This file can have all common settings that you want to apply to your design. It can source other tcl files or set parameters for DC. At a minimum, it needs to set search_path, target_library and link library.


set search_path "$search_path /db/pdk/tech45nm/.../synopsys/bin" => adds this path to default path to search for design and lib files

set target_library TECH_W_125_1.6_STDCELLS.db => this lib, which should be present in search path above, is used during compile to gen gate level netlist. worst case (wc) lib chosen, as we try to meet setup for wc corner. taget_library is used by opt engine to map design to, so it should have all stdcells that are required for mapping.

set link_library {* TECH_W_125_1.6_STDCELL.db }  => link_library (or link_path) is a superset of target_library. resolves references. First looks in DC mem (* means DC mem which has design files), then in specified .db (same as target_library files) for matching lib cell name and then any other libraries which are not target for opt, but may be present in design (as Macro, RAM cells). In DC, we don't need Clock cells (i.e buffers, inverters specifically made for clk tree), so in many companies, clk cells are all put in a separate library, so that we don't have to load unnecessary library cells during synthesis.

link library are synopsys .db files (liberty files in db format) and our design are *.ddc/*.db files. We put *, so that on top of liberty files, DC searches in all the designs already loaded in mem (i.e for module named A in top.db, it searches in A.db, before it looks for A in .lib files). If we omit *, it will cause link failures, as hier designs have modules, which it won't be able to find any more.

NOTE: Most lib/db files have file name same as library name within that file. i.e "TECH_W_125_1.6_STDCELL.db" is defined as library within the file "/db/tech45/.../TECH_W_125_1.6_STDCELL.db". "target_library" and "link_library" refers to file names ?? FIXME ??. Also, we can also provide the full path name of the file so that search_path is not needed for finding target and link libraries.

ex: set target_library "/db/tech/.../TECH_W_125_1.6_STDCELLS.db"


NOTE: In PT, we use PnR netlist which has Clk cells, so we add db for clk cells also when running PT.

NOTE: if we have hard IP blocks, then db files for those blocks should be included in link_library, and paths for those should be in search path. That way, we don't have to provide RTL code for that IP. DC sees that cell name in the db file present in any of target and link lib, and on finding them there, it doesn't complain about missing cell.

Ex: sram2048x32 (sram cell). We instantiate "sram2048x32" in RTL file digtop.v and also have a rtl file (sram2048x32.v) for this module. Then, when running DC, we don't analyze and synthesize rtl file "sram2048x32.v" (i.e this verilog file is not provided in list of RTL files). DC looks at module name "sram2048x32" and tries to find this cell in link_library. It finds this "cell (sram2048x32)" stmt in "sram2048x32_W_125_1.65.db" file, which is present in link library.At this point, tool is happy, otherwise it would search for "sram2048x32" module in any of the other rtl files. This is similar to what happens if we instantiate a latch (LATCH10) directly in RTL, then DC looks for that cell in target_library and link_library. It finds them in "TECH_W_125_1.6_STDCELL.db" file as "cell (LATCH01)" and hence doesn't complain, otherwise it would look for LATCH10 module in any of the RTL files being analyzed, and on not finding the module, it would complain.


#symbol_library => defines symbols for schematic viewing of design.
#synthetic_library => to specify any specially licensed DW lib. Std. DW lib are included by default.

NOTE: only .db library can be read in DC. If we have library in .lib format, then we need to convert it to .db using cmds below and then use those.
#read_lib /db/.../rom_W_150_1.65.lib => This file will be read and stored as *.db file in mem. list_libs will now show this lib too as .db
write_lib rom_W_150_1.65.db -f db -o /db/.../rom_W_150_1.65.db => optional. This saves file in path specified so that next time .db file are directly available to be read by DC (saves run time??).

#if we want to do max/min timing using max/min lib, then we need to do as explained in create_views of PnR_ICC.txt.
#list_libs => lists which lib are used as max lib (denoted by M), and which for min lib (denoted by m). We should see all db library, and in which db file they are. dw_foundation.sldb, gtech and standard.sldb lib are also shown with their paths.
#report_lib => reports contents of lib as units, op cond, WLM and cells. Use this to see library units for cap, resistance, etc present in the library.
#which abc.db => shows absoulte path for this .db file that is being used currently.

Difference between target library and link library, and why do we need both?

Target lib are lib that you target for mapping your RTL to gates. These are std cells which are provided as target. DC chooses from amongst this set, a subset of cells to optimize the final mapped design. On the other hand, link lib resolves references in the design by linking the instances, references in the RTL with the link libraries. So, in link lib, we provide target lib plus any IP as memory, PLL, analog blocks etc, which are needed strictly for linking. These IP lib are not needed for optimizing but just for linking (as they contain just 1 lib that we force to link). So, link lib contain target lib + extra macro libs.

So, the question is why do we need both, when we are specifying same libraries in target and link? Reason might be that it's easier for the tool to have different lib settings for "OPTIMIZATION-MAPPING" & "LINKING".That way it knows what to pick for optimizing and mapping, and what to use for strick one to one mapping.


DC script:

Below is a sample DC script that can be used to run synthesis. We start with the top most file known as top.tcl.

top.tcl: this is the main tcl file that is sourced by the DC tool from cmd line. All DC cmds are in this file, and DC starts running cmds from this file until it reaches end of this file. These are the various sections of this script in tcl:


1. Read all RTL files, and link the library of cells/IP.


#source some other files
#NOTE: for source to work, file path has to start with ./ so that it looks for that file in unix dir, else DC will look for that file in it's memory which doesn't have that file, so it will error out.
source ./setup.tcl => In this file set some variables, i.e "set RTL_DIR /db/dir" "set DIG_TOP_LEVEL  digtop" or any other settings

#this is to suppress warnings during analyze/elaborate
suppress_message {"LINT-1" "LINT-2" "LINT-10" "LINT-33" "LINT-8" "LINT-45" "VER-130" }

#read verilog/vhdl/systemVerilog files. DC can also rd in .ddc & .db (snps internal format, .ddc recommended), equation (snps equation format), pla (berkeley espresso PLA format) and st (snps state table format). 2 ways:
1. Analyze and elaborate => analyzes (compiles, checks for erros and creates an intermediate format) and elaborates HDL design, and stores it in snps lib format file for reuse. All subdesigns below the current design are analyzed, and then elaboration performed only at top level. During elaboration, RTL compiler builds data structures, infers registers in design, performs high level HDL optimization, and checks semantics. It translates the design into a technology-independent design (GTECH) from the intermediate files produced during analysis. It replaces  HDL arithmetic operators in the code with DesignWare components and automatically executes the link command, which resolves design references
After elaboration, RTL compiler has internally created data structure for the whole design on which it can perform operations. cmds:

analyze -format verilog|vhdl [list a.v b.v] => on doing analyze, WORK dir created which has .pvl, .syn and .mr file for each verilog module. Runs PRESTO HDL Compiler for RTL files, and then loads all .lib files.
analyze -autoread [list a.v b.v c.vhd] => to auto analyze mix of verilog and vhdl files

elaborate <top level verilog module name, VHDL entity or VHDL configuration> => ex: elaborate digtop => loads gtech.db and standard.sldb from synopsys lib, and link library *_CORE.db and *_CTS.db from user defined lib, and then builds all modules. It infers memory devices (flip flops, and appends _reg to the net name storing the value i.e net <= value), analyzes case stmt (full case[all possible branches specified so combinatorial logic synthesized, else latch synthesized for non full case], parallel case[case items don't overlap, so mux synthesized, else for non-parallel case, priority checking logic synthesized])

2. read_file -f <verilog|vhdl|db/edif> filename => we can also use read_verilog, read_vhdl, read_db and read_edif, instead of specifying file type in read_file. this can be used to read in gate level netlists also that are mapped to a specific tech. This also performs analysis and elaboration on HDL designs written in RTL format, but it elaborates every design read, which is unnecessary. Only top level design needs to be elaborated. read_file is useful if I want to reuse previously synthesized logic in my design.

#We use 1st way shown above: do analyze and elaborate and then set current_design
analyze -format verilog [list "/db/.../global.v" "/db/.../utils.v" ... "/db/.../digtop.v" ]
elaborate      digtop => since digtop is top level module
current_design digtop => current design always needs to be set to top level

#for design references during linking, DC uses the system variables link_library and search_path along with the design attribute local_link_library to resolve design references. link library has library cells (from .lib) as well as subdesigns(modules inside top level module) that the link cmd uses.
link => resolves references. and connects the located references to the design.

#To see the reference names, use the following command:
#get_references AN* => returns coll of instances that refer to AN2, AN3 etc. ex o/p = {U2 U3 U4}
dc_shell> report_cell [get_references AN*]  => shows references for AN2, AN3, etc for cells and the library to which it's linked. At this stage, lib is GTECH and all references are from this GTECH library. so, use * to see all references.
dc_shell> report_cell [get_references *]  => this shows all ref for cells present in top level design. If there is any logic stmt (i.e assign = A&B; etc) in top level, then it gets mapped to GTECH cells as GTECH_OR, GTECH_AND, etc and gets reported too.
Cell                      Reference       Library             Area  Attributes
--------------------------------------------------------------------------------
B_0                       GTECH_BUF       gtech           0.000000  c, u
C29                       *SELECT_OP_2.1_2.1_1            0.000000  s, u
C54                       GTECH_AND2      gtech           0.000000  u
ccd_top                   ccd_top                         4.000000  b, h, n, u
revid_tieoff              TO010           PML30_W_150_1.65_CORE.db  1.750000
--------------------------------------------------------------------------------
Total 42 cells                                            172.500000


2. specify constraints: env constraints (PVT), design constraints(area/max_fanout) & timing constraints(clks/false_paths)

constraints: IMP: all constraints are specified in sdc format. see sdc.txt for details of constraints. 2 set of constraints:
1. env_constraints = i/p driver, o/p load, i/p delay, o/p delay, dont_touch, dont_use
2. design constraints:
   A. design rule const: max_fanout, max_transition, max_cap
   B. optimization constraints:
      I. timing constraints = clks, generated clk, false path, multicycle paths, (if false_paths refer to gate level netlist, then initial mapped netlist needed)
      II. power contraints = max_power
      III. area constraints = max_area


A. environment constraints: as op cond (PVT), load (both i/p and o/p), drive (only on i/p), fanout(only on o/p) and WLM.

#set_operating_conditions: see in PT OCV section for details of this cmd.
set_operating_conditions -max W_150_1.65 -library STD_W_150_1.65_CELL.db (Instead of set_operating_conditions we can also use "set_max_library STD_W_150_1.65_CELL.db) => Here, we are using our max delay library for both setup/hold runs. We can check this by looking in reports/digtop.min_timing.rpt.

FIXME # LBC8/PML30 lib uses 1.8V PCH_D_1 and NCH_D_1 (Lmin=0.6um drawn), cell height=13.6um, 8routing tracks available, with 3,4,5 Layer for metal routing. 1X inv has i/p cap of 6ff. Power is about 0.1uW/gate/Mhz (CV^2f= 6ff*1.8^2*10^6/MHz = 0.15uW/MHz for inx1) FIXME

#WLM: wire load model: used only when design is not in physical mode.
set auto_wire_load_selection true
#set_wire_load_model "6K_3LM" => sets wire load model on current design to something other than the default one set in .lib file. Usually for larger designs, we set WLM manually, since the default WLM may be smaller designs, and so too optimistic.

# Setting enclosed wire load mode. mode may be top|enclosed|segmented
set_wire_load_mode enclosed => Here, multiple WLM are specified for various sub-modules, so for a net which traverses multiple sub-modules, WLM of that higher level module used which completely encompasses the net. When mode is "top", then WLM of top level module used for all nets in design. Since WLM is defined only for top level design above, WLM for lower level sub-modules are chosen as default when mode=enclosed or segmented.

report_design => see in PT OCV section for details of this cmd. shows all libs used, op cond used (PVT from WLM used, etc.

##### DC TOPO flow starts: see in PnR_ICC.txt for details. comment out the WLM portion above for DC-TOPO.

#create MW lib if one doesn't exist already. From next time, we can just open created desgin lib.
create_mw_lib -technology /db/DAYSTAR/design1p0/HDL/Milkyway/gs40.6lm.tf \
    -mw_reference_library "/db/DAYSTAR/design1p0/HDL/Milkyway/pml48MwRefLibs/CORE /db/DAYSTAR/design1p0/HDL/Milkyway/pml48ChamMwRefLibs/CORE" \
    -open my_mw_design_lib
open_mw_lib my_mw_design_lib
set_check_library_options -cell_area -cell_footprint
check_library

#set TLU+ files instead of WLM.
set_tlu_plus_files \
    -max_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.maxc_maxvia.wb2tcr.metalfill.spb.nlr.tlup \
    -min_tluplus /db/DAYSTAR/design1p0/HDL/Milkyway/tlu+/gs40.6lm.minc_minvia.wb2tcr.metalfill.spb.nlr.tlup \
    -tech2itf    /db/DAYSTAR/design1p0/HDL/Milkyway/mapping.file
check_tlu_plus_files

######DC-TOPO flow ends

#naming convention for lib objects varies b/w vendors, but for SNPS, it's "[file:]library/cell/[Pin]" (file and pin are optional). Ex: to access AND2 cell: set_dont_touch /usr/designs/Count_16.ddc:Count_16/U1/U5.

#i/p drives
set_driving_cell -lib_cell IV110 [all_inputs] => all i/p ports driven by IV110
#set_drive/set_input_transition

#i/p and o/p loads. (i/p load needed when there is extra load due to wire or extra fanout not captured in input gate cap)
set output_load    [get_attribute [get_lib_pins {"PML30_W_150_1.65_CORE.db/IV110/A"}] capacitance]
set output_load_4x [expr 4 * $output_load]
set_load $output_load_4x [all_outputs] => setting FO=4 load on all o/p pins. (set_load can be used on any net, port)
#NOTE: If we set o/p load to be very high (i.e 1pf), then all o/p ports will get driven by largest INV/BUF as any other logic gates don't have that drive capability to drive such a high load. So, on such ports Isolation buffers may not be needed in PnR flow, as buffers are already there from synthesized netlist (if we do put buffers in PnR flow, then we will have 2 buffers back to back for each port, resulting in area wastage)

global constraints:
-----------
#set_dont_use
set_dont_use PML30_W_150_1.65_CORE.db/LA* => don't use latches from lib
set_dont_use PML30_W_150_1.65_CORE.db/DTB* => don't use D-flops with preseet and clr

#set_dont_touch => prevents specified object (net,instance,etc) from being modified duing optimization.
Ex: set_dont_touch [get_cells {TWA/FF1}] => prevents the specified instance from being modified
Ex: set_dont_touch [get_nets -of_objects [get_cells {TWA/FF1}]] => i/o preserved for that cell.
set_dont_touch scan_inp_iso => prevents module instance from being modified.

B. design constraints: design rule and optimization constraints. For initial synthesis, we only provide env_constraint and not design constraints (as we just need gate mapping for RTL to write our false path file)
----------------
1. design rule constraints: usually provided in .lib. typical constraints are set_max_transition, set_max_fanout, set_max_capacitance. These cosntraints are associated with pins of the cells in lib, but eventually end up constraining nets of design. DC prioritizes these over opt constraints, and tries not to violate them. clk nets and constant nets have design rule fixing disabled by default, scan nets do not. i/p ports of design have max_cap figured out by cells driving i/p port (using set_driving_cell in sdc file), while o/p ports have max_cap figured out by cells driving o/p port (size of cells driving o/p port is picked up based on load on o/p port (using set_load in sdc file). o/p port max_cap is seldom violated because DC picks up right size gate to drive the o/p load. However, i/p port max_cap may be violated if we didn't pick right size buffer to drive heavily loaded pins (as i/p clk pin, reset pin, etc).
NOTE: For bidir pins, it's treated as both i/p and o/p pin, so it has a driver as well as a load. That makes it harder to meet max_cap requirement of external driver if external driving buffer is not chosen properly, while a large cap load is placed on the pin (It may easily meet internal driver max_cap requirement as the tool can size the internal driver appr). It may also fail max_transition, as if max_cap gets violated, then depending on how bad it failed, the external driving buffer may need to extrapolate timing for excess cap load, resulting in max_transition violation. To avoid this, choose appr external driver for bidir pin.
###design rule const:  We don't set any DRC as all these are picked as per .lib

2. opt constraints: opt const for timing provided later during incremental compile.
set_max_area       0

----
#set_fix_multiple_port_nets: sets "fix_multiple_port_nets" attr on the design specified.
#This attribute controls whether compile inserts extra logic into the design to ensure that there are no feedthroughs, or that there are no two output ports connected to the same net at any level of hierarchy. The default is not to add any extra logic into the design to fix such cases. Certain three-state nets cannot be buffered, because this changes the logic functionality of the design.
#-all: insert buffers for o/p directly connected to i/p(-feedthrough), inserts buffers if a driver drives multiple output ports(-outputs) and duplicate logic constants so that constants drive only 1 o/p port
# -buffer_constants: buffers logic constants instead of duplicating them.
set_fix_multiple_port_nets -all -buffer_constants [get_designs *]

#set_isolate_ports: Specifies the ports that are to be isolated from internal fanouts of their driver nets.
#-driver BU140 => BU140 or other size buffer used to isolate. By using -force, we force the driver to be the size specified (i.e BU140 only, no other size allowed), and also force isolation to be done on all ports specified, even if they don't need isolation.
#we don't put isolation cells during synthesis, as we do it during PnR.
#set_isolate_ports -driver BU140 -force [all_inputs]
#set_isolate_ports -driver BU140 -force [all_outputs]

----------------------------
# Uniquify after applying constraints
current_design $DIG_TOP_LEVEL
link
uniquify => NOT necessary, since this step is done as part of compile. Removes multiple-instantiated hierarchy in the current design by creating a unique design for each cell instance. So, if you do get_designs * => it now shows multiple instances of clk_mux with clk_mux_1, clk_mux_2, etc. So, each of these clk_mux_* have the same rtl, but they can now be optimized separately.

#Provide physical info (area, placement, keepout, routing tracks, etc) abt floorplan if in DC-TOPO mode. 3 ways:
1. write_def within ICC, and then import it into DC by using extract_physical_constraints cmd.
ex: extract_physical_constraints {design1.def ... design2.def}
2. write_floorplan cmd in ICC which generates a tcl script, and then read that file using read_floorplan.
ex: read_floorplan -echo criTop.all.fp.tcl => this tcl file is generated by write_floorplan cmd in ICC, and used here in DC.
3. Manually provide physical info. Put these constraints(die area, port locations, macro, keepout, etc) in a tcl file and source it. these constraints are the one that we use in ICC to force the tool to generate desired placement.

##### opt const (speed): clk related info here (set_input_delay, set_output_delay provided during incremental compile)
create_clock -name "spi_clk" -period 50 -waveform     { 2 27 } [get_ports spi_clk] => 20M clk, rising edge at 2ns and falling edge at 27ns.

set_clock_uncertainty 0.5 spi_slk => adds 0.5 units skew to clk to model skew during CTS in PnR.

# generated Clock: NOTE: this cmd sometimes requires the presence of synthesized netlist, as the target pin list may be o/p of flops, etc so, we use this cmd after the initial compile.
create_generated_clock -name "reg_clk" -divide_by 1  -source [get_ports clock_12m] [get_pins clk_rst_gen/pin1] => apply waveform on pin "clk_rst_gen/pin1"

#optional ideal attr => not needed for DC. clk nets are ideal nets by default.
#set_ideal_network -no_propagate {clk1 clk2} => marks a set of ports or pins  in  the  current  design  as sources  of an ideal network. compile command treats all nets, cells, and pins on the transitive fanout of these objects  as ideal (i.e no delay). transition time of the driver is set to 0ns. Propagation  traverses through combinational cells but stops at sequential cells. In  addition  to disabling timing updates and timing optimizations, all cells and nets in the ideal network have the dont_touch attribute  set. "-no_propagate" indicates that the ideal network is not propagated through logic gates (i.e logic gates encountered are treated as non-ideal with non-zero delay). By default, ideal property is propagated thru gates. NOTE: during report_timing, we see transition time on these ports/pins as 0ns, resulting in no "max_transition_time" violations.  
#set_ideal_latency 2 clk1
#set_dont_touch_network [get_clocks *]
#set_propagated_clock [all_clocks]

#set ideal n/w for scan_enable, so that they don't get buffered in DC, will be buffered in PnR
#set_ideal_network -no_propagate {POR_N I_CLK_GEN/POR_N_SYNCED} => NOT needed. POR_N port only goes to 2-3 flops as it gets synced first, then the synced version goes to all flops. We don't set any of these ports to ideal as that will prevent tool from putting buffers on these paths. These paths result in max_cap, max_trnasition viol (not timing viol as async paths aren't checked for timing in DC), so DC will buffer these to prevent those viol. If we do not want to buffer the reset tree in DC, we can use this cmd to prevent buffering in DC, and then buffer it during PnR. However, the sdc file exported to PnR tool should have this cmd removed so that PnR tool can buffer it. Also, no false_path should be set starting from "POR_N_SYNCED" pin as it's a real path. We can set false path starting from "POR_N", but even that's not required

#set_ideal_network -no_propagate I_CLK_GEN/SCANRESET => NOT needed as it feeds in same reset tree.
#set_ideal_network -no_propagate scan_en => this done during scan stitching. Here, scan_en not set to ideal is OK, as this net is not connected to any flop (it's a floating net at this stage, and scan_enable/scan_data pin of all flops is tied to 0 or 1). So, no opt takes place on this net. Later during dft scan stitching step, scan_en gets tied to pin of all flops, that is where we set it to ideal, so that it doesn't get buffered.

# Specify clock gating style => Sets  the  clock-gating  style  for the clock-gate insertion and replacement
#-sequential_cell none | latch => 2 styles. A. latch free ( no latch, just and/or gate, specify none). B. latch based (latch followed by and/or, default)
#-positive_edge_logic {cell list | integrated} => for gating +ve FF inferred from RTL. For latch based, cell list must be AND/NAND (can also specify latch in cell list). For latch-free, cell list must be OR/NOR. integrated => Uses a single special integrated clk gating cell from lib instead  of  the  clock-gating circuitry (i.e latch followed by and/nand). With integrated option, we can say whether enable signal is active low and if clk is inverted within the integrated cell. when using integrated clk gaing cells, setup/hold are specified in lib, so separate -setup/-hold options are not required. Tool identifies clk gating cells in lib by looking for clock_gating_integrated_cell.
For CGP, it's "latch_posedge", and for CGPT, it's "latch_posedge_precontrol".
For CGN, it's "latch_negedge", and for CGNT, it's "latch_negedge_precontrol".
#-negative_edge_logic {cell list} => same as above except that for latch based, cell list must be OR/NOR (can also specify latch in cell list). For latch-free, cell list must be AND/NAND.
#-control_point none | before | after => Final_En = (En | Scan_en). Before or after determines whether to put the OR gate before or after the latch. The  tool  creates  a new  input  port to provide the test signal.  The control points must be hooked up to the design level test_mode  or  scan_enable port using the insert_dft command.
#-control_signal scan_enable | test_mode => Specifies  the test control signal.  If an input port is created and the argument is  scan_enable, the name of the port is determined by the test_scan_enable_port_naming_style variable, while for test_mode, the name of the port is determined  by  the  test_mode_port_naming_style  variable. test_mode signal is the one that is asserted throughout scan testing, while scan_enable signal is asserted only during scan shifting (All FFs have scan_enable  the select line of their internal mux). Ususally it's set to scan_enable.

set_clock_gating_style -control_point       before \
                       -control_signal      scan_enable \
                       -positive_edge_logic integrated \
                       -negative_edge_logic integrated

3. synthesize/compile design (initial stage):


#2 types of compile strategy:
A. top-down: top level design and all it's subdesigns are compiled together. Takes care of interblock dependencies, but not practical for large designs, since all designs must reside in memory at same time.
B. Bottom-up: individual subdesigns are constrained and compiled separately. After successful compilation, the designs are assigned the dont_touch attribute to prevent further changes to them during subsequent compile phases. Then the compiled subdesigns are assembled to compose the designs of the next higher level of the hierarchy, and those designs are compiled iteratively until the top level design is synthesized.

# Initial Compile
#-scan: replaces normal flops with scan version. connects scan pins to tiehi or tielo (doesn't do actual stitching of scan pins here)
#DC uses design rule cost and opt cost to determine cost fn. use -no_design_rule to disable design rule cost (max_tran, max_fo, etc) and -only_design_rule to disable opt rule cost (delay, power, area, etc). hold violations are fixed only if set_fix_hold and set_min_delay is specified for design. otherwise, only max_delay (not min_delay) is part of cost fn. We can reorder priority of design/opt constraints to get new cost fn by using set_cost_priority.
#-gate_clock: enables clk gating opt as per options set by set_clock_gating_style cmd. clk gates inserted are wrapped inside a clk_gating module which has CG* cell.
#-no_autoungroup: all user hier are preserved (i.e ungrouping is disabled). Required, else ungrouping removes hier boundaries and flattens the netlist to optimize across modules. Without this, some hierarchies were being flattened to implement clock gating
compile_ultra -scan -no_design_rule -gate_clock -no_autoungroup

#check_design => checks synthesized design for consistency.
#check_timing
#report_clocks

# reports after compile
set rptfilename [format "%s%s" $mspd_rpt_path $DIG_TOP_LEVEL.initial_area.rpt ]
#redirect => Redirects the output of a command to a file. -append appends o/p to target
redirect $rptfilename {echo "digtop compile.tcl run : [date]"}
redirect -append $rptfilename {report_area -hier} => reports total area for combo & non-combo logic. also reports total no of cells at top level of design (in module digtop, counting 1 for each sub-module and 1 for each stdcell), no. of I/O ports and no of nets (total no. of wire in digtop). -hier reports it for all hier modules. area is taken from area in .lib file (RAM/ROM IP usually have incorrect area, as they aren' scaled in terms of NAND2 equiv size)
redirect -append $rptfilename {report_reference -hier} => reports all references in current instance (if current inst set) or current design (default). It reports all instances in top level module of current design, which has subdesigns(as other modules), as well as some std cells connecting these modules together). -hierarchy option goes thru the hier of sub modules and reports all leaf cell refrences.

#NOTE: we can use above 2 cmds for any netlist to report total number of gates. For ex, to find out total gates in routed netlist, do:
read_verilog ./DIGTOP_routed.v
current_design DIG_TOP  
report_area -hierarchy

#NOTE: no constraints of any sort (i/p, o/p delay, false paths, etc) are applied above, as we just want to get a verilog netlist mapped from RTL. Even clk wasn't required to be declared, as we aren't running any timing on this netlist.
# Initial Compile
-----------------------------------------------------------------------------
# Clean up
# removes unconnected ports from a list of cells or instances, perform link and uniquify before this command
# -blast_buses -> if a bus has an unconnected port, the bus is removed.
#find => Finds a design or library object. -hierarrchy means at any hierarchy of design. Ex: remove_unconnected_ports -blast_buses find( -hierarchy cell, "*")
remove_unconnected_ports -blast_buses [find -hierarchy cell *]

#to ensure name consisitency b/w netlist and other layout tools => define_name_rules and change_names cmd used to convert names. define_name_rules defines our own rules, and change_names applies the change to the netlist for the particular rule. There are already std rules for verilog/vhdl. Sometimes, we keep these cmds in .synopsys_dc.setup, so that they are always applied.
# change_names of ports, cells, and nets in a design. -hierarchy Specifies that all names in the design hierarchy are to be modified. (report_name_rules shows rules_names are sverilog,verilog,verilog_1995 and vhdl). This cmd should always be applied before writing netlist, as naming in the design database file is not Verilog or VHDL compliant.
#report_names => shows effects of change_names w/o making the changes.
change_names -rules verilog -hierarchy => std verilog rule applied to all hier of netlist.

#define_name_rules <rule_name> -map { {{string_to_be_replaced, new_replaced_string}} } -type cell
define_name_rules     reduce_underscores   -map { {{"_$", ""}, {"^_", ""}, {"__", "_"}} } => names a rule which removes trailing underscore, starting underscore and replaces double underscore with a single underscore.
change_names -rules   reduce_underscores   -hierarchy => rule applied

define_name_rules    reduce_case_sensitive   -case_insensitive
change_names -rules  reduce_case_sensitive   -hierarchy -verbose

#not sure ???
apply_mspd_name_rules_noam

------------------------------------------------------------------------------
#DC doesn't automatically saves designs loaded in memory. So, save design before exiting.
#save design using write: saves in .ddc, .v, .vhdl format
#save design using write_milkyway: writes to a milkyway database.
write -format ddc     -hierarchy -output ./netlist/${DIG_TOP_LEVEL}_initial.ddc \ => preferred, .ddc is internal database format
write -format verilog -hierarchy -output ./netlist/${DIG_TOP_LEVEL}_initial.v => verilog format (also supports systemverilog (svim) and VHDL format o/p.

4. synthesize/compile design (incremental stage):


# Apply constraints for func mode, when scan exists
source tcl/case_analysis.tcl => specify which scan related pins need to be tied for func mode. It has these stmt:
#set_case_analysis => Sets  constant  or transitional values to a list of pins or ports and prop thru logic for use by the timing engine. The specified constants or transitional values are valid only during timing analysis and do not alter the  netlist.

set_case_analysis 0 scan_mode_in => we force Scan_mode to 0, as we want to see timing paths b/w diff clocks. false paths take care of bogus paths b/w clock domains. forcing it to 1 will cause all clocks to be the same clock (i.e scan_clk), so, we won't be able to see inter clock paths. If we don't force scan_mode at all, then both scan_mode=0 and scan_mode=1 timing analysis is run.
#set_case_analysis 0 scan_en_in => we should NOT force this to 0, as we want timing for scan shift paths also.

# all constraints for all i/o pins here (opt constraints)
source tcl/constraints.tcl => Put all i/p o/p delays here

#we may want to leave setting i/p delays, so that in PT we can see them as "endpoints not constrained" warning. This helps us to see which all are pins which are going thru meta flop.
set_input_delay 0.2 -clock clk1 [remove_from_collection [all_inputs] [get_port {clk1}]] => sets 0.2 unit delay on all i/p pins (except clk1 port) relative to clk1.
sET_output_delay 0.4 -clock clk1 [remove_from_collection [all_outputs] [get_port {spi_miso}]] => 0.4 delay for all o/p ports except spi_miso

#create generated clocks here since it may refer to pins of flops, etc which may only be present in synthesized netlist
source tcl/gen_clocks.tcl
#all clks treated as div by 4 clk, since very large divided clks will cause longer run time.
#we don't have long delay paths, so even if we define very fast clks as div by 4 clk, we should not see any failing paths for setup. It will screw up hold time calc, as PT treats hold time based on no of clk edges that lies in b/w one setup path.
# Div 4 clk
create_generated_clock -name "clk_1600k"        -divide_by 4  -source [get_ports clkosc] [get_pins Iclk_rst_gen/clk_count_reg_1/Q]
create_generated_clock -name "clk_100k"         -divide_by 4  -source [get_ports clkosc] [get_pins Iclk_rst_gen/clk_count_reg_5/Q]

#ram latch clock (since clk signal, generated as a pulse, may be o/p of flop). We can do div by 1 also.
create_generated_clock -name "clk_latch_reg"    -divide_by 2  -source [get_ports clkosc] [get_pins Iregfile/wr_strobe_spi_sync_reg/Q]

#gated clocks
#create_generated_clock -name "spi_clk_gated"   -divide_by 1  -source [get_ports spi_clk]   [get_pins spi/spi_clk_gate/Q]

# Propagate clocks. NOTE: we don't propagate clk, since we don't have any buffers in DC netlist. clks treated as ideal
#set_propagated_clock [all_clocks]

# Apply false-paths: In false paths, we define false paths
# NOTE: false paths only related to setup timing are checked here. If log report indicates an ERROR in any line, it doesn't take any false path from that line into consideration (i.e it doesn't expand the wildcards to choose paths that match and drop paths that don't exist, however PnR tool does expand the wildcards and choose paths that match and drop paths that don't exist, without reporting any error, so be careful). Hold, async, clk-gating paths aren't checked (unconstrained paths) durng synthesis, however they are checked during PnR. so, we might have to add extra false paths when running PnR. These added paths may give ERROR when synthesis is re run, however we can just ignore such errors. Or instead of adding these extra false paths in DC, we can create a new false path file in PnR, and add this file to existing false path file from DC.
#set_disable_timing [get_cells {test_mode_dmux/*}]

source -echo tcl/false_paths.tcl
#set_false_path -from {POR_N POR_N_SYNCED SCAN_RESET} => Not needed as these cause recovery/removal violations which DC doesn't check for. Only scan_en pin needs to be set to false_path as it causes real timing violation due to large transition time. However if scan_en pin is set to ideal_network, then fasle path not needed for scan_en as transition time=0ns.

# Apply multi-cycle paths
source tcl/multicycle_paths.tcl

# Incremental Compile with high effort
source tcl/compile.tcl
This has:
#here we do design rule and opt rule fixing
#-incremental performs only gate level opt and not logic level opt. Resulting design is same or better than original.
#-map_effort high => default effort is medium. high effort causes restructuring and remapping the logic around critical paths. It changes the starting point, so that local minima problem is reduced. It goes to the extreme, so is very CPU intensive.
compile_ultra  -incremental -scan -area_high_effort_script -gate_clock -no_autoungroup

5. generate reports:


#generate reports:
report_area, report_reference => in area.rpt
report_timing -delay max -max_paths 500 => report setup in max_timing.rpt (this has timing with scan_mode=0, so has all interclock paths)
report_timing -delay min -max_paths 500 => report hold in min_timing.rpt => this report should be clean if no clk_uncertainty is defined. This is because c2q delay for flops is greater than hold requirement of flops, so with ideal clock (no clk delays/buffering anywhere), all flops will pass holdtime req.
check_design, report_clock_gating, report_clocks, check_timing, report_disable_timing => in compile.rpt
report_clock_gating => reports no of registers clk gated vs non-clk gated. It also shows how many CG* cells got added to do clk gating.
report_constraint => lists each constraint, and whether met/violated, also max delay and min delay cost for all path groups. -all_violators only reports violating constraints.

#cleanup netlist and then write netlist
cleanup as done in compile_initial (remove_unonnected_ports, define_name_rules, change_names)
write -format verilog -hierarchy -output ./netlist/digtop.v

6. Insert Scan:


#SCAN: DFT compiler (separate license) is invoked for scan. DFTMAX license is needed for scan compression.
#Insert Scan
set_ideal_network [get_ports scan_en_in] => scan_en_in pin is used during scan shifting (defined in setup.tcl). By setting it as ideal, DFT compiler doesn't buffer the signal as all cells/net on this n/w have "dont_touch" attr set. To buffer it, put driving cell using set_driving_cell on this port, so that DFT Compiler can buffer it appr. In DC, we don't buffer this signal, as we buffer it during PnR. NOTE: if scan_en_in is internal net, we should do this:
set_ideal_network -no_propagate {u_DIA_DIG/u_DMUX_DIG/U5/Y} => We should choose gate o/p pin and not the o/p scan_en net ({u_DIA_DIG/u_DMUX_DIG/scan_en}). Else ideal network is not applied to gate o/p, so that gate has large transition time, so tons of viols.

#set_false_path -from scan_en_in => setting any path flowing thru Scan_en and ending at clk as false. Since we have set scan_en as ideal n/w, we don't need false path for this as transition time=0ns. However if ideal n/w is not set for scan_en, then this takes care of cases where scan_en i/p delay causes any setup/hold violation at clk. If we don't do this, we see paths with scan_en failing as they have high FO and high cap, so large delay and large transition time.  We can NOT take care of this by setting small max_delay for setup and large min_delay for hold, because the high FO will cause it to fail timing nonetheless. NOTE: this is not equiv to setting case analysis for scan_en=0, as that removes scan paths (flop to flop) from any timing analysis. We want to see timing for both scan/non-scan paths.
#IMP NOTE: both false_path and ideal_network for scan_en pin should be removed in EDI (if using sdc constraints generated from DC), since scan_en path is real, as we have clk toggling within a cycle of scan_en toggling in Tetramax patterns, so it better meet timing. We need to run gate sims on teramax patterns to make sure we meet timing.

source tcl/insert_dft.tcl => this file has following lines in it:
#sets all scan configuration details as shown below:
#set test timing variables. leave these at default values if Tetramax is used to gen test patterns.
#set test_default_delay 0
#set test_default_bidir_delay 0
#set test_default_strobe 40
#set test_default_period 100 => default scan period is 10MHz.

######define test protocol using set_dft_signal cmd.
#set_dft_signal=>Specifies the DFT signal types for DRC and DFT insertion.
#-view existing_dft | spec => existing_dft implies that the  specification  refers  to  the existing  usage  of  a port, while spec (the default value) implies that the specification refers to ports that the tool must use during DFT insertion. spec view is prescriptive and specifies actions that must be taken. It indicates that signal n/w doesn't yet exist and insert_dft cmd must add it. An example of this is ScanEn signal (even though ScanEn port exists, the signal n/w is not there, so it's prescriptive). existing_dft view is descriptive and describes an existing signal n/w. An example is system clk that is used as Scan Clk (here clk n/w already exists since system clk is used as scan clk, so descriptive). So, scan_clk, reset, scan_mode are existing_dft as n/w is already there, but SDI, SDO, ScanEn are spec as that n/w needs to be built. view is used for many DFT cmds as set_dft_signal and set_scan_path.
Ex: set_dft_signal -view existing_dft -port A -type ScanEnable => when working with dft inserted design, indicates that port A is used as scan enable. indicates that ScanEnable n/w does exist and should be used. This is NOT true in most designs as n/w never exists for scan_en pin. In this case, tool will create a new port "A" and connect it to scan_en pin of all flops
Ex: set_dft_signal -view spec -port A -type ScanEnable => when preparing a design for dft insertion, specifies that port A  is used as scan enable. indicates that ScanEnable n/w doesn't exist yet. This is true for most designs, so use this for scan_en pin.
#-type specifies signal type, as Reset, constant, SDI, SDO, ScanEn, TestData, TestMode. Constant is a continuously applied value to a port.
#-active_state => Specifies the active states for the following signal types: ScanEnable, Reset, constant, TestMode, etc. active sense high or low

#define clocks, async set/reset, SDI, SDO, SCAN_EN and SCAN_MODE. We don't need to define SCAN_CLK as SCAN_MODE forces clock to scan clk (mux chooses b/w SCAN_CLK or FUNC_CLK), and it is traced all the way to the i/p port (as later during create_test_protocol, we say -infer_clock). We don't define async set/reset as we force them to 0, when scan_mode=1 (in RTL itself).
set_dft_signal -view existing_dft -port scan_mode_in -type Constant -active_state 1  => scan_mode_in pin is used during scan mode and is high throughout scan (defined in setup.tcl). existing_dft states that scan_mode n/w exists so the tool doesn't need to do anything to add the n/w.
set_dft_signal -view spec  -port  spi_mosi       -type ScanDataIn => SDI is spi_mosi
set_dft_signal -view spec  -port  spi_miso       -type ScanDataOut => SDO is spi_miso
set_dft_signal -view spec  -port  scan_en_in     -type ScanEnable  -active_state 1 => SE (for shifting during scan) is scan_en_in, and it needs to be "1" for shifting to take place. If scan_en is an internal pin, then we do:
 set_dft_signal -view spec   -hookup_pin {u_DIG_sub/scan_enable} -type ScanEnable  -active_state 1 => pin may be port of sub-module or o/p pin of a gate as "u_DIG_sub/u_sub2/U5/Y". Preferred to use port of sub-module as gate name may change every time tool is run.
#NOTE: in all cmds above, if above scan ports don't exist, then tool creates new ports.

#if needed, set_dft_signal for scan clk, scan reset and scanmode => NOTE these are "exist" view and not "spec" view
set_dft_signal -view exist -type ScanClock   -port SPI_SCK -timing [list 10 [expr ($SCANCLK_PERIOD/2)+10]] => changed freq of scan clk, so that design runs slower
set_dft_signal -view exist -type Constant    -port SCANRESET      -active_state 1
set_dft_signal -view exist -type Constant    -hookup_pin {u_SPT_DIG/auto/Scan_Mode_reg/Q}  -active_state 1 =>  Preferred to use port of sub-module for hookup_pin as in extreme case, flop name may change every time tool is run.

####do all scan related configuration.
#set_scan_element: excludes seq cells from scan insertion, reducing fault coverage
set_scan_element false [find cell test_mode/scan_mode_out_reg] => default is true which means all  nonviolated  sequential  cells  are replaced with equivalent scan cells. when false, no scan replacement done on objects (objects may be cells[as FF/LAT], hier cells, lib cells, ref, design). Sequential cells violated by dft_drc are not replaced by equivalent scan cells, regardless of their scan_element attribute values.

# Specify scan style: 4 styles: multiplexed_flip_flop, clocked_scan, lssd, scan_enabled_lssd. set_scan_configuration or test_default_scan_style can be used to set scan style.
#set_scan_configuration -style [multiplexed_flip_flop | clocked_scan | lssd |  aux_clock_lssd  | combinational | none] => By default, insert_dft uses the scan style value specified by environment variable test_default_scan_style in your .synopsys_dc.setup file
set_scan_configuration -style multiplexed_flip_flop

#count
set_scan_configuration -chain_count 1 => number of chains that insert_dft is to build. Here's it's 1. If not specified, insert_dft builds the minimum  number of scan chains consistent with clock mixing constraints.

#set_scan_configuration -clock_mixing [no_mix | mix_edges | mix_clocks | mix_clocks_not_edges] => Specifies  whether  insert_dft  can include cells from different clock domains in the same scan chain.
no_mix                 The default; cells must be clocked by the same edge of the same clock.
mix_edges              Cells must be clocked by the same clock, but the clock edges can be different.
mix_clocks_not_edges   Cells must be clocked by the same clock edge, but the clocks can be different.
mix_clocks             Cells can be clocked by different clocks and different clock edges.

set_scan_configuration -clock_mixing mix_clocks => we use mix_clocks even though during scan_mode, we have only 1 clk. Reason is that lockup element can only be added if mix_clocks option is used.

#lockup
set_scan_configuration -add_lockup true => Inserts  lockup  latches (synchronization element) between clock domain  boundaries  on  scan  chains,  when  set  to  true  (the default). If the scan specification  does  not  mix clocks on chains, insert_dft ignores this option.

#lockup_type [latch | flip_flop] => The default lock-up type is a level-sensitive latch.  If you  choose  flip_flop  as  the  lock-up type, an edge-triggered flip-flop is used as  the  synchronization  element.

#set_scan_configuration -internal_clocks [single|none|multi] => An internal clock is defined as an internal signal driven  by  a multiplexer (or multiple input gate) output pin (excludes clk gating cells). Applies  only  to  the  multiplexed flip-flop scan style, and is ignored for other scan styles.  It's used to avoid problems when placing gating logic on the clock lines (which might result in hold issues).
none (the default) - insert_dft does not treat internal clocks as  separate  clocks.   This  is  the default value for internal_clocks option.
single  -  insert_dft treats any internal clocks in the design as separate clocks for the purpose of scan chain architecting.The single value stops at the first buffer or inverter driving the flip-flops clock.
multi - insert_dft treats any internal clocks in the design as separate  clocks  for  the purpose of scan chain architecting. The multi value jumps over any buffers and inverters, stopping at the first multi-input gate driving the flip-flops clock.

set_scan_configuration -internal_clocks multi => for our design, we set it to multi.

#set_scan_link scan_link_name [Wire | Lockup] => Declares  a  scan link for the current design.  Scan links connect scan cells, scan segments, and scan ports within scan chains.  DFT  Compiler supports scan links that are implemented as wires (type Wire) and scanout lock-up latches (type Lockup).
set_scan_link LOCKUP Lockup => we name the scanlink LOCKUP

#set_scan_path specifies scan path.
set_scan_path chain1 => Specifies a name for the scan chain, here it's called chain1

#set_scan_state [unknown|test_ready|scan_existing] => sets the scan state status for the current design. Use this command only on a design that has  been  scan-replaced  using,for example, compile -scan, so that the Q ouputs of scan flip-flops are connected to the scan inputs and the scan enable pins are connected  to logic  zero.   If  there  are  nonscan  elements  in  the  design,  use set_scan_element false to identify them.
unknown=>the  scan  state  of  the  design  is  unknown,
test_ready=>the design is scan-replaced,
scan_existing=>the design is scan-inserted.

set_scan_state test_ready

--- End of scan_constriants.tcl file.

###### configure your design for scan testing by generating test protocol using create_test_protocol. Test protocol files are written in spf (STIL procedure file) which are then input to pattern generation tools as TetraMax, Encounter Test to generate pattern file in STIL format.
#create_test_protocol [-infer_asynch, -infer_clock, -capture_procedure single_clock | multi_clock] => creates a test protocol for the current design based on user specifications which were issued prior to  running  this command.  The   specifications   were  made  using  commands  such  as set_dft_signal, etc. The  create_test_protocol command should be executed before running the dft_drc command because design rule checking requires a test  protocol.
-infer_asynch => Infers asynchronous set and reset signals in the design, and places them at off state  during scan shifting.
-infer_clock => Infers test clock pins from the design, and pulses them during scan shifting.
-capture_procedure [single_clock | multi_clock] => Specifies the capture procedure type.  The multi_clock type creates a protocol file that uses generic  capture  procedures  for all  capture  clocks.   The single_clock type creates a protocol file that uses the legacy 3-vector capture  procedures  for  all capture clocks. The default value is multi_clock.

create_test_protocol -infer_clock => -infer_clock not needed if scan_clock defined above.

####DFT DRC Checking
dft_drc checks for these 3 violations:
--
1. Violations That Prevent Scan Insertion: caused due to 3 cond:
 A. FF clk is uncontrollable. => clk at FF should toggle due to test clk toggling, and clk at FF shoudl be in known state at time=0 (sometimes clk gating causes this issue)
 B. latch is enabled at the beginning of the clock cycle C.
 C. async controls of registers are uncontrollable or are held active. => if set/reset of FF/latch can't be disabled by PI of design.

2. Violations That Prevent Data Capture: caused due to these:
 A. clk used as data i/p to FF,
 B. o/p of black box feeds in clk of reg (clk may or may not fire depending on logic),
 C. src reg launch before dest reg capture,
 D. registered clk gating ckt (caused due to clk gating implemented wrongly)
 E. 3 state contention
 F. clk feeding multiple i/p of same reg (i.e. clk signal feeding into clk pin and async set/reset)

3. Violations That Reduce Fault Coverage:
 A. combo feedback loops (i.e if loops are used as a latch, replace them with a latch)
 B. Clocks That Interact With Register Input
 C. Multiple Clocks That Feed Into Latches and FF. Latches should be transparent, and latches must be enabled by one clock or by a clock ANDed with data derived from sources other than that clock.
 D. black boxes: logic surrounding BB is unobservable or uncontrollable.
 ---
#dft_drc [-pre_dft|-verbose|-coverage_estimate|-sample percentage] => checks the current design against the test design rules of the scan test implementation specified  by  the  set_scan_configuration -style  command. If design rule violations are found, the appropriate messages are  generated. Perform  test  design  rule  checking on a design before performing any other DFT Compiler operations, such as insert_dft, and after creating a valid test protocol.
-pre_dft => Specifies  that  only  pre-DFT  rules (D rules) are checked. By default, for scan-routed designs, post-DFT  rules  are checked; otherwise pre-DFT rules are checked.
-verbose => Controls  the  amount  of detail when displaying violations. every violation instance is displayed.
-coverage_estimate => Generates a test coverage estimate at the  end  of  design  rule checking.
-sample percentage =>  Specifies a sample percent of faults to be considered when estimating test coverage.

dft_drc -verbose => check for violations here, before proceeding. shows black box violations for macro, non-scan flop violation because of set_scan_element being set to false, non-scan flops present in design (flops didn't get replaced by their scan equiv because of set_scan_element being set to false or other issues ). It shows final scan flops and non-scan flops.

#preview_dft => Previews,  but does not implement, scan style, the test points, scan chains, and on-chip clocking control logic to be added  to  the  current design. The command first generates information on the scan  architecture  that  will be implemented.  In the case of a DFTMAX insertion, preview_dft provides information about the  compressor  being  created,and for basic scan, the specific chain information for the design. Next,  the command generates and displays a scan chain design that satisfies scan specifications on the current  design. This design is exactly the scan chain design that is presented to the insert_dft command for synthesis.
preview_dft -show all => Reports  information  for all objects in scan chain design.
preview_dft -test_points all => Reports all test points information, in addition to the  summary report  the preview_dft command produces by default.  The information displayed includes names assigned  to  the  test  points,locations  of  violations  being fixed, names of test mode tags, logic states that enable  the  test  mode,  and  names  of  data sources or sinks for the test points.

# Insert Scan and build the scan chain.
insert_dft => adds internal-scan or boundary-scan circuitry to the  current design. First, the  command  populates  a  flattened  representation  of  the  entire  design. By default, insert_dft performs only scan insertion and routing. Next,  insert_dft architects scan chains into the design. By default, insert_dft constructs as many scan chains as there are clocks  and  edges.   By   setting   the -clock_mixing option, we can control scan chains created. Scan cells are ordered on scan chains based on some criteria. Then the command applies a scan-equivalence process to all cells. The insert_dft command then adds generic disabling logic  where  necessary.   It  finds  and  gates all pins that do not hold the values they require during scan shift.Having  disabled  three-state buses and configured bidirectional ports,insert_dft builds the scan chains. It identifies scan_in and scan_out ports. Finally,  insert_dft  routes  global  signals (including either or both scan enable and test clocks) and applies a default  technology  mapping to  all new generic logic (including disabling logic and multiplexers). It  introduces  dedicated  test  clocks  for  clocked_scan,  lssd,  and aux_clock_lssd scan styles. Typically, at this point, the insert_dft command has  violated  compile design  rules  and  constraints  and now begins minimizing these violations, and optimizing the design.
The insert_dft command automatically updates the test protocol after inserting scan circuitry into the design, and dft_drc can be executed  afterward without rerunning create_test_protocol.

# DFT DRC Checking after insertion
write_test_protocol -output digtop_scan.spf => Writes a test protocol file to file_name specified.
dft_drc -verbose -coverage_estimate => verbose rpt with coverage post scan to make sure no violations. Coverage reported here is inferior than one reported by TetraMax/ET as those are more accurate. Also, scan_reset pin if present is not considered in coverage here, as we never provided scan_reset pin info to the tool. It's just tied to inactive state here.
#test coverage = detected_faults / (total_faults - undetectable_faults) => This is the important one to look at
#fault coverage = detected_faults / (total_faults) => this is always lower than test_coverage as UD faults not included. Not so important one to look at.

#report on scan structure:
report_scan_chain > reports/scan.rpt
report_scan_path -view existing_dft -chain all >> reports/scan.rpt
report_scan_path -view existing_dft -cell all >> reports/scan.rpt

# reports after scan insertion => redirect cmd used to create scan.area/max_timing/min_timing/compile rpts in reports/scan.compile.rpt/timing.rpt files. Then cleanup done, and verilog file wriiten in *_scan.v file.

#we don't run inc compile after scan, as scan cells are only getting stitched, so each Q pin sees a little bit of extra load due to SD pin of next flop. That little extra load causes <0.1ns timing change, so scan timing and func timing are almost same (scan mode is set to 0 for both, scan_enable/shift_enable is not forced). Sometimes, wire from Q to SDI may be significant because may be the next flop in chain is in another block very far away, resulting in large timing violation on such paths in DC (due to large transition time). When we see such paths in DC, we should ignore them as PnR tool will buffer and fix it. NOTE: this scan timing is different that scan timing in PT, as scan timing in PT reflects timing in scan_mode=1. We can run such timing in DC too (by setting scan_mode=1, defining single scan_clk, and setting all i/o delays wrt scan_clk). However, we'll only see hold violations here mostly related to scan_shift_en paths (setup path failures would mostly be same as those of functional paths).

#NOTE: dft compiler adds a mux whenever a functional pin is used as SDO pin. Select pin of mux is tied to ScanEnable pin. "0" input is functional o/p, while "1" i/p is connected to o/p pin of last scan chain. That's why after routing scan chains, we see extra mux in digtop.scan.v compared to digtop.v. We need this as functional flop and SDO flop may not be same flop. compiler may decide to have sdo_out from a different flop than func pin flop. If SDO pin of last scan chain is connected directly to func o/p port, then this mux is not reqd.

-----------------------------------------------------------------
# Write out SDC (synopsys design constraints) script file in functional mode. This script  contains  commands  that can  be  used  with  PrimeTime  or  with  Design Compiler. This sdc file combines all constraints file (user or auto generated) in func mode and so is used in AutoRoute during func mode.
write_sdc sdc/func_constraints.sdc

#count total instances in DC netlist
/home/kagrawal/scripts/count_instances_EDI.tcl => reports all gates in reports/instance_count.rpt

#use exit or quit
exit

#Final log file is in logs/top.log. Lok in this file for any errors/warning.
#Final reports: are in reports dir. Look in
#digtop.after_constrain.rpt => all false path, other constraint errors.
#digtop.compile/area/max/min for reports with no scan.
#digtop.scan.compile/area/max/min for reports with scan.

*******************************************
path groups: Look in PT notes (manually written ones) for more details.
----------
by default, DC/PT group paths based on the clock controlling the endpoint (all paths not associated with a clock are in the default path group). We'll see Path Group with "clock_name" in timing reports.

#control opt of paths: We can create our own path groups so that DC can optimize chosen critical paths.
group_path -name group3 -from in3 -to FF1/D -weight 2.5 => creates group3 path from i/p in3 to FF, and assigns a weight of 2.5 to this path group. default weight is 1 for all paths in a path group. weight can range from 0 to 100.

#opt near critical path: by default, only path with WNS is opt. but by specifying critical range, DC can opt all paths that are within that range.
group_path -critical_range 3 => opt all paths within 3 units of WNS (i.e if WNS = -15, then paths with -12ns and worse are all opt). can also use "set_critical_range 3.0 $current_design".

#opt all paths: create path group for each end point. then DC opt each path group.
set endpoints [add_to_collection [all_outputs] [all_registers -data_pins]] => all o/p ports and D pins of all FF added.
foreach_in_collection endpt $endpoints {
 set pin [get_object_name $endpt]
 group_path -name $pin -to $pin
}

---------------------------
#useful cmds:

#To remove design from DC mem, we can do this (instead of closing and opening DC)
dc_shell> remove_design -all

#Retarting dc shell again. We can read in previous .ddc file generated by using read_ddc cmd. This is helpful when we close dc-shell, but want to open previous design again.
dc_shell> read_ddc netlist/digtop_scan.ddc

#to remove design from dc_shell. can be used once design has been saved, so that we can start a new run without exiting dc_shell
dc_shell> remove_design -all => removes current design as well as all subdesigns from memory

#reporting any net connectivity
report_net -conn -v -nosplit net1 => reports (v for verbose, nosplit to get all in one line) all pins connected to the net and detailed report of cap. useful for High FO nets.

#reporting Fanout for all nets above a certain threhold
report_net_fanout -v -threshold 100 => reports (v for verbose) all nets with FO > 100

-------------------

General Synthesis Flow:

Synthesis transforms RTL into gate netlist. Since the goal of synthesis tool is not only to map the RTL into gates, but also to optimize the logic to meet timing, power and area requirements, it needs few other inputs to do the job.

Input = RTL (with pragmas), constraints (sdc) and timing libraries in Liberty format(.lib) needed.

Output = gate level verilog netlist.

Good Synthesized netlist needed as when die utilization approaches 95% to 100% (red zone), meeting timing becomes difficult. So, 3-5% reduction in area keeps the design away from red zone.

Synthesis Tools:

2 most widely used tools for synthesis are provided by Synopsys and Cadence. Synopsys provides DC (design compiler), while Cadence provides RC (RTL compiler)

Synthesis Inputs:

1. RTL:

We write RTL in any HDL language as verilog, system verilog or VHDL, and all synthesis tools are able to synthesize it.

Synthesis pragmas:  These are special helper comments. They are put inside comments (og verilog or vhdl file) preceeded by synopsys or cadence word so that DC/RC can identify them.

cadence pragmas: 2 places to put it in
// cadence pragma_name => single line comment
/* cadence pragma_name */ => mutliline comment

synopsys pragmas: Similarly Synopsys pragmas can also be put in 2 ways:

//synopsys pragma_name => single line comment

/* synopsys pragma_name */ => mutliline comment

pragma names:


I. parallel_case: used in case stmt to specify that case stmt are non-overlapping (1 hot).
ex:
case(1'b1) //cadence parallel_case
sel[0]: out = A[0]; //when sel[1:0]=01 or 11, out=A[0], as 1st matching case stmt is executed.
sel[1]: out = A[1]; //when sel[1:0]=10, out=A[1]. when sel[1:0]=00, then latch inferred, since no default case defined.
endcase

if the pragma wasn't there, then priority logic would be built, since if both sel[0] and sel[1] are 1, then the first matching case stmt is executed, so out=A[0] in such a case. All case stmt are treated as non-paallel for synthesis purpose, since that is how RTL is simulated.
out= (sel[0] and A[0]) or (!sel[0] and sel[1] and A[1]);

however, since pragma is there, no priority logic is built as shown. So if sel[1:0]=11, out=A[0] or A[1]; So, having the prgma saves unneeded priority logic, so keeps gate count lower.
out= (sel[0] and A[0]) or (sel[1] and A[1]); => this may result in mismatches in formal verification or simulation if sel[1:0]=11 is applied.

II. map_to_mux or infer_mux: used in case and if-then-else stmt to force RC to use MUX from library.
case (sel) //map_to_mux => forces mux, meaning RC doesn't optimize this logic to seek other logic
2'b00: out = A;
2'b01: out = B;
2'b10: out = C;
2'b11: out = D;

III. infer_multi_bit pragma => maps registers, multiplexers and 3 state drivers to multibit libraray cells.


2. Timing Library (in liberty or other proprietary format):

The gate library that we use during synthesis is the timing gate library. It's in liberty format. It has timing info for each gate, as well as the functionality of each gate. Using the functionality info, the synthesis tool is able to map the gates to RTL logic, and using the timing info, it's able to check if it's meeting the timing requirement of the design. The question is which timing librrary should we use? Should we use timing for typical corner or max or min corner? Since we want to design out chip such that it meets timing even in worst possible scenario, we choose "worst case" timing library, which is the max delay library.


Example of max delay lib: Let's assume the chip runs at 1.8V typical. Since we design the chip so that it should also run at +/-10% voltage swings (due to IR drop, overshoot, etc). our worst case PVT corner would be Process=weak, Voltage=1.65V amd Temperature=150C (since high temp slows transistors). So, liberty file such as W_150C_1.65V.lib would be used. Not all lib cells may be in one library. So, we may use multiple libraries. As an ex, al core cells may be in *CORE.lib, while all Clock tree cells may be in *CTS.lib

NOTE: no tech/core lef or cap tables provided, as net delay estimated based on WLM (wire load model) which has resistance/cap defined per unit length (length is estimated based on Fanout). If physical synthesis is done, which tries to do physical placement during synthesis itself, then WLM is not used (as in RC PLE or DC-topo, both of which do physical based synthesis). In such a case, core lef file and cap table files are provided.

3. Constraints (in sdc format):
In constraints file, we specify all the constraints that our synthesis tool will try to honor. Constraints are of 2 types: Environment constraints and Design constraints. Both of these are provided via an SDC file.

Timing constraints: One of the most important design constraints in sequential digital design is clock frequency. The tool tries to meet timing once clk freq or lk waveform is given. For input/output ports, we also provide the IO delay

Invalid paths: We provide false paths or multicycle paths for paths that a re not valid 1 cycle paths. In false_paths, we define all false paths on gate level netlist.

While synthesizing, Synthesis tool optimizes setup for all data paths and clk gating paths. No hold checks or async recovery/removal checks done.
Once the tool synthesizes RTL, and meets setup time, it's done. No clk propagation done, and no hold fix done (although both setup and hold timing reports are produced). Hold rpt should have no failure as clk is ideal, and c2q of flop is enough to meet hold time as hold time for most flops is -ve (This is because of extra delay in data path, which makes setup time more +ve and hold time less +ve. worst case for hold time is very small +ve number. NOTE: more delay in data path inc setup time, dec hold time while more delay in clk path dec setup time, inc hold time).


Optimization priority: Not all constraints that we specify have equal priority. Highest priority is given to constraints that can make a chip malfunction (i.e timing constraint), while lowest prioroty is given to constraints that are good to meet, but don't make a chip malfunction (i.e power constraint0

Below are various cost types for various constraint. Basically all these constraints end up as some cost in a big cost function, and the tool's job is to minimise this cost. DC from synopsys uses cost types to optimize design. Cost types are design rule cost and optimization cost. By default, highest priority to design rule cost (top one) and then priority goes down as we move to bottom ones.
1. design rule cost         => constraints are DRC (max_fanout, max_trans, max_cap, connection class, multiple port nets, cell degradation)
2. optimization cost:
 A. delay cost          => constraints are clk period, max_delay, min_delay
 B. dynamic power cost         => constraints are max dynamic power
 C. leakage power cost         => constraints are max lkg power
 D. area cost              => constraints are max area

Power optimization:

Above 90nm, power opt used to be a low priority. But with leakage power increasing, and desire to have chips last longer on battery power, optimizing chip power has become a high priority for chips going into handheld devices. These are the few of the techniques for reducing power:

1. Clock gating: Here clock gating logic is insertef for register banks (i.e a collection of flops). This reduces switching of clk every cycle, since we disable the clk when data is not being written into registers. Clock gating is inserted either when RTL has clock gating coded, or the tool can automatically infer a clock gating logic and insert clk gating logic.

ex: See clk gaters below

2. Leakege power opt: Lkg power is becoming a larger portion of overall power for low nm tech (<90nm). Multiple threshold voltages are used to reduce lkg power.

3. Dynamic power opt: Dynamic power opt consists of 2 power components: 1. short circuit power 2. switching power due to charging/discharging of net/gate caps (due to transistors switching)

4. Advanced Power management techniques: Here we employ advanced power techniques. These techniques are captured in a UPF/CPF file (see Power intent and standards).

  • MSV: Using multiple supply voltages (MSV) in design: This technique is most widely used. We use lower voltages to power parts of design, which don't need to run that fast, while use higher voltages for logic that are performance critical. This can result in huge power savings as dynmaic power varies as square of voltage.
  • PSO: Using power shut off (PSO) methodology: Here, some parts of design are switched on and off internally depending on their usage at that time. This saves both leakage and dynamic power.
  • DVFS: Using Dynamic voltage frequency scaling (DVFS): Here, voltage and frequency of parts of chip or whole chip are scaled down when peak peak perf is not required. DVFS can be seen as a special case of MSV design operating in multiple design modes.


FLOW:

Below is the flow for running synthesis. Specific flow scripts will be explained in detail in the sections for DC and RC/Genus. This is more general explanation.

  1. In init file, specify lib, lef, etc. Set other attr/parameters.
  2. Read RTL, elaborate, and check design.
  3. Set Environment Constriants using SDC file => op_cond (PVT), load (both i/p and o/p), drive (only on i/p), fanout(only on o/p) and WLM. dont_touch, dont_use directives also provided here.
  4. Do Initial synthesis with low effort, since we just need gate netlist from RTL to write our false path file. Write initial netlist.
  5. Set design constraints using SDC file => case_analysis, i/p,o/p delays, clocks/generated clocks, false/multicycle paths. We use case_analysis to set part in func mode (set scan_mode to 0, since we are not interested in timing when part is in scan mode). Strictly speaking, this is not required, but then reports may become difficult to read. So, over here we set scan_mode to 0 to see func paths only. Later during PnR, we run timing separately with scan_mode set to 1, so that we see timing paths during scan_mode. Thus we are covered for both cases of scan_mode.
          IMP: Do NOT force scan_en to 0, as that's real path and we want to see paths both during scan_capture mode as well as during scan_shift mode. If we force scan_en to 0, then scan_shift paths are removed from analysis altogether. Many of these paths fail hold time, so it's OK in synthesis flow, but in PnR flow, we want these paths to be fixed for both setup and hold violations. Since we use the same case_analysis file in PnR, we don't want to set scan_en to 0.
  6. Do Final synthesis with high effort. Report timing, area and other reports. Write Final non-scan netlist.
  7. For SCAN designs, we need to add scan pins, convert flops to scan flops, stitch them, and spit out a scan netlist. Below are the additional steps needed.
    1. set below scan related settings:
            A. set ideal_network attr for scan_en_in pin, so that DC/RC doesn't buffer it. We let PnR tool buffer it.
            B. set false_path from scan_en_in pin (ending at clk of all flops). Otherwise large tr on scan_en_in causes huge setup/hold viol.
            C. set other dft attr. define test protocol, and define scan_clk, async set/reset, SDI, SDO, SCAN_EN and SCAN_MODE.
            D. do dft DRC checking and fix all violations.
    2. Replace regular FF with scan flops, connect chain, do dft DRC checking, print timing, area and other reports. Write Final scan netlist. Synthesize again if needed (not needed since timing is usually met).

NOTE: clock, reset and scan_enable should not be buffered in DC/RC, as that's taken care of in PnR much better as layout is avilable. However, most of the times, synthesis scripts end up buffering the reset path during synthesis, which is not a good practise.

 

Library cells:

Below are some examples of RTL and their synthesized netlist. This or a very similar netlist would most likely be spit out of any Synthesis tool. i generated the netlist using Synopsys DC tool.

1. Flop:

RTL:
module flop (input Din, input clk, output reg Qout);
always @(posedge clk) Qout<=Din;
endmodule

Synthesized Gate:
module flop (input Din, input clk, output Qout); //NOTE: Qout is no more a reg, it's a wire.
FLOP2x1 Qout_reg (.D(Din), .CLK(clk), .Q(Qout), .QZ()); //name of flop is output port followed by _reg
endmodule

2. Clk Gaters:

RTL:
always @(posedge clk) begin
 if (En) Qout <= Din;
end

Synthesized Gate:
module SNPS_CLOCK_GATE_HIGH_spi_0 ( CLK, EN, ENCLK, TE );
  input CLK, EN, TE;
  output ENCLK;
  CGPx2 latch ( .TE(TE), .CLK(CLK), .EN(EN), .GCLK(ENCLK) );
endmodule

module AAA ( ... );
SNPS_CLOCK_GATE_HIGH_spi_0 clk_gate_Qout_reg ( .CLK(clk), .EN(En), .ENCLK(n38), .TE(n_Logic0) ); //Test Enabme tied to 0, since non-scan design
FLOP2x1 Qout_reg ( .D(Din), .CLK(n38), .Q(Qout) );
endmodule

3. Adders:

RTL: Z = A + B; //assume 6 bits

Synthesized Gate:
module add_unsigned_310(A, B, Z);
  input [5:0] A;
  input [6:0] B;
  output [7:0] Z;
  wire [5:0] A;
  wire [6:0] B;
  wire [7:0] Z;
  wire n_0, n_2, n_4, n_6, n_8;
  assign Z[7] = 1'b0;
  FA320 g97(.A (n_8), .B (B[5]), .CI (A[5]), .CO (Z[6]), .S (Z[5])); //Full adder, S = (A EXOR (B EXOR CI)) and CO = (A & B) + (B & CI) + (CI & A), 2X Drive
  FA320 g98(.A (n_6), .B (B[4]), .CI (A[4]), .CO (n_8), .S (Z[4]));
  FA320 g99(.A (n_4), .B (B[3]), .CI (A[3]), .CO (n_6), .S (Z[3]));
  FA320 g100(.A (n_2), .B (B[2]), .CI (A[2]), .CO (n_4), .S (Z[2]));
  FA320 g101(.A (n_0), .B (A[1]), .CI (B[1]), .CO (n_2), .S (Z[1]));
  HA220 g102(.A (A[0]), .B (B[0]), .CO (n_0), .S (Z[0])); //Half adder, S = (A EXOR B), CO = (A & B), 2X Drive
endmodule

module aaa ( ... );
add_unsigned_310 add_115_47(.A (A[5:0]), .B ({1'b0,B[5:0]}), .Z ({UNCONNECTED1,Z[6:0]}));
endmodule

4. Division:

RTL: Y = A / B; assume A[15:0], B[5:0], Y[15:0]

Synthesized Gate: Not yet done. FIXME??

-------------------


Difference in DC(design compiler) vs EDI(encounter digital implementation):
-----------------------
1. many of the cmds work on both DC and EDI. Biggest difference is in the way they show o/p. in all the cmds below, if we use tcl set command to set a variable to o/p of any of these cmds, then in DC it contains the actual object while in EDI, it contains a pointer and not the actual object. We have to do a query_objects in EDI to print the object. DC prints the object by using list.

2. Unix cmds don't work directly in EDI, while they do in DC. So, for EDI, we need to have "exec" tcl cmd before the linux cmd, so that it's interpreted by tcl interpreter within EDI.

3. Many new tcl cmd like "lassign", etc don't work in EDI.

4. NOTE: a script written for EDI will always work for DC as it's written as pure tcl cmds.

Design compiler:
---------------------

Register inference: (https://solvnet.synopsys.com/dow_retrieve/F-2011.06/dcrmo/dcrmo_8.html?otSearchResultSrc=advSearch&otSearchResultNumber=2&otPageNum=1#CIHHGGGG)
-------
On doing elaborate on a RTL, HDL compiler (PRESTO HDLC for DC) reads in a Verilog or VHDL RTL description of the design, and translates the design into a technology-independent representation (GTECH). During this, all "always @" stmt are looked at for each module.  Mem devices are inferred for flops/latches and "case" stmt are analyzed. After that, top level module is linked, all multiple instances are uniqified (so that each instance has unique module defn), clk-gating/scan and other user supplied directives are looked at. Then pass 1 mapping and then opt are done. unused reg, unused ports, unused modules are removed.

#logic level opt: works on opt GTECH netlist. consists of 2 processes:
A. structuring: subfunctions that can be factored out are optimized. Also, intermediate logic structure and variables are added to design
B. Flattening: comb logic paths are converted to 2 level SOP, and all intermediate logic structure and variables are removed.

This generic netlist has following cells:
1. SEQGEN cells for all flops/latches (i/p=clear, preset, clocked_on, data_in, enable, synch_clear, synch_preset, synch_toggle, synch_enable, o/p= next_state, Q)
2A. ADD_UNS_OP for all unsigned adders/counters comb logic(i/p=A,B, o/p=Z). these can be any bit adders/counters. DC breaks large bit adders/counters into small bit (i.e 8 bit counter may be broken into 2 counters: 6 bit and 2 bit). Note that flops are still implemented as SEQGEN. Only the combinatorial logic of this counter/adder (i.e a+b or a+1) is impl as ADD_UNS_OP, o/p of which feeds into flops.
2B. MULT_UNS_OP for unsigned multiplier/adder?
2C. EQ_UNS_OP for checking unsigned equality b/w two set of bits, GEQ_UNS_OP for greater than or equal (i/p=A,B, o/p=Z). i/p may be any no. of bits but o/p is 1 bit.
3. SELECT_OP for Muxes (i/p=data1, data2, ..., datax, control1, control2, ..., controlx, o/p=Z). May be any no. of i/p,o/p.
4. GTECH_NOT(A,Z), GTECH_BUF, GTECH_TBUF, GTECH_AND2/3/4/5/8(A,B,C,..,Z), GTECH_NAND2/3/4/5/8, GTECH_OR2/3/4/5/8, GTECH_NOR2/3/4/5/8, GTECH_XOR2/3/4, GTECH_XNOR2/3/4, GTECH_MUX*, GTECH_OAI/AOI/OA/AO, GTECH_ADD_AB(Half adder: A,B,S,COUT), GTECH_ADD_ABC(Full adder: A,B,C,S,COUT), GTECH_FD*(D FF with clr/set/scan), GTECH_FJK*(JK FF with clr/set/scan), GTECH_LD*(D Latch with clr), GTECH_LSR0(SR latch), GTECH_ISO*(isolation cells), GTECH_ONE/ZERO, for various cells. DesignWare IP (from synopsys) use these cells in their implementation. NOTE: in DC gtech netlist, we commonly see GTECH gates as NOT, BUF, AND, OR, etc. Flops, latches, adders, mux, etc are rep as cells shown in bullets 1-4 above.
5. All directly instantiated lib components in RTL.
6. If we have designware license, then we also see designware elemnets in netlist. All designware are rep as DW*. For ex: DW adder is DW01_add (n bit width, where n can be passed as defparam or #). Maybe *_UNS_OP above are designware elements.

#gate level opt: works on the generic netlist created by logic level opt to produce a technology-specific netlist. consists of 4 processes:
A. mapping: maps gates from tech lib to gtech netlist. tries to meet timing/area goal.
B. Delay opt: fix delay violations introduced during mapping. does not fix design rule or opt rule violations
C. Design rule fixing: fixes Design rule by inserting buffers or resizing cells. If necessary, it can violate opt rules.
D. Opt rule fixing: fixes opt rule, once the above 3 phases are completed. However, it won't fix these, if it introduces delay or design rule violations.
-------

In GTECH, both registers and latches are represented by a SEQGEN cell, which is a technology-independent model of a sequential element as shown in Figure 8-1. SEQGEN cells have all the possible control and data pins that can be present on a sequential element.

FlipFlop or latch are inferred based on which pins are actually present in SEQGEN cell. Register is a latch or FF. D-Latch is inferred when resulting value of o/p is not specified under all consditions (as in incompletely specified IF or CASE stmt). SR latches and master-slave latches can also be inferred. D-FF is inferred whenever sensitivity list of always block or process includes an edge expression(rising/falling edge of signal). JK FF and Toggle FF can also be inferred.
#_reg is added to the name of the reg from which ff/latch is inferred. (i.e count <= .. implies count_reg as name of the flop/latch)


o/p: Q and QN (for both flop and latch)
i/p:
1. Flop:  clear(asynch_reset), preset(async_preset), next_state(sync data Din),  clocked_on(clk),  data_in(1'b0),           enable(1'b0 or en), synch_clear(1'b0 or sync reset), synch_preset(1'b0 or sync preset), synch_toggle(1'b0 or sync toggle), synch_enable(1'b1)
2. Latch: clear(asynch_reset), preset(async_preset), next_state(1'b0),           clocked_on(1'b0), data_in(async_data Din), enable(clk),       synch_clear(1'b0),                synch_preset(1'b0),                synch_toggle(1'b0),                synch_enable(1'b0)

Ex: Flop in RTL:
always @(posedge clkosc or negedge nreset)
      if (~nreset) Out1 <= 'b0;
      else         Out1 <= Din1;

Flop replaced with SEQGEN in DC netlist: clear is tied to net 0, which is N35. preset=0, since no async preset. data_in=0 since it's not a latch. sync_clear/sync_preset/sync_toggle also 0. synch_enable=1 means it's a flop, so enable if used, is sync with clock. enable=0 as no enable in this logic.
 \**SEQGEN**  Out1_reg ( .clear(N35), .preset(1'b0), .next_state(Din1), .clocked_on(clkosc), .data_in(1'b0), .enable(1'b0), .Q(Out1), .synch_clear(1'b0), .synch_preset(1'b0), .synch_toggle(1'b0), .synch_enable(1'b1) );

Ex: Latch in RTL
always @(*)
  if (~nreset)  Out1   <= `b0;
  else  if(clk) Out1   <= Din1;     
Latch replaced with SEQGEN in DC netlist: all sync_* signals set to 0 since it's a latch. synch_enable=0 as enable is not sync with clk in a latch. enable=clk since it's a latch.
  \**SEQGEN**  Out1_reg ( .clear(N139), .preset(1'b0), .next_state(1'b0), .clocked_on(1'b0), .data_in(Din1), .enable(clk), .Q(Out1), .synch_clear(1'b0), .synch_preset(1'b0), .synch_toggle(1'b0), .synch_enable(1'b0) );

NOTE: flop has both enable and clk ports separate. sync_enable is set to 1 for flop (and 0 for latch). That means, lib cells can have Enable and clk integrated into the flop. If we have RTL as shown below, it will generate a warning if there is no flop with integrated enable in the lib.
ex: always @(posedge clk) if (en) Y <= A; //This is a flop with enable signal.
warning by DC: The register 'Y_reg' may not be optimally implemented because of a lack of compatible components with correct clock/enable phase. (OPT-1205). => this will be implemented with Mux and flop as there's no "integrated enable flop" in library.

#Set the following variable in HDL Compiler to generate additional information on inferred registers:
set hdlin_report_inferred_modules verbose

Example 8-1   Inference Report for D FF with sync preset control (for a latch, type changes to latch)
======================================================================
|Register Name | Type |Width | Bus | MB | AR | AS | SR | SS | ST |
==========================================================
| Q_reg         | Flip-flop |   1   |  N   | N    | N   | N  | N    | Y   | N   |
======================================================================
Sequential Cell (Q_reg)
Cell Type: Flip-Flop
Width: 1
Bus: N (since just 1 bit)
Multibit Attribute: N (if it is multi bit ff, i.e each Q_reg[x] is a multi bit reg. in that case, this ff would get mapped to cell in .lib which has ff_bank group)
Clock: CLK (shows name of clk. For -ve edge flop, CLK' is shown as clock)
Async Clear(AR): 0
Async Set(AS): 0
Async Load: 0
Sync Clear(SR): 0
Sync Set(SS): SET (shows name of Sync Set signal)
Sync Toggle(ST): 0
Sync Load: 1

#Flops can have sync reset (there's no concept of sync reset for latches). Design Compiler does not infer synchronous resets for flops by default. It will see sync reset signal as a combo logic, and build combo logic (with AND gate at i/p of flop) to build it. To indicate to the tool that we should use existing flop (with sync reset), use the sync_set_reset Synopsys compiler directive in Verilog/VHDL source files. HDL Compiler then connects these signals to the synch_clear and synch_preset pins on the SEQGEN in order to communicate to the mapper that these are the synchronous control signals and they should be kept as close to the register as possible. If the library has reg with sync set/reset, then these are mapped, else the tool adds extra logic on D i/p pin (adds AND gate) to mimic this behaviour.
ex:  //synopsys sync_set_reset "SET" => this put in RTL inside the module for DFF. This says that pin SET is sync set pin, and SEQGEN cell with clr/set should be used.

#Latches and Flops can have async reset. DC is able to infer async reset for flop (by choosing SEQGEN cell with async clear and preset connected appr), but for latches, it's not able to do it (it chooses SEQGEN cell with async clear/preset tied to 0). This is because it sees clear/preset signal as any other combo signal, and builds combo logic to support it. DC maps SEQGEN cell (with clr/preset tied to 0) to normal latch (with no clr/set) in library, and then adds extra logic to implement async set/reset. It actually adds and gate to D with other pin connected to clr/set, inverter on clr/set pin followed by OR gate (with other pinof OR gate tied to clk). So, basically we lose advantage of having async latch in .lib. To indicate to the tool that we should use existing latch (with async reset), use the async_set_reset Synopsys compiler directive in Verilog/VHDL source files.
ex: //synopsys async_set_reset "SET" => this says pin SET is async set/reset pin, and SEQGEN cell with clr/set should be used.


#stats for case stmt: shows full/parallel for case stmt. auto means it's full/parallel.
A. full case: all possible branches of case stmt are specified. otherwise latch synthesized. non-full cases happen for state machines when states are not multiple of 2^n. In such cases, unused states opt as don't care.
B. parallel case: only one branch of case stmt is active at a time (i.e case items do not overlap). It may happen when case stmt have "x" in the selection, or multiple select signals are active at same time (case (1'b1) sel_a:out=1; sel_b: out=0;). If more than 1 branch active, then priority logic built (sel_a given priority over sel_b), else simple mux synthesized. RTL sim may differ from gate sim, for a non-parallel case.


#The report_design command lists the current default register type specifications (if we used  "set_register_type" directive to set flipflop/latch to something from library) .
dc_shell> report_design
 ...
Flip-Flop Types:
    Default: FFX, FFXHP, FFXLP

#MUX_OPs: listed in report_design. MUXOPs are multiplexers with built in decoders. Faster than SELECT_OPs as SELECT_OPs have decoding logic outside.
ex:
reg [7:0] flipper_ram[255:0]; => 8 bit array of ram from 0 to 255
assign    p1_rd_data_out = flipper_ram[p1_addr_in]; => rd 7 bits out from addr[7:0] of ram. equiv to rd_data[7:0] = ram[addr[7:0] ].
this gives the following statistics for MUX_OPs generated from previous stmt. (MUX_OPs are used to implement indexing into a data variable, using a variable address)

===========================================================
| block name/line  | Inputs | Outputs | # sel inputs | MB |
===========================================================
|  flipper_ram/32  |  256   |    8        |      8           | N  |        => 8 bit o/p (rd_data), 8 bit select (addr[7:0]), 256 i/p (i/p refers to distinct i/p terms that mux is going to choose from, so here there are 256 terms to choose from, no. of bits for each term is already indicated in o/p (8 bit o/p) )
===========================================================

#list_designs: list the names of the designs loaded in memory, all modules are listed here.
#list_designs -show_file : shows the path of all the designs (*.db in main dir)


------------------------

#terminology within Synopsys.  https://solvnet.synopsys.com/dow_retrieve/F-2011.06/dcug/dcug_5.html

#designs => ckt desc using verilog HDL or VHDL. Can be at logic level or gate level. can be flat designs or hier designs. It consists of instances(or cells), nets (connects ports to pins and pins to pins), ports(i/o of design) and pins (i/o of cells within a design). It can contain subdesigns and library cells. A reference is a library component or design that can be used as an element in building a larger circuit. A design can contain multiple occurrences of a reference; each occurrence is an instance. The active design (the design being worked on) is called the current design. Most commands are specific to the current design.

#to list the names of the designs loaded in memory
dc_shell> list_designs
a2d_ctrl                digtop (*)              spi   etc => * shows that digtop is the current design

dc_shell> list_designs -show_file => shows memory file name corresponding to each design name
/db/Hawkeye/design1p0/HDL/Synthesis/digtop/digtop.db
digtop (*)
/db/Hawkeye/design1p0/HDL/Synthesis/digtop/clk_rst_gen.db
clk_rst_gen

#The create_design command creates a new design.
dc_shell> create_design my_design => creates new design but contains no design objects. Use the appropriate create commands (such as create_clock, create_cell, or create_port) to add design objects to the new design.

History of Simulators

Verilog-XL (from Gateway design) was the 1st and only verilog simulator available for signoff in early 1990's. Cadence bought it, but ended support at Verilog-1995. It developed it's own compiled code simulator (NcVerilog). Docs from cadence still refer to Verilog-XL when talking about Nc-Verilog. Modern version of NcSim family is IES and recommended for newer projects. However, as of 2018, IES is replaced by  even newer simulator Xcelium. VCS (Verilog Compiled code simulator, 1st SystemVerilog simulator) from Synopsys and ModelSim (ModelTech simulator, 1st VHDL simulator) from Mentor Graphics are the other two qualified for ASIC signoff. All 3 support V2001, VHDL-2002 and SV2005. Modelsim is implemented based on interpreter, so it's much slower compared to VCS and NC-verilog which are based on compilers.

Cadence Simulator: Incisiv Enterprise Simulator (IES) 9.2, verilog-XL (ncverilog) 9.2 from Cadence is the latest simulator (as of 2019). Now as of 2021 Xcelium from Cadence is widely used.

Cadence IES simulator:

Cadence Incisive sim (IES) is based on cadence's interleaved native compiled code arch (INCA is extension of native complied code arch (NCA). With INCA, we can verify multiple languages (verilog, VHDL, SV,  Specman, SystemC, Verilog AMS, VHDL AMS, C , C++, SPICE files, etc), multiple levels (behavioral, rtl, gates), multiple paradigms (event driven, cycle based), mixed signals (digital, analog)) which provides high accuracy with accuracy of event driven simulation (found in interpreted and compiled code tech).
In an NCC simulator, a parser produces an intermediate representation of the input source text. This intermediate representation is then processed by a code generator that produces relocatable machine code that runs directly on the host processor. For example, in a Verilog/VHDL configuration, both the Verilog and VHDL compilers are used to generate code for the Verilog and VHDL portions of the design, respectively. During an elaboration process similar to the linking used in computer programming, the Verilog and VHDL code segments are combined into a single code stream. This single executable is then directly executed by the host processor.
For RTL designs, a min of 64Mb is required while for gate simulation of 150K gates, min of 128Mb mem reqd.

Simulator supports IEEE 1364-2001 std for verilog, OVI 2.0, and verilog XL. System Verilog extensions to verilog as defined in IEEE P1800 std also implemented. We use compiler (ncvlog) and than elaborator (ncelab), which are integrated into IES. When we compile and elaborate a design, all internal rep of cells and views reqd by simulator are contained in single file stored in lib dir. Compiler will automatically create a default work library called worklib in a directory called INCA_libs, which is under the current directory. All design units are compiled into this library.

Cadence Xcelium simulator:

Early simulators processed verilog code in single thread, managing a single active queue of events. This serial methods resulted in significant run time. Xcelium simulator is basically same as IES, except that it can be run in single core or multi core configuration. Multi core configuartion can shorten runtime considerably, by breaking dependency on RTL/gate designs into indep parts, and simulate these parts using independent threads on parallel processors. Xcelium partitions design into accelerated (ACC) and non accelerated (NACC) regions. ACC region contains RTL/gate design, which can be run as parallel threads, while NACC region contains behavioural portions such as testbench, behavioural (model) memories, etc which are run by single core engine. This multi core engine compiler is invoked by passing option "-mcebuild". Compiler will automatically create a default work library called worklib in a directory called xcelium.d, which is under the current directory. All design units are compiled into this library, as well as other libs explained later.

Example of simple design, testbench and testcase:

//simple verilog code that will compile and run: tb.v. To run it, use cmd: irun tb.v
module tb();
 int a;
 initial begin
   $display("a=%d",a);
   //$finish; => this not needed as there's only this file with initial, so nothing is running forever
 end
endmodule

//to run a simple module, create a tb, and change signals at module i/p pins using initial block.
// To run it, use cmd: irun tb.v Top_module.v +access+r -timescale 1ns/1ps => access option needed so that waveforms can be dumped.
module tb(); => brackets optional
 int a;
 reg b,c; //reg neded as wire can't be assigned in always blocks
 
 Top_module I_top (.IN1(b), .IN2(c)); //top module connections => preferred way
 //assign Top_module.IN1 = b; assign Top_module.IN1 = c; => instead of instantiating Top_module as in above line, we can also directly connect pins to nets. NOTE: since IN1,IN2 are nets, "always *" won't work, since it needs regs. so, we use assign.
 initial begin //to apply i/p stimuli and to end sim. Usually this whole block is placed in tc_1.v file, so that we can apply diff stimuli for each testcase
   #100 b=1'b1; #200 c=1'b0;
   $display("b=%d, c=%d",b,c);
   $finish; => this should be last stmt as after this stmt, tool exits
 end

//dump waveform in vcd format. To dump fsdb (novas proprietary format, but used by almost all vendors), we need other system task defined later.
 initial begin //to dump vcd files for all modules. Does not matter in which module it's placed, it still dumps for all modules.
   $dumpvars;
   $dumpfile("tmp.vcd");
   $dumpoff;
   #3150us; //dump vcd starting from 3150us
   $dumpon;
   #600us; //end dump at 3750us
   $dumpoff;
end

initial begin //other way to dump
   #1000; //start of dump
   $dumpvars;
   $dumpfile("/sim/ACE/.../tmp.vcd");
   #2000;
   $dumpflush; //end of dump
end

endmodule


Running simulator: 2 ways.

  1. Multi-step: First compile (different compilers for diff src files), then elaborate then run simulator. Here all these steps are run separately. Not recommended.
    1. Compiler: We have different compilers for VHDL and Verilog. ncvhdl is VHDL compiler, while ncvlog is Verilog compiler.
      • ncvhdl cmd: ncvhdl vhdl_src_files => ncvhdl is VHDL compiler. run ncvhdl -help to get other options
        • ex: ncvhdl -V200X -messages -smartorder a.vhd b.vhd => enables V1993 and V2001 features (use -V93 to enable only VHDL 1993 features), print informative msg, and compile in order independent mode
      • ncvlog cmd: ncvlog verilog_src_files => analyzes and compiles verilog src. performs syntax check on HDL design and generates intermediate representation, in lib database file called inca.architecture.lib_version.pak (architecture=lnx86)
    2. Elaborator: Elaborates the design. ncelab is the elaborator provided by Cadence that elaborates the design compiled by compiler above.
      • ncelab cmd:  ncelab top_level_design_unit => elaborator takes lib cell:view of top level as i/p, and constructs design hier, establishes connectivity, and computes the initial values for all of the objects in the design. It creates a m/c code and snapshot where access level is no rd,wrt or connectivity access to simulation objects, That means we won't be able to probe these objects outside of HDL which is OK in regression mode, but we need to set it to rd access in debug mode.
    3. Simulator: Simulates the design using the test case or patterns provided.
      • ncsim cmd: ncsim snapshot_name => The simulator loads the snapshot generated by the elaborator, as well as other objects that the compiler and elaborator generate that are referenced by the snapshot. The simulator may also load HDL source files, script files, and other data files as needed.
        • ex: ncsim -run worklib.top:module => NOTE: Using -gui option with ncsim starts simVision. That brings up Design browser and Console. Then we can run ncsim cmds on the Console.
  2. Single step: Here all the steps from above are run as part of one cmd. This is much more convenient. There are 3 different variants here depending on the simulator that you have from Cadence. Either we use ncverilog or use irun/xrun (irun for IES and xrun for xcelium). NcVerilog is run in single step by using ncverilog on cmd line. irun/xrun is very similar to ncverilog, but in addition to verilog/system verilog, it can also accept vhdl, systemC, AMS, etc. Since irun/xrun run all steps, that have a lot more options each of which are specific to the tool that is being invoked. So, we should refer to those tools (i.e ncelab, xmsim, etc) for the specific options that are supported. irun/xrun are not case dependent (i.e -nolog same as -NoLoG). Also, short version of cmd line options allowed (i.e -nowarn same as -now, various options support varying num of min char required for that option to be recognized in it's short form)
    1. ncverilog: ncverilog does what multi step simulation does by invoking ncvlog, ncelab and ncsim for you. It lets us run NC-verilog simulator exactly the same way that we ran Verilog-XL (verilog-XL was run using cmd "verilog" on cmd line). All cmd line args are same as those of verilog-XL. On top of this, ncverilog also allows us to include ncvlog, ncelab and ncsim options on cmd line in form of + options. It also suppports manymore + options than verilog-XL.
    2. irun: It's for use with IES simulator. specifies all files on single cmd line. In ex below, top.v and sub.v are compiled by ncvlog using option -ieee1364, middle.vhdl is compiled by ncvhdl using option -v93, verify.e is recognized as specman e file and compiled using sn_compile.sh. After compiling all these, ncelab elaborates design using -access +r option (to provide rd access to simulation object, else in vcd/fsdb dump file, we won't see all wires,reg,etc) and generates sim snapshot. ncsim is then invoked with both SimVision (comprehensive debug env which includes design browser, waveform viewer, src code browser, signal flow browser,etc) and Specview gui.
      • ex: irun -ieee1364 -v93 +access+r +neg_tchk -gui verify.e top.v middle.vhd sub.v
      • ex: irun a.v b.v top.v tb.v => simplest cmd to run all rtl and tb files
    3. xrun: very similar to irun. It's for use with Xcelium simulator. However, compilers here are xmvlog, xmvhdl, sn_compile.sh. xmelab elaborates design, while xmsim simulates the design (xm means xcelium, while nc meant ncverilog which was used earlier in IES). xrun uses xmsc_run compiler i/f to compile c/c++ files. These compiled files, along with any other object files provided on cmd line, are then linked into single dynamic library, that is then automatically loaded before elaboration of design.
      • ex: xrun -ieee1364 -v93 +access+r +neg_tchk -gui verify.e top.v middle.vhd sub.v => NOTE: how all args are same as those of irun

 

Sequence of steps when running the Simulator:


NOTE: Both irun and ncverilog finally run ncsim which runs simulation cmds. Using -gui option brings up SimVision on ncsim cmd prompt. When running irun/ncverilog, this is what appears on screen:


1. ncvlog/ncvhdl: analyzes and compiles each source file. => done only when any file changes, else it's skipped
ex:     file: ../models/CFILTER.v
        module worklib.CFILTER:v
                errors: 0, warnings: 0


2. ncelab: elaborates files and constructs design hier from top level design units. It auto figures out top level design units based on if they are referenced elsewhere. Usually digtop_tb and testcase_name_tc are top level design units as they aren't referenced anywhere else. Then it generates native compiled code for each module and then provides design hier summary.  It finally writes the simulation snapshot, which is a file that has all info for sim to run on it (w/o needing any info from anywhere else). elaboration step is run only when any file changes, else it's skipped
ex:   Elaborating the design hierarchy:
        Top level design units:
                digtop_tb
                S1_main_hunt_tc
        Building instance overlay tables: .................... Done
        Generating native compiled code:
                S1.AFE_AGC_S1:v <0x17bc2126>
                        streams:  28, words: 11022 < and so on for each module ....>
        Building instance specific data structures.   
        Loading native compiled code:     .................... Done
        Design hierarchy summary:   
                             Instances  Unique
                Modules:         1       1     
                Registers:       3       3  
                Initial blocks:  1       1
        Writing initial simulation snapshot: worklib.tb:sv   
Loading snapshot worklib.tb:sv .................... Done        
     
3. ncsim: loads the snapshot generated above and runs ncsim. ncsim prompt appears. It first source ncsimrc file (this file is needed by ncsim for displaying rc files). Then it puts "run" cmd, and then on encountering $finish in any module or on reaching end of all "initial" and having no "always" or other infinite loops, it puts "exit" cmd to exit ncsim.
ex: ncsim> source /apps/cds/incisiv/12.20.018p2/tools/inca/files/ncsimrc => this file aliases run as "." and exit as "quit", so that . will also work instead of run, and quit will also work instead of exit.
    ncsim> run .... (displays stmt which have $display ...)
    ncsim> exit

----------------

NOTE: In verilog-XL(ncverilog) and irun, many cmds in ncvlog, ncelab and ncsim which are preceeded by "-" are replaced by +.
ex: ncvlog -define arg1 => in ncverilog/irun, it's irun +define+arg1

Help:
>irun -helphelp
>irun -helpall

NOTE: to get help on any error that we see on running irun, we can type this:
Ex: error ncelab: *E,CUVRFA: blah ... shows up. To get more info type: nchelp ncelab CUVRFA
Ex: If error happened in ncvlog, type: nchelp ncvlog CUVRFA

 


 

RTL and Gate Simulation setup:

Dir: /db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/
3 subdir:
--------
tb: testbench dir. It has top level tb file (digtop_tb.v). digtop_tb.v defines a top level module digtop_tb, includes file all_tasks.v & xfilter.v, does initial begin .. end, and then instantiates module digtop and calls this dut, and connects all tb_* signals to appropriate digtop pins.

tc: testcase dir. It has test cases for different tests. i.e for interrupt block, it has interrupt_tc.v. Remember, any signal that you specify in tc should be an i/o port of a module or block, as internal net names may get renamed in gate synthesis, so even though the testcase may run on RTL, it'll fail to run on gate netlist.

sims: This is the main dir to run gatesims or RTL sims.

RTL:
-----
Build RTL dir:

run_rtl_sims (verilog) => script to run verilog RTL sims
----------------------
#we need to be able to run debussy to debug, so we provide a link to provided compiled lib from Debussy (if PLI app from Debussy has already been compiled into dynamic shared lib as is the case here) to provide bootstrap dynamic linking. Then user defined bootstrap fn can be accessed using load* (loadpli1, loadvpi, etc) in irun (or Nc simulator). This PLI defines functions such as $fsdbdumpvars and $fsdbdumpfile, which are needed for dumping fsdb files (note functions for vcd dump don't require this PLI, since they are supported by default by all simulators).
#for linux OS
set DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/5.2_v21/share/PLI/nc_xl/LINUX/xl_shared/libpli.so:deb_PLIPtr"
#for SOLARIS OS
#set DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/5.2_v20/share/PLI/nc_xl/SOL2/xl_shared/libpli.so:deb_PLIPtr"

irun -9.20.039-ius \ => specifying version of irun is optional. default is chosen based on ame if nothing specified. (running "irun -version" returns the version of irun being used)
$DEBUSSY_PLI \ => loads debussy PLI
-y /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/verilog/models \ => -y for dir. All gate verilog included incase we've any stdcells instantiated in RTL (usually clk gaters and mux/logic on clk/reset are hard instantiated)
#+incdir+../../tb/ \ => incdir option is used when we have `include "file1" in some other verilog file2. Then we have to include whole dir where file1 resides, else while compiling file2, we'll get an error about file1 not found. We don't need to compile file1 as `include will cause file1 contents to be included in file2. Note that if we try to compile file1, it may not compile as any verilog file to be compiled needs to have proper syntax (i.e file should have "module", "endmodule", etc. Many times in such include files we just have some verilog stmts, which is fine as these are just inluded in main file2 which already has module etc).
-y /db/Hawkeye/design1p0/HDL/Source/golden \ => instead of this, we could also use "-f rtl_files.f" which would have paths for each RTL file to be included
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tb/digtop_tb.v \
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tc/$argv[1]_tc.v \
-coverage ALL -covdut digtop -covoverwrite -covworkdir ./coverage/cov_$1 => puts coverage results in dir "/coverag/cov_$1/". says top level dut used for coverage should be "digtop" instance (we can also limit coverage to particular sub-module by using hier path for that instance(Not defn of module but instance of module)". It generates binary coverage data files (UCD) and coverage model files (UCM). coverage types can be code(block, expr, fsm, toggl) or functional(assertion, covergroup). "all" enables all code coverage types listed (B=>Block, E=> expression, F=>FSM, T=>Toggle, U=> fUnctional, A=>all. ex: we can wrt "-coverage BEFT" to enable all code coverage).
#NOTE: instead of using coverage cmds, we can also pass a .ccf cfg file which can have all cmds in there. i.e -covfile config.ccf. sample coverage.ccf file
select_coverage -all -module * => selects all coverage
set_libcell_scoring => IMP: sometimes we get no coverage results. Reason is coverage stops at libcells. Sometimes all modules treated as libcells whenever irun calls source dir with -y option (-y option is usually used with libcell dir). So, this "set_libcell_scoring" option forces coverage to be reported for all libcells too.

-l ./rtl_logs/$argv[1].log \ => -l (small letter L) is to specify logfile instead of default irun.log. We can also use /$1.log (as $1 and $argv[1] are same)
+nclibdirname+"$argv[1]_INCA_libs" \
+access+r \ => rd access so that all wires,reg etc can be accessed in vcd/fsdb files
+libext+.v \ => specifies extension of files referenced by -y option (+libext+extension). If this option not used, then files referenced by -y should not have file extension, else they will be ignored (very imp to use this wuth -y)
+licq \
#+sv \ => with -sv option, all verilog type files are compiled as SystemVerilog.
+notimingchecks \ => do not execute timing checks for $setup, $recrem, etc
-input dump.tcl \ => optional. needed for shm db dump. see in simvision section below for more details
+define+TI_functiononly \
+define+FSDBFILE=\\\"/sim/HAWKEYE_DS/kagrawal/digtop/rtl/$argv[1].fsdb\\\" \ => important to have \\\ before "
+define+FSDB \
+define+IMGFILE=\\\"/sim/.../a.img\\\" \ => this can be used in tb.v file or any other verilog file, to assign value from cmdline. i.e
 `ifdef IMGFILE defparam tb.block1.PRELOADFILE=`IMGFILE; `endif
-svseed random \ => assigns random seed to all $urandom fn
+nctimescale+1ns/1ps => default timescale to use if no timescale defined anywhere

#-work: by default, irun compiles all design units in HDL files in work library called worklib (located within INCA_libs dir). We can change work lib name by using -work.
#dir structure is:
INCA_libs/irun.nc/xllibs/models,golden => for models dir, golden dir, etc specified with -y above stored in xllibs
INCA_libs/worklib/.inca*db, inca*pak   => contains all compiled units as one file in .pak lib database. within worklib dir, we have subdir for std,ieee,worklib,synopsys,etc which have their own .pak database.

#-linedebug: to get debugging info

run_rtl_sims (mixed: tb is in verilog but src files are in vhdl/verilog)
----------------------
remains same as above (i.e same as running verilog rtl sims)
The only difference is that novas fsdb dump doesn't work on vhdl src files (i.e it only shows signals for verilog files in waveform, but not for vhdl files). Option is to dump vcd file, as vcd file will always have all signals. Other option is to set DEBUSSY_PLI to newer version of novas (in run_rtl_sims file) as follows: (doesn't seem to work ?)
DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/2010.04/share/PLI/IUS/LINUX/boot/debpli:novas_pli_boot"

run vhdl rtl sims: /db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/sims/run_rtl_sims
-----------------
#for fsdb dump
In tb/tb_spi.vhd file, put "use WORK.novas.all;" at the top before entity declaration, and also add this directive:
process
begin
`ifdef FSDB
        fsdbDumpvars(0,":");
        fsdbDumpfile("test.fsdb");
`endif
end process;

#above code, always dumps fsdb file as dump.fsdb in current dir. So, we can instead run this to dump into specific file:
#create file nc.do and then call this file from irun cmd line by adding this option: -input nc.do \
call fsdbDumpfile /sim/HAWKEYE_DS/kagrawal/digtop/rtl/SPI.fsdb
call fsdbDumpvars 0 :
run => if we don't add this line, then ncsim stops at cmd prompt, and we have to type run on the prompt to continue

-----------
#run_rtl_sims (vhdl):
#LD_LIBRARAY_PATH needs to be set
#solaris
#setenv LD_LIBRARY_PATH /apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/SOL2:$LD_LIBRARY_PATH
#linux
setenv LD_LIBRARY_PATH /apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX:$LD_LIBRARY_PATH

#note, here we specified debussy_pli with path separately defined above, while for verilog, it was all in one line.
set DEBUSSY_PLI     = "-loadpli1 debpli:novas_pli_boot"
#we may also add -loadcfc option above, to get rid of some system errors:
#set DEBUSSY_PLI     = "-loadpli1 debpli:novas_pli_boot -loadcfc debcfc:novas_cfc_boot"

#irun (same as for verilog, except -top,relax,V93 options used)
irun \
$DEBUSSY_PLI \
-y /db/pdk/lbc7/rev1/diglib/msl270/r3.0.0/verilog/models \
/apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX/novas.vhd \
#/apps/novas/debussy/2011.01/share/PLI/IUS/LINUX/boot/novas.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi_typedefs.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi_control.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi_regs.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/tb/tb_spi.vhd \ =>
-top E \ => for vhdl, top entity has to be declared (this top entity is in tb/tb_spi.vhd)
-relax \ => to relax strict vhdl requirements
-V93 \ => since our vhdl is 1993 format
-input nc.do \ => use this, if we call fsdb cmd in nc.do isntead of fsdb cmd in tb_spi.vhd
-l ./rtl_logs/$argv[1].log  ... => other options same as those for verilog

-------
#for vhdl and SystemC files, you have to specify the top level with -top option, as simulator does not automatically calculate top-level VHDL/SystemC design units. However, with this option, autodetection of top level verilog modules is disabled. (-vhdltop and -sctop specifies VHDL top level and Sc top level, but doesn't disable auto calculation of verilog top level units)
-top [lib].cell[:view] => specifies top level unit, can use multiple -top options to specify multiple top-level units
Ex: -top E \ => entity E is defined in top level testbench file tb_spi.vhd, which calls top level source entity spi.

#for vhdl, IEEE 1076 standard does not allow for multiple choices (i.e. 0=>'1', OTHERS=>'0') in an array aggregate that is not locally static (i.e. VECTOR(size-1 downto 0) has a variable range). If you make the range of the array static (e.g. VECTOR(3 downto 0) or provide only one choice (e.g. OTHERS=>'0'), then the code will compile correctly. Cadence has adjusted ncvhdl with a switch named '-relax' which relaxes a variety of LRM rules, and alows code to compile.
-relax \
#we can also use option -V93 to force irun to compile with VHDL93 syntax.

GATE:
----
gate sims run on gate level netlist, which has all nets as "wire". If there's a net which is i/o port of module,  it has to be connected through a "wire" at higher level to another i/o port of some module, or to i/o port of top level module. All these "wire" have parasitics associated with them in spef file, and hence delays associated with them in sdf file. Some nets appaer as "wire", but during optimization, they are not used for connections (like instead of Q pin of flop, QZ pin is used sometimes, which results in net associated with Q pin to be floating). such nets even though listed as "wire" don't have any parasitics and are reported as "unannotated nets" during sdf file generation (in PT).

We do timing checks when running gate sims. This may cause non-convergence in simulator for cases where there are -ve setup/hold times or -ve rec/rem values in sdf file. see in verilog.txt.

--------------------------------
GateSim (for verilog testbench):
--------------------------------
For gatesims, we do xfiltering for meta flops, and we do sdf annotation for all nets/cells. We add this in digtop_tb.v in b/w "module ... endmodule", whenever SDF_MAX or SDF_MIN is defined.
digtop_tb.v:
1A. xfilter: include "../tb/xfilter.v" => In this file, we define Xon parameter for all meta flops to be 0. On doing this, setup/hold check is turned off for this flop, so that we don't see these warnings: "Warning!  Timing violation $setuphold<setup> ( posedge CLKIN:65071 PS, posedge EN:65077 PS,  0.248 : 248 PS,  0.041 : 41 PS );... Time: 6548 PS" for that flop. Here, numbers shown are setup of 248ps(min:typ:max), and hold of 41ps(min:typ:max). When only 2 values shown instead of triplet, that means sdf file had only 2 values. Here CLK and EN comes within 6ps (65077-65071) causing a viol.
ex: defparam testbench.Idigtop.Ideglitch.mota_itrip_deg.sig_meta_reg.Xon = 0; => This Xon parameter = 1 in model of flop (in ifdef TI_verilog section of DTCD2.v flop). So, by default X is propagated, but if we set Xon=0, then X is not propagated. X value in that meta flop is forced to whatever RTL is modeling. That means whatever is the i/p of flop right before the clk edge is passed. If the i/p changes right on the clk edge, then the coding sequence determines which happens first, i/p change or clk edge. If we don't set Xon=0, then X's will get propgated to all logic eventually, and all our test cases will fail. By setting Xon=0, we force o/p of flop to be 0 or 1 always.
 => Next, in filtered_logs dir, we copy all log files from gate_logs dir, and search for any "Warning" msg using filter_warnings.pl script. We should not see any warnings as meta flops are the only ones that should have setup/hold viol. Any other viol is real, and should be fixed in design. Since we were timing clean, we should investigate if we had mistakenly set that path to a false path in PT/ETS.

1B. instead of xfilter.v file, we can also turn off timing check by using tcheck cmd by specifying it on irun cmd. (valid for irun versions 14.2 or later)
    +nctfile+gate.tfile => arg to irun (no space in b/w "+")
   ex: In gate.tfile, we put 1st sync flop for all synchronizers to be filtered out for x propagation. This also prevents tool from generating "Warning! Timing violation $setuphold ...". option 1A above may still generate warnings depending on library model written.
       PATH tb_digtop.dut.sync_*.genblk1_S_sync1 -tcheck => turns off timing check for flop genblk1_S_sync1. Not sure, if it turns off all timing checks or just setup/hold.
   NOTE: if running older version of irun, then the tool doesn't pick up thse tcheck and will throw this warning "ncelab: *W,TFANOTU (gate.tfile) tfile node ... was not used by design". This means tool discarded the tcheck, due to old version, etc, etc.

1C. we can also provide timing check file via "-input tcheck_off.tcl", which will have " tcheck -off" cmd for 1st stage of all sync flops.
ex: tcheck -off veridian_tb...i_sync_flops.u_sync.tiboxv_sync_2s_acn_sync_0

2. sdf_annotation: $sdf_annotate( .... ) for both max/min. see in sdf annotator section below.

Dir: /db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/gatesims
run_gate_sims_max => script to run gatesims for max delay
----------------------
#same as run_rtl_sims except netlist is gate level, neg_tchk, max_delays, define+TI_verilog used

set DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/5.2_v21/share/PLI/nc_xl/LINUX/xl_shared/libpli.so:deb_PLIPtr"
irun \
$DEBUSSY_PLI \
-y /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/verilog/models \
../../../Source/global.v \
../../../FinalFiles/digtop/digtop_final_route.v \ => gate netlist
../tb/digtop_tb.v \
../tc/$argv[1]_tc.v \
-l ./gate_logs/$argv[1]_max.log \
+nclibdirname+"$argv[1]_INCA_libs" \
+access+r \
+libext+.v \
+licq \
+sv \
+neg_tchk \  => allows neg values in $setuphold and $recovery timing checks in the Verilog description and in SETUPHOLD and RECREM timing checks in SDF annotation. This is needed, bacuse tools zero out -ve timing check numbers, as it may not converge and causes large performance issues. see in verilog.txt for more info on -ve timing checks.
+max_delays \ => Apply the maximum delay value if a timing triplet in the form min:typ:max is provided in the Verilog
description or in the SDF annotation.
-input dump_gate.tcl => optional. same format as for rtl sims.
-SDF_CMD_FILE sdf_max.cmd => optional. see sdf section below for details.
+nctfile+ gate.tfile => optional. turns off timing checks for specified gates. see above section for details.
+define+TI_verilog \ => TI_verilog uses models with delays
+define+FSDB \
+define+FSDBFILE=\\\"/sim/NOZOMI_NEXT_OA/kagrawal/digtop/gate/$argv[1]_max.fsdb\\\" \
#+define+VCD \
+define+VCDFILE=\\\"/sim/NOZOMI_NEXT_OA/kagrawal/digtop/gate/$argv[1]_max.vcd\\\" \
+define+SDF_MAX \ => SDF_MAX annotation used in top level module
+nowarnCUVWSP \
+nctimescale+1ns/1ps

run_gate_sims_min => script to run gatesims for min delay.
----------------------
same as for max, except +min_delays, +define+SDF_MIN used.

NOTE: after running gatesims with sdf_annotate, look in sdf_max.log to make sure it has no errors or warnings. Else, sdf is not correctly annotated.

----------------------------
GateSim (for vhdl testbench):
----------------------------
Dir: /db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/kagrawal/gatesims
run_gate_sims_max => script to run gatesims for max delay

#sdf compiled file generation (see below in sdf annotation)
ncsdfc /db/MOTGEMINI_DS/design1p0/HDL/FinalFiles/digtop/digtop_max.pt.sdf -output ./digtop_max.pt.sdf.X

setenv LD_LIBRARY_PATH /apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX:$LD_LIBRARY_PATH
set DEBUSSY_PLI     = "-loadpli1 debpli:novas_pli_boot -loadcfc debcfc:novas_cfc_boot"

irun \
$DEBUSSY_PLI \
-y /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/verilog/models \
/apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX/novas.vhd \
/db/Hawkeye/design1p0/HDL/Source/golden/global.v \
/db/Hawkeye/design1p0/HDL/FinalFiles/digtop/digtop_final_route.v \ => gate netlist
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tb/digtop_tb.v \
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tc/$argv[1]_tc.v \
-l ./gate_logs/$argv[1]_max.log \
-input nc_max.do \ => look above in rtl sim for vhdl (it calls fsdb dump functions)
+nclibdirname+"$argv[1]_INCA_libs" \
+access+r \
+libext+.v \
+licq \
#+sv \
+neg_tchk \ =>allows neg values in $setuphold and $recovery timing checks in the Verilog description and in SETUPHOLD and RECREM timing checks in SDF annotation.
+max_delays \ => Apply the maximum delay value if a timing triplet in the form min:typ:max is provided in the Verilog
description or in the SDF annotation.
+define+TI_verilog \ => TI_verilog uses models with delays
+define+FSDB \
+define+FSDBFILE=\\\"/sim/HAWKEYE_DS/kagrawal/digtop/gate/$argv[1]_max.fsdb\\\" \
+define+GATE \
+define+SDF_MAX \ => SDF_MAX annotation used in top level module
+nowarnCUVWSP \
+nctimescale+1ns/1ps

run_gate_sims_min => script to run gatesims for min delay.
----------------------
same as for max, except +min_delays, +define+SDF_MIN used.

 




Waveform viewer and debugging system:

Many waveform viewer available to view the results of simulation. Some popular ones are as below:

  1. SimVision from Cadence: comprehensive debug env which includes design browser, waveform viewer, src code browser, signal flow browser,etc.  It uses *.shm waveform database to store waveforms. Expensive license ($50K)
  2. Debussy from Novas (purchased by SpringSoft in 2008): The Knowledge-Based Debugging System. debussy is cheaper ($5K), but its superset Verdi is used, which is behaviour based debugger. It uses fsdb and vcd waveform database. All the cmds of Debussy are valid in Verdi. Debussy invoked by typing: debussy -f <cmd_file>. We use debussy, Release 2008.10 , Linux x86_64/64bit. (though it says it's using verdi 2008.10 version with 64 bits).
  3. Verdi from Novas (Verdi was a product of Novas, but was purchased by Synopsys): Verdi is superset of Debussy, costs more but has lot more features. invoked by typing: verdi -f <cmd_file>. Verdi is the recommended tool to use (instead of debussy).

All these waveform viewers need waveform in some format to display it. Two most common waveforms supported are as below.

  1. VCD: (value change dump), ASCII  format for waveform dumpfiles. defined by IEEE std 1364-2001 and supports 6 value VCD format (orig 4 valued logic: 0,1,Z,X and later signal strength and direction added). widely used. The VCD file comprises a header section with date, simulator, and timescale information; a variable definition section; and a value change section, in that order.
  2. FSDB: (fast signal database), which is Novas' proprietary waveform dump format.  It is much more compressed than the standard VCD format generated by most simulators.  Novas provides a set of object files (using +loadpli) that link with all common commercial simulators to generate an FSDB file directly.



SimVision:
--------
Using -gui option with ncsim or irun/ncverilog brings up SimVision.
> irun -gui -f run.f -access RWC -linedebug (add "-uvmlinedebug" if running with uvm)
NOTE: "-access +r" or "-access RWC" is needed, else waveform dump won't show any signals (as they don't have read permission, r=read, w=write, c=connectivity to help with x propagation). Also, ncsim cmds for dumping waveform into cadence database (waves.shm) is needed in input script or on ncsim prompt. See below for details.

We can also directly type simvision to bring up simvision. We can then open "waves.shm" database.
simvision &
simvision -waves waves.shm -input digtop.svcf & => This will open up waves.shm database, with signal file digtop.svcf (similar to rc file in nWave). We can do "File->Source command script" to load svcf file or "save command script" to save svcf file.
 
Simvsion has a Design browser and Console.
1. Design Browser-simvsion: It allows to browse design. It shows modules, RTL, etc. NOTE: If we select signals on this, it won't show up in waveform window automatically. We have to do "send to waveform" to see them on waveform viewer.

2A. waveform-simvsion: To invoke waveform viewer, click on "send to waveform" on design browser of simvision (It's 2nd button after + sign on top right side). Imp cmds:
send to: => used to send values from waveform to RTL or schematic and vice versa
= => this zooms to fit waveform

2B. On waveform, to see delta time delay: take mouse to "yellow pulse shape", hold right click for a second, and a pop up comes. choose "expand time"->All_time. Then on waveform we see blue shaded area. The blue area shows what happens in delta delay time (you will see that time remains same in blue area, but numbers in brackets change implying delta delay)

3. Console-simvision: It's used to run ncsim cmds. It has ncsim prompt on simulator tab (It has 2 tabs on bottom: Simvsion and simulator). When we write "run" cmd on it, that it when it starts running sims. When we are not in Simvision gui mode, then run is automatically placed on ncsim cmd prompt, so that our simulation runs to completion. Then when completed, exit is automatically placed on ncsim cmd prompt to exit sim. If we want to stop sim when in cmd line mode, we can add "-tcl" to cmd line, and then tool will stop at ncsim prompt. We'll have to type "run" on ncsim prompt to continue. ex: irun -tcl -f run.f (stops at ncsim prompt)
Ex of ncsim cmd:
ncsim > database -event -open waves -into waves.shm => create shm database named waves.shm (which contains .dsn and .trn files, which are waveform dump). waves is the scope. "-event" provides zero time events to be seen on any signal, which is otherwise not possible to see. This helps detect edges happening with 0 width)
ncsim > probe -create -all -depth all -tasks -functions -memories -database waves -name probe_a => probe all signals, all depth and for all tasks,functions too. It does not probe memories (2-d,3-d array), so have to put -memories also.(also, if we run gui mode, w/o using -tcl, then memories are automatically added to probe). Put this probe data into database waves. If no name is provided for probe, then ncsim will name it probe 1, probe 2, etc. NOTE: in design browser, select Scope as "waves", and then you will all signals with values. By default, scope is "all available data" which shows simulator scope also (which may not have any probe data).

NOTE:To get extended vcd (which shows port dirn too), do this: (evcd needed to generate tdl files)
ncsim> database -open waves -evcd -into myvcd.vcde
ncsim> probe -create testbench.dut -evcd -database waves
Instead of above 2 cmds, we can also d this in Tb.sv file: initial $dumpports(UVMTb.I_dut, "sim.vcde");

nsicm > run => runs ncsim till it terminates. pgm terminates when $finish is reached in any module.
ncsim > run 2.5 ms => runs ncsim for 2.5ms
ncsim > exit => exits ncsim.
ncsim> reset => resets ncsim, so that we can run simulation again starting from time 0
NOTE: To rerun new rtl after modification, we can either close simvsion and rerun simulation again or from Console window we can click Simulation->Reinvoke Simulator. This reruns new rtl and loads new waveform.

NOTE: we can provide -input option with irun, specifying the input file, which gets loaded on ncsim prompt. This saves up from manually typing the ncsim cmds on cmd line. If we don't provide cmd for "database -open .." or "probe -create ...", then no cadence datanase is created. To create vcd/fsdb database, we have to provide system task "$dumpvars .." within "initial begin ... end" block to dump waveform database.
Ex: irun -access +r -f rtl_files.f -input dump.tcl .... => -access +r is needed to see signals in waveform dump
dump.tcl has these lines:
database -open waves -into /sim/bellatrix/kagrawal/waves.shm -default
probe -create -emptyok -database waves -all -memories -depth 10 digtop_tb => var in function/task not dumped by default. To dump those, use -variables.
probe -create -emptyok -database waves -all           -depth 3  Silver_top.Xosc.I1 => This type of probe used for ams sims to dump voltages upto 3 levels deep
probe -create -emptyok -database waves             -flow -ports Silver_top.Xosc.AVDD => This probes current at AVDD port of Xosc block. valid for ams sims, since digitl blocks (which are modeled as verilog) do not consume any current.
probe -create -emptyok -database waves -all -flow     -depth 3  Silver_top.Xosc.I1 => This probes current for all nets upto 3 levels deep.
probe -create -emptyok -database waves -all -memories -depth 10 -domain digital => This is helpful in ams sims, where we do not need to specify path of digital block. It does probing upto 10 level deep of all nodes which are digital in nature (i.e have verilog models)
run
quit => this is executed after run has finished

----------

Xcelium (xrun):

--------

As discussed earlier, xrun is used to run designs on Xcelium Simulator. It does work similar to irun. All of the options for xrun same as those for irun. 2 imp help cmds for xrun:

> xrun -helpshowsubject => shows list of subjects as xmvlog, xmvhdl, xmelab, xmsim, etc

> xrun -helpsubject xmvlog => shows all options for subject xmvlog, as -assert, -ams, etc

> xrun -helpall -helpalias => -helpall displays list of every supported option, while -helpalias displays different ways to enter an option (ones entered using -/+ signs. irun/xrun use both "-" and "+" for cmd line options)

ex: xrun top.v test.c obj1.so -y ./libs -y ./models -l run1.log ... (source files can be in any format as .v, sv, .vhd, .e, .vams, .c, .cpp, .s, .o, .so, etc)

This is how dir looks like, when you run xrun: ex: 

xcelium.d => instead of INCA_libs, this build dir created. Contents in this dir are automatically checked (timestamp, snashot info, etc) on rerun of xrun, to determine if recompilation or re-elaboration is needed. It has following subdir:

1. xcelium.d/run.<platform>.<xrun_version>.d (ex: xcelium.d/test_sim.lnx8664.19.01.d, instead of run, we created test_sim as custom name by using option -snapshot test_sim ). A soft link names test_sim.d is created by default pointing to this dir. Within this are subdir, is xllibs dir, which has subdir for each -y libraries and -v library files (i.e run.d/xllibs/<libs> and run.d/xllibs/<models> when cmd is "xrun top.v -y ./libs -y ./models ... ")

2. worklib => design files contained in HDL design files (as in top.v) are compiled in this dir. Usih option "-work <worklib_name>" changes name of this worklib dir. Within this dir is library database file called "xlm.lnx8664.066.pak" file, which stores all intermediate objects required by Xcelium core tools. These .pak files are large and so usually compressed by using -zlib option

3. history => There is history file which records all prev cmds run

options:

-64/-64bit => runs 64bit version of xrun

-top chipTb => defines top level module (can have multiple such cmds since there are typically multiple top level modules from uvm, design, etc). This option not needed for v/sv top level modules, but required for vhdl/systemC top level modules. By default, top level design units are automatically determined for v/sv, but are not automatically inferred for vhdl/systemC if top units are in these files. In such cases, this option is required

-l <logfile> => by default, log is written to xrun.log in same dir where xrun was invoked

-v libfile.v => old scheme of lib mgmt. xrun scans this file for module/udp defn that can't be resolved in normal src files specified. -v option causes module/udp in these files to be parsed, only if they have the same name as unresolved module/udp. Otherwise they are not parsed, which saves time. If we omit -v, then these module/udp in these files will always be parsed

-y <lib_dir> => specifies path to library dir, where files containing defn of module/udp are to be found

-define foo=2 => -define similar to using `define compiler directive in verilog. same as irun, can use +define+ also. If there's no value to assign, we can also do "-define foo".

-compile => parse and compile source files, but do not elaborate

-elaborate => parse and compile source files, elaboarte design and generate simulation snapshot but do not simulate. If -compile/-elaborate options not used, then all steps run (compile/elaborate/simulate)

-hal => this runs HAL (HDL analysis) on snapshot instead of running simulator. This is used to verify any errors/warninhgs etc on design files.

-snapshot <snapshot_name> => genrate sim snapshot with given name (-name or -snapshot are both same) in xcelium.d/worklib/<snapshot_name/*. By default, snapshot name are xcelium.d/worklib/run/*. This option also changes name of xcelium.d/run.lnx8664.19.01.d to xcelium.d/<snapshot_name>.lnx8664.19.01.d.

-r <sanpshot_name> => load and simulate specified snapshot, w/o doing any kind of checking. By providing "-input file1.tcl", we can provide diff tcl cmd i/p files to have multiple diff sims with same snapshot. -R (w/o any snapshot name) is used to simulate the last snapshot generated by xrun cmd.

-xmlibdirname <xcelime_dirname> to have custome dir name instead of xcelium.d. When running simulator only (using -r or -R option), we need to provide this, if snapshot is not in default dir path or default name.

-clean => this forces removal of dir xmlibdirname or xcelium.d and start fresh. This causes xrun to recompile, re-elaborate and recreate dir. In absence of this option, automatic checks are done to edtermine if this dir can be reused

-hdlvar /home/.../my_hdl.var => This var file is a configuration file that can have all cmd line options and args in 1 place (i.e DEFINE XRUNOPTS -ieee1364  -access +rw etc) . That way, the regular xrun cmd won't look lengthy and complex

-f <args_file> => We can also provide additional argument file that can have any args in it, name of source file, and everything else needed with xrun, which will be added to xrun existing args (i.e -clean source.v ...)

uvm cmd line options supported by xrun:

-uvm => enable support for uvm

-uvmhome /UVM/.../uvm-1.2 => specifies loc of uvm installation. By default, uvm is installed in <install_dir>/tools/methodology

-uvmexthome .../CDNS-1.2 => loc of cadence extensions to uvm. By default, uvm extensions are installed in <uvmhome>/additions/sv

. run_test() task in top level module calls this test to run+UVM_TESTNAME=<test_name> => specify name of test

 

 


Debussy:
---------
used to see waveform demp, and annotate it to rtl/gate so that debug is easier. It is also used to see schematic rep of rtl or gate, which helps to see connectivity. Gate schematic specially helps during ECO as we don't have to manually go thru verilog text file of digtop_final_route.v.
Debussy has following tools as part of the suite.

nTrace:
-------
gui that comes up to traverse design hier.can trace load, driver, connectivity. can change src code by choosing ur editor: tools->preferences->editor, and then choosing source->edit source file.
to import design, goto file->import design. Select "from file", set Virtual Top as "digtop", default dir as "/db/Hawkeye/.../FinalFiles/digtop", then in bottom LHS panel, goto dir "/db/Hawkeye/.../FinalFiles/digtop", then click on synthesized netlist "digtop_final_route.v" in RHS, and click Add. Then it shows up in design Files. Click OK. Now, you can see whole netlist in the top panel
active annotation: allows to view verification results in context of src code. But before using this, we need to load sim results (in FSDB file) using file->load simulation results. Then in hier browser, double click the instance that you want, choose source->goto->line, enter line number and OK. Then choose source->active annotation (or x key after putting the cursor in source code pane) to activate active annotation. values associated withj each signal are than displayed at time 0. Now we can do serach forward, backward for signals to change time.

nSchema
----------
gui that shows schematic.
Once you have imported design, goto tools->new scematic->current scope. Then schematic is drawn for whatever is selected as current scope in panel (current scope name also shows in the top window bar, it's set as whatever instance is selected, i.e digtop or interrupt etc).
In new schematic window, goto view->high contrast. This turns ON contrast for better viewing.
 
nWave
------
gui that shows waveform viewer:
nWave -ssf test1.fsdb => This loads the fsdb file directly
Load fsdb file: do file->open. then type name of dir containing fsdb file in white box. That shows the dir and files in that dir in two windows below. Select appropriate fsdb file in RHS window. click on Add, and then OK. This load the fsdb file.
get signals: click on "get signals" (next to open file drawing)
important settings:
1. Waveform->Snap cursor to transitions. when this is set, then when we click on any signal waveform, then the cursor goes to the next edge. Useful when doing active annotation in debussy, since the change shows up in rtl signal values.
2. Tools->Preferences. It has almost all settings for GUI. Thses settings remain there even on quitting nWave. Goto View Options->Waveform Pane. Check box "Highlight selected signals". This highlights selected signals.
3. To search for signal name, enter it in right hier and right case in "Find Signal). to search all hier, enter * at the end in "Scope", then it searches for everything under that hier. For ex:, if you are in digtop_tb hier, you will see "digtop_tb" in Scope. Just eneter * after that, i.e: /digtop_tb/*
4. To set an alias file fo state machine, etc, first select the signal that you want alias to be set to on the waveform viewer. Then select alias file as: waveform->Signal_value_radix->Add_alias_from_file, then choose the alias file and hit OK. alias file syntax is: states_timergen.alias
ALIAS timergen_sm
 PT_RESET          4'b0000
 PT_XG_INC         4'b0001
ENDALIAS

Verdi: superset of Debussy, as a lot more tools available.
------
    nCompare - Waveform compare (compare rtl and gate level waveforms).
    nSchema - Schematic browser(delay annotation).
    nState - State Diagram Debugger (Displays the Bubble Diagram of state machines)
    n Analyzer - Debug clock tree, clock and reset analysis,view multiple clock domains.
    nEco - Evaluate the changes made on the fly and validate them.
    SVTB - Gives the System Verilog Test Bench Inheritance view, class variables can be viewed synchronously with other signals on nWave.
    Assertion Evaluator - Evaluates System Verilog assertions off line without the simulator.
    Power Manager - Debug the UPF and CPF files and visualize the different power domains in the design
    Temporal Flow Wiew - Brings time,value and hierarchy on the same window


Running Debussy:
--------------------
Dir: /db/Hawkeye/design1p0/HDL/Debussy/

#Before we can run debussy, we need to generate fsdb file and do sdf_annotation (for gate sim) in irun. fsdb generation is not necessary, since debussy can convert vcd into fsdb on the fly. sdf annotation is also not a necessity since we can always run gatesims w/o sdf annotation, but then it's not very useful.

#generate fsdb: add following lines in top level verilog code. (+loadpli option should be used on irun cmd)
#File: /db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tb/digtop_tb.v

   initial
     begin
`ifdef FSDB => note FSDB was defined in cmd line of irun, so this section is valid. It generates fsdb which is proprietary.
        $fsdbDumpvars;
        $fsdbDumpfile(`FSDBFILE);

      #5000; //below cmds needed only if do not want dumping for all of sim time. Similar to vcd system task

      $fsdbDumpon; // This starts dumping

     #1000; //Dumps for 1000 time units startig from 5000 time units after sim starts

     $fsdbDumpoff; //this stops dumping
`endif
-----
#NOTE: In $fsdbDumpvars, we can also provide 2 arguments. 1st arg is name of block from which you want to dump fsdb, and 2nd var implies if we just want to dump for this block (1) or for all the hierarchy below it (0).  
ex: the code below dumps fsdb for digtop_tb (only top level since 2nd arg is 1), then dumps fsdb for digtop_00 which is a block within digtop_tb (all levels below it since 2nd arg is 0). The combined fsdb dump is in fsdbfile. So, in nWave, we'll see only digtop_tb. digtop_tb will contain digtop_00 module. digtop_00 module will contain all modules below it.
 $fsdbDumpvars(digtop_tb, 1);
 $fsdbDumpvars(digtop_00, 0);
 $fsdbDumpfile(`FSDBFILE);


-----

`ifdef VCD => if we need VCD (value change dump) which is std waveform database. Can be used with Novas Debussy as it supports both VCD and FSDB. See in verilog.txt for details on these system tasks.
        $dumpvars;
        $dumpfile(`VCDFILE);
`endif
     end // initial begin

SDF annotation: (for gate sims only)
--------------
annotator:
--------
The SDF file is brought into the analysis tool through an annotator. The job of the annotator is to match data in the SDF file with the design description and the timing models. Each region in the design identified in the SDF file must be located and its timing model found. Data in the SDF file for this region must be applied to the appropriate parameters of the timing model. SDF annotation is performed during elaboration, and can only take place at time 0.
2 ways to do sdf annotation:
-----------------------
A. $sdf_annotate utility:
Simulator only read compiled SDF file (sdf_filename.X). SDF src file is provided in $sdf_annotate and then it's compiled by the ncsdfc utility within elaborator to generate sdf_filename.X file, which is used by verilog-XL. Once *.X file is there, it can be used by the simulator for subsequent runs.
for SDF annotation, we need to do same thing as for fsdb/vcd dump file in top level module (digtop_tb). $sdf_annotate can only be in an initial block for verilog code, as it always takes place at time 0 only.

initial begin
      $sdf_annotate("/db/DRV9401/design1p1/HDL/FinalFiles/digtop_VDIO_Max_aligned.sdf", digtop_00,,"logs/sdf_max.log", "MAXIMUM"); // 7 args to sdf_annotate = name of sdf file, top level module inst name, cfgfile, logfile, MINIMUM/TYPICAL/MAXIMUM, scale_factor, scale_type.
#for min sdf ann
#$sdf_annotate("/db/DRV9401/design1p1/HDL/FinalFiles/digtop_VDIO_Min_aligned.sdf", digtop_00,,"logs/sdf_min.log", "MINIMUM"); => if sdf_annotate was called in some other module, then we had to specify the full hier, i.e. dut.digtop_00
end

NOTE: after running gatesim with sdf_annotate, look in sdf_max.log to make sure it has no errors or warnings. Else, sdf is not correctly annotated or not annotated at all for such paths. Usually we get warnings like "ncelab: *W,SDFNEP: Unable to annotate to non-existent path ..." => this indicates that an arc was there in verilog model file (i.e in AN210.v), for which there was no corresponding arc found in sdf file. This usually happens with flops, where verilog models of flop (i.e SDC210.v) may have setup and hold arcs separate, while sdf file may have both combined as $setuphold, which may cause this warning. Arcs in sdf file came from .lib file, while sdf annotation is matching the arcs with std cell verilog model file. So, basically every arc in .lib file should match arcs in specify section of verilog model file. Sometimes we have conditional arcs in verilog (i.e arc from S->Y for MUX2). Corresponding arcs in .lib file are written with "sdf_cond : "!A&&B";" etc. "ifnone" arcs in verilog are written with no "sdf_cond" in .lib files. These arcs are written as "CONDELSE" in sdf files. Sometimes, some of these conditional arcs missing in .lib files can cause sdf files to be missing these arcs too. PT/ETS run using .lib files, so they may also have incoorect timing, as timing tools choose arc with worst/best possible timing, so if the missing arc has the worst/best timing, then the timing doesn't reflect that arc, resulting in incorrect timing.
NOTE: when generating sdf file, always use correct options, or some of the arcs might get removed from sdf file even though present in .lib files. One such example is using "CONDELSE" combo path arcs.

For ex: flop in SDC10.v has this in specify section:
     (CLK *> Q  ) = (0.100000:0.100000:0.100000 , 0.100000:0.100000:0.100000);
In SDF, 1st case shown below will pass while second will fail:
IOPATH CLK Q (1.0:1.0:1.0) (0.8:0.8:0.8) => pass
IOPATH (posedge CLK) Q(1.0:1.0:1.0) (0.8:0.8:0.8) => fail, since there no negedge/posedge clause in verilog model

Other warnings:
1. *W,NTCNNC: Non-convergence of negative timing check values in instance I_xyz/reg_5 => -ve timing check couldn't converge. see in verilog.txt for more details
2. *W,SDFNDP: Annotation resulted in a negative delay value or pulse limit to specify path or interconnect delay, setting to 0 => This happens when there are -ve values for delay in sdf file. Since simulator can't go back in time, it has to use 0 or +ve values. So, it sets all these -ve delay values to 0.
3. *W,SDFNEP: Unable to annotate to non-existent path (COND readcond (IOPATH CLK Q[24])) of instance DIG_TOP...U234 of module sshdbw00056025020 <../input/DIG_TOP_routed.fromPT.Min.sdf, line 169701> => This indicates that an arc was found in sdf but not in verilog model file. This usually happens with RAM/ROM IP, which may have intentional blackbox verilog models, which don't have any arcs.
NOTE: any of the above warnings do NOT cause missing annotations, as simulator runs with verilog arcs, and uses the default delay or the sdf delay for that arc. So extra arcs in sdf file are OK. Only when arcs are present in verilog but absent from sdf, is when we see unannotated arcs.

More options for sdf reporting:
1. -sdf_verbose: We can use option "-sdf_verbose" with irun cmd to print more detailed report in sdf.log file. With "-sdf_verbose" option, we'll see each cell instance, and the arcs annotated to it. It will have warnings (*W,SDFNEP) if while annotating a cell from sdf file, it's not able to find corresponding arc in verilog model file. Once all the cell arc annotation is done, we'll see "ABSOLUTE PORT:" delays, which show interconnect delay for getting to an i/p pin of each instance. This is taken from the "INTERCONNECT" delay section of sdf file. The reason, we only see i/p pins of cells and NOT the o/p pins is because interconnect delay is just needed for each i/p pin to form the full path. That is also the reason, why interconnect delays are not specified b/w 2 points (o/p of one gate to i/p of other gate), as it's not needed.
2. -sdfstats: If we want to have more sdf stats for unannotated arcs, we can run irun with options "-sdf_verbose -clean -sdfstats sdf_unannotated.txt". Then it shows a list of unannotated arcs with their corresponding cells. Arcs that are in verilog model, but not in sdf are the arcs that are left unannotated (and shows up as less than 100% annotation). In that case, simulator takes the default delay of such arcs from the verilog model file.

B. Cmd file:
Instead of using annotator cmd ($sdf_annotate), we can do sdf annotation using these 3 steps:
1. generate compiled sdf file using this cmd on the unix shell:
ncsdfc SPI.sdf -output SPI.blah => generates SPI.sdf.X in the current dir if no output file specified with -output.
2. wrt sdf cmd file: There are seven statements, which correspond to the seven arguments of the $sdf_annotate system task. Only one statement is required: the COMPILED_SDF_FILE statement, which specifies the compiled SDF file that you want to use. Others are optional (create cmd file named:  myfile.sdf_cmd) Note, file has to be terminated with a ;
COMPILED_SDF_FILE = digtop_func_W_125_1.62.sdf.X,
SCOPE = :pm7324_inst, => annotate to the VHDL scope :pm7324_inst, which may contain Verilog blocks. For us, it's :UUT or tb_digtop.dut.
LOG_FILE = "pm7324_flat.sdf.log", =>log
MTM_CONTROL = "TYPICAL", => min/typ/max. Indicates which triplet will be used.
SCALE_FACTORS = "1.0:1.0:1.0", => optional. mult factor for min/typ/max
SCALE_TYPE = "FROM_MTM"; => optional. scales timing specs FROM_MINIMUM/FROM_TYPICAL/FROM_MAXIMUM/FROM_MTM. i.e it indicates which of the 3 triplets will be used. For ex: if MTM_CONTROL = "TYPICAL", then we specify SCALE_TYPE = "FROM_TYPICAL".
3. #for ncelab, use ncelab -sdf_cmd_file filename option to include the SDF command file.
ncelab -sdf_cmd_file myfile.sdf_cmd worklib.top
#For irun, we can use the same option: irun .... -sdf_cmd_file myfile.sdf_cmd -sdf_verbose ...

When running irun, we see annotation message like this:
     Reading SDF file from location "/vobs/.../digtop_func_QC_NOM_1.8_ATD-N_25_1.8-.sdf"
     Writing compiled SDF file to "/sim/.../../digtop_func_QC_NOM_1.8_ATD-N_25_1.8-.sdf.X".
    Annotating SDF timing data:  ....    
    Annotation completed successfully...
    SDF statistics: No. of Pathdelays = 29695  Annotated = 100.00% -- No. of Tchecks = 38702  Annotated = 99.99% => Path_delays/Tchecks refer to ones in verilog model for cells, while Annotated refer to ones in sdf
                        Total        Annotated      Percentage
         Path Delays           29695           29695          100.00 => path delays refer to IOPATH in cell, and not to interconnect delay. Here verilog model IOPATH(under Total) for all cells match sdf IOPATH(under Annotated). Reason for mismatch would be when there's an extra gate in netlist but not in sdf file
             $period               2               2          100.00
              $width            6942            6942          100.00
             $recrem            4506            4506          100.00
          $setuphold           27252           27250           99.99 => 2 setuphold arc in verilog for which the annotator didn't find corresponding arc or timing in sdf. This needs to be fixed as they should match exactly at 100%.
NOTE: missing interconenct delays will be reported separately as "ncelab: *W,SDFINC: interconnect ... not connected to ..."

NOTE: If we provide non-existent sdf file in $sdf_annotate, then irun doesn't give any warnings. We don't see any annotation messages as shown above. Instead delays from verilog models (ex 0.01ns for gates when TI_verilog is defined) are taken, and annotation is done using those delays. As a result, we may see tons of timing violations for cells. Best way to find out is to pull up waveform and check delay for buffers/inverters and make sure they match those from sdf files.

------------
SDF file format is below in another section.

----------------
#Then we run Ncverilog or irun with loadpli1 (pointing to verdi PLI), and we get waveform dump. Then we start running debussy in separate dir to debug this waveform.

script: create_symbols for debussy/verdi:
-----------------------
creating symbols: Debussy/Verdi can display gate-level schematics using the proper symbols for the cells used in the netlist.  To enable this, you must set up a Debussy/verdi symbol library for the target cell library.  The symbol library can be created by running the utility syn2SymDB on the equivalent Synopsys Liberty (.lib) library.
syn2SymDB -o foo_u foo.lib foo1.lib =>
     -o:  Specifies output library name
      foo.lib:  Synopsys library name. Other lib can be added separating them by space
 This creates symbol library (directory) called foo_u.lib++.

NOTE: we can also run "vericom" compiler by synopsys to generate foo_u.lib++
cmd: vericom -2013.09 -sv -f list_rtl.f -lib VerdiLib => reads rtl files to generate VerdiLib.lib++

ex: just typing syn2SymDB may not work, so type the whole path
/apps/novas/debussy/2010.04/platform/LINUXAMD64/bin/syn2SymDB -o symbol \
/db/pdk/lbc8lv/current/diglib/msl445/current/synopsys/src/MSL445_N_25_1.8_CORE.lib \
/db/pdk/lbc8lv/current/diglib/msl445/current/synopsys/src/MSL445_N_25_1.8_CTS.lib \
/db/pdk/lbc8lv/current/diglib/msl445/current/synopsys/src/MSL445_N_25_1.8_ECO.lib
=> creates symbol.lib++ dir.

You must reference this symbol library by setting the following two environment variables:
     setenv TURBO_LIBS "foo_u"
     setenv TURBO_LIBPATHS <path to the directory containing the symbol library directory>

We can also include these 2 variables in novas.rc file as:
   TurboLibs = symbol
   TurboLibPaths = /data/VIKING_OA_DS3/a0783809/debussy/lib
=> novas.rc gets loaded anytime debussy is invoked, so it looks in "lib" dir for "symbol.lib++" and adds all those symbols.

#Invoke Debussy and compile/load your netlist.
debussy -2012.04 /data/.../DIG_TOP_routed.v => This loads PnR netlist so that we can see schematic of this. (2012 version shows old gui, while later ones show new gui)
verdi /data/.../DIG_TOP_routed.v -upf2.0 Top.upf -upftop digtop => Loads PnR netlist into verdi (-upf loads upf to show various power domains in design. If loading upf, top module name for upf needs to be provided)


debussy quick tips:
------------
0. clicking on the AND gate symbol (2nd row 3rd col on gui) brings up the schematic.
1. When tracing loads, click on any net and click "Trace Load". Then from top, do tools->New Schematic->From Trace Results. This brings a new window which only shows net and all loads. This is helpful to see all loads on any net.
2. click Schematic->Find (or Caps A), and put name of nets/instance and it will show all. Select one that you need and click "c" to change color of that net.

script: run_debussy_rtl/run_debussy_gate: for gate runs and rtl runs
----------------------
run_debussy_rtl:

#in our dir, we see PML30.lib++. So, we set these var as follows and invoke debussy. NOTE: we don't really need these symbols since rtl only has clk gaters instantiated from library, so those will show as square box.
setenv TURBO_LIBS PML30
setenv TURBO_LIBPATHS /db/Hawkeye/design1p0/HDL/Debussy/
debussy -f list_rtl.f -vtop vtop.map -2001 -autoalias & => we can also just use "debusssy &"

list_rtl.f
-------
-f /db/DRV9401/design1p1/HDL/Source/digtop_rtl.f => has paths to all rtl files from source area: /db/DRV9401/design1p1/HDL/Source/digtop.v, global.v, etc
/db/DRV9401/design1p1/HDL/Testbenches/kagrawal/digtop/tb/digtop_tb.v => has path to top level tb block

run_verdi_rtl:

----------------

#invoke verdi to load RTL

vericom -2013.09 -sv -f list_rtl.f -lib VerdiLib => invoke vericom to create VerdiLib.lib++ from rtl files. (for some reason, this gives lots of error whe reading verilog packages. options "-2012 -ssv -ssy" seem to resolve all these errors. -2012 enables system verilog constructs (probably same as -sv), while "-ssv -ssy" enables verdi database for library cells.

verdi -lib VerdiLib -top digtop => Here we are loading VerdiLib.lib++, no need to specify RTL files, as lib++ already has lib built from rtl from earlier step (when running vericom)

verdi -f list_rtl.f => This loads list_rtl.f directly instead of generating lib thru vericom. For some reason, this gives lots of errors with packages.

vtop.map: debussy accesses already dumped fsdb files. The map file maps hier in fsdb to that in RTL.
-------
digtop = digtop_tb.digtop_00 => this provides the hier path to the dut (digtop_00 is the instance name of digtop [digtop is top level RTL module] instantiated in digtop_tb)

run_debussy_gate: same as with rtl except that we run it directly on gate netlist:
---------------
list_gate.f
/db/DRV9401/design1p0/HDL/FinalFiles/digtop_VDIO.v => path to gate level netlist
../Testbenches/digtop/tb/digtop_tb.v => path to top level tb

vtop.map
-------
digtop_top = digtop_tb.digtop_00 (If top level module in gate netlist is called digtop_top, then that is what we specify. This digtop_top ties gate level netlist with tb file)

NOTE: If we get *.vcd file from analog team, then to run debussy, we need to map the hier from .vcd file to our gate netlist. so, in vtop.map:
digtop_top =  zorro_toplevel_sch.I3.I7.I0 (here I0 at the end refers to the inst of "digtop_top" module in gate level netlist digtop_VDIO.v. zorro_toplevel_sch is the schematic name within which we have I3 top level block, which contains digital wrapper I3 within which we have digital block I0)

running debussy when debugging RTL:
----------------------------------
Bring up Debussy nTrace. goto source-> mark Parameter annotation and active annotation.
Now, open nWave by going to Tools->New Waveform.
Now, we can drag and drop signals from nTrace to nWave and vice vera, and observe signals.
1. We can click on clk edge in nWave and that will show which values changed.
2. We can click on signal names in nTrace and it will backtrace it.
2. We can click "c" on any net, and we can set net to chosen color.
3. We can open 2 nWave from nTrace by going to Tools->New Waveform. this way, we open 2 nWave window. we can goto nWave "window" button and turn ON sync waveform view. We do it for both the windows so that clicking on cursor in any one of them, will affect the other (if we do it for only one of them, then clicking on cursor in that window will affect the other, but not the other way around). Then the 2 nWave windows will be synced in time, so that it's easier to compare results (for ex b/w RTL and gate)
4. NOTE: when we open nWave using Debussy and do active annotation, we will see the name of the fsdb file on the top panel of debussy window. That is the fsdb file that is actively annotated with the current RTL that we see in the RHS of debussy main window. If we open open any other nWave window and any other fsdb file, it will NOT be actively annotated with that RTL. To actively annotate other fsdb file, we goto nWave window of new fsdb, click on Window->change to primary. This changes this new nWave window to be actively annotated with the current RTL (we will see the name of this new fsdb file on the top panel of debussy window). So, we can switch back and forth b/w multiple nWave window.
4. NOTE: sometimes when we load new fsdb from nWave window, it may not get annotated properly with rtl. So, best way to open a new fsdb is to do it from Debussy panel. In debussy, goto File->close simulation results. this kills the current fsdb, but retains all the signals, so that we don't have to save it. Now, do File->load simulation results and open the new fsdb. This is correct way to view new fsdb.

running debussy when looking at gate netlist for ECO:
----------------------------------------------------
run_debussy_eco: here we are just looking at schematic of gate netlist, so we invoke debussy with just gate level netlist.

#in our dir, we see PML30.lib++. So, we set these var as follows and invoke debussy.
setenv TURBO_LIBS PML30
setenv TURBO_LIBPATHS /db/Hawkeye/design1p0/HDL/Debussy/
debussy -f list_gate.f & => list_gate.f has path to gate level netlist /db/DRV9401/design1p0/HDL/FinalFiles/digtop_VDIO.v
#debussy & => If we call debussy w/o -f option, then we have to do File->Import design, Put the file name (/db/DRV9401/design1p0/HDL/FinalFiles/digtop_VDIO.v) in bottom box, and then click Add, then OK.

Then click on Tools->New Schematic->Current scope

patgen files:
-----------
Ex: /db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/patgen

verilog models used: (for lbc7)
--------------------
A. MODEL_functiononly: timescale is 1ps/1ps. It has following delays specified:
 1. gates (AN2,etc) = 0
 2. clk gating cells (CG*) = 0
 3. flops, c2q delay = 1ps. For ex: in DTP20.v (in lbc7), "buf" and "not" gates are specified delay of #1(1ps), so final o/p Q/QZ have delay of 1ps.

B. MODEL_verilog: timescale is 1ns/1ps. It has following delays specified:
 1. gates = 10ps(#0.01) (in specify section =>)
 2. clk gating cells (CG*) = 100ps(#0.1) (in specify section =>). Also does setup/hold checks.
 3. flops, c2q delay = 100ps(#0.1) (in specify section =>). Also does setup/hold checks.

C. If nothing defined (neither MODEL_functiononly nor MODEL_verilog). timescale is 1ns/1ps. same as MODEL_verilog except no checks done:
 1. gates = 10ps(#0.01) (in specify section =>)
 2. clk gating cells (CG*) = 100ps(#0.1) (in specify section =>). NO setup/hold checks done.
 3. flops, c2q delay = 100ps(#0.1) (in specify section =>). NO setup/hold checks done.

NOTE: we used "specify" instead of putting delays as "#" so that when we sdf annotation, it will disregard delays in specify section. If we hardcoded delays as #, then we would have double counted the delay as sdf annotation would have happened on top of existing delay in verilog model.
NOTE: Only delay numbers are disregarded in specify section, all arcs (c2q, setup/hold, rec/rem, width, etc) are still honored and transfer is passed to appr notifier in the verilog model (using delay numbers from sdf file).

NOTE: When we run AMS sims, we run toplevel sims directly on digital schematic which is a gate level netlist. We don't have sdf file to annotate delays for gates. So, we set "MODEL_functiononly" as that will cause no setup/hold issues. Flops will always have 1ps delay, so they will always have enough setup time, and since all comb gates on clk have "0" delay, there will be no hold issues.  If we run it with MODEL_verilog (or with nothing defined), then hold issues may show up, which may not be actually present in ckt. This hold will show up if we've (c2q+data_path_delay < clk_path_delay). Usually clk+path has < 10 clk buffers, so no hold issues. However, if even 1 clk gating cell gets added on clk path, then hold will get violated as clk will change before data (if no of clk buffers is greater than no of gates in data path) or at same time as data (if no of clk buffers is same as no of gates in data path).

modeling delays in simulations:
------------------------------
By default, verilog gate level models, and, interconnect delays are always simulated as transport delays, but they look as if they are simulated as pure inertial delays (since they don't allow glitches shorter than prop delay to pass thru). This is beacuse, by default, pulse_r and pulse_e are set to 100%. These are verilog cmd line switches that can be used to alter this behaviour for gate level sims. delays inside of specify blocks are affected, when this cmd line switches are used with simulators: (add +transport_path_delays also)
A. +pulse_r/R% : switch forces pulses that are shorter than R% (R=0 to 100) of the propagation delay of the device being tested to be "rejected" or ignored.
B. +pulse_e/E% : switch forces pulses that are shorter than E% (E=0 to 100) but longer than %R of the propagation delay of the device being tested to be an "error" causing unknowns (X's) to be driven onto the output of the device. Any pulse greater than E% of the propagation delay of the device being tested will propagate to the output of the device as a delayed version of the expected output value.
scenarios are as below:
  0% ------  R%  -------   E% ------  100%
 --- reject  --  error(x)  -- output  --- => So, glitches can be rejected, output an x or get out as normal delayed version depending on settings.

Ex: vcs -RI +v2k tb.v delaybuf.v +pulse_r/0 +pulse_e/0 +transport_path_delays => causes pulses shorter than 0% to be rejected, and pulses greater than 0% to be propagated to the o/p. => all pulses are passed, no matter how small.
Ex: +pulse_r/0 +pulse_e/100 => causes no glitches to be rejected, but o/p x, for glitches shorter than propagation delay.
Ex: +pulse_r/100 +pulse_e/100 => models inertial delays, where all pulses shorter than propagation delay are ignored.
Ex: +pulse_r/20 +pulse_e/20 => causes  glitches <20% to be rejected, but glitches >20% to be passed.

NOTE: when we run gate sim, we may start seeing "glitch suppression" warnings (many times after adding pulse_r/pulse_e switches).
EX: Warning!  Glitch suppression
           Scheduled event for delayed signal of net "GVC_D_D" at time 1027453294 PS was canceled!
            File: /db/pdkoa/lbc8lv/current/diglib/msl458/PAL/CORE/verilog/SDP10B_LL.v, line = 92
           Scope: tb_digtop.dut.I_i2c_top.I_bellatrix_i2c_slave.I_meson_i2c_fsm.bitCnt_reg_2
            Time: 1027453096 PS

Glitch suppression: This happens when there are -ve timing values, which causes simulator to use delayed signals. When a delay with two values is calculated, there is the possibility that an event on the input net may cancel a scheduled event on the internal signal driven by the delay. This is called glitch suppression.Because  glitch  suppression  can  hide  input  events  from  a  timing  check's  input,  the simulator generates a glitch suppression timing violation if an event on a delayed signal is canceled.
To suppress the warnings due to the glitch suppression algorithm, use the -nontcglitch simulation option  

NOTE: the above cmd line switches only valid for delays in specify block, not for delays using SDF annotation. For sdf delays, we need to have these in absolute numbers within sdf file for each cell.
NOTE: to specify reject/error, we need to have extra paranthesis, like this:
ex: (IOPATH A Y ((rise_delay) (rise_reject) (rise_error)) ((fall_delay) (fall_reject) (fall_error)) ) => extra parantheses, empty parantheses for reject/error imply that reject/error is set equal to delay value => inertial delay model
ex: (IOPATH A Y (rise_delay) (fall_delay)) => no extra brackets, so values are delay values. no reject/error values.
ex:
(CELL
  (CELLTYPE "IV110")
  (INSTANCE U32)
  (DELAY
    (ABSOLUTE
    (IOPATH A Y ((0.066:0.066:0.066) (0.015:0.015:0.015) (0.019:0.019:0.019)) ((0.059:0.059:0.059) (0.012:0.012:0.012) (0.017:0.017:0.017))) => 66ps for o/p rise delay, 15ps is rise reject limit while 19ps is rise error limit. 59ps for o/p fall delay, 12ps is fall reject limit while 17ps is fall error limit.
    )
  )
)
Ex: we can also use "PATHPULSEPERCENT" keyword in sdf file to specify reject and error limits in % terms.
    (IOPATH A Y (0.066:0.066:0.066) (0.059:0.059:0.059))
    (PATHPULSEPERCENT A Y (25) (35)) => 25=pulse reject limit in %, 35=pulse error limit in %
-----------------------

SDF file syntax: ( /db/Hawkeye/design1p0/HDL/Primetime/digtop/sdf/digtop_max.pt.sdf)
-----------------
OVI (open verilog intl) developed SDF v3 syntax. timing calc tools (PT,etc) are resp for generating SDF.

syntax:
------
(DELAYFILE
(SDFVERSION "OVI 3.0")
(DESIGN "digtop")
(DATE "Thu Jul 21 20:22:34 2011")
(VENDOR "PML30_W_150_1.65_CORE.db PML30_W_150_1.65_CTS.db")
(PROGRAM "Synopsys PrimeTime")
(VERSION "D-2010.06")
(DIVIDER /) => hier divider is / (by default, it's .) a/b/c
// OPERATING CONDITION "W_150_1.65" => // is for comment
//triplets are always in form - min:typ:max for delay
(VOLTAGE 1.65:1.65:1.65) => best:nom:worst
(PROCESS "3.000:3.000:3.000") => best:nom:worst
(TEMPERATURE 150.00:150.00:150.00) => best:nom:worst
(TIMESCALE 1ns) => implies all delay values are to be multiplied by 1ns
//delays specified in CELLS for both interconnect and cell delay.
//interconnect delays => we may have the block below repeated many times as only some wires may be in each block. It's easier for readability. interconnect delays are always between 2 points => o/p of one gate to i/p of other gate.
(CELL => inter connect delays specified here. interconnect delays of order of ps (vry small), while cell delays of order of ns. All INTERCONNECT delays are only specified for top level module (digtop). For wires which are not in digtop, heir names are used.
  (CELLTYPE "digtop")
  (INSTANCE) //no instance specified, implying it's interconnect delay
  (DELAY =>
    (ABSOLUTE => delay can be ABSOLUTE or INCREMENT
    (INTERCONNECT scan_out_iso/U282/Y em_out_31_I_bufx4/A (0.008:0.008:0.008) (0.008:0.008:0.008)) //rise/fall (min:typ:max)delays. min:typ:max are same delays for one sdf file as we use separate sdf files for min/typ/max corners. However, if we use newer tools as tempus to generate sdf, we may see (0.41::0.62), which indicates  that for sdf generated at particular corner (say NOM.sdf), we may have different values for min,typ,max. In timing tools for OCV runs, for a giver corner (say NOM), min value in triplet is used for clk, max for data (for setup check) and viceversa for hold. However for gatesim, it takes only one value for all paths, and we specify what triplet value to use (by stating "MAXIMUM","TYPICAL" or "MINIMUM" in sdf_annotate). So, ideally, we should run gate sims with "MAX" triplet  with QC_MAX.sdf, and "MIN" triplet with QC_MIN.sdf. "MAX" and "MIN" triplet with QC_NOM.sdf is not really needed as it will be bounded by MAX/QC_MAX.sdf and MIN/QC_MIN.sdf.
    (INTERCONNECT scan_out_iso/U164/Y a2d_trg_out_I_bufx8/A (0.001:0.001:0.001)) //same delay for rise/fall (NOTE: hier names used)
    ...
    )
  )
)   
//cell delays
(CELL => delay for cells: delays for each instance defined separately, since it may be diff based on load.
  (CELLTYPE "NA210") =>nand gate
  (INSTANCE test_mode_dmux/U85) => in test_mode_dumx module. since instance specified, it's cell delay
  (DELAY
    (ABSOLUTE
    (IOPATH A Y (0.129:0.129:0.129) (0.170:0.170:0.170)) //rise/fall for Y (min:typ:max)delays. We don't specify rise/fall for A as it's automatically decided based on direction of Y.
    (IOPATH B Y (0.157:0.157:0.157) (0.158:0.158:0.158))
    (COND !A&&!B (IOPATH Y S  (0.630::0.641) (1.470::1.476))) //some complex cells(adders, etc) will have cond delay arcs.
    )
  )
)
(CELL => flop delay. flops will have delay arcs as well as timing check arcs.
  (CELLTYPE "TDC10")
  (INSTANCE spi/data_reg_15)
  (DELAY
    (ABSOLUTE
    (IOPATH CLK QZ (0.622:0.622:0.622) (0.624:0.624:0.624)) => NOTE: sdf doesn't say rise/fall of CLK in IOPATH. Only rise/fall of QZ. However, model file specifies QZ delay wrt posedge or negedge CLK. So, there's always this discrepancy b/w verilog model file and sdf file for all IOPATH.
    (IOPATH CLK Q (1.308:1.308:1.308) (0.874:0.874:0.874))
    (IOPATH CLRZ QZ (0.936:0.936:0.936) ())
    (IOPATH CLRZ Q () (1.217:1.217:1.217))
    )
  )
  (TIMINGCHECK => checks
    (WIDTH (posedge CLK) (0.176:0.176:0.176)) => min allowable time for +ve(high) pulse of clk
    (WIDTH (negedge CLK) (0.692:0.692:0.692)) =>  min allowable time for -ve(low) pulse of clk
    (WIDTH (negedge CLRZ) (0.330:0.330:0.330)) => min allowable time for -ve(low) pulse of clrz
    (SETUPHOLD (posedge D) (posedge CLK) (0.437:0.437:0.437) (-0.263:-0.263:-0.263)) => setup and hold checks for rising edge of D wrt +ve clk. first triplet(0.437) is for setup, while second(-0.263) is for hold. triplets are min:typ:max delays. SETUP and HOLD can also be separated by using SETUP and HOLD keywords. NOTE: setu is +ve, while hold is -ve (typically true for flops as data lines inside flops have extra gates before they hit clk logic)
    (SETUPHOLD (negedge D) (posedge CLK) (0.716:0.716:0.716) (-0.288:-0.288:-0.288)) => similarly for falling edge of D
    (SETUPHOLD (posedge SCAN) (posedge CLK) (0.954:0.954:0.954) (-0.592:-0.592:-0.592))
    (SETUPHOLD (negedge SCAN) (posedge CLK) (0.659:0.659:0.659) (-0.538:-0.538:-0.538))
    (SETUPHOLD (posedge SD) (posedge CLK) (0.472:0.472:0.472) (-0.317:-0.317:-0.317))
    (SETUPHOLD (negedge SD) (posedge CLK) (0.756:0.756:0.756) (-0.332:-0.332:-0.332))
    (RECREM (posedge CLRZ) (posedge CLK) (0.405:0.405:0.405) (0.084:0.084:0.084)) => recovery check  is like setup check for clrz where it should go inactive sometime before the clk., so that flop i/p can get flopped. Removal check is like hold check for clrz where it should go inactive sometime after the clk, so that flop i/p doesn't get flopped that cycle, but the next cycle. RECREM combines RECOVERY ans REMOVAL checks in one. 1st triplet(0.405) is recovery, 2nd(0.084) is removal.
  )
)
(CELL => clkgater delay
  (CELLTYPE "CGPT40")
  (INSTANCE hwk_regs/clk_gate_ccd_brightness_out_reg/latch)
  (DELAY
    (ABSOLUTE
    (IOPATH CLK GCLK (0.642:0.642:0.642) (0.610:0.610:0.610))
    )
  )
  (TIMINGCHECK
    (WIDTH (negedge CLK) (0.538:0.538:0.538))
    (SETUPHOLD (posedge TE) (posedge CLK) (0.701:0.701:0.701) (-0.448:-0.448:-0.448))
    (SETUPHOLD (negedge TE) (posedge CLK) (0.795:0.795:0.795) (-0.508:-0.508:-0.508))
    (SETUPHOLD (posedge EN) (posedge CLK) (0.468:0.468:0.468) (-0.214:-0.214:-0.214))
    (SETUPHOLD (negedge EN) (posedge CLK) (0.721:0.721:0.721) (-0.430:-0.430:-0.430))
  )
)
(CELL => latch delay
  (CELLTYPE "LAH11")
  (INSTANCE flipper_top/flipper_ram/flipper_ram_reg_185_5)
  (DELAY
    (ABSOLUTE
    (IOPATH C Q (0.826:0.826:0.826) (1.097:1.097:1.097))
    (IOPATH D Q (0.622:0.622:0.622) (1.250:1.250:1.250))
    )
  )
  (TIMINGCHECK
    (WIDTH (posedge C) (0.733:0.733:0.733))
    (SETUPHOLD (posedge D) (negedge C) (0.464:0.464:0.464) (-0.339:-0.339:-0.339))
    (SETUPHOLD (negedge D) (negedge C) (1.116:1.116:1.116) (-1.079:-1.079:-1.079))
  )
)

(CELL //hard IP
    (CELLTYPE  "ophdll00032008040") => otp
    (INSTANCE  I_i2c_top/I_bellatrix_i2c_otp/I_otp_32x8)
      (DELAY
    (ABSOLUTE
    (IOPATH CLK Q[0]  (27.7495::27.7495) (7.7705::7.7705)) ...
    (IOPATH CLK Q[7]  (27.7482::27.7482) (7.7696::7.7696))
    (COND WRITECOND (IOPATH CLK BUSY  () (18.6512::18.9356)))
    (COND READCOND (IOPATH CLK BUSY  (5.5594::5.5594) (73.8027::73.8027)))
    (IOPATH PROG BUSY  (17.0202::17.2431) (18.7960::19.0702))
    )
      )
      (TIMINGCHECK
    (SETUPHOLD (posedge READ) (posedge CLK) (23.4053::23.4053) (4.3649::4.3649)) ...
    (WIDTH (COND WRITECOND (posedge CLK)) (50000.0000::50000.0000)) ... => This WRITECOND should be there in verilog model of otp else tool will complain about missing "WRITECOND". This "WRITECOND" initially came from .lib file.
    (PERIOD (COND WRITECOND (posedge CLK)) (50202.0000::50202.0000)) ...
    (SETUPHOLD (posedge D[0]) (posedge CLK) (0.1406::0.1406) (5.4447::5.4447)) ...
    (SETUPHOLD (negedge A[4]) (posedge CLK) (0.7298::0.7298) (4.0518::4.0518))
    (SETUPHOLD (negedge PROG) (posedge READ) (163.5221::163.5221) ())
    (SETUPHOLD (negedge CLK) (posedge READ) (163.5160::163.5160) ())
      )
)

-------

SDF supports both a pin-to-pin and a distributed delay modeling style. We use pin to pin.
SDF supports setup, hold, recovery, removal, maximum skew, minimum pulse width, minimum period and no-change timing checks.
interconnect delay: SDF supports two styles of interconnect delay modeling.
A. The SDF INTERCONNECT construct allows interconnect delays to be specified on a point-to-point basis from o/p port of one device to i/p port of other device. This is the most general method of specifying interconnect delay.
B. The SDF PORT construct allows interconnect delays to be specified as equivalent delays occurring at cell input ports. This results in no loss of generality for wires/nets that have only one driver. However, for nets with more than one driver, it will not be possible to represent the exact delay.

cell delay: SDF supports 2 types of cell delay.
A. IOPATH implies delay from i/p port of device to o/p port of same device. We use this for all simple cells.
B. COND implies conditional i/p to o/p path delay. We use this for complex cells (adders, etc).

************************************************