liberty
- Details
- Last Updated: Friday, 25 October 2024 19:07
- Published: Wednesday, 26 September 2018 05:12
- Hits: 3601
Liberty file format: (.lib): These are standard files for representing timing info for stdcells as gates, flops, etc. They contain all arcs for all stdcells, as well as functionality of these stdcells. That is why synthesis tools are able to map RTL to gate, by using this functionality information for all stdcells present in these liberty files. They use timing info from these files to figure out optimal gates to meet timing.
The most common liberty files in use are the ones used for higher node tech ( >22nm). These have simple look up table (LUT) delays specified for all cells. This is the conventional NLDM (non linear delay model) based. The other more accurate one is CCS (composite current source) model which is employed for tech 22nm and below to give accuracy within 2% of spice simulations. CCS will be discussed later.
syntax:
A very good resource is the official Liberty user guide and reference manual uploaded here: liberty.pdf
General syntax of a test.lib file is as follows:
1st stmt names the library. stmts that follow are library level attributes that apply to the whole lib, as tech type, defn, defaults, etc. then every cell in lib has separate cell description.
stmts are buliding blocks of lib. 4 types:
1. group stmt: {} used to enclose contents of group. Ex:
pin(A) {
related_pin: B; //pin group stmt
cap1_rise (cap_template) { index1 ("..."); values (" ..."); } // groups nay be nested recursively here
}
2. Attribute stmt: attribute_name: attribute_value; => attribute value sometimes enclosed in double quotes. Attributes explained in detail later.
pin (A){
direction : output;
function : "X+Y"; => this is used by synthesis tool, to figure out which gate to use for given RTL logic.
}
3. define stmt: to create new attribute. syntax is: define (attr_name, group_name, attr_type);
Ex: to define a new string attribute called bork, which is valid in a pin group, use
define (bork, pin, string) ;
You give the new attribute a value using the simple attribute syntax:
bork : "nimo"
4. wire load: define the estimated wire length as a function of fanout. You can also define scaling factors to derive wire resistance, capacitance, and area from a given length of wire.
wire_load("3K_2LM") { //name => implies it's for 2 metal layer and for design whose size is < 3K.
resistance : 0; //res in ohms/unit length. Res=0 implies no resistance.
capacitance : 1; //cap in cap_unit/unit length. Note unit is in pf, so cap=1pf/unit is
area : 0; //area/unit length
slope : 0.0118413; //characterizes linear fanout length behavior beyond the scope
of the longest length described by the fanout_length attributes.
fanout_length( 1, 0.005469 ) ; //for fanout=1, estimated wire length is 0.005 units
fanout_length( 2, 0.00943588 ) ;
....
fanout_length( 19, 0.259363 ) ; //for fanout=19, estimated wire length is 0.26 units (for linear interploation, wire length for FO=19 = 0.005*19=0.1, so actual wire length is higher than linear interploation.
}
wire_load("3K_3LM") {//name => implies it's for 3 metal layer and for design whose size is < 3K. similarly for 3K<6K, 6K<16K, so on.
resistance : 0;
...
}
#wire load selection criteria is given below which selects from one of the wire load models above
wire_load_selection (2LM) {
wire_load_from_area (0, 3000, "3K_2LM" ); => specs that if 0 < area_of_design < 3000, choose 3K_2LM wire load model.
wire_load_from_area (3000, 6000, "6K_2LM" ); => choose 6K_2LM for 3000 < area_of_design < 6000. 6K_2LM usually has longer length for a given FO than 3K_2LM, as bigger the design, longer the nets for a particular FO. Simlarly 6K_3LM has lower length for a given FO compared to 6K_2LM as 3 metal layers provide more routing resource, so longer wires not needed.
}
wire_load_selection (3LM) {
...
}
default_wire_load : "6K_3LM"; => by default, wire_load(6K_3LM) is chosen.
default_wire_load_selection : 3LM ; => by default, wire_load_selection(3LM) is chosen, and within this 6K_3LM is chosen.
5. include_file(file_name); => This includes that file from the dir specified in search path.
6. fanout_load: this specifies fanout load for each i/p pin of cell. If not specified, default_fanout_load defined at top of lib file is used.
This may be some number as 1 for smallest size gate (invx1), and then defined appropriately for bigger gates. This will be used by synthesis tool, when we specify max_fanout_load, then all the fanout_load attached to the o/p pin are added to calculate total fanout_load.
7. function: used to represent function of o/p pins of a cell
function : "A&B"; => rep that o/p pin is AND of i/p pins.
Note that simple combinatorial gates can be represented by "function:" stmt, but with seq logic, it's not easy. For latches/flops, we use special keywords. In .lib file, "latch" group used to describe latches and "ff" group used to describe flops. In GTECH (during synthesis in Synopsys DC), both registers and latches are represented by a SEQGEN cell, which has many i/p and o/p pins. Any type of flop/latch can be configured from this SEQGEN cell by tying it's various inputs and outputs.
Example.lib: The lib example below can be applied to any cell, an std cell, or an IP e.g memory module. Just as for a std cell, we specify setup/hold arcs or delay arcs for all i/p pins, we do the same for an IP lib file for all it's i/p pins. When lib files are created for IP, they are called as ETM (extracted Timing model). These ETM hide the internal details of an IP, and just show the arcs on all i/p and o/p pins. These ETM are also used in big SOC, since over there, we run timing on block level, and then when moving to higher level, we generate ETM models of these lower level blocks. That way STA runs much faster at higher module level. We finally take this approach all the way to chip level, where all top modules in it are ETM. this allows STA to run in a much faster time. In some cases where SOC have 10B+ transistors, it's not even possible to run STA flat on chip level gate netlist, since it will take weeks to complete. On other hand, top chip level runs with ETM of lower level blocks can run in less than a day.
library (LIB_W_150_2.5_STDCELL.db) { /* name of library. name can be with .db or w/o it. entire lib desc, lib level attr desc below */
/* genral lib attr */
technology (cmos); /* tech tools used, default name is cmos*/
delay_model : table_lookup; /* which delay model to use in delay calc. generic_cmos is default, which is simplest model. 4 others arr table_lookup, piecwwise_cmos, dcm, polynomial. table_lookup is most common. table_lookup is aka Non linear Delay model (NLDM), and this is the one which is shown in this example below*/
bus_naming_style : "Bus%sPin%d"; /* naming convention for buses */
routing_layers ("routing_layer_1, routing_layer_2"); /* all routing layers available for PnR */
/* delay and slew attr */
//define varios slew and delay attr like thresholds for measuring delay and slew ..
input_threshold_pct_fall : 46; // threshold of 46% fall at i/p pin of receiver for measuring delay
input_threshold_pct_rise : 46;
output_threshold_pct_fall : 46; // threshold of 46% fall at o/p pin of driver for measuring delay
output_threshold_pct_rise : 46;
slew_lower_threshold_pct_fall : 20; //slew starting point is at 20% rise/fall
slew_lower_threshold_pct_rise : 20;
slew_upper_threshold_pct_fall : 80; //slew ending point is 80% rise/fall. This start/end points are used to get the linear slope of waveform
slew_upper_threshold_pct_rise : 80;
/* define units */
time_unit: "10ps"; /* to identify physical time unit in lib. most common is 1ns*/
voltage_unit: "100mv"; /* to scale i/p, o/p voltage groups. most common is 1V*/
current_unit: "1mA"; /* drive current unit genrated by o/p pads, or pull-up/pull-down transistor */
pulling_resistance_unit: "10ohm"; /* res for pull-up/pull-down transistor */
capacitive_load_unit (1,pf); /* unit for all caps*/
leakage_power_unit: 100uW; /* unit of power values. Power units are usually not reported, and calcualted from V, I, C. However, lkg is added for Synopsys DesignPower*/
voltage_map (VDD, 0.5); //These map var VDD to 0.5V. Similarly map other voltages as VPP, VBB, VSS. These mappings are needed if these var are used later.
...
default values /* env defn*/
nom_process : 3;
nom_temperature : 150;
nom_voltage : 2.5;
default_fanout_load : 1; => by default, each i/p pin assigned a fanout load of 1. we override this by assigning explicit FO on each i/p pin of all cells (by using fanout_load : 1;)
default_max_fanout : 20; => max_fanout set at 20 for all o/p pins. we don't specify it explicitly for o/p pins except for tie_hi/tie_lo pins of TIE cell.
default_input_pin_cap : 1; => default is 1 unit. however, each i/p pin assigned explicit cap (by using capacitance : 0.004)
default_inout_pin_cap : 1;
default_output_pin_cap : 0; => default is 0 unit. however, each o/p pin assigned explicit cap (which is again very close to 0, as src/drn cap is negligible)
operating_conditions (W_150_2.5) { //just one op cond specified for particular lib. name is W_150_2.5 (W=weak, T=150C, V=2.5V) but can be anything as "SlowSlow_0p9v_m25c". Here op cond is WCCOM (worst case cond). Usually there is only 1 op cond specified in single lib, but there may be multiple too, in which case we choose the one we want. Other op cond BCCOM (best case cond) may be defined in some other lib. This section is used in PT/synthesis to set operating condition. More details on "set_operating_condtion" specified in "PT - OCV" section.
process : 3; => Process is usually defined as a number where some process number=nom. Any number below nom is considered fast process, while number above nom is considered slow process.
temperature : 150; => This defines Temperature for this lib
voltage : 2.5; => This defines voltage for this lib
tree_type : "balanced_tree"; //interconnect model for calc interconnect delay. During Synthesis, "compile" cmd uses the model from here to select a formula for calc interconnect delays. 3 models available: best_case_tree (uses lumped RC model), worst_case_tree (all loads assume full wire resistance) and balanced_tree (all loads share wire resistance evenly). Here, model is "balanced_tree".
voltage_map(VDD_HIGH, 0.540); => This is latest liberty cmd, that is used to map voltage for PG(power / ground) pin of block. This is the voltage that this PG pin is mapped to for this corner. The flow defaults to this voltage, when no other voltages are set on this pin. It also issues error/warning, when the voltage set on this pin, is not within a certain range of this voltage. We can specify voltage map for all PG pins of this block. NOTE: operating_condition also specifies voltage for a block, but it specifies for whole of the block, not for each indvidual power pin of block. More usage of this is explianed in "PT - DSLG flow" section.
delay_lut_template (name) { //name may be delay_template_5x6 or something descriptive. There may be multiple of these lut for power, driver_waveform, etc
//lookup table template info. Below info says that when table is 2D with 2 indices, then 1st index ins i/p cap, while 2nd index is i/p slew, and the value reported in lut is the "delay" va;ue corresponding to this i/p cap and this i/p slew.
index_1 ("1,2,3,4,5"); //index values here may be real values too
variable_1 : total_output_net_capacitance; //o/p net cap used for table look up. variable1 corresponds to index1
index_2 ("1,2,3,4,5,6");
variable_2 : input_net_transition; //i/p net transition on that pin used for table look up. variable2 corresponds to index2
NOTE: each row in the 2D table reported later is for index_1 (so 5 rows), while each column in the 2D table refers to index_2 (so 6 columns) for the entries that we see in table. var1 and var2 may be other way around too
}
//wire load models
wire_load("3K_2LM") { ...}
wire_load("3K_3LM") { ...}
wire_load("zwlm") { resistance: 1; capacitance:0; fanout_length(1,0) ... } //this is zero wire load model which says that res=1ohm and cap=0pf per unit length of wire, and for fanout=1, assume length to be 0, FO=2, assume length to be 0 and so on until FO=20. So, essentially. RC delay is going to be 0 for all wires, as wire length is assumed to be 0 for all connections
...
wire_load_selection (2LM) { wire_load_from_area(0, 110300, "zwlm"); } //each of these wire load selection chooses a particular wire load model from above based on area of design. here it says that if area of design is between 0 and 110300 units then choose zwlm.
wire_load_selection (3LM) { ... }
default_wire_load : "6K_3LM";
default_wire_load_selection : 3LM ;
deafult_wire_load_mode: segmented ;
//////// All cells power/delay data ////////
cell (name1) { /* cell defn */
//general info for each cell. All these attributes are defined by liberty syntax. We can have as many attributes for each cell as we want.
version : 1.0;
cell_leakage_power : 4.579760E+01;
area : 1.25;
cell_footprint : AN2;
pg_pin (VDD) { pg_type: primary_power; voltage_name:VDD; related_bias_pin: VPP; } => optional. All pg_pins as VDD, VSS, VPP, VBB specified here
//optional: lkg pwr for each combo of i/p pins. default lkg pwr is the one above (when none of below conditions occur). This is needed only if we want to model very accurate leakage power (<22nm)
leakage_power () {
value : 4.112860E+01;
when : "A&!B"; //lkg pwr when A=1,B=0. similarly we define for other combo of A,B
related_pg_pin: VDD //if we have multiple Power pins, then we can define power consumption for each pin separately. If we have bias voltage for nwell as VPP, then we have separate lkg power related to that pin.
}
//info for each i/p pin.
pin (A) { //similarly for pin B and other i/p pin
capacitance : 0.0027; //i/p cap on pin A (may have rise_cap and fall_cap also listed separately for low tech node (<22nm), however, rise/fall cap are very close to regular cap of pin)
receiver_capacitance () { //apart from simple values above, we can specify i/p slew dependent cap values to be used in receiver model in CCS model. More details in CCS section.
when: "!B&SI"; //cap can be different based on i/p pin state. So, we can condition based cap
receiver_capacitance1_rise (receiver_cap_template_8x8) { //we have 4 such values for cap1_rise, cap1_fall, cap2_rise and cap2_fall
index_1 ("0.00340741, 0.0126433, 0.031115, 0.0681482, 0.142125, 0.290168, 0.586164, 1.17816"); //i/p slew
values ( \
"0.000288178, 0.000321005, 0.000330344, 0.000333917, 0.000335357, 0.000336019, 0.000336375, 0.000336634" \ //cap1_rise values for diff i/p slew rate
);
}
max_transition : 4.00; //max transition tolerated on i/p pin A. this max transition is there since the timing table for o/p pin has look up values upto tran time of 4ns. Any trnasition greater than 4ns has to be extrapolated by the timing tool to come up with delay for the cell, which may be inaccurate.
direction : input;
fanout_load : 1; //this pin assigned FO=1. This number is used by tool to estimate wireload for net connecting to this pin. This FO is also used to calc total FO load on each net for max FO Design rule violation. For bigger gates, we may assign FO=2,3,etc.
related_power_pin: VDD; //when we have pwr/gnd pins, we assign related power, gnd and bias pins (3 separate stmt)
//internal_power arcs for i/p pin usually don't exist, since internal pwr is already captured in o/p pin. But when we have multiple i/p pins, it's possible that some internal pwr gets consumed, when i/p pin changes even when o/p pin doesn't change. This happens due to redistribution of cap on internal nodes, due to i/p pin switching. Note that if o/p pin toggles due to i/p pin toggling, then it gets reported as internal pwr on o/p pin. Pwr consumed here is small, so most libs do not care about internal pwr on i/p pins of stdcells. Only used for lower nm tech where we want to model power accurately
internal_power () { //when pin A is toggling. similarly for pin B.
when : "!B"; //this when condition is necessary, since this internal pwr only gets consumed for NAND gate when other pin=0. This forces o/p pin to 1. So, pin A toggling doesn't cause o/p to change in this case, resulting in internal pwr on pin A only
related_pg_pin: "VDD"; //if we have multiple pwr pins like VDD, VPP, then we define pwr separately for each pg_pin, so that we can separate out current thru each of these pins. So, for 2 PG pins as VDD, VPP, we repeat this internal power table for pin A 2 times
rise_power (inpower_template_8x1) { //there is only index_1 which has i/p transition time on it. There is no index for cap here
}
fall_power (inpower_template_8x1) {
}
}
//info for o/p pin
pin (Y) {
//general info
capacitance : 0.0000; //drn cap on o/p pin is 0 (we may also omit this)
max_capacitance : 0.15; //max cap tolerated on o/p pin Y. this max cap is there since timing table for o/p pin has look up values upto max cap of 150ff. Any cap load of greater than 150ff has to be extrapolated by the timing tool to come up with delay for the cell, which may be inaccurate. We may also specify a min_cap which refers to the smallest cap present in LUT
direction : output;
function : "A&B"; //used by tools to know functionality of cells !=NOT, +=OR, &=AND
power_down_function: "!VDD + !VPP + VSS + VBB"; // This says that cell si powered down when VDD=0 & VPP=0 & VSS=1 & VBB=1 (0=not present, 1=present)
related_bias_pin: VPP; //when we have pwr/gnd pins, we assign related power, gnd and bias pins (3 separate stmt)
//timing arcs for o/p pin rise/fall delay and rise/fall transition wrt to all i/p pins
timing () { //timing wrt to i/p pin A. similarly for timing wrt i/p pin B
transport : "NO";
related_pin : "A";
timing_type : combinational;
timing_sense : positive_unate;//+ve means o/p goes in same dirn as i/p
when: "A1&!A2"; //optional, specifies that is timing arc is to be used when A1=1,A2=0. We also specify a sdf condition (sdf_cond: "A1==1'b1 && A2==1'b0") that is used when generating sdf file.
mode(my_mode, "scan_2"); //optional. We can specify each tiing arc to be valid for specific conditions only. We achieve this via "mode" attribute. A mode attribute pertains to an individual timing arc. We specify a mode_name and mode_value, and this timing arc is active only when mode is set to that value. Here, we set our variable "my_mode" to value="scan_2", so this timing arc will be picked only when "my_mode" is set to "scan_2" mode. Here my_mode is not just a variable, that can be set via "set my_mode scan_2", but rather a mode variable, set via PT cmd "set_mode" in synthesis/STA scripts. See details of this cmd in PT cmds section.
rise_transition (transitiondelayload6slew7_6x7) { } //NLDM LUT for o/p slew, similarly for fall_transition
cell_rise (celldelayload6slew7_6x7) { } //NLDM LUT for o/p delay, similarly for cell_fall
}
timing () { //timing wrt i/p pin B
}
//power arcs for o/p pin rise/fall wrt to all i/p pins. arcs similar to those of timing
//assumption is that both pins will never change at exactly the same time. so we can calc power wrt 1 pin toggling, then wrt other pin toggling
internal_power () { //when pin A is toggling. similarly for pin B.
related_pin : "A";
related_pg_pin: "VDD"; //if we have multiple pwr pins like VDD, VPP, then we define pwr separately for each pg_pin, so that we can separate out current thru each of these pins. So, for 2 PG pins as VDD, VPP, we repeat this internal power table for pin A 2 times
rise_power (outputpower_cap4_trans5) {
}
fall_power (outputpower_cap4_trans5) {
}
internal_power () { //power when pin B is toggling
}
}
//internal power can be for i/p pins as well as o/p pins as we saw above. For std IP as SRAM, etc we have internal power for i/p pins instead of o/p pins as power for IP varies based on whether it's enabled, and whether in rd/wrt mode. This power number accounts for all the power for that IP in various modes
ex:
pin (CLK) { ...
internal_power() {
power_level : "VDD";
when : "(WZ&!EZ)"; //similarly power for other modes as wrt=(!WZ&!EZ), idle=(EZ)
power(inputpower_slew3){
index_1("0.008,0.1500,0.600"); //i/p pin "CLK" slew rate (0.6ns is max slew rate)
values(\
"49.327, 49.315, 49.325"); //energy in pJ for whole IP when in rd=(WZ&!EZ)
}
}
//cell info for other cells
cell (name2) { /* cell defn */
cell1 info
}
type (name) {
bus type name
}
input_voltage (name) {
input voltage information
}
output_voltage (name) {
output voltage information
}
INTERNAL PIN: apart from i/p and o/p pins, we can define internal pins also. This is needed in cases, where there's a complex IP, and it has clocks generated internally that time i/p and o/p ports. In such cases, we define internal pin, which is some divided version of i/p clk, and characterize it's timing based on i/p clk rise/fall. We can have all timing arcs here as setup/hold delay arcs as well as min_pulse and min_period arcs, etc.
ex:
pin("clk_pll_checkpin_int") {
direction : internal ;
clock : true ;
capacitance : 0.000000 ;
timing() {
related_pin : "clk1_ext" ; //this is the i/p clk pin of the IP, which serves as the master source of this internal clock pin. We define timing for gen clk wrt master clk, so that gen clk can be timed correctly based on master clk i/p slew
timing_type : combinational ;
cell_rise (....);
}
timing() {
related_pin : "clk_pll_checkpin_int" ; //this is related to itself as min_pulse_with/min_period types are defined on the pin itself
timing_type : min_pulse_width ;
rise_constraint (....);
}
pin("IN1") { direction: input; ... timing() {
related_pin :"clk_pll_checkpin_int"; //Here i/p port IN1 has timings related to the internal clk defined above.
...... }
We can also use "generated_clock" directive to define internal generated clocks. This is so that we don't have to write "create_generated_clock" cmd ourselves to create internal generated clocks. This may be useful in some cases. However, more often we remove these internal clocks inside the phy, and it's preferred to write your own "create_generated_clock" cmd to create clks inside the phy. That way we have more control on what we want.
ex: generated_clock(my_int_clk) { /* This internal clk is defined as div by 2 of master clk, and it's defined on "port1_clk" pin of the IP. There still needs to be a path via "arcs" from gen_clk to master clk, for this gen clk to be created, else PT will give PTE-075 error "gen clk has no path to master clk"*/
clock_pin : port1_clk ;
master_pin : ext_800m_clk ;
divided_by : 2 ;
}
CHECKPIN: Timing tools as PT creates it's own internal pins for certain arcs even when the .lib beingread doesn't have any internal pins with that name. It creates an internal pin with name "*checkpin*" whenever a pin has a combinational and sequential delay timing arc. This is done to separate the two types of arcs. For ex: consider a cell which has a clk->q arc and a combo clk->gated_clk. Here 1st arc is seq, while 2nd arc is combo. We could have written both arc with related pin as "clk". But PT chooses to create a "checkpin" for seq arc, where clk->q is now referenced as clkcheckpin1->q along with other setup/hold seq arcs also referenced wrt checkpin. clk->gated_clk is still referenced wrt original "clk". a new combo arc from clk->clkceheckpin1 is created with 0 delay. All of this internal "checkpin" creation is done when reading in .lib or .db. So, don't be surprised if you see arcs referecing checkpin, when you no such internal clks. It's something peculiar to PT only.
Details of this is on solvnet => https://solvnetplus.synopsys.com/s/article/Internal-Checkpins-Created-in-Some-Library-Cells-1576002481225
Attributes:
As we saw above, we have various attributes for cells, pins, etc. 1 of the most important attribute in "timing" group is "timing_type" attribute. It's used by timing tools to determine timing paths. timing_sense attribute is used along with this. Also, we have related and constrained pin concept that these attr apply to:
Constrained pin: This is the pin which is being constrained. When we write timing arcs, this is the -to pin. For ex: EN pin of a clk gater is a constrained pin. This is the pin that you will see in .lb as "pin(PIN_1) { ... }
Related pin: Any constrained pin may be constrained wrt multiple pins. When we write timing arcs, this is the -from pin. For ex: EN pin of a clk gater may be constrained wrt clk pin, wrt to clear pin, wrt to st pin, etc. All of these pins as clk, clear, set, etc are called related pins. These are the pins that appear within constrained pin section in .lib as "timing() { {related_pin : "clk"; ... } {related_pin : "set"; ... } } etc.
1. timing_sense attribute can be unate or non_unate. unate is when o/p dirn is dependent on i/p direction (i.e inverter o/p is always opposite of inverter i/p). Non_unate is when o/p dirn has no relationship to input dirn (i.e fop o/p pin Q can be rising or falling with no relationship to i/p pin D dirn). This attr is needed since timing tools can't determine the sense as they can't see the guts of logic. Unate can be +ve unate or -ve unate.
positive_unate : if rising/falling change on i/p causes o/p to rise/fall (same polarity),
negative_unate: if rising/falling change on i/p causes o/p to rise/fall i/p causes o/p to fall/rise (opposite polarity).
2. timing_type attribute: distinguishes b/w comb and seq cell. If this attr is not defined, cell is considered combinatorial. values defined for following timing arcs:
- I. comb arc: timing arc attached to an o/p pin, and related pin is either i/p or o/p pin. timing arc has rise/fall_transition and cell_rise/fall for o/p pin wrt each i/p pin. It's used for all combo gates as AND, OR, etc. An arc from Clk to Q pin of a flop is NOT a combo arc (explained in seq arc)
- A. combinational: means o/p can rise or fall. for positive_unate, arc is for R->R,F->F. for negative_unate, arc is for R->F,F->R. for non_unate, arc is for {R,F}->{F,R}
- B. combinational_rise: rise means o/p is rising only. +ve_unate(R->R), -ve_unate(F->R), no_unate({R,F}->R})
- C. combinational_fall: fall means o/p is falling only. +ve_unate(F->F), -ve_unate(R->F), no_unate({R,F}->F})
- II. seq arc: It's either delay arc (clk and o/p data) or constraint arc (clk and i/p data). It's used for flops/latches, etc. The seq arc is from "related" pin to the "constrained" pin.
- A. rising/falling_edge: arc whose timing o/p pin is sensitive to rising/falling signal at i/p pin. An ex is CLK->Q arc of a flop. Here when clk rises, o/p pin may rise or fall. It looks like a combo arc (i.e delay from i/p to o/p), but it's actually a seq arc, as the arc breaks here. We have a new timing arc start from clk pin to q pin. Another reason, it's not a combo arc is because o/p value changes only on +ve edge of clk and not on -ve edge (for a +ve flop). So, to differentitate this CLK->Q arc from pure combo clk->gclk arc, we write it as seq arc.
- B. preset/clear: arc affect only the rise/fall arrival time of o/p pin. logic 1/0 is asserted on o/p pin. EX: SR latch has clear arc on "Q" pin wrt "SZ" pin, and preset arc on "Q" wrt "SZ" pin.
- C. hold_rising/falling: designates rising/falling edge of related pin for hold check.
- D. setup_rising/falling: designates rising/falling edge of related pin for setup check.
- E. recovery_rising/falling: uses rising/falling edge of related pin for recovery check. clk is rising/falling edge triggered.
- F. removal_rising/falling: used when the cell is low-enable latch or rising-edge triggered FF (for removal_rising) or the cell is high-enable latch or falling-edge triggered FF (for removal_falling). intinsic_rise/fall attr used along with this.
- G. min_pulse_width: together with minimum_period value, specifies min pulse width for clk pin. can also be specified for other pins as set/reset, etc. Both *_high/low defined for clk pins, while *_high defined for active high set,reset pins while *_low defined for active low set,reset pins. Both high and low pulses need to have min width for clk, since there's a rising edge on both of them, and it may be missed, if it happens in a very small time (low pulse while clk is high, or high pulse while clk is low). If we want min_pulse_width to be specified in same format as other timing attributes, then we need to have related_pin set to same pin as i/p pin, and timing_type as "min_pulse_width". Then to specify min_pulse_width_high, we can specify rising transition with rise_constraint and have different values of high pulse width for different rising transition of pin. Similarly fall contraint means min_pulse_width_low. Usually min_pulse width should be greater than a gate delay in that tech, since the clk pulse passes thru several gates inside the flop, so a pulse less than a gate delay may be swallowed by the gate itself (i.e pulse may start dying before it even rose to 100%, since the delay is more than pulse width)
- III. nonseq arc: when setup/hold are specified on data pin with a non-clk pin as the related pin. The signal of a pin must be stable for a specified period of time before and after another pin of the same cell change state, for the cell to function as expected. Called nonseq since related pin is not clk. 4 possible arcs are non_seq_setup/hold_rising/falling. rising/falling edge are meant for related pin. These are called data to data paths.
- Ex: SR latch has non_seq_setup/hold_rising arcs on "RZ"(data) rising wrt "SZ"(clk as related pin) rising and vice versa. This arc exists since when both RZ/SZ go inactive, o/p Q is uncertain depending on which pin went inactive first. Similar arcs for clrz wrt prez and vice versa for all flops/latches which have clrz and prez pins on them.
- IV: nochange arc: used for latch devices with latch enable signals. 4 possible arcs of nochange_high/low_high/low indicate +ve/-ve pulse on constrained pin and +ve/-ve pulse on related pin.
stdcells and their .lib arcs:
In PT, we can see all the arcs for a particular cell by typing: report_lib <args> (see in PT_ETS.txt for more details). We'll use this cmd when looking at arcs for cells below. This will ensure our cell timing arc understanding is consistent with what Timing tool sees. Below are different kind of stdcells discussed, along with their timing arcs.
1. comb logic: combinatorial gates as AND, OR, etc. arcs are for o/p pin with related i/p pin. o/p pin rise/fall wrt each i/p pin. positive_unate/negative_unate indicates the dirn of input pin. 3 kinds:
A. Data path: Adders, comparators, etc. AD2 (half adder, S=A^B, CO=A&B), AD3 (full adder, S=A^B^CI, CO=A&B+A&C+B&C) SU2 (subtractor/comparator)
B. Gates: AN21/NA21 (2/3/4 i/p and/nand gate), BF09/BH03 (2 to 7 i/p Boolean functions), EN21 (2 i/p EX-NOR), EX22 (2/3 i/p EX OR), BU10/IV10 (buffers,tri-state buffers, inverters), OR31/NO31 (2/3/4 i/p or/nor gate)
C. Multiplxer: MU111 (multiplexer). if multiplexer implemented using pass gates then it's no more comb, so special attributes have to be placed for such 1 hot mux)
Example arc for NAND gate: NOTE: AND has 2 gates in it (nand followed by inv). So, better to look at an nand.
cell (NA210) {
version : 1.0;
cell_leakage_power : 3.75; //avg (default) lkg power in pW (unit defined in top)
area : 1.40;
cell_footprint : AN2;
leakage_power () {//lkg power for A=1, B=0
value : 6;
when : "A&!B";
}
leakage_power () {//lkg power for A=0, B=1
value : 7;
when : "!A&B";
}
pin (A) {
capacitance : 0.0065;//cap in pf. 0.006pf=6ff
max_transition : 3.50; //max slew rate allowed on i/p pin is 3.5ns (for all cells)
direction : input;
fanout_load : 1; //fanout load defined as 1 for i/p pin (for all cells). this fanout load is used when calc FO at any o/p pin (FO load for all i/p pins at receiver added to get FO load at o/p of driver)
}
pin (B) { //for i/p pin B
capacitance : 0.0063;
max_transition : 3.50;
direction : input;
fanout_load : 1;
}
pin (Y) { //for o/p pin Y
capacitance : 0.0000;
max_capacitance : 0.11; //max cap allowd on pin Y is set to 110ff. assume pmos/nmos same size = x. So, i/p cap for EFO purpose = 1/1.5(n)+1(p)=1.66*6ff/2=5ff. max EFO=110/5=22. it's same as for invx1, as all x1 gates have same driving strength. When we goto size x2, max cap is set to 0.22 (since i/p drv strength is twice [i/p cap is 12ff], so max EFO is still 22)
direction : output;
function : "A&B";
timing () {
transport : "NO";
related_pin : "A"; => related pin says with respect to which i/p pin is o/p delay based on. For flops with pin D, related pin would be CLK for setup or hold checks.
timing_type : combinational; => refers to related pin dirn (for ex, if it's hold_rising, then rising refers to pin "A" dirn)
timing_sense : positive_unate;
rise_transition (transitiondelayload5slew6) { //o/p slew rate
index_1 ("0.0054,0.0162,0.0324,0.0486,0.0864");//o/p load in pf. NOTE: max cap in table here is 86.4ff, while max cap is set to 110ff. So, extrapolation is done.
index_2 ("0.04,0.1,0.4,0.8,1.5,3.5");//i/p slew in ns (max i/p slew is 3.5ns)
values (\
"0.1665, 0.1666, 0.1707, 0.1765, 0.1864, 0.2188",\ => 1st row is for index_1, entry 1
"0.3182, 0.3189, 0.3203, 0.3243, 0.3283, 0.3477",\ => each column is index_2 entry 1-6
"0.5517, 0.5515, 0.5516, 0.5549, 0.5566, 0.5669",\
"0.7859, 0.7859, 0.7844, 0.7864, 0.7886, 0.7946",\
"1.3322, 1.3305, 1.3300, 1.3316, 1.3326, 1.3363"); => 5th row is for index_1, entry 5
}
cell_rise (celldelayload5slew6) { //delay thru cell
index_1 ("0.0054,0.0162,0.0324,0.0486,0.0864");
index_2 ("0.04,0.1,0.4,0.8,1.5,3.5");
values (\
"0.2670, 0.2874, 0.3779, 0.4501, 0.5348, 0.6757",\
"0.3734, 0.3939, 0.4843, 0.5573, 0.6425, 0.7908",\
"0.5278, 0.5482, 0.6387, 0.7130, 0.7971, 0.9461",\
"0.6806, 0.7011, 0.7922, 0.8666, 0.9508, 1.0993",\
"1.0361, 1.0568, 1.1484, 1.2227, 1.3087, 1.4557");
}
fall_transition (transitiondelayload5slew6) { ... }
cell_fall (celldelayload5slew6) { ... }
//similarly for pin B
timing () { ... }
internal_power () {
related_pin : "A";
rise_power (outputpower_cap3_trans4) { //pwr in pW when o/p pin Y is rising
index_1 ("0.0108,0.0432,0.0864");
index_2 ("0.1000,0.5000,1.2000,3.8000");
values (\
"0.0246, 0.0242, 0.0270, 0.0409",\
"0.0255, 0.0244, 0.0257, 0.0367",\
"0.0257, 0.0248, 0.0251, 0.0338");
}
fall_power (outputpower_cap3_trans4) { //pwr when pin o/p pin Y is falling
index_1 ("0.0108,0.0432,0.0864");
index_2 ("0.1000,0.5000,1.2000,3.8000");
values (\
"0.0077, 0.0013, 0.0024, 0.0157",\
"0.0087, 0.0063, 0.0032, 0.0109",\
"0.0090, 0.0077, 0.0062, 0.0084");
}
}
internal_power () { ... } //similarly for pin B.
}
}
}
2. seq logic: Flops and latches. The name of flop/latches in libraries is such that it allows to distinguish b/w scan/no_scan, +ve/-ve, Clrz/Prez/both pins. as an example name XYZ=> X=D(no scan),T(scan). Y=N(-ve),T(+ve), Z=B(both),C(clr),P(preset),N(none). clr/preset are active low.
A. no scan flops: DNB10/DTB10(-ve/+ve, clr/preset), DNC10/DTC10(-ve/+ve, clr), DNN10/DTN10(-ve/+ve, none), DTP10(+ve, preset).
ex: Negative edge triggered D-FF, async active low clear, both Q and QZ outputs., 4X Drive
cell (DNC40) {
...//ff group: describes either a single stage or master-slave Flip Flop. ff_bank used to rep multi-bit flip-flop.
ff ("IQ","IQZ") { => IQ defines state of non-inverting o/p, while IQZ defines inverting output state (internal states of cross coupled inverters within the flop). These can be named anything except name of a pin in the cell being described.
next_state : "D"; => required, it's a logic eqn written in terms of i/p pins or 1st state variable (IQ)
clocked_on : "CLK'"; => required, identify active edge of clock signal (here CLK' indicates it's -ve edge triggered device). all pins listed here are treated as clocks by DC. For ex, for ff with CE pin, we can write clocked_on: "CLK & CE", but then we define clock attribute as true for CLK and false for CE.
clear : "CLRZ'"; => optional, gives active value for clear input. here's it's CLRZ' => clrz bar
preset : "xx"; =>optional, gives active value for preset input
clear_preset_var1 : L; => this is there if both clrz,prez pins there. implies IQ=L if both clrz,prez active.
clear_preset_var2 : L; => this is there if both clrz,prez pins there. implies IQZ=L if both clrz,prez active.
}
pin (CLK) {
min_pulse_width_high : 0.9572;
min_pulse_width_low : 0.7352;
capacitance : 0.0152;
max_transition : 4.10;
direction : input;
fanout_load : 1;
clock : true; => clock attribute needs to be set to true, so that DC treats this as clock.
...
}
pin (CLRZ) {
min_pulse_width_low : 0.6865; //clrz low pulse can't be < 0.68ns. This translates into $width check when running PT/gate_sims. No check for high pulse as high is inactive, so even if there's a high glitch, it's ok as o/p will still be low.
capacitance : 0.0129;
max_transition : 4.10;
direction : input;
fanout_load : 1;
...
//timing: 4 arcs = recovery_falling/removal_falling wrt CLK (implies clk falling edge), non_seq_setup_rising/non_seq_hold_rising wrt PREZ. see top of this file for details on various arcs for all cells. Since related pin is "CLK" so timing arc is -from "CLK" pin -to specified pins (i.e -to CLRZ/PREZ etc). This is how seq timing arcs are written. They are always from "related" pin to "constrained" pin.
timing() { //timing for removal_falling related to clk pin (clk pin falling since it's -ve edge flop)
related_pin : "CLK"; //since related pin is CLK, arc is: -from CLK -to CLRZ
timing_type : removal_falling;
rise_constraint (constraint_slewref_6slewdata_6) { //note that i/p pins use word "constraint" for timing arcs instead of cell_rise, etc as used for o/p pins. This has rise_constraint only as recovery/removal are for active to inactive edge only
}
}
timing() { //timing for recovery_falling related to clk pin
related_pin : "CLK";
timing_type : recovery_falling;
rise_constraint (constraint_slewref_6slewdata_6) { //note this has rise_constraint only
}
}
timing () { //CLRZ rising (rise_constraint) should setup some time before PREZ rising (non_seq_setup_rising)
related_pin : "PREZ"; //since related pin is PREZ, arc is: -from PREZ -to CLRZ
timing_type : non_seq_setup_rising; //setup arc
rise_constraint (constraint_slewref_6slewdata_6) {
}
}
timing () { //CLRZ rising (rise_constraint) should hold for some time after PREZ rising (non_seq_setup_rising)
related_pin : "PREZ";
timing_type : non_seq_hold_rising; //hold arc
rise_constraint (constraint_slewref_6slewdata_6) {
}
}
}
pin (PREZ) { //similar arcs for PREZ as for CLRZ
}
pin (D) {
capacitance : 0.0054;
max_transition : 4.10;
direction : input;
fanout_load : 1;
...
//pin D has 2 arcs, setup/hold wrt clk falling
timing () { //pin D needs to setup with clk falling
related_pin : "CLK"; //since related pin is CLK, arc is: -from CLK -to D
timing_type : setup_falling;
rise_constraint (constraint_slewref_6slewdata_6) { //pin D rising edge setup. setup/hold arcs are dependent on D and CLK pin slew rates, and do not have dependence on o/p load. So, 2D table has index1 as clk_slew and index2 as data_slew
}
fall_constraint (constraint_slewref_6slewdata_6) { //pin D falling edge setup
}
}
timing () { //pin D needs to hold with clk falling
related_pin : "CLK";
timing_type : hold_falling;
rise_constraint (constraint_slewref_6slewdata_6) { //same for hold
}
fall_constraint (constraint_slewref_6slewdata_6) {
}
}
}
pin (Q) {
capacitance : 0.0000;
max_capacitance : 0.77;
direction : output;
function : "IQ";
...
//Q pin has 4 arcs: delay arcs wrt PREZ falling, CLRZ falling and Q falling, CLRZ falling and Q rising, and CLK falling
timing () { //Q rising wrt PREZ falling
transport : "NO";
related_pin : "PREZ";
timing_type : preset;
timing_sense : negative_unate;
rise_transition (transitiondelayload6slew7) {
}
cell_rise (celldelayload6slew7) {
}
}
timing () {//Q falling wrt CLRZ falling
transport : "NO";
related_pin : "CLRZ";
timing_type : clear;
timing_sense : positive_unate;
fall_transition (transitiondelayload6slew7) {
}
cell_fall (celldelayload6slew7) {
}
}
timing () {//Q rising wrt CLRZ falling. this happens since clrz has priority, so when both clrz,prez are low, then Q=L. But if clrz goes high, then Q goes high as prez is still active.
transport : "NO";
related_pin : "CLRZ";
timing_type : preset;
timing_sense : positive_unate;
rise_transition (transitiondelayload6slew7) {
}
cell_rise (celldelayload6slew7) {
}
}
timing () {//Q rise/fall wrt clk falling
transport : "NO";
related_pin : "CLK";
timing_type : falling_edge;
rise_transition (transitiondelayload6slew7) {
}
fall_transition (transitiondelayload6slew7) {
}
cell_fall (celldelayload6slew7) {
}
cell_rise (celldelayload6slew7) {
}
}
}
pin (QZ) { //same arcs as those of Q
capacitance : 0.0000;
max_capacitance : 0.77;
direction : output;
function : "IQZ";
..
}
} => end of cell
In PT, for a regular flop with D, CLK and Q pins, we see these 4 arcs. NOTE that all arcs are "-from" CP pin (related pin) "-to" Q or D pin (constrained pin). Always keep that in mind when considering arcs.
pt_shell> report_lib -timing TSM_LIB {DFLOP_SVT}
****************************************
Arc Arc Pins
Lib Cell Attributes # Type/Sense From To When
----------------------------------------------------------------------------
s 0 hold_clk_rise CP D
1 setup_clk_rise CP D
2 clock_pulse_width_high
CP CP D
3 clock_pulse_width_low
CP CP D
4 rising_edge CP Q
B. scan flops: TNB11/TDB11(-ve/+ve, clr/preset), TNC10/TDC10(-ve/+ve, clr), TNN10/TDN10(-ve/+ve, none), TNP/TDP(-ve/+ve, preset). All these scan flops have test_cell group to identify them as scan cells.
arcs for TDB11 are for:
I. prez pin: 4 arcs. 2 arcs are with clk as related pin, recovery_rising/removal_rising(implies clk rising) for prez rising. no falling edge arc as recovery/removal checks are only for async signal going from active to inactive. other 2 arcs are with clrz as related pin, non_seq_setup/hold_rising(implies clrz rising) for prez rising. again no falling edge arcs here.
II. clrz pin: 4 arcs same as for prez pin. recovery_rising/removal_rising with clk as related pin, and non_seq_setup/hold_rising with prez as related pin.
III. Data pin: 4 arcs. with clk as related pin, setup/hold_rising(implies clk rising) for data pin rising and falling.
IV: Q pin: 5 arcs. 1 arc is with prez as related pin, "preset" arc for Q rising. 2 arcs are with clrz as related pin, "clear" arc for Q falling, and "preset" arc for Q rising. Note that for clrz related pin, we have "preset" arc also. this is because clrz has priority over prez, so when both clrz/prez are low, and then clrz goes high, then Q goes high. so, we have "preset" arc for Q rising with clrz as related pin. 2 arcs with clk as related pin, "rising_edge"(implies clk rising) for Q rising/falling.
V: SD pin: 4 arcs. with clk as related pin, setup/hold_rising(implies clk rising) for SD rising/falling. same as Data pin arcs.
VI: SCAN pin: 4 arcs. with clk as related pin, setup/hold_rising(implies clk rising) for SCAN rising/falling. same as Data pin arcs.
ex: Scan flop
cell (TDN10) { ...
ff ("IQ","IQZ") {
next_state : " (D SCAN') + (SD SCAN) "; => states that next state is D when scan=0, and SD when scan=1
clocked_on : "CLK";
}
//test_cell group: added to the cell desc to identify it as scan cell. this group defines only the non-test mode fn of scan cell.
test_cell () { => identifies this cell as scan cell
ff ("IQ","IQZ") { => model only the non-test cell behaviour here.
next_state : "D"; => in no-test, next state=D
clocked_on : "CLK";
}
pin (D) {
direction : input;
}
pin (CLK) {
direction : input;
}
pin (SD) {
direction : input;
signal_type : test_scan_in; => scan_data_in
}
pin (SCAN) {
direction : input;
signal_type : test_scan_enable; => scan_enable
}
pin (Q) {
function : "IQ";
direction : output;
signal_type : test_scan_out; => scan_data_out
}
pin (QZ) {
function : "IQZ";
direction : output;
signal_type : test_scan_out_inverted; =>
}
}
C. latch (no scan): LAB10( nand SR latch), LAL10/LAH10(active low/high), LAH27(active high with clr/preset), LAH2B(active high with clr)
arcs for LAH27 are for: (clk pin has no arc but has "min_pulse_width_high" check and is tagged as "clock : true"). Note, a active high latch essentially behaves as -ve flop, so all arcs same as those for flop, except for Q pin comb arc from D->Q.
I. prez pin: 4 arcs. 2 arcs are with clk as related pin, recovery_falling/removal_falling(implies clk falling) for prez rising. clk falling edge taken as latch turns off at falling edge of clk. 2 arcs are with clrz as related pin, non_seq_setup/hold_rising(implies clrz rising) for prez rising.
II. clrz pin: 4 arcs same as for prez pin. recovery_falling/removal_falling with clk as related pin, and non_seq_setup/hold_rising with prez as related pin.
III. Data pin: 4 arcs. with clk as related pin, setup/hold_falling(implies clk falling) for data pin rising and falling.
IV: Q pin: 8 arcs. 2 arcs with prez as related pin, "preset" arc for Q rising (prez falling) and "clear" arc for Q falling (prez rising). 2 arcs with clrz as related pin, "clear" arc for Q falling (clrz falling) and "preset" arc for Q rising (clrz rising). Note that prez has priority over clrz here, so with prez as related pin "clear" arc exists for Q falling. But irrespective of that, whenever clrz or prez go high (while clk is high), then i/p Data will flow to Q, so with clrz rising, "preset" arc exists for Q rising, and for prez rising, "clear" arc exists for Q falling. So, with prez as related pin, "clear" arc exists for Q falling in 2 ways:
A. clrz=0, clk=0or1, and prez rises => Q falls (case of prez having priority)
B. clrz=1, clk = 1, and prez rises => Q falls (case of D->Q path while clk active)
2 arcs with clk as related pin, "rising_edge"(implies clk rising) for Q rising/falling. 2 arcs with Data as related pin, "combinational" for Q rising/falling.
ex: active high D-latch, async active low clear/preset, both Q and QZ outputs., 4X Drive
cell (LAH21) {
...//latch group below: describes level sensitive storage device. latch_bank used to rep multi-bit latch.
latch ("IQ","IQZ") { => IQ defines state of non-inverting o/p, while IQZ defines inverting output state (internal states of cross coupled inverters within the flop). These can be named anything except name of a pin in the cell being described.
enable: "CLK"; => optional. specify enable (active high)
data_in: "D"; => optional, data
preset : "PREZ'"; => preset is active low (note ' at end of PREZ to indicate bar)
clear : "CLRZ'"; => clr is active low
clear_preset_var1 : H; => IQ (var1) =H when both preset and clear are active
clear_preset_var2 : H; => IQZ (var2) =H when both preset and clear are active
}
pin (CLK) { .... clock: true; => clock attribute needs to be set to true, so that DC treats this as clock. No timing arcs.
pin (D) { .. } => 2 timing arcs, setup_falling/hold_falling wrt CLK falling and D pin rise/fall constraint
pin (CLRZ) or (PREZ) => these don't have any special attr. just treated as normal pins. They have 4 arcs: recovery_falling/removal_falling wrt CLK pin falling and CLRZ rising (rise_constraint), and non_seq_setup/hold_rising for pin CLRZ wrt pin PREZ rising (or for PREZ pin: non_seq_setup/hold_rising for pin PREZ wrt pin CLRZ rising)
pin (Q) { ...//4 arcs: wrt clrz rise/fall, prez fall, clk fall and combinatorial arc for D rise/fall.
function: "IQ"; => Q has same value as var IQ above. IQ=H when both clrz/prez active, so prez has priority
pin (QZ) { ...
function: "IQZ"; => QZ has same value as var IQZ above. IQZ=H when both clrz/prez active, so clrz has priority
D. latch(with scan): ADD DETAILS
3. clock cells: cells on clk path. CGN4/CGP4 (clk gaters), CTB20 (clk tree buffer)
arcs for CGP40 are for: (CG* cells have statetable instead of function, and then o/p pin uses "state_function" to define functionality)
I. EN: 4 arcs with CLK as related pin, setup/hold_rising(clk rising) for EN rising and falling. clk rising since active low latch present. Note that arc has to consider path upto the "and" gate to calc setup/hold, since just meeting setup/hold to the latch i/p doesn't guarantee that EN signal will meet setup/hold to "and" gate.
II. GCLK: state_function: "CLK * ENL", where CLK and ENL(internal node) values are in statetable. 2 "comb" arcs with clk as related pin, for o/p rise/fall.
ex: clk tree buffer
cell (CTB70) { ...
cell_footprint : CTNIBUF; //Use this attribute to assign the same footprint class to all cells that have the same layout boundary. Cells with the same footprint class are considered interchangeable and can be swapped during in-place optimization. Cells without cell_footprint attributes are not swapped during in-place optimization. NOTE that all CTB are assigned same footprint, even thogh they have different layout boundary. similary for CG*, AN2*, etc. all cells from same class are assigned a footprint in TI lib files.
dont_touch : true; //marked as don't touch, so that some opt step doesn't touch/remove it
dont_use : true; //marked as don't use so that they are not used during for normal logic design (use only for clk tree)
...}
NOTE: cell_footprint is set to "NIBUF" (non inverting buf) for all buffers (BU110, BU120, etc) and set to "DELAYBUF" for all delay cells (BU112, BU113, BU116, etc). Tool identifies buffers/delay cells by looking at function stmt of cell which is "function : "A";". All delay cells are marked as "dont_use", so normal logic design doesn't use these delay cells to fix hold time.
ex: clk gating cell: CGP10 (passes EN when CLK is Low)
cell (CGP10) {
version : 1.0;
cell_leakage_power : 2.204898E+01;
area : 4.00;
dont_use : true;
dont_touch : true;
cell_footprint : CGP;
clock_gating_integrated_cell : "latch_posedge"; => this atr says to synthesis tool that it's integrated clk gating cell.
statetable (" CLK EN","ENL") { //("i/p node names", "internal node names")CLK, EN are input pins, ENL is defined as internal node. statetable is used to define fn of complex seq cells
table : "L L : - : L ,\ => "i/p values : current internal value : next internal values". When clk=L, EN=L, ENL current value is - (whatever it's supposed to be), and ENL next value is L.
L H : - : H ,\ => here also ENL is same as EN (as CLK is Low=active)
H - : - : N "; => no change in ENL
}
pin (ENL) { //internal node ENL used to define statetable above
direction : internal;
internal_node : "ENL";
}
pin (CLK) {
...
clock : true;
clock_gate_clock_pin : true; //clk gating attr defined
internal_power () { .... }
}
pin (EN) {
...
clock_gate_enable_pin : true; //clk gating attr defined
internal_power () { ... }
//2 timing arcs: setup and hold for EN pin wrt CLK rising (note: arcs are for when clk goes inactive).
timing () { //hold check for EN rise/fall
related_pin : "CLK";
timing_type : hold_rising;
rise_constraint (constraint_slewref_7slewdata_7) { ... }
fall_constraint (constraint_slewref_7slewdata_7) { ... }
timing () { //setup check for EN rise/fall
related_pin : "CLK";
timing_type : setup_rising; ...
}
}
pin (GCLK) {
capacitance : 0.0000;
max_capacitance : 0.19;
direction : output;
clock_gate_out_pin : true; //clk gating attr defined
state_function : " CLK * ENL "; //o/p is product of internal node ENL (defined above) and CLK. When CLK=0, o/p=0, but when CLK=1, o/p=ENL
timing () { //c2q delay
transport : "NO";
related_pin : "CLK";
timing_type : combinational;
timing_sense : positive_unate;
rise_transition (transitiondelayload8slew9) { ... }
fall_transition (transitiondelayload8slew9) { ... }
cell_fall (celldelayload8slew9) { ... }
cell_rise (celldelayload8slew9) { ... }
}
internal_power () { ... }
}
}
4. special cells:
A. PB110 (3 state bus holder) => no function specified as attribute "driver_type: bus_hold" is defined, indicating it's bi-dir pin, and it holds the last logic value when no-one is driving.
B. TO010 (tie-off cell) : used to tie constant values to these cells. tie-off cells are identified by looking at "function : "0 or 1" in the pin attribute.
DC will tie any contant net to this cell unless "set_direct_power_rail_tie" is used for that partcular net. Then, that net will be left floating during synth, but will be connnected directly to vdd/vss during PnR.
cell (TO010) {
area : 1.75;
cell_footprint : TO010;
pin(LO) {
max_fanout : 50;
max_capacitance : 100.04;
direction : output ;
function : " 0 " ; => this identifies it as tieoff cell for constant logic "0"
}
pin(HI) {
max_fanout : 50;
max_capacitance : 100.04;
direction : output ;
function : " 1 " ; => this identifies it as tieoff cell for constant logic "1"
}
}
5. missing cells: antenna, decap, filler, tap cells.
A. decoupling cells, filler cells and tap cells: decap cells, are cells that have a capacitor placed between the power rail and the ground rail to overcome dynamic voltage drop; filler cells are used to connect the gaps between the cells after placement; and tap cells are physical-only cells that have power and ground pins and do not have signal pins. Tap cells are well-tied cells that bias the silicon infrastructure of n-wells or p-wells (to connect body/substrate of all devices). All of these are identified by using these attributes for cells:
cell (cell_name) {
¡
is_decap_cell : <true | false>;
is_filler_cell : <true | false>;
is_tap_cell : <true | false>;
¡
}
NOTE: since these are physical only cells (no logic function or timing), we usually don't put these cells in .lib file. They only exist in *.lef file. Some Synopsys tools will complain about this, since they don't find the correct attribute on the cell (as it's missing in .lib). However, we can create a physical only .lib, and we can put all these cells in there (especially the decap cells). Then we don't see the warnings. Or, we should not put these cells in netlist during synthesis.
ex: decap cell in *PHYS.lib
cell (SPAREMOSCAP) {
area : 0.75; //no other attribute besides area.
}
B. antenna cells => used to fix antenna violations. It just has a nmos whose gate is tied to vss, and src/drn are tied to i/p A.
NOTE: function is not defined for Antenna Protection cell.
cell (AP001) {
version : 1.0;
cell_leakage_power : 3.828184E+00;
area : 1.00;
dont_use : true;
dont_touch : true;
cell_footprint : DIODE;
leakage_power () {
value : 3.884488E+00;
when : "A";
}
leakage_power () {
value : 3.771880E+00;
when : "!A";
}
pin (A) {
capacitance : 0.0028;
direction : input;
fanout_load : 1;
internal_power () {
rise_power (inputpower_trans5) {
index_1 ("0.0100,0.2000,1.0000,2.0000,4.0000");
values ("-0.0001, -0.0001, -0.0001, -0.0001, -0.0001");
}
fall_power (inputpower_trans5) {
index_1 ("0.0100,0.2000,1.0000,2.0000,4.0000");
values ("0.0001, 0.0001, 0.0001, 0.0001, 0.0001");
}
}
}
}
----------------
delay models:
To calculate any delay thru a path, timing tool must accurately calculate the delay and slew (transition time) at each stage of each timing path. A stage consists of a driving cell, the annotated RC network at the output of the cell, and the capacitive load of the network load pins. Models are employed for driver, wire network and receiver load. The driver model models any cell as a driver (which may be current or voltage source). The wire network is modeled as reduced RC network. Reduced RC network should behave same as original RC network at all frquencies, but allows lot lower computation to calculate delays (PT uses Arnoldi reduction method)The receiver is simply a capacitance. However, the cap may vary depending on rise/fall transition on receiver, min/max condition, miller effect (cap changing due to coupling b/w i/p and o/p, where o/p is changing simultaeously while input is changing), etc. To account for this, models also uses a receiver model to account for this cap as accurately as possible. 2 delay models widely in use:
1. NLDM: (non linear delay model)
For simple NLDM model, driver is a linear voltage ramp in series with a resistor. This is captured via a lookup table, instead of having equations which are more time consuming. The simple LUT model (aka NLDM) employed above works for 22nm tech and above. It specifies delay at midpoint ( at 50% rise or fall). We specify o/p delay + o/p transition time for different i/p slew rate and different o/p load, via a LUT. So, slew rate (b/w 20% to 80% rise or fall with linear slope) and delay (b/w 50% rise/fal to 50% rise/fall) are 2 important parameters that define the shape of o/p waveform (o/p load and i/p slope are used as indexes). However in this simple table, we do not capture the exact waveform of input or output of cell. It's a fixed o/p transition slew rate. This starts adding inaccuracies in delays when compared to spice models. Using a more complex CCS model allows us to capture the waveform more accurately, which is needed for tech < 22nm to get timing results within 2%-5% of spice results. The receiver model NLDM uses is a single cap value for a given timing path. However, cap values may be different based on rise/fall or min/max conditions.
2. CCS: (constant current source model)
CCS model was developed to reduce inaccuracies at 20nm and lower tech. It uses constant current source model (constant current source implies infinite driver strength). It models driver as time varying current source. It can handle high resistive nets (driven by fast drivers), which is a problem for NLDM. CCS receiver model uses 2 cap values for each timing arc. It uses cap C1 for receiver voltage going upto the midpoint of VDD, and then uses C2 for going from the midpoint of VDD to the end. This models miller cap more accurately. For receiver cap, we specify 2D tables for both rise/fall at i/p of receiver. 2D tables are for 4 parameters: receiver_capacitance1_rise, receiver_capacitance1_fall, receiver_capacitance2_rise, receiver_capacitance1_fall. We see at 7nm and below that C1 and C2 themselves differ by upto 20%, and they also vary by as much as 50% across different i/p slew rate and o/p load. So, that signifies the importance of having these receiver models in CCS across diff slew rate and load.
Representing Composite Current Source (CCS) Driver Information: In the Liberty syntax, using CCS model, you can represent nonlinear delay information at the pin level by specifying a current lookup table at the timing group level that is dependent upon input slew and output load. CCS describes each CCS driver switching current waveform by adaptively sampling data points. So basically we take the 2D lookup table from NLDM, and instead of specifying single transition time for each i/p slew and o/p load, we provide current value at different points in time.
To define your lookup tables, use the following groups and attributes:
1. output_current_template group in the library group level
2. output_current_rise and output_current_fall groups in the timing group level
Example of cell:
cell (AOI21_LVT) {
pin(A) { /group for i/p pin. /similarly for all other i/p pins
direction: input; // many other attributes defined
receiver_capacitance () { ... } => tables for different index, and for different cond (when: "!A1&A2)
internal_power () { ... } => tables
}
pin(Z) { //group for o/p pin
direction: output; // many other attributes defined as function, etc
internal_power () { ... } => tables for each related i/p pin for diff condition
timing () { //for each related_pin, there may be more timing groups for each condition
related_pin: "A"; //similarly for related_pin B, etc
when: "!A&B"; //similarly for diff condition
cell_rise (delay_8x8) { ... } // similarly for cell_fall, rise_transition, fall_tarnsition
ocv_sigma_cell_rise (delay_8x8) { sigma_type: early; ... } //EARLY: similarly for ocv cell_fall, rise_transition, fall_tarnsition
ocv_sigma_cell_rise (delay_8x8) { sigma_type: late; ... } //LATE:
ccsn_first_stage () {
stage_type: both; //many more attr as "when, etc
dc_current (ccsn_dc_template) { ... }
output_voltage_fall () { vector (template1) { ... } vector (template1) { ... } ...} //similarly for o/p voltage rise
propagated_noise_high () { vector (template1) { ... } vector (template1) { ... } ...} //similarly for noise_low
}
receiver_capacitance1_rise () { ... } //similarly for cap1_fall, cap2_rise, cap2_fall
output_current_fall () { vector (template1) { ... } vector (template1) { ... } ...} //similarly for o/p current rise. These tables are big as they have current values for lot of time samples for each i/p slew and o/p load
} //end of timing group
timing () {
related_pin: "B";
Example of lib:
library (new_lib) {
...
output_current_template (CCT) { //template for CCS => o/p current waveform wrt 3 var below
variable_1: input_net_transition;
variable_2: total_output_net_capacitance;
variable_3: time;
}
lu_table_template (ccsn_prop_template) { //template for noise
variable_1 : input_noise_height;
variable_2 : input_noise_width;
variable_3 : total_output_net_capacitance;
variable_4 : time;
}
dynamic_current () { => this models dynamic current at power pins (VDD/VSS) of a gate (here inverter) with both rise/fall at i/p. This can be used to calculate dynamic peak IR more accurately. In absence of this, we use "fixed current" at power pins throughout the switching, which is not so accurate.
related_inputs : "I";
related_outputs : "Z";
switching_group () {
input_switching_condition (fall);
output_switching_condition (rise); //o/p is rising, so current waveform is primarily thru VDD as it charges cap, however some short circuit current also flows thru VSS
pg_current (VDD) {
vector (ccsp_template2) {
reference_time : 0.00138;
index_1 ("0.0023"); //slew rate at i/p of gate
index_2 ("0.00023"); //load on o/p of gate
index_3 ("0, 0.0005314, 0.001445, 0.002875, 0.00306608, 0.00314519, 0.00327175, 0.00976063, 0.0144103, 0.0188013, 0.0204694, 0.0249671, 0.0381518, 1.44866"); // these are time delay from reference point of 0.00138 units
values ( \
"8.7941e-07, 0.0714941, 0.0554155, 0.109314, 0.0744151, 0.0740751, 0.0750564, 0.0518026, 0.0122771, 0.00195741, 0.000952697, 0.000121253, 1.55532e-07, 1.73258e-06" \ //as can be seen, current is almost 0 at start and end, but goes theu a peak in between. +ve values imply current is getting pulled out of VDD.
);
}
vector (ccsp_template2) { // we repeat above table multiple times for differnt slew rates and load. they may end up with different refrence time depending on delay thru cell
}
pg_current (VSS) { //similarly for VSS pin. NOTE that for VSS, current values are -ve (implying current is pushed into VSS), and they are of much smaller magnitude than VDD current, as it's only small amount of short circuit current
vector (ccsp_template2) {
reference_time : 0.00138;
index_1 ("0.0023");
index_2 ("0.00023");
index_3 ("0, 0.000670518, 0.00175046, 0.002875, 0.00300388, 0.00345785, 0.00367968, 0.00392382, 0.00409207, 0.00437108, 0.00472153, 0.00507438, 0.00524245, 0.00675625, 0.00699046, 0.00995021, 0.0131331, 0.0144103, 0.015075, 0.0167913, 0.0188013, 0.0204694, 0.0225021, 0.0249671, 0.0467915, 1.44078, 1.44866");
values ( \
"-8.82073e-07, 0.0923592, 0.0406992, 0.0189937, -0.00807851, -0.0116778, -0.0111549, -0.012948, -0.0129895, -0.0129045, -0.0145467, -0.0128898, -0.0144827, -0.0128716, -0.0132787, -0.00995613, -0.00380439, -0.00217233, -0.00167868, -0.000805295, -0.000346744, -0.000158811, -7.18249e-05, -1.5272e-05, 5.02506e-06, -8.37263e-06, 5.12667e-06" \
);
}
switching_group () { //repeat above group for other dirn, i.e rise at i/p
input_switching_condition (rise);
output_switching_condition (fall);
pg_current (VDD) { ... } //NOTE: current values for VDD are -ve here, while VSS are -ve too (implying current is pushed into both VDD and VSS here, maybe because of ripple at o/p which causes o/p voltage to be higher than VDD)
pg_current (VSS) { .. } //similarly for VSS. VSS current lot higher than VDD current as only small amount of short circuit current flows thru VDD
}
}//end of dynamic current section
...
pin(Z) { ...
timing() { //For CCS, timing section has extra CCS LUT
cell_rise (delay_tem...) { .... } //regular NLDM LUT is also present here, so that NLDM will be used if specified in the tool
ccsn_first_stage () { //This specs CCS for first stage of gate (channel connected block or CCB) if gate has multiple stages inside it. For ex, AND gate has nand followed by inverter. So, we repeat this section for last_stage too.
is_inverting : true;
is_needed : true;
when: "A&!SE|SD"; //all CCS values below can be defined condition based
miller_cap_fall : 0.000207711;
miller_cap_rise : 0.000205185;
stage_type : both;
dc_current (ccsn_dc_template) { //2D dc current table which lists the DC current measured at CCB o/p node, with indexes specifying i/p node and o/p node voltage
index_1 ("-0.95, -0.475, -0.19, -0.095, 0, 0.0475, 0.095, 0.1425, 0.19, 0.2375, 0.285, 0.3325, 0.38, 0.4275, 0.475, 0.5225, 0.57, 0.6175, 0.665, 0.7125, 0.76, 0.8075, 0.855, 0.9025, 0.95, 1.045, 1.14, 1.425, 1.9"); //i/p voltage
index_2 ("-0.95, -0.475, -0.19, -0.095, 0, 0.0475, 0.095, 0.1425, 0.19, 0.2375, 0.285, 0.3325, 0.38, 0.4275, 0.475, 0.5225, 0.57, 0.6175, 0.665, 0.7125, 0.76, 0.8075, 0.855, 0.9025, 0.95, 1.045, 1.14, 1.425, 1.9"); //o/p voltage
values ( "0.436551, 0.363591, 0.349409, 0.343731, 0.337281, 0.333664, ... ", ) //and so on ..
}
output_voltage_rise() { //voltage waveforms are not important in CCS, as currents are used to come up with delay and slew at o/p (I=Cdv/dt, So, deltaV can be calculated from i(t) and C). So, we see very few vectors for voltage waveform, but a lot for current waveform
vector (ccsn_vout_template) {
index_1 ("0.02306"); => i/p tran
index_2 ("0.0018245"); => o/p cap
index_3 ("0.0215692, 0.0266808, 0.03194, 0.0380183, 0.0470364"); => time
values ( \
"0.095, 0.28, 0.475, 0.66, 0.82" \ => provides sample points of o/p voltage. voltage is 0.09V at 21ps, then 0.28V at 26ps, and so on ..
);
}
vector (ccsn_vout_template) { ... } //this is repeated for diff slew rates and load
}
output_voltage_fall() { ... }
output_current_rise() { //most important section for CCS. It provides detailed current waveform at all possible i/p slew and slow load. So, for 7x8 NLDM LUT, there would be about 56 (7*8) vectors here. So, this section usually long
vector(CCT) {
reference_time : 0.05; =>
index_1(0.1); => i/p tran
index_2(2.1); => o/p cap
index_3("1.0, 1.5, 2.0, 2.5, 3.0"); => time
values("0.0003, 0.007, 0.022, 0.027, 0.028" ); => current values of the driver model for current rising at o/p. NOTE: current is not in shape of bell curve here, not sure why, maybe the rise time is very sharp, so not captured here
}
vector(next1) { .. } //for other slew rates and load
}
}
}
output_current_fall() { ... }
propagated_noise_high () { // This is to be able to run noise runs. It propagates noise thru the cell, and shows how o/p waveform looks for different i/p waveform
vector (ccsn_prop_template) { //similarly for other vectors
index_1 ("0.595548"); => i/p noise height
index_2 ("0.283096"); => i/p noise width
index_3 ("0.0018245"); => o/p cap
index_4 ("0.141002, 0.154822, 0.183612, 0.20954, 0.226069"); => time
values ( \
"0.810785, 0.727257, 0.671571, 0.727257, 0.810785" \ => waveform of o/p noise sampled at various times. At t=0.14, V=0.8V(which is =VDD), then it dips a little, then goes back to VDD. For noise_low, it will be bump from VSS, back to VSS
);
}
propagated_noise_low () { ... } //for low noise
receiver_capacitance1_rise (delay_template_7x7_0) { //NOTE: these cap values are for o/p pins, not sure why we need for o/p pins, when we have it for i/p pins
index_1 ("0.00205853, 0.00859214, 0.0216594, 0.0477043, 0.0998837, 0.204153, 0.412781");
index_2 ("0.00023, 0.00081, 0.00196, 0.00426, 0.00887, 0.01807, 0.03649");
values ( \
"0.000400186, 0.000424999, 0.000440668, 0.000447652, 0.000451636, 0.000453669, 0.000454712", \
....
"0.000527481, 0.000507608, 0.000493188, 0.000483650, 0.000477769, 0.000473495, 0.000472111" \
);
}
receiver_capacitance2_rise (delay_template_7x7_0) { .. }
receiver_capacitance1_fall (delay_template_7x7_0) { .. }
receiver_capacitance2_fall (delay_template_7x7_0) { .. }
} //end of ccsn_first stage
ccsn_last_stage () { .... } //repeat whole section above for last stage if more than 1 stage present in stdcell. NOTE: last stage is important one for any stdcell, as we care about what comes at the o/p of cell, and not much about happens on internal nodes. Usually, if stdcell has only 1 stage, we only have values for ccsn_first_stage (which is actually the last stage). If stdcell has multiple stages, then ccsn_first_stage is very small (has only voltage and noise waveforms, no other groups)
internal_power () {
related_pin : "I";
related_pg_pin : VDD;
rise_power (power_template_7x7_0) { .. } //tables for both rise and fall power. Only shown for VDD pin, as power is delivered via VDD only
fall_power (power_template_7x7_0) { .. }
...
}
}
}
}
NOTE: there may be too many such arcs to rep current adequately at each slew rate and load. So, we also have compact CCS rep in .lib, so that .lib file doesn't grow tremendously.
Variations in process parameters: To account for this, new extensions added to liberty
Liberty Variation Format (LVF): These are extension to lib format. They are used to specify variation parameters which are needed for OCV timing analysis. Many new groups defined for LVF. We can use these groups in regular .lib files, as long as the tools support reading these LVF groups.
timing () {
cell_rise (delay_temp_8x8) { //regular cell delay for rise
index_1 ("0.0019, 0.0058, 0.0137, 0.0295, 0.061, 0.1241, 0.2502, 0.5025");
index_2 ("0.00016, 0.00088, 0.00232, 0.00519, 0.01093, 0.02241, 0.04538, 0.09131");
values ( \
"0.00814129, 0.0102127, 0.0141374, 0.0218088, 0.0370626, 0.067524, 0.128439, 0.250268", \
...
"0.17358, 0.190633, 0.214257, 0.245593, 0.287164, 0.341309, 0.429989, 0.585219" \
);
}
ocv_sigma_cell_rise (delay_temp_8x8) { //sigma values for cell delay rise. Each value specifies 1 sigma delta from nominal delay value above. Used in POCV analysis. Here sigma value is different for different slew/load.
sigma_type : early;
index_1 ("0.0019, 0.0058, 0.0137, 0.0295, 0.061, 0.1241, 0.2502, 0.5025");
index_2 ("0.00016, 0.00088, 0.00232, 0.00519, 0.01093, 0.02241, 0.04538, 0.09131");
values ( \
"0.000311555, 0.000401142, 0.000580422, 0.000937877, 0.00165292, 0.00308312, 0.00594014, 0.011653", \ => Here, 0.0003 is the 1 sigma offset from mean of 0.0081 specified above for given load/slew rate. So, offset is about 5% from mean, which can be significant when added across multiple gates. Also, note that sigma offset as a % of mean delay is diff for diff load/slew rate, so having single sigma offset value would have given inaccuracies.
....
"0.0101789, 0.0101964, 0.0102317, 0.0103037, 0.0104537, 0.0107767, 0.0143185, 0.021458" \
);
}
ocv_sigma_cell_rise (delay_template_8x8) {
sigma_type : late;
index_1 ("0.0019, 0.0058, 0.0137, 0.0295, 0.061, 0.1241, 0.2502, 0.5025");
index_2 ("0.00016, 0.00088, 0.00232, 0.00519, 0.01093, 0.02241, 0.04538, 0.09131");
values ( \
"0.000391953, 0.000508076, 0.000740406, 0.00120356, 0.00212998, 0.00398288, 0.00765485, 0.0149973", \
...
"0.0104575, 0.010507, 0.0106064, 0.0108063, 0.0112123, 0.0120466, 0.0167537, 0.026188" \
);
}
... }