voltus

Details: Last Updated: Wednesday, 01 May 2019 13:36; Published: Wednesday, 01 May 2019 13:36; Hits: 2018

voltus:
------
power analysis tool (cadence). Cadence® Voltus IC Power Integrity Solution is a full-chip, cell-level power signoff tool that provides high-capacity analysis and optimization technologies to designers for debugging, verifying, and fixing IC chip power consumption, IR drop, and electromigration (EM) constraints and violations.

voltus repalced EPS 13.2 (Encounter Power System)
version 16.1: analyze pkg data with Sigrity, Innovus gui,
- pwr calc for static and dynamic
- EM/IR on Power grid (PG) for static and dynamic
- static timing due to IR drop impact
- decap opt: insertion/removal
- power gating switches on ramp-up, steadystates

A=activity (net switching from 0->1 or 1->0 in 1 clk cycle). A of clk = (1+1)/1 = 2
D=duty cycle, % of time net has value 1 during sim time
TD=transition density = num of times signal toggle from 0->1 or 1->0 in 1 sec. TD=A*F. So, for clk=1MHz, TD=(1+1)/1us=2e+06

static pwr: avg activity calc for each net, and pwr reported for the whole sim window as 1 pwr number. so, pwr at each time is not known, it's avg pwr over whole window. so called static. (k-factor pwr scaling parameters for PVT can be provided, which will scale pwr num accordingly)
- switching pwr = C*V^2*f = 1/2*C*V^2*F*A = 1/2*C*V^2*TD (factor of 1/2 is added since A=TD=2 for clk, while in reality it should be 1)
- Int pwr = from .lib (both int switching and int feed thru pwr). It's energy (Joules), not power(Watts). If lkg power is in pW, then energy is in pJ. Int pwr can be on both i/p pin (mostly for macros) and o/p pin (mostly for stdcells). i/p pin pwr can be state dependent (what state other i/p pins are at), o/p pin pwr is based on i/p slew rate + cap load LUT. Since P=Energy*Freq*AF, P=1/2*(Erise + Efall)*TD => Energy stored in cap is 1/2*C*V^2, equiv amount is lost in resistor, since supply provides C*V^2. So, in 1 charge and discharge of cap, C*V^2 energy is lost in resistors.
- lkg pwr = from .lib (state dependent lkg, genric lkg pwr also provided for states not defined)

ex: Int pwr: for 2 i/p AND, we see int_pwr showing on o/p Y (related to pin A, and then for pin B). We calc TD(A)=tran density on pin A, TD(B), and calc int_pwr of Y = 1/2*(Erise(A)+Efall(A))*TD(A) + 1/2*(Erise(B)+Efall(B))*TD(B) => see pg 153 of voltus user guide.
If TD(A)=1000/sec, TD(B)=2000/sec, then B's contribution to pwr at Y is 2X that from A.
ex: see pg 190 of voltus user guide. If gate is XOR gate, then TD(Y) != TD(A)+TD(B). since int_pwr is mostly reported for o/p pin, we need to look at TD(Y) to calc pwr at pin Y. We calc pwr(A->Y)(assuming Y switching coming exclusively from A switching) and pwr(B->Y)(assuming Y switching coming exclusively from B switching). Then we divide that pwr depending on TD(A) and TD(B) as follows:
Int_pwr(Y) = [pwr(A->Y)*TD(A) + pwr(B->Y)*TD(B)] / [TD(A)+TD(B)]

ex: Int pwr for ram macro: int_pwr defined on clk pin only for these 3 conditions: (clk pin is chosen for power reporting since EZ/WZ signals on clk edge decide what mode macro is in. We monitor pwr for both high/low of clk, but since EZ/WZ don't change on -ve edge of clk, pwr is calc correctly, as EZ/WZ remain stable for 1 full clk cycle, so whatever is EZ/WZ status on +ve edge of clk is what is used for pwr calc.
1. idle: when: "(EZ)"; for various i/p slew rates on CLK = 4pJ
2. read: when: "(WZ&!EZ)"; for various i/p slew rates on CLK = 50pJ
3. wrt: when: "(!WZ&!EZ)"; for various i/p slew rates on CLK = 57pJ
Int_pwr for macro calc similar to AND gate above. TD(clk) is calc, and then see what probability of time these 3 cond appear, and then calc pwr
Int_pwr = TD(clk)*[P(1)*E(1) + P(2)*E(2) + P(3)*E(3)]/(P(1)+P(2)+P(3)]
ex: TD(clk) 5 toggles in 20us time chosen from VCD. For EZ=1, E=4pJ => P=E*TD=4pJ*5/20us=1uW. Voltus reports .001mW in .rpt, which matches our calc.
ex: TD(clk) 4 toggles in 20us time chosen from VCD. For 8.3us, it's in idle, and renaining 11.7us, it's in wrt. So P=(E1*P1+E2*P2)/(P1+P2)*TD=(4pJ*8.3/20+57pJ*11.7/20)*4/20=7uW. voltus reports 6.7uW => close enough

dynamic pwr: It analyzes current over specified time, and calc pwr at each point in time. It can be both vectored or vectorless. Since VCD can be very large, dynamic pwr analysis should be limited to atmost 5 clk cycles of fastest/dominant clk with max switching activity. Vectorless approach generates worst case activity by propagating user supplied activitt, and so gives pessimistic results.

cmd:
---
voltus -file run.tcl -no_gui => for no gui

run.tcl:
---
1. load design/library
- read_lib -max {CORE/liberty/MSL700_W_125_1.35_CORE_iso_pg.lib liberty/ssbwmv3m04096033080_W_125_1.35.lib ...} => these pg.lib files have PG pins (VDD/VSS) for all cells, related power pins for each i/p, o/p pin (A,Y). .lib files also have lkg pwr and internal power (for o/p pin) in pW. internal pwr is crossbar power.
ex: inv1 => lkg pwr =25pW, int pwr = 0.006pW
- read_lib -lef {MSL700_tech.lef} => This has all metal/via rules and shapes.
- read_verilog netlist.v => this reads gate level netlist
– set_top_module dig_top => This reads all libs/verilog above

- read_def digtop.def => def file needed for layout
– #read_power_domain –cpf cpf1 => optional
– read_spef -rc_corner QC_MAX_1.5 {QC_MAX_1.5_ATD_W_125_1.35_maxC_maxvia_decoupled_125.spef.gz} => spef file to get RC for interconnects. WLM can also be provided if spef not avilable. -rc_corner is optional.
– read_sdc func.sdc => sdc file for func mode

- source scripts/view_definition.tcl => this is from MMMC flow in PnR. create rc_corner, library_set, delay_corner, constraint_mode and analysis_view for multiple corners.
- set_analysis_view -setup [list func_QC_MAX_1.5_ATD_W_125_1.35] -hold [list func_QC_MAX_1.5_ATD_W_125_1.35] => set view for max corner. same corner used for setup and hold as we want same kind of delays on both setup/hold. Pwr view can be set to only 1 view at a time, either setup or hold. Use "set_analysis_mode -checkType setup" to choose setup view as power view.
– #update_timing => optional

2. setup:
- setup switching activity
A. vectorless: transition density and duty cycle of nets specified. At the least, sw of PI should be provided. Tool can propagate transition activity thru combo logic. activity prop thru seq cells is hard, as they mostly have loops. so, best to provide AF at o/p of seq cell. similarly, AF at en pin of clk gaters should be provided, since activity propagation for clk en pin may prop incorrectly. For macros, AF at rd/wrt i/p pins should be provided.
     – Set up defaults for inputs/flops
              set_default_switching_activity -duty 0.5 -seq_activity 0.2 -input_activity 0.1 => 0.2 at o/p of flops
       set_default_switching_activity -global_activity 0.2 -clock_gates_output_ratio 1
     – Can also set specific activities on specific pins
              set_switching_activity -activity 0.2 -duty 0.5 -inst flopA
B. vectored: provide VCD file from gate/RTL sim, or TCF file which provides toggle counts for each net.
     - read_activity_file -format VCD design_mode.vcd.gz start 461us -end 639us -scope usbpd_testbench_bga0/usbpd_digtop_0 => specify start/end time within vcd that is to be used (by default, entire time window is used). -scope specifies module within vcd to be used =>
$scope module usbpd_testbench_bga0 $end => line in vcd that specifies scope
$scope module usbpd_digtop_0 $end => line in vcd

3. setup and analyze power
static/dynamic power:
A.setup
GUI: pwr_and_rail->set_pwr_analysis_mode(set analysis=Static)
set_power_analysis_mode -method static \ => use -method dynamic_vectorbased for dynamic pwr
                        -analysis_view func_QC_MAX_1.5_ATD_W_125_1.35 \
                        -create_binary_db true \ => save plot data power.db
                        -use_encounter_db false \
                        -transition_time_method max \
                        -write_static_currents true \
#                        -disable_static false \
#                        -ignore_control_signals false \
#                        -read_rcdb true

B. analyze power
GUI: pwr_and_rail->run_pwr_analysis (fill tabs for basic, activity, power, advanced)
GUI: pwr_and_rail->text_rpts(pwr_analysis) => to see reports
GUI: pwr_and_rail >pwr_rail_plots (select pwr, load power.db) => will show all pwr/activity
#dynamic current plot
GUI: pwr_and_rail >dynamic results->waveforms (select pwr waveform, choose pwr db, add waveform file dynamic_VDD.ptiavg, select any inst and click plot. to see current for all of VDD, choose "total current" from composite waveform menu) => shows dynamic current in simvision. Current (NOT pwr) is shown. current shows up from time 0 to time in vcd file. We should see current spikes around clk edges. We can also plot current for all clks only. For
#pwr profiling plot
If we have pwr profiling going on, then we can choose "profiling histograms" in above case, choose same pwr db, then add *.rpt.trn waveform file (*.trn file gets generated automatically for pwr profiling), select that *.trn waveform file and click Plot. On the plot, we will see pwr (NOT current), in histogram(bars) form from start time to end time of vcd file. We see pwr histogram in widths of step (if step=1us, then for 10us vcd run time, we see 10 histogram with width of 1us each). It shows total pwr as well as switching, lkg, int as well. It shows only for top level of hier. Also, pwr number here is for each separate time step, so to calculate total pwr, we have to add pwr for all steps multiplied by each time step and then divide by total time to get dynamic pwr number. Note that this dynamic pwr number should equal static pwr number as it's just avg of pwr over whole time domain.

set_power_output_dir my_dir
report_power -outfile pwr.rpt => this dumps results in my_dir/pwr.rpt
report_power -outfile dir1/pwr.rpt => this dumps results in dir1/pwr.rpt. overrides output_dir set above.
report_power –no_wrap \ => reports staic/dynamic pwr depending on settings above
         -output staticReports \ => o/p dir that stores all pwr rpt
         -report_prefix design.power \ => o/p files prefixed with "design.power".*.rpt
         -view func_QC_MAX_1.5_ATD_W_125_1.35 \ => pwr analysis is run on only 1 view at a time
         -instances {*} \ => specifies inst to include in pwr rpt. shows rpt for each inst in instpwr.rpt
#         -cell {CTB*} => specifies cells to include in pwr rpt.
#            -format {simple|detailed} => reports pwr consumed by all nets in simple/detailed
#         -hierarchy {all} => reports for all hier level starting from top to leaf. shows rpt in hierpwr.rpt. hier level of 0 reports only for top level, while 2 will report for 2 levels below top level. default is all.
#         -net -nworst 100 => reports net switching pwr for each net in design. -nworst 100 reports only 100 nets with highest net sw pwr. useful for debug
         -create_power_db true => creates power database

#report_power -hierarchy 3 -outfile hier.rpt => reports pwr 3 levels down
#report_power -instances all -outfile inst.rpt =>
#report_power -net -nworst 1000 -outfile net.rpt =>

#report_instance_power inst1 -outfile inst1.rpt => Generate detailed report on power calculation for specific instance. very powerful cmd, shows internal power calc method. (for dynamic pwr runs, we have to run this too: set_power_include_file)

#restore_power_database -file power.db => to restore old power db results from prev run

#vector profiling => identifies windows with max activity and power which then drives dynamic vector based pwr analysis.
2 types of vector profiling:
1. avg vector profiling: vector profiler computes average toggle density within each step to compute and display average power profile of a VCD/FSDB file. default step size for 2 times the fastest clock.
2. event-based profiling: vector profiler computes power profile of every event on each net. This accurately capture vectors that could produce peak power using very small resolution. default step size is 1ps.
report_vector_profile -event_based_peak_power -write_profiling_db true -detailed_report true -outfile func_power.rpt -step 1000 => -average_power does avg vector profiling. step size here is specified as 1000ns=1us. This will report intervals of 1us with max power. NOTE: step size is calc automatically if start/stop time for vcd is provided (ignores -step in that case). Then we choose time window with max power, and use that time window in vcd file to once again do vector profiling with -step 1 (1ns step). This gives us max pwr for that 1ns time window. Or, we can do this to do it all in one run:
report_vector_profile -event_based_peak_power -write_profiling_db true -detailed_report true -outfile func_power.rpt -step 1 => 1ns window
read_activity_file -reset
read_activity_file -start $worst_power_window_start -end $worst_power_window_end => stores the worst power window in these var
report_power -outfile worst_pwr.rpt

view_analysis_results => Ability to script loading power results without navigating GUI menus
view_dynamic_waveform -type profile -waveform_files func_power.rpt.trn => runs simvision to display dynamic power. *.trn is dumped auto, when "-write_profiling_db true"
write_tcf top.tcf => Dump out toggle count for every net or pin in design. Useful for comparing toggle propagation between different setups

----
effective resistance: calc eff res b/w 2 nodes on PG, or from any node/inst to voltage src. does it for all inst in design in 2 modes:
1. net based: analyze_resistance -net <net_name> (o/p=effr.rpt)
2. domain based => analyze_resistance -domain <domain_name> (o/p=domain_effr.rpt). gives Rvdd+Rvss for all nets in design

#tcl file
set_pg_nets -net VDD -voltage 1.10 -threshold 0.99 -tolerance 0.3 -force
set_pg_nets -net VSS -voltage 0.00 -threshold 0.11 -tolerance 0.3 -force
set_rail_analysis_mode -ignore_shorts true -work_directory_name work.zx -method static -accuracy hd -enable_manufacturing_effects true -power_grid_library ../accurate_stdcells.cl -temp_directory_name ./tmp.zx -cell_ignore_file ../fill.list (accuracy=xd is used for relaxed accuracy (based on lef files), while hd is used for high accuracy (based on gds files), )
set_rail_analysis_domain -name PD -pwrnets VDD -gndnets VSS => needed for domain based
set_power_pads -net VDD -format xy -file ../VDD.pp => pwr pad location has to be specified
set_power_pads -net VSS -format xy -file ..VSS.pp   => pwr pad location has to be specified
set_package -spice ../pkg.spi -mapping ../pkg.map
analyze_resistance -net VDD => for net based Reff. -node_list can be specified for exact coord where we want Reff to be measured. -node_list {{90 11 M4} {75 11 M4}}. -instance_list can be used to specify inst where we want Reff to measured. -instance_list {{INV1 vdd} {INV2 vdd}}
analyze_resistance -domain PD => for domain based Reff

------------
IR drop/ gnd bounce: IR_drop = drop in VDD, gnd_bounce=inc in VSS. causes timing problems.
---
IR drop inc if more cells switch together (i.e more I), or if line more resistive (i.e more R).

Static IR drop: avg current draw is used to calc IR drop. Ususally peak I much higher than avg I when looked at small time windows, but adding enough decoupling caps, makes this peak I smoothen out, so that static IR and dynamic IR get close enough. Typical limit for IR drop is 2-5%.
dynamic IR drop: peak current draw is used to calc IR drop. waveforms show transient I. decoupling caps reduce dynamic IR drop. Many decoupling caps are in built (as gate cap, diffusion cap, parasitic cap b/w pwr/gnd), while other decoupling caps are inserted on purpose to reduce dyn IR. too much decap will cause more pwr, as current leaks thru decap, so voltus tries to reposition existing decaps more effectively, before adding new decaps.

symptoms of IR drop:
- logical malfunction. may be timing failures. Inc voltage usually resolves it.
- data dependent failure. When some data pattern causes excessive activity, resulting in large IR drop. Dec clk freq may rsolve it.
- clock jitter. 5% IR drop on clk buffer can reduce it's speed by 15%. The drop not only reduces the logical High voltage of gates, but also slows the charging/discharging of logic as lower voltage is available now.

For 130nm and below, manufacturing effects of wire widths, etc are modeled correctly in tools. Dishing, slotting, cladding affect wire width. Erosion affects wire thickness. Density rules for metal help reduce erosion. Metal fills (either floating or tied to gnd) done at foundary step, but now they are done in design stage to model changes in cap, etc. gnd metal fill cause higher cap, then floating metal fill.

#tcl file:
set_pg_nets -net VDD -voltage 1.10 -threshold 0.99 -tolerance 0.3 -force
set_pg_nets -net VSS -voltage 0.00 -threshold 0.11 -tolerance 0.3 -force
set_rail_analysis_mode -method static -accuracy hd -power_grid_library {stdcells.cl mem.cl} //for dynamic use -method dynamic
#set_rail_analysis_mode –report_power_in_parallel true => this allows pwr analysis to be run in parallel with rail analysis. No need to run separate pwr analysis
set_rail_analysis_domain -name PD -pwrnets VDD -gndnets VSS => needed for domain based
set_power_pads -net VDD -format xy -file ../VDD.pp => pwr pad location has to be specified
set_power_pads -net VSS -format xy -file ..VSS.pp   => pwr pad location has to be specified
set_power_data -format current -scale 1 {static_VDD.ptiavg static_VSS.ptiavg} => o/p reports
analyze_rail -type domain -results_directory static_rail PDcore => runs static IR analysis

From the results, we can get plots for IRdrop, grid_res, resistor_current, current_density, etc
read_power_rail_results -rail_directory ALL_25C_avg_1/VSS
report_power_rail_results -plot ir -filename VSS.irdrop.report => by default, all text reports are generated

------------
EM: caused by movement of atoms in wire because of high current. pwr grid which have redundant wires, exhibit higher Res due to EM, while signals which provide unique connectivity, cause total failure due to EM. shorts to neighboring wires may also cause total failure.
----
2 phenomeno causes EM:
1. wearout: metals become narrower at places where metal atoms start moving, causing wire to break. To reduce this, metal wire are built in sandwich structure with top and bottom layer being made of metal which is more resistive to EM, and central metal is real metal (for ex: Tin=Titanium nitride filled around Aluminum metal. Cu is increasingly used for metal as it not only offers lower Res, but also higher resistivity to EM wearout) This prevents total wire failure.
2. Joule heating: high ac currents may cause excessive heating resulting in thermal expansion and temperature induced EM.

EM modeled using Black's eqn. MTTF obtained using this is used to calc prob of failure for a wire. Then using prob failure for each wire, failure prob for whole chip is calc.

--------------
pwr network optimization (PNO) and ESD analysis/opt also done by voltus.

Nav view search

Navigation

Search

voltus