PVT effects and variations

Delay thru the Transistor:

A very important metric in digital design is the delay thru any gate. This determines the speed of the chip, since less the delay thru a transistor, faster is the gate, and less is the time it takes for a signal to go from one flop to next flop, resulting in a chip that could run faster.

In previous section on "solid state devices", we saw the eqn for Transistor current. Since this current determines the delay thru a transistor, any change in of these parameters of the eqn could cause a change in current and hence a change in delay. These are the 3 input conditions that could affect transistor delay:

  1. Process: Any change in process parameters µ, Cox, W, L or VTH could cause a change in delay thru transistor. These 5 process parameters vary depending on the fabrication process used. The first 4 process parameters affect delay thru transistor linearly, while Threshold voltage affects the delay as square and so has a more pronounced effect. A "fast" or "hot" process corner is one where these parameters change in a way, that makes the transistor run faster. Converse is true for a "slow" or "cold" process corner. Fabs can usually target their process to their customer's needs.
  2. Voltage: Any change in the voltage applied to terminals of transistor could cause a change in delay. Here we show Vgs only, but Vds could cause a change. Both of these voltage values are eventually dependent on supply Voltage (VDD), so the supply voltage at the transistor terminals could impact delay thru it. Highre the voltage, higher the transistor current and lower the delay.
  3. Temperature: From above eqn, it's not apparent that Temperature could cause any change in current. But if we look carefully, we notice that some process parameters are actually dependent on Temperature. 2 of such params are Mobility (µ) and Threshold Voltage (VTH).
    1. Mobility (µ): Mobility of electrons or holes is determined by how fast they are able to move thru any medium. As we saw in Resistance and Capacitance section, it's change in avg speed with change in Electric field. Recall that mobility of any charged particle is q*t/m, so as charge travels more before colliding with anything, it gets to a higher speed, and hence it's mobility is higher. In a lattice structure of a compound or element, how far these electrons or holes travel depends on the lattice structure and size of atoms around these moving electrons/holes. In general, as Temp increases, these electrons/holes get more energy and are more agitated. So, they travel with faster speed, but also hit the lattice structure more often. The final effect is that mobility decreases with higher Temperature. Current decreases with lower mobility. So, transistor gets slower with higher temperature.
    2. Threshold Voltage (VTH): Threshold Voltage of a transistor was explained in "solid state device" section. It's basically the barrier that electrons/holes in conduction band have to clear. As temp increases, more of these electron/hole pairs get in conduction band, due to higher energy. This allows more electron/hole to cross the hump, resulting in higher current, or effectively lower Threshold Voltage. In general, as Temp increases, threshold voltage gets lower as  carrier concentration increases. Current increases with lower VTH as current has square dependency on gate overdrive voltage. So, transistor gets faster with higher temperature.
    3. Net Effect: So, we see that these 2 effects have opposing effects with increase in Temp, where mobility causes transistors to slow down, while Threshold Voltage causes transistors to speed up. Net Effect is hard to gauge w/o knowing the exact relation of these 2 factors with Temp. In the past for designs 180nm and above, it used to be that increasing temperature used to make transistors slow, meaning mobility won over Threshold Voltage. We can see from transistor Current eqn, that if VDD is lot more than VTH , then even with square dependence, the effect of  VTH change will be muted. As an ex, consider case where  VTH is 10% of VDD. Now as Temp goes up, Vth will come down. A 20% reduction in  VTH will cause change of ( (VDD-0.1VDD)/(VDD-0.12VDD))^2 = (0.9/0.88)^2 = 1.04 or 4% increase in Drive current. This is assuming VTH is going down linearly with inc in Temp. However, a 20% reduction in µ will cause a 20% decrease in drive current (assuming µ is going down linearly with inc in Temp) over same Temp increase. So, net effect will be that transistor will get slower by 15% as Temp increases in that range. This was always the expected behaviour.
      • Temperature Inversion: We saw above that Inc in Temp resulted in slowdown of transistor for 180nm and above. However, with sub 180nm design, the trend started inverting, and transistors started running faster at higher temperatures, especially at lower voltages. This was due to the fact that VDD came down significantly with scaling of transistors, but VTH came down only a little. So, now Vth was about 50% of VDD. With increasing Temp, a 20% reduction in  VTH will cause change of ( (VDD-0.5VDD)/(VDD-0.6VDD))^2 = (0.5/0.4)^2 = 1.56 or 56% increase in Drive current. This is assuming VTH is going down linearly with inc in Temp. However, a 20% reduction in µ will still cause the same 20% decrease in drive current (assuming µ is going down linearly with inc in Temp) over same Temp increase. So, net effect will be that transistor will get faster by 30% as Temp increases in that range.  This phenomenon of transistors getting faster at higher temp was an anamoly and came to be known as temperature inversion. 

Delay thru R and C:

Above 3 conditions not only affect the delay thru a transistor, but also affects the delay thru wires which have resistance and capacitance in them. Thus we have to consider the effect of PVT on Resistance (R) and Capacitance (C) too. When process is making a transistor weaker, there's no rule that says that R, C will get slower too (i.e more resistance and higher capacitance). We'll have to look at equations for R, C to see their dependency on process, Voltage and Temperature.

  1. Process: Process impacts R, C both ways, however it's precise correlation with transistor is hard to gauge. We usually get a range of R, C and use those limits to bound the box for R, C. Note that R and C usually move in opposite direction. For ex, a process that increases R because it's making the wires thinner will decrease C as wires will have more distance between them. So, the product R, C may not change much across process variations.
    1. With lower nm tech, variations in metals/vias R,C are significant. There are also a lot more R,C process corners than just Rmin,Cmin and Rmax,Cmax. With 2 or more masks on same metal layer (in FinFets <16nm), the variations are even more pronounced as the 2 masks may shift on the same metal layer, causing more variations. Most of the times, it's not possible to run timing tools for all R,C corners. So, we just pick few R,C corners and then apply a BEOL margin to account for other corners which we may not have run, but may show worse performance. This margin is only applied for hold timing, as hold is more critical (failing to meet hold timing will result in chip not working).
  2. Voltage: Voltage has negligible impact on R, C to first order. Need to have an equation FIXME ?
  3. Temperature: Resistance increases with Temperature. However, capacitance doesn't have a clear relation with Temperature and will go up or down depending on dielectrics involved. Need to find more about C Vs T ? FIXME ?

Final Delay through a path involving Transistors and Wires:

Final delay thru a path depends on P, V, T. For "weak" P, transistors get weak, as well as R,C get weak too. We don't mention R,C separately, as it's assumed that N (normal) process means typical transistor, typical R, and typical C. However, in reality we may want to consider variants, where for a Strong process, transistor may be strong, but R, C may not be as strong.

PVT ranges:

The 3 PVT inputs that affect delay of circuits are very important in determining proper functioning of circuits. In digital circuits, they are used to check if all the paths in digital circuits meet timing. We run timing tools on our design to make sure our design meets timing. We check timing at various PVT corners. More details are in STA section.

We run timing at extreme PVT corners that our design can possibly be exposed to. We also have typical corner that the design is supposed to be exposed when being in a typical environment, but usually we don't run STA on this typical corner. Let's see the range of these PVT corners:

Process: For process we define a fast process corner and a slow process corner. fast process corner is where all transistors are supposed to be running faster, while slow corner is one where all transistors are supposed to be running slower. However how fast is fast corner really? For that we use a metric called 3 sigma variation. We draw a plot of all transistors across various dies, with current on X axis and number of transistors on Y axis. This gives us a gaussian plot. From this plot, we take 3 sigma variation from mean. The -3 sigma point gives us slow corner, while +3 sigma point gives us fast corner. 99.7% of the transistors lie within -3 sigma to +3 sigma range. So, we are willing to sacrifice the remaining o.3% of the chips if they don't work in real silicon. Since we have both PMOS and NMOS, we define fast and slow for PMOS and NMOS separately. So, we have 4 combinations:

  1. fast fast (FF): This is the corner where both NMOS and PMOS are fast
  2. slow slow (SS): This is the corner where both NMOS and PMOS are slow
  3. fast slow (FS): This is the corner where NMOS is fast but PMOS is slow. This doesn't really happen in real silicon by itself, though it's sometimes done on purpose.
  4. slow fast (SF): This is the corner where NMOS is slow but PMOS is fast. This doesn't really happen in real silicon by itself, though it's sometimes done on purpose.

Voltage: When we run STA at a certain voltage, we always mean the voltage at the transistor pins. It's not the voltage at chip pins. For smaller chips or ones which don't draw a whole lot of current for digital block, the difference is voltage b/w chip pins and tarnsistor pins is not much and can be ignored. However for digital SOC which have billions of transistors and run at 1V or below, the voltage difference can be substantial. We usually run some sims to figure out voltage at transistor pins. Once we know the voltage at transistor pins, we apply a some margin for PMU voltage overshoot and undershoot. Chip pins are usually driven by a PMU, whose all job is to keep the voltage fixed at specified level. Even then we account for some voltage overshoot/undershoot. As a rule of thumb, we apply +/-10% voltage overshoot and undershoot for chips that have a small digital core having less than a million transistors, and running at > 1V. This +/-10% also accounts for the IR drop that may occur on chip. This 10% rule of thumb is true only for small digital cores. For large digital SOCs, we run more detailed simulations.

Temperature: For temperature, we usually consider a range of -40C to +150C depending on what kind of temperature extremes we think the chip may be exposed to. The ambient temperature (temperature of environment) may not go to such extremes but the temperature of the transistor itself may go to large extremes. -40C to +150C provides us enough buffer for such temperature extremes. -25C to +85C is other temperature range that's seen in smaller chips, which aren't consuming too much power (i.e embedded chips), so a smaller range suffices for those. Lower temperatures are limited to ambient temp, as temp of chip can't go below ambient Temp (as chips will usually generate heat). But for higher temperatures, we go much higher than ambient Temps. That guarantees that nothing will break on the chip at higher Temps. . Of course, for people living in very cold climates, there's no guarantee that the chip will work :(

PVT Corners: We define 3 PVT corners.

1. typ: This is the TYP corner, where PVT is at it's typical value. So, Process = TT which means NMOS and PMOS are at their typical process value (i.e typical speed), Voltage = Typical voltage that the design is supposed to run at and Temperature = typical room temperature which is taken as 27C. Here we take R, C at their typical values, even though we know that if NMOS/PMOS are at their typ values, R,C may not be necessarily at their typ values.

2. min: This is the MIN delay corner where transistors are supposed to be at their minimum delay (i.e fastest). So, Process = FF which means NMOS and PMOS are at their fast process value (i.e fast speed), Voltage = Maximum voltage that the design is supposed to be exposed to (maximum PMU voltage overshoot) and Temperature = lowest temperature which is taken as -40C. However for lower nm nodes (<180nm) operating at very low voltages (< 1V), Temperature inversion may occur. Since min corner is run at highest voltage, it's possible that temperature inversion may not occur at higher voltages, so lowest temp may still still be ok for getting min delay. However, the behaviour may be different for different Vth transistors, so some paths may have min delay at some temp, while others may have at some other temp (depending on the High Vth and Low Vth mix of cells in the path).  So, a set of temperatures should be used at the highest voltage to make sure that all possible extremes of min delay are captured. Here we take R, C at their min values (even though R,C may not be at their min necessarily)

3. max: This is the MAX delay corner where transistors are supposed to be at their minimum delay (i.e fastest). This is just the opposite of MIN corner. So, Process = SS which means NMOS and PMOS are at their slow process value (i.e slow speed), Voltage = Minimum voltage that the design is supposed to be exposed to (maximum PMU voltage undershoot) and Temperature = highest temperature which is taken as +150C. Again there is this temperature inversion and voltage dependency problem as discussed above. Since we are at lowest voltage, Temp inversion is very likely to happen at low voltages, so lowest temp should be used here. So, for both min and max delay corner, we use the lowest Temp corner. Here we take R, C at their max values (even though R,C may not be at their max necessarily)

Temperature turned out to be not so straight forward, at low nm tech. With further scaling to <14nm, the trend with Temp inversion gets even more hotch-potch where depending on the voltage and the Vth of trnasistors (high Vth or low Vth), transistors got fast or slow at lower temperatures. So, now there is no clear trend on what temperatures to use. Best is to run max and min delay corner across a set of temperatures.

 

Global variation vs local variation:

When we talked about PVT corners above, we assume that same PVT corner applies to all transistors on a single die. For a different die, different PVT corner would apply. Assumption is that across multiple wafers and multiple dies on each wafer, all dies would be bounded by the max and min PVT corners. So, when we run STA at the max and min corners, we have kind of guaranteed that timing will be met for all these dies, no matter what the process, voltage or temperature be. So, if Process is fast, Voltage is low, and Temperature is high, this particular PVT point is bounded by our our max and min PVT corner, and so will pass timing as long as max and min timing are passing.

However, a question that immediately comes to mind is what about the PVT variations across multiple transistors within a die. For ex, on a given die, not all transistors will be fast-fast at same speed. They will have local variations, and some transistors will be slower than that "fast" corner, while some might be even faster. Similarly for voltage, not all transistors on the same die will see exactly the same voltage. Some transistors may see a little higher voltage while some others might see a little lower voltage depending on IR  drop. The same goes for temperature. Since temperature of a transistor is heavily affected by it's surroundings, it's possible that some transistors which are ON most of the time and running at high frequency may see a higher temperature than some other transistors which are OFF most of the times. This will affect delay of transistors differently and depending on the path, the timing will need to be re calculated with these more precise values of PVT. This is called on chip variation (OCV). This will be discussed in "OCV section".

What if we don't want to deal with OCV, since we have no clue on how to measure these PVT variations within a die. In that case we could use "max corner" for 1 path and min corner for other path on the same die. This guarantees that our chip will meet timing no matter what. However this is way pessimistic that what a real silicon would see. So, we end up unnecessarily putting a lot of margin in design which wastes area and power. We'll study about all of these in "OCV section", which is the next one.