Publishes by edn.com
The push for higher performance at lower power and cost has driven the VLSI industry towards System-on-Chip (SoC) integration resulting in designs with multiple clocks. It is common to see blocks that share the same clock source having synchronous interactions when the frequency relation is an integral multiple of two. Often, these could be on the critical paths in the design from a timing perspective. In such situations, you must test these interactions for transition-type faults to achieve test coverage and DPPM (defective parts per million) targets.
To put it into perspective, the path, which has the launch flip-flop in one clock domain and the capture flip-flop in another synchronous clock domain, is called the Synchronous Cross-Clock Domain (SCCD) path. Figure 1 shows the combinational cloud between FF1 to FF2 is called the intra-clock domain, while the cloud between FF1 to FF3 is called the inter-clock domain.
Figure 1. Intra- and inter-clock domain paths can introduce faults in the form of delays.
Clock Filtering Circuits (CFCs), used for transition fault testing, filter out the required clock pulses from the clock source. Typical CFCs have limitations and can’t be used for testing transition faults across these synchronous clock domains. What problems occur when you have transition faults? We will explain those limitations and propose enhancements to the CFC to make testing of SCCD, such as the inter-clock faults as shown in Fig. 1, feasible.
Figure 2 represents a typical clock filtering circuit, which has three primary components.
Figure 2: Typical Clock Filtering Circuit may not be sufficient for testing clock faults across synchronous clock domains.
At-speed fault testing involves two steps. The first step is the shift mode and the second is the capture mode. In shift mode, registers are initialized to a known value by shifting through the scan chain when the SE is high. In the capture mode, the response of the functional path is captured in the registers when the SE is low.
In capture mode, the CFC is used to generate the required clock pulses for launch and capture cycles of at-speed testing.
When SE is strobed, it arrives at the CFC after some delay. In the CFC, it is then synchronized with the two-stage synchronization cell of the receiving clock domain. The SE synchronized signal will trigger the n-stage programmable register to give an enable signal for the ICG to filter out the required clock pulses. This implies clock pulses coming out of the CFC take a certain amount of delay from the time the SE arrives at the CFC. The delay is mainly due to the synchronization cell delay.
Figure 3. Typical CFC output waveform for at-speed testing clock.
For testing faults within a clock domain (intra-clock domain faults), this CFC works fine as shown in Figure 3. To test SCCD transition faults, however, you need to generate the launch and capture the pulses as shown in Figure 4. To achieve such waveforms, we generally use two independent CFCs. Each clock domain requires its own CFC because pulse width is different for each clock and therefore needs to be generated from a different CFC. Fig. 4 gives examples of different launch and capture condition that you can achieve.
Figure 4. Typical launch and capture pulse combination for testing inter-clock domain faults include both fast launch and slow capture, and slow launch and fast capture pulses.
When employed to test faults in synchronous inter-clock domains (Fig. 1) the same CFC encounters the following challenges:
Edge Misalignment: When testing faults between two SCCDs, each clock domain has its own CFC. This causes the outputs to be out of alignment, caused by the inherent synchronization delay attached with CFCs. The resulting clock edges will be out of the cycle alignment. For example, two synchronous clocks of frequency F and F/2, each with a programmable CFC of shift register length 4 is assumed. The programmable shift register is triggered at different times, which results in a dissimilar delay on the CFCs output. Clock domain F/2 takes twice the time as clock domain F, assuming a two-stage synchronizer. Figure 5 shows the clock output waveforms of both CFCs. It’s important to note that there are two types of misalignment. One is due to a delay in synchronization itself, which is shown in Fig. 5.
Figure 5. Clock waveform for generic CFC output, showing misaligned edges.
The other reason for misalignment is clock skew. The skew from each clock causes additional misalignment of clock outputs of the two CFCs. As shown in Figure 6, clock out of CFC_OUT_F is skewed with respect to CFC_OUT_F/2. Because of this, the functional timing window to capture the launch signal gets reduced, compromising the test quality and validity.
Missing Clock Pulses: Fig. 4 shows a subset of the launch and capture pulse required for at-speed testing of the inter-clock domain faults. Referring to Fig. 5, it can be deduced that for a CFC with a programmable shift register of length 4, we can’t hit all combinations of launch and capture pulses. For example, the first combination (in Fig. 4) of launch and capture pulse can be created by using the two CFCs, but not the second combination. This problem can be addressed by increasing the shift register length.
As shown in Fig. 4, to successfully test the SCCD paths, two launch and capture pulse combinations are needed. Pulses should be separated by the clock-timing window. Due to clock skew and synchronization cell delays, however, it’s difficult to achieve.
Multi-Cycle Path (MCP) Testing: Due to problems related to missing clock pulses, testing is limited by the length of the shift register. Shift register length won’t be enough to capture all the combinations of launch and capture, and this enhances more in MCP because you must wait for a cycle of one clock to pulse the other. A programmable shift register should have sufficient width to create the required MCP launch and capture the pulse and test the path. Figure 7 shows two scenarios of MCP. Again, only the first condition is possible with this CFC and not the second because it’s based on the generation of generic CFC pulses.
Figure 7. Multi-cycle paths need a shift register long enough to produce an effective launch.
Challenges in Physical Implementation and STA: The delay difference between D1 and D2, as shown in Figure 8, needs to be as close as possible. A small difference ensures that the clock skew between the two clocks are similar in the test condition to the clock skew seen in the functional condition. Designers should provide additional physical and timing constraints to achieve a small difference. For example, you can use SDC to enable clock balancing between the D1 and D2 path.
Figure 8. Minimize the difference between D1 and D2 to reduce clock skew.
Pattern Generation: Even for the conditions that can be tested, designers need to find the correct sequence of bits to constrain in the programmable shift register. Based on these values, CFCs generate the correct sequence of launch and capture pulses for single-cycle paths and multi-cycle paths.
A clock-filtering circuit used to test inter-clock domain transition faults should ensure the following:
Figure 9 illustrates a modified CFC that we use at OpenSilicon.
Figure 9. CFC modifications include programmable shift registers and a seprate synchronization cell.
Common Synchronization Cell: The synchronization cell is brought out of the CFCs and made common to all SCCD CFCs. Doing so assures that triggers to the programmable shift registers arrive at the same time. The separate cell removes the misalignment of edges caused by differences in synchronization cell delays.
As Figure 10 shows, to achieve output clock pulses with alligned rising edges, you should generate the synchronized SE at the previous edge of the SCCD’s fastest clock. You can achieve that by a combination of multiple SCCD clocks in the synchronization cell. To attain the above allignment with minimum logic, the slowest and fastest clocks of SCCD are used in the synchronization cell.
Figure 10. A valid SE synchronization edge produces proper allignment of all SCCD clocks.
Shift Register Clocking: The programmble shift registers of SCCD CFCs are modified to operate at the frequency of that SCCD’s fastest clock. This creates a timing path between the fastest clock (f) of that SCCD and operating clock (f/2), as highlighted in Figure 11. The moficiation simplifies the physical implementation and Static Timing Analysis (STA) effort by achieving a tighter clock skew during the Clock Tree Synthesis (CTS) stage. It also removes the necessity of having additional constraints for physical implementation and STA.
Figure 11. The SCCD timing path contains a synchronization cell and programmable shift registers.
Shift Register Width: The width of the shift register needs to be increased because it’s operating at a higher frequency than the CFC operating frequency. For example, to produce four pulses of the f/2 clock, the shift register should be of length 8 because it’s working at frequency f. The shift register’s width depends on a ratio of the frequency of the fastest clock to that of the slowest clock in a given SCCD. The equation below represents the parameters on which shift register width varies:
Figure 12 portrays the output clock pulses of a SCCD CFCs of three different clocks of relation F, F/2 and F/4. Using the programmable shift register, you can configure the CFC to generate the required launch and capture clock pulses.
Figure 12. The output of the CFCs in a synchronous clock domain shows all clock edges aligned without any delay.
In Fig. 5, we showed that with the typical CFC, you can’t generate all the combinations of launch and capture clock pulses to test faults in the inter-clock domain. With the enhanced CFC, you can generate all the possible combinations of launch and capture pulses, including those required for MCP testing.
Pattern Generation: Pattern generation for the enhanced CFC involves programming the shift register’s bits to the value that corresponds to the desired launch and capture pulses. With enhanced CFCs, the programmable shift register is working on the fastest clock to get the one clock pulse. Thus, you need to constrain more than one bit of shift register.
Assume an operating clock of frequency f/4 and a programmable shift register clock of frequency f. Four bits of the shift register need to be programmed to generate one pulse of operating clock frequency f/4 (Figure 13). Furthermore, the Named Capture Procedure (NCP) Automatic Test Pattern Generation (ATPG) methodology is used to run the fault simulation.
Figure 13. Programmable shift register bit values of two CFCs can generate the one launch and capture combination.
At Open-Silicon, such enhanced CFCs have been implemented in designs and achieved at-speed coverage improvement of 1.07%. Though the number itself may look small, these faults cover the critical paths that were essential to be tested.
Testing the SCCD faults in complex SoCs containing multiple clocks is never easy, but it’s important for yield enhancement. To overcome such issues, a novel method of modifying the existing CFC to achieve the higher coverage has been proposed. With this approach, a robust CFC has been realized with minimal changes to the original CFC and implemented without hindering any design schedule and complexity. Testing of such synchronous inter-clock domain faults lets the designer achieve higher at-speed test coverage and low DPPM targets. This also helps in the screening of Integrated Circuit (IC) parts with additional benefit to perform speed binning.
This test methodology gives system engineers higher confidence for a successful product, especially in cases where the design contains many such SCCD paths.