Many emerging and future communication applications require a significant amount of high throughput data processing and operate with decreasing power budgets. This need for greater energy efficiency and improved performance of electronic devices demands a joint optimization of algorithms, architectures, and implementations.
Low Density Parity Check (LDPC) decoding has received significant attention due to its superior error correction performance, and has been adopted by recent communication standards such as 10GBASE-T 10 Gigabit Ethernet. Currently high performance LDPC decoders are designed to be dedicated blocks within a System-on-Chip (SoC) and require many processing nodes. These nodes require a large set of interconnect circuitry whose delay and power are wire-dominated circuits. Therefore, low clock rates and increased area are a common result of the codes' inherent irregular and global communication patterns. As the delay and energy costs caused by wires are likely to increase in future fabrication technologies new solutions dealing with future VLSI challenges must be considered.
Three novel message-passing decoding algorithms, Split-Row, Multi-Split and Split-Row Threshold are introduced, which significantly reduce processor logical complexity and local and global interconnections. One conventional and four Split-Row Threshold LDPC decoders compatible with the 10GBASE-T standard are implemented in 65 nm CMOS and presented along with their trade-offs in error correction performance, wire interconnect complexity, decoder area, power dissipation, and speed. For additional power saving, an adaptive wordwidth decoding algorithm is proposed which switches between a 6-bit Normal Mode and a reduced 3-bit Low Power Mode depending on the SNR and decoding iteration.
A 16-way Split-Row Threshold with adaptive wordwidth implementation achieves improvements in area, throughput and energy efficiency of 3.9x, 2.6x, and 3.6x respectively, compared to a MinSum Normalized implementation, with an SNR loss of 0.25 dB at BER = 10^-7. The decoder occupies a die area of 5.10~mm^2, operates up to 185 MHz at 1.3 V, and attains an average throughput of 85.7 Gbps with early-termination. Low power operation at 0.6 V gives a worst case throughput of 9.3 Gbps--above the 6.4 Gbps 10GBASE-T requirement, and an average power of 31 mW.
Tinoosh Mohsenin, "Algorithms and Architectures for Efficient Low Density Parity Check (LDPC) Decoder Hardware" Technical Report ECE-CE-2010-4, Computer Engineering Research Laboratory, ECE Department, University of California, Davis, 2010.
@phdthesis{Tinoosh:phdthesis, author = {Tinoosh Mohsenin}, title = {Algorithms and Architectures for Efficient Low Density Parity Check (LDPC) Decoder Hardware}, school = {University of California}, year = 2010, address = {Davis, CA, USA}, month = Nov, note = {\url{http://www.ece.ucdavis.edu/vcl/pubs/theses/2010-4}} }