This website uses cookies. By using this site, you consent to the use of cookies. For more information, please take a look at our Privacy Policy.
Home > FPGA Technical Tutorials > FPGAs Fundamentals, advanced features, and applications in industrial electronics > Advanced Signal Processing Resources in FPGAs

Advanced Signal Processing Resources in FPGAs

FONT SIZE : AAA

Digital signal processing (DSP) is an area witnessing continuous significant advancements both in terms of software approaches and hardware plat- forms. Some of the most usual functionalities in this domain are digital fil- ters, encoders, decoders, correlators, and mathematic transforms such as the fast Fourier transform (FFT). Most DSP functions and algorithms are quite complex and involve a large number of variables, coefficients, and stages. The key basic operation is usually provided by MAC units. Since high oper- ating frequency and/or throughput are usually required, it is often neces- sary to use DSPs, whose hardware and instruction set are optimized for the execution of MAC operations or other features such as bit-reverse addressing (as discussed in Section 1.3.4.2). CPUs in DSPs are also designed to execute instructions in less clock cycles than in general-purpose processors. For many years, DSPs have been the only platforms capable of efficiently imple- menting DSP algorithms. However, in recent years, FPGAs have emerged as serious natural contenders in this area because of their intrinsic parallel- ism, their ability to very efficiently implement arithmetic operations, and the huge amount of logic resources available. 

Since the advent of the first FPGAs in the 1980s, one of the main goals of vendors has been to ensure their devices are capable of efficiently implement- ing binary arithmetic operations (mainly addition, subtraction, and multipli- cation). This implies the need not only for specific logic resources but also for specialized interconnection resources, for example, for propagating carry signals or for chain connection of LBs, in order for propagation delays to be minimized and the number of bits of the operands to be parameterizable. 

As FPGAs became increasingly popular, new application niches appeared requiring new specialized hardware resources. The availability of embed- ded memory blocks was particularly useful for the implementation of data acquisition and control circuits, avoiding (or at least mitigating) the need for external memories and reducing memory access times. After them, many other specialized hardware blocks were progressively included in each new family of devices, as described in detail in Chapter 2. 

ALUs in conventional DSPs usually include from one to four MAC units operating in parallel. Their rigid architectures do not allow, for instance, the number of bits of the operands in a multiplication to be parameterized. Therefore, parallelism and bandwidth are inherently limited in these plat- forms, and increasing operating frequency is, in most cases, the only way of improving performance. 

Let us consider as an example the implementation of an N-stage finite impulse response (FIR) filter in a DSP with four MAC units. From Figure 4.1a, it can be concluded that the algorithm has to be executed N/4 times for valid output data to be produced. 

How can the same problem be solved using FPGAs? Thanks to the avail- ability of abundant logic resources and the possibility of configuring them to operate in parallel, several approaches are feasible, from a fully series archi- tecture (requiring N clock cycles to generate new output data) to a fully par- allel one, like the one shown in Figure 4.1b, capable of generating new output data every clock cycle, or intermediate series-parallel solutions. This provides the designer with the flexibility to define different performance– complexity trade-offs by choosing a particular degree of parallelism. In addition, by using design techniques such as pipelining or retiming, extremely high- performance signal processing systems can be obtained. 

(a) FIR filter implemented with four MAC units and (b) fully parallel FIR filter.png

FIGURE 4.1 (a) FIR filter implemented with four MAC units and (b) fully parallel FIR filter.

Other advantages of the FPGA approach are the possibility to parameter- ize the size (number of bits per operand) of the arithmetic operators and the availability of different hardware structures to implement the MAC units. 

The basic FPGA implementation of MAC units consists in building adders and multipliers using distributed logic, and combining them with embedded memory blocks, which act as accumulators and where coefficients are stored. However, in many cases, this solution implies the need for using many LBs, resulting not only in high resource consumption but also in long propaga- tion delays, which limit operating frequency. Because of these issues, current FPGAs include specialized hardware blocks oriented to DSP applications, which are analyzed in the following sections. The simplest among these are hardware multipliers, but more complex ones (often referred to as DSP blocks) are also available. 


  • XCR3128XL-6TQ144C

    Manufacturer:Xilinx

  • CPLD CoolRunner XPLA3 Family 3K Gates 128 Macro Cells 175MHz 0.35um Technology 3.3V 144-Pin TQFP
  • Product Categories: Programmable logic array

    Lifecycle:Active Active

    RoHS: No RoHS

  • XCR3128XL-7CS144I

    Manufacturer:Xilinx

  • CPLD CoolRunner XPLA3 Family 3K Gates 128 Macro Cells 119MHz 0.35um Technology 3.3V 144-Pin CSBGA
  • Product Categories: CPLDs

    Lifecycle:Active Active

    RoHS: No RoHS

  • XC3S1600E-4FGG484I

    Manufacturer:Xilinx

  • FPGA Spartan-3E Family 1.6M Gates 33192 Cells 572MHz 90nm Technology 1.2V 484-Pin FBGA
  • Product Categories: FPGAs

    Lifecycle:Active Active

    RoHS:

  • XC3S1600E-5FGG400C

    Manufacturer:Xilinx

  • FPGA Spartan-3E Family 1.6M Gates 33192 Cells 657MHz 90nm Technology 1.2V 400-Pin FBGA
  • Product Categories: FPGAs

    Lifecycle:Active Active

    RoHS:

  • XC1736EPC20C

    Manufacturer:Xilinx

  • PROM Serial 35.44K-bit 5V 20-Pin PLCC
  • Product Categories: Memory - Configuration Proms for FPGA's

    Lifecycle:Obsolete -

    RoHS: No RoHS

Need Help?

Support

If you have any questions about the product and related issues, Please contact us.