Dual-core FFT Processor Based on FPGA

Can Zhao\textsuperscript{a}, Jun Yang\textsuperscript{b*}

School of Information Science and Engineering, Yunnan university, Kunming, China
\textsuperscript{a}781831625@qq.com, \textsuperscript{b}junyang@ynu.edu.cn

Keywords: processor; FPGA; dual core; FFT

Abstract: With the increasingly developed processing speed of the processor, the processing speed of the FFT is also higher than that of the past. However, the design of the FFT processor based on FPGA is still dominated by the single core\textsuperscript{[1]}. This design is mainly used for the single nuclear FFT processor where the multi-core era faced the FPGA-based parallelization of the technology. The system uses Verilog HDL language, in the Quartus II 8.0 development platform for layout and routing test, and the final A dual-core FFT processor based on FPGA is implemented.

1. Introduction

As the rapid growth of computer technology, digital signal processing has been deeply spread to many other disciplines. Digital signal processing, such as filtering, correlation, convolution, spectral estimation, etc., many algorithms can be converted into discrete Fourier transform (DFT) to achieve\textsuperscript{[2]}. Fast Fourier transform algorithm has two main ways: software implementation and hardware implementation. Among them, the software implementation as a mathematical function to facilitate the call, but do not have high-speed, real-time characteristics.

Because of the rapid growth of FPGA technology and computer parallel technology, it is necessary to use FFT technology with better flexibility and faster processing speed to realize the FFT algorithm with parallel features\textsuperscript{[3]}. Although with the processing speed of the processor continues to improve, FFT processing speed than in the past has improved, but in the FPGA-based FFT processor design, is still dominated by single-core, the design is for the single The situation of multi-core era of nuclear FFT processor has a certain theoretical and practical significance for its parallelization based on FPGA technology.

2. Introduction to FFT algorithms

In the field of digital signal processing, the concept of Fourier transform is that the signal in the time domain is decomposed into a sum of sinusoidal signals or cosine signals with different frequencies\textsuperscript{[4]}. We are able to convert Fourier transforms to physical glass prisms, and their effects are similar. Only the light is divided into different frequencies of different colors of light is the role of the prism in the physical field, through the role of the prism can promote the observation of light\textsuperscript{[5]}, and the input time domain signal or function is divided into different frequency components Fourier transform, the use of Fourier transform can enable researchers to analyze the frequency to analyze a function or time domain signal.

According to the continuity of the signal and the periodic classification of the Fourier transform, can now be divided into four categories, as shown in Table 1,At least one of the three signal types (frequency domain or time domain) of the Fourier transform, the Fourier series, and the discrete time domain Fourier transform described in Table 1 is continuous and is not suitable for In the field
of digital signal processing, discrete Fourier transform (DFT) is the main research object of the majority of researchers, only DFT in line with its signal both in the time domain or in the frequency domain are all in the field of digital signal processing. Discrete requirements. The following is a discussion of DFT and its fast algorithm FFT.

It can be obtained that the sequence must satisfy the periodic or finite condition to make sense, assuming a finite sequence \( x(n) \), where \( M \) is its length, applying the Fourier series formula in mathematics, Fourier transform of finite length sequence.

<table>
<thead>
<tr>
<th>Signal Type</th>
<th>Transform Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Aperiodic continuous signal</td>
<td>Fourier Transform</td>
</tr>
<tr>
<td>Periodic continuous signal</td>
<td>Fourier Series</td>
</tr>
<tr>
<td>Aperiodic discrete signal</td>
<td>Discrete Time Fourier Transform</td>
</tr>
<tr>
<td>Periodic discrete signal</td>
<td>Discrete Fourier Transform</td>
</tr>
</tbody>
</table>

3. Design of Dual-Core FFT Based on FPGA

In order to achieve the design of the dual-core FFT processor needs to solve the single-core design is running slowly, affecting the performance of the entire system, how to use less resources to obtain the accuracy and speed of the better results is the key to this design[6]. The performance of the dual-core FFT implemented by software is generally measured according to its computational complexity; the use of hardware to design high-performance dual-core FFT processor, its structural complexity and the use of hardware resources is an indispensable consideration. Dual-core FFT processor design includes the following aspects need to consider:

1). The processor structure

The pipeline structure is able to improve the speed of operation because in the realization of the FFT algorithm, any level need to occupy a storage space, the above level of output can be directly applied to the next level and do not need to wait until the level of Computing is completed; In summary, in order to improve the FFT speed, generally used to save resources and take the pipeline structure, the design of the dual-core FFT processor is a combination of pipelined structure and full parallel structure of the form.

2). The effective organization of memory

Such as the use of multiple butterfly unit parallel processing, such as the use of parallel structure is a good way to improve the speed of dual-core FFT solution, but at the same time the parallelism of data access must also be improved. The degree of parallelism to improve the system bottleneck is the data access speed, otherwise, will seriously affect the system performance. Therefore, it is necessary to efficiently organize the memory, increase the speed at which the data is organized into the memory, and organize the operands into the memory without conflict.

3). generate and implement the rotation factor

Coordinate Rotation Digital Computer (CORDIC) mode and look-up table approach are two of the most common ways to generate rotation factors in real-time processing. Coordinate rotation of the digital computer, is the use of a vector rotation based on the multiplication into a shift iteration algorithm of the iterative algorithm; calculate the rotation factor in advance and store it in the memory, and in the complex multiplication from the memory read out the way called look-up table.
4. Simulation test

Verification of the correctness of the data: When the dual-core FFT system to complete the operation, the data output from the serial port. The output of the results of data conversion, and convert it to a single precision floating point, and then in Matlab to seek power spectrum, the results shown in Figure 2. The result is consistent with the fft () function of Matlab in Figure 2, which indicates that the design of the dual-core FFT system is correct and also reflects the frequency domain characteristics of the signal.

The dual-core FFT processor system was simulated in the Quartus II integration software after compilation. As shown in Fig. 3, the timing simulation of the system, in which the clock system, data read and write information, can clearly see the dual-core FFT system in the data operation speed advantage, and more complex timing control.

In terms of performance, the time required for a dual-core FFT processor to process 1024 data is 102.4us, and the time required for a single-core FFT processor to process 1024 data is 192us, based on the results of the simulation. The simulation shows that the number of clock cycles required by a dual-core FFT system is almost half that of a single-core FFT system) can prove that the performance of a dual-core FFT system is nearly twice that of a single-core FFT system.

To sum up, this dual-core FFT system can achieve our goal, with the design mentioned in the
functionality, efficiency fully in line.

![Frequency content of y](image)

**Figure 3** Matlab `fft()` function power spectrum

5. Summary

This paper introduces the process of designing a complete dual-core FFT processor, and adjusts the core modules such as clock system, address generator and angle generator according to the principle of multi-core parallelism, which makes the dual-core FFT system normal according to our design goal jobs. After the system design is completed, with the relevant peripherals, the whole design in the Quartus II software for local module simulation test and the overall simulation test. The results show that the design improves the working efficiency of the FFT system through the pipeline technology and the all-parallelization method. The design is feasible, real-time is good, the resources are small, the result is correct, and it has certain theoretical and practical value.

References


