An FPGA Implementation of a Parallel Column Sort Algorithm with Off-chip DRAMs
The main contribution of this paper is to show an FPGA implementation of a parallel sorting algorithm with off-chip DRAMs. In the implementation, we use the idea of the column sort and multiple data sets stored in the distinct DRAMs are concurrently sorted by FIFO-based pipeline sorters in the FPGA. We have implemented the proposed circuit in a Xilinx Virtex Ultra Scale+ family FPGA XCVU9PL2FLGA2104E with eight off-chip DRAMs. The experimental results show that the proposed implementation can achieve a speed-up factor of 84 over the sequential CPU implementation by quick sort.
- There are currently no refbacks.