An FPGA Implementation of a Parallel Column Sort Algorithm with Off-chip DRAMs

Naoaki Harada, Koji Nakano, Yasuaki Ito


The main contribution of this paper is to show an FPGA implementation of a parallel sorting algorithm with off-chip DRAMs. In the implementation, we use the idea of the column sort and multiple data sets stored in the distinct DRAMs are concurrently sorted by FIFO-based pipeline sorters in the FPGA. We have implemented the proposed circuit in a Xilinx Virtex Ultra Scale+ family FPGA XCVU9PL2FLGA2104E with eight off-chip DRAMs. The experimental results show that the proposed implementation can achieve a speed-up factor of 84 over the sequential CPU implementation by quick sort.


parallel sorting algorithm; hardware algorithm; FPGA; DRAM

Full Text:



