Parallel Particle Swarm Optimization on Graphical Processing Unit for Pose Estimation


In this paper, we present a parallel implementation of the Particle Swarm Optimization (PSO) on GPU using CUDA. By fully utilizing the processing power of graphic processors, our implementation provides a speedup of 215x compared to a sequential implementation on CPU. This speedup is significantly superior to what has been reported in recent papers and is achieved by a few simple optimizations we made to better adapt the parallel algorithm to the specific architecture of the NVIDIA GPU. Next, we apply our parallel PSO to the problem of 3D pose estimation of a bomb in free fall. We reduce the computation time of the analysis of 120 images to about 1 s, representing a speedup of 140x compared to the sequential version on CPU. Key-Words: CUDA, graphic processing units, particle swarm optimization, parallel implementation, 3D pose estimation


    14 Figures and Tables

    Download Full PDF Version (Non-Commercial Use)