Skip to main content
    Details
    Author(s)
    Display Name
    Huiyang Xiong
    Affiliation
    Affiliation
    Nanjing University
    Display Name
    Bohang Xiong
    Affiliation
    Affiliation
    Nanjing University
    Display Name
    Wenhao Wang
    Affiliation
    Affiliation
    Nanjing University
    Display Name
    Jing Tian
    Affiliation
    Affiliation
    Nanjing University
    Display Name
    Hao Zhu
    Affiliation
    Affiliation
    Nanjing University
    Display Name
    Zhongfeng Wang
    Affiliation
    Affiliation
    Nanjing University, China
    Abstract

    The Internet of Things (IoT)-centric applications, such as augmented reality and self-driven cars, require real-time task processing, large bandwidth, and low data transmission latency. FPGA-based edge computing is considered an effective solution to tackle these challenges. As an excellent tool in these applications, nonlinear optimization methods involve computation-intensive and data-dependency operations leading to limited real-time applications. The limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm ranks among the most efficient algorithms for large-scale optimization problems. In this paper, we propose, for the first time, a high-parallel FPGA-based architecture for the two key parts of the L-BFGS algorithm: the search direction computation and line searching. Compared with the implementation on the CPU, the search direction computation and line searching implementation on FPGA achieve 39.73× and 5.50× speedups, respectively. Compared with the straightforward implementation on GPU, the search direction computation on FPGA obtains a speedup of 31.03×.