Pudov S.G.  

Intel® Math Kernel Library Inspector-Executor Sparse BLAS API for iterative computations

The implementation of Sparse BLAS functionality in the Intel® Math Kernel Library (Intel® MKL) versions not higher than 11.2 is based on the NIST* Sparse BLAS C implementation. This API uses a single function call for any compute operation and does not allow passing optimization information between function calls. This limits certain aggressive optimizations, such as balancing based on matrix sparsity patterns, matrix reordering, and even matrix format changes. These optimizations require time compared to one sparse-matrix vector multiplication and become beneficial only when multiple operations are performed with a single matrix, such as in iterative solvers. Intel MKL 11.3 Beta introduces an inspector-executor API, which uses a two-step approach to computations. The analysis stage is used to inspect the matrix sparsity pattern and apply matrix structure changes. The information from the analysis stage is used in subsequent calls to do computations with higher performance. The API offers a consistent support for C- and Fortran-style data layouts (row- and column-major) and indexing (zero-based and one-based), as well as combinations of these. It supports key sparse matrix storage formats: CSR (CSC), COO and BSR. I will discuss optimizations made to support iterative solvers with matrix-vector multiplications and triangular solvers aimed to achieve scalability on Intel® Xeon® and Intel® Xeon Phi™ processors.


To reports list