Thursday, April 4, 2013

Spotlight: Blaze C++ math library

My collaborator Yucheng Low, asked me to take a look at Blaze math library. I did a quick review and here are my findings.

Blaze is an interesting effort with relatively easy programming interface (see for example CG code: under "A complex example").
The main guy behind the software is Klaus Iglberger, a PhD from germany. Two papers were published about Blaze:
  • K. Iglberger, G. Hager, J. Treibig, and U. Rüde: Expression Templates Revisited: A Performance Analysis of Current Methodologies(Download). SIAM Journal on Scientific Computing, 34(2): C42--C69, 2012
  • K. Iglberger, G. Hager, J. Treibig, and U. Rüde: High Performance Smart Expression Template Math Libraries (Download). Proceedings of the 2nd International Workshop on New Algorithms and Programming Models for the Manycore Era (APMM 2012) at HPCS 2012
Regarding performance, at least for the tested primitives they have very good performance. It seems they work very well for small matrices (less than 100 width) even relative to MKL. For larger matrices have performance similar to MKL. The performance tests do not cover sparse matrices (only dense).

What is missing IMHO, is an algorithmic suite like linear solvers, svd etc. that exists Eigen. So currently Blaze focuses mainly on matrix vector operation.

It seems Blaze is using a single core implementation (they do not exploit parallelism). 

Overall, it seems like a very interesting project to keep track of. Once it supports some additional functionality I would consider using it.

To dig a little dipper, I sent some questions to Klaus Iglberger, who was very kind to promptly replay:

> We were looking for a good math library to replace Eigen and we liked Blaze API. But we still have some missing functionality I wanted to ask you about.
We released the Blaze library only recently, in August 2012. Therefore Blaze is obviously much younger than Eigen and cannot compete in terms of features. Currently it can only compete in terms of performance (it seems to be the most efficient C++ math library for many operations) and in terms of software architecture and design. The software design and architecture is one of our personal interests will therefore always play a major role in the development of Blaze. However, due to that effort, I feel that Blaze can be used more naturally than the other C++ math libraries (including Eigen).

Since you are asking about features, let me give you an idea of our current roadmap. We are currently working on views (which you can for instance use to work on submatrices, extract parts of the result from vectors and matrices, etc.), special purpose matrices (banded matrices, upper and lower triangle matrices, etc.) and shared memory parallelization. These will be the next big features, starting with views in Blaze 1.2.

Whereas in direct comparison Blaze cannot compete in the total number of features, Blaze still offers a small number of unique features. The probably most important is the support of the Intel MIC architecture (Xeon Phi). Second is the support of the AVX instruction set, that is still not available in most other C++ math libraries. Third, Blaze is probably the only library that allows a completely hierarchic nesting of matrix and vector data types without performance penalties. For instance, you can define block structured matrices very conveniently:

typedef CompressedMatrix< DynamicMatrix<double,rowMajor>, rowMajor>  BlockStructuredMatrix;

BlockStructuredMatrix A, B, C;
// … Initializing the matrices
C = A * B;

In this matrix multiplication you can still count on every single matrix multiplication to be executed at maximum performance (see also the answer to your third question). And last but not least, with the introduction of views, Blaze will offer an extremely versatile feature to restrict the computation to the "parts" you are interested in:

DynamicMatrix<double,rowMajor> A, B;
DynamicVector<double,columnVector> x;

// Restricts the computation of the matrix multiplication to the fourth column and still considers the most efficient way
// to compute the result although both matrices are stored in a row-wise fashion.
x = column( A * B, 4 );

> 1) Is there a plan to support sparse matrix algorithms like solving a linear system, eigen decomposition etc.
We currently don't plan to extend our linear system solvers, but hope that they can be added easily based on the data structures that we provide.

> 2) What is the level of support for parallelism? Namely, is the library fully serial or do you have some support for parallelism when there are multiple cores.
Until now, Blaze is completely serial, except for the vectorization (which is of course also a level of parallelization). So unfortunately, currently a single operation can only use a single core. But, as already stated, we are currently working on shared memory parallelization, but it will take some time until we release this feature.

> 3) According to the performance plots, on large matrices blaze performance aligns with MKL. Is there some mechanism which sends the computation to MKL once the problem is big enough and otherwise uses blaze code?
Blaze tries to detect several characteristics about the involved matrices. One of these characteristics is the size, which is used to determine which algorithm is most beneficial for performance. Whereas the MKL offers by far the best performance for large matrices, for small matrices the performance is less favorable due to optimizations that only work well for large matrices and therefore cause a performance penalty for small matrices. Therefore Blaze provides a couple of special algorithms tailored for small matrices (for instance, these algorithms don't use blocking strategies and use the data a little differently). The threshold for this algorithm switching can be configured in one of the configuration files: blaze/config/Threshold.h. With these you can tune Blaze to a specific target platform.

> 4) Is there support for serialization for writing and loading matrices from file?
We unfortunately neglected the support for writing to file and loading from file a little. Currently, only the DynamicVector class supports this feature. The member function you can use are called 'read' and 'write', respectively. However, I have added this to our list of features for the next release, since this will not take too much time to implement.

> Thanks a lot for your time!
You're very welcome. Please don't hesitate to contact me again if you have further questions (or possible feature requests) or if you have suggestions of how to improve Blaze. Hopefully you consider Blaze for your work, even if some features are currently missing. Please keep me posted on your decision.


  1. I am trying to figure out which matrix library to adopt for a new project. Since speed will be extremely important Blaze was the front runner. However it looks like Eigen is catching up speedwise since support for AVX and FMA is being added through this branch. Do you expect Blaze to keep an edge over Eigen in the future?

    1. Very hard question. We mainly use eigen since it is compatible with any system and does not required blas or lapack. It may be that for very specific needs other libraries are faster.