Large Scale Machine Learning and Other Animals: The quiet rise of Gaussian Belief Propagation (GaBP)

Gaussian Belief Propagation is an inference method on a Gaussian graphical model which is related to solving a linear system of equations, one of the fundamental problems in computer science and engineering. I have published my PhD thesis on applications of GaBP in 2008.
When I started working on GaBP, it was absolutely useless algorithm with no documented applications.
Recently, I am getting a lot of inquiries from people who applying GaBP on real world problems. Some examples:
* Carnegie Mellon graduate student Kyung-Ah Sohn, working with Eric Xing, is working on regression problem for finding causal genetic variants of gene expressions, considered using GaBP for computing matrix inverses.
* UCSC researcher Daniel Zerbino using suing GaBP for smoothing genomic sequencing measurements with constraints.
* UCSB graduate student Yun Teng is working on implementing GaBP as part of the KDT (knowledge discovery toolbox package).

Furthermore, I was very excited to find out today from Noam Koenigstein, a Tel Aviv university graduate about Microsoft Research Cambridge project called MatchBox, which is using Gaussian BP for collaborative filtering and being actually deployed in MS. Some examples to other conversations I had are:
* Wall Street undisclosed company (that asked to remain private) who is using GaBP for parallel computation of linear regression of online stock market data.
* A gas and oil company was considering to use GaBP for computing the main diagonal
of the inverse of a sparse matrix.

I thought about writing on the common questions I am getting so my answers can be useful for a wider audience. Here are some of the questions:

Q: What is the difference between GaBP and other iterative methods for solving a linear system of equations like the Jacobi method?
A: GaBP typically converges in a fewer number of iterations relative to Jacobi, since it deploys information from the second derivative (the precision/variance of the Gaussian). However, each iteration is more computation intensive since for each non-zero entry of the linear relation matrix a Jacobi like computation is done. Thus, the algorithms works well for sparse matrices.

Q: When should I use GaBP?
A: I believe there is no silver bullet and the answer depends on the data. I recommend to always try several algorithms like Jacobi, conjugate gradients etc. and compare performance to GaBP. On some cases GaBP works very well. For example: matrices with zero, -1, 1 entries. Matrices with same number of non zeros in each row and column. CDMA. 2D Poisson problem.

Q: Can GaBP be applied to dense matrices?
A: Yes, there is an algorithm variants for dense matrices - see section 3.3 of my PhD thesis.

Q: Does GaBP always converge?
A: No, like the Jacobi method, its converges is relates to the data matrix. However, it is always possible to force converges, at the cost of higher computation. See http://arxiv.org/abs/0901.4192

Q: What is the best way to understand GaBP better?
A: I recommend downloading the Matlab code from here, and experimenting with it on some toy problems.

Q: Can GaBP converge in cases where Jacobi does not converge?
A: Yes, in some cases. See example in equation 57 in here.

Q: Can the main diagonal of the sparse inverse matrix be computed using GaBP?
A: Yes, but only an approximation of it. The main inverse if approximated by the
output precision entries.

Q: can GaBP be used on non-square/non-symmetric matrices?
A: Yes, see: section III in here.

Q: Can GaBP be used for computing the full inverse of a sparse matrix?
A: Yes. The way to compute the full inverse is by solving n systems of linear equations where
the observation vector b_i = e_i, namely the all zero vector with 1 in its i-th entry. This can be done efficiently in parallel. Note that while the linear system may be sparse, the inverse solution is typically not sparse.

Q: Is there an efficient multicore implementation of GaBP?
A: Yes! as a part of the GraphLab linear solver library.

Q: What is the relation to linear belief functions?
A: Linear belief functions describe a joint Gaussian distribution, combine them, marginalize, etc. Compared to representing a joint Gaussian by its covariance matrix, linear belief functions allows to have infinite variance in some directions, which corresponds to no belief about the value in that direction. The same can be done in GaBP, by having a zero precision value of that node. However, careful handling of the ordering of computation is needed to avoid division by zero.

Q: I need to solve Ax=b where A = UU' and U is given and not sparse.
The problem is that I would like to avoid computation of UU'.
A: First you need to solve U\b using GaBP for getting the observation
and then you need to solve U'\(U\b).

Q: Recently i found your PHD thesis about solving systems of linear equations using GaBP and then the GraphLab. I have two questions, what's the advantage of GaBP over algorithms like sparseLM in solving linear systems of equations?

A: I am not familiar with sparseLM algorithm - do you mean http://www.ics.forth.gr/~lourakis/sparseLM/ ? It seem to be a non-linear algorithm. When the cost function is linear (like in a linear system) there is no benefit in using a non-linear method.

Q: Is it possible to incrementally solve the "Ax=b" equation as new measurements are added to the matrices "A" and "b"?
A: No. The GaBP is an offline algorithm. The scenario you are describing sounds like a Kalman filter problem - where new measurements are constantly introduced into the model. You can take a look
at my paper: "Distributed Kalman filter via Gaussian Belief Propagaion". You may also want to take a look at rank one update algorithms: where matrix A is updated based on new observations.

If you have any other interesting questions - please email me I am always happy to answer!

Large Scale Machine Learning and Other Animals

Saturday, August 6, 2011

The quiet rise of Gaussian Belief Propagation (GaBP)

No comments:

Post a Comment

Labels

GraphLab Users Google Group

pagerank

google analytics

syntax