## Thursday, April 18, 2013

### Distributed Dual Decomposition (DDD) in GraphLab

Our collaborator Dhruv Batra, from Virginia Tech has kindly contributed DDD code for GraphLab. Here are some explanation about the method and how to deploy it.
The full documentation is found here.

# Distributed Dual Decomposition

Dual Decomposition (DD), also called Lagrangian Relaxation, is a powerful technique with a rich history in Operations Research. DD solves a relaxation of difficult optimization problems by decomposing them into simpler subproblems, solving these simpler subproblems independently and then combining these solutions into an approximate global solution.
More details about DD for solving Maximum A Posteriori (MAP) inference problems in Markov Random Fields (MRFs) can be found in the following:
```D. Sontag, A. Globerson, T. Jaakkola.
Introduction to Dual Decomposition for Inference.
Optimization for Machine Learning, editors S. Sra, S. Nowozin, and S. J. Wright: MIT Press, 2011.
```

## Running DDD

The input MRF graph is assumed to be in the standard UAI file format. For example a 3x3 grid MRF can be found here: grid3x3.uai.
The program can be run like this:
```> ./dd --graph grid3x3.uai
```
Other arguments are:
• –help Display the help message describing the list of options.
• –output The output directory in which to save the final predictions.
• –dualimprovthres (Optional, default 0.00001) The amount of change in dual objective (in log-space) that will be tolerated at convergence.
• –pdgapthres (Optional, default 0.1) The tolerance level for zero primal-dual gap.
• –maxiter (Optional, default 10000) The maximum no. of dual update iterations.
• –engine (Optional, Default: asynchronous) The engine type to use when executing the vertex-programs
• synchronous: All LoopyBP updates are run at the same time (Synchronous BP). This engine exposes greater parallelism but is less computationally efficient.
• asynchronous: LoopyBP updates are run asynchronous with priorities (Residual BP). This engine is has greater overhead and exposes less parallelism but can substantially improve the rate over convergence.
• –ncpus (Optional, Default 2) The number of local computation threads to use on each machine. This should typically match the number of physical cores.
• –scheduler (Optional, Default sweep) The scheduler to use when running with the asynchronous engine. The default is typically sufficient.
• –engine_opts (Optional, Default empty) Any additional engine options. See –engine_help for a list of options.
• –graph_opts (Optional, Default empty) Any additional graph options. See –graph_help for a list of options.
• –scheduler_opts (Optional, Default empty) Any additional scheduler options. See –scheduler_help for a list of options.
Anyone who tries to run it - please let us know!