Monday, September 27, 2021

Colossal - The Future of DNA Editing?

 I found some recent news about Colossal a new startup that wants to revive extinct Mammoth to fight the global warming. Fighting global warming is one of the best things we can do, especially that one of the co-founders is Prof. George Chruch from Harvard Medical School, a very credible authority on gene editing. Church  is one of the inventors of Crispr, a gene editing tool that can cut and paste any desired segment of the DNA and thus make whatever changes we like to do. 

Here is my take on it:

  • Their website is amazing, a lot of effort was invented on that front. Backing up the pretty wild idea and thus draws a lot of attention to this work. The raised amount of 15M$ is tiny considering the amount of lab effort, equipment, materials etc.
  • Global warming sounds like an awkward excuse to fund the research they really like to do.
    Ben Lamm, CEO of Colossal, told The Washington Post in an email that the extinction of the woolly mammoth left an ecological void in the Arctic tundra that Colossal aims to fill. The eventual goal is to return the species to the region so that they can reestablish grasslands and protect the permafrost, keeping it from releasing greenhouse gases at such a high rate.
  • Sending a wild Mammoth to eat grass somewhere frozen, with the hope of reducing gas emissions is likely is the most complicated way to fight global warming I can imagine. But is a sexy way of drawing news attention.
  • The difference between Mammoth DNA and a person DNA is most likely 90% similar. Thus having the ability to revive and extinct Mammoth will enable also reviving also persons. Recently, Israeli research hash shown the possibility of raising mice embryos outside the womb. So raising Mammoth outside the womb as they like to do is maybe doable.
  • Christopher Preston, a professor of environmental ethics and philosophy at the University of Montana, questioned Colossal’s focus on climate change, given that it would take decades to raise a herd of woolly mammoths large enough to have environmental impacts.
  • So, the real applications of this technology may be applied to humans. For example, what if I wanted to revive my dead grandfather? What is I wanted a baby with blond hair and blue eyes? My guess there is a huge market for this technology in real life.
I wonder why all the news and media attention ignores the actual use cases of this tehnology?

Monday, September 13, 2021

Monday, September 6, 2021

How can we visualize attention?

 A nice and recent paper from Lior Wolf's lab at Tel Aviv University: by Hila Chefer, Shir Gur and Lior Wolf. The problem is very simple: given a transformer encoder/ decoder network, we would like to visualize the affect of attention on the image. While the problem is simple the answer is pretty complicated: we need to take into account attention matrices from mutliple layers at once. The paper suggests an iterative way to add up all those attention layers into one coherent image.

Figure 4 shows that the result is very compelling vs. previous art: 

top row is the new paper and bottom row is work for comparison. 

Thursday, September 2, 2021

Gaussian Belief Propagation Tutorial

 I have stumbled upon this nice tutorial:  which interactively visualizes Gaussian Belief Propagation. What is nice about it that the authors spent time to make an interactive tutorial that you can play with.

As a grad student I was totally excited about Gaussian Belief Propagation and spend a large chunk of my PhD thesis on it. In a nutshell it is an iterative algorithm for solving a set of linear equations (for a PSD square matrix). The algorithm is very similar to Jacobi iterative method but uses second order information (namely approximation of the Hessian) to improve on convergence speed at the cost of additional memory & computation. In deep learning terminology this is related to adding Adam/ Momentum/ Admm etc. From personal experience, when people get excited about speeding up conference of iterative algorithm they completely neglect the fact here is no free lunch: when you speed convergence in terms of number of iterations you typically pay in something else (computation/ communication).

The complexity of the algorithm derivation comes from the fact it arises from probabilistic graphical models where the notation of the problem is cumbersome, as it can be presented as either factor graphs or undirected graphical model. A factor graph is a bipartite graph with evidence nodes (the input) at one side and a function aggregating the nodes on the other side. It is very similar to a single dense layer in deep learning where the input is coming from the left and the summation plus activation is done on the right. However unlike deep learning the factor has only a single layer and the message propagate again back to the variable (input) nodes back and forth. So the factor graph is the grand grand father of deep learning. 

To make it totally confusing the seminal paper by Prof. Weiss uses pairwise notation which is a third way of presenting the same model. (Instead of a single linear system of equation it is a collection of multiple sets of sparse linear equations where each set has two variables only). 

Any continuous function can be locally approximated in a first order method around a point by computing the gradient. That is why we often see linear modeling when modeling complex problems, including in deep learning where each dense layer is linear. This is the relevancy of solving linear models in multiple domains. 

Another nice property of the algorithm is that besides of the marginals (the solution to the linear system of equations) we get an approximation to the main diagonal of the inverse matrix of the linear system. This is often useful when inverting the full matrix is too heavy computationally.