Hi Danny, I've seen some of your posts regarding lanczos and thought I'd ask you.
I have a problem where I need to compute the first few hundred eigenvectors/values in the PCA of a large, dense dataset, about ten thousand times five million. (It's an avi of a fluorescence microscopy experiment). For the larger datasets I might not even be able to hold the entire set in memory. The normal approach would be to compute the 10000^2 A*A' matrix and then diagonalize it, the problem being that this matrix entails 12 million dot products of 5M element vectors. So I was hoping to find a method that iteratively computes the eigenvectors without such explicit evaluations. Is Lanczos numerically stable enough for such sizes?
The incremental SVD papers that I am thinking of have been written by Matthew Brand. He has several papers on this topic, but the one that I have been using the most is this one: http://www.merl.com/
Unfortunately I am not aware of software package which implements the Brand method. You will probably have to implement it yourself.
Additional resource I found is another paper by Brand:
Brand, M.E., “Incremental Singular Value Decomposition of Uncertain Data with Missing Values”, European Conference on Computer Vision (ECCV), Vol 2350, pps 707-720, May 2002 (Lecture Notes in Computer Science)