My searches on factor analyses of time series have led me to a group of
articles, mostly applied to the analysis of functions and the like,
which use some kind of entropy minimization to find independent factors.
I wonder how they fit into the picture.
I'm thinking in particular of two papers:
Gustavo Deco and Bernd Schurmann, Learning time series evolution by
unsupervised extraction of correlations, Phys. Rev. E, Vol. 51, No. 3,
(1995).
A. Norman Redlich, Supervised Factorial Learning, Neural Computation 5,
750-766 (1993).
It seems to me there might be similarities in the approach (I'm guessing).
If not, how well do you think they would work in the case of language?
Rob Freeman
>steve finch's comment about using some sort of gradient approach
>combined with a mixed multinomial model is exactly right. i
>implemented a version of the EM algorithm some years ago to do just
>this, but have never gotten back around to testing it on realistic
>sized corpora. my impression on test cases was that convergence
>happened *very* quickly, but very careful numerical coding was
>required to handle large examples without running into a variety of
>problems related to underflow and round off error.