i don't think it is.
it really doesn't matter if you are using factor analysis in an
exploratory or confirmatory manner. the problem of using least
squares is still there.
steve finch's comment about using some sort of gradient approach
combined with a mixed multinomial model is exactly right. i
implemented a version of the EM algorithm some years ago to do just
this, but have never gotten back around to testing it on realistic
sized corpora. my impression on test cases was that convergence
happened *very* quickly, but very careful numerical coding was
required to handle large examples without running into a variety of
problems related to underflow and round off error.
it should be noted that the result of such an exercise is a *very*
different animal from the result of factor analysis. the reduced
dimensional representation becomes a list of probabilities that a
document was produced by one of the derived multi-nomial
distributions. this means that the dot product between the reduced
representations of two documents can be interpreted as the probability
that the documents are about the same topic which is very nice. it is
also possible to perform a sort of a likelihood ratio test to
determine which words or other features are significantly more or less
common in a particular multi-nomial distribution.
another interpretation of this method is to consider a fuzzy form of
k-means clustering with an unusual metric.