Clustering algorithms for web applications: A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge

This subject of this paper published in 1997 is perhaps at first impressions not as directly relevant to my project as the other papers mentioned previously. It has an emphasis on learning and the acquisition of knowledge from the viewpoint of philosophical science; however it candidly employs Latent Semantic Analysis (LSA) for the experimental purposes. Thus, Landauer and Dumais provide a thorough review of LSA and its application in real experiments.

What is LSA? It is a "high-dimensional linear associative model that embodies no human knowledge beyond its general learning mechanism". (p.211) Alternatively, LSA can be described in its “bare mathematical formalism” in the “singular-value-decomposition matrix model”. (p.218)

The singular-value-decomposition (SVD) realization of LSA is a “general method for linear decomposition of a matrix into independent principle components”. (p.218) There are a number of methods of SVD application, however within the experiments conducted for this report ‘tf-idf’ was employed.

Term-frequency inverse-document-frequency (tf-idf) is a formula that provides an illustration of the relative importance of a term within a document and its relative importance of this occurrence within the corpus. It should be noted that terms that occurred in “at least two samples” are of interest. (p.218) Thus, for the purpose of SVD, the data size is compressed in size.

LSA is then employed within numerous experiments so as to provide a means for mimicking human learning.

Clustering algorithms for web applications

Tuesday, July 31, 2007

A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge

No comments:

Blog Archive

About Me