The Author-Topic Model

The author-topic model is a generative model for authors and documents that reduces the generation of documents to a simple series of probabilistic steps. Each author is associated with a mixture over topics, where topics are multinomial distributions over words. The words in a collaborative paper are assumed to be the result of a mixture of the authors' topics mixtures. The results presented on this webpage are extracted from a single MCMC sample, for a 300-topic model for CiteSeer, a 200-topic model for Enron emails, and a 100-topic model for NIPS papers.

Information about the Data Sets

Author-Topic Modeling Results

Applications of the Author-Topic Model