The Author-Topic Model
The author-topic model is a generative model for authors and documents
that reduces the generation of documents to a simple series of probabilistic
steps. Each author is associated with a mixture over topics, where topics
are multinomial distributions over words. The
words in a collaborative paper are assumed to be the result of a mixture of
the authors' topics mixtures.
The results presented on this webpage are extracted from a
single MCMC sample, for a 300-topic model for CiteSeer, a 200-topic model for Enron
emails, and a 100-topic model for NIPS papers.
Information about the Data Sets
Author-Topic Modeling Results
Applications of the Author-Topic Model
References
-
Finding scientific topics, T. Griffiths and M. Steyvers, Proceedings of the National Academy of Sciences, 2004
-
The author-topic model for authors and documents,
M. Rosen-Zvi, T. Griffiths, M. Steyvers, P. Smyth,
Proceedings of the 20th Annual Conference on Uncertainty in Artificial
Intelligence, 2004.
-
Probabilistic
author-topic models for information discovery
M. Steyvers, P. Smyth, M. Rosen-Zvi, T. Griffiths,
Proceedings of the Tenth ACM SIGKDD Conference, 2004.