The Author-Topic Model The author-topic model is a generative model for authors and documents that reduces the generation of documents to a simple series of probabilistic steps. Each author is associated with a topics mixture and the choice of words of a collaborative paper is assumed to be the result of a mixture of the authors' topics mixtures. The model is applied to a collection of 1.7K NIPS conference papers and 160K CiteSeer abstracts. This webpage contains an online query interface to the model that allows interactive exploration of queries such as the query what topics does a given author write about and other fun applications. Most of the data currently presented in this webpage is extracted from a single MCMC sample. One solution of 300 topics from the CiteSeer dataset and one solution of 100 topics from the NIPs dataset (these two samples are available for queries at the browser). |
|
|
|
References Finding
Scientific Topics. The Author-Topic
Model for Authors and Documents Probabilistic
author-topic models for information discovery Credits This is a joint reaserch project by Mark Steyvers, Padhraic Smyth, Michal
Rosen-Zvi, Thomas
Griffiths We would like to thank Steve Lawrence and C. Lee Giles for kindly providing us with the CiteSeer data used. Last Updated: 2008-09-28 for comments and questions contact Michal Rosen-Zvi, email: michal at il.ibm.com |