Run a set of Latent Dirichlet Allocation models

For a given dataset consisting of counts of words across multiple documents in a corpus, conduct multiple Latent Dirichlet Allocation (LDA) models (using the Variational Expectation Maximization (VEM) algorithm; Blei et al. 2003) to account for [1] uncertainty in the number of latent topics and [2] the impact of initial values in the estimation procedure.

LDA_set is a list wrapper of LDA in the topicmodels package (Grun and Hornik 2011).

check_LDA_set_inputs checks that all of the inputs are proper for LDA_set (that the table of observations is conformable to a matrix of integers, the number of topics is an integer, the number of seeds is an integer and the controls list is proper).

LDA_set(document_term_table, topics = 2, nseeds = 1,
  control = list())

check_LDA_set_inputs(document_term_table, topics, nseeds, control)

Arguments

document_term_table	Table of observation count data (rows: documents, columns: terms. May be a class `matrix` or `data.frame` but must be conformable to a matrix of integers, as verified by `check_document_term_table`.
topics	Vector of the number of topics to evaluate for each model. Must be conformable to `integer` values.
nseeds	Number of seeds (replicate starts) to use for each value of `topics`. Must be conformable to `integer` value.
control	A `list` of parameters to control the running and selecting of LDA models. Values not input assume default values set by `LDA_set_control`. Values for running the LDAs replace defaults in (`LDAcontol`, see `LDA` (but if `seed` is given, it will be overwritten; use `iseed` instead).

Value

LDA_set: list (class: LDA_set) of LDA models (class: LDA_VEM). check_LDA_set_inputs: an error message is thrown if any input is improper, otherwise NULL.

References

Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3:993-1022. link.

Grun B. and K. Hornik. 2011. topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software 40:13. link.

Examples

  data(rodents)
  lda_data <- rodents$document_term_table
  r_LDA <- LDA_set(lda_data, topics = 2, nseeds = 2)

Run a set of Latent Dirichlet Allocation models

Arguments

Value

References

Examples

Contents