Each of the chains is initialized by prep_cpts
using a
draw from the available times (i.e. assuming a uniform prior), the best
fit (by likelihood) draw is put in the focal chain with each subsequently
worse fit placed into the subsequently hotter chain. update_cpts
updates the change points after every iteration in the ptMCMC algorithm.
prep_cpts(data, formula, nchangepoints, timename, weights,
control = list())
update_cpts(cpts, swaps)
Arguments
data |
data.frame including [1] the time variable (indicated
in timename ), [2] the predictor variables (required by
formula ) and [3], the multinomial response variable (indicated in
formula ) as verified by check_timename and
check_formula . Note that the response variables should be
formatted as a data.frame object named as indicated by the
response entry in the control list, such as gamma
for a standard TS analysis on LDA output.
|
formula |
formula defining the regression relationship between
the change points, see formula . Any
predictor variable included must also be a column in
data and any (multinomial) response variable must be a set of
columns in data , as verified by check_formula .
|
nchangepoints |
integer corresponding to the number of
change points to include in the model. 0 is a valid input (corresponding
to no change points, so a singular time series model), and the current
implementation can reasonably include up to 6 change points. The
number of change points is used to dictate the segmentation of the data
for each continuous model and each LDA model.
|
timename |
character element indicating the time variable
used in the time series. Defaults to "time" . The variable must be
integer-conformable or a Date . If the variable named
is a Date , the input is converted to an integer, resulting in the
timestep being 1 day, which is often not desired behavior.
|
weights |
Optional class numeric vector of weights for each
document. Defaults to NULL , translating to an equal weight for
each document. When using multinom_TS in a standard LDATS
analysis, it is advisable to weight the documents by their total size,
as the result of LDA is a matrix of
proportions, which does not account for size differences among documents.
For most models, a scaling of the weights (so that the average is 1) is
most appropriate, and this is accomplished using
document_weights . |
control |
A list of parameters to control the fitting of the
Time Series model including the parallel tempering Markov Chain
Monte Carlo (ptMCMC) controls. Values not input assume defaults set by
TS_control . |
cpts |
The existing matrix of change points. |
swaps |
Chain configuration after among-temperature swaps. |
Value
list
of [1] matrix
of change points (rows) for
each temperature (columns) and [2] vector
of log-likelihood
values for each of the chains.
Examples
# \donttest{
data(rodents)
document_term_table <- rodents$document_term_table
document_covariate_table <- rodents$document_covariate_table
LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
data <- document_covariate_table
data$gamma <- LDA_models@gamma
weights <- document_weights(document_term_table)
data <- data[order(data[,"newmoon"]), ]
saves <- prep_saves(1, TS_control())
inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights,
TS_control())
cpts <- prep_cpts(data, gamma ~ 1, 1, "newmoon", weights, TS_control())
ids <- prep_ids(TS_control())
for(i in 1:TS_control()$nit){
steps <- step_chains(i, cpts, inputs)
swaps <- swap_chains(steps, inputs, ids)
saves <- update_saves(i, saves, steps, swaps)
cpts <- update_cpts(cpts, swaps)
ids <- update_ids(ids, swaps)
}
# }