The data is simulated from a second order DLM:
\[\begin{align} Y_t &= F \textbf{x}_t + v_t, \quad v_t \sim \mathcal{N}(0, V), \\ \textbf{X}_t &= G \textbf{x}_{t-1} + \textbf{w}_t, \quad w_t \sim \textrm{MVN}(0, W), \\ \textbf{X}_0 &\sim \textrm{MVN}(m_0, C_0). \end{align}\]
The state is two dimensional, as such the system noise matrix \(W\) is a \(2 \times 2\) matrix. The observation and system evolution matrices do not depend on time, the observation matrix is \(F = (1 \quad 0)\) and the system evolution matrix is:
\[G = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}.\]
In order to examine the properties of this model, first we can simulate a time series of values from it:
import com.github.jonnylaw.dlm._
import breeze.linalg.{DenseMatrix, DenseVector, diag}
val mod = Dlm.polynomial(2)
val p = DlmParameters(
DenseMatrix(3.0),
diag(DenseVector(2.0, 1.0)),
DenseVector(0.0, 0.0),
diag(DenseVector(100.0, 100.0))
)
val data = Dlm.simulateRegular(mod, p, 1.0).
steps.
take(1000).
toVector
The figure below shows simulated values from the Second Order DLM with parameters, \((V, W, \textbf{m}_0, C_0) = (3.0, \operatorname{diag}(2.0, 1.0), (0.0, 0.0), \operatorname{diag}(100.0, 100.0))\)
Kalman Filtering can be performed to learn the posterior distribution of the states, given the observations:
import cats.implicits._
val filtered = KalmanFilter.filterDlm(mod, data.map(_._1), p)
The parameter posterior distributions can be learned using Gibbs sampling. The state evolution distribution and the observation distribution are Gaussian with unknown variance \(W\) and \(V\) respectively. The state is assumed to have a diagonal, \(2 \times 2\) covariance matrix and hence the unknown variances are chosen to have Inverse Gamma priors. The Inverse Gamma distribution is conjugate to the Normal distribution with known mean and unknown variance. To perform gibbs sampling using the Bayesian DLMs package:
val iters = GibbsSampling.sample(
mod,
InverseGamma(4.0, 9.0),
InverseGamma(5.0, 8.0),
p,
data.map(_._1))
gibbs_iters = read_csv("../examples/data/second_order_dlm_gibbs.csv")
actual_values = tibble(
Parameter = c("V", "W1", "W2"),
actual_value = c(3.0, 2.0, 1.0)
)
# gibbs_iters %>%
# mcmc() %>% ggs() %>%
# summary_table()
params = ggs(mcmc(gibbs_iters))
p1 = params %>%
inner_join(actual_values) %>%
ggplot(aes(x = Iteration, y = value)) +
geom_line() +
geom_hline(aes(yintercept = actual_value), colour = "#ff0000") +
facet_wrap(~Parameter, scales = "free_y")
p2 = params %>%
ggs_autocorrelation()
p1 + p2 + plot_layout(ncol = 1)