MirrorVI.jl
Documentation for MirrorVI.jl
MirrorVI.ConstantDistribution
— TypeUnivariate Constant Distribution
MirrorVI.compute_logpdf_prior
— Methodcompute_logpdf_prior(theta::ComponentArray; params_dict::OrderedDict)
Compute the sum of the log-pdf of the prior distribution.
Arguments
theta::ComponentArray
: ComponentArray with the components of the parameters sample, one component for each prior.params_dict::OrderedDict
: OrderedDict generated using the MirrorVI.utils functions containing the prior details.
Returns
Real
: The sum of the log-pdf components.
MirrorVI.cyclical_polynomial_decay
— Functioncyclical_polynomial_decay(n_iter::Int64, n_cycles::Int64=2)
Generate a cyclical polynomial decay schedule over n_iter
iterations, divided into n_cycles
cycles.
This function creates a learning rate schedule where the polynomial decay is applied cyclically. Each cycle consists of steps_per_cycle = n_iter / n_cycles
steps, and the polynomial_decay
function is applied within each cycle.
Arguments
n_iter::Int64
: The total number of iterations for the schedule.n_cycles::Int64=2
: The number of cycles to divide the iterations into. Default is2
.
Returns
Vector{Float32}
: A vector of decayed values representing the learning rate schedule.
Examples
```julia julia> schedule = cyclicalpolynomialdecay(10, 2) # 10 iterations, 2 cycles 10-element Vector{Float32}: 0.17782794 0.125 0.09765625 0.080566406 0.06871948 0.17782794 0.125 0.09765625 0.080566406 0.06871948
julia> length(schedule) # Total number of iterations 10
MirrorVI.elbo
— Methodelbo(
z::AbstractArray;
y::AbstractArray,
X::AbstractArray,
ranges_z::AbstractArray,
vi_family_array::AbstractArray,
random_weights::AbstractArray,
model,
theta_axes::Tuple,
log_likelihood,
log_prior=zero,
n_samples::Int64=1,
n_repeated_measures::Int64=1
)
Compute the Evidence Lower Bound (ELBO) for a variational inference problem.
The ELBO is a key quantity in variational inference, used to approximate the posterior distribution of model parameters. This function computes the ELBO by:
- Sampling from the variational distribution.
- Evaluating the log-likelihood and log-prior of the model. Optional: Takes into account repeated measurements, if it is required
- Adding the entropy of the variational distribution.
Arguments
z::AbstractArray
: The variational parameters used to define the variational distribution.y::AbstractArray
: The observed data (target values).X::AbstractArray
: The input data (features).ranges_z::AbstractArray
: Specifies howz
is divided among the parameters of the variational distribution.vi_family_array::AbstractArray
: An array of functions defining the variational family for each parameter.random_weights::AbstractArray
: Boolean array of the same dimension as theta, stating whether each parameter is random or not.model
: A function representing the model, which takes parameters and input dataX
(keyword argument) and returns atuple with the predictions.theta_axes::ComponentArrays.Axes
: The axes for constructing aComponentArray
from the sampled parameters.log_likelihood
: A function that computes the log-likelihood of the observed data given the model predictions.log_prior=zero
: A function that computes the log-prior of the parameters. Default iszero
(no prior).n_samples::Int64=1
: The number of Monte Carlo samples to use for approximating the ELBO. Default is1
.n_repeated_measures::Int64=1
: The number of repeated measurements (e.g., for longitudinal data). Default is1
.
Returns
Float64
: The negative ELBO value (to be minimized).
MirrorVI.get_parameters_axes
— Methodget_parameters_axes(params_dict::OrderedDict)
Generate the parameters axes as needed by the library ComponentArrays. This function processes a dictionary of parameters (params_dict
) and constructs a prototype array with the same structure as the parameters. The axes of this prototype array are returned, which can be used to initialize a ComponentArray
with the correct structure.
Arguments
params_dict::OrderedDict
: OrderedDict generated using the MirrorVI.utils functions containing the prior details.
Returns
Tuple
: The axes of the prototype array, which can be used to initialize aComponentArray
with the same structure as the parameters.
MirrorVI.hybrid_training_loop
— Methodhybrid_training_loop(;
z::AbstractArray,
y::AbstractArray,
X::AbstractArray,
params_dict::OrderedDict,
model,
log_likelihood,
log_prior=zero,
n_iter::Int64,
optimiser::Optimisers.AbstractRule,
save_all::Bool=false,
use_noisy_grads::Bool=false,
elbo_samples::Int64=1,
lr_schedule=nothing,
n_repeated_measures::Int64=1,
dropout::Bool=false,
start_dropout_iter::Int=0
)
Run a training loop for variational inference, combining gradient-based optimization with optional noise injection and dropout (experimental).
This function performs variational inference by minimizing the Evidence Lower Bound (ELBO) using gradient-based optimization. It supports:
- Noisy gradients for exploration.
- Dropout for regularization.
- Cyclical learning rate schedules.
- Saving intermediate results for analysis.
Arguments
z::AbstractArray
: Initial variational parameters.y::AbstractArray
: Observed data (target values).X::AbstractArray
: Input data (features).params_dict::OrderedDict
: Ordered Dictionary defined through MirrorVI.utils functions, containing configuration for the variational family, parameter ranges, and other settings. Must include:"vi_family_array"
: Array of functions defining the variational family for each parameter."ranges_z"
: Specifies howz
is divided among the parameters of the variational distribution."random_weights"
: Weights used for sampling from the variational distribution."noisy_gradients"
: Standard deviation of noise added to gradients (ifuse_noisy_grads=true
).
model
: A function representing the model, which takes parameters and input dataX
and returns predictions.log_likelihood
: A function that computes the log-likelihood of the observed data given the model predictions.log_prior=zero
: A function that computes the log-prior of the parameters. Default iszero
(no prior).n_iter::Int64
: Number of training iterations.optimiser::Optimisers.AbstractRule
: Optimiser to use for updatingz
(e.g.,DecayedADAGrad
).save_all::Bool=false
: Iftrue
, saves the trace ofz
across all iterations. Default isfalse
.use_noisy_grads::Bool=false
: Iftrue
, adds noise to the gradients during training. Default isfalse
.elbo_samples::Int64=1
: Number of Monte Carlo samples to use for approximating the ELBO. Default is1
.lr_schedule=nothing
: Learning rate schedule (e.g., fromcyclical_polynomial_decay
). Default isnothing
.n_repeated_measures::Int64=1
: Number of repeated measurements (e.g., for time-series data). Default is1
.dropout::Bool=false
: Iftrue
, applies dropout toz
during training. Default isfalse
.start_dropout_iter::Int=0
: Iteration at which to start applying dropout. Default is0
.
Returns
Dict
: A dictionary containing:"loss_dict"
: A dictionary with keys:"loss"
: Array of ELBO values across iterations."z_trace"
: Trace ofz
across iterations (ifsave_all=true
).
"best_iter_dict"
: A dictionary with keys:"best_loss"
: Best ELBO value achieved."best_z"
: Variational parameters corresponding to the best ELBO."final_z"
: Final variational parameters after training."best_iter"
: Iteration at which the best ELBO was achieved.
MirrorVI.polynomial_decay
— Methodpolynomial_decay(t::Int64; a::Float32=1f0, b::Float32=0.01f0, gamma::Float32=0.75f0)
Compute the polynomial decay value at step t
using the formula: a * (b + t)^(-gamma)
This function is commonly used in optimization algorithms (e.g., learning rate scheduling) to decay a value polynomially over time.
# Arguments
- `t::Int64`: The current step or time at which to compute the decay.
- `a::Float32=1f0`: The initial scaling factor. Default is `1.0`.
- `b::Float32=0.01f0`: A small constant to avoid division by zero. Default is `0.01`.
- `gamma::Float32=0.75f0`: The decay rate. Controls how quickly the value decays. Default is `0.75`.
# Returns
- `Float32`: The decayed value at step `t`.
# Examples
```julia
julia> polynomial_decay(10) # Default parameters
0.17782794f0
julia> polynomial_decay(100, a=2.0f0, b=0.1f0, gamma=0.5f0) # Custom parameters
0.31622776f0
MirrorVI.MyOptimisers.DecayedADAGrad
— TypeDecayedADAGrad(; η = 0.1, pre = 1.0, post = 0.9)
Implements a decayed version of AdaGrad. It has parameter specific learning rates based on how frequently it is updated. Does not really need tuning. References: ADAGrad optimiser.
Arguments
η=0.1
: learning ratepre=1.0
: weight of new gradient normpost=0.9
: weight of histroy of gradient norms
Returns
Vector{Float32}
: