Dear BayesFlow
team,
I am trying to get a feel for how to work with the codebase by using a very simple example related to the real inference I want to work on. The simple exercise consists of training a neural network (NN) to perform inference over the parameters of a negative binomial distribution. There are many moving parts and options that I do not understand, and none of the examples in the documentation helped me overcome the bump. Here is the basic setup:
After the imports, I define my prior function as
def prior_fn():
# Prior on p for negative binomial parameter
p = RNG.beta(1, 10)
# Prior on log of r parameter
logr = RNG.normal(0, 1)
# Return the prior
return np.float32(np.array([p_hat, logr]))
with the corresponding simulation.Prior
object
prior = bf.simulation.Prior(
prior_fun=prior_fn,
param_names=param_names,
)
Next, I define the likelihood function in which I convert the integer outputs to float32
. For now, I’m using a fixed number of observations (100), but I would like to later understand how to train the NN varying this number.
def likelihood_fn(params, n_obs=100):
# Unpack parameters
p = params[0]
r = np.exp(params[1])
# Sample from the negative binomial distribution
u = RNG.negative_binomial(r, p, size=n_obs)
# Add an extra dimension to make the output a 3D tensor
u = np.expand_dims(u, axis=1)
return np.float32(u)
Again, I define the corresponding Simulator
and GenerativeModel
objects
# Define Likelihood simulator function for BayesFlow
simulator = bf.simulation.Simulator(simulator_fun=likelihood_fn)
# Build generative model
model = bf.simulation.GenerativeModel(prior, simulator)
I define the summary network as a DeepSet
(but I have also tried a SetTransforme
without much success)
# Define summary network as a Deepset
summary_net = bf.networks.DeepSet(summary_dim=4)
For the inference NN, I don’t have a good intuition of which arguments I should modify or not. So far, one of the failed attempts I have looks something like this:
# Define the conditional invertible network with affine coupling layers
inference_net = bf.inference_networks.InvertibleNetwork(
num_params=prior(1)["prior_draws"].shape[-1],
num_coupling_layers=6,
coupling_design="affine",
permutation="learnable",
use_act_norm=True,
coupling_settings={
"mc_dropout": True,
"dense_args": dict(units=32, activation="relu"),
}
)
I put everything together
# Assemble the amoratizer that combines the summary network and inference
# network
amortizer = bf.amortizers.AmortizedPosterior(
inference_net, summary_net,
)
# Assemble the trainer with the amortizer and generative model
trainer = bf.trainers.Trainer(amortizer=amortizer, generative_model=model)
and train the model using
# Train the model
history = trainer.train_online(
epochs=20,
iterations_per_epoch=128,
batch_size=64,
validation_sims=200,
)
However, as shown in the figure below, it doesn’t matter what I try—albeit with very little understanding of the specific details of what each argument does—the loss function explodes into some non-sensical values to quickly come back.
I am sure that I am doing something naively wrong, but I have struggled enough with it to ask for help. So, I hope you can guide me to what I’m doing wrong here.
Thanks in advance for your invaluable help!