Problems on Priors for Hierarchical Model

I cannot fully reproduce your situation with the given code, but here is some minimal simulator code as a starting point:

def minimal_hierarchical_simulator(theta, num_obs=20, rng=None):

    if rng is None:
        rng = np.random.default_rng()

    batch_size = theta.shape[0]
    num_participants = theta.shape[1]

    out = np.zeros((batch_size, num_participants, num_obs, 3))
    for i in range(batch_size): # loop over data sets
        for j in range(num_participants): # loop over participants
            for k in range(num_obs): # loop over observations
                # dummy code, fill with ddm trial simulator
                out[i, j, k, :] = rng.normal(loc=theta[i, j, 0], scale=1, size=3)

    return out

You do not need to explicitly provide num_participants as it is already determined via the shape of local_parameters. Right now, local_prior_fun always returns shape (10, 5), i.e., 5 parameter values each for 10 participants, but I guess this is not what you intended?
Once you connect everything via TwoLevelGenerativeModel, it should return 4D tensors of shape (batch_size, num_participants, num_observations, num_variables).

Note that this requires a fixed number of num_obs in each batch (you can vary both num_participants and num_obs between batches via context). But you will also have to pass your empirical data as a 4D tensor of shape (num_datasets, num_participants, num_observations, num_variables). Therefore, if you have a variable number of observations within each participant, we recommend a masking approach like described here and in section 4.3.3 of this paper so you can still have a single 4D tensor (with a fixed num_obs) for all participants.

A tutorial for these functionalities is definitely on our to-do list! Our tutorial on hierarchical model comparison may also provide some further information about the usage of hierarchical data structures in BayesFlow (it just uses custom functions instead of the wrappers, so the simulator does a bit more work by also sampling local parameters before simulating).

Lastly, the triple for loop is of course not very efficient. One possible way to speed up computations would be the usage of numba for the simulator of the ddm trials.

Hope that helps, hierarchical models are definitely challenging!

1 Like