Problems on Priors for Hierarchical Model

YufeiWu · January 16, 2024, 9:38am

Hi, I would like to fit a hierarchical Ratcliff drift diffusion model with BayesFlow. The parameters I have are:

\delta_1, \delta_2: two drift rates
\alpha: boundary separation
\beta: response bias
\tau: non-decision time

The function I would like to use is:
bayesflow.simulation.TwoLevelPrior(hyper_prior_fun: callable, local_prior_fun: callable, shared_prior_fun: callable = None, local_context_generator: callable = None)

I first define the hyper_prior_fun:

def hierarchical_prior_fun(rng=None):

    if rng is None:
        rng = np.random.default_rng()

    mu_delta1 = rng.normal(1, 0.25)
    mu_delta2 = rng.normal(-1, 0.25)
    mu_alpha = rng.normal(3, 0.25)
    mu_beta = rng.uniform(0.25, 0.75)
    mu_tau = rng.uniform(0.2, 0.5)
    sigma_delta1 = rng.uniform(0.01, 100)
    sigma_delta2 = rng.uniform(0.01, 100)
    sigma_alpha = rng.uniform(0.01, 100)
    sigma_beta = rng.uniform(0.01, 100)
    sigma_tau = rng.uniform(0.01, 100)
    return np.concatenate([np.r_[mu_delta1, mu_delta2, mu_alpha, mu_beta, mu_tau, 
                            sigma_delta1, sigma_delta2, sigma_alpha, sigma_beta, sigma_tau]])

And then the local_prior_fun:

def local_prior_fun(rng=None, hyper_theta=hierarchical_prior_fun(), num_groups=2, dim = 5):

    if rng is None:
        rng = np.random.default_rng()

    delta1 = rng.normal(hyper_theta[0], hyper_theta[5], size = (num_groups, dim))
    delta2 = rng.normal(hyper_theta[1], hyper_theta[6], size = (num_groups, dim))
    alpha = rng.normal(hyper_theta[2], hyper_theta[7], size =  (num_groups, dim))
    beta = rng.normal(hyper_theta[3], hyper_theta[8], size =  (num_groups, dim))
    tau = rng.normal(hyper_theta[4], hyper_theta[9], size =  (num_groups, dim))
    
    return np.concatenate([np.r_[delta1, delta2, alpha, beta, tau]])

Wrapping the Prior:

prior = bf.simulation.TwoLevelPrior(hyper_prior_fun =  hierarchical_prior_fun(), local_prior_fun = local_prior_fun())
prior(batch_size=1)

And I got the following errors:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[97], line 2
      1 prior = bf.simulation.TwoLevelPrior(hyper_prior_fun =  hierarchical_prior_fun(), local_prior_fun = local_prior_fun())
----> 2 prior(batch_size=1)

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\simulation.py:446, in TwoLevelPrior.__call__(self, batch_size, **kwargs)
    442     local_context = {}
    444 for b in range(batch_size):
    445     # Draw hyper parameters
--> 446     hyper_params = self.draw_hyper_parameters(**kwargs.get("hyper_args", {}))
    448     # Determine context types for local parameters
    449     if local_context.get(DEFAULT_KEYS["batchable_context"]) is not None:

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\simulation.py:483, in TwoLevelPrior.draw_hyper_parameters(self, **kwargs)
    480 def draw_hyper_parameters(self, **kwargs):
    481     """TODO"""
--> 483     params = self.hyper_prior(**kwargs)
    484     return params

TypeError: 'numpy.ndarray' object is not callable

It would be super nice if I could get some advice from you and thanks a lot for developing such a cool tool!

elseml · January 16, 2024, 10:41am

Hi Yufei,

that sounds like an interesting project, I’m excited to see the outcomes!

Your arguments in TwoLevelPrior expect a callable object - for that, simply remove the parentheses:

prior = bf.simulation.TwoLevelPrior(hyper_prior_fun = hierarchical_prior_fun, local_prior_fun = local_prior_fun)

After that, the code does not run because of local_prior_fun - you do not have to call the hyperprior here, but simply need to have a placeholder argument for the draws from the hyperprior:

def local_prior_fun(hyper_theta, rng=None, num_groups=2, dim = 5):

BayesFlow will handle the connection of the priors for you (see draw_local_parameters in the simulation functions).

Note that some of the functionalities for hierarchical parameter estimation are still work in progress afaik - maybe @KLDivergence can provide some context?

Cheers,
Lasse

YufeiWu · January 16, 2024, 7:11pm

Hi Lasse,

Thanks a lot for the help!

That worked for me and now I’ve got another question when using the function TwoLevelGenerativeModel().

with

prior = bf.simulation.TwoLevelPrior(hyper_prior_fun =  hierarchical_prior_fun, local_prior_fun = local_prior_fun)
prior(batch_size=1)

I got a nice output:

{'hyper_parameters': array([[ 0.89370657, -0.96871042,  3.05425995,  0.38803703,  0.21136131,
          1.34940303,  0.21571783,  2.47985008,  0.01550302,  0.96481163]]),
 'local_parameters': array([[[ 0.50782794, -0.94445938,  3.615876  ,  0.39021517,
          -0.53776428],
         [ 0.21586491, -0.74669511,  0.29463958,  0.39685427,
          -1.03224035],
         [-0.83923065, -1.10874365,  2.35926946,  0.38828451,
          -0.02567072],
         [ 2.41993515, -0.79318584,  4.87912848,  0.38581586,
          -0.07069425],
         [-0.12979605, -1.2496352 ,  8.83532658,  0.39258417,
           1.23848594],
         [ 0.27289896, -0.9996007 ,  0.63003071,  0.3998397 ,
          -0.19455999],
         [ 3.60272041, -1.31273021, -0.09149035,  0.39084008,
          -0.89633564],
         [-0.45215141, -0.48763598,  2.06188668,  0.37621888,
           0.56736861],
         [-0.55060549, -0.92329827,  5.19240955,  0.39227571,
          -0.56255196],
         [-0.38835137, -0.95382018,  2.51740657,  0.38353564,
           0.18797084]]]),
 'batchable_context': None,
 'non_batchable_context': None}

However, I am afraid that I do not understand how to write a nice simulator for my hierarchical DDM model.

What I wrote is:

def hierarchical_simulator(theta, design_matrix, num_obs, num_par=10, rng=None, *args):

    out = []
    #num_par is the number of participants in every simulated dataset
    for i in range(0, num_par):
        v = theta[i][: NUM_CONDITIONS]
        v = np.split(v, NUM_CONDITIONS)
        out_new = np.zeros((num_obs[i], 3))
        
        #num_obs is a array that contains the number of observations for each participant
        for n in range(0, num_obs[i]):
            #here the diffusion_trial() is a pre-defined function that simulate one-trial of DDM
            out_new[n, :] = diffusion_trial(v[design_matrix[i][n]], theta[i][-3], theta[i][-2], theta[i][-1],i)
            
        out.append(out_new)
        
    return out

simulator = bf.simulation.Simulator(batch_simulator_fun=hierarchical_simulator, context_generator=context_gen)
model = bf.simulation.TwoLevelGenerativeModel(prior=prior, simulator=simulator, name="Hierarchical DDM")

Here I assumed that theta is the prior['local_parameters'] which contains all the local parameters draws. I used a for loop with index i that represents each participant and tried to extract the information from theta. However, I am not sure whether it is what the TwoLevelGenerativeModel() want when the prior was wrapped by the TwoLevelPrior. A small piece of an example simulator for this function will be super helpful.

I hope my question is more or less clear, please let me know if anything is confusing here. Thanks a lot!

Best,
Yufei

elseml · January 17, 2024, 2:17pm

I cannot fully reproduce your situation with the given code, but here is some minimal simulator code as a starting point:

def minimal_hierarchical_simulator(theta, num_obs=20, rng=None):

    if rng is None:
        rng = np.random.default_rng()

    batch_size = theta.shape[0]
    num_participants = theta.shape[1]

    out = np.zeros((batch_size, num_participants, num_obs, 3))
    for i in range(batch_size): # loop over data sets
        for j in range(num_participants): # loop over participants
            for k in range(num_obs): # loop over observations
                # dummy code, fill with ddm trial simulator
                out[i, j, k, :] = rng.normal(loc=theta[i, j, 0], scale=1, size=3)

    return out

You do not need to explicitly provide num_participants as it is already determined via the shape of local_parameters. Right now, local_prior_fun always returns shape (10, 5), i.e., 5 parameter values each for 10 participants, but I guess this is not what you intended?
Once you connect everything via TwoLevelGenerativeModel, it should return 4D tensors of shape (batch_size, num_participants, num_observations, num_variables).

Note that this requires a fixed number of num_obs in each batch (you can vary both num_participants and num_obs between batches via context). But you will also have to pass your empirical data as a 4D tensor of shape (num_datasets, num_participants, num_observations, num_variables). Therefore, if you have a variable number of observations within each participant, we recommend a masking approach like described here and in section 4.3.3 of this paper so you can still have a single 4D tensor (with a fixed num_obs) for all participants.

A tutorial for these functionalities is definitely on our to-do list! Our tutorial on hierarchical model comparison may also provide some further information about the usage of hierarchical data structures in BayesFlow (it just uses custom functions instead of the wrappers, so the simulator does a bit more work by also sampling local parameters before simulating).

Lastly, the triple for loop is of course not very efficient. One possible way to speed up computations would be the usage of numba for the simulator of the ddm trials.

Hope that helps, hierarchical models are definitely challenging!

YufeiWu · January 21, 2024, 6:36pm

Hi Lasse,

the minimum simulator code is very helpful, now I can simulate a desired output with the TwoLevelGenerativeModel().
With 4 participants in a batch and the 4 observations each participant, the output looks like this with batch_size = 2:

{'sim_data': array([[[[1.29470631, 1.        , 0.        ],
          [2.78270631, 1.        , 0.        ],
          [3.39770631, 1.        , 0.        ],
          [6.39470631, 0.        , 0.        ]],
 
         [[1.68407352, 1.        , 1.        ],
          [1.85707352, 1.        , 1.        ],
          [1.11507352, 0.        , 1.        ],
          [2.77207352, 0.        , 1.        ]],
 
         [[1.36106071, 1.        , 2.        ],
          [2.99906071, 1.        , 2.        ],
          [4.23206071, 1.        , 2.        ],
          [4.13506071, 0.        , 2.        ]],
 
         [[0.87416584, 1.        , 3.        ],
          [0.78516584, 1.        , 3.        ],
          [1.36316584, 0.        , 3.        ],
          [0.67516584, 0.        , 3.        ]]],
 
 
        [[[1.54510175, 0.        , 0.        ],
          [4.51710175, 1.        , 0.        ],
          [1.68410175, 0.        , 0.        ],
          [0.81410175, 0.        , 0.        ]],
 
         [[2.34517544, 0.        , 1.        ],
          [1.99617544, 0.        , 1.        ],
          [0.82817544, 0.        , 1.        ],
          [1.50817544, 1.        , 1.        ]],
 
         [[1.30738178, 1.        , 2.        ],
          [2.85238178, 1.        , 2.        ],
          [1.80738178, 0.        , 2.        ],
          [1.57038178, 0.        , 2.        ]],
 
         [[0.57009528, 0.        , 3.        ],
          [3.31809528, 0.        , 3.        ],
          [0.51009528, 0.        , 3.        ],
          [0.69309528, 0.        , 3.        ]]]]),
 'hyper_prior_draws': array([[ 1.41857089, -0.93169315,  3.31628374,  0.45483618,  0.20133162,
          0.44102315,  0.69175527,  0.79266394,  0.0595174 ,  0.10295194],
        [ 0.60878216, -1.39706776,  2.78171975,  0.47483292,  0.39313015,
          0.11375621,  0.16225779,  0.7928784 ,  0.03337419,  0.13652142]]),
 'local_prior_draws': array([[[ 0.82724444,  0.40367882,  4.02537849,  0.35320437,
           0.13170631],
         [ 0.90479767, -0.68222203,  2.49417105,  0.41947897,
           0.12407352],
         [ 0.88580755,  0.05465446,  3.15031844,  0.54067406,
           0.16306071],
         [ 0.9926855 , -0.18758255,  2.97067231,  0.41571828,
           0.16816584]],
 
        [[ 0.58973456, -1.01946977,  2.24953471,  0.39817652,
           0.24910175],
         [ 0.55350562, -1.36531887,  1.57962135,  0.41177689,
           0.62417544],
         [ 0.58416283, -1.30227951,  3.34284877,  0.49790414,
           0.52538178],
         [ 0.56352598, -1.26160684,  2.31466931,  0.43792337,
           0.16509528]]]),
 'shared_prior_draws': None,
 'sim_batchable_context': [array([[0, 0, 1, 1],
         [0, 0, 1, 1],
         [0, 0, 1, 1],
         [0, 0, 1, 1]]),
  array([[0, 0, 1, 1],
         [0, 0, 1, 1],
         [0, 0, 1, 1],
         [0, 0, 1, 1]])],
 'sim_non_batchable_context': array([4, 4, 4, 4]),
 'prior_batchable_context': None,
 'prior_non_batchable_context': None}

I set the summary and the inference network as follows:

summary_net = bf.summary_networks.HierarchicalNetwork([
    bf.networks.DeepSet(), 
    bf.networks.DeepSet(summary_dim=64)
])

inference_net = bf.networks.InvertibleNetwork(
    num_params=15,
    coupling_settings={"dense_args": dict(kernel_regularizer=None), "dropout": False},
    name="ddm_inference",
)

then, I tried to set up a configurator that organize the output of TwoLevelGenerativeModel as in https://github.com/stefanradev93/BayesFlow/blob/master/examples/LCA_Model_Posterior_Estimation.ipynb.

from tensorflow.keras.utils import to_categorical


def configurator(forward_dict):
    """Configure the output of the GenerativeModel for a BayesFlow setup."""

    # Prepare placeholder dict
    out_dict = {}

    # Extract simulated response times
    data = forward_dict["sim_data"]

    # Convert list of condition indicators to a 2D array and add a
    # trailing dimension of 1, so shape becomes (batch_size, num_obs, 1)
    # We need this in order to easily concatenate the context with the data
    context = np.array(forward_dict["sim_batchable_context"])[..., None]

    # One-hot encoding of integer choices
    categorical_resp = to_categorical(data[:, :, :, 1], num_classes=2)

    # Concatenate rt, resp, context
    out_dict["summary_conditions"] = np.c_[data[:, :, :, :3], categorical_resp, context].astype(np.float32)

    # Make inference network aware of varying numbers of trials
    # We create a vector of shape (batch_size, 1) by repeating the sqrt(num_obs)
    vec_num_obs = forward_dict["sim_non_batchable_context"] * np.ones((data.shape[0], 1))
    out_dict["direct_conditions"] = np.sqrt(vec_num_obs).astype(np.float32)

    # Get data generating parameters
    out_dict["local_parameters"] = forward_dict["local_prior_draws"].astype(np.float32)

    # Get data generating parameters
    out_dict["hyper_parameters"] = forward_dict["hyper_prior_draws"].astype(np.float32)

    return out_dict

This configurator returns a dictionary like this. In the “summary_conditions”, the columns are:

Reaction Time,
Response,
Subject Index,
4, 5: One-hot encoding for two conditions:

{'summary_conditions': array([[[[ 2.6474695 ,  1.        ,  0.        ,  0.        ,
            1.        ,  0.        ],
          [ 3.3264694 ,  1.        ,  0.        ,  0.        ,
            1.        ,  0.        ],
          [ 7.2974696 ,  0.        ,  0.        ,  1.        ,
            0.        ,  1.        ],
          [ 3.8964696 ,  1.        ,  0.        ,  0.        ,
            1.        ,  1.        ]],
 
         [[ 1.9811877 ,  1.        ,  1.        ,  0.        ,
            1.        ,  0.        ],
          [ 1.8581877 ,  1.        ,  1.        ,  0.        ,
            1.        ,  0.        ],
          [ 2.4871876 ,  1.        ,  1.        ,  0.        ,
            1.        ,  1.        ],
          [ 2.5161877 ,  1.        ,  1.        ,  0.        ,
            1.        ,  1.        ]],
 
         [[-0.03797305,  1.        ,  2.        ,  0.        ,
            1.        ,  0.        ],
          [-0.03797305,  1.        ,  2.        ,  0.        ,
            1.        ,  0.        ],
          [-0.03797305,  1.        ,  2.        ,  0.        ,
            1.        ,  1.        ],
          [-0.03797305,  1.        ,  2.        ,  0.        ,
            1.        ,  1.        ]],
 
         [[ 1.405282  ,  1.        ,  3.        ,  0.        ,
            1.        ,  0.        ],
          [ 2.2722821 ,  1.        ,  3.        ,  0.        ,
            1.        ,  0.        ],
          [ 1.968282  ,  0.        ,  3.        ,  1.        ,
            0.        ,  1.        ],
          [ 1.265282  ,  0.        ,  3.        ,  1.        ,
            0.        ,  1.        ]]],
 
 
        [[[ 1.8850831 ,  1.        ,  0.        ,  0.        ,
            1.        ,  0.        ],
          [ 1.4160831 ,  1.        ,  0.        ,  0.        ,
            1.        ,  0.        ],
          [ 1.3750831 ,  1.        ,  0.        ,  0.        ,
            1.        ,  1.        ],
          [ 0.7910831 ,  1.        ,  0.        ,  0.        ,
            1.        ,  1.        ]],
 
         [[ 2.1057036 ,  1.        ,  1.        ,  0.        ,
            1.        ,  0.        ],
          [ 1.2727035 ,  1.        ,  1.        ,  0.        ,
            1.        ,  0.        ],
          [ 4.5987034 ,  1.        ,  1.        ,  0.        ,
            1.        ,  1.        ],
          [ 2.9247036 ,  0.        ,  1.        ,  1.        ,
            0.        ,  1.        ]],
 
         [[ 1.2369583 ,  1.        ,  2.        ,  0.        ,
            1.        ,  0.        ],
          [ 1.2579583 ,  1.        ,  2.        ,  0.        ,
            1.        ,  0.        ],
          [ 1.6269583 ,  0.        ,  2.        ,  1.        ,
            0.        ,  1.        ],
          [ 2.2309582 ,  0.        ,  2.        ,  1.        ,
            0.        ,  1.        ]],
 
         [[ 0.87938637,  0.        ,  3.        ,  1.        ,
            0.        ,  0.        ],
          [ 1.9493864 ,  1.        ,  3.        ,  0.        ,
            1.        ,  0.        ],
          [ 1.1853864 ,  1.        ,  3.        ,  0.        ,
            1.        ,  1.        ],
          [ 2.1333864 ,  0.        ,  3.        ,  1.        ,
            0.        ,  1.        ]]]], dtype=float32),
 'direct_conditions': array([[2., 2., 2., 2.],
        [2., 2., 2., 2.]], dtype=float32),
 'local_parameters': array([[[ 0.9636896 ,  0.04489963,  4.8279986 ,  0.40029538,
           0.2964695 ],
         [ 0.8681931 ,  1.1129119 ,  5.250871  ,  0.41632465,
           0.3461877 ],
         [ 1.0855117 , -2.1465156 ,  2.145133  ,  1.4483211 ,
          -0.03797305],
         [ 0.9783554 , -1.4380984 ,  2.4775946 ,  0.5278079 ,
           0.51528203]],
 
        [[ 1.309698  ,  1.1134202 ,  2.546568  ,  0.42023334,
           0.35108307],
         [ 1.0092192 , -0.3970489 ,  3.036948  ,  0.43358582,
           0.46970353],
         [ 0.9460304 , -0.23147029,  2.9617784 ,  0.43887368,
           0.42495826],
         [ 1.2671711 , -0.43410122,  2.7386088 ,  0.42810172,
           0.47738636]]], dtype=float32),
 'hyper_parameters': array([[ 0.9527935 , -1.1164383 ,  3.3564892 ,  0.27710044,  0.2009128 ,
          0.09594686,  1.336043  ,  1.1092404 ,  0.52335066,  0.3671519 ],
        [ 1.2016746 , -0.7607997 ,  3.1105556 ,  0.42581084,  0.43157616,
          0.14350495,  1.0815748 ,  0.2312061 ,  0.01252569,  0.08018385]],
       dtype=float32)}

However, I got the following error message while passing all the info to the Trainer():

INFO:root:Performing a consistency check with provided components...
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\trainers.py:1314, in Trainer._check_consistency(self)
   1313 logger.info("Performing a consistency check with provided components...")
-> 1314 _ = self.amortizer.compute_loss(self.configurator(self.generative_model(_n_sim)))
   1315 logger.info("Done.")

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\amortizers.py:209, in AmortizedPosterior.compute_loss(self, input_dict, **kwargs)
    208 # Get amortizer outputs
--> 209 net_out, sum_out = self(input_dict, return_summary=True, **kwargs)
    210 z, log_det_J = net_out

File ~\AppData\Roaming\Python\Python311\site-packages\keras\src\utils\traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\amortizers.py:181, in AmortizedPosterior.call(self, input_dict, return_summary, **kwargs)
    180 # Compute output of inference net
--> 181 net_out = self.inference_net(input_dict[DEFAULT_KEYS["parameters"]], full_cond, **kwargs)
    183 # Return summary outputs or not, depending on parameter

KeyError: "Exception encountered when calling layer 'hierarchical_ddm_amortizer' (type AmortizedPosterior).\n\nparameters\n\nCall arguments received by layer 'hierarchical_ddm_amortizer' (type AmortizedPosterior):\n  • input_dict={'summary_conditions': 'tf.Tensor(shape=(2, 4, 4, 6), dtype=float32)', 'direct_conditions': 'tf.Tensor(shape=(2, 4), dtype=float32)', 'local_parameters': 'tf.Tensor(shape=(2, 4, 5), dtype=float32)', 'hyper_parameters': 'tf.Tensor(shape=(2, 10), dtype=float32)'}\n  • return_summary=True\n  • kwargs={'training': 'None'}"

During handling of the above exception, another exception occurred:

ConfigurationError                        Traceback (most recent call last)
Cell In[345], line 1
----> 1 trainer = bf.trainers.Trainer(
      2     generative_model=model, amortizer=amortizer, configurator=configurator)

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\trainers.py:220, in Trainer.__init__(self, amortizer, generative_model, configurator, checkpoint_path, max_to_keep, default_lr, skip_checks, memory, **kwargs)
    218 # Perform a sanity check with provided components
    219 if not skip_checks:
--> 220     self._check_consistency()

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\trainers.py:1317, in Trainer._check_consistency(self)
   1315     logger.info("Done.")
   1316 except Exception as err:
-> 1317     raise ConfigurationError(
   1318         "Could not carry out computations of generative_model ->"
   1319         + f"configurator -> amortizer -> loss! Error trace:\n {err}"
   1320     )

ConfigurationError: Could not carry out computations of generative_model ->configurator -> amortizer -> loss! Error trace:
 "Exception encountered when calling layer 'hierarchical_ddm_amortizer' (type AmortizedPosterior).\n\nparameters\n\nCall arguments received by layer 'hierarchical_ddm_amortizer' (type AmortizedPosterior):\n  • input_dict={'summary_conditions': 'tf.Tensor(shape=(2, 4, 4, 6), dtype=float32)', 'direct_conditions': 'tf.Tensor(shape=(2, 4), dtype=float32)', 'local_parameters': 'tf.Tensor(shape=(2, 4, 5), dtype=float32)', 'hyper_parameters': 'tf.Tensor(shape=(2, 10), dtype=float32)'}\n  • return_summary=True\n  • kwargs={'training': 'None'}"

I first changed the key “local_parameters” in the configurator to “parameters”, and then got some other errors:

INFO:root:Performing a consistency check with provided components...
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\trainers.py:1314, in Trainer._check_consistency(self)
   1313 logger.info("Performing a consistency check with provided components...")
-> 1314 _ = self.amortizer.compute_loss(self.configurator(self.generative_model(_n_sim)))
   1315 logger.info("Done.")

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\amortizers.py:209, in AmortizedPosterior.compute_loss(self, input_dict, **kwargs)
    208 # Get amortizer outputs
--> 209 net_out, sum_out = self(input_dict, return_summary=True, **kwargs)
    210 z, log_det_J = net_out

File ~\AppData\Roaming\Python\Python311\site-packages\keras\src\utils\traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\amortizers.py:181, in AmortizedPosterior.call(self, input_dict, return_summary, **kwargs)
    180 # Compute output of inference net
--> 181 net_out = self.inference_net(input_dict[DEFAULT_KEYS["parameters"]], full_cond, **kwargs)
    183 # Return summary outputs or not, depending on parameter

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\inference_networks.py:178, in InvertibleNetwork.call(self, targets, condition, inverse, **kwargs)
    177     return self.inverse(targets, condition, **kwargs)
--> 178 return self.forward(targets, condition, **kwargs)

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\inference_networks.py:215, in InvertibleNetwork.forward(self, targets, condition, **kwargs)
    214 for layer in self.coupling_layers:
--> 215     z, log_det_J = layer(z, condition, **kwargs)
    216     log_det_Js.append(log_det_J)

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\coupling_networks.py:612, in CouplingLayer.call(self, target_or_z, condition, inverse, **kwargs)
    611 if not inverse:
--> 612     return self.forward(target_or_z, condition, **kwargs)
    613 return self.inverse(target_or_z, condition, **kwargs)

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\coupling_networks.py:637, in CouplingLayer.forward(self, target, condition, **kwargs)
    636 if self.act_norm is not None:
--> 637     target, log_det_J_act = self.act_norm(target)
    638     log_det_Js += log_det_J_act

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\helper_networks.py:369, in ActNorm.call(self, target, inverse)
    368 if not inverse:
--> 369     return self._forward(target)
    370 else:

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\helper_networks.py:376, in ActNorm._forward(self, target)
    374 """Performs a forward pass through the layer."""
--> 376 z = self.scale * target + self.bias
    377 ldj = tf.math.reduce_sum(tf.math.log(tf.math.abs(self.scale)), axis=-1)

InvalidArgumentError: Exception encountered when calling layer 'act_norm_42' (type ActNorm).

{{function_node __wrapped__Mul_device_/job:localhost/replica:0/task:0/device:CPU:0}} Incompatible shapes: [15] vs. [2,4,5] [Op:Mul] name: 

Call arguments received by layer 'act_norm_42' (type ActNorm):
  • target=tf.Tensor(shape=(2, 4, 5), dtype=float32)
  • inverse=False

During handling of the above exception, another exception occurred:

ConfigurationError                        Traceback (most recent call last)
Cell In[347], line 1
----> 1 trainer = bf.trainers.Trainer(
      2     generative_model=model, amortizer=amortizer, configurator=configurator)

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\trainers.py:220, in Trainer.__init__(self, amortizer, generative_model, configurator, checkpoint_path, max_to_keep, default_lr, skip_checks, memory, **kwargs)
    218 # Perform a sanity check with provided components
    219 if not skip_checks:
--> 220     self._check_consistency()

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\trainers.py:1317, in Trainer._check_consistency(self)
   1315     logger.info("Done.")
   1316 except Exception as err:
-> 1317     raise ConfigurationError(
   1318         "Could not carry out computations of generative_model ->"
   1319         + f"configurator -> amortizer -> loss! Error trace:\n {err}"
   1320     )

ConfigurationError: Could not carry out computations of generative_model ->configurator -> amortizer -> loss! Error trace:
 Exception encountered when calling layer 'act_norm_42' (type ActNorm).

{{function_node __wrapped__Mul_device_/job:localhost/replica:0/task:0/device:CPU:0}} Incompatible shapes: [15] vs. [2,4,5] [Op:Mul] name: 

Call arguments received by layer 'act_norm_42' (type ActNorm):
  • target=tf.Tensor(shape=(2, 4, 5), dtype=float32)
  • inverse=False

Could you give me some advice on the configurator, and possibly on the summary and inference network if there is any problem about them?

I don’t know how to organize the data so that it make sense to the Trainer() function, and a piece of example code of a configurator for hierarchical data could be very helful!

Thanks a lot for your time!!

Best,
Yufei

elseml · January 22, 2024, 9:37am

Hi Yufei,

great to hear about your progress! Before we take a deeper dive: Your output mentions AmortizedPosterior, do you use AmortizedPosterior or TwoLevelAmortizedPosterior for your amortizer?

Cheers,
Lasse

YufeiWu · January 22, 2024, 12:01pm

Hi Lasse,

Thanks a lot for the reminder! Clearly I missed the TwoLevelAmortizedPosterior() and used the AmortizedPosterior().

Now I have made some changes:

For the summary network, inference network, and Amortizers, I set:

summary_net = bf.summary_networks.HierarchicalNetwork([
    bf.networks.DeepSet(summary_dim=20), 
    bf.networks.DeepSet(summary_dim=15)
])

local_inference_net = bf.networks.InvertibleNetwork(
    num_params=5,
    coupling_settings={"dense_args": dict(kernel_regularizer=None), "dropout": False},
    name="local_ddm_inference"
)

hyper_inference_net = bf.networks.InvertibleNetwork(
    num_params=10,
    coupling_settings={"dense_args": dict(kernel_regularizer=None), "dropout": False},
    name="hyper_ddm_inference"
)

local_amortizer = bf.amortizers.AmortizedPosterior(local_inference_net, name="local_ddm_amortizer")
hyper_amortizer = bf.amortizers.AmortizedPosterior(hyper_inference_net, name="hyper_ddm_amortizer")
twolevel_amortizer = bf.amortizers.TwoLevelAmortizedPosterior(summary_net = summary_net,
                                                            local_amortizer = local_amortizer,
                                                            global_amortizer = hyper_amortizer)

And the configurator:

from tensorflow.keras.utils import to_categorical

#a configurator extracts the results of the generative model to a format
#that the neural network would like
#transformation of the data/ parameters
#Try this with a example simulation
def configurator(forward_dict):
    """Configure the output of the GenerativeModel for a BayesFlow setup."""

    # Prepare placeholder dict
    out_dict = {}

    # Extract simulated response times
    data = forward_dict["sim_data"]

    # Convert list of condition indicators to a 2D array and add a
    # trailing dimension of 1, so shape becomes (batch_size, num_obs, 1)
    # We need this in order to easily concatenate the context with the data
    context = np.array(forward_dict["sim_batchable_context"])[..., None]

    # One-hot encoding of integer choices
    categorical_resp = to_categorical(data[:, :, :, 1], num_classes=2)

    # Concatenate rt, resp, context
    out_dict["summary_conditions"] = np.c_[data[:, :, :, :3], categorical_resp, context].astype(np.float32)

    # Make inference network aware of varying numbers of trials
    # We create a vector of shape (batch_size, 1) by repeating the sqrt(num_obs)
    vec_num_obs = forward_dict["sim_non_batchable_context"] * np.ones((data.shape[0], 1))
    out_dict["direct_local_conditions"] = np.sqrt(vec_num_obs).astype(np.float32)

    # Get data generating parameters
    out_dict["local_parameters"] = forward_dict["local_prior_draws"].astype(np.float32)

    # Get data generating parameters
    out_dict["hyper_parameters"] = forward_dict["hyper_prior_draws"].astype(np.float32)

    return out_dict

The output of the configurator is:

{'summary_conditions': array([[[[0.23502871, 0.        , 0.        , 1.        , 0.        ,
           0.        ],
          [0.23602872, 0.        , 0.        , 1.        , 0.        ,
           0.        ],
          [0.23402871, 0.        , 0.        , 1.        , 0.        ,
           1.        ],
          [0.23402871, 0.        , 0.        , 1.        , 0.        ,
           1.        ]],
 
         [[1.7234102 , 0.        , 1.        , 1.        , 0.        ,
           0.        ],
          [0.8444103 , 1.        , 1.        , 0.        , 1.        ,
           0.        ],
          [1.5314103 , 1.        , 1.        , 0.        , 1.        ,
           1.        ],
          [6.2644105 , 0.        , 1.        , 1.        , 0.        ,
           1.        ]],
 
         [[0.293461  , 1.        , 2.        , 0.        , 1.        ,
           0.        ],
          [0.293461  , 1.        , 2.        , 0.        , 1.        ,
           0.        ],
          [0.293461  , 1.        , 2.        , 0.        , 1.        ,
           1.        ],
          [0.293461  , 1.        , 2.        , 0.        , 1.        ,
           1.        ]],
 
         [[1.434916  , 1.        , 3.        , 0.        , 1.        ,
           0.        ],
          [0.31991604, 1.        , 3.        , 0.        , 1.        ,
           0.        ],
          [1.2419161 , 0.        , 3.        , 1.        , 0.        ,
           1.        ],
          [2.2129161 , 0.        , 3.        , 1.        , 0.        ,
           1.        ]]],
 
 
        [[[4.0679054 , 1.        , 0.        , 0.        , 1.        ,
           0.        ],
          [3.1409054 , 1.        , 0.        , 0.        , 1.        ,
           0.        ],
          [9.545905  , 0.        , 0.        , 1.        , 0.        ,
           1.        ],
          [6.2359056 , 0.        , 0.        , 1.        , 0.        ,
           1.        ]],
 
         [[1.1551435 , 1.        , 1.        , 0.        , 1.        ,
           0.        ],
          [2.4151435 , 1.        , 1.        , 0.        , 1.        ,
           0.        ],
          [3.4001436 , 0.        , 1.        , 1.        , 0.        ,
           1.        ],
          [8.354143  , 0.        , 1.        , 1.        , 0.        ,
           1.        ]],
 
         [[1.272507  , 1.        , 2.        , 0.        , 1.        ,
           0.        ],
          [4.5155067 , 1.        , 2.        , 0.        , 1.        ,
           0.        ],
          [6.363507  , 0.        , 2.        , 1.        , 0.        ,
           1.        ],
          [3.880507  , 0.        , 2.        , 1.        , 0.        ,
           1.        ]],
 
         [[3.9988506 , 1.        , 3.        , 0.        , 1.        ,
           0.        ],
          [3.7818506 , 1.        , 3.        , 0.        , 1.        ,
           0.        ],
          [6.5608506 , 0.        , 3.        , 1.        , 0.        ,
           1.        ],
          [6.0908504 , 0.        , 3.        , 1.        , 0.        ,
           1.        ]]]], dtype=float32),
 'direct_local_conditions': array([[2., 2., 2., 2.],
        [2., 2., 2., 2.]], dtype=float32),
 'local_parameters': array([[[ 0.9620098 , -1.7165476 ,  3.405874  ,  0.00586221,
           0.23302871],
         [ 0.1588431 , -1.3296039 ,  3.2937753 ,  0.560999  ,
           0.3134103 ],
         [ 1.2310443 , -1.2390847 ,  3.2818716 ,  1.1094904 ,
           0.293461  ],
         [ 0.8047062 , -1.8772503 ,  3.3627822 ,  0.70007056,
           0.17391604]],
 
        [[ 1.0138699 , -0.7760718 ,  7.719133  ,  0.6055134 ,
           0.48590544],
         [ 1.0411495 , -0.8074007 ,  5.006719  ,  0.6599075 ,
           0.30314353],
         [ 0.91515446, -0.7879659 ,  4.5611696 ,  0.5836236 ,
           0.4195069 ],
         [ 0.955317  , -0.7899395 ,  6.1566954 ,  0.5403657 ,
           1.0858506 ]]], dtype=float32),
 'hyper_parameters': array([[ 1.1409779 , -1.5284885 ,  3.3699188 ,  0.63659275,  0.2107306 ,
          0.5449599 ,  0.23720397,  0.06172697,  0.36356974,  0.06822886],
        [ 1.0060467 , -0.79305816,  3.0460436 ,  0.60686755,  0.42614746,
          0.0510627 ,  0.03500815,  3.6632185 ,  0.03509955,  0.27787185]],
       dtype=float32)}

Then, with:

trainer = bf.trainers.Trainer(
    generative_model=model, amortizer=twolevel_amortizer, configurator=configurator)

The error message I got is:

INFO:root:Performing a consistency check with provided components...
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\trainers.py:1314, in Trainer._check_consistency(self)
   1313 logger.info("Performing a consistency check with provided components...")
-> 1314 _ = self.amortizer.compute_loss(self.configurator(self.generative_model(_n_sim)))
   1315 logger.info("Done.")

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\amortizers.py:1084, in TwoLevelAmortizedPosterior.compute_loss(self, input_dict, **kwargs)
   1082 """Compute loss of all amortizers."""
-> 1084 local_summaries, global_summaries = self._compute_condition(input_dict, **kwargs)
   1085 local_inputs, global_inputs = self._prepare_inputs(input_dict, local_summaries, global_summaries)

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\amortizers.py:1178, in TwoLevelAmortizedPosterior._compute_condition(self, input_dict, **kwargs)
   1177 # Obtain needed summaries
-> 1178 local_summaries, global_summaries = self._get_local_global(input_dict, **kwargs)
   1180 # At this point, add globals as conditions

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\amortizers.py:1209, in TwoLevelAmortizedPosterior._get_local_global(self, input_dict, **kwargs)
   1208 if input_dict.get("direct_local_conditions") is not None:
-> 1209     local_summaries = tf.concat([local_summaries, input_dict.get("direct_local_conditions")], axis=-1)
   1210 if input_dict.get("direct_global_conditions") is not None:

File ~\AppData\Roaming\Python\Python311\site-packages\tensorflow\python\util\traceback_utils.py:153, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    152   filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153   raise e.with_traceback(filtered_tb) from None
    154 finally:

File ~\AppData\Roaming\Python\Python311\site-packages\tensorflow\python\framework\ops.py:5888, in raise_from_not_ok_status(e, name)
   5887 e.message += (" name: " + str(name if name is not None else ""))
-> 5888 raise core._status_to_exception(e) from None

InvalidArgumentError: {{function_node __wrapped__ConcatV2_N_2_device_/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Ranks of all input tensors should match: shape[0] = [2,4,20] vs. shape[1] = [2,4] [Op:ConcatV2] name: concat

During handling of the above exception, another exception occurred:

ConfigurationError                        Traceback (most recent call last)
Cell In[509], line 1
----> 1 trainer = bf.trainers.Trainer(
      2     generative_model=model, amortizer=twolevel_amortizer, configurator=configurator)

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\trainers.py:220, in Trainer.__init__(self, amortizer, generative_model, configurator, checkpoint_path, max_to_keep, default_lr, skip_checks, memory, **kwargs)
    218 # Perform a sanity check with provided components
    219 if not skip_checks:
--> 220     self._check_consistency()

File ~\AppData\Roaming\Python\Python311\site-packages\bayesflow\trainers.py:1317, in Trainer._check_consistency(self)
   1315     logger.info("Done.")
   1316 except Exception as err:
-> 1317     raise ConfigurationError(
   1318         "Could not carry out computations of generative_model ->"
   1319         + f"configurator -> amortizer -> loss! Error trace:\n {err}"
   1320     )

ConfigurationError: Could not carry out computations of generative_model ->configurator -> amortizer -> loss! Error trace:
 {{function_node __wrapped__ConcatV2_N_2_device_/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Ranks of all input tensors should match: shape[0] = [2,4,20] vs. shape[1] = [2,4] [Op:ConcatV2] name: concat

I am not sure whether these are the problem with my summary/ inference network, or with the configurator. Could you give me some advice on this?

Best,’
Yufei

elseml · January 24, 2024, 1:15pm

Hi Yufei,

concerning the nets: I would choose a higher number of summary dimensions for the second network (depending on the number of participants in your sample) to avoid a bottleneck. The first network only compresses the information within each participant, while the second network further compresses the information between all participants.

Looking at your error message, the error seems to happen at

 local_summaries = tf.concat([local_summaries, input_dict.get("direct_local_conditions")], axis=-1)

where local_summaries has shape (2,4,20) but direct_local_conditions has shape (2,4), so the concatenation along the last axis fails. It seems to me that direct_local_conditions needs to have a 3D shape (i.e., (2,4,1) for example via [:, :, np.newaxis]) so it can be concatenated properly. Does it work then? As I have myself not used these functionalities yet, I will also get in touch with other project members about this

Cheers,
Lasse

Daniel · January 24, 2024, 11:27pm

Yes, direct_local_conditions needs to be 3D and your problem seems to be somewhere in the configurator.

Here is an overview of the required shapes:

For the generator (this should already work okay, just to double check):
sim_data: (batch_size, num_groups, num_obs, num_dims)
hyper_prior_draws: (batch_size, num_hyper(x num_dims))
shared_prior_draws: (batch_size, num_shared(x num_dims))
local_prior_draws: (batch_size, num_groups, num_dims)
sim_batchable_context: list of length batch_size
sim_non_batchable_context: None
prior_batchable_context: list of length batch_size
prior_non_batchable_context: None

For the configurator:
hyper_parameters: (batch_size, num_hyper(x num_dims))
shared_parameters: (batch_size, num_shared(x num_dims))
local_parameters: (batch_size, num_groups, num_dims)
summary_conditions: (batch_size, num_groups, num_obs, num_dims)
direct_global_conditions: (batch_size, num_global_conditions)
direct_local_conditions: (batch_size, num_groups, num_local_condition)

The local amortizer needs the number of observations (so the last dimension of direct_local_conditions typically has size 1), and for the global amortizer you want to add the number of groups and number of observations (so the last dimension of direct_global_conditions typically has size 2).

If this is not helpful, could you maybe share your simulator (or a minimal version)? That would make it a bit easier to see what’s going on.

Topic		Replies	Views
ConfigurationError in hierarchical model General	3	46	May 22, 2025
V2 BayesFlow and Hierarchal Modeling General	1	81	April 3, 2025
Prior function and random seeds General	5	39	February 25, 2025
Neual likelihood estimation General	12	89	June 2, 2025
Preprint on "Amortized Bayesian Multilevel Models" Announcements research	3	139	September 15, 2024

Problems on Priors for Hierarchical Model

Related topics