Hey everyone,
Thank you for your response, you already helped me a lot!
Scatter plots
The scatter plots in my first post depict only the first two dimensions of the summary statistics. I agree that this is not really helpful if the actual summary space includes 32 dimensions. I updated the plot (see attachment). If I understood your paper (Schmitt et al., 2022) correctly, there is a tradeoff between parameter recovery and model misspecification detection that depends on the number of summary dimensions. Is there any rule of thumb or relation to the number of parameters (we estimate five parameters) that can help to find a suitable number of dimensions that is not too sensitive to MMS? I read about using at least as many dimensions as parameters estimated, but I wonder if five dimensions are sufficient or If there is any sweet spot between 5 and 32 dimensions.
The loss history (attached) shows that a loss of 0 is reached after about 40k iterations, so I think early stopping is a good idea in this case. I assume that our loss history suggests that our model is likely to be overfitted. Could this also contribute to a significant model misspecification?
Offline training
Thank you for your advice regarding the training phase. I adapted my script and tried the trainer.train_offline()-function:
simulations_dict = model(100)
h = trainer.train_offline(simulations_dict = simulations_dict,
epochs = 10,
batch_size = 10,
validation_sims=200,
save_checkpoint=True,
early_stopping=True)
but received this error message:
Traceback (most recent call last):
File "/Applications/DataSpell.app/Contents/plugins/python-ce/helpers/pydev/pydevconsole.py", line 364, in runcode
coro = func()
File "<input>", line 1, in <module>
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/bayesflow/trainers.py", line 551, in train_offline
data_set = SimulationDataset(simulations_dict, batch_size)
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/bayesflow/helper_classes.py", line 67, in __init__
self.data = tf.data.Dataset.from_tensor_slices(tuple(slices)).shuffle(buffer_size).batch(batch_size)
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 825, in from_tensor_slices
return from_tensor_slices_op._from_tensor_slices(tensors, name)
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/tensorflow/python/data/ops/from_tensor_slices_op.py", line 25, in _from_tensor_slices
return _TensorSliceDataset(tensors, name=name)
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/tensorflow/python/data/ops/from_tensor_slices_op.py", line 38, in __init__
self._structure = nest.map_structure(
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/tensorflow/python/data/util/nest.py", line 122, in map_structure
return nest_util.map_structure(
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/tensorflow/python/util/nest_util.py", line 1068, in map_structure
return _tf_data_map_structure(func, *structure, **kwargs)
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/tensorflow/python/util/nest_util.py", line 1135, in _tf_data_map_structure
return _tf_data_pack_sequence_as(structure[0], [func(*x) for x in entries])
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/tensorflow/python/util/nest_util.py", line 1135, in <listcomp>
return _tf_data_pack_sequence_as(structure[0], [func(*x) for x in entries])
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/tensorflow/python/data/ops/from_tensor_slices_op.py", line 39, in <lambda>
lambda component_spec: component_spec._unbatch(), batched_spec) # pylint: disable=protected-access
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.10/site-packages/tensorflow/python/framework/tensor.py", line 1199, in _unbatch
raise ValueError("Unbatching a tensor is only supported for rank >= 1")
ValueError: Unbatching a tensor is only supported for rank >= 1
Our simulation_data has a similar shape as the one in this bayesflow-script, capturing the numbers of observations in the non batchable context and an np.array of congruency conditions in the batchable context. I tested the functions mentioned in the traceback and figured out that this function:
tf.data.Dataset.from_tensor_slices(tuple(slices)).shuffle(buffer_size).batch(batch_size)
cannot handle our non batchable context (<class 'intâ>) because it expects a shape corresponding to the other elements of slices
. The problem seems to be caused by the non batchable context, since changing the non batchable context from an integer to an array with the shape (number of simulations, 1) works fine. If I do so, I have to change also the configurator, so the âdirect conditionsâ output of the configurator matches the shape of the non batchable context. Unfortunately, this leads to different ranks in the tensors and an error message when executing the bf.trainers.Trainer():
Traceback (most recent call last):
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.11/site-packages/bayesflow/trainers.py", line 1314, in _check_consistency
_ = self.amortizer.compute_loss(self.configurator(self.generative_model(_n_sim)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.11/site-packages/bayesflow/amortizers.py", line 209, in compute_loss
net_out, sum_out = self(input_dict, return_summary=True, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.11/site-packages/bayesflow/amortizers.py", line 174, in call
summary_out, full_cond = self._compute_summary_condition(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.11/site-packages/bayesflow/amortizers.py", line 410, in _compute_summary_condition
full_cond = tf.concat([sum_condition, direct_conditions], axis=-1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: Exception encountered when calling layer 'dmc_amortizer_model21beta2wide_test_transf_offline' (type AmortizedPosterior).
{{function_node __wrapped__ConcatV2_N_2_device_/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Dimension 0 in both shapes must be equal: shape[0] = [2,32] vs. shape[1] = [699,1] [Op:ConcatV2] name: concat
Call arguments received by layer 'dmc_amortizer_model21beta2wide_test_transf_offline' (type AmortizedPosterior):
⢠input_dict={'summary_conditions': 'tf.Tensor(shape=(2, 699, 4), dtype=float32)', 'direct_conditions': 'tf.Tensor(shape=(699, 1), dtype=float32)', 'parameters': 'tf.Tensor(shape=(2, 5), dtype=float32)'}
⢠return_summary=True
⢠kwargs={'training': 'None'}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Applications/DataSpell.app/Contents/plugins/python-ce/helpers/pydev/pydevconsole.py", line 364, in runcode
coro = func()
^^^^^^
File "<input>", line 1, in <module>
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.11/site-packages/bayesflow/trainers.py", line 220, in __init__
self._check_consistency()
File "/Users/simonschaefer/anaconda3/envs/bf/lib/python3.11/site-packages/bayesflow/trainers.py", line 1317, in _check_consistency
raise ConfigurationError(
bayesflow.exceptions.ConfigurationError: Could not carry out computations of generative_model ->configurator -> amortizer -> loss! Error trace:
Exception encountered when calling layer 'dmc_amortizer_model21beta2wide_test_transf_offline' (type AmortizedPosterior).
{{function_node __wrapped__ConcatV2_N_2_device_/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Dimension 0 in both shapes must be equal: shape[0] = [2,32] vs. shape[1] = [699,1] [Op:ConcatV2] name: concat
Call arguments received by layer 'dmc_amortizer_model21beta2wide_test_transf_offline' (type AmortizedPosterior):
⢠input_dict={'summary_conditions': 'tf.Tensor(shape=(2, 699, 4), dtype=float32)', 'direct_conditions': 'tf.Tensor(shape=(699, 1), dtype=float32)', 'parameters': 'tf.Tensor(shape=(2, 5), dtype=float32)'}
⢠return_summary=True
⢠kwargs={'training': 'None'}
Is there an easy way to adapt the simulation data (the non batchable context in particular) so that it works with train_offline()?
Thank you in advance, I appreciate any thoughts and remarks