Time Series Dimension Errors in HierarchicalNetwork

Hi! I’ve recently been trying out BayesFlow for a use case at my organization for the first time (it seems really cool!), but I’ve bumped into an error when trying to use the HierarchicalNetwork with some hierarchical time series data.

I’ve set up my configurator to output data in the shape (n_batches, n_groups, n_observations, n_dims). As far as I can tell from the docstring on the HierarchicalNetwork class, this seems like the correct format.

Docstring
def call(self, x, return_all=False, **kwargs):
        """Performs the forward pass through the hierarchical network,
        transforming the nested input into learned summary statistics.

        Parameters
        ----------
        x          : tf.Tensor of shape (batch_size, ..., data_dim)
            Example, hierarchical data sets with two levels:
            (batch_size, D, L, x_dim) -> reduces to (batch_size, out_dim).
        return_all : boolean, optional, default: False
            Whether to return all intermediate outputs (True) or just
            the final one (False).

        Returns
        -------
        out : tf.Tensor
            Output of shape ``(batch_size, out_dim) if return_all=False`` else a tuple
            of ``len(outputs) == len(networks)`` corresponding to all outputs of all networks.
        """

The observations are indexed by time. Based on that, I’ve tried to set up my components as follows:

summary_net = bf.networks.HierarchicalNetwork([bf.networks.SequenceNetwork(summary_dim=32), bf.networks.DeepSet(summary_dim=128)])
local_amortizer = bf.amortizers.AmortizedPosterior(inference_net=bf.networks.InvertibleNetwork(num_params=N_LOCAL_PARAMS))
global_amortizer = bf.amortizers.AmortizedPosterior(inference_net=bf.networks.InvertibleNetwork(num_params=N_GLOBAL_PARAMS))
amortizer = bf.amortizers.TwoLevelAmortizedPosterior(local_amortizer=local_amortizer, global_amortizer=global_amortizer, summary_net=summary_net)

Unfortunately, this summary network definition raises an error from the recurrent network during the consistency check triggered when creating the trainer:

trainer = bf.trainers.Trainer(amortizer=amortizer, generative_model=model, configurator=configure_inputs)
ConfigurationError: Could not carry out computations of generative_model ->configurator -> amortizer -> loss! Error trace:
 Exception encountered when calling layer 'sequence_network_4' (type SequenceNetwork).

Input 0 of layer "lstm_9" is incompatible with the layer: expected ndim=3, found ndim=4.

Interestingly, the consistency check passes if I swap out the the SequenceNetwork for a DeepSet. Does anyone have a suggestion on what I might be doing wrong, or if this is perhaps a bug?

Thanks!

Tech Details

macOS 15.1.1 Sequoia
Python 3.11.11

bayesflow==1.1.6
tensorflow==2.15.1
tensorflow-probability==0.23.0

Hi, welcome to the forum!

Currently, some of the summary networks are written in a way to expect inputs of the shape (n_batches, n_observations, n_dims). We can test this by creating the SequenceNetwork directly and passing some random data through it:

import bayesflow as bf
import tensorflow as tf

seq_net = bf.networks.SequenceNetwork(summary_dim=32)

seq_net(tf.random.normal(32, 15, 10, 5)) # fails
seq_net(tf.random.normal(32, 10, 5)) # output shape: (32, 32)

What we want to do instead is to treat the group dimension like an additional batch dimension. One way do this is to wrap everything in a tf.keras.layers.TimeDistributed layer (see tf.keras.layers.TimeDistributed  |  TensorFlow v2.16.1, the timestamp dimension there corresponds to our group dimension).

import bayesflow as bf

seq_net = tf.keras.layers.TimeDistributed(bf.networks.SequenceNetwork(summary_dim=32))
seq_net(tf.random.normal(32, 15, 10, 5)) # output shape: (32, 15, 32)

There is one additional caveat: If you later want to amortize over the number of observations, the above will fail because of how the TimeDistributed layer initializes the shapes.

seq_net(tf.random.normal(32, 15, 8, 5)) 
# fails because TimeDistributed is not happy about how the observation dimension changed after building

To remedy this, we can build the network manually with flexible input shapes:

seq_net = tf.keras.layers.TimeDistributed(bf.networks.SequenceNetwork(summary_dim=32))
seq_net.build((None, None, None, 5))

# seq_net(tf.random.normal((32, 15, 10, 5))) works
# seq_net(tf.random.normal((32, 15, 8, 5)) afterwards now also works

# put seq_net into the HierarchicalNetwork:
summary_net = bf.networks.HierarchicalNetwork([seq_net, bf.networks.DeepSet(summary_dim=128)])

This is admittedly quite non-intuitive and will be much nicer in the future bayesflow release :slight_smile:

2 Likes

Thank you for the explanation, that is very helpful!

By the way, the main resource I’ve used to get myself up to speed on the API thus far has been the example notebooks, though I notice that they don’t seem to cover this particular aspect of the library. I’ve created a notebook with a contrived dataset that follows this nested time series structure in order to teach myself the functionality. If I were to polish it up, do you know if the team would be interested in including that in the example set? I think it would be similar in scope to the Two Moons example.

3 Likes

Hi Noah, I would be happy to include your example notebook on the landing page!

1 Like