Hi,
In some experiments with stochastic ODE models we are finding it take a lot of data and training for the summary networks (either TimeSeriesTransformer or SequenceNetwork) to learn the process and measurement noise parameters, even if they have rather clear and intuitive signatures in the outputs which we can visually inspect. That motivates the idea of augmenting automated summary statistics with generic (for time series) manually crafted ones. What is the best way to do that? SplitNetwork? Is there any example out there for how to do that?
We are currently working on a generic and sophisticated solution to this problem. For now, you can simply achieve what you want with a simple configurator that returns the manually crafted summary statistics into the direct_conditions dictionary key. The raw data stays into the summary_conditions key. They will be combined automatically. Here is an example with the toy model from GitHub:
import numpy as np
import bayesflow as bf
def simulator(theta, n_obs=50, scale=1.0):
return np.random.default_rng().normal(loc=theta, scale=scale, size=(n_obs, theta.shape[0]))
def prior(D=2, mu=0., sigma=1.0):
return np.random.default_rng().normal(loc=mu, scale=sigma, size=D)
def configurator(input_dict):
# Example hand crafted statistics: sample average of shape (batch_size, D)
stats = np.mean(input_dict['sim_dict'], axis=1).astype(np.float32)
# Raw data will still be processed by the summary network
raw_data = input_dict['sim_data'].astype(np.float32)
output_dict = {
'summary_conditions': raw_data,
'direct_conditions': stats,
'parameters': input_dict['prior_draws'].astype(np.float32)
}
return output_dict
generative_model = bf.simulation.GenerativeModel(prior, simulator)
# Inspect output
configurator(generative_model(batch_size=3))
# Workflow as usual...
Don’t forget to pass your custom configurator to the Trainer.
Hi Hazhir,
I also encountered this situation like yours. In my case, simulation budget is limited due to very expensive forward simulation. What I did is to manually transform the time-series data to frequency-domain data, such as extracting natural frequency from acceleration time-series data. The use of summary statiscs of natural frequency to train the model is very efficient and requires less training data.
I have a question on this. Does this mean that the hand-crafted summary statistics (direct_conditions) are passed to the summary network and learned by the neural net too?
Direct conditions are not passed to the summary network. Only the “summary conditions” go into the summary network. Then, the output of the summary network is concatenated with the direct conditions, and that’s the conditioning input to the normalizing flow.