Preferred way to deal with time series with non-equidistant time steps

nw95 · May 23, 2024, 4:14pm

I am currently testing bayesflow with different ODE-based models. Recently I turned my attention to model outputs with varying number of outputs and varying time step points. For example one model output could be at times t=1,3,4,5 and another output could use the times t =2,3. Is there a preferred way to deal with such data? Extending the data to uniform timesteps (t=1,2,3,4,5 in this example) might lead to undesirable behavior. Has anyone encountered a similiar problem and can offer some advice?

marvinschmitt · May 23, 2024, 4:32pm

Hi, welcome to the BayesFlow Forums!

I would suggest adding the original time information (t=1,3,4,...) as a dimension on the last axis of your data, and then using a TimeSeriesTransformer as a summary network. If you need support setting this up, you can provide a minimal reproducible example and I’m happy to help.

Cheers,
Marvin

PS: You can also experiment with custom time encodings (e.g., sinusoidal or some learned time embedding), but I wouldn’t recommend starting with that if the simple time+transformer approach works.

nw95 · May 23, 2024, 5:53pm

Thanks for the welcome and the quick response! This is really helpful.I think I didn’t get this across in my initial post but my output is two time-series with potentially different time data. So a time series with t=1,3,4,5 and one time series with t = 2 and 3. Is it possible to adapt the TimeSeriesTransformer for this purpose? A minimal reproducible example (if possible) would indeed be nice.

marvinschmitt · May 24, 2024, 9:27am

Thanks for the clarification. Just to make sure: Given one parameter vector \theta, your model outputs:

x with t_x=1,3,4,5 and also
y with t_y=2,3,

making your simulator have a signature like \theta\mapsto[x,y]?

Or is it \theta\mapsto x but sometimes t_x=1,3,4,5 and sometimes t_x=2,3?

If you indeed have 2 time series outputs [x, y] from a single call to the simulator, then you could make use of our recent work on multimodal fusion for neural posterior estimation (Link to preprint). Once we have completely clarified your exact problem setting, I’m happy to help more!

nw95 · May 25, 2024, 6:10pm

For now I want to restrict my application to the former case. That is, I have one output indeed an two outputs x and y with t_x = 1,3,4,5 and t_y = 2,3. I have taken a look at the preprint, it seems exactly like what I need. A small example would also be very helpful!

Thank you for taking the time helping me!

marvinschmitt · May 30, 2024, 10:09am

Hi, apologies for the delay. I set up a small example for you. The setup is very simple:

2-dimensional target parameter \theta\sim\mathrm{Normal}(0, 1)
source x: random walk with drift \theta, noise \sigma=0.3, 10 discretization steps, but only t_x=5 randomly selected observations
source y: random walk with drift \theta, noise \sigma=0.8, 10 discretization steps but only t_y=3 randomly selected observations

Shapes

The shapes for a batch of forward samples are:

\theta: (batch_size, 2)
x: (batch_size, 5, 3)
y: (batch_size, 3, 3)

Both x and y are 3-dimensional in the last axis because we have two data dimensions (2D random walk) and one time dimension which contains the actual time stamps of the observations.

Summary networks

Both sources are processed with a TimeSeriesTransformer, which learns 10 summary dimensions each. 10+10=20 summaries are likely an overkill, but that’s not the point here

General thoughts

I don’t know about the exact structure of your data, but it might make sense to share information between the summary networks. Also, the specifics of the networks (number of layers, units per layer, …) and regularization (in case of offline learning on a fixed data set) are the first things I’d change for an actual application. If you want additional eyes on your specific modeling example, let me know (here or via email).

Link to notebook

Here’s the Google Colab link for you to clone and play with it: Google Colab

nw95 · June 2, 2024, 2:16pm

Thank you very much! I was on a short trip so also apologies for the delay. I will play around with it and if I have questions I might have to impose on you again.

marvinschmitt · August 6, 2024, 9:15am

Hi,

How did your experiments go? Let me know if you have any follow-up questions and I’m happy to help anytime.

nw95 · August 6, 2024, 11:07am

Hey,

it is going great so far. I have no follow-up questions regarding this topic so far. Thank you for your help!

Right now, I just have to fix some problems regarding identifiability of the model parameters. In theory, the output trajectories are enough to identify the parameter but in practice this sadly is not the case.

KLDivergence · August 6, 2024, 11:25am

Hi Nils, if some parameters are only weakly identifiable, you may want to play around with the summary network hyperparameters a bit and see if you can scrape some additional performance.
Cheers,
Stefan

nw95 · August 8, 2024, 5:25pm

Hi Stefan,

I tried playing around with the summary network a bit but sadly this didn’t change much. This is not of much suprise, plotting model output trajectories shows that the trajectories can be very close to each other given certain parameters.

However I noticed that bayesflow was able to infer some algebraic relations between the input parameters so I used that as a kind of dimension reduction.

Topic		Replies	Views
Setting up TimeSeriesTransformer General	8	217	February 14, 2024
Time Series Dimension Errors in HierarchicalNetwork General	3	71	January 17, 2025
Handling missing data in summary and inference networks General	5	235	January 16, 2024
Cannot do offline training with summary network General	5	205	December 9, 2023
Adding manual summary statistics to summary network General	5	164	August 6, 2024

Preferred way to deal with time series with non-equidistant time steps

Shapes

Summary networks

General thoughts

Link to notebook

Related topics