Attention mask for the TimeSeriesTransformer summary network

leo · December 12, 2023, 2:59pm

Hi!
I am trying to use time series data of different lengths with the TimeSeriesTransformer as summary network. To avoid having to make batches of time series of the same length I was thinking of padding them to a fixed length and then use an attention mask. Is there an easy way to use an attention mask in the TimeSeriesTransformer summary network or would I have to rewrite the class?

Cheers,
Leonardo

KLDivergence · December 13, 2023, 6:30pm

Hi Leonardo,

This may be tricky to do currently and requires some modifications where we allow masks in the configured inputs. Let’s discuss the possibility of adding this functionality.

@marvinschmitt @elseml @paul.buerkner @valentin What do you think about:

inference_mask: ...,
summary_mask: ...,

In the configurator keys, which, when present, are propagated to the appropriate networks?

@leo To get you started quickly, you can do a custom modification to the existing time series transformer.

paul.buerkner · December 14, 2023, 8:50am

Can you write out some example code to showcase the syntax you have in mind?

KLDivergence · December 14, 2023, 3:07pm

@paul.buerkner The syntax will be the one above. Currently, our configuration dictionaries have some combination of the keys:

conf = {
    "parameters": ...,
    "summary_conditions",...
    "direct_conditions",...

I am proposing to allow for additional optional keys:

conf = {
    "parameters": ...,
    "summary_conditions",...
    "direct_conditions",...
    "inference_mask",...
    "summary_mask"...

which can hold masks for each of the networks’ outputs and are propagated to the associated networks by the Amortizer, if they are present.

paul.buerkner · December 14, 2023, 4:02pm

Thanks for the details. That looks reasonable to me!

elseml · December 15, 2023, 9:04am

Agree, optional keys for specific use cases are an elegant solution.

Topic		Replies	Views
Setting up TimeSeriesTransformer General	8	217	February 14, 2024
Handling missing data in summary and inference networks General	5	235	January 16, 2024
Preferred way to deal with time series with non-equidistant time steps General	10	199	August 8, 2024
Adding manual summary statistics to summary network General	5	164	August 6, 2024
Cannot do offline training with summary network General	5	205	December 9, 2023

Attention mask for the TimeSeriesTransformer summary network

Related topics