Hi!
I am trying to use time series data of different lengths with the TimeSeriesTransformer as summary network. To avoid having to make batches of time series of the same length I was thinking of padding them to a fixed length and then use an attention mask. Is there an easy way to use an attention mask in the TimeSeriesTransformer summary network or would I have to rewrite the class?
This may be tricky to do currently and requires some modifications where we allow masks in the configured inputs. Let’s discuss the possibility of adding this functionality.