Training data with additive noise

Jice · March 13, 2024, 10:12pm

Hi, all,
I added noise to training data in BayesFlow, such as Gaussian noise, but I found the the model performance after training is not as good as those without noise. I tried to tune the model for a while, the issue is still there. I read some papers and wonder whether it results from the model misspecification or not. The real data contains noise at different level, considering noise in training process is quite necessary. Does anyone suggest some methods to improve model performance in the case of additive noise?
Thanks!

Jice

ali · March 14, 2024, 11:37pm

Others should have better and more systematic solutions, but from my experience, introducing more data and reducing the batch size (e.g., batch_size=16) would be helpful. More data would increase the chance of the neural network being exposed to more distinct modes of behavior, enabling it to understand the role of the noise parameter. A smaller batch size would possibly let the network escape local minima, but it can increase the training time significantly.

elseml · March 15, 2024, 9:37am

Hi Jice,

Are you referring to performance on simulated data or real-world data?
For diagnostics on simulated data, model misspecification should not be an issue since your test data is simulated from the same model that you train your network on. Adding noise makes the inverse task that the network learns harder, so I would try making the network more expressive (= bigger), letting it train longer while monitoring potential overfitting, and tuning the learning rate.
For real-world data, we recently proposed using the variation between the predictions of a neural network ensemble to measure the impact of model misspecification. It would be interesting to see whether adding noise to your model reduces the sensitivity induced by misspecification by improving the reliability of the networks.

Jice · March 18, 2024, 6:49am

Hi Elseml,
Thanks for your reply. It is true that adding the noise to the observations makes the inference more difficult. My training data is (x, y), x are parameters, y are observations. The neural network performs well for posterior approximate given the new y*. However, if I add the noise to y*, such as y*+noise, since the true observations always contain noise, the performance degenerates. I guess it is caused by that the training data does not include any noise, or we call it model misspecification, as y in training data differs from (y*+noise).
I found one paper "Noise-Net: Determining physical properties of Hii regions reflecting observational uncertainties’. This paper implements noise training that considers noise during cINN training based on Soft Flow. I found the BayesFlow framework has the option of soft flow for inference network. But the cINN with soft flow works on perturbing x not observations, such as x+noise and concatenate y and noise [y, noise]. Do I understand right? If I am right, the part of soft flow in BayesFlow should be modified as below if I want to train noise-net like the paper:

Perturb data with noise (will broadcast to all dimensions) (we should not perturb x)

        if len(shape_scale) == 2 and len(target_shape) == 3:
            targets += tf.expand_dims(noise_scale, axis=1) * 0
        else:
            targets += noise_scale *0

Augment condition with noise scale variate (we should perturb the condition with noise)

        condition +=noise_scale * tf.random.normal(shape=condition_shape)

Thanks for your discussion with me!

Jice

marvinschmitt · March 18, 2024, 7:17am

Hi Jice,

Thanks for kicking off that interesting discussion!

re misspecification/robustness:

Your observation that posterior inference gets worse when we encounter unexpected noise is in line with findings from our paper on model misspecification in amortized Bayesian inference: https://www.dagm-gcpr.de/fileadmin/dagm-gcpr/pictures/2023_Heidelberg/Paper_MainTrack/030.pdf
Here’s a paper that also adds noise during the training to achieve more robust inference in noisy real-data settings: [2307.13918] Simulation-based Inference for Cardiovascular Models

re softflow <> noise:

I don’t think you need softflow in this case. As you said, softflow acts on the parameter domain and not on the data domain. Adding noise during training is independent of softflow.

re noise during workflow:

Where in the data flow are you currently adding the noise? My naïve take would be to add the noise in the configurator.

Cheers,
Marvin

Jice · March 18, 2024, 4:53pm

Hi Marvin,
Thank you very much for the suggestions. You are right, the soft flow perturbs parameter domain by adding noise to X, not acting on observation Y.
I have tried to directly add the noise to training data (all training data are available on hand), then trained BayesFlow on those noisy training data (as you said, adding noise to configurator). But the model performance is bad.
What I want to do is we can perturb the Y similar as soft flow, as the paper does (Noise-Net: determining physical properties of H ii regions reflecting observational uncertainties | Monthly Notices of the Royal Astronomical Society | Oxford Academic). So I modify the code of soft flow part in BayesFlow as:

not perturb x in soft flow

    if len(shape_scale) == 2 and len(target_shape) == 3:
        targets += tf.expand_dims(noise_scale, axis=1) * 0
    else:
        targets += noise_scale *0

perturb the condition with noise

   condition +=noise_scale * tf.random.normal(shape=condition_shape)

How do you think? Is it feasible?
Thanks!

Jice

marvinschmitt · March 20, 2024, 9:05am

Hi Jice,

Thanks for clarifying. Where exactly are you doing the condition +=noise_scale * tf.random.normal(shape=condition_shape) part? I don’t see why you would need to modify “the softflow part”.

Cheers,
Marvin

Jice · March 22, 2024, 2:15am

Yes, you are right. Actually, it is nothing about softflow part. I just sampled error from uniform distribution, then added error to the observation. This process iterates during the training. I trained a model like this, the model now perfoms very stably under different noise level.

elseml · March 25, 2024, 11:21am

That’s great to hear!

Topic		Replies	Views
Model Misspecification Diffusion model for conflict tasks General	5	375	January 20, 2024
Simple negative binomial inference General	6	144	January 23, 2025
Reinforcement Learning Cognitive Models General	5	275	June 9, 2024
Confused in training `AmortizedPosteriorEstimator` General	6	210	April 23, 2024
Dealing with different driving data for different units General	3	164	January 27, 2024

Training data with additive noise

Perturb data with noise (will broadcast to all dimensions) (we should not perturb x)

Augment condition with noise scale variate (we should perturb the condition with noise)

re misspecification/robustness:

re softflow <> noise:

re noise during workflow:

not perturb x in soft flow

perturb the condition with noise

Related topics