Training data with additive noise

Hi, all,
I added noise to training data in BayesFlow, such as Gaussian noise, but I found the the model performance after training is not as good as those without noise. I tried to tune the model for a while, the issue is still there. I read some papers and wonder whether it results from the model misspecification or not. The real data contains noise at different level, considering noise in training process is quite necessary. Does anyone suggest some methods to improve model performance in the case of additive noise?
Thanks!

Jice

1 Like

Others should have better and more systematic solutions, but from my experience, introducing more data and reducing the batch size (e.g., batch_size=16) would be helpful. More data would increase the chance of the neural network being exposed to more distinct modes of behavior, enabling it to understand the role of the noise parameter. A smaller batch size would possibly let the network escape local minima, but it can increase the training time significantly.

Hi Jice,

Are you referring to performance on simulated data or real-world data?
For diagnostics on simulated data, model misspecification should not be an issue since your test data is simulated from the same model that you train your network on. Adding noise makes the inverse task that the network learns harder, so I would try making the network more expressive (= bigger), letting it train longer while monitoring potential overfitting, and tuning the learning rate.
For real-world data, we recently proposed using the variation between the predictions of a neural network ensemble to measure the impact of model misspecification. It would be interesting to see whether adding noise to your model reduces the sensitivity induced by misspecification by improving the reliability of the networks.

1 Like

Hi Elseml,
Thanks for your reply. It is true that adding the noise to the observations makes the inference more difficult. My training data is (x, y), x are parameters, y are observations. The neural network performs well for posterior approximate given the new y*. However, if I add the noise to y*, such as y*+noise, since the true observations always contain noise, the performance degenerates. I guess it is caused by that the training data does not include any noise, or we call it model misspecification, as y in training data differs from (y*+noise).
I found one paper "Noise-Net: Determining physical properties of Hii regions reflecting observational uncertainties’. This paper implements noise training that considers noise during cINN training based on Soft Flow. I found the BayesFlow framework has the option of soft flow for inference network. But the cINN with soft flow works on perturbing x not observations, such as x+noise and concatenate y and noise [y, noise]. Do I understand right? If I am right, the part of soft flow in BayesFlow should be modified as below if I want to train noise-net like the paper:

Perturb data with noise (will broadcast to all dimensions) (we should not perturb x)

        if len(shape_scale) == 2 and len(target_shape) == 3:
            targets += tf.expand_dims(noise_scale, axis=1) * 0
        else:
            targets += noise_scale *0

Augment condition with noise scale variate (we should perturb the condition with noise)

        condition +=noise_scale * tf.random.normal(shape=condition_shape)

Thanks for your discussion with me!

Jice

Hi Jice,

Thanks for kicking off that interesting discussion!

re misspecification/robustness:

re softflow <> noise:

I don’t think you need softflow in this case. As you said, softflow acts on the parameter domain and not on the data domain. Adding noise during training is independent of softflow.

re noise during workflow:

Where in the data flow are you currently adding the noise? My naĂŻve take would be to add the noise in the configurator.

Cheers,
Marvin

Hi Marvin,
Thank you very much for the suggestions. You are right, the soft flow perturbs parameter domain by adding noise to X, not acting on observation Y.
I have tried to directly add the noise to training data (all training data are available on hand), then trained BayesFlow on those noisy training data (as you said, adding noise to configurator). But the model performance is bad.
What I want to do is we can perturb the Y similar as soft flow, as the paper does (Noise-Net: determining physical properties of H ii regions reflecting observational uncertainties | Monthly Notices of the Royal Astronomical Society | Oxford Academic). So I modify the code of soft flow part in BayesFlow as:

not perturb x in soft flow

    if len(shape_scale) == 2 and len(target_shape) == 3:
        targets += tf.expand_dims(noise_scale, axis=1) * 0
    else:
        targets += noise_scale *0

perturb the condition with noise

   condition +=noise_scale * tf.random.normal(shape=condition_shape)

How do you think? Is it feasible?
Thanks!

Jice

Hi Jice,

Thanks for clarifying. Where exactly are you doing the condition +=noise_scale * tf.random.normal(shape=condition_shape) part? I don’t see why you would need to modify “the softflow part”.

Cheers,
Marvin

Yes, you are right. Actually, it is nothing about softflow part. I just sampled error from uniform distribution, then added error to the observation. This process iterates during the training. I trained a model like this, the model now perfoms very stably under different noise level.

That’s great to hear!