Can BayesFlow diffusion models infer high-dimensional sparse binary parameters?

Hi everyone,

I am applying BayesFlow to a posterior inference problem of the form p(\theta|y), where y is the observed data and \theta is the target parameter.

In my case, \theta is high-dimensional and binary. For example, each \theta can be viewed as a 25 \times 25 binary image, where most cells are 0 and only a few cells are 1. I flatten this image into a 625-dimensional vector and train a conditional diffusion model to infer \theta from y.

However, the results are not good. The model tends to predict almost all entries as 0 and fails to recover the sparse 1-valued cells. I suspect this may be related to the severe class imbalance in \theta, since most entries are 0.

I am wondering:

  1. Can the current BayesFlow diffusion model handle high-dimensional sparse binary parameters?
  2. Are there recommended ways to model binary/discrete \theta in this setting?
  3. Are there examples of using BayesFlow for image-like binary posterior inference problems?

Any suggestions would be greatly appreciated. Thank you!

Hi Jice, diffusion models (and other free-form models, such as consistency or flow matching for that matter) are not suitable for discrete problems. For one, you should not flatten data with spatial structure, but should use a suitable subnet (e.g., UNet). See our tutorial tutorial on image generation.

Second, to properly model binary targets, you should either find a way to dequantize the problem or model the underlying continuous logits or probabilities giving rise to the binary values.

Let me know how that goes!

Thanks for the suggestions! I will first try subnet architectures that are more suitable for image-like inference tasks, such as a U-Net.

My understanding from your comment is that diffusion models may not be the best choice for binary targets. Do you have any advice or experience regarding which generative models may be more appropriate for this type of task, especially for high-dimensional sparse binary parameters?
Many thanks for your help!

I would assume something along the lines of Bernoulli diffusion, e.g.,: [2304.04429] BerDiff: Conditional Bernoulli Diffusion Model for Medical Image Segmentation (Code: GitHub - takimailto/BerDiff · GitHub) or [2310.16834] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution, but we don’t have a Keras3 implementation.

I appriciate the recommended references. I will take a look. Thanks!

Hi, as you suggested, I switched to a U-Net architecture for image-like data generation. It does improve the results compared with using flattened data, but the performance is still not good enough. This may still be due to the discrete nature of the data, which traditional continuous diffusion models may not handle efficiently. I will next try a discrete variant of the diffusion model for this task.
Thanks for your advice again!