Announcement_7

Now accepted to NeurIPS 2025: multi-modal contrastive learning adapts to intrinsic dimensions. We present a theoretical analysis of CLIP, showing how temperature optimization enables adaptation to the intrinsic dimension of shared features in multi-modal data.

A more recent work proposes IndiSeek which learns modality-specific features that are independent of shared features.