CALIP introduces a parameter-free Attention module to enhance CLIP’s zero-shot performance by enabling interactive cross-modal features using attention. It discards parameters due to reduced pre-training distances between modalities, achieving a training-free process. CALIP improves zero-shot performance over CLIP in various benchmarks for 2D image and 3D point cloud few-shot classification, and even with added linear layers, it outperforms existing methods under few-shot settings.