Researchers have introduced MARCO, a new model designed to improve semantic correspondence by addressing the generalization limitations of existing dual-encoder architectures. MARCO utilizes a novel training framework that combines a coarse-to-fine objective for spatial precision with a self-distillation approach to expand supervision beyond annotated areas. This method results in a model that is smaller and faster than diffusion-based alternatives while achieving state-of-the-art performance on several benchmarks, particularly in fine-grained localization and generalization to unseen data. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON This is a research paper detailing a new model and its performance on benchmarks.