Semantic representation learning for a mask-modulated lensless camera by contrastive cross-modal transferring

Ya Ti Chang Lee*, Chung Hao Tien

*此作品的通信作者

研究成果: Article同行評審

3 引文 斯高帕斯(Scopus)

摘要

Lensless computational imaging, a technique that combines optical-modulated measurements with task-specific algorithms, has recently benefited from the application of artificial neural networks. Conventionally, lensless imaging techniques rely on prior knowledge to deal with the ill-posed nature of unstructured measurements, which requires costly supervised approaches. To address this issue, we present a self-supervised learning method that learns semantic representations for the modulated scenes from implicitly provided priors. A contrastive loss function is designed for training the target extractor (measurements) froma source extractor (structured natural scenes) to transfer cross-modal priors in the latent space. The effectiveness of the new extractor was validated by classifying the mask-modulated scenes on unseen datasets and showed the comparable accuracy to the source modality (contrastive language-image pre-trained [CLIP] network). The proposed multimodal representation learning method has the advantages of avoiding costly data annotation, being more adaptive to unseen data, and usability in a variety of downstream vision tasks with unconventional imaging settings.

原文English
頁(從 - 到)C24-C31
期刊Applied Optics
63
發行號8
DOIs
出版狀態Published - 10 3月 2024

指紋

深入研究「Semantic representation learning for a mask-modulated lensless camera by contrastive cross-modal transferring」主題。共同形成了獨特的指紋。

引用此