摘要
The recent vision foundation models, e.g. Segment Anything Model (SAM), have shown great potential in various downstream 2D tasks. However, their adaptability to 3D vision remains an unexplored area. In this paper, we propose a novel generative framework, namely Eyeing3D, by integrating generative vision models of multiple purposes (including SAM and Neural Radiance Fields) to achieve human's uncanny capability to perceive and interpret the 3D structure of a visual object, even when it is represented in a single 2D image. Particularly, a user is granted the ability to select any visual object of interest in the input 2D image with a simple click or bounding box, facilitating the reconstruction of its 3D model, with the added ability to manipulate the visual style and viewing angle. In the experiments, the effectiveness of our proposed Eyeing3D is demonstrated, showcasing improved performance in image-based 3D reconstruction tasks.
原文 | English |
---|---|
頁(從 - 到) | 120-121 |
頁數 | 2 |
期刊 | IET Conference Proceedings |
卷 | 2023 |
發行號 | 35 |
DOIs | |
出版狀態 | Published - 2023 |
事件 | 2023 IET International Conference on Engineering Technologies and Applications, ICETA 2023 - Yunlin, Taiwan 持續時間: 21 10月 2023 → 23 10月 2023 |