rel="stylesheet">
Creating and sharing knowledge in communications and information technology

Swinscale-LFVS: Parallel Feature Integration for Light Field View Synthesis

Zubair, M. Z. ; Nunes, P. ; Conti, C. ; Soares, L. D.

Swinscale-LFVS: Parallel Feature Integration for Light Field View Synthesis, Proc IEEE International Conference on Image Processing ICIP, Anchorage, United States, Vol. , pp. 1942 - 1947, September, 2025.

Digital Object Identifier: 10.1109/ICIP55913.2025.11084705

Download Full text PDF ( 833 KBs)

 

Abstract
Light Field (LF) view synthesis aims to synthesize a dense set of LF views from a sparse set of input views. Although many recent learning-based methods have shown promising results in this task, they often rely on deep residual networks or on multiple LF representations to extract dense features, without fully exploiting the geometric structure of the LFs. In this paper, we introduce SwinScale-LFVS, a novel framework that combines the strengths of the Swin Transformer and the Multi-Scale Convolutional Network in parallel streams. The first stream uses a Swin Transformer to model local and global features using a geometry-aware Angular Mutual Self Attention (AMSA) network, and the second stream uses multi-scale 3D convolutions to extract dense features and to ensure spatial-angular consistency in synthesized LF views. The outputs from these streams are integrated and processed by an LF View Synthesis (LFVS) network to synthesize high-quality dense LF views. Extensive experiments show that SwinScale-LFVS outperforms existing methods on both real-world and synthetic datasets. The code is publicly available at https://github.com/MSP-IUL/SwinScale-LFVS.