A Lightweight 3D Convolutional Autoencoder Architecture for Temporal Coherence in 3D Video
Main Article Content
Abstract
We propose a novel lightweight 3D convolutional autoencoder architecture designed to efficiently encode and decode spatiotemporal information from 3D video data while preserv- ing temporal coherence between frames. We present a theoretical analysis on the stability of temporal feature representations and validate the approach on a synthetic 3D video dataset of moving volumetric shapes. Experimental results demonstrate the effectiveness of our method in reconstructing 3D videos with high fidelity and smooth temporal transitions, highlighting its potential for real-world 3D video processing applications.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/
References
D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3d convolutional networks,” in Proceedings of the IEEE international conference on computer vision, pp. 4489–4497, 2015.
N. Srivastava, E. Mansimov, and R. Salakhudinov, “Unsupervised learning of video represen- tations using lstms,” in International conference on machine learning, pp. 843–852, PMLR, 2015.
Y. Xie, H. Chen, G. P. Meyer, Y. J. Lee, E. M. Wolff, M. Tomizuka, W. Zhan, Y. Chai, and X. Huang, “Cohere3d: Exploiting temporal coherence for unsupervised representation learning of vision-based autonomous driving,” arXiv preprint arXiv:2402.15583, 2024.
D. Qu, Y. Lao, Z. Wang, D. Wang, B. Zhao, and X. Li, “Towards nonlinear-motion-aware and occlusion-robust rolling shutter correction,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10680–10688, 2023.
G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” science, vol. 313, no. 5786, pp. 504–507, 2006.