Casually Creating Content for VR and Immersive Experiences
Content creation for VR
VR Content Creation from Everyday Images and Videos
We are exploring novel ways that allow users to casually capture 3D representations and content for replay in Virtual and Augmented Reality environments.
This work was supported by the New Zealand Marsden Council through Grant UOO1724 (Interactive 3D computational videography).
Research Topics
Thanks to the ubiquity of devices capable of recording and playing back video, the amount of video files is growing at a rapid rate. Most of us have now video recordings of major events in our lives. However, until today, these videos are captured mainly in 2D and are mostly used for screen-based video replay. In this research, we are investigating methods that work towards solving this problem.
PanoSynthVR
In this project, we investigate how real-time, 360 degree view synthesis can be achieved on current virtual reality hardware from a single panoramic image input. We introduce a light-weight method to automatically convert a single panoramic input into a multi-cylinder image representation that supports real-time, free-viewpoint view synthesis rendering for virtual reality.
We apply an existing convolutional neural network trained on pinhole images to a cylindrical panorama with wrap padding to ensure agreement between the left and right edges. The network outputs a stack of semi-transparent panoramas at varying depths which can be easily rendered and composited with over blending. Quantitative experiments and a user study show that the method produces convincing parallax and fewer artifacts than a textured mesh representation.
John Conrad Waidhofer, Richa Gadgil, Anthony Dickson, Stefanie Zollmann, Jonathan Ventura
PanoSynthVR: Toward Light-weight 360-Degree View Synthesis from a Single Panoramic Input
IEEE International Symposium on Mixed and Augmented Reality (IEEE ISMAR), 2022
Richa Gadgil, Reesa John, Stefanie Zollmann, and Jonathan Ventura. 2021.
PanoSynthVR: View Synthesis From A Single Input Panorama with Multi-Cylinder Images . In ACM SIGGRAPH 2021 Posters (SIGGRAPH '21). Association for Computing Machinery, New York, NY, USA, Article 24, 1–2. https://doi.org/10.1145/3450618.3469144
VRVideos
Recent advances in Neural Radiance Field (NeRF)- based methods have enabled high-fidelity novel view synthesis for video with dynamic elements. However, these methods often require expensive hardware, take days to process a second-long video and do not scale well to longer videos.
We create an end- to-end pipeline for creating dynamic 3D video from a monocular video that can be run on consumer hardware in minutes per second of footage, not days. Our pipeline handles the estimation of the camera parameters, depth maps, 3D reconstruction of dynamic foreground and static background elements, and the rendering of the 3D video on a computer or VR headset. We use a state-of-the-art visual transformer model to estimate depth maps which we use to scale COLMAP poses and enable RGB-D fusion with estimated depth data. In our preliminary experiments, we rendered the output in a VR headset and visually compared the method against ground-truth datasets and state-of-the-art NeRF-based methods.
Anthony Dickson, Jeremy Shanks, Jonathan Ventura, Ali Knott, Stefanie Zollmann,
VRVideos: A flexible pipeline for Virtual Reality Video Creation
IEEE International conference on ARTIFICIAL INTELLIGENCE & VIRTUAL REALITY (AIVR)), 2022
CasualVRVideos
Traditional videos are not optimized for playback in stereoscopic displays or even tracked Virtual Reality devices. In this work, we present CasualVRVideos, a first approach that works towards solving these issues by extracting spatial information from video footage recorded in 2D, so that it can later be played back in VR displays to increase the immersion.
We focus in particular on the challenging scenario when the camera itself is not moving.
Stefanie Zollmann, Anthony Dickson, Jonathan Ventura CasualVRVideos: VR videos from casual stationary videos
VRST '20: 26th ACM Symposium on Virtual Reality Software and Technology, 2020
Casual Stereo Panoramas
Hand-held capture of stereo panoramas involves spinning the camera in a roughly circular path to acquire a dense set of views of the scene. However, most existing structure-from-motion pipelines fail when trying to reconstruct such trajectories, due to the small baseline between frames.
In this work, we evaluate the use of spherical structure-from-motion for reconstructing handheld stereo panorama captures. The spherical motion constraint introduces a strong regularization on the structure-from-motion process which mitigates the small-baseline problem, making it well-suited to the use case of stereo panorama capture with a handheld camera.
We demonstrate the effectiveness of spherical structure-from-motion for casual capture of high-resolution stereo panoramas and validate our results with a user study.
Editing VR videos using standard video editing tools is challenging, especially for non-expert users. There is a large gap between the 2D interface used for traditional video editing and the immersive VR environment used for replay.
Within this project, we look into methods for immersive video editing bridging this gap. In particular, we propose 6DIVE, a 6 degrees-of-freedom (DoF) immersive video editor. 6DIVE allows users to edit these 6DoF videos directly in an immersive VR environment.