Content creation for VR

VR Content Creation from Everyday Images and Videos

We are exploring novel ways that allow users to casually capture 3D representations and content for replay in Virtual and Augmented Reality environments.
This work was supported by the New Zealand Marsden Council through Grant UOO1724 (Interactive 3D computational videography).


Research Topics

Thanks to the ubiquity of devices capable of recording and playing back video, the amount of video files is growing at a rapid rate. Most of us have now video recordings of major events in our lives. However, until today, these videos are captured mainly in 2D and are mostly used for screen-based video replay. In this research, we are investigating methods that work towards solving this problem.



PanoSynthVR

In this project, we investigate how real-time, 360 degree view synthesis can be achieved on current virtual reality hardware from a single panoramic image input. We introduce a light-weight method to automatically convert a single panoramic input into a multi-cylinder image representation that supports real-time, free-viewpoint view synthesis rendering for virtual reality.


We apply an existing convolutional neural network trained on pinhole images to a cylindrical panorama with wrap padding to ensure agreement between the left and right edges. The network outputs a stack of semi-transparent panoramas at varying depths which can be easily rendered and composited with over blending. Quantitative experiments and a user study show that the method produces convincing parallax and fewer artifacts than a textured mesh representation.

John Conrad Waidhofer, Richa Gadgil, Anthony Dickson, Stefanie Zollmann, Jonathan Ventura
PanoSynthVR: Toward Light-weight 360-Degree View Synthesis from a Single Panoramic Input
IEEE International Symposium on Mixed and Augmented Reality (IEEE ISMAR), 2022
Richa Gadgil, Reesa John, Stefanie Zollmann, and Jonathan Ventura. 2021. PanoSynthVR: View Synthesis From A Single Input Panorama with Multi-Cylinder Images
. In ACM SIGGRAPH 2021 Posters (SIGGRAPH '21). Association for Computing Machinery, New York, NY, USA, Article 24, 1–2. https://doi.org/10.1145/3450618.3469144

VRVideos

Recent advances in Neural Radiance Field (NeRF)- based methods have enabled high-fidelity novel view synthesis for video with dynamic elements. However, these methods often require expensive hardware, take days to process a second-long video and do not scale well to longer videos.


We create an end- to-end pipeline for creating dynamic 3D video from a monocular video that can be run on consumer hardware in minutes per second of footage, not days. Our pipeline handles the estimation of the camera parameters, depth maps, 3D reconstruction of dynamic foreground and static background elements, and the rendering of the 3D video on a computer or VR headset. We use a state-of-the-art visual transformer model to estimate depth maps which we use to scale COLMAP poses and enable RGB-D fusion with estimated depth data. In our preliminary experiments, we rendered the output in a VR headset and visually compared the method against ground-truth datasets and state-of-the-art NeRF-based methods.

Anthony Dickson, Jeremy Shanks, Jonathan Ventura, Ali Knott, Stefanie Zollmann,
VRVideos: A flexible pipeline for Virtual Reality Video Creation
IEEE International conference on ARTIFICIAL INTELLIGENCE & VIRTUAL REALITY (AIVR)), 2022

CasualVRVideos

Traditional videos are not optimized for playback in stereoscopic displays or even tracked Virtual Reality devices. In this work, we present CasualVRVideos, a first approach that works towards solving these issues by extracting spatial information from video footage recorded in 2D, so that it can later be played back in VR displays to increase the immersion.


We focus in particular on the challenging scenario when the camera itself is not moving.
Stefanie Zollmann, Anthony Dickson, Jonathan Ventura
CasualVRVideos: VR videos from casual stationary videos
VRST '20: 26th ACM Symposium on Virtual Reality Software and Technology, 2020


Casual Stereo Panoramas

Hand-held capture of stereo panoramas involves spinning the camera in a roughly circular path to acquire a dense set of views of the scene. However, most existing structure-from-motion pipelines fail when trying to reconstruct such trajectories, due to the small baseline between frames.


In this work, we evaluate the use of spherical structure-from-motion for reconstructing handheld stereo panorama captures. The spherical motion constraint introduces a strong regularization on the structure-from-motion process which mitigates the small-baseline problem, making it well-suited to the use case of stereo panorama capture with a handheld camera.

We demonstrate the effectiveness of spherical structure-from-motion for casual capture of high-resolution stereo panoramas and validate our results with a user study.

Lewis Baker, Steven Mills, Stefanie Zollmann, Jonathan Ventura
CasualStereo: Casual Capture of Stereo Panoramas with Spherical Structure-from-Motion
IEEE Conference on Virtual Reality and 3D User Interfaces (IEEE VR), 2020, Honourable Mention Award.

VRVideo Editing

Editing VR videos using standard video editing tools is challenging, especially for non-expert users. There is a large gap between the 2D interface used for traditional video editing and the immersive VR environment used for replay.


Within this project, we look into methods for immersive video editing bridging this gap. In particular, we propose 6DIVE, a 6 degrees-of-freedom (DoF) immersive video editor. 6DIVE allows users to edit these 6DoF videos directly in an immersive VR environment.

Ruairi Griffin, Tobias Langlotz, Stefanie Zollmann
6DIVE: 6 Degrees-of-Freedom Immersive Video Editor
Frontiers in Virtual Reality 2, 75

Applications

Experience VR videos in a VR headset or in your browser

RGBD Video VR Player

Render your own RGBD videos in Virtual Reality (only Google Chrome is supported at the moment). Click on Single or 360 to load your video.

Check out the embedded content on the left or visit the external website.

Team Members

Our interdisciplinary team consists of members from Computer Science.

Stefanie Zollmann

AR and Visualization expert. School of Computing, University of Otago.

Jonathan Ventura

Computer vision, augmented/virtual reality, and machine learning expert. Department of Computer Science, Cal Poly.

Lewis Baker

Postgraduate researcher with focus on Computer Vision, tracking and localisation. Graduated with a PhD in 2021 ( thesis)

Wei Hong Lo

PhD student with focus on Visualisation and User Experience. Graduated with a PhD in 2022 ( thesis)

Anthony Dickson

Anthony is a PhD student working on 3D from single view videos.

Ali Knott

Professor in Artificial Intelligence at the Victoria University of Wellington. Co-supervisor of Anthony Dickson.