With the emergence of consumer grade devices providing 3D geometry information, such as Microsoft Kinect sensors, Lytro light field cameras, Leap Motion sensors and new camera phones, the way visual data is captured and transmitted is changing. Most often, these new devices provide additional geometry information about the visual scene besides the usual 2D texture information, by means of the clever use of novel optics, sensors (e.g. infrared), illumination (e.g. reflected light) and signal processing techniques. Thus, efficient techniques to represent this geometry information are urgently needed and 3D geometry coding techniques have received renewed interest. While in the past, 3D geometric data was originated mostly from computer graphics (e.g. mesh based representation), it is now more common to have geometry information acquired in realtime by new 3D geometry enabled devices. Currently, a significant amount of research work on depth map and mesh data coding is undergoing; however, other representations, such as pointsampled geometry, can bring significant advantages to represent the 3D scene geometry and enable a whole new set of applications, such as freeviewpoint video, advanced content manipulation (e.g. refocusing or relighting), natural user interaction (e.g. controlling devices by gestures) and gaming. Although there has been some work in the computer graphics community targeted at efficient pointsampled geometry coding , many of the techniques typically used by the image and video coding community still have the potential to greatly increase its coding efficiency, thus providing better quality of experience in several applications.
Nowadays, unified fully 3D geometry representations of visual content can be classified into two main categories: polygon mesh and pointsampled geometry. A 3D polygon mesh represents the surfaces of the real world with a piecewise linear approximation and contains vertices as well as their connections; typically, triangular static meshes are considered. Several contributions to static 3D mesh coding have been proposed in the past, which is reflected by several publications and an already available MPEG coding standard, notably MPEG4 FAMC – Framebased Animated Mesh Compression. However, such representation requires time consuming and complex signal processing algorithms, which must consider the constraints defined by the connectivity information. Recently, pointsampled geometry (or 3D point cloud) has received much attention since it is considered an attractive alternative to polygon mesh based representations with several advantages: i) no connectivity information is required to be coded or processed which allows bitrate savings, especially for complex and highly detailed 3D models; and ii) the computational overhead for surface reconstruction (such as triangulation) is avoided, leading to a simpler and more natural way to render and manipulate objects with complex topologies. A 3D point cloud represents the real world surfaces as a set of 3D point locations and may provide interesting additional benefits such as viewpoint independent representation, more compact storage, improved fidelity, progressive transmission, and enhanced image analysis (e.g. object detection and recognition). This project targets the development of a novel simplified point cloud representation model and efficient methods to code this type of promising 3D representation data.
|Start Date: 01-06-2016|
|End Date: 31-05-2019|
|Team: João Miguel Duarte Ascenso, Catarina Isabel Carvalheiro Brites Ascenso, Fernando Manuel Bernardo Pereira|
|Groups: Multimedia Signal Processing – Lx|
|Local Coordinator: João Miguel Duarte Ascenso|