Creating and sharing knowledge for telecommunications

Project: Progressive Point Cloud Representation

Acronym: PCR
Main Objective:
With the emergence of consumer grade devices providing 3D geometry information, such as Microsoft Kinect sensors, Lytro light field cameras, Leap Motion sensors and new camera phones, the way visual data is captured and transmitted is changing. Most often, these new devices provide additional geometry information about the visual scene besides the usual 2D texture information, by means of the clever use of novel optics, sensors (e.g. infrared), illumination (e.g. reflected light) and signal processing techniques. Thus, efficient techniques to represent this geometry information are urgently needed and 3D geometry coding techniques have received renewed interest. While in the past, 3D geometric data was originated mostly from computer graphics (e.g. mesh based representation), it is now more common to have geometry information acquired in real­time by new 3D geometry enabled devices. Currently, a significant amount of research work on depth map and mesh data coding is undergoing; however, other representations, such as point­sampled geometry, can bring significant advantages to represent the 3D scene geometry and enable a whole new set of applications, such as free­viewpoint video, advanced content manipulation (e.g. refocusing or relighting), natural user interaction (e.g. controlling devices by gestures) and gaming. Although there has been some work in the computer graphics community targeted at efficient point­sampled geometry coding [21]­[28], many of the techniques typically used by the image and video coding community still have the potential to greatly increase its coding efficiency, thus providing better quality of experience in several applications.
Nowadays, unified fully 3D geometry representations of visual content can be classified into two main categories: polygon mesh and point­sampled geometry. A 3D polygon mesh represents the surfaces of the real world with a piece­wise linear approximation and contains vertices as well as their connections; typically, triangular static meshes are considered. Several contributions to static 3D mesh coding have been proposed in the past, which is reflected by several publications and an already available MPEG coding standard, notably MPEG­4 FAMC – Frame­based Animated Mesh Compression. However, such representation requires time consuming and complex signal processing algorithms, which must consider the constraints defined by the connectivity information. Recently, point­sampled geometry (or 3D point cloud) has received much attention since it is considered an attractive alternative to polygon mesh based representations with several advantages: i) no connectivity information is required to be coded or processed which allows bitrate savings, especially for complex and highly detailed 3D models; and ii) the computational overhead for surface reconstruction (such as triangulation) is avoided, leading to a simpler and more natural way to render and manipulate objects with complex topologies. A 3D point cloud represents the real world surfaces as a set of 3D point locations and may provide interesting additional benefits such as viewpoint independent representation, more compact storage, improved fidelity, progressive transmission, and enhanced image analysis (e.g. object detection and recognition). This project targets the development of a novel simplified point cloud representation model and efficient methods to code this type of promising 3D representation data.
Reference: PTDC/EEI-PRO/7237/2014
Funding: FCT
Start Date: 01-06-2016
End Date: 31-05-2019
Team: João Miguel Duarte Ascenso, Catarina Isabel Carvalheiro Brites Ascenso, Fernando Manuel Bernardo Pereira
Groups: Multimedia Signal Processing – Lx
Partners: IT
Local Coordinator: João Miguel Duarte Ascenso

Associated Publications