Creating and sharing knowledge for telecommunications

Time-evolving O-D matrix estimation using high-speed GPS data streams

Gama, J. Gama ; Ferreira, M. ; João Mendes-Moreira, JMM ; Damas, L.D. ; Matias, L.

Expert Systems with Applications Vol. 44, Nº ., pp. 275 - 288, February, 2016.

ISSN (print): 0957-4174
ISSN (online):

Scimago Journal Ranking: 1,34 (in 2016)

Digital Object Identifier: 10.1016/j.eswa.2015.08.048

Portable digital devices equipped with GPS antennas are ubiquitous sources of continuous information for location-based Expert and Intelligent Systems. The availability of these traces on the human mobility patterns is growing explosively. To mine this data is a fascinating challenge which can produce a big impact on both travelers and transit agencies.
This paper proposes a novel incremental framework to maintain statistics on the urban mobility dynamics over a time-evolving origin-destination (O-D) matrix. The main motivation behind such task is to be able to learn from the location-based samples which are continuously being produced, independently on their source, dimensionality or (high) communicational rate. By doing so, the authors aimed to obtain a generalist framework capable of summarizing relevant context-aware information which is able to follow, as close as possible, the stochastic dynamics on the human mobility behavior. Its potential impact ranges Expert Systems for decision support across multiple industries, from demand estimation for public transportation planning till travel time prediction for intelligent routing systems, among others.
The proposed methodology settles on three steps: (i) Half-Space trees are used to divide the city area into dense subregions of equal mass. The uncovered regions form an O-D matrix which can be updated by transforming the trees’leaves into conditional nodes (and vice-versa). The (ii) Partioning Incremental Algorithm is then employed to discretize the target variable’s historical values on each matrix cell. Finally, a (iii) dimensional hierarchy is defined to discretize the domains of the independent variables depending on the cell’s samples.
A Taxi Network running on a mid-sized city in Portugal was selected as a case study. The Travel Time Estimation (TTE) problem was regarded as a real-world application. Experiments using one million data samples were conducted to validate the methodology. The results obtained highlight the straightforward contribution of this method: it is capable of resisting to the drift while still approximating context-aware solutions through a multidimensional discretization of the feature space. It is a step ahead in estimating the real-time mobility dynamics, regardless of its application field.