16/06/2021

Background

  • Transportation presents a major challenge to curbing climate change.

  • Better informed policymaking requires up-to-date empirical data with good quality, low cost, and easy access.

  • Emerging data sources enable deep and new insights from large-scale collection of human movement and transport systems.

What is human mobility?

The geographic displacement of human beings in space and time, seen as individuals or groups. Barbosa et al., 2018

Individual movements

missing

Trips from groups of people

missing

How is human mobility supported by transport systems?

Through a variety of transport modes, e.g.,



ride-sourcing public transit private car

Transport modal disparities

  • Carbon intensity

  • Spatiotemporal distributions of travel time and trips

Research questions and present work

RQ1 What are the potentials and limitations of using emerging data sources for modelling mobility?

RQ2 How can new data sources be properly modelled for characterising transport modal disparities?

RQ # Scope Paper title
1 I Population heterogeneity From individual to collective behaviours: exploring population heterogeneity of human mobility based on social media data
II Travel demand Feasibility of estimating travel demand using geolocations of social media data
III A mobility model for synthetic travel demand from sparse individual traces
2 IV Travel time Disparities in travel times between car and transit: spatiotemporal patterns in cities
V Modal competition Ride-sourcing compared to its public-transit alternative using big trip data

Methodology

RQ1 What are the potentials and limitations of using geotagged tweets for modelling mobility?

RQ1 Geotagged tweets, pros and cons

Geotagged tweets


The tweets with precise location information (GPS coordinates) when Twitter users actively choose to tag it.

  • Why Twitter?

    • Easy access, low cost, large spatial and population coverage.

  • Limitations of geotagged tweets

    • Biased population: young, highly-educated, urban residents.

    • Sparse sampling of the actual mobility.

    • Behaviour bias of reporting geolocations.

RQ1 Limitations: sparse sampling of the actual mobility

(Paper III)

  • Twitter users DO NOT geotweet every day.

  • Twitter users DO NOT geotweet every location visited.

RQ1 Limitations: behaviour bias of overly reporting leisure/night activities

(Paper I)

Uncommon places and leisure activities more than regularly visited places, e.g., home and workplace.

RQ1 Limitations: not for commuting travel demand estimation

(Paper II)

The reliability of estimated commuting trips using geotagged tweets is low.

Commuting trip distance distributions

RQ1 Potentials at individual level: population heterogeneity on mobility

(Paper I)

Four types of travellers

  • Local vs. Global traveller visits

    • Local: nearby locations.

    • Global: more distant locations.

  • Returner vs. Explorer explores around

    • Returner: one centralised location.

    • Explorer: decentralised locations that are distant from each other.

RQ1 Potentials at population level: travel demand modelling

(Paper II)

spatial scale

Twitter data are more suitable for city level than national level (in Sweden).

sampling method

User-based data collection works better than area-based data collection.

sample size

A much larger number of geotagged tweets, a more complete picture of travel demand.

RQ1 Extending the use by innovative approaches

(Paper II)

A density-based approach is proposed to increase sample size for estimating travel demand.

Trip-based approach in the literatureLee et al., 2019

Our density-based approach

increases the accuracy of the estimation by 10%–60%.

RQ1 Extending the use by innovative approaches

(Paper III)

  • An individual-based mobility model is proposed to fill the gaps in sparse mobility data.

  • The model is designed to correct behaviour bias and sparsity issue.

Input- sparse mobility traces.

Output- synthesised mobility converted to daily trips.

RQ1 Extending the use by innovative approaches

(Paper III)

  • The model-synthesised results have good performance.

  • An application: characterising trip distance distributions (domestic) of global regions’ residents:

A summary of answers to RQ1

Potentials and limitations of geotagged tweets for modelling mobility

  • ⭕️ ❗️ Easy access, low cost, but with biased population, behaviour bias, and sparsity issue.

  • 👤 At the individual level, fundamental patterns are preserved.

  • 👥 At the population level:

    • a reasonably good source for the overall travel demand estimation but not commuting demand.

    • careful consideration on spatial scale, sampling method, and sample size.

  • 🔧 Innovative approaches for correcting the biases and increasing available data.

RQ2 How can new data sources be properly modelled for characterising transport modal disparities?

RQ2 Spatiotemporal patterns of travel time: data fusion approach

(Paper IV)

Data fusion framework for travel time calculation:

Distribution of geotagged tweets represents the dynamic attractiveness of locations in cities.

RQ2 Spatiotemporal patterns of travel time

(Paper IV)

Travel time ratio (R) by hour of day (Sydney)

Travel time ratio (R) over 24 hours

  • Travel time by PT is around twice as high as by car.

  • PT can compete with car use during peak rush hours in Stockholm and Amsterdam.

RQ2 Modal competition: ride-sourcing vs. public transit

(Paper V)

Does ride-sourcing complement, or compete with, public transit?

  • How large is the share of ride-sourcing trips that can be substituted by taking public transit, if you are willing to walk up to 800 m to access and leave the transit station during daytime?

  • Ride-sourcing trips: transit-competing vs. non-transit-competing.

    • What trip attributes and built environment are linked to the competition?

    • What are the implications for policymaking?

RQ2 Modal competition: ride-sourcing vs. public transit

(Paper V)

  • The transit-competing trips account for 48.2% of the overall 4.3 million ride-sourcing trips.

Hot spots of ride-sourcing trips

RQ2 Modal competition: data fusion approach and model

(Paper V)

  • Big mode-specific trip data are often collected from a large area and population, but at a cost of rich detail.

Raw data
trip ID
pick-up/drop-off locations
pick-up/drop-off times
cost
Enriched variables
Trip attributes 🚌 public transit information, e.g., travel time
☀️ weather condition
🕸 demand-based communities (by trip-based network)
Built environment 🚏 transit-stop density
🏨 functional regions (by points of interest)
  • A glass-box model enhanced by machine learning techniques: additive impact of variables and variable interactions.

RQ2 Modal competition: impact of land-use

(Paper V)

Land-use clusters of the study area

  • Low density/diversity of land use -> a lower probability of competition.

RQ2 Modal competition: impact of land-use x transit boardings

(Paper V)

Land-use clusters of the study area

  • Multiple transfers + middle density/diversity of land use -> a higher probability of competition.

RQ2 Modal competition: selective recommendations

(Paper V)

  • 🚌 Expand PT networks guided by the transit-competing ride-sourcing trips featuring

    • short travel time by ride-sourcing (< 15 min)

    • large travel time ratio between the two modes;

  • 💰 Incentivise the ride-sourcing trips that fill the gaps in the PT services which

    • take a long time

    • require lengthy walking

    • require multiple transfers connecting to suburban areas.

A summary of answers to RQ2

Characterising transport modal disparities between public transit and car & ride-sourcing

  • 📦 Importance of data fusion approaches, especially given more and more open but incomplete data.

  • Geotagged tweets is a good source for time-varying attractiveness of urban locations.

  • 🚌 🚗 Public transit is virtually always slower than car and ride-sourcing.

  • 📍 For making public transit more competitive, spatiotemporal details add nuanced insights to identify gaps and opportunities.

Knowledge contributions

  • Provide validation to identify the limitations of geotagged tweets: behavior bias, biased population, and sparsity issue.

  • Reveal the potentials of geotagged tweets at both individual and population level.

  • Reveal the spatio-temporal disparities between car/ride-sourcing and public transit about travel time and modal competition.

Methodological contributions

  • Propose a density approach and an individual-based mobility model for travel demand estimation, addressing sparsity issue and behaviour bias of geotagged tweets.

  • Create two reproducible data fusion frameworks integrating multiple data sources from transport systems for characterising modal disparities.

Outlook

  • Extending the use of social media data for mobility modelling.

  • 🌐 Generating global synthetic mobility data for improving travel demand projections.

  • 🚙 & 🚊 🚲 🚕 🚌 🚋 Combining multi-modal trip data to explore occupancy, shareability, and electrification of new mobility services for reducing transport carbon emissions.

  • 🕸 Introducing the perspective of networks.

Thanks for listening!