Skip to content

Transportation Module

The transportation module provides functions for processing GTFS (General Transit Feed Specification) data and creating transportation network graphs.

GTFS transportation network utilities.

The functions in this module load GTFS feeds into DuckDB, derive stop-to-stop origin/destination records, and aggregate those records into transport summary graphs for downstream network analysis.

Functions:

Name Description
load_gtfs

Load a GTFS ZIP archive into an in-memory DuckDB database.

get_od_pairs

Generate stop-to-stop OD pairs from GTFS trip stop sequences.

travel_summary_graph

Aggregate GTFS service into a weighted stop-to-stop summary graph.

load_gtfs

load_gtfs(path)

Load a GTFS ZIP archive into an in-memory DuckDB database.

The function reads *.txt GTFS files as tables, registers a time parsing UDF, and materializes stops.geometry as points from stop_lon/stop_lat when stop coordinates are available.

Parameters:

Name Type Description Default
path str | Path

Path to a GTFS .zip file.

required

Returns:

Type Description
DuckDBPyConnection

In-memory DuckDB connection containing imported GTFS tables.

get_od_pairs

get_od_pairs(
    con, start_date=None, end_date=None, include_geometry=True, directed=False
)

Generate stop-to-stop OD pairs from GTFS trip stop sequences.

For each trip_id, consecutive stops are paired into directed legs with departure/arrival timestamps and per-leg travel time in seconds. Service activity is derived from calendar and optionally adjusted using calendar_dates.

Parameters:

Name Type Description Default
con DuckDBPyConnection

DuckDB connection with GTFS tables loaded.

required
start_date str | None

Inclusive start date (YYYYMMDD). Optional when calendar exists.

None
end_date str | None

Inclusive end date (YYYYMMDD). Optional when calendar exists.

None
include_geometry bool

If True and stop geometries are available, include a line geometry for each OD pair.

True
directed bool

If True, preserve the original trip direction for each OD pair. If False (default), canonicalize each pair so that orig_stop_id < dest_stop_id and aggregate both directions together.

False

Returns:

Type Description
DataFrame | GeoDataFrame

One row per directed (or undirected) stop-to-stop leg.

travel_summary_graph

travel_summary_graph(
    con,
    start_time=None,
    end_time=None,
    calendar_start=None,
    calendar_end=None,
    as_nx=False,
    directed=False,
    use_frequencies=True,
)

Aggregate GTFS service into a weighted stop-to-stop summary graph.

The function builds a stop-level graph from the GTFS feed, where each edge represents the aggregated service between two consecutive stops.

Edge attributes ~~~~~~~~~~~~~~~ travel_time_sec Service-weighted average travel time in seconds across all trips serving this stop pair. frequency Total number of scheduled leg traversals in the resolved calendar window. When use_frequencies=True and frequencies.txt is present, headway-based services are expanded into effective trip counts. For example, a trip running every 10 minutes for 1 hour contributes 6 traversals per active service day.

Edge geometry ~~~~~~~~~~~~~ Edge geometry is represented as a straight stop-to-stop line.

Parameters:

Name Type Description Default
con DuckDBPyConnection

DuckDB connection with GTFS tables loaded.

required
start_time str | None

Optional lower bound (inclusive) for departure time, HH:MM:SS.

None
end_time str | None

Optional upper bound (inclusive) for next-stop arrival time, HH:MM:SS.

None
calendar_start str | None

Optional calendar window start, YYYYMMDD.

None
calendar_end str | None

Optional calendar window end, YYYYMMDD.

None
as_nx bool

If True, return a NetworkX graph instead of GeoDataFrames.

False
directed bool

If True, return a directed graph (nx.DiGraph) preserving trip direction. If False (default), edges in opposite directions are merged into undirected edges (nx.Graph).

False
use_frequencies bool

If True and frequencies.txt exists, expand headway-based service into effective trip counts for the frequency attribute.

True

Returns:

Type Description
tuple[GeoDataFrame, GeoDataFrame] | DiGraph | Graph

(nodes_gdf, edges_gdf) when as_nx is False; otherwise a nx.DiGraph (directed=True) or nx.Graph (directed=False) built from those GeoDataFrames.