Proximity#
Proximity-Based Graph Generation Module.
This module provides comprehensive functionality for generating graph networks based on spatial proximity relationships between geographic features. It implements several classical proximity models commonly used in spatial network analysis and geographic information systems. The module is particularly useful for constructing heterogeneous graphs from multiple domains of geospatial relations, enabling complex spatial analysis across different feature types and scales.
- knn_graph(gdf, k=5, distance_metric='euclidean', network_gdf=None, *, target_gdf=None, as_nx=False)[source]#
Generate a k-nearest neighbour graph from a GeoDataFrame of points.
This function constructs a graph where each node is connected to its k nearest neighbors based on the specified distance metric. The resulting graph captures local spatial relationships and is commonly used in spatial analysis, clustering, and network topology studies.
- Parameters:
gdf (geopandas.GeoDataFrame) – GeoDataFrame containing the points (nodes) for the graph. The index of this GeoDataFrame will be used as node IDs.
k (int, default 5) – The number of nearest neighbors to connect to each node. Must be positive and less than the total number of nodes.
distance_metric (str, default "euclidean") – The distance metric to use for calculating nearest neighbors. Options are: - “euclidean”: Straight-line distance - “manhattan”: City-block distance (L1 norm) - “network”: Shortest path distance along a network
network_gdf (geopandas.GeoDataFrame, optional) – A GeoDataFrame representing a network (e.g., roads, paths) to use for “network” distance calculations. Required if distance_metric is “network”.
target_gdf (geopandas.GeoDataFrame, optional) – If provided, creates a directed graph where edges connect nodes from gdf to their k nearest neighbors in target_gdf. If None, creates an undirected graph within gdf itself.
as_nx (bool, default False) – If True, returns a NetworkX graph object. Otherwise, returns a tuple of GeoDataFrames (nodes, edges).
- Returns:
If as_nx is False, returns a tuple of GeoDataFrames: - nodes_gdf: GeoDataFrame of nodes with spatial and attribute information - edges_gdf: GeoDataFrame of edges with ‘weight’ and ‘geometry’ attributes If as_nx is True, returns a NetworkX graph object with spatial attributes.
- Return type:
tuple[geopandas.GeoDataFrame, geopandas.GeoDataFrame] or networkx.Graph
- Raises:
ValueError – If distance_metric is “network” but network_gdf is not provided. If k is greater than or equal to the number of available nodes.
See also
delaunay_graph
Generate a Delaunay triangulation graph.
fixed_radius_graph
Generate a fixed-radius graph.
waxman_graph
Generate a probabilistic Waxman graph.
Notes
Node IDs are preserved from the input GeoDataFrame’s index
Edge weights represent the distance between connected nodes
Edge geometries are LineStrings connecting node centroids
For Manhattan distance, edge geometries follow L-shaped paths
The graph is undirected unless target_gdf is specified
References
Eppstein, D., Paterson, M.S. & Yao, F.F. On Nearest-Neighbor Graphs. Discrete Comput Geom 17, 263-282 (1997). [1](https://doi.org/10.1007/PL00009293)
Examples
>>> import geopandas as gpd >>> import numpy as np >>> from shapely.geometry import Point >>> >>> # Create a sample GeoDataFrame with 6 points >>> np.random.seed(42) >>> coords = np.random.rand(6, 2) * 10 >>> data = { ... 'id': [f'node_{i}' for i in range(6)], ... 'type': ['residential', 'commercial', 'industrial', 'park', 'school', 'hospital'], ... 'geometry': [Point(x, y) for x, y in coords] ... } >>> gdf = gpd.GeoDataFrame(data, crs="EPSG:4326").set_index('id') >>> print("Input GeoDataFrame:") >>> print(gdf.head(3)) type geometry id node_0 residential POINT (3.745 9.507) node_1 commercial POINT (7.319 5.987) node_2 industrial POINT (1.560 0.581)
>>> # Generate a 3-nearest neighbor graph >>> nodes_gdf, edges_gdf = knn_graph(gdf, k=3, distance_metric="euclidean") >>> print(f"\\nNodes GDF shape: {nodes_gdf.shape}") >>> print(f"Edges GDF shape: {edges_gdf.shape}") Nodes GDF shape: (6, 2) Edges GDF shape: (18, 2)
>>> print("\\nSample edges with weights:") >>> print(edges_gdf[['weight']].head(3)) weight 0 4.186842 1 6.190525 2 8.944272
>>> # Generate with Manhattan distance >>> nodes_manhattan, edges_manhattan = knn_graph( ... gdf, k=2, distance_metric="manhattan" ... ) >>> print(f"\\nManhattan edges count: {len(edges_manhattan)}") Manhattan edges count: 12
>>> # Generate as NetworkX graph >>> G = knn_graph(gdf, k=3, as_nx=True) >>> print(f"\\nNetworkX graph: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges") >>> print(f"Average degree: {2 * G.number_of_edges() / G.number_of_nodes():.2f}") NetworkX graph: 6 nodes, 9 edges Average degree: 3.00
>>> # Directed graph example with target_gdf >>> target_data = { ... 'id': ['target_1', 'target_2'], ... 'service': ['hospital', 'school'], ... 'geometry': [Point(5, 5), Point(8, 8)] ... } >>> target_gdf = gpd.GeoDataFrame(target_data, crs="EPSG:4326").set_index('id') >>> nodes_dir, edges_dir = knn_graph(gdf, k=1, target_gdf=target_gdf) >>> print(f"\\nDirected graph edges: {len(edges_dir)} (each source node → 1 target)") Directed graph edges: 6 (each source node → 1 target)
- delaunay_graph(gdf, distance_metric='euclidean', network_gdf=None, *, as_nx=False)[source]#
Generate a Delaunay triangulation graph from a GeoDataFrame of points.
This function constructs a graph based on the Delaunay triangulation of the input points. Each edge in the graph corresponds to an edge in the Delaunay triangulation.
- Parameters:
gdf (geopandas.GeoDataFrame) – GeoDataFrame containing the points (nodes) for the graph. The index of this GeoDataFrame will be used as node IDs.
distance_metric (str, default "euclidean") – The distance metric to use for calculating edge weights. Options are “euclidean”, “manhattan”, or “network”.
network_gdf (geopandas.GeoDataFrame, optional) – A GeoDataFrame representing a network (e.g., roads) to use for “network” distance calculations. Required if distance_metric is “network”.
as_nx (bool, default False) – If True, returns a NetworkX graph object. Otherwise, returns a tuple of GeoDataFrames (nodes, edges).
- Returns:
If as_nx is False, returns a tuple of GeoDataFrames: - nodes_gdf: GeoDataFrame of nodes with spatial and attribute information - edges_gdf: GeoDataFrame of edges with ‘weight’ and ‘geometry’ attributes If as_nx is True, returns a NetworkX graph object with spatial attributes.
- Return type:
tuple[geopandas.GeoDataFrame, geopandas.GeoDataFrame] or networkx.Graph
- Raises:
ValueError – If distance_metric is “network” but network_gdf is not provided.
See also
knn_graph
Generate a k-nearest neighbour graph.
fixed_radius_graph
Generate a fixed-radius graph.
waxman_graph
Generate a probabilistic Waxman graph.
Notes
Node IDs are preserved from the input GeoDataFrame’s index.
Edge weights represent the distance between connected nodes.
Edge geometries are LineStrings connecting the centroids of the nodes.
If the input gdf has fewer than 3 points, an empty graph will be returned as Delaunay triangulation requires at least 3 non-collinear points.
References
Lee, D. T., & Schachter, B. J. (1980). Two algorithms for constructing a Delaunay triangulation. International Journal of Computer & Information Sciences, 9(3), 219-242. [1](https://doi.org/10.1007/BF00977785)
Examples
>>> import geopandas as gpd >>> from shapely.geometry import Point >>> # Create a sample GeoDataFrame >>> data = {'id': [1, 2, 3, 4, 5], ... 'geometry': [Point(0, 0), Point(1, 1), Point(0, 1), Point(1, 0), Point(2, 2)]} >>> gdf = gpd.GeoDataFrame(data, crs="EPSG:4326").set_index('id') >>> >>> # Generate a Delaunay graph >>> nodes_gdf, edges_gdf = delaunay_graph(gdf) >>> print(nodes_gdf) >>> print(edges_gdf) >>> >>> # Generate a Delaunay graph as NetworkX object >>> G_delaunay = delaunay_graph(gdf, as_nx=True) >>> print(G_delaunay.nodes(data=True)) >>> print(G_delaunay.edges(data=True))
- gabriel_graph(gdf, distance_metric='euclidean', network_gdf=None, *, as_nx=False)[source]#
Generate a Gabriel graph from a GeoDataFrame of points.
In a Gabriel graph two nodes u and v are connected iff the closed disc that has \(uv\) as its diameter contains no other node of the set.
- Parameters:
gdf (geopandas.GeoDataFrame) – Input point layer. The GeoDataFrame index is preserved as the node id.
distance_metric ({'euclidean', 'manhattan', 'network'}, default 'euclidean') – Metric used for edge weights / geometries (see the other generators).
network_gdf (geopandas.GeoDataFrame, optional) – Required when distance_metric=’network’.
as_nx (bool, default False) – If True return a NetworkX graph, otherwise return two GeoDataFrames (nodes, edges) via nx_to_gdf.
- Returns:
If as_nx is False, returns a tuple of GeoDataFrames: - nodes_gdf: GeoDataFrame of nodes with spatial and attribute information - edges_gdf: GeoDataFrame of edges with ‘weight’ and ‘geometry’ attributes If as_nx is True, returns a NetworkX graph object with spatial attributes.
- Return type:
tuple[geopandas.GeoDataFrame, geopandas.GeoDataFrame] or networkx.Graph
Notes
The Gabriel graph is a sub-graph of the Delaunay triangulation; therefore the implementation first builds the Delaunay edges then filters them according to the disc-emptiness predicate, achieving an overall
\[\mathcal{O}(n \log n + m k)\]complexity ( m = Delaunay edges, k = average neighbours tested per edge).
When the input layer has exactly two points, the unique edge is returned.
If the layer has fewer than two points, an empty graph is produced.
References
Gabriel, K. R., & Sokal, R. R. (1969). A new statistical approach to geographic variation analysis. Systematic zoology, 18(3), 259-278. [1](https://doi.org/10.2307/2412323)
Examples
>>> nodes, edges = gabriel_graph(points_gdf) >>> G = gabriel_graph(points_gdf, as_nx=True)
- relative_neighborhood_graph(gdf, distance_metric='euclidean', network_gdf=None, *, as_nx=False)[source]#
Generate a Relative-Neighbourhood Graph (RNG) from a GeoDataFrame.
In an RNG two nodes u and v are connected iff there is no third node *w* such that both \(d(u,w) < d(u,v)\) and \(d(v,w) < d(u,v)\). Equivalently, the intersection of the two open discs having radius :math::d(u,v) and centres u and v (the lune) is empty.
- Parameters:
gdf (geopandas.GeoDataFrame) – Input point layer whose index provides the node ids.
distance_metric ({'euclidean', 'manhattan', 'network'}, default 'euclidean') – Metric used to attach edge weights / geometries.
network_gdf (geopandas.GeoDataFrame, optional) – Required when distance_metric=’network’.
as_nx (bool, default False) – If True return a NetworkX graph, otherwise return two GeoDataFrames (nodes, edges) via nx_to_gdf.
- Returns:
If as_nx is False, returns a tuple of GeoDataFrames: - nodes_gdf: GeoDataFrame of nodes with spatial and attribute information - edges_gdf: GeoDataFrame of edges with ‘weight’ and ‘geometry’ attributes If as_nx is True, returns a NetworkX graph object with spatial attributes.
- Return type:
tuple[geopandas.GeoDataFrame, geopandas.GeoDataFrame] or networkx.Graph
Notes
The RNG is a sub-graph of the Delaunay triangulation; therefore the implementation first collects Delaunay edges ( \(\mathcal{O}(n\log n)\) ) and then filters them according to the lune-emptiness predicate.
When the input layer has exactly two points the unique edge is returned.
If the layer has fewer than two points, an empty graph is produced.
References
Toussaint, G. T. (1980). The relative neighbourhood graph of a finite planar set. Pattern recognition, 12(4), 261-268. [1](https://doi.org/10.1016/0031-3203(80)90066-7)
Examples
>>> nodes, edges = relative_neighborhood_graph(points_gdf) >>> G = relative_neighborhood_graph(points_gdf, as_nx=True)
- euclidean_minimum_spanning_tree(gdf, distance_metric='euclidean', network_gdf=None, *, as_nx=False)[source]#
Generate a (generalised) Euclidean Minimum Spanning Tree from a GeoDataFrame of points.
The classical Euclidean Minimum Spanning Tree (EMST) is the minimum- total-length tree that connects a set of points when edge weights are the straight-line ( \(L_2\) ) distances. For consistency with the other generators this implementation also supports manhattan and network metrics - it simply computes the minimum-weight spanning tree under the chosen metric. When the metric is euclidean the edge search is restricted to the Delaunay triangulation ( EMST ⊆ Delaunay ), guaranteeing an \(\mathcal{O}(n \log n)\) overall complexity. With other metrics, or degenerate cases where the triangulation cannot be built, the algorithm gracefully falls back to the complete graph.
- Parameters:
gdf (geopandas.GeoDataFrame) – Input point layer. The index is preserved as the node identifier.
distance_metric ({'euclidean', 'manhattan', 'network'}, default 'euclidean') – Metric used for the edge weights / geometries.
network_gdf (geopandas.GeoDataFrame, optional) – Required when distance_metric=’network’. Must contain the network arcs with valid pos attributes on its nodes.
as_nx (bool, default False) – If True return a NetworkX graph, otherwise return two GeoDataFrames (nodes, edges) via
nx_to_gdf
.
- Returns:
If as_nx is False, returns a tuple of GeoDataFrames: - nodes_gdf: GeoDataFrame of nodes with spatial and attribute information - edges_gdf: GeoDataFrame of edges with ‘weight’ and ‘geometry’ attributes If as_nx is True, returns a NetworkX graph object with spatial attributes.
- Return type:
tuple[geopandas.GeoDataFrame, geopandas.GeoDataFrame] or networkx.Graph
See also
delaunay_graph
Generate a Delaunay triangulation graph.
gabriel_graph
Generate a Gabriel graph.
relative_neighborhood_graph
Generate a Relative Neighborhood Graph.
Notes
The resulting graph always contains n - 1 edges (or 0 / 1 when the input has < 2 points).
For planar Euclidean inputs the computation is \(\mathcal{O}(n \log n)\) thanks to the Delaunay pruning.
All the usual spatial attributes (weight, geometry, CRS checks, etc.) are attached through the shared private helpers.
References
March, W. B., Ram, P., & Gray, A. G. (2010, July). Fast euclidean minimum spanning tree: algorithm, analysis, and applications. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 603-612). [1](https://doi.org/10.1145/1835804.1835882)
Examples
>>> nodes, edges = euclidean_minimum_spanning_tree(points_gdf) >>> G = euclidean_minimum_spanning_tree(points_gdf, as_nx=True)
- fixed_radius_graph(gdf, radius, distance_metric='euclidean', network_gdf=None, *, target_gdf=None, as_nx=False)[source]#
Generate a fixed-radius graph from a GeoDataFrame of points.
This function constructs a graph where nodes are connected if the distance between them is within a specified radius. This model is particularly useful for modeling communication networks, ecological connectivity, and spatial influence zones where interaction strength has a clear distance threshold.
- Parameters:
gdf (geopandas.GeoDataFrame) – GeoDataFrame containing the source points (nodes) for the graph. The index of this GeoDataFrame will be used as node IDs.
radius (float) – The maximum distance for connecting nodes. Nodes within this radius will have an edge between them. Must be positive.
distance_metric (str, default "euclidean") – The distance metric to use for determining connections. Options are: - “euclidean”: Straight-line distance - “manhattan”: City-block distance (L1 norm) - “network”: Shortest path distance along a network
network_gdf (geopandas.GeoDataFrame, optional) – A GeoDataFrame representing a network (e.g., roads) to use for “network” distance calculations. Required if distance_metric is “network”.
target_gdf (geopandas.GeoDataFrame, optional) – If provided, creates a directed graph where edges connect nodes from gdf to nodes in target_gdf within the specified radius. If None, creates an undirected graph from gdf itself.
as_nx (bool, default False) – If True, returns a NetworkX graph object. Otherwise, returns a tuple of GeoDataFrames (nodes, edges).
- Returns:
If as_nx is False, returns a tuple of GeoDataFrames: - nodes_gdf: GeoDataFrame of nodes with spatial and attribute information - edges_gdf: GeoDataFrame of edges with ‘weight’ and ‘geometry’ attributes If as_nx is True, returns a NetworkX graph object with spatial attributes.
- Return type:
tuple[geopandas.GeoDataFrame, geopandas.GeoDataFrame] or networkx.Graph
- Raises:
ValueError – If distance_metric is “network” but network_gdf is not provided. If radius is not positive.
See also
knn_graph
Generate a k-nearest neighbour graph.
delaunay_graph
Generate a Delaunay triangulation graph.
waxman_graph
Generate a probabilistic Waxman graph.
Notes
Node IDs are preserved from the input GeoDataFrame’s index
Edge weights represent the actual distance between connected nodes
Edge geometries are LineStrings connecting node centroids
The graph stores the radius parameter in G.graph[“radius”]
For Manhattan distance, edges follow L-shaped geometric paths
References
Bentley, J. L., Stanat, D. F., & Williams Jr, E. H. (1977). The complexity of finding fixed-radius near neighbors. Information processing letters, 6(6), 209-212. [1](https://doi.org/10.1016/0020-0190(77)90070-9)
Examples
>>> import geopandas as gpd >>> import numpy as np >>> from shapely.geometry import Point >>> >>> # Create a sample GeoDataFrame representing city facilities >>> facilities = { ... 'name': ['Library_A', 'Park_B', 'School_C', 'Hospital_D', 'Mall_E'], ... 'type': ['library', 'park', 'school', 'hospital', 'commercial'], ... 'geometry': [ ... Point(0, 0), Point(1.5, 1), Point(3, 0.5), ... Point(1, 3), Point(4, 4) ... ] ... } >>> gdf = gpd.GeoDataFrame(facilities, crs="EPSG:4326").set_index('name') >>> print("Input facilities:") >>> print(gdf) type geometry name Library_A library POINT (0.000 0.000) Park_B park POINT (1.500 1.000) School_C school POINT (3.000 0.500) Hospital_D hospital POINT (1.000 3.000) Mall_E commercial POINT (4.000 4.000)
>>> # Generate fix radius graph with radius=2.0 >>> nodes_gdf, edges_gdf = fixed_radius_graph(gdf, radius=2.0) >>> print(f"\\nConnections within 2.0 units:") >>> print(f"Nodes: {len(nodes_gdf)}, Edges: {len(edges_gdf)}") Connections within 2.0 units: Nodes: 5, Edges: 4
>>> print("\\nEdge connections and distances:") >>> for idx, row in edges_gdf.iterrows(): ... print(f"{row.name}: weight = {row['weight']:.3f}") 0: weight = 1.803 1: weight = 1.581 2: weight = 2.000 3: weight = 2.236
>>> # Compare with smaller radius >>> nodes_small, edges_small = fixed_radius_graph(gdf, radius=1.0) >>> print(f"\\nWith radius=1.0: {len(edges_small)} edges") With radius=1.0: 0 edges
>>> # Compare with larger radius >>> nodes_large, edges_large = fixed_radius_graph(gdf, radius=3.0) >>> print(f"With radius=3.0: {len(edges_large)} edges") With radius=3.0: 7 edges
>>> # Manhattan distance example >>> nodes_manh, edges_manh = fixed_radius_graph( ... gdf, radius=3.0, distance_metric="manhattan" ... ) >>> print(f"\\nManhattan metric (radius=3.0): {len(edges_manh)} edges") Manhattan metric (radius=3.0): 6 edges
>>> # NetworkX graph with radius information >>> G = fixed_radius_graph(gdf, radius=2.5, as_nx=True) >>> print(f"\\nNetworkX graph properties:") >>> print(f"Radius parameter: {G.graph['radius']}") >>> print(f"Graph density: {nx.density(G):.3f}") >>> print(f"Connected components: {nx.number_connected_components(G)}") NetworkX graph properties: Radius parameter: 2.5 Graph density: 0.600 Connected components: 1
>>> # Directed graph to specific targets >>> targets = gpd.GeoDataFrame({ ... 'service': ['Emergency', 'Transit'], ... 'geometry': [Point(2, 2), Point(3.5, 1.5)] ... }, crs="EPSG:4326", index=['Emergency_Hub', 'Transit_Stop']) >>> >>> nodes_dir, edges_dir = fixed_radius_graph( ... gdf, radius=2.5, target_gdf=targets ... ) >>> print(f"\\nDirected connections to targets: {len(edges_dir)} edges") Directed connections to targets: 8 edges
- waxman_graph(gdf, beta, r0, seed=None, distance_metric='euclidean', network_gdf=None, *, as_nx=False)[source]#
Generate a probabilistic Waxman graph from a GeoDataFrame of points.
This function constructs a random graph where the probability of an edge existing between two nodes decreases exponentially with their distance. The model is based on the Waxman random graph model, commonly used to simulate realistic network topologies in telecommunications, transportation, and social networks where connection probability diminishes with distance.
The connection probability follows the formula:
\[P(u,v) = \beta \times \exp \left(-\frac{\text{dist}(u,v)}{r_0}\right)\]- Parameters:
gdf (geopandas.GeoDataFrame) – GeoDataFrame containing the points (nodes) for the graph. The index of this GeoDataFrame will be used as node IDs.
beta (float) – Parameter controlling the overall probability of edge creation. Higher values (closer to 1.0) increase the likelihood of connections. Must be between 0 and 1.
r0 (float) – Parameter controlling the decay rate of probability with distance. Higher values result in longer-range connections being more likely. Must be positive.
seed (int, optional) – Seed for the random number generator to ensure reproducibility of results. If None, results will vary between runs.
distance_metric (str, default "euclidean") – The distance metric to use for calculating distances between nodes. Options are: - “euclidean”: Straight-line distance - “manhattan”: City-block distance (L1 norm) - “network”: Shortest path distance along a network
network_gdf (geopandas.GeoDataFrame, optional) – A GeoDataFrame representing a network (e.g., roads) to use for “network” distance calculations. Required if distance_metric is “network”.
as_nx (bool, default False) – If True, returns a NetworkX graph object. Otherwise, returns a tuple of GeoDataFrames (nodes, edges).
- Returns:
If as_nx is False, returns a tuple of GeoDataFrames: - nodes_gdf: GeoDataFrame of nodes with spatial and attribute information - edges_gdf: GeoDataFrame of edges with ‘weight’ and ‘geometry’ attributes If as_nx is True, returns a NetworkX graph object with spatial attributes.
- Return type:
tuple[geopandas.GeoDataFrame, geopandas.GeoDataFrame] or networkx.Graph
- Raises:
ValueError – If distance_metric is “network” but network_gdf is not provided. If beta is not between 0 and 1, or if r0 is not positive.
See also
knn_graph
Generate a k-nearest neighbour graph.
delaunay_graph
Generate a Delaunay triangulation graph.
fixed_radius_graph
Generate a fixed-radius graph.
Notes
Node IDs are preserved from the input GeoDataFrame’s index
Edge weights represent the actual distance between connected nodes
Edge geometries are LineStrings connecting node centroids
The graph stores parameters in G.graph[“beta”] and G.graph[“r0”]
Results are stochastic; use seed parameter for reproducible outputs
The graph is undirected with symmetric edge probabilities
References
Waxman, B. M. (2002). Routing of multipoint connections. IEEE journal on selected areas in communications, 6(9), 1617-1622. [1](https://doi.org/10.1109/49.12889)
Examples
>>> import geopandas as gpd >>> import numpy as np >>> from shapely.geometry import Point >>> >>> # Create a sample GeoDataFrame representing communication towers >>> np.random.seed(123) >>> tower_coords = np.random.uniform(0, 10, (8, 2)) >>> towers = { ... 'tower_id': [f'T{i:02d}' for i in range(8)], ... 'power': np.random.choice(['high', 'medium', 'low'], 8), ... 'geometry': [Point(x, y) for x, y in tower_coords] ... } >>> gdf = gpd.GeoDataFrame(towers, crs="EPSG:4326").set_index('tower_id') >>> print("Communication towers:") >>> print(gdf.head(4)) power geometry tower_id T00 low POINT (6.964 2.862) T01 high POINT (2.269 5.513) T02 medium POINT (5.479 4.237) T03 medium POINT (8.444 7.579)
>>> # Generate Waxman graph with moderate connectivity >>> nodes_gdf, edges_gdf = waxman_graph( ... gdf, beta=0.5, r0=3.0, seed=42 ... ) >>> print(f"\\nWaxman graph (β=0.5, r₀=3.0):") >>> print(f"Nodes: {len(nodes_gdf)}, Edges: {len(edges_gdf)}") >>> print(f"Graph density: {2 * len(edges_gdf) / (len(nodes_gdf) * (len(nodes_gdf) - 1)):.3f}") Waxman graph (β=0.5, r₀=3.0): Nodes: 8, Edges: 12 Graph density: 0.429
>>> print("\\nSample edge weights (distances):") >>> print(edges_gdf[['weight']].head(4)) weight 0 2.876543 1 4.123789 2 1.987654 3 5.432109
>>> # Compare different parameter settings >>> # High connectivity (higher beta, higher r0) >>> _, edges_high = waxman_graph(gdf, beta=0.8, r0=5.0, seed=42) >>> print(f"\\nHigh connectivity (β=0.8, r₀=5.0): {len(edges_high)} edges") High connectivity (β=0.8, r₀=5.0): 19 edges
>>> # Low connectivity (lower beta, lower r0) >>> _, edges_low = waxman_graph(gdf, beta=0.2, r0=1.5, seed=42) >>> print(f"Low connectivity (β=0.2, r₀=1.5): {len(edges_low)} edges") Low connectivity (β=0.2, r₀=1.5): 3 edges
>>> # NetworkX graph with parameter storage >>> G = waxman_graph(gdf, beta=0.6, r0=4.0, seed=42, as_nx=True) >>> print(f"\\nNetworkX graph parameters:") >>> print(f"Beta: {G.graph['beta']}, r0: {G.graph['r0']}") >>> print(f"Average clustering coefficient: {nx.average_clustering(G):.3f}") >>> print(f"Number of connected components: {nx.number_connected_components(G)}") NetworkX graph parameters: Beta: 0.6, r0: 4.0 Average clustering coefficient: 0.267 Number of connected components: 1
>>> # Demonstrate reproducibility with seed >>> G1 = waxman_graph(gdf, beta=0.4, r0=2.0, seed=99, as_nx=True) >>> G2 = waxman_graph(gdf, beta=0.4, r0=2.0, seed=99, as_nx=True) >>> print(f"\\nReproducibility test:") >>> print(f"Graph 1 edges: {G1.number_of_edges()}") >>> print(f"Graph 2 edges: {G2.number_of_edges()}") >>> print(f"Identical: {G1.number_of_edges() == G2.number_of_edges()}") Reproducibility test: Graph 1 edges: 8 Graph 2 edges: 8 Identical: True
>>> # Manhattan distance metric example >>> nodes_manh, edges_manh = waxman_graph( ... gdf, beta=0.5, r0=3.0, distance_metric="manhattan", seed=42 ... ) >>> print(f"\\nManhattan distance: {len(edges_manh)} edges") Manhattan distance: 10 edges
- bridge_nodes(nodes_dict, proximity_method='knn', *, multigraph=False, as_nx=False, **kwargs)[source]#
Build directed proximity edges between every ordered pair of node layers.
This function creates a multi-layer spatial network by generating directed proximity edges from nodes in one GeoDataFrame layer to nodes in another. It systematically processes all ordered pairs of layers and applies either k-nearest neighbors (KNN) or fixed-radius method to establish inter-layer connections. This function is specifically designed for constructing heterogeneous graphs by generating new edge types (“is_nearby”) between different types of nodes, enabling the modeling of complex relationships across multiple domains. It is useful for modeling complex urban systems, ecological networks, or multi-modal transportation systems where different types of entities interact through spatial proximity.
- Parameters:
nodes_dict (dict[str, geopandas.GeoDataFrame]) – A dictionary where keys are layer names (strings) and values are GeoDataFrames representing the nodes of each layer. Each GeoDataFrame should contain point geometries with consistent CRS across all layers.
proximity_method (str, default "knn") – The method to use for generating proximity edges between layers. Options are: - “knn”: k-nearest neighbors method - “fixed_radius”: fixed-radius method
multigraph (bool, default False) – If True, the resulting NetworkX graph will be a MultiGraph, allowing multiple edges between the same pair of nodes from different proximity relationships.
as_nx (bool, default False) – If True, returns a NetworkX graph object containing all nodes and inter-layer edges. Otherwise, returns dictionaries of GeoDataFrames.
**kwargs (Any) –
Additional keyword arguments passed to the underlying proximity method:
For proximity_method=”knn”:
- kint, default 1
Number of nearest neighbors to connect to in target layer
- distance_metricstr, default “euclidean”
Distance metric (“euclidean”, “manhattan”, “network”)
- network_gdfgeopandas.GeoDataFrame, optional
Network for “network” distance calculations
For proximity_method=”fixed_radius”:
- radiusfloat, required
Maximum connection distance between layers
- distance_metricstr, default “euclidean”
Distance metric (“euclidean”, “manhattan”, “network”)
- network_gdfgeopandas.GeoDataFrame, optional
Network for “network” distance calculations
- Returns:
If as_nx is False, returns a tuple containing:
nodes_dict: The original input nodes_dict (unchanged)
edges_dict: Dictionary where keys are edge type tuples of the form (source_layer_name, “is_nearby”, target_layer_name) and values are GeoDataFrames of the generated directed edges. Each unique tuple represents a distinct edge type in the heterogeneous graph, enabling differentiation between relationships across different node type combinations.
If as_nx is True, returns a NetworkX graph object containing all nodes from all layers and the generated directed inter-layer edges, forming a heterogeneous graph structure where different node types are connected through proximity-based relationships.
- Return type:
tuple[dict[str, geopandas.GeoDataFrame], dict[tuple[str, str, str], geopandas.GeoDataFrame]] | networkx.Graph
- Raises:
ValueError – If nodes_dict contains fewer than two layers. If proximity_method is not “knn” or “fixed_radius”. If proximity_method is “fixed_radius” but radius is not provided in kwargs.
See also
knn_graph
Generate a k-nearest neighbour graph.
fixed_radius_graph
Generate a fixed-radius graph.
Notes
All generated edges are directed from source layer to target layer
The relation type for all generated edges is fixed as “is_nearby”, creating a new edge type that bridges different node types in heterogeneous graphs
Each ordered pair of node layers generates a distinct edge type, enabling the construction of rich heterogeneous graph structures with multiple relationship types between different domain entities
Edge weights and geometries are calculated based on the chosen distance_metric
Each ordered pair of layers generates a separate edge GeoDataFrame
Self-connections (layer to itself) are not created
The resulting structure is ideal for heterogeneous graph neural networks, multi-layer network analysis, and cross-domain spatial relationship modeling
Examples
>>> import geopandas as gpd >>> import numpy as np >>> from shapely.geometry import Point >>> >>> # Create multi-layer urban infrastructure dataset >>> # Layer 1: Schools >>> schools_data = { ... 'name': ['Elementary_A', 'High_B', 'Middle_C'], ... 'capacity': [300, 800, 500], ... 'geometry': [Point(1, 1), Point(4, 3), Point(2, 4)] ... } >>> schools = gpd.GeoDataFrame(schools_data, crs="EPSG:4326").set_index('name') >>> >>> # Layer 2: Hospitals >>> hospitals_data = { ... 'name': ['General_Hospital', 'Clinic_East'], ... 'beds': [200, 50], ... 'geometry': [Point(3, 2), Point(5, 5)] ... } >>> hospitals = gpd.GeoDataFrame(hospitals_data, crs="EPSG:4326").set_index('name') >>> >>> # Layer 3: Parks >>> parks_data = { ... 'name': ['Central_Park', 'River_Park', 'Neighborhood_Green'], ... 'area_ha': [15.5, 8.2, 3.1], ... 'geometry': [Point(2, 2), Point(1, 3), Point(4, 4)] ... } >>> parks = gpd.GeoDataFrame(parks_data, crs="EPSG:4326").set_index('name') >>> >>> nodes_dict = { ... 'schools': schools, ... 'hospitals': hospitals, ... 'parks': parks ... } >>> >>> print("Input layers:") >>> for layer_name, gdf in nodes_dict.items(): ... print(f"{layer_name}: {len(gdf)} nodes") Input layers: schools: 3 nodes hospitals: 2 nodes parks: 3 nodes
>>> # Bridge nodes using KNN method (1 nearest neighbor) >>> nodes_out, edges_out = bridge_nodes( ... nodes_dict, proximity_method="knn", k=1 ... ) >>> >>> print(f"\\nGenerated edge types: {len(edges_out)}") >>> for edge_key in edges_out.keys(): ... print(f" {edge_key[0]} → {edge_key[2]}: {len(edges_out[edge_key])} edges") Generated edge types: 6 schools → hospitals: 3 edges schools → parks: 3 edges hospitals → schools: 2 edges hospitals → parks: 2 edges parks → schools: 3 edges parks → hospitals: 3 edges
>>> # Examine specific edge relationships >>> school_to_hospital = edges_out[('schools', 'is_nearby', 'hospitals')] >>> print("\\nSchools to nearest hospitals:") >>> print(school_to_hospital[['weight']]) weight 0 2.236068 1 1.414214 2 2.828427
>>> # Bridge nodes using fixed radius >>> nodes_fr, edges_fr = bridge_nodes( ... nodes_dict, proximity_method="fixed_radius", radius=2.5 ... ) >>> >>> total_fr_edges = sum(len(gdf) for gdf in edges_fr.values()) >>> print(f"\\nFixed radius method (radius=2.5): {total_fr_edges} total edges") Fixed radius method (radius=2.5): 8 total edges
>>> # Compare edge counts by method >>> print("\\nEdge count comparison:") >>> for edge_key in edges_out.keys(): ... knn_count = len(edges_out[edge_key]) ... fr_count = len(edges_fr[edge_key]) if edge_key in edges_fr else 0 ... print(f" {edge_key[0]} → {edge_key[2]}: KNN={knn_count}, Fixed radious={fr_count}") Edge count comparison: schools → hospitals: KNN=3, Fixed radious=2 schools → parks: KNN=3, Fixed radious=3 hospitals → schools: KNN=2, Fixed radious=1 hospitals → parks: KNN=2, Fixed radious=1 parks → schools: KNN=3, Fixed radious=1