This tutorial aims to show a facility location application. To achieve this goal the tuorial will solve mulitple real world facility location problems using a dataset describing an area of census tract 205, San Francisco. The general scenarios can be stated as: store sites should supply the demand in this census tract considering the distance between the two sets of sites: demand points and candidate supply sites. Four fundamental facility location models are utilized to highlight varying outcomes dependent on objectives: LSCP (Location Set Covering Problem), MCLP (Maximal Coverage Location Problem), P-Median and P-Center. For further information on these models, it’s recommended to see the notebooks that explain more deeply about each one: LSCP , MCLP , P-Median , P-Center .
Also, this tutorial demonstrates the use of different solvers that PULP supports.
%config InlineBackend.figure_format = "retina" %load_ext watermark %watermark
Last updated: 2023-12-10T13:35:40.591294-05:00 Python implementation: CPython Python version : 3.12.0 IPython version : 8.18.0 Compiler : Clang 15.0.7 OS : Darwin Release : 23.1.0 Machine : x86_64 Processor : i386 CPU cores : 8 Architecture: 64bit
import geopandas import matplotlib.pyplot as plt from matplotlib.patches import Patch import matplotlib.lines as mlines import matplotlib_scalebar from matplotlib_scalebar.scalebar import ScaleBar import numpy import pandas import pulp import shapely from shapely.geometry import Point import spopt import time import warnings %watermark -w %watermark -iv
Watermark: 2.4.3 pulp : 2.7.0 matplotlib : 3.8.2 geopandas : 0.14.1 numpy : 1.26.2 pandas : 2.1.3 shapely : 2.0.2 spopt : 0.5.1.dev53+g5cadae7 matplotlib_scalebar: 0.8.1
We use 4 data files as input: - network_distance is the distance between facility candidate sites calculated by ArcGIS Network Analyst Extension - demand_points represents the demand points with some important features for the facility location problem like population - facility_points represents the stores that are candidate facility sites - tract is the polygon of census tract 205.
All datasets are online on this repository.
DIRPATH = "../spopt/tests/data/"
network_distance = pandas.read_csv( DIRPATH + "SF_network_distance_candidateStore_16_censusTract_205_new.csv" ) network_distance
distance | name | DestinationName | demand | |
---|---|---|---|---|
0 | 671.573346 | Store_1 | 60750479.01 | 6540 |
1 | 1333.708063 | Store_1 | 60750479.02 | 3539 |
2 | 1656.188884 | Store_1 | 60750352.02 | 4436 |
3 | 1783.006047 | Store_1 | 60750602.00 | 231 |
4 | 1790.950612 | Store_1 | 60750478.00 | 7787 |
. | . | . | . | . |
3275 | 19643.307257 | Store_19 | 60816023.00 | 3204 |
3276 | 20245.369594 | Store_19 | 60816029.00 | 4135 |
3277 | 20290.986235 | Store_19 | 60816026.00 | 7887 |
3278 | 20875.680521 | Store_19 | 60816025.00 | 5146 |
3279 | 20958.530406 | Store_19 | 60816024.00 | 6511 |
3280 rows × 4 columns
demand_points = pandas.read_csv( DIRPATH + "SF_demand_205_centroid_uniform_weight.csv", index_col=0 ) demand_points = demand_points.reset_index(drop=True) demand_points
OBJECTID | ID | NAME | STATE_NAME | AREA | POP2000 | HOUSEHOLDS | HSE_UNITS | BUS_COUNT | long | lat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 6081602900 | 60816029.00 | California | 0.48627 | 4135 | 1679 | 1715 | 112 | -122.488653 | 37.650807 |
1 | 2 | 6081602800 | 60816028.00 | California | 0.47478 | 4831 | 1484 | 1506 | 59 | -122.483550 | 37.659998 |
2 | 3 | 6081601700 | 60816017.00 | California | 0.46393 | 4155 | 1294 | 1313 | 55 | -122.456484 | 37.663272 |
3 | 4 | 6081601900 | 60816019.00 | California | 0.81907 | 9041 | 3273 | 3330 | 118 | -122.434247 | 37.662385 |
4 | 5 | 6081602500 | 60816025.00 | California | 0.46603 | 5146 | 1459 | 1467 | 44 | -122.451187 | 37.640219 |
. | . | . | . | . | . | . | . | . | . | . | . |
200 | 204 | 6075011100 | 60750111.00 | California | 0.09466 | 5559 | 2930 | 3037 | 362 | -122.418479 | 37.791082 |
201 | 205 | 6075012200 | 60750122.00 | California | 0.07211 | 7035 | 3862 | 4074 | 272 | -122.417237 | 37.785728 |
202 | 206 | 6075017601 | 60750176.01 | California | 0.24306 | 5756 | 2437 | 2556 | 943 | -122.410115 | 37.779459 |
203 | 207 | 6075017800 | 60750178.00 | California | 0.27882 | 5829 | 3115 | 3231 | 807 | -122.405411 | 37.778934 |
204 | 208 | 6075012400 | 60750124.00 | California | 0.17496 | 8188 | 4328 | 4609 | 549 | -122.416455 | 37.782289 |
205 rows × 11 columns
facility_points = pandas.read_csv(DIRPATH + "SF_store_site_16_longlat.csv", index_col=0) facility_points = facility_points.reset_index(drop=True) facility_points
OBJECTID | NAME | long | lat | |
---|---|---|---|---|
0 | 1 | Store_1 | -122.510018 | 37.772364 |
1 | 2 | Store_2 | -122.488873 | 37.753764 |
2 | 3 | Store_3 | -122.464927 | 37.774727 |
3 | 4 | Store_4 | -122.473945 | 37.743164 |
4 | 5 | Store_5 | -122.449291 | 37.731545 |
5 | 6 | Store_6 | -122.491745 | 37.649309 |
6 | 7 | Store_7 | -122.483182 | 37.701109 |
7 | 8 | Store_11 | -122.433782 | 37.655364 |
8 | 9 | Store_12 | -122.438982 | 37.719236 |
9 | 10 | Store_13 | -122.440218 | 37.745382 |
10 | 11 | Store_14 | -122.421636 | 37.742964 |
11 | 12 | Store_15 | -122.430982 | 37.782964 |
12 | 13 | Store_16 | -122.426873 | 37.769291 |
13 | 14 | Store_17 | -122.432345 | 37.805218 |
14 | 15 | Store_18 | -122.412818 | 37.805745 |
15 | 16 | Store_19 | -122.398909 | 37.797073 |
study_area = geopandas.read_file(DIRPATH + "ServiceAreas_4.shp").dissolve() study_area
geometry | |
---|---|
0 | POLYGON ((-122.45299 37.63898, -122.45415 37.6. |
base = study_area.plot() base.axis("off");
To start modeling the problem assuming the arguments expected by spopt.locate , we should pass a numpy 2D array as a cost matrix. So, first we pivot the network_distance dataframe.
Note that the columns and rows are sorted in ascending order.
ntw_dist_piv = network_distance.pivot_table( values="distance", index="DestinationName", columns="name" ) ntw_dist_piv
name | Store_1 | Store_11 | Store_12 | Store_13 | Store_14 | Store_15 | Store_16 | Store_17 | Store_18 | Store_19 | Store_2 | Store_3 | Store_4 | Store_5 | Store_6 | Store_7 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DestinationName | ||||||||||||||||
60750101.0 | 11495.190454 | 20022.666503 | 10654.593733 | 8232.543149 | 7561.399789 | 4139.772198 | 4805.805279 | 2055.530234 | 225.609240 | 1757.623456 | 11522.519829 | 7529.985950 | 10847.234951 | 10604.729605 | 20970.277793 | 15242.989416 |
60750102.0 | 10436.169910 | 19392.094770 | 10024.022001 | 7601.971416 | 6930.828057 | 3093.851654 | 4175.233547 | 1257.809690 | 1041.911304 | 2333.244000 | 10509.099285 | 6470.965406 | 10216.663219 | 9974.157873 | 20339.706061 | 14612.417684 |
60750103.0 | 10746.296811 | 19404.672860 | 10036.600090 | 7614.549505 | 6943.406146 | 3381.778555 | 4187.811636 | 2046.436590 | 744.584403 | 1685.517099 | 10800.926186 | 6778.892307 | 10229.241308 | 9986.735962 | 20352.284150 | 14624.995773 |
60750104.0 | 11420.492134 | 19808.368182 | 10440.295413 | 8018.244828 | 7347.101469 | 4044.473877 | 4591.506959 | 2463.736278 | 795.715285 | 1282.217412 | 11308.221508 | 7447.187630 | 10632.936630 | 10390.431285 | 20755.979472 | 15028.691095 |
60750105.0 | 11379.443952 | 19583.920000 | 10215.847231 | 7793.796646 | 7122.653287 | 4103.725695 | 4367.058776 | 3320.283731 | 1731.462738 | 249.669959 | 11083.773326 | 7379.539448 | 10408.488448 | 10165.983103 | 20531.531290 | 14804.242913 |
. | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
60816025.0 | 17324.066610 | 2722.031291 | 10884.063331 | 14178.007937 | 13891.857275 | 18418.384867 | 16726.951785 | 20834.395022 | 21441.247824 | 20875.680521 | 14662.484617 | 16569.371114 | 12483.322114 | 11926.727459 | 4968.842581 | 8648.054204 |
60816026.0 | 15981.172325 | 3647.137006 | 10299.369046 | 13593.313651 | 13307.162990 | 17833.690581 | 16142.257500 | 20249.700736 | 20856.553539 | 20290.986235 | 14050.290332 | 15963.776829 | 11871.527828 | 11342.033174 | 3625.948296 | 7919.659919 |
60816027.0 | 14835.342712 | 4581.333336 | 9637.139433 | 12931.084039 | 12644.933377 | 17171.460969 | 15480.027887 | 19587.471124 | 20194.323926 | 19628.756623 | 13341.313338 | 15301.547216 | 11209.298215 | 10679.803561 | 2290.818683 | 7242.830306 |
60816028.0 | 13339.491691 | 6392.856372 | 8577.488412 | 11871.433018 | 11585.282356 | 16111.809948 | 14420.376867 | 18527.820103 | 19134.672905 | 18569.105602 | 11845.462317 | 14241.196195 | 10155.147195 | 9620.152540 | 1846.795647 | 5746.979285 |
60816029.0 | 15257.855684 | 6394.920365 | 10253.752405 | 13547.697010 | 13261.546349 | 17788.073940 | 16096.640859 | 20204.084095 | 20810.936898 | 20245.369594 | 13763.826309 | 15918.160188 | 11825.911187 | 11296.416533 | 508.931655 | 7665.343278 |
205 rows × 16 columns
Here the pivot table is transformed to a numpy 2D array such as spopt.locate expected. The matrix has a shape of 205x16.
cost_matrix = ntw_dist_piv.to_numpy() cost_matrix.shape
(205, 16)
cost_matrix[:3, :3]
array([[11495.19045438, 20022.66650296, 10654.59373325], [10436.16991032, 19392.09477041, 10024.0220007 ], [10746.29681106, 19404.67285964, 10036.60008993]])
Now, as the rows and columns of our cost matrix are sorted, we have to sort our facility points and demand points geodataframes, too.
n_dem_pnts = demand_points.shape[0] n_fac_pnts = facility_points.shape[0]
process = lambda df: as_gdf(df).sort_values(by=["NAME"]).reset_index(drop=True) as_gdf = lambda df: geopandas.GeoDataFrame(df, geometry=pnts(df)) pnts = lambda df: geopandas.points_from_xy(df.long, df.lat)
facility_points = process(facility_points) demand_points = process(demand_points)
Reproject the input spatial data.
for _df in [facility_points, demand_points, study_area]: _df.set_crs("EPSG:4326", inplace=True) _df.to_crs("EPSG:7131", inplace=True)
Here the rest of parameters are set.
ai is the demand weight and in this case we model as population in year 2000 of this tract.
ai = demand_points["POP2000"].to_numpy() # maximum service radius (in meters) SERVICE_RADIUS = 5000 # number of candidate facilities in optimal solution P_FACILITIES = 4
Below, the method is used to plot the results of the four models that we will prepare to solve the problem.
dv_colors_arr = [ "darkcyan", "mediumseagreen", "saddlebrown", "darkslategray", "lightskyblue", "thistle", "lavender", "darkgoldenrod", "peachpuff", "coral", "mediumvioletred", "blueviolet", "fuchsia", "cyan", "limegreen", "mediumorchid", ] dv_colors = ": dv_colors_arr[i] for i in range(len(dv_colors_arr))> dv_colors
def plot_results(model, p, facs, clis=None, ax=None): """Visualize optimal solution sets and context.""" if not ax: multi_plot = False fig, ax = plt.subplots(figsize=(6, 9)) markersize, markersize_factor = 4, 4 else: ax.axis("off") multi_plot = True markersize, markersize_factor = 2, 2 ax.set_title(model.name, fontsize=15) # extract facility-client relationships for plotting (except for p-dispersion) plot_clis = isinstance(clis, geopandas.GeoDataFrame) if plot_clis: cli_points = <> fac_sites = <> for i, dv in enumerate(model.fac_vars): if dv.varValue: dv, predef = facs.loc[i, ["dv", "predefined_loc"]] fac_sites[dv] = [i, predef] if plot_clis: geom = clis.iloc[model.fac2cli[i]]["geometry"] cli_points[dv] = geom # study area and legend entries initialization study_area.plot(ax=ax, alpha=0.5, fc="tan", ec="k", zorder=1) _patch = Patch(alpha=0.5, fc="tan", ec="k", label="Dissolved Service Areas") legend_elements = [_patch] if plot_clis: # any clients that not asscociated with a facility if model.name.startswith("mclp"): c = "k" if model.n_cli_uncov: idx = [i for i, v in enumerate(model.cli2fac) if len(v) == 0] pnt_kws = dict(ax=ax, fc=c, ec=c, marker="s", markersize=7, zorder=2) clis.iloc[idx].plot(**pnt_kws) _label = f"Demand sites not covered ($n$=)" _mkws = dict(marker="s", markerfacecolor=c, markeredgecolor=c, linewidth=0) legend_elements.append(mlines.Line2D([], [], ms=3, label=_label, **_mkws)) # all candidate facilities facs.plot(ax=ax, fc="brown", marker="*", markersize=80, zorder=8) _label = f"Facility sites ($n$=)" _mkws = dict(marker="*", markerfacecolor="brown", markeredgecolor="brown") legend_elements.append(mlines.Line2D([], [], ms=7, lw=0, label=_label, **_mkws)) # facility-(client) symbology and legend entries zorder = 4 for fname, (fac, predef) in fac_sites.items(): cset = dv_colors[fname] if plot_clis: # clients geoms = cli_points[fname] gdf = geopandas.GeoDataFrame(geoms) gdf.plot(ax=ax, zorder=zorder, ec="k", fc=cset, markersize=100 * markersize) _label = f"Demand sites covered by " _mkws = dict(markerfacecolor=cset, markeredgecolor="k", ms=markersize + 7) legend_elements.append( mlines.Line2D([], [], marker="o", lw=0, label=_label, **_mkws) ) # facilities ec = "k" lw = 2 predef_label = "predefined" if model.name.endswith(predef_label) and predef: ec = "r" lw = 3 fname += f" ()" facs.iloc[[fac]].plot( ax=ax, marker="*", markersize=1000, zorder=9, fc=cset, ec=ec, lw=lw ) _mkws = dict(markerfacecolor=cset, markeredgecolor=ec, markeredgewidth=lw) legend_elements.append( mlines.Line2D([], [], marker="*", ms=20, lw=0, label=fname, **_mkws) ) # increment zorder up and markersize down for stacked client symbology zorder += 1 if plot_clis: markersize -= markersize_factor / p if not multi_plot: # legend kws = dict(loc="upper left", bbox_to_anchor=(1.05, 0.7)) plt.legend(handles=legend_elements, **kws)
from spopt.locate import LSCP lscp = LSCP.from_cost_matrix(cost_matrix, SERVICE_RADIUS) lscp = lscp.solve(pulp.GLPK(msg=False)) lscp_objval = lscp.problem.objective.value() lscp_objval