cltproblem package¶
cltproblem.algorithm module¶
Algorithms for the cltproblem, mostly containing Suurballe’s algorithm.
-
cltproblem.algorithm.
clc
(G, s, W1, W2, epsilon=0.0001, verbose=False, fp='log.json', sec=<function sec_connected_components>)¶ Solves CLC exactly using Gurobi solver.
- Parameters
G (Graph) – Input graph. Can be directed or undirected.
s (node) – Origin.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
epsilon (float, optional) – Small value for rounding.
verbose (bool, optional) – Whether to print and record the results to a json file. May slow down the solver on larger instances.
fp (bool, optional) – Filepath to the results json file.
sec (function, optional) – Function that takes the variables x and start vertex s as input and returns a dict called cuts. Each item in cuts is a dict with a ‘left’ key and a ‘right’ key. The constraint added to the model is left <= right.
- Returns
List of vertices in the optimal tour.
- Return type
list
-
cltproblem.algorithm.
delta
(G, S, vars={})¶ Given a set of nodes S in a graph G, return the set of edges with exactly one endpoint in S and the other endpoint not in S.
- Parameters
G (nx.Graph) – Input graph.
S (set) – Set of nodes.
vars (set or dict, optional) – Set containing keys of variables in the IP solver.
- Returns
Set of edges with exactly one endpoint in S and one endpoint not in S.
- Return type
set
-
cltproblem.algorithm.
edge_disjoint_path_cost
(T, t)¶ Return the cost of the least-cost edge-disjoint pair of paths from s to t.
- Parameters
T (SuurballeTree) – Initialised Suurballe’s Tree
t (node) – Target node
- Returns
Total cost of both edge-disjoint paths. If no edge-disjoint paths exist, then return sys.maxsize
- Return type
int or float
-
cltproblem.algorithm.
extract_suurballe_edge_disjoint_paths
(T, s, t)¶ Algorithm consists of initialisation and two backward traversals for each edge-disjoint path.
- Parameters
T (SuurballeTree) – Input tree with d, p, and q initialised using Suurballe’s algorithm. Ensure that all vertices are unmarked to begin.
s (node) – Start vertex.
t (node) – Target vertex.
- Returns
Tuple of length 2. Each tuple is a list with a path from s to t.
- Return type
2-tuple of lists
-
cltproblem.algorithm.
init_clc
(G, s, W1, W2)¶ Initialise a Gurobi MIP to solve the CLC problem.
The inputs G, s, W1 and W2 are all stored as attributes of the model. They can be accessed by prepending an underscore, e.g. model._s.
The following objective function and constraints are added to the solver:
\[\begin{split}\begin{align} \label{obj} & \text{Minimise} & & \displaystyle\sum_{e_{ij} \in E(G)}~{x_{ij} \cdot c(e_{ij})} & \\ \label{weight} & \text{Subject to} & & W_1 \leq \displaystyle\sum_{e_{ij} \in E(G)}~{x_{ij} \cdot w(e_{ij})} \leq W_2 & \\ \label{origin} & & & x(\delta(s^\star)) = 2 & \\ \label{degree} & & & \frac{1}{2} x(\delta(i)) \in \{0,1\} & \forall i \in V(G) \\ \label{domain} & & & x_{ij} \in \{0,1\} & \forall e_{ij} \in E(G) \end{align}\end{split}\]To enforce connectivity of the solution, you will need to add subtour elimination constraints.
- Parameters
G (Graph) – Input graph. Can be directed or undirected.
s (node) – Origin.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
- Returns
Gurobi Model object initialised with constraints.
- Return type
Model
-
cltproblem.algorithm.
integral
(x, epsilon=0.0001)¶ Check that every variable in x is integral.
-
cltproblem.algorithm.
is_ancestor
(T, ancestor, descendant)¶ Test if ancestor is the ancestor of descendant.
- Parameters
T (SuurballeTree) – Tree where each vertex has an attribute ‘pre’ and ‘post’ which denote the preorder number and postorder number of the vertex. Use the preorder and postorder methods to compute these numberings.
ancestor (node) – The node we are testing is an ancestor.
descendant (node) – The node we are testing is a descendant.
- Returns
True if ancestor is the ancestor of descendant, and false otherwise.
- Return type
boolean
-
cltproblem.algorithm.
postorder
(T, s)¶ Computes a postorder traversal of the tree. Tree should be an instance of a directed graph. Assigns nodes with the attribute “post”. Returns a list of nodes in postorder.
- Parameters
T (DiGraph) – Tree.
s (node) – Root vertex.
- Returns
Vertices in postorder.
- Return type
list
-
cltproblem.algorithm.
preorder
(T, s, limit=5000)¶ Computes a preorder traversal of the tree. Tree should be an instance of a directed graph. Assign nodes with the attribute “pre”. Returns a list of nodes in preorder.
- Parameters
T (DiGraph) – Tree.
s (node) – Root vertex.
limit (int, optional) – Recursion limit. Default is 5000. Bigger is required for larger datasets.
- Returns
Vertices in preorder.
- Return type
list
-
cltproblem.algorithm.
sec_callback
(model, where)¶
-
cltproblem.algorithm.
sec_connected_components
(vars, s)¶
-
cltproblem.algorithm.
sec_min_weight_cut
(vars, s)¶
-
cltproblem.algorithm.
suurballe
(G, s, weight='weight')¶ Suurballe’s algorithm for multiple targets. For every vertex in G, finds the least-cost edge-disjoint paths from s to v.
This algorithm is the single-source multiple-target version [ST84]. The original Suurballe algorithm [Suu74] was for single-source single-target.
We have tried to be consistant with the notation presented in the 1984 paper [ST84] to make it easier to follow the code.
- Parameters
G (DiGraph) – Directed input graph. Must be asymmetric, i.e. if (u,v) is in G then (v,u) is not in G. If your graph is symmetric then use the preprocessing.create_dummies method.
s (node) – Source vertex.
weight (string, optional) – Name of the edge attribute. Defaults to ‘weight’.
- Returns
A vertex for which a disjoint path exists has its label set to true. Each reachable vertex has p, q, and d assigned. The edge-disjoint paths can be reconstructed from p and q.
- Return type
-
cltproblem.algorithm.
whack_a_mole
(m)¶ Enforce that every subset S that does not contain the origin has at least two edges coming out of it.
\[\begin{align} & \frac{\sum_{i \in S}~y_i}{|S|} \leq x(\delta(S)) & \forall S \subset V(G), 3 \leq |S| \leq n-3, s^\star \notin S \end{align}\]
cltproblem.datastructure module¶
-
class
cltproblem.datastructure.
SuurballeTree
(s)¶ Bases:
networkx.classes.digraph.DiGraph
Store the tree and all other datastructures for Suurballe’s algorithm for finding least-cost edge-disjoint paths from s to every vertex in the graph.
-
s
¶ Root vertex.
- Type
node
-
p
¶ Keys are nodes. The edge (p[w],w) is a non-tree edge. Thus p[w] is not the parent of the node in the shortest path tree. Parent of root is None.
- Type
dict
-
d
¶ Keys are nodes. Weight from s to vertex v.
- Type
dict
-
c
¶ Keys are 2-tuple (u,v) of edges. Values are transformed costs of edges.
- Type
dict
-
children
¶ Keys are nodes. Value is a list of children nodes.
- Type
dict
-
i
¶ Keys are nodes. Value is list of 2-tuples representing edges incident to the node.
- Type
dict
-
l
¶ Keys are nodes. Values are bool. Values indicate if the node has been labelled.
- Type
dict
-
hq
¶ Heap queue data structure.
- Type
list
-
mark
¶ Keys are nodes. Values are bool. Values indicate if the node has been marked.
- Type
dict
-
w
¶ Keys are nodes. Values are the shortest path from s to the node.
- Type
dict
-
q
¶ Keys are nodes. Values are nodes. Let v be a key. Then q[v] is the node that caused (p[v],v) to be processed.
- Type
dict
-
get_node_data
(u)¶ Print out the attributes of a vertex in the tree.
- Parameters
u (node) –
-
cltproblem.experiments module¶
The codebase for running experiments to show how well the heuristics performs against the relaxations.
-
class
cltproblem.experiments.
ConnectivityResult
(G, s, W1, W2, tour, time=-1, id=-1, dataset='', weight='weight', cost='cost')¶
-
class
cltproblem.experiments.
ContinuousResult
(G, s, W1, W2, tour, xhead, time=-1, id=-1, dataset='', weight='weight', cost='cost')¶ Bases:
cltproblem.experiments.Result
Deals with the extra xhead parameter.
-
to_dict
()¶ Return a dictionary with keys/values for each datapoint.
-
-
class
cltproblem.experiments.
Experiment
(thresholds, keys, filepath='', origins=[], dataset=[], pre={}, overwrite=True, id=0, ran_nodes=10, backup=False, optimise={})¶ Bases:
object
Stores the results and parameters of an experiment.
-
print
()¶ For every result, print a summary of the result.
-
run
(verbose=False)¶ Run the experiment. If the filepath attribute is set, then results are written to that json file.
- Parameters
verbose (bool) – If true, print out the result each time.
-
write
()¶ Write all the results and settings to a json file.
- Parameters
filepath (str) – Write the results to a json file with this path.
-
-
cltproblem.experiments.
KEYS
= {'AH': <function adaptive>, 'CR': <function continuous>, 'DjV': <function dejavu>, 'SH': <function suurballe_heuristic>, 'XR': <function connectivity>}¶ Maps a key to a heuristic or relaxation method.
-
cltproblem.experiments.
NICE
= {'AH': 'Adaptive', 'CR': 'Continuous relaxation', 'DjV': 'Deja Vu', 'SH': 'Suurballe', 'XR': 'Connectivity relaxation'}¶ Maps a key to a nice name for a heuristic or relaxation.
-
class
cltproblem.experiments.
Result
(G, s, W1, W2, key, tour, time=-1, id=-1, dataset='', weight='weight', cost='cost')¶ Bases:
object
-
print
()¶ Print out the result.
-
to_dict
()¶ Return a dictionary with keys/values for each datapoint.
-
-
cltproblem.experiments.
read
(filepath)¶ Read a json file and return an
Experiment
object.
cltproblem.heuristic module¶
Heuristics for the CLT and CLC problems.
-
cltproblem.heuristic.
adaptive
(G, s, W1, W2, weight='weight', cost='cost', k=1, optimise='default')¶ Creates a cycle C which can be repeated. If s is in C, then C is repeated at most k times. If s is not in C, then a path from s to C is computed, and C is repeated at most k-1 times.
- Parameters
G (Graph) – Asymmetric, directed input graph.
s (node) – Start vertex.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
weight (str, optional) – Name of weight edge attribute.
cost (str, optional) – Name of cost edge attribute.
k (int, optional) – Max number of vertex repetitions.
optimise (str, optional) – Method for reducing the number of vertices visited. Options are default, sphere, euclid, manhattan.
- Returns
Ordered list of vertices in the tour.
- Return type
list
-
cltproblem.heuristic.
dejavu
(G, s, W1, W2, weight='weight', cost='cost', k=9223372036854775807, optimise='default')¶ Designed for the CLT problem when k=infinity. Finds a path to a low cost edge which it repeats with a large multiplicity. Returns to s along the path.
- Parameters
G (Graph) – Input graph.
s (node) – Start vertex.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
weight (str, optional) – Name of weight edge attribute.
cost (str, optional) – Name of cost edge attribute.
k (int, optional) – Max number of vertex repetitions.
optimise (str, optional) – Redundant for this heuristic.
- Returns
Ordered list of vertices in the tour.
- Return type
list
-
cltproblem.heuristic.
suurballe_heuristic
(G, s, W1, W2, weight='weight', cost='cost', k=1, optimise='default')¶ Find two vertex-disjoint paths starting at the origin and ending at some vertex t such that the total cost is minimised and sum of the weights of the paths is weight feasible. Time complexity is \(\mathcal{O}(n \log n)\).
- Parameters
G (Graph) – Asymmetric, directed, simple input graph.
s (node) – Start vertex.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
weight (str, optional) – Name of weight edge attribute.
cost (str, optional) – Name of cost edge attribute.
k (int, optional) – Max number of vertex repetitions. Default is 1 (CLC problem).
optimise (str, optional) – Redundant for this heuristic.
- Returns
list – Ordered list of vertices in first pair of vertex-disjoint paths.
list – Order list of vertices in second pair of vertex-disjoint paths.
cltproblem.preprocess module¶
Pre-processing algorithms to reduce the size of the input graph or converting between different types of graph.
-
class
cltproblem.preprocess.
Conversion
¶ Bases:
object
Converting between different types of networkx Graphs.
Different datasets mean we need to convert between different types of graphs because algorithms often assume a certain type of graph as input. This class stores different types of graphs for the same dataset and provides methods for converting between them. Specifically, we give methods for converting between MultiGraphs to simple Graphs, from MultiDiGraphs to DiGraphs, and from DiGraphs to asymmetric DiGraphs. See the networkx and osmnx packages for converting between directed and undirected graphs.
-
multi
¶ A networkx multigraph.
- Type
MultiGraph
-
simple
¶ A networkx simple graph.
- Type
Graph
-
di
¶ A networkx DiGraph.
- Type
DiGraph
-
multidi
¶ A networkx MultiDiGraph.
- Type
MultiDiGraph
-
asymm
¶ A network DiGraph that is asymmetric.
- Type
DiGraph
-
di_fakes
¶ A dictionary mapping a fake node f in the simple graph to a 3-tuple (u,v,k) in the multi graph. There are two edges (u,f) and (f,v) in the simple graph connecting f. (u,w,k) is a multi-edge between u and v with key k.
- Type
dict
-
simple_fakes
¶ Fake nodes in the undirected simple graph to edges in the undirected multi graph
- Type
dict
-
asymm_mid
¶ Max node id of the asymmetric directed graph.
- Type
int
-
multidi_mid
¶ Max node id of multi directed graph.
- Type
int
-
di_mid
¶ Max node id of directed simple graph.
- Type
int
-
multi_mid
¶ Max node id of the multi undirected graph.
- Type
int
Examples
Lets say you create the following graph:
>>> G = nx.MultiDiGraph() >>> nx.add_path(G, [0,1,2,3,4]) >>> nx.add_edge(2,3) >>> nx.add_edge(3,4) >>> C = Conversion() >>> H = C.multidi2di(G)
Calling multidi2di will create a simple graph. You can access the original MultiDiGraph G by
>>> C.multidi
And you can access the created simple directed graph by
>>> C.di
The multi_to_simple function will create some ``fake nodes’‘. The attribute fakes stores the created fake nodes as keys and stores the original (u,v,k) multi-edges as a value.
>>> print(C.di_fakes) {5:(2,3,1),6:(3,4,1)}
-
get_di_fake_nodes
()¶ Get nodes that were created in the conversion processes.
- Returns
Set of nodes.
- Return type
set
-
get_di_path_from_asymm
(P)¶ Given a path in the asymmetric graph, return the path in the directed graph.
- Parameters
P (list) – List of nodes in the path from the asymmetric graph.
- Returns
List of nodes in the path in the di graph.
- Return type
list
-
get_multi_path_from_asymm
(P)¶ Given a list of nodes in the asymmetric directed graph, return a list of nodes from the undirected multi graph.
-
get_multi_path_from_simple
(P)¶ Given a path in the simple undirected graph, return the path from the multi undirected graph.
Case = [f,0,1,2,3,4,f] Case = [f,0,1,2,3,4] Case = [0,1,2,3,4,f]
- Parameters
P (list) – List of nodes in the path from the simple graph.
- Returns
List of nodes in the path in the multi graph.
- Return type
list
-
get_multidi_path_from_di
(P)¶ Given a list of vertices in a path from the simple (di) graph, return the list of edges in the multi-graph in the path.
- Parameters
P (list) – List of nodes in the path.
- Returns
List of edges in the path. Each item in the list is a three tuple.
- Return type
list
-
multi2asymm
(G=None)¶ Given undirected graph, return a directed, asymmetric graph.
-
multi2simple
(G=None)¶ Given an undirected multigraph, return an undirected simple graph.
- Parameters
G (nx.MultiGraph, optional) – Undirected graph with multi edges.
- Returns
Undirected, simple graph with no multi edges.
- Return type
nx.Graph
Notes
Currently, we remove self loops rather than processing them.
-
multidi2di
(G=None)¶ Given a directed graph with multiple arcs between two vertices, return a simple graph.
If there exist two edges e1,e2 between two vertices u,v, then create a third vertex w. e1 remains unchanged, and e2 is split in half such that there exists an edge u,w and w,v. The weight, cost and gamma of e1 and (u,w) remain unchanged. The weight, cost and gamma of (w,v) are all set to zero.
G is assigned to the multidi attribute. The returned DiGraph is assigned to the di attribute. The fakes attribute is initialised.
- Parameters
G (nx.MultiDiGraph, optional) – Directed input graph that may have multiple edges between two vertices. If not specified then we use the multidi attribute as G.
- Returns
Simple directed graph.
- Return type
nx.DiGraph
-
multidi2multi
(G=None)¶ Given a multidi graph, return a multi undirected graph.
-
simple2asymm
(G=None, weight='weight', cost='cost', cost_of_split=0, weight_of_split=0)¶ Given a directed, (possibly) symmetric, simple graph, return a directed asymmetric simple graph.
This tranformation is described by [ST84]. Each vertex from the directed graph is split in two. E.g. node u is split into u1 and u2. A directed edge is create between u1 and u2. The cost and weight of this edge is defined by the parameters cost_of_split and weight_of_split.
This method will also assign the asymm_mid and di_mid attributes.
- Parameters
G (nx.Graph, nx.DiGraph, optional) – Input graph that may be directed. If a DiGraph is specified, then G is assigned to the C.di attribute. If no graph is specified, the C.di graph is used.
weight (str, optional) – Name of weight edge attribute of the graph.
cost (str, optional) – Name of cost edge attribute of the graph.
weight_of_split (int, optional) – Weight of arc between u1 and u2.
cost_of_split (int, optional) – Cost of arc between u1 and u2.
- Returns
Asymmetric directed graph.
- Return type
DiGraph
-
-
exception
cltproblem.preprocess.
ConversionException
(message)¶ Bases:
Exception
Raised when there is a problem converting between graphs.
-
cltproblem.preprocess.
leaves
(G, s, limit=1000)¶ Given graph G, find all vertices with degree 1 and remove them from the graph. If removing a vertex creates another vertex with degree 1, also remove this vertex.
If the graph is disconnected, or become disconnected as a result of removing nodes, then not all leaves are guaranteed to be removed from the graph. In this case, it is suggested to use the remove_disconnected_components function.
- Parameters
G (Graph) – Input graph.
s (node) – Origin
limit (int, optional) – Recursion limit.
- Returns
Graph where all vertices have degree 2 or more.
- Return type
Graph
-
cltproblem.preprocess.
post_clc
(C, edges, s)¶ Return a tour after post-processing the CLC result.
-
cltproblem.preprocess.
post_suurballe
(C, P1, P2, s)¶ Given two vertex disjoint paths in the asymm graph, return the tour from the multidi graph.
-
cltproblem.preprocess.
pre_clc
(G, s, W1, W2)¶
-
cltproblem.preprocess.
pre_suurballe
(G, s, W2)¶ Executes all pre-processing methods for Suurballe’s heuristic.
Removes vertices that are unreachable from s within weight W2/2.
Removes all remaining leaves.
Converts the multidi graph into a simple di graph.
Converts the di graph into an asymmetric di graph.
- Parameters
G (nx.MultiDiGraph) – Input graph.
s (node) – Start vertex.
W2 (int) – Upper weight threshold.
- Returns
A Conversion class with the multidi, di and asymm attributes set. The pre-processed graph you should input to Suurballe’s heuristic is the asymm graph.
- Return type
Notes
The start vertex will need to be changed using the
util.create_split_target()
function when passed to theheuristic.suurballes_heuristic()
function. This is because the graph has gone through the “splitting” procedure (see explanation insimple2asymm()
method in theConversion
class.)
-
cltproblem.preprocess.
suurballe_prune
(G, s, threshold, weight='weight')¶ If the total weight of shortest pair of vertex-disjoint paths from s to a vertex v in G is greater than the threshold, then remove v from the graph.
- Parameters
G (nx.DiGraph) – Directed input graph.
s (node) – Vertex in G.
threshold (int or float) – Cutoff threshold.
weight (string, optional) – Name of edge attribute.
- Returns
New graph containing only vertices that are reachable from s within given weight threshold.
- Return type
nx.DiGraph
-
cltproblem.preprocess.
unreachable
(G, s, threshold, weight='weight')¶ Remove vertices from G which are unreachable within a given weight threshold from a vertex s.
- Parameters
G (Graph) – Input graph. Can be directed or undirected.
s (node) – Vertex in G.
threshold (int or float) – Cutoff threshold.
weight (string, optional) – Name of edge attribute.
- Returns
New graph containing only vertices that are reachable from s within given weight threshold.
- Return type
Graph
cltproblem.relaxation module¶
Relaxations for the Constrained Least-cost Tour problem. These relaxations are a lower bound on the optimal solution to the CLT prolem. Thus we can compare how close our heuristics are to the lower bounds to measure the performance of the heuristics.
-
cltproblem.relaxation.
connectivity
(G, s, W1, W2, weight='weight', cost='cost', optimise='default', count=0, auth={}, time_limit=9223372036854775807)¶ Find a group of disjoint cycles, one of which contains s, such that the total weight of the solution is at least W1 and at most W2. The total cost of the cycles should be minimised.
This algorithm is a relaxation for the constrained least-cost cycle problem.
- Parameters
G (Graph) – Input graph.
s (string) – Start vertex.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
weight (str, optional) – Name of weight attribute on edge. Default is ‘weight’.
cost (str, optional) – Name of cost attribute on edge. Default is ‘cost’.
count (int, optional) – Number of times the solution has been tried so far. Used if DoCloud drops out.
auth (dict, optional) –
The authentication values if docloud is used. If not specified then model is solved optimally, i.e. using your locally installed CPLEX Optimisation Studio.
Must have “url” and “client_id” fields set if solving on cloud.
optimise (str, optional) – Redundant for this relaxation.
- Returns
eids (list) – List of edge ids in the solution.
lower_bound (float) – The lower bound of the solution. If the solution is optimal, lower_bound is the value of the optimal solution.
-
cltproblem.relaxation.
continuous
(G, s, W1, W2, weight='weight', cost='cost', optimise='default')¶ The continuous relaxtion for the constraint least-cost closed walk problem. The algorithm returns a list of nodes The algorithm can be summerised in 3 steps:
Find the shortest (least polluted) path to every node reachable within weight D/2.
For every node found, find xhead and compute the cost of the path.
Compare all paths in 2. and return the path with least cost.
- Parameters
G (Graph) – Input graph.
s (node) – Start and end vertex.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
weight (str, optional) – Name of weight attribute on edge. Default is ‘weight’.
cost (str, optional) – Name of cost attribute on edge. Default is ‘cost’.
optimise (str, optional) – Redundant for this relaxation.
- Returns
List of nodes in walk. The final edge is the head.
- Return type
list
-
cltproblem.relaxation.
initialise_model
(G, s, W1, W2, weight='weight', cost='cost', lower_bound=0, time_limit=9223372036854775807)¶ Initialises a CPLEX model for connectivity relaxation.
- Parameters
G (Graph) – Input graph.
s (node) – Starting node.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
weight (str, optional) – Name of weight attribute on edge. Default is ‘weight’.
cost (str, optional) – Name of cost attribute on edge. Default is ‘cost’.
lower_bound (float, optional) – If you have previously derived a lower bound for the solution (e.g. using the continuous relaxation) then you can add this lower bound here.
time_limit (float, optional) – Max time in seconds.
- Returns
The initialised model.
- Return type
CpoModel
cltproblem.settings module¶
Settings for API keys, filepaths and experiments.
-
cltproblem.settings.
get_settings
(filepath='settings.json', default=True)¶ Load the settings from the json file.
- Parameters
filepath (str, optional) – Filepath to the settings json file. Must set default to False if you use this option.
default (bool) – Use the default location for the settings file.
- Returns
Dictionary containing the settings.
- Return type
dict
cltproblem.util module¶
Provides some generally useful functions for extracting walks, manipulating graphs and loading the datasets.
The splitting procedure¶
Some of the methods in this module are used to
create an asymmetric directed graph from an
undirected graph. For example, given a vertex u,
the create_split_source()
and
create_split_target()
will create two vertex IDs which could then
be added to a directed graph.
-
cltproblem.util.
argmin
(dictionary)¶ Given a dictionary, return the key of the item with least value.
- Parameters
dictionary (dict) –
- Returns
Key of item with least value.
- Return type
key
-
cltproblem.util.
compute_entropy
(G, u, M, sight=1, max_entropy=True)¶ The entropy of a vertex \(v\) is defined by:
\[\mathcal{E}(v) = \sum_{c=1}^{C}~{- P_c \log P_c}\]where \(C\) is the number of classes and \(P_c\) is the probability of class \(c\) appearing in a grid centered on vertex \(v\). The size of the grid is \(2 \cdot \texttt{sight} + 1\).
- Parameters
G (Graph) – Input graph.
u (node) – The vertex to compute entropy for.
M (np.matrix) – Numpy matrix containing the classes of all vertices.
sight (int, optional) – The distance the agent can see.
max_entropy (bool, optional) – True if the agent need to seek areas of high entropy. False otherwise.
- Returns
Entropy.
- Return type
float
-
cltproblem.util.
compute_xhead
(G, W, x, weight='weight')¶ Finds how many times the last edge in the walk is repeated. This method is intended for the continuous relaxation.
- Parameters
G (nx.Graph) – Input graph.
W (list) – List of vertices in the walk.
x (int) – Target weight.
- Returns
Number of times the head is repeated.
- Return type
float
-
cltproblem.util.
create_split_source
(u, max_id=None)¶ Given a vertex, create the split-source vertex ID. If the vertex ID is a string, then “i-” is prepended. If the vertex ID is an int, then max_id + u is returned.
- Parameters
u (node) – A vertex in the directed graph.
max_id (int, optional) – The id of the node with the largest id in the original undirected graph.
- Returns
If u is an int, then a new integer vertex ID is returned. Else if u is a string, then a new string vertex ID is returned.
- Return type
int OR str
-
cltproblem.util.
create_split_target
(u, max_id=None)¶ Given a vertex, create the split-target vertex ID. If the vertex ID is a string, then “ii-” is prepended. If the vertex ID is an int, then 2*max_id + u is returned.
- Parameters
u (node) – A vertex in the directed graph.
max_id (int, optional) – The id of the node with the largest id in the original undirected graph.
- Returns
If u is an int, then a new integer vertex ID is returned. Else if u is a string, then a new string vertex ID is returned.
- Return type
int, str
-
cltproblem.util.
create_walk_dictionary
(G, s, W1, W2, T, time, algorithm, id=None, dataset=None, xhead=None)¶ Store result of an algorithm which computes a solution to the CLT problem in a dictionary.
- Parameters
G (Graph) – Input graph.
s (node) – Origin.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
T (list) – List of vertices in the tour.
time (float) – Time taken to compute result.
algorithm (str) – Name of the algorithm that computed the tour.
id (int, optional) – ID of experiment
dataset (str, optional) – Name of the dataset, e.g. pollution.
xhead (float, optional) – Number of times the head of the walk is repeated. Necessary if the algorithm is the continuous relaxation.
- Returns
Returns a dictionary with key-values summerising the tour. The value for the key ‘walk’ is a list.
- Return type
dict
-
cltproblem.util.
default_check
(G, u, v, T, min_uv_distance=0)¶ Always returns true.
- Parameters
G (Graph) – Input graph.
u (node) –
v (node) –
T (int) – u and v must at most this threshold distance apart.
min_uv_distance (int) – u and v must be at least this distance apart.
- Returns
Always true.
- Return type
bool
-
cltproblem.util.
disjoint_tours
(G, edges)¶ Given a graph and a list of edges, return a list of lists of vertices where each list is a disjoint tour.
- Parameters
G (Graph) – Input graph.
edges (list) – List of 2-tuples. Each 2-tuple is an edge (u,v).
- Returns
Each list is a disjoint tour of vertices.
- Return type
list of lists
-
cltproblem.util.
distance_on_unit_sphere
(lat1, long1, lat2, long2)¶ Given two lat/long points on a globe, compute the distance in meters between the two points.
See John Cook’s blog.
- Parameters
lat1 (float) –
long1 (float) –
lat2 (float) –
long2 (float) –
- Returns
Distance between two points in meters.
- Return type
float
-
cltproblem.util.
edge_multiplicity
(W, xhead=None)¶ Given a walk of vertices, count the number of times each edge (u,v) is repeated.
- Parameters
W (list) – List of vertices in the walk.
xhead (float, optional) – If the walk is a solution to the continuous relaxation, then xhead is number of times the last edge in the walk is repeated. Every other edge is repeated exactly twice.
- Returns
Dictionary containing the number of times each edge is repeated. The keys are a 2-tuple of nodes: (u,v). The values (multiplicity of edge) are integers or a float if xhead is set.
- Return type
dict
- Raises
TypeError – If W is not a list or if xhead is not a float, int or Decimal.
ValueError – If xhead is negative.
-
cltproblem.util.
euclidean_check
(G, u, v, T, min_uv_distance=0)¶ Return true if the Euclidean distance between u and v + shortest path distance between u and origin + shortest path distance between v and origin is less than T.
- Parameters
G (Graph) – Input graph.
u (node) –
v (node) –
T (int) – u and v must at most this threshold distance apart.
min_uv_distance (int) – u and v must be at least this distance apart.
- Returns
True if nodes meet threshold constraints.
- Return type
bool
-
cltproblem.util.
extract_dummy_path
(p, max_id=None)¶ Given a path of split-sources and split-targets, return the vertices from the pre-split graph that are in the path.
- Parameters
p (list) – List of split-source and split-target nodes.
max_id (int, optional) – The id of the node with the largest id in the original undirected graph.
- Returns
List of vertices from the original graph.
- Return type
list
-
cltproblem.util.
extract_split_node_id
(u, max_id=None)¶ Given a vertex u which is either a split-source or split-target, return the original ID of the vertex before it was split.
- Parameters
u (node) – A vertex in the directed graph.
max_id (int, optional) – The id of the node with the largest id in the original undirected graph.
- Returns
If u is an int, then a new integer vertex ID is returned. Else if u is a string, then a new string vertex ID is returned.
- Return type
int, str
-
cltproblem.util.
feasible
(total_weight, W1, W2)¶ - Parameters
total_weight (int) – Weight of solution.
W1 (int) – Lower weight threshold.
W2 (int) – Upper weight threshold.
- Returns
True if W1 <= total_weight <= W2. False otherwise.
- Return type
bool
-
cltproblem.util.
find_max_id
(G)¶ If node ids are integers then return the node with the largest id.
- Parameters
G (Graph) – Input graph with integer node ids.
- Returns
Largest id.
- Return type
int
-
cltproblem.util.
get_benchmark_id
(x, y, width)¶ Get the id of a vertex given x-coord and y-coord and the width (same as height) of the benchmark map.
- Parameters
x (int) – x co-ordinate of grid square.
y (int) – y co-ordinate of grid square.
width (int) – Width of the pathfinding benchmark graph.
- Returns
Node id.
- Return type
int
-
cltproblem.util.
get_boundaries
(gdf)¶ Given a geodataframe, return the bounding box.
- Parameters
gdf (GeoDataFrame) – Must have a geometry column.
- Returns
List of 4 floats: [xmin, ymin, xmax, ymax]. If using lat/lon, then x is lon and y is lat.
- Return type
list
-
cltproblem.util.
get_class
(k)¶ Given a string, return the class it belongs to (0-4).
- Parameters
k (str) – Type of ground from: . S T W @
- Returns
0 : . (normal ground)
1 : S (shallow water)
2 : T (trees)
3 : W (water)
4 : @ (out-of-bounds)
- Return type
int
See also
-
cltproblem.util.
get_input_graph
(vars)¶ Given the variables in the MIP, create a networkx graph.
- Parameters
vars (list) – List of 2-tuples representing edges in the graph.
- Returns
Undirected graph.
- Return type
nx.Graph
-
cltproblem.util.
get_pollution_gdf
(fp='', datadir='datasets/', filename='pollution.shp')¶ Get a GeoDataFrame containing pollution predictions across London.
- Parameters
fp (str, optional) – Full filepath to the shape file.
datadir (str, optional) – Name of the data directory.
fp – Filename of the shapefile.
- Returns
Polygons each with a mean pollution and variance.
- Return type
gpd.GeoDataFrame
-
cltproblem.util.
get_solution_edges
(x, epsilon=0.0001, varname='x')¶ Given some Gurobi Vars indexed by a two-tuple (u,v) return all edges (u,v) which have a value greater than epsilon.
- Parameters
x (dict) – Keys are (u,v) edges. Values are Gurobi Vars.
epsilon (float, optional) – Small value to correct for rounding errors. Default is 0.0001.
varname (str, optional) – Name of the Gurobi variable. Default is ‘x’.
- Returns
Edges who’s value is greater than epsilon. List of 2-tuples (u,v).
- Return type
list
-
cltproblem.util.
get_support_graph
(x)¶ Create a networkx graph over the variables x.
- Parameters
x (list) – List of 2-tuples representing edges that have been selected in the solution to the MIP.
- Returns
Undirected graph over edges in the solution.
- Return type
nx.Graph
-
cltproblem.util.
is_split_pair
(u, v, max_id=None)¶ Check if two distinct vertices in the DiGraph were originally the same vertex in the undirected input graph.
- Parameters
u (node) – Node should be a string or int.
v (node) – Node should be a string or int.
max_id (int, optional) – The id of the node with the largest id in the original undirected graph.
- Returns
True if the vertices were the same vertex in the original graph. False otherwise.
- Return type
bool
-
cltproblem.util.
is_split_source
(u, max_id=None)¶ Check if u is a split-source vertex
- Parameters
u (node) – A vertex in the directed graph.
max_id (int, optional) – The id of the node with the largest id in the original undirected graph.
- Returns
True if vertex is split-source, false otherwise.
- Return type
bool
See also
-
cltproblem.util.
is_split_target
(u, max_id=None)¶ Check if u is a split-target vertex.
- Parameters
u (node) – A vertex in the directed graph.
max_id (int, optional) – The id of the node with the largest id in the original undirected graph.
- Returns
True if vertex is split-target, false otherwise.
- Return type
bool
See also
-
cltproblem.util.
load_dataset
(ds, fp=None)¶ Load the dataset and return a graph.
- Parameters
ds (str) – Key for dataset. Options are ‘pollution’, ‘crucible’.
- Returns
Graph for dataset. The graph will have a cost function and a weight function on the edges. If crucible, then nx.Graph is returned. If pollution, then nx.MultiDiGraph is returned.
- Return type
Graph
-
cltproblem.util.
log_model
(m, fp='log.json', write=False, epsilon=0.0001)¶ Log current details of a model.
- Parameters
m (Model) – A Gurobi model that has already been optimized.
- Returns
Dictionary containing log.
- Return type
dict
-
cltproblem.util.
manhattan_check
(G, u, v, T, min_uv_distance=0)¶ Return true if the manhattan distance between u and v + shortest path distance between u and origin + shortest path distance between v and origin is less than T.
- Parameters
G (Graph) – Input graph.
u (node) –
v (node) –
T (int) – u and v must at most this threshold distance apart.
min_uv_distance (int) – u and v must be at least this distance apart.
- Returns
True if nodes meet threshold constraints.
- Return type
bool
-
cltproblem.util.
order_path
(edges, s)¶ Given an unordered list of edges (u,v) and a start vertex s, return an ordered list of vertices in the path starting with s.
- Parameters
edges (list) – Each item in the list is a 2-tuple of vertices (u,v).
s (node) – Start vertex of path.
- Returns
Ordered list of vertices.
- Return type
list
-
cltproblem.util.
pathfinding_benchmark_graph
(fp='', datadir='datasets/', filename='thecrucible.map')¶ Load a graph from the 2D Pathfinding Benchmarks dataset by Nathan Sturtevant [Stu12].
There are five types of ground in the maps: Normal ground (.), shallow water (S), trees (T), water (W) and out-of-bounds(@).
A vertex is created for every square in the grid. An edge exists between two adjacent (left,right,up,down) squares of normal ground.
The weight of every edge is one. The cost of an edge is defined by the mean entropy of the vertices it is connected to:
\[c(e_{u,v}) = 1 - \frac{1}{2}(\mathcal{E}(u) + \mathcal{E}(v))\]- Parameters
fp (str, optional) – Full filepath to the shape file.
datadir (str, optional) – Name of the data directory.
fp – Name of the map file.
- Returns
An undirected graph.
- Return type
Graph
See also
-
cltproblem.util.
pollution_gdf
(fp='', datadir='datasets/', filename='pollution.shp')¶ Return a geodataframe with pollution values.
- Parameters
fp (str, optional) – Full filepath to the shape file.
datadir (str, optional) – Name of the data directory.
fp – Filename of the shapefile.
- Returns
Contains a geometry column and a value column. The value column contains the pollution values.
- Return type
GeoDataFrame
-
cltproblem.util.
pollution_graph
(fp='', datadir='datasets/', filename='pollution.shp')¶ Load the pollution dataset graph.
Gets the bounding box of the air quality predictions. Queries osmnx to get all vertices and edges within the bounding box. See
get_boundaries()
method for more details about the bounding box.- Parameters
fp (str, optional) – Full filepath to the shape file.
datadir (str, optional) – Name of the data directory.
fp – Filename of the shapefile.
- Returns
A multi di graph returned from osmnx.
- Return type
nx.MultiDiGraph
-
cltproblem.util.
random_nodes
(G, k)¶ Choose k random nodes from G
- Parameters
G (Graph) – Input graph.
k (int) – Number of random nodes to choose.
- Returns
Set of k randomly chosen nodes.
- Return type
Set
-
cltproblem.util.
reorder
(T, s)¶ Given a tour T and a start vertex s, reorder the tour such that the start vertex is at the start and end of the tour.
- Parameters
T (list) – List of nodes.
s (node) – Start and end vertex of the tour.
- Returns
New tour. First and last vertex in the list is the start vertex.
- Return type
list
-
cltproblem.util.
repetitive_index
(G, T, xhead=None)¶ Compute the repetitive index (\(\rho\)) of a tour T in G. If \(\rho \to 1\), then the tour consists of a single edge repeated many times, else if \(\rho \to -1\) then the tour is a simple cycle. \(\rho\) is defined by:
\[\rho = \frac{\underset{e \in E(\mathcal{T})}{\max} \left\{ \phi(\mathcal{T}, e)\right\} - |E(\mathcal{T})|}{\displaystyle\sum_{e \in E(\mathcal{T})}~{\phi(\mathcal{T}, e)}}\]where \(E(\mathcal{T}) \subset E(G)\) is the set of edges in tour \(\mathcal{T}\) and \(\phi(\mathcal{T}, e)\) is the number of times edge \(e\) is repeated in tour \(\mathcal{T}\).
- Parameters
G (Graph) – Input graph.
T (list) – List of nodes denoting a tour.
xhead (float, optional) –
- Returns
Repetitive index between -1 and 1.
- Return type
float
-
cltproblem.util.
sphere_check
(G, u, v, T, min_uv_distance=0)¶ The great sphere distance between u and v + shortest path distance between u and origin + shortest path distance between v and origin is less than T.
- Parameters
G (Graph) – Input graph.
u (node) –
v (node) –
T (int) – u and v must at most this threshold distance apart.
min_uv_distance (int) – u and v must be at least this distance apart.
- Returns
True if nodes meet threshold constraints.
- Return type
bool
-
cltproblem.util.
sum_disconnected_walks
(G, walks, weight='weight')¶ In the connectivity relaxation, there exist disconnected components. To find the weight/cost of the components, we must compute each component seperately.
- Parameters
G (Graph) – Input graph.
walks (list of lists) – Each list is a connected walk in the graph.
- Returns
Sum of weights of each disconnected walk.
- Return type
int, float
See also
-
cltproblem.util.
total
(G, W, weight='weight', xhead=None)¶ Find the total weight of a walk W in the graph G.
- Parameters
G (Graph) – Input graph.
W (list) – List of vertices in the walk.
weight (str, optional) – Name of edge attribute to sum weight for. Default is ‘weight’.
xhead (float, optional) – Number of times last edge in the walk is repeated. Only used for continuous relaxation. Default is None.
- Returns
Total weight of walk.
- Return type
int or float
-
cltproblem.util.
update_cost
(G, pdf, edf=None, prune=True, keyname='key')¶ Update the cost of edges in G from a GeoDataFrame gdf.
- Parameters
G (Graph) – Input graph. Must have a geometry attribute on the edges.
pdf (GeoDataFrame) – Must contain geometry column and value column.
edf (GeoDataFrame, optional) – If an edge GeoDataFrame of G is already available, pass it to this parameter. Otherwise a new GeoDataFrame will be created from G.
prune (bool) – Whether to remove edges and nodes that do not intersect the area defined by pdf.
- Returns
Graph with updated cost attribute. If prune is set to true, then a new graph is created and returned. If prune is false, then the cost function of the original graph is changed and the original graph is returned.
- Return type
Graph
cltproblem.visualise module¶
-
cltproblem.visualise.
COLORS
= {'A': 'red', 'C': 'magenta', 'N': 'orange', 'R': 'blue', 'S': 'purple', 'X': 'black'}¶ Colour used for each heuristic.
-
cltproblem.visualise.
DOTS
= {'A': '^', 'C': '^', 'N': 'x', 'R': 'x', 'S': 'o', 'X': 'o'}¶ Type of dot used for each heuristic.
-
cltproblem.visualise.
LINES
= {'A': '-.', 'C': '--', 'N': '--', 'R': ':', 'S': '-.', 'X': '--'}¶ Type of line used for each heuristic.
-
cltproblem.visualise.
get_tight_box
(X, Y, W2, big_box)¶ Get a tight box around an x,y co-ordinate. The size of the box is determined by the upper weight threshold W2.
- Parameters
X (float) – x co-ordinate
Y (float) – y co-ordinate
W2 (int) – Upper weight threshold in meters.
big_box (list) – The boundaries of the box containing all edges and vertices. See
util.get_boundaries()
to get this box given a GeoDataFrame.
- Returns
Bounding box for the tight box. List of 4 floats: [xmin, ymin, xmax, ymax]. If using lat/lon, then x is lon and y is lat.
- Return type
list
-
cltproblem.visualise.
num_constrs
(constrs)¶ Plot the number of constraints in the model.
-
cltproblem.visualise.
objective
(obj)¶ Plot the value of the objective function over iterations.
-
cltproblem.visualise.
tour
(G, edges, pgdf, egdf=None, ax=None, ngdf=None, s=None, xlim=[], ylim=[], edge_color='white', tour_color='white', edge_alpha=0.3, node_tour_color='red', vmin=0, vmax=80, source='source', target='target')¶ Show a tour T in graph G for the pollution prediction contained in the GeoDataFrame gdf.
- Parameters
G (Graph) – Input graph. Must have a geometry attribute.
edges (list) – A list of 3-tuples (u,v,k) representing edges in the tour.
gdf (GeoDataFrame) – Contains a geometry column and value column.
s (node, optional) – The start vertex.
s_geom (Point, optional) – Geometry of the start vertex. Must be a point.
xlim (list) – List of ints or floats. Length two.
ylim (list) – List of ints or floats. Length two.
-
cltproblem.visualise.
viz_disjoint_tours
(C, pdf, tour_list, dir='', save=False, filename='sec')¶