cltproblem package

cltproblem.algorithm module

Algorithms for the cltproblem, mostly containing Suurballe’s algorithm.

cltproblem.algorithm.clc(G, s, W1, W2, epsilon=0.0001, verbose=False, fp='log.json', sec=<function sec_connected_components>)

Solves CLC exactly using Gurobi solver.

Parameters
  • G (Graph) – Input graph. Can be directed or undirected.

  • s (node) – Origin.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

  • epsilon (float, optional) – Small value for rounding.

  • verbose (bool, optional) – Whether to print and record the results to a json file. May slow down the solver on larger instances.

  • fp (bool, optional) – Filepath to the results json file.

  • sec (function, optional) – Function that takes the variables x and start vertex s as input and returns a dict called cuts. Each item in cuts is a dict with a ‘left’ key and a ‘right’ key. The constraint added to the model is left <= right.

Returns

List of vertices in the optimal tour.

Return type

list

cltproblem.algorithm.delta(G, S, vars={})

Given a set of nodes S in a graph G, return the set of edges with exactly one endpoint in S and the other endpoint not in S.

Parameters
  • G (nx.Graph) – Input graph.

  • S (set) – Set of nodes.

  • vars (set or dict, optional) – Set containing keys of variables in the IP solver.

Returns

Set of edges with exactly one endpoint in S and one endpoint not in S.

Return type

set

cltproblem.algorithm.edge_disjoint_path_cost(T, t)

Return the cost of the least-cost edge-disjoint pair of paths from s to t.

Parameters
  • T (SuurballeTree) – Initialised Suurballe’s Tree

  • t (node) – Target node

Returns

Total cost of both edge-disjoint paths. If no edge-disjoint paths exist, then return sys.maxsize

Return type

int or float

cltproblem.algorithm.extract_suurballe_edge_disjoint_paths(T, s, t)

Algorithm consists of initialisation and two backward traversals for each edge-disjoint path.

Parameters
  • T (SuurballeTree) – Input tree with d, p, and q initialised using Suurballe’s algorithm. Ensure that all vertices are unmarked to begin.

  • s (node) – Start vertex.

  • t (node) – Target vertex.

Returns

Tuple of length 2. Each tuple is a list with a path from s to t.

Return type

2-tuple of lists

cltproblem.algorithm.init_clc(G, s, W1, W2)

Initialise a Gurobi MIP to solve the CLC problem.

The inputs G, s, W1 and W2 are all stored as attributes of the model. They can be accessed by prepending an underscore, e.g. model._s.

The following objective function and constraints are added to the solver:

\[\begin{split}\begin{align} \label{obj} & \text{Minimise} & & \displaystyle\sum_{e_{ij} \in E(G)}~{x_{ij} \cdot c(e_{ij})} & \\ \label{weight} & \text{Subject to} & & W_1 \leq \displaystyle\sum_{e_{ij} \in E(G)}~{x_{ij} \cdot w(e_{ij})} \leq W_2 & \\ \label{origin} & & & x(\delta(s^\star)) = 2 & \\ \label{degree} & & & \frac{1}{2} x(\delta(i)) \in \{0,1\} & \forall i \in V(G) \\ \label{domain} & & & x_{ij} \in \{0,1\} & \forall e_{ij} \in E(G) \end{align}\end{split}\]

To enforce connectivity of the solution, you will need to add subtour elimination constraints.

Parameters
  • G (Graph) – Input graph. Can be directed or undirected.

  • s (node) – Origin.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

Returns

Gurobi Model object initialised with constraints.

Return type

Model

cltproblem.algorithm.integral(x, epsilon=0.0001)

Check that every variable in x is integral.

cltproblem.algorithm.is_ancestor(T, ancestor, descendant)

Test if ancestor is the ancestor of descendant.

Parameters
  • T (SuurballeTree) – Tree where each vertex has an attribute ‘pre’ and ‘post’ which denote the preorder number and postorder number of the vertex. Use the preorder and postorder methods to compute these numberings.

  • ancestor (node) – The node we are testing is an ancestor.

  • descendant (node) – The node we are testing is a descendant.

Returns

True if ancestor is the ancestor of descendant, and false otherwise.

Return type

boolean

cltproblem.algorithm.postorder(T, s)

Computes a postorder traversal of the tree. Tree should be an instance of a directed graph. Assigns nodes with the attribute “post”. Returns a list of nodes in postorder.

Parameters
  • T (DiGraph) – Tree.

  • s (node) – Root vertex.

Returns

Vertices in postorder.

Return type

list

cltproblem.algorithm.preorder(T, s, limit=5000)

Computes a preorder traversal of the tree. Tree should be an instance of a directed graph. Assign nodes with the attribute “pre”. Returns a list of nodes in preorder.

Parameters
  • T (DiGraph) – Tree.

  • s (node) – Root vertex.

  • limit (int, optional) – Recursion limit. Default is 5000. Bigger is required for larger datasets.

Returns

Vertices in preorder.

Return type

list

cltproblem.algorithm.sec_callback(model, where)
cltproblem.algorithm.sec_connected_components(vars, s)
cltproblem.algorithm.sec_min_weight_cut(vars, s)
cltproblem.algorithm.suurballe(G, s, weight='weight')

Suurballe’s algorithm for multiple targets. For every vertex in G, finds the least-cost edge-disjoint paths from s to v.

This algorithm is the single-source multiple-target version [ST84]. The original Suurballe algorithm [Suu74] was for single-source single-target.

We have tried to be consistant with the notation presented in the 1984 paper [ST84] to make it easier to follow the code.

Parameters
  • G (DiGraph) – Directed input graph. Must be asymmetric, i.e. if (u,v) is in G then (v,u) is not in G. If your graph is symmetric then use the preprocessing.create_dummies method.

  • s (node) – Source vertex.

  • weight (string, optional) – Name of the edge attribute. Defaults to ‘weight’.

Returns

A vertex for which a disjoint path exists has its label set to true. Each reachable vertex has p, q, and d assigned. The edge-disjoint paths can be reconstructed from p and q.

Return type

SuurballeTree

cltproblem.algorithm.whack_a_mole(m)

Enforce that every subset S that does not contain the origin has at least two edges coming out of it.

\[\begin{align} & \frac{\sum_{i \in S}~y_i}{|S|} \leq x(\delta(S)) & \forall S \subset V(G), 3 \leq |S| \leq n-3, s^\star \notin S \end{align}\]

cltproblem.datastructure module

class cltproblem.datastructure.SuurballeTree(s)

Bases: networkx.classes.digraph.DiGraph

Store the tree and all other datastructures for Suurballe’s algorithm for finding least-cost edge-disjoint paths from s to every vertex in the graph.

s

Root vertex.

Type

node

p

Keys are nodes. The edge (p[w],w) is a non-tree edge. Thus p[w] is not the parent of the node in the shortest path tree. Parent of root is None.

Type

dict

d

Keys are nodes. Weight from s to vertex v.

Type

dict

c

Keys are 2-tuple (u,v) of edges. Values are transformed costs of edges.

Type

dict

children

Keys are nodes. Value is a list of children nodes.

Type

dict

i

Keys are nodes. Value is list of 2-tuples representing edges incident to the node.

Type

dict

l

Keys are nodes. Values are bool. Values indicate if the node has been labelled.

Type

dict

hq

Heap queue data structure.

Type

list

mark

Keys are nodes. Values are bool. Values indicate if the node has been marked.

Type

dict

w

Keys are nodes. Values are the shortest path from s to the node.

Type

dict

q

Keys are nodes. Values are nodes. Let v be a key. Then q[v] is the node that caused (p[v],v) to be processed.

Type

dict

get_node_data(u)

Print out the attributes of a vertex in the tree.

Parameters

u (node) –

cltproblem.experiments module

The codebase for running experiments to show how well the heuristics performs against the relaxations.

class cltproblem.experiments.ConnectivityResult(G, s, W1, W2, tour, time=-1, id=-1, dataset='', weight='weight', cost='cost')

Bases: cltproblem.experiments.Result

class cltproblem.experiments.ContinuousResult(G, s, W1, W2, tour, xhead, time=-1, id=-1, dataset='', weight='weight', cost='cost')

Bases: cltproblem.experiments.Result

Deals with the extra xhead parameter.

to_dict()

Return a dictionary with keys/values for each datapoint.

class cltproblem.experiments.Experiment(thresholds, keys, filepath='', origins=[], dataset=[], pre={}, overwrite=True, id=0, ran_nodes=10, backup=False, optimise={})

Bases: object

Stores the results and parameters of an experiment.

print()

For every result, print a summary of the result.

run(verbose=False)

Run the experiment. If the filepath attribute is set, then results are written to that json file.

Parameters

verbose (bool) – If true, print out the result each time.

write()

Write all the results and settings to a json file.

Parameters

filepath (str) – Write the results to a json file with this path.

write_result(result)

Appends the last added result to the json file.

Parameters

result (Result) –

cltproblem.experiments.KEYS = {'AH': <function adaptive>, 'CR': <function continuous>, 'DjV': <function dejavu>, 'SH': <function suurballe_heuristic>, 'XR': <function connectivity>}

Maps a key to a heuristic or relaxation method.

cltproblem.experiments.NICE = {'AH': 'Adaptive', 'CR': 'Continuous relaxation', 'DjV': 'Deja Vu', 'SH': 'Suurballe', 'XR': 'Connectivity relaxation'}

Maps a key to a nice name for a heuristic or relaxation.

class cltproblem.experiments.Result(G, s, W1, W2, key, tour, time=-1, id=-1, dataset='', weight='weight', cost='cost')

Bases: object

print()

Print out the result.

to_dict()

Return a dictionary with keys/values for each datapoint.

cltproblem.experiments.read(filepath)

Read a json file and return an Experiment object.

cltproblem.heuristic module

Heuristics for the CLT and CLC problems.

cltproblem.heuristic.adaptive(G, s, W1, W2, weight='weight', cost='cost', k=1, optimise='default')

Creates a cycle C which can be repeated. If s is in C, then C is repeated at most k times. If s is not in C, then a path from s to C is computed, and C is repeated at most k-1 times.

Parameters
  • G (Graph) – Asymmetric, directed input graph.

  • s (node) – Start vertex.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

  • weight (str, optional) – Name of weight edge attribute.

  • cost (str, optional) – Name of cost edge attribute.

  • k (int, optional) – Max number of vertex repetitions.

  • optimise (str, optional) – Method for reducing the number of vertices visited. Options are default, sphere, euclid, manhattan.

Returns

Ordered list of vertices in the tour.

Return type

list

cltproblem.heuristic.dejavu(G, s, W1, W2, weight='weight', cost='cost', k=9223372036854775807, optimise='default')

Designed for the CLT problem when k=infinity. Finds a path to a low cost edge which it repeats with a large multiplicity. Returns to s along the path.

Parameters
  • G (Graph) – Input graph.

  • s (node) – Start vertex.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

  • weight (str, optional) – Name of weight edge attribute.

  • cost (str, optional) – Name of cost edge attribute.

  • k (int, optional) – Max number of vertex repetitions.

  • optimise (str, optional) – Redundant for this heuristic.

Returns

Ordered list of vertices in the tour.

Return type

list

cltproblem.heuristic.suurballe_heuristic(G, s, W1, W2, weight='weight', cost='cost', k=1, optimise='default')

Find two vertex-disjoint paths starting at the origin and ending at some vertex t such that the total cost is minimised and sum of the weights of the paths is weight feasible. Time complexity is \(\mathcal{O}(n \log n)\).

Parameters
  • G (Graph) – Asymmetric, directed, simple input graph.

  • s (node) – Start vertex.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

  • weight (str, optional) – Name of weight edge attribute.

  • cost (str, optional) – Name of cost edge attribute.

  • k (int, optional) – Max number of vertex repetitions. Default is 1 (CLC problem).

  • optimise (str, optional) – Redundant for this heuristic.

Returns

  • list – Ordered list of vertices in first pair of vertex-disjoint paths.

  • list – Order list of vertices in second pair of vertex-disjoint paths.

cltproblem.preprocess module

Pre-processing algorithms to reduce the size of the input graph or converting between different types of graph.

class cltproblem.preprocess.Conversion

Bases: object

Converting between different types of networkx Graphs.

Different datasets mean we need to convert between different types of graphs because algorithms often assume a certain type of graph as input. This class stores different types of graphs for the same dataset and provides methods for converting between them. Specifically, we give methods for converting between MultiGraphs to simple Graphs, from MultiDiGraphs to DiGraphs, and from DiGraphs to asymmetric DiGraphs. See the networkx and osmnx packages for converting between directed and undirected graphs.

multi

A networkx multigraph.

Type

MultiGraph

simple

A networkx simple graph.

Type

Graph

di

A networkx DiGraph.

Type

DiGraph

multidi

A networkx MultiDiGraph.

Type

MultiDiGraph

asymm

A network DiGraph that is asymmetric.

Type

DiGraph

di_fakes

A dictionary mapping a fake node f in the simple graph to a 3-tuple (u,v,k) in the multi graph. There are two edges (u,f) and (f,v) in the simple graph connecting f. (u,w,k) is a multi-edge between u and v with key k.

Type

dict

simple_fakes

Fake nodes in the undirected simple graph to edges in the undirected multi graph

Type

dict

asymm_mid

Max node id of the asymmetric directed graph.

Type

int

multidi_mid

Max node id of multi directed graph.

Type

int

di_mid

Max node id of directed simple graph.

Type

int

multi_mid

Max node id of the multi undirected graph.

Type

int

Examples

Lets say you create the following graph:

>>> G = nx.MultiDiGraph()
>>> nx.add_path(G, [0,1,2,3,4])
>>> nx.add_edge(2,3)
>>> nx.add_edge(3,4)
>>> C = Conversion()
>>> H = C.multidi2di(G)

Calling multidi2di will create a simple graph. You can access the original MultiDiGraph G by

>>> C.multidi

And you can access the created simple directed graph by

>>> C.di

The multi_to_simple function will create some ``fake nodes’‘. The attribute fakes stores the created fake nodes as keys and stores the original (u,v,k) multi-edges as a value.

>>> print(C.di_fakes)
{5:(2,3,1),6:(3,4,1)}
get_di_fake_nodes()

Get nodes that were created in the conversion processes.

Returns

Set of nodes.

Return type

set

get_di_path_from_asymm(P)

Given a path in the asymmetric graph, return the path in the directed graph.

Parameters

P (list) – List of nodes in the path from the asymmetric graph.

Returns

List of nodes in the path in the di graph.

Return type

list

get_multi_path_from_asymm(P)

Given a list of nodes in the asymmetric directed graph, return a list of nodes from the undirected multi graph.

get_multi_path_from_simple(P)

Given a path in the simple undirected graph, return the path from the multi undirected graph.

Case = [f,0,1,2,3,4,f] Case = [f,0,1,2,3,4] Case = [0,1,2,3,4,f]

Parameters

P (list) – List of nodes in the path from the simple graph.

Returns

List of nodes in the path in the multi graph.

Return type

list

get_multidi_path_from_di(P)

Given a list of vertices in a path from the simple (di) graph, return the list of edges in the multi-graph in the path.

Parameters

P (list) – List of nodes in the path.

Returns

List of edges in the path. Each item in the list is a three tuple.

Return type

list

multi2asymm(G=None)

Given undirected graph, return a directed, asymmetric graph.

multi2simple(G=None)

Given an undirected multigraph, return an undirected simple graph.

Parameters

G (nx.MultiGraph, optional) – Undirected graph with multi edges.

Returns

Undirected, simple graph with no multi edges.

Return type

nx.Graph

Notes

Currently, we remove self loops rather than processing them.

multidi2di(G=None)

Given a directed graph with multiple arcs between two vertices, return a simple graph.

If there exist two edges e1,e2 between two vertices u,v, then create a third vertex w. e1 remains unchanged, and e2 is split in half such that there exists an edge u,w and w,v. The weight, cost and gamma of e1 and (u,w) remain unchanged. The weight, cost and gamma of (w,v) are all set to zero.

G is assigned to the multidi attribute. The returned DiGraph is assigned to the di attribute. The fakes attribute is initialised.

Parameters

G (nx.MultiDiGraph, optional) – Directed input graph that may have multiple edges between two vertices. If not specified then we use the multidi attribute as G.

Returns

Simple directed graph.

Return type

nx.DiGraph

multidi2multi(G=None)

Given a multidi graph, return a multi undirected graph.

simple2asymm(G=None, weight='weight', cost='cost', cost_of_split=0, weight_of_split=0)

Given a directed, (possibly) symmetric, simple graph, return a directed asymmetric simple graph.

This tranformation is described by [ST84]. Each vertex from the directed graph is split in two. E.g. node u is split into u1 and u2. A directed edge is create between u1 and u2. The cost and weight of this edge is defined by the parameters cost_of_split and weight_of_split.

This method will also assign the asymm_mid and di_mid attributes.

Parameters
  • G (nx.Graph, nx.DiGraph, optional) – Input graph that may be directed. If a DiGraph is specified, then G is assigned to the C.di attribute. If no graph is specified, the C.di graph is used.

  • weight (str, optional) – Name of weight edge attribute of the graph.

  • cost (str, optional) – Name of cost edge attribute of the graph.

  • weight_of_split (int, optional) – Weight of arc between u1 and u2.

  • cost_of_split (int, optional) – Cost of arc between u1 and u2.

Returns

Asymmetric directed graph.

Return type

DiGraph

exception cltproblem.preprocess.ConversionException(message)

Bases: Exception

Raised when there is a problem converting between graphs.

cltproblem.preprocess.leaves(G, s, limit=1000)

Given graph G, find all vertices with degree 1 and remove them from the graph. If removing a vertex creates another vertex with degree 1, also remove this vertex.

If the graph is disconnected, or become disconnected as a result of removing nodes, then not all leaves are guaranteed to be removed from the graph. In this case, it is suggested to use the remove_disconnected_components function.

Parameters
  • G (Graph) – Input graph.

  • s (node) – Origin

  • limit (int, optional) – Recursion limit.

Returns

Graph where all vertices have degree 2 or more.

Return type

Graph

cltproblem.preprocess.post_clc(C, edges, s)

Return a tour after post-processing the CLC result.

cltproblem.preprocess.post_suurballe(C, P1, P2, s)

Given two vertex disjoint paths in the asymm graph, return the tour from the multidi graph.

cltproblem.preprocess.pre_clc(G, s, W1, W2)
cltproblem.preprocess.pre_suurballe(G, s, W2)

Executes all pre-processing methods for Suurballe’s heuristic.

  • Removes vertices that are unreachable from s within weight W2/2.

  • Removes all remaining leaves.

  • Converts the multidi graph into a simple di graph.

  • Converts the di graph into an asymmetric di graph.

Parameters
  • G (nx.MultiDiGraph) – Input graph.

  • s (node) – Start vertex.

  • W2 (int) – Upper weight threshold.

Returns

A Conversion class with the multidi, di and asymm attributes set. The pre-processed graph you should input to Suurballe’s heuristic is the asymm graph.

Return type

Conversion

Notes

The start vertex will need to be changed using the util.create_split_target() function when passed to the heuristic.suurballes_heuristic() function. This is because the graph has gone through the “splitting” procedure (see explanation in simple2asymm() method in the Conversion class.)

cltproblem.preprocess.suurballe_prune(G, s, threshold, weight='weight')

If the total weight of shortest pair of vertex-disjoint paths from s to a vertex v in G is greater than the threshold, then remove v from the graph.

Parameters
  • G (nx.DiGraph) – Directed input graph.

  • s (node) – Vertex in G.

  • threshold (int or float) – Cutoff threshold.

  • weight (string, optional) – Name of edge attribute.

Returns

New graph containing only vertices that are reachable from s within given weight threshold.

Return type

nx.DiGraph

cltproblem.preprocess.unreachable(G, s, threshold, weight='weight')

Remove vertices from G which are unreachable within a given weight threshold from a vertex s.

Parameters
  • G (Graph) – Input graph. Can be directed or undirected.

  • s (node) – Vertex in G.

  • threshold (int or float) – Cutoff threshold.

  • weight (string, optional) – Name of edge attribute.

Returns

New graph containing only vertices that are reachable from s within given weight threshold.

Return type

Graph

cltproblem.relaxation module

Relaxations for the Constrained Least-cost Tour problem. These relaxations are a lower bound on the optimal solution to the CLT prolem. Thus we can compare how close our heuristics are to the lower bounds to measure the performance of the heuristics.

cltproblem.relaxation.connectivity(G, s, W1, W2, weight='weight', cost='cost', optimise='default', count=0, auth={}, time_limit=9223372036854775807)

Find a group of disjoint cycles, one of which contains s, such that the total weight of the solution is at least W1 and at most W2. The total cost of the cycles should be minimised.

This algorithm is a relaxation for the constrained least-cost cycle problem.

Parameters
  • G (Graph) – Input graph.

  • s (string) – Start vertex.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

  • weight (str, optional) – Name of weight attribute on edge. Default is ‘weight’.

  • cost (str, optional) – Name of cost attribute on edge. Default is ‘cost’.

  • count (int, optional) – Number of times the solution has been tried so far. Used if DoCloud drops out.

  • auth (dict, optional) –

    The authentication values if docloud is used. If not specified then model is solved optimally, i.e. using your locally installed CPLEX Optimisation Studio.

    Must have “url” and “client_id” fields set if solving on cloud.

  • optimise (str, optional) – Redundant for this relaxation.

Returns

  • eids (list) – List of edge ids in the solution.

  • lower_bound (float) – The lower bound of the solution. If the solution is optimal, lower_bound is the value of the optimal solution.

cltproblem.relaxation.continuous(G, s, W1, W2, weight='weight', cost='cost', optimise='default')

The continuous relaxtion for the constraint least-cost closed walk problem. The algorithm returns a list of nodes The algorithm can be summerised in 3 steps:

  1. Find the shortest (least polluted) path to every node reachable within weight D/2.

  2. For every node found, find xhead and compute the cost of the path.

  3. Compare all paths in 2. and return the path with least cost.

Parameters
  • G (Graph) – Input graph.

  • s (node) – Start and end vertex.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

  • weight (str, optional) – Name of weight attribute on edge. Default is ‘weight’.

  • cost (str, optional) – Name of cost attribute on edge. Default is ‘cost’.

  • optimise (str, optional) – Redundant for this relaxation.

Returns

List of nodes in walk. The final edge is the head.

Return type

list

cltproblem.relaxation.initialise_model(G, s, W1, W2, weight='weight', cost='cost', lower_bound=0, time_limit=9223372036854775807)

Initialises a CPLEX model for connectivity relaxation.

Parameters
  • G (Graph) – Input graph.

  • s (node) – Starting node.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

  • weight (str, optional) – Name of weight attribute on edge. Default is ‘weight’.

  • cost (str, optional) – Name of cost attribute on edge. Default is ‘cost’.

  • lower_bound (float, optional) – If you have previously derived a lower bound for the solution (e.g. using the continuous relaxation) then you can add this lower bound here.

  • time_limit (float, optional) – Max time in seconds.

Returns

The initialised model.

Return type

CpoModel

cltproblem.settings module

Settings for API keys, filepaths and experiments.

cltproblem.settings.get_settings(filepath='settings.json', default=True)

Load the settings from the json file.

Parameters
  • filepath (str, optional) – Filepath to the settings json file. Must set default to False if you use this option.

  • default (bool) – Use the default location for the settings file.

Returns

Dictionary containing the settings.

Return type

dict

cltproblem.util module

Provides some generally useful functions for extracting walks, manipulating graphs and loading the datasets.

The splitting procedure

Some of the methods in this module are used to create an asymmetric directed graph from an undirected graph. For example, given a vertex u, the create_split_source() and create_split_target() will create two vertex IDs which could then be added to a directed graph.

cltproblem.util.argmin(dictionary)

Given a dictionary, return the key of the item with least value.

Parameters

dictionary (dict) –

Returns

Key of item with least value.

Return type

key

cltproblem.util.compute_entropy(G, u, M, sight=1, max_entropy=True)

The entropy of a vertex \(v\) is defined by:

\[\mathcal{E}(v) = \sum_{c=1}^{C}~{- P_c \log P_c}\]

where \(C\) is the number of classes and \(P_c\) is the probability of class \(c\) appearing in a grid centered on vertex \(v\). The size of the grid is \(2 \cdot \texttt{sight} + 1\).

Parameters
  • G (Graph) – Input graph.

  • u (node) – The vertex to compute entropy for.

  • M (np.matrix) – Numpy matrix containing the classes of all vertices.

  • sight (int, optional) – The distance the agent can see.

  • max_entropy (bool, optional) – True if the agent need to seek areas of high entropy. False otherwise.

Returns

Entropy.

Return type

float

cltproblem.util.compute_xhead(G, W, x, weight='weight')

Finds how many times the last edge in the walk is repeated. This method is intended for the continuous relaxation.

Parameters
  • G (nx.Graph) – Input graph.

  • W (list) – List of vertices in the walk.

  • x (int) – Target weight.

Returns

Number of times the head is repeated.

Return type

float

cltproblem.util.create_split_source(u, max_id=None)

Given a vertex, create the split-source vertex ID. If the vertex ID is a string, then “i-” is prepended. If the vertex ID is an int, then max_id + u is returned.

Parameters
  • u (node) – A vertex in the directed graph.

  • max_id (int, optional) – The id of the node with the largest id in the original undirected graph.

Returns

If u is an int, then a new integer vertex ID is returned. Else if u is a string, then a new string vertex ID is returned.

Return type

int OR str

cltproblem.util.create_split_target(u, max_id=None)

Given a vertex, create the split-target vertex ID. If the vertex ID is a string, then “ii-” is prepended. If the vertex ID is an int, then 2*max_id + u is returned.

Parameters
  • u (node) – A vertex in the directed graph.

  • max_id (int, optional) – The id of the node with the largest id in the original undirected graph.

Returns

If u is an int, then a new integer vertex ID is returned. Else if u is a string, then a new string vertex ID is returned.

Return type

int, str

cltproblem.util.create_walk_dictionary(G, s, W1, W2, T, time, algorithm, id=None, dataset=None, xhead=None)

Store result of an algorithm which computes a solution to the CLT problem in a dictionary.

Parameters
  • G (Graph) – Input graph.

  • s (node) – Origin.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

  • T (list) – List of vertices in the tour.

  • time (float) – Time taken to compute result.

  • algorithm (str) – Name of the algorithm that computed the tour.

  • id (int, optional) – ID of experiment

  • dataset (str, optional) – Name of the dataset, e.g. pollution.

  • xhead (float, optional) – Number of times the head of the walk is repeated. Necessary if the algorithm is the continuous relaxation.

Returns

Returns a dictionary with key-values summerising the tour. The value for the key ‘walk’ is a list.

Return type

dict

cltproblem.util.default_check(G, u, v, T, min_uv_distance=0)

Always returns true.

Parameters
  • G (Graph) – Input graph.

  • u (node) –

  • v (node) –

  • T (int) – u and v must at most this threshold distance apart.

  • min_uv_distance (int) – u and v must be at least this distance apart.

Returns

Always true.

Return type

bool

cltproblem.util.disjoint_tours(G, edges)

Given a graph and a list of edges, return a list of lists of vertices where each list is a disjoint tour.

Parameters
  • G (Graph) – Input graph.

  • edges (list) – List of 2-tuples. Each 2-tuple is an edge (u,v).

Returns

Each list is a disjoint tour of vertices.

Return type

list of lists

cltproblem.util.distance_on_unit_sphere(lat1, long1, lat2, long2)

Given two lat/long points on a globe, compute the distance in meters between the two points.

See John Cook’s blog.

Parameters
  • lat1 (float) –

  • long1 (float) –

  • lat2 (float) –

  • long2 (float) –

Returns

Distance between two points in meters.

Return type

float

cltproblem.util.edge_multiplicity(W, xhead=None)

Given a walk of vertices, count the number of times each edge (u,v) is repeated.

Parameters
  • W (list) – List of vertices in the walk.

  • xhead (float, optional) – If the walk is a solution to the continuous relaxation, then xhead is number of times the last edge in the walk is repeated. Every other edge is repeated exactly twice.

Returns

Dictionary containing the number of times each edge is repeated. The keys are a 2-tuple of nodes: (u,v). The values (multiplicity of edge) are integers or a float if xhead is set.

Return type

dict

Raises
  • TypeError – If W is not a list or if xhead is not a float, int or Decimal.

  • ValueError – If xhead is negative.

cltproblem.util.euclidean_check(G, u, v, T, min_uv_distance=0)

Return true if the Euclidean distance between u and v + shortest path distance between u and origin + shortest path distance between v and origin is less than T.

Parameters
  • G (Graph) – Input graph.

  • u (node) –

  • v (node) –

  • T (int) – u and v must at most this threshold distance apart.

  • min_uv_distance (int) – u and v must be at least this distance apart.

Returns

True if nodes meet threshold constraints.

Return type

bool

cltproblem.util.extract_dummy_path(p, max_id=None)

Given a path of split-sources and split-targets, return the vertices from the pre-split graph that are in the path.

Parameters
  • p (list) – List of split-source and split-target nodes.

  • max_id (int, optional) – The id of the node with the largest id in the original undirected graph.

Returns

List of vertices from the original graph.

Return type

list

cltproblem.util.extract_split_node_id(u, max_id=None)

Given a vertex u which is either a split-source or split-target, return the original ID of the vertex before it was split.

Parameters
  • u (node) – A vertex in the directed graph.

  • max_id (int, optional) – The id of the node with the largest id in the original undirected graph.

Returns

If u is an int, then a new integer vertex ID is returned. Else if u is a string, then a new string vertex ID is returned.

Return type

int, str

cltproblem.util.feasible(total_weight, W1, W2)
Parameters
  • total_weight (int) – Weight of solution.

  • W1 (int) – Lower weight threshold.

  • W2 (int) – Upper weight threshold.

Returns

True if W1 <= total_weight <= W2. False otherwise.

Return type

bool

cltproblem.util.find_max_id(G)

If node ids are integers then return the node with the largest id.

Parameters

G (Graph) – Input graph with integer node ids.

Returns

Largest id.

Return type

int

cltproblem.util.get_benchmark_id(x, y, width)

Get the id of a vertex given x-coord and y-coord and the width (same as height) of the benchmark map.

Parameters
  • x (int) – x co-ordinate of grid square.

  • y (int) – y co-ordinate of grid square.

  • width (int) – Width of the pathfinding benchmark graph.

Returns

Node id.

Return type

int

cltproblem.util.get_boundaries(gdf)

Given a geodataframe, return the bounding box.

Parameters

gdf (GeoDataFrame) – Must have a geometry column.

Returns

List of 4 floats: [xmin, ymin, xmax, ymax]. If using lat/lon, then x is lon and y is lat.

Return type

list

cltproblem.util.get_class(k)

Given a string, return the class it belongs to (0-4).

Parameters

k (str) – Type of ground from: . S T W @

Returns

0 : . (normal ground)

1 : S (shallow water)

2 : T (trees)

3 : W (water)

4 : @ (out-of-bounds)

Return type

int

cltproblem.util.get_input_graph(vars)

Given the variables in the MIP, create a networkx graph.

Parameters

vars (list) – List of 2-tuples representing edges in the graph.

Returns

Undirected graph.

Return type

nx.Graph

cltproblem.util.get_pollution_gdf(fp='', datadir='datasets/', filename='pollution.shp')

Get a GeoDataFrame containing pollution predictions across London.

Parameters
  • fp (str, optional) – Full filepath to the shape file.

  • datadir (str, optional) – Name of the data directory.

  • fp – Filename of the shapefile.

Returns

Polygons each with a mean pollution and variance.

Return type

gpd.GeoDataFrame

cltproblem.util.get_solution_edges(x, epsilon=0.0001, varname='x')

Given some Gurobi Vars indexed by a two-tuple (u,v) return all edges (u,v) which have a value greater than epsilon.

Parameters
  • x (dict) – Keys are (u,v) edges. Values are Gurobi Vars.

  • epsilon (float, optional) – Small value to correct for rounding errors. Default is 0.0001.

  • varname (str, optional) – Name of the Gurobi variable. Default is ‘x’.

Returns

Edges who’s value is greater than epsilon. List of 2-tuples (u,v).

Return type

list

cltproblem.util.get_support_graph(x)

Create a networkx graph over the variables x.

Parameters

x (list) – List of 2-tuples representing edges that have been selected in the solution to the MIP.

Returns

Undirected graph over edges in the solution.

Return type

nx.Graph

cltproblem.util.is_split_pair(u, v, max_id=None)

Check if two distinct vertices in the DiGraph were originally the same vertex in the undirected input graph.

Parameters
  • u (node) – Node should be a string or int.

  • v (node) – Node should be a string or int.

  • max_id (int, optional) – The id of the node with the largest id in the original undirected graph.

Returns

True if the vertices were the same vertex in the original graph. False otherwise.

Return type

bool

cltproblem.util.is_split_source(u, max_id=None)

Check if u is a split-source vertex

Parameters
  • u (node) – A vertex in the directed graph.

  • max_id (int, optional) – The id of the node with the largest id in the original undirected graph.

Returns

True if vertex is split-source, false otherwise.

Return type

bool

cltproblem.util.is_split_target(u, max_id=None)

Check if u is a split-target vertex.

Parameters
  • u (node) – A vertex in the directed graph.

  • max_id (int, optional) – The id of the node with the largest id in the original undirected graph.

Returns

True if vertex is split-target, false otherwise.

Return type

bool

cltproblem.util.load_dataset(ds, fp=None)

Load the dataset and return a graph.

Parameters

ds (str) – Key for dataset. Options are ‘pollution’, ‘crucible’.

Returns

Graph for dataset. The graph will have a cost function and a weight function on the edges. If crucible, then nx.Graph is returned. If pollution, then nx.MultiDiGraph is returned.

Return type

Graph

cltproblem.util.log_model(m, fp='log.json', write=False, epsilon=0.0001)

Log current details of a model.

Parameters

m (Model) – A Gurobi model that has already been optimized.

Returns

Dictionary containing log.

Return type

dict

cltproblem.util.manhattan_check(G, u, v, T, min_uv_distance=0)

Return true if the manhattan distance between u and v + shortest path distance between u and origin + shortest path distance between v and origin is less than T.

Parameters
  • G (Graph) – Input graph.

  • u (node) –

  • v (node) –

  • T (int) – u and v must at most this threshold distance apart.

  • min_uv_distance (int) – u and v must be at least this distance apart.

Returns

True if nodes meet threshold constraints.

Return type

bool

cltproblem.util.order_path(edges, s)

Given an unordered list of edges (u,v) and a start vertex s, return an ordered list of vertices in the path starting with s.

Parameters
  • edges (list) – Each item in the list is a 2-tuple of vertices (u,v).

  • s (node) – Start vertex of path.

Returns

Ordered list of vertices.

Return type

list

cltproblem.util.pathfinding_benchmark_graph(fp='', datadir='datasets/', filename='thecrucible.map')

Load a graph from the 2D Pathfinding Benchmarks dataset by Nathan Sturtevant [Stu12].

There are five types of ground in the maps: Normal ground (.), shallow water (S), trees (T), water (W) and out-of-bounds(@).

A vertex is created for every square in the grid. An edge exists between two adjacent (left,right,up,down) squares of normal ground.

The weight of every edge is one. The cost of an edge is defined by the mean entropy of the vertices it is connected to:

\[c(e_{u,v}) = 1 - \frac{1}{2}(\mathcal{E}(u) + \mathcal{E}(v))\]
Parameters
  • fp (str, optional) – Full filepath to the shape file.

  • datadir (str, optional) – Name of the data directory.

  • fp – Name of the map file.

Returns

An undirected graph.

Return type

Graph

cltproblem.util.pollution_gdf(fp='', datadir='datasets/', filename='pollution.shp')

Return a geodataframe with pollution values.

Parameters
  • fp (str, optional) – Full filepath to the shape file.

  • datadir (str, optional) – Name of the data directory.

  • fp – Filename of the shapefile.

Returns

Contains a geometry column and a value column. The value column contains the pollution values.

Return type

GeoDataFrame

cltproblem.util.pollution_graph(fp='', datadir='datasets/', filename='pollution.shp')

Load the pollution dataset graph.

Gets the bounding box of the air quality predictions. Queries osmnx to get all vertices and edges within the bounding box. See get_boundaries() method for more details about the bounding box.

Parameters
  • fp (str, optional) – Full filepath to the shape file.

  • datadir (str, optional) – Name of the data directory.

  • fp – Filename of the shapefile.

Returns

A multi di graph returned from osmnx.

Return type

nx.MultiDiGraph

cltproblem.util.random_nodes(G, k)

Choose k random nodes from G

Parameters
  • G (Graph) – Input graph.

  • k (int) – Number of random nodes to choose.

Returns

Set of k randomly chosen nodes.

Return type

Set

cltproblem.util.reorder(T, s)

Given a tour T and a start vertex s, reorder the tour such that the start vertex is at the start and end of the tour.

Parameters
  • T (list) – List of nodes.

  • s (node) – Start and end vertex of the tour.

Returns

New tour. First and last vertex in the list is the start vertex.

Return type

list

cltproblem.util.repetitive_index(G, T, xhead=None)

Compute the repetitive index (\(\rho\)) of a tour T in G. If \(\rho \to 1\), then the tour consists of a single edge repeated many times, else if \(\rho \to -1\) then the tour is a simple cycle. \(\rho\) is defined by:

\[\rho = \frac{\underset{e \in E(\mathcal{T})}{\max} \left\{ \phi(\mathcal{T}, e)\right\} - |E(\mathcal{T})|}{\displaystyle\sum_{e \in E(\mathcal{T})}~{\phi(\mathcal{T}, e)}}\]

where \(E(\mathcal{T}) \subset E(G)\) is the set of edges in tour \(\mathcal{T}\) and \(\phi(\mathcal{T}, e)\) is the number of times edge \(e\) is repeated in tour \(\mathcal{T}\).

Parameters
  • G (Graph) – Input graph.

  • T (list) – List of nodes denoting a tour.

  • xhead (float, optional) –

Returns

Repetitive index between -1 and 1.

Return type

float

cltproblem.util.sphere_check(G, u, v, T, min_uv_distance=0)

The great sphere distance between u and v + shortest path distance between u and origin + shortest path distance between v and origin is less than T.

Parameters
  • G (Graph) – Input graph.

  • u (node) –

  • v (node) –

  • T (int) – u and v must at most this threshold distance apart.

  • min_uv_distance (int) – u and v must be at least this distance apart.

Returns

True if nodes meet threshold constraints.

Return type

bool

cltproblem.util.sum_disconnected_walks(G, walks, weight='weight')

In the connectivity relaxation, there exist disconnected components. To find the weight/cost of the components, we must compute each component seperately.

Parameters
  • G (Graph) – Input graph.

  • walks (list of lists) – Each list is a connected walk in the graph.

Returns

Sum of weights of each disconnected walk.

Return type

int, float

See also

total()

cltproblem.util.total(G, W, weight='weight', xhead=None)

Find the total weight of a walk W in the graph G.

Parameters
  • G (Graph) – Input graph.

  • W (list) – List of vertices in the walk.

  • weight (str, optional) – Name of edge attribute to sum weight for. Default is ‘weight’.

  • xhead (float, optional) – Number of times last edge in the walk is repeated. Only used for continuous relaxation. Default is None.

Returns

Total weight of walk.

Return type

int or float

cltproblem.util.update_cost(G, pdf, edf=None, prune=True, keyname='key')

Update the cost of edges in G from a GeoDataFrame gdf.

Parameters
  • G (Graph) – Input graph. Must have a geometry attribute on the edges.

  • pdf (GeoDataFrame) – Must contain geometry column and value column.

  • edf (GeoDataFrame, optional) – If an edge GeoDataFrame of G is already available, pass it to this parameter. Otherwise a new GeoDataFrame will be created from G.

  • prune (bool) – Whether to remove edges and nodes that do not intersect the area defined by pdf.

Returns

Graph with updated cost attribute. If prune is set to true, then a new graph is created and returned. If prune is false, then the cost function of the original graph is changed and the original graph is returned.

Return type

Graph

cltproblem.visualise module

cltproblem.visualise.COLORS = {'A': 'red', 'C': 'magenta', 'N': 'orange', 'R': 'blue', 'S': 'purple', 'X': 'black'}

Colour used for each heuristic.

cltproblem.visualise.DOTS = {'A': '^', 'C': '^', 'N': 'x', 'R': 'x', 'S': 'o', 'X': 'o'}

Type of dot used for each heuristic.

cltproblem.visualise.LINES = {'A': '-.', 'C': '--', 'N': '--', 'R': ':', 'S': '-.', 'X': '--'}

Type of line used for each heuristic.

cltproblem.visualise.get_tight_box(X, Y, W2, big_box)

Get a tight box around an x,y co-ordinate. The size of the box is determined by the upper weight threshold W2.

Parameters
  • X (float) – x co-ordinate

  • Y (float) – y co-ordinate

  • W2 (int) – Upper weight threshold in meters.

  • big_box (list) – The boundaries of the box containing all edges and vertices. See util.get_boundaries() to get this box given a GeoDataFrame.

Returns

Bounding box for the tight box. List of 4 floats: [xmin, ymin, xmax, ymax]. If using lat/lon, then x is lon and y is lat.

Return type

list

cltproblem.visualise.num_constrs(constrs)

Plot the number of constraints in the model.

cltproblem.visualise.objective(obj)

Plot the value of the objective function over iterations.

cltproblem.visualise.tour(G, edges, pgdf, egdf=None, ax=None, ngdf=None, s=None, xlim=[], ylim=[], edge_color='white', tour_color='white', edge_alpha=0.3, node_tour_color='red', vmin=0, vmax=80, source='source', target='target')

Show a tour T in graph G for the pollution prediction contained in the GeoDataFrame gdf.

Parameters
  • G (Graph) – Input graph. Must have a geometry attribute.

  • edges (list) – A list of 3-tuples (u,v,k) representing edges in the tour.

  • gdf (GeoDataFrame) – Contains a geometry column and value column.

  • s (node, optional) – The start vertex.

  • s_geom (Point, optional) – Geometry of the start vertex. Must be a point.

  • xlim (list) – List of ints or floats. Length two.

  • ylim (list) – List of ints or floats. Length two.

cltproblem.visualise.viz_disjoint_tours(C, pdf, tour_list, dir='', save=False, filename='sec')