AI-Tools: Graph search¶
Finding a way between towns¶
To actually solve the navigation problem, two things are needed:
Knowledge and
Algorithm
For human beings, a map is the model of the World, which include the knowledge needed for navigation. The ment for navigation shows the location of the cities and the roads connecting them. The map is printed in scale, and therefore the distances between the cities can be measured from the map.
A human navigator have learned how to use this data model for finding the best routes, estimate the distances, and selecting optimal roads. We are so used to using maps (at least we were before starting to use computers for navigation) that we hardly think that we are carrying out a navigation algorithm in our heads when selecting the optimal roads. To be able to implement an artificial agent for navigation, it is usefull to think how a humang agent works, and then try to implement it with a computer. How would you explain how you navigate?
Solution using Networkx graph library¶
The previous implementations have shown approaches for implementing graph searching. But there are also many libraries with ready made implementations of various graph search algorithms. Networkx is one of the most often used graph libraries in Python.
I have collected some additional information about the towns between Vaasa and Vimpeli, for example the coordinates of the towns, which can be used for calculating the direct line distances between them to define a heuristic function for guiding the graph search for most promising directions.
Name Northing Easting
0 Akaa 61.167145977 23.865888764
1 Alajärvi 63.001273254 23.817522812
2 Alavieska 64.166646246 24.307213479
3 Alavus 62.586161997 23.616600589
4 Asikkala 61.172512361 25.549142484
5 Askola 60.530210718 25.598880612
...
The following function calculates the coordinates of the towns based on their longitudes and latitudes.
# This code calculates the coordinates of the towns. It is only needed for nice plotting
import pandas as pd
from math import sin, pi
Coordinates=pd.read_csv('data/coordinates.csv')
def getCoordinates(Location, Theta=63.092589):
Ps=Coordinates[Coordinates.Name==Location]
if len(Ps)==0:
print("Now location %s found!" % Location)
return None
P=Ps.iloc[0]
# Earth radius / km
R=6371.0 # Average
Rp=6356.8 # Polar
Re=6378.1 # Equatorial
#
# Earth radius in current Northing
#Theta=P.Northing/180*pi
#Theta=63.092589
r=sin(pi/2-Theta/180*pi)*Re
#
y = (P.Northing/180*pi)*Rp
x = (P.Easting/180*pi)*r
return (x,y)
print(getCoordinates('Vaasa'))
print(getCoordinates('Seinäjoki'))
(1088.9504471170537, 6999.93912935637)
(1150.7335047599795, 6965.997532375281)
Encoding the information¶
The biggest work when using the Networkx all other similar libraries, is to encode the information for the library. In the code below, the libraries are first imported into the workspace, and empty graph is created.
Then the nodes are added from the list of towns using for-loop. The coordinates of the towns are added in each node (the pos attribute) to be able to plot the graph in actual scale.
The roads between the towns are added as edges. The lengths of the roads are added as weigths of edges.
The networx library includes convenient funtions for plotting the graph. The weights of the edges can be plotted in the edges and the pos-attribute is now used to place the towns in a coordinate system.
# This code generates the graph
import networkx as nx
import matplotlib.pyplot as plt
# Create an empty directed graph
M=nx.DiGraph()
# List of towns added as nodes
Towns=['Vaasa', 'Laihia', 'Vöyri', 'Ilmajoki', 'Ylistaro', 'Seinäjoki',
'Vöyri', 'Lapua', 'Jepua', 'Alajärvi', 'Kyyjärvi', 'Kuortane',
'Kauhava', 'Lappajärvi', 'Evijärvi', 'Vimpeli']
# Add the nodes and store the coordinates of the towns as well, for nice plotting
for town in Towns:
M.add_node(town, pos=getCoordinates(town))
# Add the edges, and their weights = distances
M.add_edge('Vaasa', 'Laihia', weight=26)
M.add_edge('Vaasa', 'Vöyri', weight=32)
M.add_edge('Laihia', 'Ilmajoki', weight=45)
M.add_edge('Laihia', 'Ylistaro', weight=29)
M.add_edge('Vöyri', 'Jepua', weight=42)
M.add_edge('Vöyri', 'Kauhava', weight=56)
M.add_edge('Ilmajoki', 'Seinäjoki', weight=18)
M.add_edge('Ylistaro', 'Seinäjoki', weight=27)
M.add_edge('Ylistaro', 'Lapua', weight=28)
M.add_edge('Seinäjoki', 'Lapua', weight=26)
M.add_edge('Vöyri', 'Kauhava', weight=45)
M.add_edge('Vöyri', 'Jepua', weight=42)
M.add_edge('Lapua', 'Kauhava', weight=18)
M.add_edge('Lapua', 'Alajärvi', weight=46)
M.add_edge('Jepua', 'Kauhava', weight=56)
M.add_edge('Alajärvi', 'Kyyjärvi', weight=36)
M.add_edge('Alajärvi', 'Vimpeli', weight=22)
M.add_edge('Alajärvi', 'Kuortane', weight=40)
M.add_edge('Kauhava', 'Lappajärvi', weight=38)
M.add_edge('Kauhava', 'Evijärvi', weight=44)
M.add_edge('Lappajärvi', 'Vimpeli', weight=23)
M.add_edge('Evijärvi', 'Vimpeli', weight=32)
# Plot the graph nicely in scale
fig=plt.figure(figsize=(8,8))
pos=nx.get_node_attributes(M,'pos')
nx.draw_networkx(M, pos=pos, node_color='orange')
labels = nx.get_edge_attributes(M,'weight')
nx.draw_networkx_edge_labels(M,pos,edge_labels=labels)
print("Graph is ready")
Graph is ready
Asking questions from the graph¶
list(M.neighbors('Kauhava'))
['Lappajärvi', 'Evijärvi']
#nx.depth_first_search.dfs_successors(M,'Vaasa')
for node in nx.breadth_first_search.bfs_successors(M, 'Vaasa'):
print(node)
('Vaasa', ['Laihia', 'Vöyri'])
('Laihia', ['Ilmajoki', 'Ylistaro'])
('Vöyri', ['Jepua', 'Kauhava'])
('Ilmajoki', ['Seinäjoki'])
('Ylistaro', ['Lapua'])
('Kauhava', ['Lappajärvi', 'Evijärvi'])
('Lapua', ['Alajärvi'])
('Lappajärvi', ['Vimpeli'])
('Alajärvi', ['Kyyjärvi', 'Kuortane'])
The breadth first ordering of the towns can be used to find a path from Vaasa to Vimpeli, by starting from Vimpeli to the parent Lappajärvi, and so on the reverse order up to Vaasa.
Vaasa -> Vöyri -> Kauhava -> Lappajärvi -> Vimpeli
The path is unique in this ordering but not necessary optimal, since the weights are not taken into account.
A* algorithm¶
More optimal path minimizing the weights can be found using A* algorithm. If no heuristic function is given, the algorith is using Dijkstra’s algorithm. It finds an optimal path, but it may be slow for very large graphs, because Dijkstra’s algorith is uninformed search algorithm.
print("Shortest path :", nx.astar_path(M, 'Vaasa', 'Vimpeli'))
print("Shortest path :", nx.astar_path(M, 'Vaasa', 'Vimpeli'))
print("Shortest distance :", nx.astar_path_length(M, 'Vaasa', 'Vimpeli'), "km")
Shortest path : ['Vaasa', 'Vöyri', 'Kauhava', 'Lappajärvi', 'Vimpeli']
Shortest path : ['Vaasa', 'Vöyri', 'Kauhava', 'Lappajärvi', 'Vimpeli']
Shortest distance : 138 km
The A* algorith is informed search, which is made for efficient using so called heuristic function. In case of navigation, the natural heuristic function is to calculate the line sight distance remaining to the target, and use that to judge the fitness of different options for the next step. This distance can be easily calculated, since the coordinates of the towns is known.
The heuristic function needed by A* method takes two nodes as a parameter and returns the line of sight distance between them. For example, when the algorithm needs to choose should it go either Laihia or Vöyri from Vaasa, the decision is made by adding the distance to the next time and the line of sight distance from that town to Vimpeli, and the decision with smallest costs is selected.
from math import sqrt
from numpy import random
def distanceByName(A,B):
a=getCoordinates(A)
if a==None:
return None
b=getCoordinates(B)
if b==None:
return None
#print(a,b)
return sqrt( (a[0]-b[0])**2 + (a[1]-b[1])**2 )
def heuristic(u, v):
return distanceByName(u,v)
# Test the method in the first step
print("Estimated costs through Vöyri is ", 32 + heuristic('Vöyri', 'Vimpeli'))
print("Estimated costs through Laihia is ", 26 + heuristic('Laihia', 'Vimpeli'))
Estimated costs through Vöyri is 111.06838998333184
Estimated costs through Laihia is 119.41142572684615
Path through Vöyri seems shorter, so that is selected as the first step.
random.seed(0)
%timeit path = nx.astar_path(M, 'Vaasa', 'Vimpeli', heuristic=heuristic, weight='weight')
print("Shortest path :", nx.astar_path(M, 'Vaasa', 'Vimpeli', heuristic=heuristic, weight='weight'), )
random.seed(0)
print("Shortest distance :", nx.astar_path_length(M, 'Vaasa', 'Vimpeli', heuristic=heuristic, weight='weight'), "km")
8.59 ms ± 243 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Shortest path : ['Vaasa', 'Vöyri', 'Kauhava', 'Lappajärvi', 'Vimpeli']
Shortest distance : 138 km
The path found is probably optimal, since the A* algorith search multiple paths from source to destination to find the true costs. The algorith uses the heuristic function to make the search faster but due to search of multiple paths, the quality of the heuristic function is not crucial. The algorithm also keeps track of the true costs already calculated, and therefore it does not need to calculate the true costs of a certain path twice.
More complex example¶
See below an example of the search in one magnitude more complex graph, and the nice visualization capabilities of the networkx library.
N=400
G = nx.random_geometric_graph(N, 0.125, seed=2)
pos = nx.get_node_attributes(G, "pos")
nodelist = G.nodes(data=True)
A=1
B=20
colors=['orange']*N
colors[A]='blue'
colors[B]='green'
plt.figure(figsize=(8, 8))
nx.draw_networkx_edges(G, pos, alpha=0.4)
nx.draw_networkx_nodes(
G,
pos,
nodelist=G.nodes,
node_size=80,
node_color=colors,
)
%time path=nx.astar_path(G,A,B)
path_edges = list(zip(path,path[1:]))
nx.draw_networkx_edges(G,pos,edgelist=path_edges,edge_color='r',width=5, alpha=0.5)
nx.draw_networkx_nodes(G,pos,nodelist=path[1:-1],node_size=80, node_color='r', alpha=1.0)
plt.axis("off")
CPU times: user 4.06 ms, sys: 0 ns, total: 4.06 ms
Wall time: 4.11 ms
(-0.09958297307894066,
1.1035920983535217,
-0.10150557132936275,
1.1032264276041368)
%timeit path=nx.astar_path(G,A,B)
print("Shortest path :", path )
print("Shortest distance :", nx.astar_path_length(G,A,B), "steps")
3.19 ms ± 219 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Shortest path : [1, 88, 292, 38, 283, 307, 24, 363, 324, 174, 20]
Shortest distance : 10 steps
Summary¶
In this module the methods for searching data and paths from graph and using graph library has been covered. The learning outcomes of the module:
How the data is encoded as data structures in programs
Some implementation strategies for breadth first, depth first and A* searches
Implementation strategy of a heuristic function
Overall view of the services provided by the Networkx library and the possibilities it provides
Some understanding of Python programing
How to use graph search as part of and intelligent agent, also in cases where the full graph is not known a-priori.