AI-Tools: Graph search#
The search problem
Graph search, uninformed search strategies
Breadth first
Depth First
Data structures
Stack
Fifo
Informed search and heuristic function
A* algorithm
Finding a way between towns#
To actually solve the navigation problem, two things are needed:
Knowledge and
Algorithm
What knowledge is encoded in the map:
?
?
?
Puzzle solving, infinite graph#
The possible moves in the 8-puzzle game forms a graph
Obviously, if the tree is expanded from all possible moves, the goal state will be included in the tree.
The goal can be find using graph search algorithms, such as A*.
Even though this 8-puzzle is small but the number of possible moves will grow exponentially
The solution graph is infinite
Most of the branches are, however, bad moves
It is a good strategy to use a heuristic function to eliminate bad moves before taking them.
To explore an infinite graph, one must expand it on-the-go
In this example an implementation included in the course book is used:
Python code for the book Artificial Intelligence: A Modern Approach. You can use this in conjunction with a course on AI, or for study on your own. We’re looking for solid contributors to help.
Solution using Networkx graph library#
The previous implementations have shown approaches for implementing graph searching. But there are also many libraries with ready made implementations of various graph search algorithms. Networkx is one of the most often used graph libraries in Python.
I have collected some additional information about the towns between Vaasa and Vimpeli, for example the coordinates of the towns, which can be used for calculating the direct line distances between them to define a heuristic function for guiding the graph search for most promising directions.
Name Northing Easting
0 Akaa 61.167145977 23.865888764
1 Alajärvi 63.001273254 23.817522812
2 Alavieska 64.166646246 24.307213479
3 Alavus 62.586161997 23.616600589
4 Asikkala 61.172512361 25.549142484
5 Askola 60.530210718 25.598880612
...
The following function calculates the coordinates of the towns based on their longitudes and latitudes.
# This code calculates the coordinates of the towns. It is only needed for nice plotting
import pandas as pd
from math import sin, pi
Coordinates=pd.read_csv('data/coordinates.csv')
def getCoordinates(Location, Theta=63.092589):
Ps=Coordinates[Coordinates.Name==Location]
if len(Ps)==0:
print("Now location %s found!" % Location)
return None
P=Ps.iloc[0]
# Earth radius / km
R=6371.0 # Average
Rp=6356.8 # Polar
Re=6378.1 # Equatorial
#
# Earth radius in current Northing
#Theta=P.Northing/180*pi
#Theta=63.092589
r=sin(pi/2-Theta/180*pi)*Re
#
y = (P.Northing/180*pi)*Rp
x = (P.Easting/180*pi)*r
return (x,y)
print(getCoordinates('Vaasa'))
print(getCoordinates('Seinäjoki'))
(1088.9504471170537, 6999.93912935637)
(1150.7335047599795, 6965.997532375281)
Encoding the information#
The biggest work when using the Networkx all other similar libraries, is to encode the information for the library. In the code below, the libraries are first imported into the workspace, and empty graph is created.
Then the nodes are added from the list of towns using for-loop. The coordinates of the towns are added in each node (the pos attribute) to be able to plot the graph in actual scale.
The roads between the towns are added as edges. The lengths of the roads are added as weigths of edges.
The networx library includes convenient funtions for plotting the graph. The weights of the edges can be plotted in the edges and the pos-attribute is now used to place the towns in a coordinate system.
# This code generates the graph
import networkx as nx
import matplotlib.pyplot as plt
# Create an empty directed graph
M=nx.DiGraph()
# List of towns added as nodes
Towns=['Vaasa', 'Laihia', 'Vöyri', 'Ilmajoki', 'Ylistaro', 'Seinäjoki',
'Vöyri', 'Lapua', 'Jepua', 'Alajärvi', 'Kyyjärvi', 'Kuortane',
'Kauhava', 'Lappajärvi', 'Evijärvi', 'Vimpeli']
# Add the nodes and store the coordinates of the towns as well, for nice plotting
for town in Towns:
M.add_node(town, pos=getCoordinates(town))
# Add the edges, and their weights = distances
M.add_edge('Vaasa', 'Laihia', weight=26)
M.add_edge('Vaasa', 'Vöyri', weight=32)
M.add_edge('Laihia', 'Ilmajoki', weight=45)
M.add_edge('Laihia', 'Ylistaro', weight=29)
M.add_edge('Vöyri', 'Jepua', weight=42)
M.add_edge('Vöyri', 'Kauhava', weight=56)
M.add_edge('Ilmajoki', 'Seinäjoki', weight=18)
M.add_edge('Ylistaro', 'Seinäjoki', weight=27)
M.add_edge('Ylistaro', 'Lapua', weight=28)
M.add_edge('Seinäjoki', 'Lapua', weight=26)
M.add_edge('Vöyri', 'Kauhava', weight=45)
M.add_edge('Vöyri', 'Jepua', weight=42)
M.add_edge('Lapua', 'Kauhava', weight=18)
M.add_edge('Lapua', 'Alajärvi', weight=46)
M.add_edge('Jepua', 'Kauhava', weight=56)
M.add_edge('Alajärvi', 'Kyyjärvi', weight=36)
M.add_edge('Alajärvi', 'Vimpeli', weight=22)
M.add_edge('Alajärvi', 'Kuortane', weight=40)
M.add_edge('Kauhava', 'Lappajärvi', weight=38)
M.add_edge('Kauhava', 'Evijärvi', weight=44)
M.add_edge('Lappajärvi', 'Vimpeli', weight=23)
M.add_edge('Evijärvi', 'Vimpeli', weight=32)
# Plot the graph nicely in scale
fig=plt.figure(figsize=(8,8))
pos=nx.get_node_attributes(M,'pos')
nx.draw_networkx (M, pos=pos, node_color='orange')
labels = nx.get_edge_attributes(M,'weight')
nx.draw_networkx_edge_labels(M,pos,edge_labels=labels)
print("Graph is ready")
Graph is ready
Asking questions from the graph#
list(M.neighbors('Kauhava'))
['Lappajärvi', 'Evijärvi']
#nx.depth_first_search.dfs_successors(M,'Vaasa')
for node in nx.breadth_first_search.bfs_successors(M, 'Vaasa'):
print(node)
('Vaasa', ['Laihia', 'Vöyri'])
('Laihia', ['Ilmajoki', 'Ylistaro'])
('Vöyri', ['Jepua', 'Kauhava'])
('Ilmajoki', ['Seinäjoki'])
('Ylistaro', ['Lapua'])
('Kauhava', ['Lappajärvi', 'Evijärvi'])
('Lapua', ['Alajärvi'])
('Lappajärvi', ['Vimpeli'])
('Alajärvi', ['Kyyjärvi', 'Kuortane'])
The breadth first ordering of the towns can be used to find a path from Vaasa to Vimpeli, by starting from Vimpeli to the parent Lappajärvi, and so on the reverse order up to Vaasa.
Vaasa -> Vöyri -> Kauhava -> Lappajärvi -> Vimpeli
The path is unique in this ordering but not necessary optimal, since the weights are not taken into account.
A* algorithm#
More optimal path minimizing the weights can be found using A* algorithm. If no heuristic function is given, the algorith is using Dijkstra’s algorithm. It finds an optimal path, but it may be slow for very large graphs, because Dijkstra’s algorith is uninformed search algorithm.
print("Shortest path :", nx.astar_path(M, 'Vaasa', 'Vimpeli'))
print("Shortest path :", nx.astar_path(M, 'Vaasa', 'Vimpeli'))
print("Shortest distance :", nx.astar_path_length(M, 'Vaasa', 'Vimpeli'), "km")
Shortest path : ['Vaasa', 'Vöyri', 'Kauhava', 'Lappajärvi', 'Vimpeli']
Shortest path : ['Vaasa', 'Vöyri', 'Kauhava', 'Lappajärvi', 'Vimpeli']
Shortest distance : 138 km
The A* algorith is informed search, which is made for efficient using so called heuristic function. In case of navigation, the natural heuristic function is to calculate the line sight distance remaining to the target, and use that to judge the fitness of different options for the next step. This distance can be easily calculated, since the coordinates of the towns is known.
The heuristic function needed by A* method takes two nodes as a parameter and returns the line of sight distance between them. For example, when the algorithm needs to choose should it go either Laihia or Vöyri from Vaasa, the decision is made by adding the distance to the next time and the line of sight distance from that town to Vimpeli, and the decision with smallest costs is selected.
from math import sqrt
from numpy import random
def distanceByName(A,B):
a=getCoordinates(A)
if a==None:
return None
b=getCoordinates(B)
if b==None:
return None
#print(a,b)
return sqrt( (a[0]-b[0])**2 + (a[1]-b[1])**2 )
def heuristic(u, v):
return distanceByName(u,v)
# Test the method in the first step
print("Estimated costs through Vöyri is ", 32 + heuristic('Vöyri', 'Vimpeli'))
print("Estimated costs through Laihia is ", 26 + heuristic('Laihia', 'Vimpeli'))
Estimated costs through Vöyri is 111.06838998333184
Estimated costs through Laihia is 119.41142572684615
Path through Vöyri seems shorter, so that is selected as the first step.
random.seed(0)
%timeit path = nx.astar_path(M, 'Vaasa', 'Vimpeli', heuristic=heuristic, weight='weight')
print("Shortest path :", nx.astar_path(M, 'Vaasa', 'Vimpeli', heuristic=heuristic, weight='weight'), )
random.seed(0)
print("Shortest distance :", nx.astar_path_length(M, 'Vaasa', 'Vimpeli', heuristic=heuristic, weight='weight'), "km")
3.38 ms ± 29.8 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Shortest path : ['Vaasa', 'Vöyri', 'Kauhava', 'Lappajärvi', 'Vimpeli']
Shortest distance : 138 km
The path found is probably optimal, since the A* algorith search multiple paths from source to destination to find the true costs. The algorith uses the heuristic function to make the search faster but due to search of multiple paths, the quality of the heuristic function is not crucial. The algorithm also keeps track of the true costs already calculated, and therefore it does not need to calculate the true costs of a certain path twice.
More complex example using Open Streetmap#
We can download real world map data from OpenStreetmap as a graph by means of osmnx and networkx libraries. The data can be also visualised in scale using geopandas.
import networkx as nx
import osmnx as ox
%matplotlib ipympl
#bbox=(north, south, east, west)
bbox=(63.4, 62.7, 24.2, 21.05)
G_box=ox.graph_from_bbox(bbox=bbox, network_type="drive")
#G_Pohjanmaa = ox.graph_from_place("Pohjanmaa, Finland", network_type="drive")
/tmp/ipykernel_139180/4031697211.py:3: FutureWarning: The expected order of coordinates in `bbox` will change in the v2.0.0 release to `(left, bottom, right, top)`.
G_box=ox.graph_from_bbox(bbox=bbox, network_type="drive")
The roads#
The roads in the bounding box covering the source and destination nodes.
fig, ax = ox.plot_graph(G_box, edge_color='orange', edge_linewidth=1, node_size=2, figsize=(9,4.5))
Route finding#
The distance and routing modules can calculate road length and speed information from the geospatial data stored in Open Streetmap.
G = ox.distance.add_edge_lengths(G_box)
G = ox.routing.add_edge_speeds(G)
G = ox.routing.add_edge_travel_times(G)
university=(21.594, 63.1027)
vimpeli=(23.8285,63.16428)
# Find the nearest node by longitude and latitude
orig, dist_o = ox.distance.nearest_nodes(G, X=university[0], Y=university[1], return_dist=True)
dest, dist_d= ox.distance.nearest_nodes(G, X=vimpeli[0], Y=vimpeli[1], return_dist=True)
print(f"Source node={orig}, destination node is {dest}")
print(f"Distance from source point to source node is {dist_o:.3} m and between destination point and node is {dist_d:.3} m")
Source node=567879088, destination node is 8555338968
Distance from source point to source node is 74.7 m and between destination point and node is 12.4 m
# Find the shortest route
##%timeit
## Route finding takes about 100 ms in standard laptop (Intel, i5-1245U)
route = ox.shortest_path(G, orig, dest, weight="length")
# Convert to Geopandas Data Frame
gdf=ox.routing.route_to_gdf(G, route)
gdf.explore()
gdf
osmid | lanes | ref | name | highway | maxspeed | oneway | reversed | length | geometry | speed_kph | travel_time | bridge | junction | tunnel | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
u | v | key | |||||||||||||||
567879088 | 2501538277 | 0 | [1205303812, 242697759] | NaN | NaN | Yliopistonranta | residential | 30 | False | True | 273.703 | LINESTRING (21.59339 63.10331, 21.59345 63.103... | 30.0 | 32.8 | NaN | NaN | NaN |
2501538277 | 2501538262 | 0 | [242697753, 242697754, 242697755, 242697751] | [2, 3] | NaN | Wolffintie | secondary | 50 | True | False | 384.787 | LINESTRING (21.59875 63.10376, 21.59906 63.103... | 50.0 | 27.7 | yes | NaN | NaN |
2501538262 | 26843884 | 0 | 741050670 | 2 | NaN | Kirkkopuistikko | secondary | 40 | True | False | 58.079 | LINESTRING (21.60380 63.10116, 21.60398 63.101... | 40.0 | 5.2 | NaN | NaN | NaN |
26843884 | 26843883 | 0 | 741050670 | 2 | NaN | Kirkkopuistikko | secondary | 40 | True | False | 123.919 | LINESTRING (21.60455 63.10076, 21.60467 63.100... | 40.0 | 11.2 | NaN | NaN | NaN |
26843883 | 26035348 | 0 | 20867334 | NaN | NaN | Museokatu | secondary | 40 | False | False | 15.946 | LINESTRING (21.60554 63.09974, 21.60583 63.09980) | 40.0 | 1.4 | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
4901691129 | 813611651 | 0 | 130069820 | 2 | 68 | Vimpelintie | primary | 60 | False | True | 10.057 | LINESTRING (23.83291 63.15953, 23.83284 63.15962) | 60.0 | 0.6 | NaN | NaN | NaN |
813611651 | 1433712216 | 0 | [130069820, 130069821, 25535639] | 2 | 68 | Vimpelintie | primary | 60 | False | True | 333.515 | LINESTRING (23.83284 63.15962, 23.83265 63.159... | 60.0 | 20.0 | yes | NaN | NaN |
1433712216 | 4901715602 | 0 | 25535639 | 2 | 68 | Vimpelintie | primary | 60 | False | True | 25.970 | LINESTRING (23.83034 63.16240, 23.83014 63.16261) | 60.0 | 1.6 | NaN | NaN | NaN |
4901715602 | 834229654 | 0 | 25535639 | 2 | 68 | Vimpelintie | primary | 60 | False | True | 11.090 | LINESTRING (23.83014 63.16261, 23.83006 63.16270) | 60.0 | 0.7 | NaN | NaN | NaN |
834229654 | 8555338968 | 0 | 25535639 | 2 | 68 | Vimpelintie | primary | 60 | False | True | 202.066 | LINESTRING (23.83006 63.16270, 23.82955 63.163... | 60.0 | 12.1 | NaN | NaN | NaN |
257 rows × 15 columns
# Some statistics # Gogle maps 138 km
print(f"The map has {len(G_box.nodes)} nodes (crossings)")
print(f"The map has {len(G_box.edges)} edges (roads)")
distance=gdf.iloc[:,8].sum()
print(f"Total length betwen the source and destination is {distance/1000:4.4} km")
The map has 21472 nodes (crossings)
The map has 49167 edges (roads)
Total length betwen the source and destination is 135.5 km
Summary#
In this module the methods for searching data and paths from graph and using graph library has been covered. The learning outcomes of the module:
How the data is encoded as data structures in programs
Some implementation strategies for breadth first, depth first and A* searches
Implementation strategy of a heuristic function
Overall view of the services provided by the Networkx library and the possibilities it provides
Some understanding of Python programing
How to use graph search as part of and intelligent agent, also in cases where the full graph is not known a-priori.