Graph Theory
- Graph Theory
Books
-
Introduction to Algorithms — Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein (2022).
Comprehensive reference covering algorithm design and analysis, including graph traversal, shortest paths, and spanning trees. -
Introduction to Graph Theory — Richard J. Trudeau.
A visual and conceptual introduction to the fundamentals of graph theory and mathematical reasoning.
Prerequisites
-
Basic Set Theory
Sets, subsets, unions, intersections, Cartesian products, and relations. -
Data Structures
Lists, stacks, queues, trees, heaps, and adjacency-based graph representations. -
Algorithmic Complexity
Big-O notation, asymptotic analysis, and time–space trade-offs.
Understanding how runtime grows with input size is essential for comparing graph algorithms.
Asymptotic Notation Refresher
Suppose we double the number of intersections in a city map. Will our shortest-path algorithm become twice as slow? Four times? Or barely slower at all? Questions like these arise constantly in graph-based planning, and answering them requires a scale-independent way of describing algorithmic growth.
Asymptotic notation provides this perspective. By focusing on how running time behaves as the graph size grows, rather than on machine-specific measurements, we obtain a clear and comparable description of algorithmic efficiency.
When we analyze graph algorithms, we usually care about how the running time grows with the input size $n$ as $n \to \infty$. We compare functions like $f(n)$ and $g(n)$ using the following notations:
Big-O, Big-Omega, Big-Theta
Definition. Big-O — asymptotic upper bound
A function $ f(n) $ is in $ O(g(n)) $ if there exist constants $ c > 0 $ and $ n_0 $ such that
$$ 0 \le f(n) \le c\, g(n) \quad \text{for all } n \ge n_0 . $$
Intuition: $f$ does not grow faster than $g$, up to constant factors.
While Big-O describes an upper bound, it does not tell us whether the function ever grows faster than the comparison. To capture that aspect, we use Big-Omega.
Definition. Big-Omega — asymptotic lower bound
A function $ f(n) $ is in $ \Omega(g(n)) $ if there exist constants $ c > 0 $ and $ n_0 $ such that
$$ 0 \le c\, g(n) \le f(n) \quad \text{for all } n \ge n_0 . $$
Intuition: $f$ does not grow slower than $g$.
Neither an upper nor a lower bound alone gives the full picture. When both coincide, we obtain a tight characterization, expressed using Big-Theta.
Definition. Big-Theta — tight bound
A function $ f(n) $ is in $ \Theta(g(n)) $ if
$$ f(n) = O(g(n)) \quad \text{and} \quad f(n) = \Omega(g(n)). $$
Intuition: $f$ and $g$ grow at the same rate (up to constant factors).
The three notations form a hierarchy of expressiveness. Together, they let us distinguish between algorithms that merely avoid worst-case blowup and those that are genuinely efficient across all input sizes.
Quick Examples
- $ n^2 = O(n^3) $.
- $ n^3 = \Omega(n^2) $.
- $ 3n^2 + 5n + 7 = \Theta(n^2) $ (same growth rate as $ n^2 $).
We will use these notations throughout to describe the time and space complexity of graph algorithms such as BFS, DFS, Dijkstra, and A*.
Quiz.
With these tools in hand, we can now analyze the complexity of the graph algorithms introduced throughout this chapter.
General Motivation
Before introducing formal definitions, it is useful to understand why graphs arise so naturally in robotics and algorithmic reasoning. Many real-world systems, from transportation to autonomous navigation, can be expressed as collections of states and possible transitions. Graph theory provides the mathematical language for studying these relationships and describing how information or motion can flow through a system.
When a robot navigates an environment, its possible configurations and feasible transitions form exactly such a graph: each node represents a state, and each edge represents a valid move. Motion planning then becomes the problem of finding a path in this graph from an initial configuration to a goal while satisfying feasibility or optimality constraints.
Example 1: GPS Navigation
The most intuitive real-world example of a graph is the road network used by your car’s GPS or a mapping app. Intersections are nodes; road segments are edges. To find the “best” route, we assign each edge a weight (typically travel time), computed from distance, speed limits, and live traffic. A shortest-path algorithm then finds the path with minimum total weight from start to destination.
Example 2: Warehouse Robotics
Autonomous robots navigating a warehouse are another classic example. We can model the warehouse floor as a grid, where each free square is a node. Edges implicitly connect adjacent squares, representing a possible move. While a simple path-finding algorithm on this graph can find the route with the fewest moves, we can add weights to edges to represent “slower” zones or areas to avoid. This model also scales to complex multi-agent problems. To coordinate dozens of robots, the graph can be expanded to include time, allowing a search algorithm to find collision-free paths for the entire fleet.
These examples highlight how diverse systems can be described using the same graph abstraction. To work with such systems rigorously, we next introduce precise definitions for graphs, edges, weights, and paths, forming the mathematical foundation used throughout the rest of this material.
Course Content
Basic Definitions
To formalize the intuitions built from real-world examples, we now introduce the core definitions of graph theory. These definitions establish the vocabulary used for describing connectivity, relationships, and structure within a system, and they serve as the basis for later algorithmic analysis.
Graphs
A graph is one of the simplest and most powerful mathematical abstractions used to represent relationships between entities.
Formally, a graph is an ordered pair
$$ G = (V, E) $$
where $V$ is a finite, nonempty set of vertices (or nodes) and $E$ is a set of edges that connect pairs of vertices.
Each edge can be represented as an ordered or unordered pair $(u, v)$, depending on whether the graph is directed or undirected.
In an undirected graph, $(u, v) \in E$ implies $(v, u) \in E$, meaning that the connection between the two vertices is symmetric. In contrast, directed graphs encode asymmetric relationships, such as one-way communication links, dependencies, or flows of information.
The number of vertices connected to a vertex $v$ is called its degree.
In directed graphs, we often distinguish between the in-degree and out-degree, referring to the number of incoming and outgoing edges respectively.
Exercise 1 — In-degree and Out-degree
Consider the directed graph shown below:
The graph consists of vertices
$V = {A, B, C, D, E}$ and directed edges
$E = {(A,B), (A,C), (B,C), (C,A), (C,D), (D,E)}$.
- List the incoming and outgoing edges for each vertex.
- Compute the in-degree and out-degree of vertex C.
- Verify that the sum of all in-degrees equals the sum of all out-degrees.
Solution
-
- $A$: out = ${(A,B),(A,C)}$, in = ${(C,A)}$
- $B$: out = ${(B,C)}$, in = ${(A,B)}$
- $C$: out = ${(C,A),(C,D)}$, in = ${(A,C),(B,C)}$
- $D$: out = ${(D,E)}$, in = ${(C,D)}$
- $E$: out = $\emptyset$, in = ${(D,E)}$
-
For vertex C,
in-degree = 2 (from A, B),
out-degree = 2 (to A, D). - $\sum_v \text{in-degree}(v) = 6 = \sum_v \text{out-degree}(v)$ — as expected for any directed graph.
Weighted Graphs
Many problems require not only knowing whether two vertices are connected, but also how costly that connection is.
A weighted graph extends the definition of $G = (V, E)$ by associating a real-valued weight function
$$ w : E \rightarrow \mathbb{R}^+ $$
that assigns a positive cost or length to each edge.
The weight may represent distance, time, energy, or any quantitative measure of effort required to traverse that edge.
The inclusion of weights allows us to formalize the notion of optimality.
A path connecting two vertices may exist, but the one with minimal total weight is often of greater interest than any arbitrary path.
Paths and Connectivity
A path in a graph is a finite sequence of vertices
$$ P = (v_0, v_1, \ldots, v_k) $$
such that each consecutive pair $(v_i, v_{i+1})$ is an edge in $E$.
The length of a path can refer either to the number of edges it contains (for unweighted graphs) or to the sum of edge weights (for weighted graphs).
A graph is said to be connected if, for every pair of vertices $u$ and $v$, there exists at least one path from $u$ to $v$.
Otherwise, the graph is disconnected, meaning that it consists of multiple connected components that are isolated from one another.
Connectivity plays a central role in determining whether traversal or communication between parts of the graph is possible.
Quiz.
In summary, this chapter introduced the essential components of a graph, vertices, edges, weights, and paths, and showed how these elements encode the structure of a system. With these definitions established, we can now examine how graphs are represented inside a computer, an important consideration for the efficiency of search and planning algorithms.
Graph Representations
Once a graph is defined mathematically, the next question is how to store it efficiently in a computer. Different applications require different operations, fast lookups, memory-efficient storage, or quick neighbor traversal, and the chosen data structure strongly influences algorithmic performance. This chapter compares the three standard representations used in practice.
Before we can run algorithms like BFS or Dijkstra, we must represent the abstract concept of a graph $G = (V, E)$ in a computer’s memory. The choice of representation is a fundamental design decision that has a major impact on both runtime and memory usage.
A representation is a data structure that allows us to store vertices and edges and perform key operations such as:
- Checking if an edge $(u, v)$ exists
- Finding all neighbors of a vertex $v$
- Adding or removing vertices and edges
- Storing and retrieving edge weights
We will explore the three most common methods:
- Adjacency Matrix
- Adjacency List
- Edge List
Adjacency Matrix
An adjacency matrix represents a graph with $V$ vertices as a $V \times V$ matrix (a 2D array) of booleans or weights. For any two vertices $i$ and $j$, the entry $A[i, j]$ stores information about the edge $(i, j)$:
$$ A[i, j] = \begin{cases} w(i, j), & \text{if edge } (i, j) \in E \\
\infty\ \text{(or 0 for unweighted)}, & \text{otherwise} \end{cases} $$
For an undirected graph, the matrix is symmetric, meaning $A[i, j] = A[j, i]$.
Example. Consider this simple weighted, undirected graph:
Its adjacency matrix representation (using $\infty$ for non-edges) would be:
$$ A = \begin{bmatrix} & \mathbf{A} & \mathbf{B} & \mathbf{C} & \mathbf{D} & \mathbf{E} \\
\mathbf{A} & 0 & 2.0 & 1.5 & \infty & \infty \\
\mathbf{B} & 2.0 & 0 & \infty & 3.0 & \infty \\
\mathbf{C} & 1.5 & \infty & 0 & 2.5 & 1.0 \\
\mathbf{D} & \infty & 3.0 & 2.5 & 0 & 4.0 \\
\mathbf{E} & \infty & \infty & 1.0 & 4.0 & 0 \end{bmatrix} $$
Analysis
Pros
- Fast edge lookup: Checking if an edge $(i, j)$ exists or finding its weight is $O(1)$.
- Simple updates: Adding or removing an edge is $O(1)$.
Cons
- High space complexity: Requires $O(V^2)$ space regardless of how many edges exist.
- Inefficient for sparse graphs: Example with $V = 10{,}000$ vertices and each connected to $k = 15$ neighbors:
$E \approx V \cdot k / 2 = 75{,}000$ edges.
The matrix stores $V^2 = 10{,}000^2 = 10^8$ entries — over 99.9% are $\infty$. - Slow neighbor iteration: To find all neighbors of vertex $i$, you must scan its entire row ($O(V)$).
Adjacency List
An adjacency list is the most common representation for sparse graphs. It consists of an array (or map) of $V$ lists. The list at index $i$ stores all neighbors of vertex $i$. For a weighted graph, each entry stores both the neighbor ID and the edge weight.
Example. Consider the following graph:
Adjacency list representation:
$$ \begin{aligned} A &\rightarrow [(B,\, 1.0),\; (C,\, 2.5)] \\
B &\rightarrow [(A,\, 1.0),\; (C,\, 1.8)] \\
C &\rightarrow [(A,\, 2.5),\; (B,\, 1.8),\; (D,\, 3.2)] \\
D &\rightarrow [(C,\, 3.2)] \end{aligned} $$
Analysis
Pros
- Space-efficient: $O(V + E)$ total space.
- Fast neighbor iteration: $O(\text{degree}(i))$, ideal for BFS/Dijkstra.
Cons
- Slower edge lookup: To check if $(i, j)$ exists, you must scan vertex $i$’s list ($O(\text{degree}(i))$).
Edge List
An edge list is the simplest of all representations.
It is a single list (or array) containing all edges in the graph.
For a weighted graph, each entry is a tuple $(u, v, w)$.
Example. Consider the following graph:
Edge list representation:
$$ E = [ (A,\, B,\, 2.2),\; (A,\, D,\, 3.1),\; (B,\, C,\, 1.7),\; (C,\, D,\, 2.8),\; (C,\, E,\, 1.5),\; (D,\, E,\, 2.3) ] $$
Analysis
Pros
- Very simple: Minimal implementation complexity.
- Space-efficient: $O(E)$.
- Ideal for edge-based algorithms such as Kruskal’s Minimum Spanning Tree.
Cons
- Slow neighbor iteration: Finding all neighbors of $i$ requires scanning all $E$ edges ($O(E)$).
- Slow node-centric operations: Adding nodes or checking degrees is inefficient.
Summary and Comparison
The best representation depends entirely on the graph’s density and the operations you need.
| Operation | Adjacency Matrix | Adjacency List | Edge List |
|---|---|---|---|
| Space | $O(V^2)$ | $O(V + E)$ | $O(E)$ |
| Find neighbors of $v$ | $O(V)$ | $O(\text{degree}(v))$ | $O(E)$ |
| Check if edge $(u,v)$ exists | $O(1)$ | $O(\text{degree}(u))$ | $O(E)$ |
| Add edge | $O(1)$ | $O(1)$ | $O(1)$ |
| Remove edge $(u,v)$ | $O(1)$ | $O(\text{degree}(u))$ | $O(E)$ |
Choosing the right representation is a crucial design decision, as it determines which graph algorithms can run efficiently. With a representation selected, we can now study how to explore, traverse, and search graphs using systematic algorithms.
Summary Takeaway
- Use Adjacency Matrix for dense graphs or frequent edge lookups.
- Use Adjacency List for sparse graphs and search algorithms (BFS, Dijkstra, A*).
- Use Edge List for edge-sorted or edge-centric algorithms (Kruskal, clustering, etc.).
Exercise: Graph Representation in Real-World Scenarios
For each scenario, consider the nature of the graph (directed/undirected, weighted/unweighted), its density (E vs. V2), and the most frequent operation required by the system.
1. The Global Flight Network ✈️
Imagine building a system for a flight search engine (like Google Flights or Skyscanner).
-
Vertices (V): Every major international airport (around 10,000 to 20,000 worldwide).
-
Edges (E): A direct commercial flight route between two airports.
-
Weights: Flight time and/or ticket cost.
-
System Goal: The main operation is finding the shortest path (fewest layovers or cheapest route) between two airports.
Guiding Questions:
-
Is this graph likely sparse or dense? Hint: Does every airport have a direct flight to every other major airport?
-
What is the complexity of V? Is it large, making O(V2) space prohibitive?
-
The core algorithm (like Dijkstra’s) relies heavily on finding all neighbors of a current airport. Which representation is fastest for this operation?
-
Conclusion: Which representation is the best fit?
Hints
Airports form a sparse network because very few airports have direct flights to most others.
With ~10k–20k airports, V² becomes extremely large.
Shortest-path algorithms frequently require scanning all outgoing edges from a node.
Solution
The graph is sparse: each airport connects only to a small subset of all airports.
Storing O(V²) is too large for V≈20,000, so an adjacency matrix is impractical.
Since shortest-path algorithms repeatedly need to find all neighbors of a node, adjacency lists provide fast neighbor iteration.
Conclusion: Use an adjacency list representation.
2. A Social Media “Friend” Graph 🫂
Imagine a system (like Facebook) that stores the connections between its users in a major city.
-
Vertices (V): Users in the city (e.g., 500,000 users).
-
Edges (E): A “friend” connection between two users (undirected).
- Weights: Unweighted (or maybe a metric of interaction frequency).
- System Goal: The main operation is quickly checking if two specific users are direct friends to filter content or display a profile.
Guiding Questions:
-
With V=500,000, V2 is an enormous number (2.5×1011). Is O(V2) space feasible for your memory budget?
-
While the graph is sparse overall (most users don’t know most other users), the primary operation is checking for a specific edge (u,v).
-
Which representation provides an O(1) time complexity for the most frequent operation (edge lookup)?
-
Conclusion: If space is limited by O(V+E), how do you achieve the fastest lookup? Hint: Consider using a Hash Map/Set structure within an Adjacency List to improve lookup time from O(degree(u)) to O(1) on average.
Hints
V² is far too large to store for 500k users.
Pure adjacency lists make checking if (u,v) exists O(degree(u)).
Hash sets inside adjacency lists allow average O(1) edge existence checks.
Solution
The graph is sparse, and O(V²) memory is completely infeasible.
The main operation is checking whether two users are directly connected.
An adjacency list with each user’s neighbors stored in a hash set gives O(1) expected lookup time, while keeping O(V+E) memory usage.
Conclusion: Use adjacency lists augmented with hash sets.
3. A Chip Design Interconnect microchip 📐
Imagine modeling the metal wire connections (nets) on a microchip layout for routing algorithms.
-
Vertices (V): Metal contact points (pins) on the chip (e.g., V=1000).
-
Edges (E): A physical wire connection between two pins.
-
Weights: Wire resistance/capacitance.
-
System Goal: Analyzing signal integrity often requires repeated matrix multiplications (to find all paths of length k) or quickly assessing all possible connections.
Guiding Questions:
-
With V=1000, what is the size of V2? (1,000,000). Is this space feasible?
-
In a typical chip design, the wires are laid out densely, often meaning a high percentage of pins are connected to a high percentage of other pins, or the analysis requires considering every pair.
-
The ability to perform O(1) edge lookups is often critical.
-
Conclusion: Which representation is best when V is relatively small (making O(V2) space acceptable) and the need for O(1) edge lookup is paramount, or when using matrix-based algorithms (like Floyd-Warshall or matrix exponentiation)?
Hints
For V=1000, a V² matrix has 1 million entries, which is easily storable.
Dense operations and algorithms like Floyd–Warshall work best with adjacency matrices.
Matrices give O(1) edge lookups.
Solution
With V=1000, storing O(V²) is perfectly feasible.
Chip connections are often dense, or the algorithms used require dense matrix operations.
Constant-time edge lookup is important, and matrix-based algorithms operate naturally on adjacency matrices.
Conclusion: Use an adjacency matrix representation.
Traversal and Search
Graph traversal algorithms reveal the structure of a graph by visiting vertices according to specific rules. These methods form the backbone of motion planning, routing, and connectivity analysis. In this chapter, we examine several classical strategies, starting from uninformed exploration and progressing toward informed, cost-aware search. Once a graph is defined, we can explore it systematically using search algorithms. Traversal algorithms visit nodes according to specific rules, allowing us to enumerate vertices, discover components, or find optimal paths between nodes. Although many variants exist, two of the most fundamental search paradigms are breadth-first and depth-first exploration.
Breadth-First Search (BFS)
We begin with the simplest form of systematic exploration, which visits nodes in increasing order of distance. Breadth-First Search explores a graph in layers, visiting all vertices at distance one from the starting node before proceeding to vertices at distance two, and so on. This systematic expansion guarantees that the first time a vertex is reached, it is reached via the shortest possible path in terms of edge count.
The algorithm maintains a queue of vertices to be explored. When a vertex is dequeued, all of its unvisited neighbors are enqueued for later exploration. The process continues until the queue is empty or a target vertex is reached.
Algorithm 1: Breadth-First Search
procedure BFS(G, s)
for each vertex v in G do
visited[v] ← false
parent[v] ← NIL
queue ← empty queue
visited[s] ← true
ENQUEUE(queue, s)
while queue is not empty do
u ← DEQUEUE(queue)
for each neighbor v of u do
if not visited[v] then
visited[v] ← true
parent[v] ← u
ENQUEUE(queue, v)
return parent
Note. The array parent records the predecessor of each vertex along the discovered path, enabling reconstruction of the shortest path by backtracking from the goal to the start.
Breadth-First Search for Robot Motion Planning. YouTube video, Nov 12, 2017. Available at: https://www.youtube.com/watch?v=x6cGmE0XpY8&list=PLYZT24lofrjXcuu1iBNWu-NprW2wZD3zu&index=43
Depth-First Search (DFS)
While BFS explores in expanding layers, Depth-First Search follows a single branch of the graph as far as possible before backtracking.
It uses a stack (either explicit or via recursion) to remember the path being followed.
This makes DFS more memory-efficient than BFS, though it does not guarantee finding the shortest path.
DFS is particularly useful for tasks such as detecting cycles, testing graph connectivity, and performing topological sorting in directed acyclic graphs.
Algorithm 2: Depth-First Search
procedure DFS(G, s)
for each vertex v in G do
visited[v] ← false
STACK ← empty stack
PUSH(STACK, s)
while STACK is not empty do
u ← POP(STACK)
if not visited[u] then
visited[u] ← true
for each neighbor v of u in reverse order do
if not visited[v] then
PUSH(STACK, v)
By changing the order in which neighbors are pushed onto the stack, DFS can yield different traversal sequences while maintaining the same overall structure.
Shortest Paths in Weighted Graphs
In graphs where edges carry positive weights, we are often interested in the shortest path between two vertices, that is, the path with the minimal total cost.
For a path $P = (v_0, v_1, \ldots, v_k)$, its cost is given by
$$ C(P) = \sum_{i=0}^{k-1} w(v_i, v_{i+1}) $$
The two most classical algorithms for finding shortest paths are Dijkstra’s algorithm and A* search.
Coding Exercise 1: DFS Traversal and Tree Construction
In this exercise, you will implement an iterative Depth-First Search (DFS) on a fixed undirected graph:
def dfs_order(G, start)
Your function should explore the graph starting from the node start and return:
-
order: a list of nodes in the order they are first discovered, -
parent: a dictionary mapping each node to its parent in the DFS tree
(the start node should haveparent[start] = None)
After you implement the function, click Run ▶ below to animate your DFS traversal.
The visualization will show:
- red edges for the growing DFS tree,
- light-gray edges for non-tree edges,
- a green start node,
- the current node highlighted during traversal,
- and the discovery order displayed above each visited node.
💡 Hints
- Use a Python list as a stack to implement DFS iteratively: push nodes onto the stack, and pop from the end.
- When you first pop a node from the stack, mark it as visited, record its parent, and append it to
order. - To ensure deterministic behavior, push neighbors in a
sorted(..., reverse=True)order so that the smallest neighbor is explored first. - Only push neighbors onto the stack if they have not yet been visited.
def dfs_order(G, start):
order = []
parent = {start: None}
visited = set()
stack = [(start, None)]
while stack:
node, par = stack.pop()
if node in visited:
continue
visited.add(node)
parent[node] = par
order.append(node)
# deterministic neighbor order
for nb in sorted(G.neighbors(node), reverse=True):
if nb not in visited:
stack.append((nb, node))
return order, parent
Dijkstra’s Algorithm (1959) [1]
BFS and DFS operate under the assumption that all edges have equal cost or that costs are irrelevant. In many real systems, maps, robot motion, energy usage, transitions have different weights. Dijkstra’s algorithm extends the ideas of BFS to handle weighted graphs optimally.
Dijkstra’s algorithm generalizes BFS to weighted graphs by always expanding the vertex with the lowest cumulative cost from the start.
It maintains a priority queue of vertices, ordered by their current best-known distance.
Quick Fact — Priority Queue.
Dijkstra’s algorithm relies on a priority queue to always expand the vertex with the smallest tentative distance.
Using a binary heap keeps insertions and extractions logarithmic in the number of vertices, making the overall complexity
$\mathcal{O}(E + V\log V)$.
Once a vertex has been expanded, its shortest distance is guaranteed and never updated again.
Algorithm 3: Dijkstra’s Shortest Path
procedure DIJKSTRA(G, s)
for each vertex v in G do
dist[v] ← ∞
parent[v] ← NIL
dist[s] ← 0
Q ← priority queue ordered by dist
INSERT(Q, s)
while Q is not empty do
u ← EXTRACT-MIN(Q)
for each neighbor v of u do
alt ← dist[u] + w(u, v)
if alt < dist[v] then
dist[v] ← alt
parent[v] ← u
DECREASE-KEY(Q, v, alt)
return dist, parent
The algorithm runs in $\mathcal{O}((V + E)\log V)$ time when implemented with a binary heap. It guarantees optimality when all edge weights are positive.
The efficiency of Dijkstra’s algorithm depends primarily on how the priority queue is implemented. Each vertex is inserted once into the queue and may be updated (via a ‘decrease-key’ operation) whenever a shorter path is found. Using a binary heap, both ‘insertion’ and ‘extraction’ of the minimum element take $O(\log V)$ time. Since every edge can cause at most one key update, there are up to $O(E)$ decrease-key operations and $O(V)$ extractions overall. This yields a total running time of $O((V + E)\log V)$, which simplifies to $O(E\log V)$ for sparse graphs where $E \approx O(V)$.
For dense graphs, where $E$ grows quadratically with $V$, the complexity approaches $O(V^2 \log V)$. In practice, the algorithm performs efficiently on most real-world graphs because they are typically sparse.
A-Star (A*) Search (1968) [2]
While Dijkstra’s algorithm finds optimal paths in weighted graphs, it does so without any knowledge of the goal’s location. In many applications, we can estimate how promising a particular direction might be. A* search incorporates this information through heuristics, guiding the algorithm toward the goal more efficiently.
A* extends Dijkstra’s algorithm by adding a heuristic function $h(v)$ that estimates the remaining cost from a vertex $v$ to the goal. This heuristic guides the search toward promising directions, potentially reducing the number of expanded nodes.
Each vertex maintains two quantities:
$g(v)$ — the cost from the start to $v$, and
$f(v) = g(v) + h(v)$ — the estimated total cost through $v$.
The algorithm always expands the vertex with the lowest $f(v)$. If the heuristic is admissible (never overestimates the true cost), the resulting path is guaranteed to be optimal.
Algorithm 4: A-Star Search
procedure ASTAR(G, s, goal, h)
for each vertex v in G do
g[v] ← ∞
f[v] ← ∞
parent[v] ← NIL
g[s] ← 0
f[s] ← h(s)
OPEN ← priority queue ordered by f
INSERT(OPEN, s)
while OPEN is not empty do
u ← EXTRACT-MIN(OPEN)
if u = goal then
return RECONSTRUCT-PATH(parent, u)
for each neighbor v of u do
tentative ← g[u] + w(u, v)
if tentative < g[v] then
g[v] ← tentative
f[v] ← g[v] + h(v)
parent[v] ← u
INSERT-OR-DECREASE(OPEN, v, f[v])
When $h(v) = 0$ for all vertices, A* reduces to Dijkstra’s algorithm. When $h$ is perfectly accurate, the algorithm expands only the nodes along the optimal path.
The hidden beauty of the A* algorithm. YouTube video by Polylog, Jan 20, 2023. Available at: https://www.youtube.com/watch?v=A60q6dcoCjw
Together, BFS, DFS, Dijkstra, and A* form the foundation of graph search theory, illustrating how the structure of a graph and the information available about costs or heuristics influence the efficiency and guarantees of traversal algorithms.
Exercise 2: Tracing Dijkstra’s Algorithm
Consider the following weighted graph, with start node A.
Trace Dijkstra’s algorithm. Fill out a table showing the dist and parent for each node, and list the order in which nodes are EXTRACT-MINed from the priority queue.
Solution

Priority Queue (Q): [ (A, 0) ]
- Extract: A (dist=0)
- Visit B:
dist[B] = 2,parent[B] = A→ Add (B, 2) - Visit C:
dist[C] = 5,parent[C] = A→ Add (C, 5) - Q =
[ (B, 2), (C, 5) ]
- Visit B:
- Extract: B (dist=2)
- Visit C:
2 + 2 = 4 < 5→ Updatedist[C] = 4,parent[C] = B - Visit D:
2 + 6 = 8→ Add (D, 8) - Q =
[ (C, 4), (D, 8) ]
- Visit C:
- Extract: C (dist=4)
- Visit D:
4 + 1 = 5 < 8→ Updatedist[D] = 5,parent[D] = C - Q =
[ (D, 5) ]
- Visit D:
- Extract: D (dist=5)
- Visit E:
5 + 1 = 6 < 10→ Updatedist[E] = 6,parent[E] = D - Q =
[ (E, 6) ]
- Visit E:
- Extract: E (dist=6)
- Q =
[ ](empty)
- Q =
Final State:
| Node | dist | parent |
|---|---|---|
| A | 0 | NIL |
| B | 2 | A |
| C | 4 | B |
| D | 5 | C |
| E | 6 | D |
Extraction Order: A, B, C, D, E
Exercise 3: A* vs. Dijkstra
Now, use the same graph from the previous exercise. We want to find a path from A to E. We are given the following admissible heuristic values $h(v)$:
| Node | $h(v)$ (Est. cost to E) |
|---|---|
| A | 5 |
| B | 4 |
| C | 2 |
| D | 1 |
| E | 0 |
Trace the A* algorithm. What is the $f(v) = g(v) + h(v)$ value for each node when it is extracted? Notice which nodes Dijkstra expanded that A* did not need to (or vice-versa).
Solution
-
g[v]is cost-from-start (same asdistin Dijkstra). -
f[v]is priorityg[v] + h(v).
Priority Queue (OPEN): [ (A, f=5) ] (g=0, h=5)
- Extract: A (g=0, f=5)
- Visit B:
g[B]= 2.f[B]=g[B] + h(B)= 2 + 4 = 6. Add (B, 6) to Q. - Visit C:
g[C]= 5.f[C]=g[C] + h(C)= 5 + 2 = 7. Add (C, 7) to Q. -
Q:
[ (B, 6), (C, 7) ]
- Visit B:
- Extract: B (g=2, f=6)
- Visit C:
g = 2 + 2 = 4. This is< 5.- Update
g[C]= 4.f[C]= 4 + 2 = 6. DECREASE-KEY(C, 6).
- Update
- Visit D:
g = 2 + 6 = 8.-
g[D]= 8.f[D]= 8 + 1 = 9. Add (D, 9) to Q.
-
-
Q:
[ (C, 6), (D, 9) ]
- Visit C:
- Extract: C (g=4, f=6)
- Visit D:
g = 4 + 1 = 5. This is< 8.- Update
g[D]= 5.f[D]= 5 + 1 = 6. DECREASE-KEY(D, 6).
- Update
-
Q:
[ (D, 6)
- Visit D:
- Extract: D (g=5, f=6)
- Visit E:
g = 5 + 1 = 6. This is< 10.- Update
g[E]= 6.f[E]= 6 + 0 = 6. DECREASE-KEY(E, 6).
- Update
-
Q:
[ (E, 6) ]
- Visit E:
- Extract: E (g=6, f=6)
- Goal Reached! Return path E $\to$ D $\to$ C $\to$ B $\to$ A.
Observation: The final path and costs are identical. The order of exploration was A, B, C, D, E. In this specific case, the heuristic was good enough to guide the search along the optimal path, but it didn’t save any expansions because the “cheapest-first” path (A-B-C-D-E) also happened to be the one Dijkstra would have explored.
Traversal Cheat Sheet
The following table summarizes the core properties of the traversal algorithms introduced so far.
| Algorithm | Works on weighted? | Finds shortest path? | Typical data structure | Time complexity (unweighted) |
|---|---|---|---|---|
| BFS | No (unweighted only) | Yes, fewest edges | Queue | $\mathcal{O}(V + E)$ |
| DFS | Yes (ignores weights) | No (arbitrary path) | Stack (or recursion) | $\mathcal{O}(V + E)$ |
| Dijkstra | Yes (positive weights) | Yes, minimum total cost | Priority queue | $\mathcal{O}((V + E)\log V)$ |
| A* | Yes (positive weights) | Yes (with admissible heuristic) | Priority queue | $\mathcal{O}((V + E)\log V)$ in practice, often fewer expansions |
Rule of thumb.
- Use BFS if all edges have equal cost and you only care about fewest steps.
- Use Dijkstra if edges have different positive costs and no heuristic is available.
- Use A* if you have a good heuristic and want to guide the search.
- Use DFS when you care about exploring structure (components, cycles) rather than finding shortest paths.
Together, BFS, DFS, Dijkstra, and A* illustrate a progression from uninformed to informed search. As we refine our knowledge about the environment, edge costs, heuristic estimates, the algorithms become increasingly efficient while still guaranteeing correctness under appropriate conditions.
Credits:
This course page was created by Hanka Goralija, EPFL under the supervision of Prof. Aude Billard, and funded by IEEE RAS and EPFL.
References
- Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1(1), 269–271.
- Hart, P. E., Nilsson, N. J., & Raphael, B. (1968). A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Transactions on Systems Science and Cybernetics, 4(2), 100–107.
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2022). Introduction to Algorithms (4th ed.). MIT Press.