Overview
In this unit, we will introduce graph theory—the final novel—and perhaps an unfamiliar mathematical topic. Graph theory is super useful in modelling ideas. We use graph theory in many ways in computer science, either for re-stating problems in a way that computers can understand, or to analyse and count quantities that we care about.
The goal of this unit is to get you familiarised with basic graph terminology and basic theorems about graphs. So in this unit, there will be far less proving, and a lot more of just building mathematical vocabulary. This is mostly because graphs are quite application focused in computer science, and mostly interesting in their own right from a mathematical perspective. Thus, actually talking about anything interesting should be done from the perspective of the topics using them, like algorithms, or natural language processing, or AI.
Contents:
- Motivation for this unit
- Basic definitions in graph theory
- Useful theorems in graph theory
- Bonus: The pigeonhole principle
Part 0: Why Graph Theory?
Modelling Problems
Have you ever wondered how Google Maps knows how to route you from point A to point B?

How does it take a system like the MRT map, and figure out how to get you from, say, Clementi station, to Kent Ridge station?
Or how about in video games: how do non-playing characters know how to move towards you? Here’s an example video showing pathfinding in action in a video game.
How about your network packets? How does your computer know how to send their requests to the various servers that it does? Sure, you could say “I type in ‘google.com’ and my computer figures it out for me”. But someone had to design that solution, and it comes by first modelling a real-life problem as a graph.
Graph theory is a great way to model problems, models that computers can understand, and models that we can build solutions on.
Analysing Data Structures
The other reason, as you will see when you come around to algorithms and data structures, is that graph theory helps us understand basic data structures too! So the combinatorial aspect of data structures is useful as well.
Part 1: Basic Definitions
Disclaimer
As we move away from formal logic and into the realm of graph theory, we shall be less concerned with the pedantry and precision of using formal logic in our descriptions, and focus more on the higher-level concepts and the properties of these objects.
The implication is that occasionally, the descriptions we use may not be 100% rigorous, but hopefully they are sufficient to bring the point across. Nonetheless, if you have any questions regarding the content, please feel free to drop your question on the Canvas forum :)
Graph
Definition: Graphs and subgraphs
Let be a set of vertices/nodes, and let be a set of edges connecting pairs of vertices (i.e., .
We define a graph to be a pair of sets and vertices.
We also define to be a subgraph of if and .
For example, we might have a set of four vertices. These vertices might represent locations (like MRT stations) or network nodes (like routers).
An edge is a pair, like , which indicates a connection from to . So here’s an example edge set . Pictorially, it should look like this:
In fact, it’s very common to look at graphs pictorially to get a better sense of what is going on.
Common Types of Graphs
Graphs come in many shapes and forms, and we would like to refer to special cases of them. Let us go through some examples. But in summary, here are four types of graphs that we can have:
- Undirected graph: Edges have no direction
- Directed graph (a.k.a. digraph): Edges have a direction (represented as arrows)
- Simple graph: A graph without loops or multiple edges between the same vertices
- Multigraph: Allows multiple edges between two vertices
Undirected graph
Undirected graphs are when the edge connections are two-way. More formally, for any pair of vertices , if both and are in , then we can call an undirected graph.
For example, with , we can have edges . Here, when we draw undirected graphs, we omit the arrowheads.
Directed graph
Directed graphs are when the edge connections are one-way.
For example, with , we can have edges . Here, when we draw directed graphs, we include the arrowheads. Now, take note that we can still have two-way connections like between and , but to do so, we need to have two edges: one from to , and one from to .
Simple graph
Simple graphs are where there are: (i) no self-loops (i.e., a vertex pointing back to itself), and (ii) no duplicate edges.
For example, with , we can have edges . Notice we cannot have the to edge, because that is a self-loop.
Multigraph
Multigraphs are graphs that can have duplicate edges, or self-loops.
So for example, with , we can have edges . Notice here we have duplicate edges from to , and we have a self-loop to .
More Terminology in Graph Theory
Degree
Definition: Degree of a vertex
The degree of a vertex is the number of times appears as an endpoint of some edge in .
For example:
For undirected graphs, we do not double-count each edge. Hence, in the example above, we say that has a degree of (not , even though the edge connecting and is technically represented as two pairs, and ).
Also, has degree , has degree , and has degree . It might be a little unintuitive why node has degree . But that is because appears as an endpoint twice! Even if it’s on the same edge.
In a directed graph on the other hand, there is an in-degree and an out-degree.
Node has out-degree , has out-degree , has out-degree , and has out-degree . On the other hand, every node in this example, has in-degree .
Path
Definition: Paths
A path is a sequence of distinct vertices (i.e., a vertex cannot appear twice) where each consecutive pair of vertices is connected by an edge.
For example, a path is something like a sequence .
Connectivity
Definition: Connectivity
We say that two vertices are connected if there exists a path between them.
We say that a graph is connected if all pair of vertices are connected.
Based on the previous example, we know that and are connected. We can similarly argue that is connected to or , and so on.
Connected components
Definition: Connected components
In an undirected graph , a connected component of is a maximal set of vertices such that all pairs of vertices in that set are connected.
Here, we say that a set of vertices is maximal if for any , there exists two vertices in the set that are not connected.
In an undirected graph, a connected component is a maximal set of vertices such that all pairs of vertices is connected by a path.
So again, using the previous example, we know that is a connected component. If we used any smaller set of vertices like , then it is not maximal—we could have added even more nodes and it would still be connected. On the other hand, something like is not a connected component because there exists pairs of vertices that are not connected.
Cycle
A cycle is a path where the starting vertex is also the ending vertex and no edges or vertices are repeated (other than the starting and ending vertex).
So again, using the previous example, we know that is a cycle; only the first and last vertex in our path has repeated.
Special Types of Graphs
Let’s shift our attention to these three special kinds of graphs: trees, bipartite graphs and complete graphs.
Tree
Definition: Trees
A graph is a tree if it is connected and has no cycles (acyclic).
Alternatively, a tree is a graph where:
- All vertices are connected to each other
So here’s an example:

If you want, count the nodes and the edges—you’ll realise that we have nodes, and exactly edges. Furthermore, all the nodes can reach each other.
Typically (but not always), we will also specify what the root of the tree is. In the drawn example above, is our root node. Now that we have established a root node, we can start talking about the other important names have for the special case of trees.
When heading “away” from the root, we say that we “traverse from parents to children”. For example, we can head “away” from by moving from node to node , so we call the parent of . Similarly, this means that is the child of . In our diagram, has two children: and . You might want to think about how every node only has at most parent, and never or more.
Also, when we reach a node that has no children, that node is called a leaf. So in this tree, we have leaves: , , and . Any node that has children is called an internal node. So here, are internal nodes.
Lastly, if every node in a given tree has at most children, we say the tree is -ary. For example, a tree where each node has at most children is called a -ary or binary tree. Hence, the tree in our example is a -ary tree.
Bipartite graph
A graph is bipartite if we can take its vertices and divide them into two disjoint sets and , such that every edge connects a vertex across the two sets and , and never within the sets themselves. So let’s do an example:
The graph on the left is bipartite. Why? Put nodes and in set , and nodes , and in set . Then the only edges are crossing between set and . To make it obvious, we can shift the nodes around a little bit so that the nodes in are on the left, and nodes in are on the right.
Also if you’re curious, could we have done and instead? The answer is yes!
What about this graph instead? Is it bipartite?
You might notice no matter how hard you try to form the sets and , there’s always an edge that is either within set or within set . Why can’t we do it? The idea is that this an odd length cycle—if a graph every contains an odd length cycle, it is not bipartite. If the graph does not contain any odd length cycles (i.e. we can never make one), then it is bipartite.
Complete graph
Lastly, let’s talk about complete graphs.
Definition: Complete graphs
An undirected graph is complete if every pair of nodes has an edge between them.
The complete graph with nodes is sometimes denoted .
For five nodes, this is what the complete graph looks like.
On the other hand, this is what the complete graph looks like.
Complement graphs
On the topic of complete graphs, we can now talk about the complement of a graph.
Definition: Complement of a graph
Given a graph , the complement of (denoted ) is the graph that contains the same set of nodes , but if is an edge in , then it is not an edge in . Likewise, if was not an edge in , then it is an edge in .
Here’s an example:
here has nodes, and notice that since is an edge in , and are not connected in . Similarly, since and are not connected in , they are connected in .
One thing you might want to take note is that if we take the union of the edge set from and the edge set from , we get back the complete graph.
Part 2: Theorems and Proofs in Graph Theory
Okay, let’s do a few theorems about graphs. I have chosen the three most useful ones that I know.
Theorem 1: Handshake Lemma
Theorem 1: Handshake Lemma
The sum of the degrees of all vertices of an undirected graph is twice the number of edges. In other words:
Let’s see an example of this:
Notice that we have edges, and . Wow! But why does it work?
Intuition
Each edge contributes exactly to the total sum—once for adding to the degree of node , then once again adding to the degree of node . Thus, this process increments our sum twice, so each edge contributes a total of to the sum.
Since each edge contributes , this means the sum is .
Theorem 2: Number of Edges in a Complete Graph
Theorem 2: Number of edges in
The number of edges in a complete graph with vertices is:
So again you might have noticed, for example, a graph with nodes has edges.
There are two ways to prove this fact for a general graph:
Intuition 1 (Using combinatorics)
If you think about it, every pair of vertices that we can form from the set makes an edge. This means that the number of edges is the same as the number of possible pairings. This happens to be:
Short and sweet!
Intuition 2 (Using the handshake lemma)
There is another way that actually uses what we’ve learned here today. Think about it this way, in a complete graph with vertices, every node has exactly other nodes that it is connected to. Because of that, every node has degree exactly . So, via the handshake lemma:
Dividing both sides by , we get:
Theorem 3: Number of Nodes in a -ary Tree
Lastly, this fact is super useful (combinatorially) in computer science:
Theorem 3: Number of nodes in a -ary tree
Given a graph that is a -ary rooted tree of height , then:
- The minimum number of nodes in the tree is .
- The maximum number of nodes in the tree is .
Here’s an example with a -ary (or ternary) tree of height . At the minimum, we can have nodes. And at the maximum, we can have nodes. But what about in general? Here’s the idea:
Intuition
We need nodes at least to go from a root to a leaf node that is hops away. So the minimum is .
To hit the maximum, every internal node will have all children, and every leaf node again, must have children. So the total nodes we have are:
(This comes from the formula for the sum of the geometric progression with a common ratio of and initial value .)
But what about in general for a -ary tree? Well just swap out the for a :
Part 2.1(?): Pigeonhole Principle
Lastly, let’s talk about the pigeonhole principle. This is a little bit of an oddball topic but it’s very common to also cover it during combinatorics and graphs. It would be remiss if we did not at least mention it. The pigeonhole principle feels like it’s stating the obvious, but it can be used to great effect.
Simply put:
Pigeonhole principle
If we have pigeons, and pigeonholes, and each pigeon wants to nest in a pigeonhole, then there exists at least one pigeonhole that has at least pigeons that are nesting in it.
This sounds a little obvious, for example with pigeons and holes, there is definitely a hole with at least pigeons. But here’s an example statement we could possible argue with it.
There are at least people with the same number of strands of hair on their head in Singapore.
Why is this true? Think about it this way: there are around million people in Singapore, and the average human has around strands of hair on their head. Let’s assume (reasonably) that it’s probably impossible for a human to have a million strands of hair.
So we make million possible pigeonholes (for each possible hair count), and we let each person be a pigeon. Then, we assign a person to the pigeonhole if they have strands of hair on their head. Since there are million people but only million pigeonholes, there must be at least people who have the same number of strands of hair on their head!