Overview
CPSC 221 is the core data structures course — the difference between a solution that works and one that scales. Understanding why certain structures are fast (and slow) is the skill that separates strong engineers from weak ones.
Complexity Analysis
Big-O Notation
Big-O describes how runtime grows as input size grows. We care about the dominant term:
| Complexity | Name | Example | |---|---|---| | O(1) | Constant | Array index access | | O(log n) | Logarithmic | Binary search | | O(n) | Linear | Linear scan | | O(n log n) | Linearithmic | Merge sort | | O(n²) | Quadratic | Bubble sort | | O(2ⁿ) | Exponential | Naive recursion |
Trees
Binary Search Tree (BST)
Property: For every node, all left descendants < node < all right descendants.
struct Node {
int val;
Node* left;
Node* right;
};
// In-order traversal (gives sorted output)
void inorder(Node* root) {
if (!root) return;
inorder(root->left);
cout << root->val << " ";
inorder(root->right);
}
- Search / Insert / Delete: O(log n) average, O(n) worst (unbalanced)
AVL Trees
Self-balancing BST. Maintains the invariant: |height(left) - height(right)| ≤ 1. Rotations (single/double) restore balance after insertions. All operations guaranteed O(log n).
Hash Tables
Store key-value pairs with O(1) average access.
Hash function maps keys to indices. Collisions are handled by:
- Chaining — each bucket holds a linked list
- Open addressing — probe for the next empty slot (linear, quadratic, double hashing)
Load factor α = n / m (items / buckets). Keep α < 0.7 for good performance.
// Simple hash for integers
int hash(int key, int tableSize) {
return key % tableSize;
}
Graphs
A graph G = (V, E) with vertices V and edges E.
- Directed vs. undirected
- Weighted vs. unweighted
- Representations: adjacency matrix O(V²) space, adjacency list O(V+E) space
BFS (Breadth-First Search)
Explores level by level. Uses a queue. Finds shortest path in unweighted graphs.
void bfs(int start, vector<vector<int>>& adj) {
vector<bool> visited(adj.size(), false);
queue<int> q;
q.push(start);
visited[start] = true;
while (!q.empty()) {
int v = q.front(); q.pop();
cout << v << " ";
for (int u : adj[v]) {
if (!visited[u]) {
visited[u] = true;
q.push(u);
}
}
}
}
DFS (Depth-First Search)
Explores as far as possible before backtracking. Uses a stack (or recursion). Good for cycle detection, topological sort.
Sorting Algorithms
| Algorithm | Best | Average | Worst | Stable? | |---|---|---|---|---| | Bubble sort | O(n) | O(n²) | O(n²) | Yes | | Merge sort | O(n log n) | O(n log n) | O(n log n) | Yes | | Quick sort | O(n log n) | O(n log n) | O(n²) | No | | Heap sort | O(n log n) | O(n log n) | O(n log n) | No |
Merge sort divides the array, sorts each half, then merges — classic divide-and-conquer.
Quick sort picks a pivot, partitions elements around it, recurses. Fast in practice (good cache behavior) but O(n²) worst case with bad pivot choice.
Key Takeaways
- Choose the right structure first. Hash table for lookups, BST for ordered data, heap for priority queues.
- Amortized cost matters. Dynamic arrays are O(1) amortized even though resizing is O(n).
- Balance is everything in trees. An unbalanced BST is just a slow linked list.
- Graph problems are everywhere. Social networks, routing, dependency resolution, scheduling — all graph problems.