Kruskal’s Minimum Spanning Tree

What It Solves & When

Build a minimum spanning tree by adding cheapest safe edges globally — when does this edge-centric view beat growing a tree?

Core Idea: Sort Edges, Add If Safe

Process edges in non-decreasing weight. Why is the cheapest edge that doesn’t form a cycle always safe to add?

Why Sorted Order Makes Greedy Work

Tie this to the cut property: when you pick the next-cheapest non-cycling edge, which cut is it the minimum crosser of?

Union-Find as the Engine

How does a disjoint-set structure answer “are these two endpoints already connected?” in near-constant time?

Find with Path Compression

What does find return, and how does compression flatten the tree on the way up to the root?

Union by Rank / Size

Why attach the smaller tree under the larger? What worst-case tree height does this prevent?

The Cut & Cycle Properties

Which property justifies including a light crossing edge, and which justifies rejecting an edge inside a component?

Detecting Cycles Without Searching

Why does “both endpoints share a root” mean adding the edge would create a cycle?

Stopping Early

How many edges does a spanning tree on V vertices have, and how does that let you halt before exhausting the sorted list?

Variants

Pre-sorted or radix-sorted edges; filter-Kruskal; building a spanning forest on disconnected graphs — what changes?

vs Prim’s

Kruskal’s loves sparse graphs and pre-sorted edges; Prim’s loves dense ones. What drives the choice?

Time Complexity

Two phases: sorting the edges, then a near-linear sweep with union-find. Which dominates?

State the sort cost in terms of E, and explain why the union-find work over all edges is effectively linear (the inverse-Ackermann factor). Why is \\( \log E \\) interchangeable with \\( \log V \\) here?

Space Complexity

What do the edge array and the union-find parent[]/rank[] arrays cost in terms of V and E?

Union-Find Footprint

Why is the disjoint-set overhead only linear in V, regardless of edge count?

Implementation Walkthrough

Plan the code before writing it.

Edge List & Union-Find Setup

How do you build the sortable edge list, and initialize parent[] and rank[] so each vertex starts in its own set?

Sorting & The Main Sweep

After sorting by weight, sketch the loop: for each edge, decide include-or-skip. What is the accept condition?

The Key Step: Union on Accept

When endpoints have different roots, what two things happen — record the edge, and merge the sets how?

Building the Result

How do you accumulate MST edges and total weight, and detect that the graph was disconnected?

Real Uses

Clustering (cut the k-1 heaviest MST edges), circuit/road design, image segmentation — where does the global edge view help?

Pitfalls

Skipping path compression/union-by-rank and getting slow finds; not handling disconnected graphs; unstable tie handling — what bites you?

Implementation

// implement from scratch here

Summary

Recap the whole topic in a few lines — the core idea, the key complexities and trade-offs, and when to reach for it. The cheat-sheet you’d skim before an exam or interview.

My Notes

Fill this in as you learn

Data Structures and Algorithms