Asymptotic Analysis
Big-O, Big-Θ, Big-Ω, recurrence relations, master theorem.
Big-O, Θ, Ω notation
Definitions, common pitfalls.
Why asymptotic? Algorithm performance depends on input size. We care about behavior as n → ∞ — not constants, lower-order terms, or small n.
THREE MAIN NOTATIONS:
Big O (Upper bound):
f(n) = O(g(n)) ↔ ∃ c, n₀ > 0 such that f(n) ≤ c·g(n) for all n ≥ n₀.
- Worst case behavior.
- "f grows AT MOST as fast as g."
Big Omega (Lower bound):
f(n) = Ω(g(n)) ↔ ∃ c, n₀ > 0 such that f(n) ≥ c·g(n) for all n ≥ n₀.
- Best case bound (in this sense).
- "f grows AT LEAST as fast as g."
Big Theta (Tight bound):
f(n) = Θ(g(n)) ↔ f = O(g) AND f = Ω(g).
- Exact growth rate.
- "f grows AT THE SAME RATE as g."
SMALL O AND SMALL OMEGA:
- Small o: f(n) = o(g(n)) means f grows STRICTLY slower than g (ratio → 0).
- Small omega: f(n) = ω(g(n)) means f grows STRICTLY faster (ratio → ∞).
So O = "at most" (could be tight); o = "strictly less."
COMMON GROWTH RATES (slow → fast):
| Class | Name | Example |
|---|---|---|
| O(1) | constant | Hash lookup |
| O(log log n) | log-log | (rare) |
| O(log n) | logarithmic | Binary search |
| O(√n) | sqrt | Naive primality testing up to √n |
| O(n) | linear | Single pass through array |
| O(n log n) | linearithmic | Merge sort, heap sort |
| O(n²) | quadratic | Bubble, insertion, selection sort |
| O(n³) | cubic | Floyd-Warshall, naive matrix mult |
| O(n^k) | polynomial | for fixed k |
| O(2ⁿ) | exponential | TSP brute force, subset enumeration |
| O(n!) | factorial | TSP via permutations, brute force |
ALGEBRA OF O-NOTATION:
- Sum rule: O(f) + O(g) = O(max(f, g)). E.g., O(n²) + O(n) = O(n²).
- Product rule: O(f) × O(g) = O(f·g). E.g., O(n) × O(log n) = O(n log n).
- Constants: O(c · f) = O(f) for any constant c > 0.
- Lower order absorbed: O(n² + n + 100) = O(n²).
- Log base irrelevant: O(log₂ n) = O(log₁₀ n) (= O(log n)).
- Exponents differ: O(2ⁿ) ≠ O(3ⁿ).
COMMON RECURRENCES & SOLUTIONS:
| Recurrence | Algorithm | Solution |
|---|---|---|
| T(n) = T(n−1) + 1 | Linear recursion | O(n) |
| T(n) = T(n−1) + n | Insertion sort | O(n²) |
| T(n) = T(n/2) + 1 | Binary search | O(log n) |
| T(n) = T(n/2) + n | Quickselect avg | O(n) |
| T(n) = 2T(n/2) + 1 | (rare) | O(n) |
| T(n) = 2T(n/2) + n | Merge sort | O(n log n) |
| T(n) = 2T(n/2) + n log n | — | O(n log² n) |
| T(n) = 2T(n/2) + n² | — | O(n²) |
| T(n) = 4T(n/2) + n | — | O(n²) |
| T(n) = 7T(n/2) + n² | Strassen | O(n^log₂ 7) ≈ O(n^2.807) |
(Use Master Theorem when applicable.)
TIME vs SPACE COMPLEXITY:
- Time: number of basic operations.
- Space: auxiliary memory used.
- Quick sort: O(n log n) time avg, O(log n) space (recursion stack).
- Merge sort: O(n log n) time, O(n) space.
WORST / AVERAGE / BEST CASE:
| Algo | Best | Avg | Worst |
|---|---|---|---|
| Linear search | 1 | n/2 | n |
| Binary search | 1 | log n | log n |
| Bubble sort | n | n² | n² |
| Insertion sort | n | n² | n² |
| Quick sort | n log n | n log n | n² |
| Merge sort | n log n | n log n | n log n |
| Hashing (good hash) | 1 | 1 | n |
Average case assumes some distribution of inputs (often uniform).
AMORTIZED ANALYSIS:
Some operations are expensive but happen rarely. Average over a sequence of operations.
Example: dynamic array (e.g., Python list). Append is O(1) amortized, even though occasional resize is O(n). The "blame" is spread over many cheap appends.
Methods: aggregate, accounting, potential.
COMPARISONS:
For large n, what's faster?
- Constant vs anything > constant → constant wins.
- log n vs n → log n wins.
- n vs n log n → linear wins.
- n^k vs 2ⁿ → polynomial wins (eventually).
- 2ⁿ vs n! → exponential wins (eventually); n! beats anything polynomial.
Real-world impact:
- For n = 10⁶: n² = 10¹² (way too slow); n log n = 2×10⁷ (fast).
- For n = 30: 2ⁿ ≈ 10⁹ (slow but doable); for n = 50: 2ⁿ = 10¹⁵ (too slow).
EXAM HOOKS:
- f(n) = O(g(n)) means f ≤ c·g eventually.
- Constant factors and lower-order terms ignored in asymptotic analysis.
- log n = O(n) but n ≠ O(log n).
- 2ⁿ ≠ O(2^(n/2)) (since 2^(n/2) is much slower).
- n! grows faster than 2ⁿ (by Stirling's).
- "Tight bound" = Θ. "Upper bound" = O.
Asymptotic Notations and Definitions
Asymptotic notations describe growth of f(n) as n approaches infinity.
- Big-O (upper bound): f(n)=O(g(n)) if there exist c>0, n0>0 such that 0 <= f(n) <= c.g(n) for all n>=n0.
- Big-Omega (lower bound): f(n)=Omega(g(n)) if 0 <= c.g(n) <= f(n).
- Theta (tight bound): f(n)=Theta(g(n)) iff f(n)=O(g(n)) AND f(n)=Omega(g(n)).
- little-o (strict upper): f(n)=o(g(n)) if for ALL c>0 there is n0 with f(n) < c.g(n); equivalently lim f/g = 0.
- little-omega (strict lower): lim f/g = infinity.
Memory aid: O/Omega are like <= and >=; o/omega are like < and >; Theta is =. The constant c must work for ALL n>=n0, but you only need ONE c for O/Omega and EVERY c for o/omega.
The fastest way to compare two functions f and g is the limit L = lim(n->inf) f(n)/g(n).
- L = 0 => f = o(g), so also f = O(g) but NOT Theta(g).
- L = infinity => f = omega(g), so f = Omega(g) but NOT Theta(g).
- L = constant c (0 < c < inf) => f = Theta(g) (tight bound both ways).
When the limit oscillates (e.g., f(n) = n^(1+sin n)), the limit test fails and you must reason directly from the definition. Use L'Hopital's rule or take logs for tricky ratios like (log n)^k vs n^e or n! vs 2^n. Remember Stirling: n! ~ sqrt(2.pi.n).(n/e)^n, which gives log(n!) = Theta(n log n).
Key facts examiners exploit:
- f(n) = O(g(n)) does NOT imply g(n) = O(f(n)). O is not symmetric.
- Theta IS an equivalence relation (reflexive, symmetric, transitive); O and Omega are only reflexive and transitive (partial orders).
- 2^(2n) is NOT O(2^n): ratio = 2^n -> infinity. Constants in exponents matter!
- log(n!) = Theta(n log n), not Theta(n).
- n^0.001 grows FASTER than (log n)^1000 eventually — any positive power of n beats any power of log n.
- max(f,g) = Theta(f+g) for nonnegative functions.
Memory aid for ordering: 'constants < log n < n^c (c<1) < n < n log n < n^2 < n^c (c>1) < 2^n < 3^n < n! < n^n'.
Master theorem
Three cases for T(n) = aT(n/b) + f(n).
For a recurrence of the form T(n) = a·T(n/b) + f(n) where a ≥ 1, b > 1:
Compare f(n) with n^(log_b a) (call this critical polynomial). Three cases:
Case 1. If f(n) = O(n^(log_b a − ε)) for some ε > 0 → T(n) = Θ(n^(log_b a))
The recursion dominates; work is concentrated at leaves.
Case 2. If f(n) = Θ(n^(log_b a) · log^k n), k ≥ 0 → T(n) = Θ(n^(log_b a) · log^(k+1) n)
Balanced; recursion and combine work are comparable.
Case 3. If f(n) = Ω(n^(log_b a + ε)) AND a·f(n/b) ≤ c·f(n) for some c < 1 (regularity) → T(n) = Θ(f(n))
The combine step dominates.
Worked examples:
Merge sort: T(n) = 2T(n/2) + n. a=2, b=2 → n^(log₂ 2) = n. f(n) = n. Case 2 with k=0 → T(n) = Θ(n log n).
Binary search: T(n) = T(n/2) + 1. a=1, b=2 → n^(log₂ 1) = n^0 = 1. f(n) = 1 = Θ(1). Case 2 with k=0 → T(n) = Θ(log n).
Karatsuba: T(n) = 3T(n/2) + n. a=3, b=2 → n^(log₂ 3) ≈ n^1.585. f(n) = n. Case 1 → T(n) = Θ(n^1.585).
When master theorem doesn't apply:
- a is not a constant (e.g. T(n) = n·T(n/2))
- f(n) is not polynomially bounded vs n^(log_b a)
- Regularity condition fails in Case 3
For these, use substitution, recursion tree, or Akra-Bazzi.
Why asymptotic? Algorithm performance depends on input size. We care about behavior as n → ∞ — not constants, lower-order terms, or small n.
THREE MAIN NOTATIONS:
Big O (Upper bound):
f(n) = O(g(n)) ↔ ∃ c, n₀ > 0 such that f(n) ≤ c·g(n) for all n ≥ n₀.
- Worst case behavior.
- "f grows AT MOST as fast as g."
Big Omega (Lower bound):
f(n) = Ω(g(n)) ↔ ∃ c, n₀ > 0 such that f(n) ≥ c·g(n) for all n ≥ n₀.
- Best case bound (in this sense).
- "f grows AT LEAST as fast as g."
Big Theta (Tight bound):
f(n) = Θ(g(n)) ↔ f = O(g) AND f = Ω(g).
- Exact growth rate.
- "f grows AT THE SAME RATE as g."
SMALL O AND SMALL OMEGA:
- Small o: f(n) = o(g(n)) means f grows STRICTLY slower than g (ratio → 0).
- Small omega: f(n) = ω(g(n)) means f grows STRICTLY faster (ratio → ∞).
So O = "at most" (could be tight); o = "strictly less."
COMMON GROWTH RATES (slow → fast):
| Class | Name | Example |
|---|---|---|
| O(1) | constant | Hash lookup |
| O(log log n) | log-log | (rare) |
| O(log n) | logarithmic | Binary search |
| O(√n) | sqrt | Naive primality testing up to √n |
| O(n) | linear | Single pass through array |
| O(n log n) | linearithmic | Merge sort, heap sort |
| O(n²) | quadratic | Bubble, insertion, selection sort |
| O(n³) | cubic | Floyd-Warshall, naive matrix mult |
| O(n^k) | polynomial | for fixed k |
| O(2ⁿ) | exponential | TSP brute force, subset enumeration |
| O(n!) | factorial | TSP via permutations, brute force |
ALGEBRA OF O-NOTATION:
- Sum rule: O(f) + O(g) = O(max(f, g)). E.g., O(n²) + O(n) = O(n²).
- Product rule: O(f) × O(g) = O(f·g). E.g., O(n) × O(log n) = O(n log n).
- Constants: O(c · f) = O(f) for any constant c > 0.
- Lower order absorbed: O(n² + n + 100) = O(n²).
- Log base irrelevant: O(log₂ n) = O(log₁₀ n) (= O(log n)).
- Exponents differ: O(2ⁿ) ≠ O(3ⁿ).
COMMON RECURRENCES & SOLUTIONS:
| Recurrence | Algorithm | Solution |
|---|---|---|
| T(n) = T(n−1) + 1 | Linear recursion | O(n) |
| T(n) = T(n−1) + n | Insertion sort | O(n²) |
| T(n) = T(n/2) + 1 | Binary search | O(log n) |
| T(n) = T(n/2) + n | Quickselect avg | O(n) |
| T(n) = 2T(n/2) + 1 | (rare) | O(n) |
| T(n) = 2T(n/2) + n | Merge sort | O(n log n) |
| T(n) = 2T(n/2) + n log n | — | O(n log² n) |
| T(n) = 2T(n/2) + n² | — | O(n²) |
| T(n) = 4T(n/2) + n | — | O(n²) |
| T(n) = 7T(n/2) + n² | Strassen | O(n^log₂ 7) ≈ O(n^2.807) |
(Use Master Theorem when applicable.)
TIME vs SPACE COMPLEXITY:
- Time: number of basic operations.
- Space: auxiliary memory used.
- Quick sort: O(n log n) time avg, O(log n) space (recursion stack).
- Merge sort: O(n log n) time, O(n) space.
WORST / AVERAGE / BEST CASE:
| Algo | Best | Avg | Worst |
|---|---|---|---|
| Linear search | 1 | n/2 | n |
| Binary search | 1 | log n | log n |
| Bubble sort | n | n² | n² |
| Insertion sort | n | n² | n² |
| Quick sort | n log n | n log n | n² |
| Merge sort | n log n | n log n | n log n |
| Hashing (good hash) | 1 | 1 | n |
Average case assumes some distribution of inputs (often uniform).
AMORTIZED ANALYSIS:
Some operations are expensive but happen rarely. Average over a sequence of operations.
Example: dynamic array (e.g., Python list). Append is O(1) amortized, even though occasional resize is O(n). The "blame" is spread over many cheap appends.
Methods: aggregate, accounting, potential.
COMPARISONS:
For large n, what's faster?
- Constant vs anything > constant → constant wins.
- log n vs n → log n wins.
- n vs n log n → linear wins.
- n^k vs 2ⁿ → polynomial wins (eventually).
- 2ⁿ vs n! → exponential wins (eventually); n! beats anything polynomial.
Real-world impact:
- For n = 10⁶: n² = 10¹² (way too slow); n log n = 2×10⁷ (fast).
- For n = 30: 2ⁿ ≈ 10⁹ (slow but doable); for n = 50: 2ⁿ = 10¹⁵ (too slow).
EXAM HOOKS:
- f(n) = O(g(n)) means f ≤ c·g eventually.
- Constant factors and lower-order terms ignored in asymptotic analysis.
- log n = O(n) but n ≠ O(log n).
- 2ⁿ ≠ O(2^(n/2)) (since 2^(n/2) is much slower).
- n! grows faster than 2ⁿ (by Stirling's).
- "Tight bound" = Θ. "Upper bound" = O.
Recurrence Relations and Master Theorem
For T(n) = a.T(n/b) + f(n) with a>=1, b>1, compare f(n) with n^(log_b a) (the 'watershed').
- Case 1: if f(n) = O(n^(log_b a - e)) for some e>0, then T(n) = Theta(n^(log_b a)). (Leaves dominate.)
- Case 2: if f(n) = Theta(n^(log_b a)), then T(n) = Theta(n^(log_b a) . log n).
- Case 3: if f(n) = Omega(n^(log_b a + e)) AND regularity a.f(n/b) <= c.f(n) for c<1, then T(n) = Theta(f(n)). (Root dominates.)
Shortcut: compute c_crit = log_b a. Compare exponent of f. If f is polynomially smaller -> Case 1; equal (within log factors) -> Case 2; polynomially larger -> Case 3.
Lock these in for instant recall:
- T(n) = 2T(n/2) + n => Theta(n log n) [Merge sort, Case 2]
- T(n) = 2T(n/2) + 1 => Theta(n) [Case 1, log_b a = 1]
- T(n) = T(n/2) + 1 => Theta(log n) [Binary search]
- T(n) = T(n/2) + n => Theta(n) [Case 3]
- T(n) = 2T(n/2) + n log n => Theta(n log^2 n) [Master fails; use extended/tree]
- T(n) = 7T(n/2) + n^2 => Theta(n^(log2 7)) ~ Theta(n^2.81) [Strassen]
- T(n) = T(n-1) + 1 => Theta(n); T(n) = T(n-1) + n => Theta(n^2)
- T(n) = 2T(n-1) + 1 => Theta(2^n).
Master Theorem does NOT apply to subtract-and-conquer (n-1 form) or non-polynomial f gaps.
When Master Theorem fails (e.g., gap is not polynomial, or a/b are not constants), use:
- Recursion tree: sum work per level. Levels = log_b n. Work at level i = a^i . f(n/b^i). Sum the geometric/arithmetic series. If total dominated by root -> Theta(f(n)); by leaves -> Theta(n^(log_b a)); if equal per level -> multiply by number of levels (log factor).
- Substitution: guess the bound, prove by induction. Useful for T(n)=2T(n/2)+n which guesses cn log n.
Key trap: T(n) = 2T(n/2) + n/log n falls in the Master Theorem GAP between Case 1 and 2 (f is smaller than n by only a log factor, not polynomially). Recursion tree gives Theta(n log log n).
Algorithm Complexity Analysis from Code
Count operations by summing iteration counts.
- Independent nested loops multiply: two loops each n times -> Theta(n^2).
- Dependent loops (inner depends on outer): for i=1..n, for j=1..i -> sum i = n(n+1)/2 = Theta(n^2).
- Geometric/halving loops: while(i<n) i*=2 -> Theta(log n). i goes 1,2,4,... up to n.
- Loop with i*=2 inside an outer n-loop -> Theta(n log n).
Shortcut for j growing multiplicatively: number of iterations = log_base(limit). For 'for j=1; j<=n; j=j*k' the count is log_k n.
Watch for break/return that change worst vs best case. Always state which case (best/average/worst) is being analyzed.
Memorize these closed forms used in loop analysis:
- 1 + 2 + ... + n = n(n+1)/2 = Theta(n^2)
- 1^2 + 2^2 + ... + n^2 = n(n+1)(2n+1)/6 = Theta(n^3)
- 1 + 2 + 4 + ... + 2^k = 2^(k+1) - 1 = Theta(2^k)
- Harmonic: 1 + 1/2 + 1/3 + ... + 1/n = Theta(log n) (H_n ~ ln n + 0.577)
- sum_{i=1}^{n} n/i = n.H_n = Theta(n log n)
- Geometric sum with ratio r>1: dominated by last term Theta(r^n).
The harmonic sum is a frequent trap: a loop doing n/i work per outer iteration totals Theta(n log n), not Theta(n^2).
Analyze:
for (i=1; i<=n; i++)
for (j=1; j<=n; j=j*2)
for (k=1; k<=j; k++) op;
Outer i: n iterations. Middle j: takes values 1,2,4,...,n => log n values. Inner k runs j times. For fixed i, inner total = 1+2+4+...+n = 2n - 1 = Theta(n). So per outer iteration work = Theta(n), times n outer iterations = Theta(n^2).
Key insight: the inner two loops together do Theta(n) work (geometric sum dominated by largest term n), NOT Theta(n log n). The middle loop's log n values mislead; you must sum the actual inner work, which is geometric.
Best, Average, Worst Case and Growth Comparison
Distinguish best (Omega-ish minimum), average (expected), and worst (maximum) inputs:
- Linear search: best O(1), avg O(n), worst O(n).
- Binary search: best O(1), avg/worst O(log n).
- Quicksort: best/avg O(n log n), WORST O(n^2) (already sorted with poor pivot).
- Merge sort: O(n log n) in ALL cases (input-independent).
- Insertion sort: best O(n) (sorted), avg/worst O(n^2).
- Bubble sort: best O(n) (with flag, sorted), worst O(n^2).
- Heap sort: O(n log n) all cases. Build-heap is O(n).
- BST search: avg O(log n), worst O(n) (skewed).
Memory aid: Merge & Heap sort are 'guaranteed' n log n; Quicksort is fast on average but quadratic worst case.
Big-O is NOT the same as worst case. Big-O/Omega/Theta are bounds on a FUNCTION; best/average/worst describe which INPUT you analyze. You can state Big-O of the best case (e.g., insertion sort best case is Theta(n), which is also O(n) and O(n^2)).
Thus 'worst case is O(n^2)' and 'best case is Omega(n)' are both valid. A bound and a case are orthogonal concepts.
Common GATE trap: 'The worst-case running time of merge sort is O(n^2)' is TECHNICALLY TRUE (since O is an upper bound, and n log n <= c.n^2), even though Theta is n log n. Read whether the question asks for tight (Theta) or just an upper bound (O).
Comparison-based sort lower bound: Omega(n log n) (decision-tree argument: n! leaves need log(n!)=Theta(n log n) height).
Non-comparison sorts beat this: Counting sort O(n+k), Radix sort O(d(n+k)), Bucket sort avg O(n).
Space: Merge sort O(n) auxiliary; Quicksort O(log n) stack (avg); Heap sort O(1) in-place.
Selection sort: O(n^2) all cases, O(n) swaps.
Key numericals: minimum comparisons to find both min and max in n elements = ceil(3n/2) - 2. Minimum comparisons to find the 2nd smallest = n + ceil(log2 n) - 2. Merging two sorted lists of sizes m and n needs m+n-1 comparisons worst case.