Arrays and Strings
Operations, complexity, common patterns.
Array Address Calculation
For a 1-D array A declared with lower bound L, the address of A[i] is:
Address(A[i]) = Base + (i - L) * w
where Base is the address of A[L] and w is the size of each element in bytes. In C, L = 0, so Address(A[i]) = Base + iw. Memory aid: 'count the gap from the lower bound, scale by element size.' Always confirm whether the question gives you the address of the FIRST element or of element[0]; if A starts at index L, you must subtract L. The total bytes occupied by an array of n elements is nw. These formulas underpin all higher-dimensional addressing, so memorize the (index - lowerbound)*width pattern.
For a 2-D array A[L1..U1][L2..U2] with element size w, let rows R = U1-L1+1 and columns C = U2-L2+1.
Row-major (C, C++): Address(A[i][j]) = Base + ((i-L1)*C + (j-L2))*w
Column-major (Fortran): Address(A[i][j]) = Base + ((j-L2)*R + (i-L1))*w
Memory aid: row-major walks across a row first, so multiply the row offset by the number of COLUMNS; column-major walks down a column first, so multiply the column offset by the number of ROWS. Mistaking R for C is the single most common GATE error. Always compute counts (R, C) before plugging in, and respect non-zero lower bounds by subtracting them.
Array A[1..10][1..15], each element 4 bytes, Base = 1000, row-major. Find Address(A[4][6]).
Here L1=1, L2=1, columns C = 15-1+1 = 15, w = 4.
Address = 1000 + ((4-1)*15 + (6-1))*4
= 1000 + (45 + 5)4
= 1000 + 504 = 1000 + 200 = 1200.
Shortcut check: A[4][6] is in row 4. Rows 1-3 are fully before it = 3*15 = 45 elements, plus 5 elements before column 6 in row 4 = 50 elements total, times 4 bytes = 200 offset. Adding Base gives 1200. Always count 'full rows before + elements before in current row.'
Array Operations and Complexity
Access by index: O(1) (random access via address arithmetic). Search (unsorted): O(n) linear scan; (sorted): O(log n) with binary search. Insertion/deletion at the END of a dynamic array: O(1) amortized; at an arbitrary position: O(n) because elements must shift. Inserting at the BEGINNING is worst case—every element shifts, O(n). Memory aid: 'arrays are cheap to read, expensive to reshape.' Updating an element in place is O(1). The fixed-size limitation forces O(n) resizing in static arrays. Contrast with linked lists, which give O(1) insert/delete at a known node but O(n) access. GATE frequently tests this access-vs-modify tradeoff.
A dynamic array (e.g., C++ vector, Java ArrayList) doubles its capacity when full. Copying n elements costs O(n), but this happens rarely. Summing the cost of n appends, total copy work is n + n/2 + n/4 + ... < 2n, so the amortized cost per append is O(1). Memory aid: 'doubling makes the rare expensive copy pay for many cheap appends.' If instead you grew by a CONSTANT amount each time, total work would be O(n^2) and amortized O(n)—much worse. This geometric-growth amortization is a favorite GATE/interview topic; remember the < 2n bound from the geometric series.
To reverse an array in place, swap A[i] with A[n-1-i] for i from 0 to n/2 - 1. This uses O(1) extra space and O(n) time (n/2 swaps). Example for n=5 (indices 0..4): swap (0,4), swap (1,3); index 2 is the middle and stays. Memory aid: 'two pointers walking inward, stop when they meet or cross.' A common bug is looping i to n instead of n/2, which reverses twice and yields the original array. Reversal is the building block for rotation: rotate-left by k = reverse first k, reverse rest, reverse whole.
String Algorithms and Pattern Matching
Naive string matching slides the pattern (length m) over the text (length n) one position at a time, re-comparing from scratch on each mismatch: worst case O(nm). KMP (Knuth-Morris-Pratt) precomputes a failure/LPS (Longest Proper Prefix which is also Suffix) array in O(m), then scans the text once without backtracking, giving O(n + m) total. Memory aid: 'KMP never re-reads a text character.' The LPS array tells how far to shift the pattern on a mismatch by reusing already-matched prefix information. KMP space is O(m) for the LPS array. Rabin-Karp uses hashing for average O(n+m) but worst-case O(nm) due to hash collisions.
LPS[i] = length of the longest proper prefix of pattern[0..i] that is also a suffix of pattern[0..i]. 'Proper' means it cannot be the whole substring. Example, pattern = 'ABABACA':
A -> 0
AB -> 0
ABA -> 1 (prefix 'A' = suffix 'A')
ABAB -> 2 ('AB')
ABABA -> 3 ('ABA')
ABABAC -> 0 (no match)
ABABACA -> 1 ('A')
So LPS = [0,0,1,2,3,0,1]. Memory aid: 'match the longest border.' On a text mismatch after matching j characters, jump the pattern index to LPS[j-1] instead of 0. Building LPS is O(m) using two pointers.
Concatenating two strings of lengths a and b: O(a+b). Comparing two strings: O(min length) worst case. Reversing a string: O(n). Checking palindrome: O(n) two-pointer. Naive substring search: O(n*m). KMP / Z-algorithm: O(n+m). Computing all character frequencies: O(n) with a fixed-size count array (O(1) if alphabet size is constant, e.g., 256 ASCII). Sorting characters of a string: O(n log n) comparison sort, or O(n + k) counting sort for fixed alphabet k. Memory aid: 'linear scans for most one-pass tasks, n log n only when sorting by comparison.' In C, string length via strlen is O(n) because it scans to the null terminator.
Sparse Matrices and Special Arrays
A sparse matrix has most entries zero. Storing it fully wastes space, so we use a triplet (3-tuple) representation: each non-zero element is stored as (row, column, value). The table typically has a header row recording (total rows, total columns, number of non-zeros). For a matrix with t non-zero elements, storage is (t+1) rows x 3 columns. Memory aid: 'one row per non-zero, plus one header.' This saves space only when t is small relative to mn; the break-even point is roughly when 3(t+1) < m*n. Other formats include CSR (Compressed Sparse Row) and CSC, which compress further by storing row pointers.
A lower-triangular n x n matrix has only i >= j entries non-zero, totaling n(n+1)/2 elements; storing these in a 1-D array saves ~half the space. Row-major index of A[i][j] (1-based, i>=j) in the packed array = i(i-1)/2 + (j-1). Memory aid: 'rows above contribute 1+2+...+(i-1) = i(i-1)/2 elements.' An upper-triangular matrix is symmetric to this. A symmetric matrix also needs only n(n+1)/2 stored values since A[i][j]=A[j][i]. A tridiagonal (banded) matrix has at most 3n-2 non-zero elements (main diagonal n, plus two adjacent diagonals of n-1 each).
Store a 1-based lower-triangular matrix row by row in a 1-D array. Find the array index (0-based) of A[4][2].
Elements before row 4 = rows 1,2,3 have 1+2+3 = 6 elements = 34/2 using i(i-1)/2 with i=4 -> 43/2 = 6.
Within row 4, A[4][2] is the 2nd element, offset (j-1) = 1.
Index = 6 + 1 = 7 (0-based), i.e., the 8th stored element.
Formula check: i(i-1)/2 + (j-1) = 4*3/2 + (2-1) = 6 + 1 = 7. Memory aid: 'triangular numbers count the rows above, then add the column offset within the current row.'