[Alg] Divide and Conquer

Design
Step 1: Divide into (>=2)의 smaller problems (instances)
Step 2: Solve each instances
How to solve the step 2?
(1) Recursively : divide and conquer
: concise, natural, clear, 기계적 해법
(2) Iteratively, Sequentially (if the subprogram is small)
: faster, save memory space,(system stack)
--> recursion depth, stack depth : log(n) + 1
Step 3 : Combine the subsolutions

1. Divide and Conquer Search (binary search)

if x == s[mid], return mid
step 1: Divide S into 2 subarrays
if x < s[mid], return left array
otherwise, return right array
step 2: Conquer(Solve) the chosen array
step 3: Combine(Obtain) the solution(s) (Optional, not in binary search)

// Example
// sorted array S : [10, 12, 13, 14, 18, 20, 25, 27, 30, 35, 40]
// problem : find x, locate x in S (searching) (x=18)
// 10, 12, 13, 14, 18, 20, 25, 27, 30, 35, 40 (s[mid]=20)
// 10, 12, 13, 14, 18 (s[mid]=13)
// 14, 18(s[mid]=14)
// 18 --> yes, find the index

//Detailed version
D&C_Search(s, x, low, high):
    if low > high, then return nil
    Else
        mid = (low + high)/2
        if x == s[mid], then return mid
        else if x < s[mid], then return D&C_Search(…, low, mid-1)
        else                    return D&C_Search(…, mid+1, high)
        
        
        
d&c_a(original problem size)
...
divide (orig -> left, right)
solve d&c_a(left information)
solve d&c_a(right information)
combine(left + right) // optional
...

--> Top-Down approaches, Naturla design

Time Complexity : Mathematical analysis is required (수학적 사고)

1. Best-case : O(1)
2. Worst-case : 1/2 size decreasing -> O(logn)

TC = # of basic operations
T(n) = Divide + Conquer + Combine
= ? + ? + 0
= 1 + T(n/2) + 0 (searching using divide and conquer)
= 1 + 1 + T(n/4)
T(n) = T(n/4) + 2
= T(n/8) + 3 = ...
= T(n/2^k) + k (suppose n = 2^k)
= T(n/n) + log(n)
= T(1) + log(n)
= 1 + log(n)
<= O(log(n))

2. Divide and Conquer Sort (merge sort, 2-way merge sort)

Design
D&C_Sort (original array)
Step 1: divide into (>=2)의 subarrays
Step 2: Solve(Conquer) each subarray - sort left, right
Step 3: Combine the subsolutions - merge left, right
* divide and conquer : natural, top-down, systematic + termination conditions

// Example : insight from examples

s = [27, 10, 12, 20, 25, 13, 15, 22], n = 8
step 1: Divide -> [27, 10, 12, 20] [25, 13, 15, 22]
                  Left             Right
step 2 : Solve/Sort -> Left(10 12 20 27), Right(13 15 22 25)
         --> do Not care about the details
step 3: Combine/Merge -> 10 12 13 15 20 22 25 27
        ex) left(10, 27), right(12, 20)
            10, 12, 20, 17 : n-1번 비교! (3번, 마지막 원소는 비교x)
            --> n/2 + n/2 - 1



d&c_sort(S[1...n] : array)
--------------------------
if (n>1) // depending on your coding style
copy s[1] ... s[mid] to left array  // divide
copy s[mid+1] ... s[n] to right array  // divide
d&c_sort(left array) // conquer
d&c_sort(right array)  // conquer
combine/merge(left array, right array) // combine
--------------------------



--------------------------
if (n>1) // depending on your coding style
d&c_sort(S[1]...S[mid]) // conquer
d&c_sort(S[mid+1]...S[n])  // conquer
combine/merge(left array, right array) // combine
--------------------------
* copy하는 과정을 생략하고, (S[1]...S[mid])와 (S[mid+1]...S[n])를
  그대로 파라미터로 전달하는 방법을 택할 시에 divide 과정을 생략할 수 있으므로
  divide의 basic operations의 개수는 0이 될 수 있다

Time Complexity : T(n), O(n), big-O, worst-case, upper bound
T(n) = divide complexity + conquer complexity + combine
= ? + T(n/2) + T(n/2) + combine
= ? + 2*T(n/2) + n/2 + n/2 - 1
= 0(basic operations) + 2T(n/2) + (n-1)
= 2T(n/2) + n - 1 (recurrence equation)
<= 2T(n/2) + n
= 2(2T(n/4) + n/2) + n = 4T(n/4) + 2n
= 8T(n/8) + 3n = ...
= 2^k * T(n/2^k) + k*n (trick : n = 2^k)
= n * T(n/n) + log(n) * n
= n * T(1) + n * log(n) = n*0 + nlog(n) -> O(nlog(n))
1개를 원소로 갖는 배열을 정렬 시키는데 드는 시간은 0

Quiz : k-way merge sort에서 optimal한 k는??
2-way merge sort가 3-way merge sort보다 우월한가??, 10-way merge sort ??
- recursion depth: logₖ(n) (k가 크면 깊이는 얕아짐)
- comparing cost: 병합 시 k개의 리스트에서 최솟값을 찾는 데 O(log k) (heap 이용)
- O(n logₖ(n) × log(k)) = O(n log n) (big-O는 동일)
전체 원소 수 n개를 한 번씩은 비교해야 함: n, 비교하는데 heap 자료구조 사용: log(k)
recursion depth: log_k(n)
따라서 이론적으로는 곱하면 똑같지만,
k가 너무 크면 병합 시, 힙이 복잡해지고 k가 너무 작으면 recursion depth가 깊어져 함수 호출 오버헤드

[2.5강]
MIT 6.042J <Mathematics for Computer Science>
Lecture 1: Proof
Lecture 2: induction
Lecture 3: strong induction
Lecture 14: devide and conquer recurrence equations

[2.6강]
Thingking Habits in CS
C : claim
E : evidence, 증거, 증명, 입증 자료 -> 수학적 표현
R : Reasoning, 결론, 추론적 나의 주장, support

3. Quick Sort

Q : Divide and Conquer에서 Combining Step 없이 Sorting 가능한가? → D&C (without Step 3/Combining)
Ex. 15, 22, 13, 27, 12, 10, 20, 25
→ (left) (right) : divide (step1) 중요!
(left) < (right) : (left) < (pivot: 기준 - random) < (right)
→ (left part : sort) (right part : sort) : conquer (step2)
→ END

pivot(15) => S[n]
(left) 15 (right)
→ divide / partitioning algorithm
(13 12 10) 15 (22 27 20 25)
→ conquer (not detailed)
(10 12 13) 15 (20 22 25 27)

D#C_Sort(low, high: array index) → Quick Sort
{
	if(high > low) // stopping condition, termination condition
    {
    	Divide(low, high, pivotIndex)
        D&C_Sort(low, pivotIndex-1)  // left
        D&C_Sort(pivotIndex+1, high)  // right
    }
}

Time Complexity

(1) Best case TC

T(n)
= Divide + Conquer + Combine
= (n-1) + T(n/2) + T(n/2) + 0
= (n-1) + 2T(n/2)
<= 2T(n/2) + n
= 2^k * T(n/2^k) + kn (2^k=n)
= nT(1) + logn * n = n*0 + n(logn) = O(nlogn)

(2) Worst case TC

T(n)
= Divide + Conquer + Combine
= (n-1) + T(left) + T(right) + 0
= (n-1) + T(n-1)
= T(n-1) + n-1
= T(n-2) + (n-2) + (n-1)
= T(n-3) + (n-3) + (n-2) + (n-1)
= n(n-1)/2 <= O(n^2)

[3.3강]

Recurrences(회귀)

<Recurrence(s) equations>
1. Describes a sequence of numbers
- Early terms(numbers) are specified explicitly
- Later terms are calculated(expressed) a function of their predecessors
- Ex. T(1) = 1, T(n) = T(n-1) + 1
2. Reduces a big problem into smaller problems until easy cases (terms) are reached (progressively)
3. Useful and powerful to analyze the performance of recursive algorithms → Time complexity analysis

(Devide and Conquer != Recurrence)
Two big classes of recurrences in CS

1. Linear recurrence
이전 항들의 linear combination으로 현재 항을 정의하는 recurrence(점화식)

: later terms = linear combination of early terms
: linear algebra
<Fibonacci recurrence>
f(0)=1, f(1)=1
f(n) = f(n-1) + f(n-2) : linear recurrence f(n) is a linear combination

Time Complexity
T(n) = T(n-1) + T(n-2) → O(2^n) : not tight bound
O(1.6180^n / sqrt(5)) : tight upper bound → exponential time complexity

2. D&C recurrence
문제를 하위 문제로 divide하고, 각각을 conquer(solve)하고 합쳐서 풀 때 생기는 recurrece

<Merge sort recurrence>
T(1) = 0
T(n) = 2T(n/2) + n -1 : not a linear combination of preceding terms
→ NOT linear recurrence → D&C recurrence
→ T(n) is a function of T(n/2)

Time Complexity
T(n) = 2T(n/2) + n -1 → O(nlogn)

Observations -> Guidelines:
1. Generating smaller problems is more important to algorithmic speed(efficiency)
than reducing the extra steps per recursive calls.
2. More generally, linear recurrences (have big subproblems) typically have Exponential solution(extremely inefficient),
while D&C recurrences (have small subproblems) usually have solutions bounded by a polynomial TC.
3. If the subproblems are only a fraction(분수) of the original size,
then the solution is typically bounded by a polynomial. → D&C design is good.
4. Even worse, Fibonacci recurrences have subproblems "overlapped."

5. How to deal with Fibonacci?
We know that the recurrence can be implemented recursively as well as iteratively.
→ f0, f1, f2, f3, f4, f5.
→ New design is introduced!! (4주차 Dynamic Programming)

[3.5강]
MIT-6.042-Lec15 : Linear Recurrences
1. General Math. Definitions and general solutions.
2. Two general solving techniques
   1) guess-and-verify
   2) plug-and-chug.
   ==> applicable to every recurrence.
3. Homogeneous (inhomogeneous) linear recurrence
   f(n) = a1*f(n-1) + a2*f(n-2) + ... + ad*f(n-d) (+ g(n))
4. Ex: Fibonacci recurrence -> solution.
Ex : Towers of Hanoi : f(1) = 1, f(n) = 2f(n-1) + 1

[3.6강] Advanced Analysis

Merge Sort와 Quick Sort 모두 평균적인 시간 복잡도가 O(n log n)이지만,
실제로는 퀵소트가 대부분의 경우에서 더 빠르게 동작

1. 상수 계수 차이
Merge Sort는 추가적인 메모리를 사용하면서 배열을 복사하는 작업이 필요
Quick Sort는 보통 제자리 정렬되므로, 메모리 복사에 따른 오버헤드가 적다

2. 캐시 효율성

Quick Sort는 평균적으로 빠르고 메모리 사용도 적기 때문 더 자주 사용
다만, 최악의 경우 시간 복잡도가 O(n²)까지 갈 수 있다는 점에서, Merge Sort가 안정적인 선택이 될 수 있다

티스토리툴바