Computer Science/Data Structures

Binary heap

metamong 2023. 2. 6.

๐Ÿ‘จ‍๐Ÿซ ์šฐ์„ ์ˆœ์œ„ ํ๋Š” max-heap ๋˜๋Š” min-heap์„ ์‚ฌ์šฉํ•ด ์šฐ์„ ์ˆœ์œ„์— ๋งž๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ๋ฝ‘์„ ์ˆ˜ ์žˆ๋Š” ์ž๋ฃŒ๊ตฌ์กฐ์ž„์„ ๊ณต๋ถ€ํ–ˆ๋‹ค. (์•„๋ž˜ ํฌ์ŠคํŒ… ์ฐธ์กฐ)

 

 

priority queue - (binary) heap tree

1. concepts> โ˜๏ธ ์šฐ์„ ์ˆœ์œ„ ํ = '์šฐ์„ ์ˆœ์œ„๊ฐ€ ๊ฐ€์žฅ ๋†’์€ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์žฅ ๋จผ์ € ์‚ญ์ œํ•˜๋Š” ์ž๋ฃŒ๊ตฌ์กฐ' โ˜๏ธ ๋ฐ์ดํ„ฐ๋ฅผ ์šฐ์„ ์ˆœ์œ„์— ๋”ฐ๋ผ ์ฒ˜๋ฆฌํ•˜๊ณ  ์‹ถ์„ ๋•Œ ์‚ฌ์šฉ ex) ๋ฌผ๊ฑด ๋ฐ์ดํ„ฐ๋ฅผ ์ž๋ฃŒ๊ตฌ์กฐ์— ๋„ฃ์—ˆ๋‹ค๊ฐ€ ๊ฐ€

sh-avid-learner.tistory.com

 

โ˜๏ธ priority queue๋Š” ์ผ์ข…์˜ Abstract Data Structure(ADT)๋กœ, ์ถ”์ƒํ™”๋œ ๊ฐœ๋…์„ heap์ด๋ผ๋Š” ์ž๋ฃŒ๊ตฌ์กฐ๋กœ implementํ•  ์ˆ˜ ์žˆ๋‹ค.

(์šฐ์„ ์ˆœ์œ„๊ฐ€ ์ตœ๋Œ€ ๋จผ์ €, ๋˜๋Š” ์ตœ์†Œ ๋จผ์ €์ผ ๊ฒฝ์šฐ heap์€ ํŠธ๋ฆฌ ๊ตฌ์กฐ๋ฅผ ๋งŒ๋“ค์–ด ์ด์˜ ์šฐ์„ ์ˆœ์œ„์— ๋งž๊ฒŒ ๊บผ๋‚ด์ค€๋‹ค)

 

๐Ÿ‘จ‍๐Ÿซ ๋”ฐ๋ผ์„œ, ์ค‘์š” ์ž๋ฃŒ๊ตฌ์กฐ heap์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋ชจ๋“  ์ •๋ณด๋ฅผ ์ •ํ™•ํžˆ ์ •๋ฆฌํ•ด๋ด„


๐Ÿ‘จ‍๐Ÿซ ์ผ๋‹จ heap data structure๋Š” ์—ฌ๋Ÿฌ ํ˜•ํƒœ๋กœ ํ‘œํ˜„๋  ์ˆ˜ ์žˆ๋‹ค. ์ด ์ค‘ Binary heap์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๋ ค ํ•œ๋‹ค.

1. binary heap intro

๐Ÿ‘จ‍๐Ÿซ ๋ณดํ†ต ์œ„์—์„œ ์–ธ๊ธ‰ํ•œ priority queue๋ฅผ ๊ตฌํ˜„ํ•  ๋•Œ ๋งŽ์ด ์“ฐ์ž„

 

๐Ÿ‘จ‍๐Ÿซ binary tree ํ˜•ํƒœ์ด๋ฉฐ, ๊ทธ ์ค‘ complete binary tree ํ˜•ํƒœ๋ฅผ ๋ณด์ธ๋‹ค.

 

๐Ÿ‘จ‍๐Ÿซ binary tree๋Š” ๋‘ ๊ฐœ์˜ ์กฐ๊ฑด ํ•˜์— ์ •์˜๋œ๋‹ค. → shape property, heap property

* shape property - complete binary tree

๐Ÿ‘จ‍๐Ÿซ Complete Binary Tree(CBT; ์™„์ „ ์ด์ง„ ํŠธ๋ฆฌ)๋ž€, root ๋…ธ๋“œ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜์—ฌ ์™ผ์ชฝ ์ž์‹ ๋…ธ๋“œ, ์˜ค๋ฅธ์ชฝ ์ž์‹ ๋…ธ๋“œ ์ˆœ์„œ๋Œ€๋กœ ๋ฐ์ดํ„ฐ๊ฐ€ ์ฐจ๋ก€๋Œ€๋กœ ์‚ฝ์ž…๋˜๋Š” tree๋ฅผ ๋œปํ•œ๋‹ค.

 

'all levels of the tree, except possibly the last one (deepest) are fully filled, and, if the last level of the tree is not complete, the nodes of that level are filled from left to right.'

 

๐Ÿ‘จ‍๐Ÿซ ์œ„ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ์™ผ์ชฝ๋ถ€ํ„ฐ ์ฐจ๋ก€๋Œ€๋กœ node์— ๋ฐ์ดํ„ฐ๊ฐ€ ์ฑ„์›Œ์ง€๊ณ , ๋งจ ์•„๋ž˜์ธต์„ ์ œ์™ธํ•œ ๋ชจ๋“  ์ธต์˜ node๊ฐ€ ์ฑ„์›Œ์ง„๋‹ค. ๋‹จ, ๋งˆ์ง€๋ง‰ ์ธต์€ ์™ผ์ชฝ๋ถ€ํ„ฐ node๊ฐ€ ์ฑ„์›Œ์ง„๋‹ค.

 

๐Ÿ‘จ‍๐Ÿซ Binary Tree ์ข…๋ฅ˜ ํฌ์ŠคํŒ… ์ฐธ์กฐ

 

Types of Binary Tree๐ŸŒฒ

๐ŸŒฒpriority queue ์ž๋ฃŒ๊ตฌ์กฐ ํฌ์ŠคํŒ…์—์„œ heap์€ ์ผ์ข…์˜ CBT(Complete Binary Tree) ํ˜•ํƒœ๋กœ ๊ตฌํ˜„๋œ๋‹ค๊ณ  ์–ธ๊ธ‰ํ•œ ์ ์ด ์žˆ๋‹ค. priority queue - heap 1. concepts> โ˜๏ธ ์šฐ์„ ์ˆœ์œ„ ํ = '์šฐ์„ ์ˆœ์œ„๊ฐ€ ๊ฐ€์žฅ ๋†’์€ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์žฅ

sh-avid-learner.tistory.com

 

๐Ÿ‘จ‍๐Ÿซ ๋”ฐ๋ผ์„œ, ์•„๋ž˜ tree๋“ค ์ค‘ CBT๋Š” 1๋ฒˆ, 2๋ฒˆ, 5๋ฒˆ, 6๋ฒˆ์ด๋‹ค

 

๐Ÿ‘จ‍๐Ÿซ ๊ทธ๋ ‡๋‹ค๋ฉด, heap property์— ์˜ํ•ด ๋‚˜๋‰˜๋Š”, heap์˜ ๋‘ ์ข…๋ฅ˜ max-heap๊ณผ min-heap์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž

* heap property - max-heap & min-heap

'Heap property: the key stored in each node is either greater than or equal to (≥) or less than or equal to (≤) the keys in the node's children, according to some total order.'

 

๐Ÿ‘จ‍๐Ÿซ max-heap: parent key๊ฐ€ child keys๋ณด๋‹ค ํฌ๊ฑฐ๋‚˜ ๊ฐ™์€ heap

 

๐Ÿ‘จ‍๐Ÿซ min-heap: parent key๊ฐ€ child keys๋ณด๋‹ค ์ž‘๊ฑฐ๋‚˜ ๊ฐ™์€ heap

 

→ ๋‹ค์‹œ ์ •๋ฆฌํ•˜๋ฉด,

โ˜๏ธ ์ตœ์†Œ heap(min-heap): root node๊ฐ€ ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์„ ๊ฐ€์ง€๋ฉฐ, ๊ฐ’์ด ์ž‘์€ ๋ฐ์ดํ„ฐ๊ฐ€ ์šฐ์„ ์ ์œผ๋กœ ์ œ๊ฑฐ๋œ๋‹ค.
(์ด๋Ÿด ๊ฒฝ์šฐ ๋งŒ๋“ค์–ด์ง„ ์ตœ์†Œ heap์— n๊ฐœ์˜ ๋ฐ์ดํ„ฐ๋งŒ ๋„ฃ์–ด๋„ ์ž๋™์ ์œผ๋กœ ์˜ค๋ฆ„์ฐจ์ˆœ ์ •๋ ฌ์ด ์™„์„ฑ๋œ๋‹ค)

โ˜๏ธ ์ตœ๋Œ€ heap(max-heap): root node๊ฐ€ ๊ฐ€์žฅ ํฐ ๊ฐ’์„ ๊ฐ€์ง€๋ฉฐ, ๊ฐ’์ด ํฐ ๋ฐ์ดํ„ฐ๊ฐ€ ์šฐ์„ ์ ์œผ๋กœ ์ œ๊ฑฐ๋œ๋‹ค.

 

โ€ป ๋”ฐ๋ผ์„œ, max-heap๊ณผ min-heap์€ ์šฐ์„ ์ˆœ์œ„ ํ๋ฅผ implementํ•  ๋•Œ, element๋ฅผ insertingํ•˜๊ฑฐ๋‚˜, ๊ฐ€์žฅ ์ž‘๊ฑฐ๋‚˜ ๊ฐ€์žฅ ํฐ element๋ฅผ ๊ฐ๊ฐ min-heap๊ณผ max-heap์—์„œ ๋นผ๋Š” ์ด ๋‘ ๊ฐ€์ง€ operation์—์„œ ํšจ์œจ์ ์ด๊ฒŒ ๋งŽ์ด ์“ฐ์ธ๋‹ค.

 

โ€ป ๋˜ํ•œ, heapsort sorting algorithm(in-place algorithm)์œผ๋กœ heap์ด ์“ฐ์ธ๋‹ค. ์„œ๋กœ node๋ผ๋ฆฌ ์ƒ๋Œ€์ ์œผ๋กœ ์œ„์น˜๋ฅผ ๋ฐ”๊พธ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋ฏ€๋กœ in-place ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค.

 

๐Ÿ‘จ‍๐Ÿซ ์˜ˆ์‹œ (์™ผ์ชฝ์€ max-heap, ์˜ค๋ฅธ์ชฝ์€ min-heap)

2. binary heap implementations

๐Ÿ‘จ‍๐Ÿซ heap ๊ตฌํ˜„์€linked list์™€ array ๋‘ ๊ตฌ์กฐ๋กœ ๊ตฌํ˜„์ด ๊ฐ€๋Šฅํ•œ๋ฐ, linked list๋Š” ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ ์ฐจ์ง€,์ฃผ๋กœ array๋กœ ๊ตฌํ˜„๋œ๋‹ค. array์˜ index๋ฅผ ์ด์šฉํ•ด heap binary tree์— ์Œ“๋Š” ๊ณผ์ •์„ ์•Œ์•„๋ณด์ž

* child & parent's node (based on the node's index)

โ€ป ์ƒ์œ„ node์˜ index๊ฐ€ i๋ผ๋ฉด

→ root node์˜ index๊ฐ€ 0์ด๋ฉด, ํ•˜์œ„ left node์˜ index๋Š” 2i+1, ํ•˜์œ„ right node์˜ index๋Š” 2i+2

→ root node์˜ index๊ฐ€ 1์ด๋ฉด, ํ•˜์œ„ left node์˜ index๋Š” 2i, ํ•˜์œ„ right node์˜ index๋Š” 2i+1

 

→ ์œ„ array์—๋Š” 10, 30, 40, 90, 60 ์ˆœ์„œ๋Œ€๋กœ ๋“ค์–ด๊ฐ€ ์žˆ๋‹ค. root node๋ถ€ํ„ฐ ์ฐจ๋ก€๋กœ ์™ผ์ชฝ๋ถ€ํ„ฐ array์— ๋“ค์–ด๊ฐ€๋Š”๋ฐ, ๋งŒ์•ฝ ์ž์‹ node์˜ index๊ฐ€ x๋ผ๋ฉด

→ root node์˜ index๊ฐ€ 0์ผ ๊ฒฝ์šฐ parent node์˜ index๋Š” $\lfloor \cfrac{(x-1)}{2} \rfloor$

→ root node์˜ index๊ฐ€ 1์ผ ๊ฒฝ์šฐ parent node์˜ index๋Š” $\lfloor \cfrac{x}{2} \rfloor$)

 

๐Ÿ‘จ‍๐Ÿซ ์ฃผ์–ด์ง„ array์˜ ์›์†Œ๋“ค์„ array ๊ณต๊ฐ„์„ ๋‹ค์‹œ ์จ์„œ ํ•ด๋‹น ๊ณต๊ฐ„์—์„œ binary heap ํ˜•ํƒœ๋กœ ์ €์žฅํ•˜๋Š” heapsort ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ์“ฐ์ธ๋‹ค.

 

๐Ÿ‘จ‍๐Ÿซ ์œ„ ์™„์„ฑ๋œ binary heap๊ณผ array์™€์˜ index ์ˆœ์„œ, array element ์ˆœ์„œ๋ฅผ ์„œ๋กœ ์ž˜ ์‚ดํŽด๋ณด๋ฉฐ ๊ธฐ์–ตํ•˜์ž!

 

๐Ÿ‘จ‍๐Ÿซ heap์—์„œ ์ง„ํ–‰๋˜๋Š” ์—ฌ๋Ÿฌ ๊ณผ์ •์€ ์ผ๋‹จ max-heap์„ ๊ธฐ์ค€์œผ๋กœ ํ•œ๋‹ค.


* creating a heap

๐Ÿ‘จ‍๐Ÿซ ๊ทธ๋ ‡๋‹ค๋ฉด, heapsort์˜ core ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ, ์ฃผ์–ด์ง„ array๋ฅผ binary heap ํ˜•ํƒœ๋กœ ์ €์žฅํ•˜๋Š” ๊ณผ์ •์„ ํ•œ ๋‹จ๊ณ„์”ฉ ์ž์„ธํžˆ ์‚ดํŽด๋ณด์ž

 

* ์˜ˆ์‹œ) array [33, 35, 42, 10, 7, 8, 14, 19, 48]์ด ์ฃผ์–ด์กŒ๋‹ค๋ฉด, ๋งจ ์•ž์˜ ์›์†Œ๋ถ€ํ„ฐ ์‹œ์ž‘. child node๋กœ ๋“ค์–ด๊ฐˆ ๋•Œ, parent node์™€ ํฌ๊ธฐ๋ฅผ ๋น„๊ตํ•˜๊ณ , max-heap์ผ ๊ฒฝ์šฐ, parent node๋ณด๋‹ค ๋“ค์–ด๊ฐ€๋Š” ์›์†Œ๊ฐ€ ๋” ํฌ๋‹ค๋ฉด ์„œ๋กœ swap. ๋‹ค์‹œ ์ƒ์œ„ parent node์™€ ๋น„๊ต (min-heap์€ parent node๋ณด๋‹ค ์›์†Œ๊ฐ€ ๋” ์ž‘๋‹ค๋ฉด parent node์™€ swap)

- ๊ทธ ๊ฒฐ๊ณผ array [48, 42, 35, 33, 7, 8, 14, 10, 19]์ธ binary max-heap ์™„์„ฑ

 

- ์™„์„ฑ๋œ binary max-heap -

 

๐Ÿ‘จ‍๐Ÿซ ์œ„ ์˜ˆ์‹œ A [33, 35, 42, 10, 7, 8, 14, 19, 48]๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ์ฃผ์–ด์ง„ array๋ฅผ binary heap์— ๋งž๊ฒŒ ๋ฐ”๊พธ๋Š” heapify๋ฅผ python์œผ๋กœ ๊ตฌํ˜„!

→ ์ถœ๋ ฅํ•˜๋ฉด A๊ฐ€ [48, 35, 42, 33, 7, 8, 14, 19, 10]๋กœ ๋ฐ”๋€๋‹ค.

 

def heapify(arr, n, parent_idx):
    largest = parent_idx
    left = parent_idx*2 + 1
    right = parent_idx*2 + 2

    if left < n and arr[left] > arr[largest]:
        largest = left
    
    if right < n and arr[right] > arr[largest]:
        largest = right
    
    if largest != parent_idx: #if changed
        arr[largest], arr[parent_idx] = arr[parent_idx], arr[largest] #swap
        heapify(arr, n, largest) #if swapped, continue with the lower part

arr = [33, 35, 42, 10, 7, 8, 14, 19, 48]

for i in range((len(arr)-1)//2,-1,-1): #heapify
    heapify(arr, len(arr), i)
    print(arr)

print(arr) #[48, 35, 42, 33, 7, 8, 14, 19, 10]

 

heapfiy() ์‹œ๊ฐ„ ๋ณต์žก๋„ O(N) (๊ฐ node๋ณ„ heapify()๋ฅผ ์—ฌ๋Ÿฌ ๋ฒˆ ์‹œํ–‰)

 

โ–ถ time complexity โ—€

 

→ ๋จผ์ €, ์ด n๊ฐœ์˜ node๊ฐ€ ์žˆ๋‹ค๊ณ  ํ•˜๊ณ , ๋ฐ”๋‹ฅ๋ถ€ํ„ฐ h=0, ๊ทธ ๋‹ค์Œ ์ธต h=1๊ณผ ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ ์ง„ํ–‰๋œ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๋ฉด, h๋ผ๋Š” ์ธต์— ์กด์žฌํ•˜๋Š” ์ตœ๋Œ€ node ๊ฐœ์ˆ˜๋Š” $\cfrac{n}{2^{h+1}}$

 

→ last non-leaf node๋ถ€ํ„ฐ root-node๊นŒ์ง€ ๊ฐ๊ฐ max_heapify()๊ฐ€ ์ง„ํ–‰๋œ๋‹ค. ์ด ๋•Œ, max_heapify()์—์„œ swap์„ ์œ„ํ•œ comparison ํšŸ์ˆ˜๋Š” ๊ณง ์‹œ์ž‘ node์˜ height๋ฅผ ๋œปํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ์šฐ๋ฆฌ๋Š” ์‹œ๊ฐ„๋ณต์žก๋„๋ฅผ h๋ผ๋Š” ์ธต์—์„œ $O(h) * \lceil\cfrac{n}{2^{h+1}}\rceil$๋กœ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

(O(h)๋Š” ์‹œ์ž‘ node์˜ height - ์ฆ‰ ์ผ์ • ์ƒ์ˆ˜ $c*h$๋กœ ํ‘œํ˜„ ๊ฐ€๋Šฅ)

 

 ๊ฐ ๋ชจ๋“  ์ธต์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋ฅผ ๊ตฌํ•˜๋ฉด, n๊ฐœ์˜ node์—์„œ ์ธต์˜ ๊ฐœ์ˆ˜๋Š” logn์ด๋ฏ€๋กœ $\sum_{0}^{log n}O(h) * \lceil\cfrac{n}{2^{h+1}}\rceil$

= $\cfrac{n*c}{2} \sum_{0}^{logn} \lceil\cfrac{n}{2^h}\rceil$

 

  logn์„ ∞๋กœ ๋ฐ”๊พธ๋ฉด, harmonic progression์— ์˜ํ•ด ์‹œ๊ทธ๋งˆ๊ฐ’์€ 2๊ฐ€ ๋œ๋‹ค.

= $\cfrac{n*c}{2} * 2$ = $c*n$

 

โ–ถ T(n) = O(n) โ—€

 

โ€ป ์•„๋ž˜ ์ฐธ์กฐ โ€ป


* deleting from heap → heapsort

๐Ÿ‘จ‍๐Ÿซ ์ด๋ฏธ ์™„์„ฑ๋œ binary heap์—์„œ ํ•œ ์›์†Œ๋ฅผ deleteํ•œ ๋‹ค๋Š” ๊ฒƒ์€, ๊ฐ€์žฅ ์ƒ์œ„์˜ root node๋ฅผ ๋นผ ์˜จ๋‹ค๋Š” ๋œป์ด๊ณ , max-heap์ผ ๊ฒฝ์šฐ ์›์†Œ์˜ ์ตœ๋Œ“๊ฐ’, min-heap์ผ ๊ฒฝ์šฐ ์›์†Œ์˜ ์ตœ์†Ÿ๊ฐ’์„ ๊ฐ€์ ธ์˜ค๊ฒŒ ๋œ๋‹ค.

 

๐Ÿ‘จ‍๐Ÿซ root- node๋ฅผ ๋นผ์˜ค๊ณ  ๋‚จ์€ ์ž๋ฆฌ์—๋Š”, bottom-most right-most element๋ฅผ ๋นผ์˜จ๋‹ค. (array๊ตฌ์กฐ์ƒ index๊ฐ€ ๊ฐ€์žฅ ํฐ, ๋งจ ์˜ค๋ฅธ์ชฝ์˜ ์›์†Œ). ์ด๋ฅผ root node์— ์˜ฌ๋ ค๋†“๋Š”๋‹ค.

 

*์˜ˆ์‹œ) ์œ„ ์˜ˆ์‹œ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜๋ฉด, 42๋ฅผ ๋นผ๊ณ , 10 ์›์†Œ๋ฅผ ๋งจ ์œ„์— ์˜ฌ๋ ค ๋†“๋Š”๋‹ค. ์ด ๋•Œ, ํ•ญ์ƒ max-heap ๊ตฌ์กฐ๋ฅผ ์œ ์ง€ํ•ด์•ผ ํ•˜๋ฏ€๋กœ, ์ƒˆ๋กœ ๋ฐ”๋€ root-node์™€ child-node๋ฅผ ๋น„๊ตํ•œ๋‹ค. swap์ด ์ง„ํ–‰๋˜๋ฉด, ์ดํ›„ ์•„๋ž˜ ๋ถ€๋ถ„๊นŒ์ง€ ๋Œ€์†Œ ๊ด€๊ณ„๊ฐ€ max-heap ๊ตฌ์กฐ์— ๋งž์ง€ ์•Š์œผ๋ฉด ๊ณ„์†ํ•ด์„œ swap ์ง„ํ–‰

 

- (์ขŒ) root node ๋นผ์˜ค๊ณ  ๋งˆ์ง€๋ง‰ ์›์†Œ๋ฅผ root node๋กœ ์˜ฌ๋ฆผ / (์šฐ) ์ตœ์ข…์ ์œผ๋กœ swap ๊ณผ์ •์ด ์ง„ํ–‰๋œ ํ›„ ๋‹ค์‹œ ์—…๋ฐ์ดํŠธ ๋œ max-heap -

- update๋œ max-heap์„ array ํ˜•ํƒœ๋กœ ํ‘œํ˜„ํ•˜๋ฉด [35, 33, 14, 19, 7, 8, 10]์ด ๋œ๋‹ค.

- ์ด์™€ ๊ฐ™์ด ๊ณ„์†ํ•ด์„œ root-node deletion์„ ์ง„ํ–‰ํ•˜๋ฉด ์ œ์ผ ํฐ ์ˆ˜๋ถ€ํ„ฐ ์ฐจ๋ก€๋Œ€๋กœ delete๋œ๋‹ค. ๋”ฐ๋ผ์„œ, delete๋œ ์›์†Œ๋ฅผ ์ฐจ๋ก€๋Œ€๋กœ ๋ชจ์œผ๋ฉด, sort๋œ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜จ๋‹ค. ์ด sorting์„ heapsort๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค.

(์—ฌ๊ธฐ์„œ parent node์™€ ๋‘ child node ์ค‘ ์ œ์ผ ํฐ child์™€ swap์ด ์ง„ํ–‰๋œ๋‹ค.

left์™€ swap์ดํ›„ right์™€ ํ•œ ๋ฒˆ ๋” swap์ด ์ง„ํ–‰๋˜๋Š” ๊ฒŒ ์•„๋‹˜)

 

- ์•„๋ž˜ ์ฝ”๋“œ heapsort ์ฐธ์กฐ -

def max_heap(arr, n, parent_idx):
    largest = parent_idx
    left = parent_idx*2 + 1
    right = parent_idx*2 + 2

    if left < n and arr[left] > arr[largest]:
        largest = left
    
    if right < n and arr[right] > arr[largest]:
        largest = right
    
    if largest != parent_idx: #if changed
        arr[largest], arr[parent_idx] = arr[parent_idx], arr[largest] #swap
        max_heap(arr, n, largest) #if swapped, continue with the lower part

def heapsort(arr):
    
    n = len(arr)

    for i in range((n-1)//2,-1,-1): #first heapify
        max_heap(arr, n, i)

    for i in range(n-1,0,-1): #heapsort (until last single element remained)
        arr[i], arr[0] = arr[0], arr[i] #swap with the last and the first(max) element
        max_heap(arr,i,0) #after swapped, heapify (start with the first element)

arr = [33, 35, 42, 10, 7, 8, 14, 19, 48]
heapsort(arr)
print(arr) #[7, 8, 10, 14, 19, 33, 35, 42, 48]

 

โ‘  max_heap(): arr์™€ ์ „์ฒด arr ๊ธธ์ด, ํ˜„์žฌ parent_idx ์„ธ ๊ฐœ์˜ ์ธ์ž

→ ์ผ๋‹จ ํ˜„์žฌ largest index๋Š” parent_idx์ด๊ณ , ์ž์‹ node์˜ index๋Š” parent_index*2 +1๊ณผ parent_index*2+2

(IndexError ๊ฑฑ์ •ํ•  ํ•„์š” ์—†๋‹ค. max_heap์„ ๋Œ๋ฆด ๋•Œ non-leaf node ๊นŒ์ง€๋งŒ ๋Œ๋ฆฌ๊ธฐ ๋•Œ๋ฌธ)

 ์ž์‹ left node๊ฐ€ parent node๋ณด๋‹ค ํฌ๋‹ค๋ฉด, ๊ทธ๋ฆฌ๊ณ  left node์˜ index๊ฐ€ n๋ณด๋‹ค ์ž‘๋‹ค๋ฉด, swap ํ•„์š”ํ•˜๋ฏ€๋กœ index swap

 ์ดํ›„ ์ž์‹ right node์™€๋„ ๋™์ผํ•˜๊ฒŒ ์ง„ํ–‰

 ๋งŒ์•ฝ index swap์ด ์ง„ํ–‰๋˜์—ˆ๋‹ค๋ฉด(largest != parent_idx)

- arr[largest]์™€ arr[parent_idx]์™€์˜ swap ์ง„ํ–‰

- ๊ทธ๋ฆฌ๊ณ , ์ดํ›„ ์•„๋ž˜ ํ•˜์œ„ ๋…ธ๋“œ์™€๋„ ์ด๋ฅผ ๋ฐ˜๋ณตํ•ด์•ผ ํ•˜๋ฏ€๋กœ max_heap(arr, n, largest) ์ง„ํ–‰

(์ด ๋•Œ, parent_idx๋Š” largest์ด๋ฏ€๋กœ, largest๋ฅผ ์ธ์ž๋กœ ๋„ฃ์Œ)

 heapify() ์‹œ๊ฐ„ ๋ณต์žก๋„ O(logN)

 

โ‘ก heapsort()

 last non-leaf node๋ถ€ํ„ฐ(์ฆ‰, n๊ฐœ์˜ node์—์„œ index๊ฐ€ $\lfloor \cfrac{n}{2} - 1 \rfloor$์ธ node๋ถ€ํ„ฐ) root node๊นŒ์ง€ ์ฐจ๋ก€๋Œ€๋กœ heapify() ์ ๊ฒ€

(last non-leaf node๋ž€, ์ž์‹ ๋…ธ๋“œ๊ฐ€ 1๊ฐœ ์ด์ƒ ์žˆ๋Š” node ์ค‘ index๊ฐ€ ์ œ์ผ ํฐ, ์ฆ‰ array์—์„œ ๋งจ ๋์— ์žˆ๋Š” node๋ฅผ ๋œปํ•œ๋‹ค. ์ฆ‰ ์ž์‹์ด ์žˆ๋Š” ๋ ๋…ธ๋“œ๋ถ€ํ„ฐ root node๊นŒ์ง€ max-heap ๊ตฌํ˜„์„ ์œ„ํ•ด swap ์—ฌ๋ถ€๋ฅผ ์ž์‹๊ณผ ๊ณ„์† ์‹ค์‹œํ•œ๋‹ค๋Š” ๋œป) - for loop ๋Œ๋ฆผ

 ํ•œ ๋ฒˆ์˜ heapify() ์ง„ํ–‰ ํ›„, n๋ฒˆ์˜ heapify() ์ง„ํ–‰ (for loop ๋Œ๋ฆผ) - ์ฒซ๋ฒˆ์งธ ์›์†Œ์™€ ๋งจ ๋งˆ์ง€๋ง‰ ์›์†Œ swap & max_heap() ๋Œ๋ ค heapify() ์ง„ํ–‰

(ํ•œ ๊ฐœ ๋‚จ์„ ๋•Œ ๊นŒ์ง€ ์ง„ํ–‰ํ•˜๋ฏ€๋กœ, index n-1๋ถ€ํ„ฐ 1๊นŒ์ง€)

 

โ–ถ heapsort T(n) = O(nlogn) โ—€

 

→ ๊ฐ๊ฐ max node๋ฅผ ๋นผ์˜ค๊ณ  ๋งค ๋ฒˆ heapify()๋ฅผ ์ง„ํ–‰ํ•ด์ฃผ๋ฏ€๋กœ, heapify()์˜ ์‹œ๊ฐ„ ๋ณต์žก๋„๋Š” O(logn)์ธ ์ƒํƒœ์—์„œ, max node๋ฅผ ๋นผ์˜ค๋Š” ํšŸ์ˆ˜ n์ด ์ตœ์ข…์ ์œผ๋กœ ๊ณฑํ•ด์ ธ O(nlogn)์˜ ์‹œ๊ฐ„๋ณต์žก๋„๊ฐ€ ๋œ๋‹ค.


* inserting key into heap

๐Ÿ‘จ‍๐Ÿซ ์ด๋ฏธ ์™„์„ฑ๋œ heap์—์„œ ๋งจ ๋งˆ์ง€๋ง‰์— ์ƒˆ๋กœ์šด ์›์†Œ๋ฅผ insertionํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด ๋•Œ, max-heap์˜ ์„ฑ์งˆ์„ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ๋ถ€๋ชจ ๋…ธ๋“œ ์›์†Œ์™€ ๊ณ„์† ๋น„๊ตํ•˜๋ฉฐ ์ƒ์œ„๋กœ ์˜ฌ๋ผ๊ฐ„๋‹ค. (ํ•˜๋‹จ ๊ทธ๋ฆผ ์ฐธ์กฐ)

๐Ÿ‘จ‍๐Ÿซ ์ด์ฒ˜๋Ÿผ, ์ž„์˜์˜ binary heap ์›์†Œ ์ค‘ ํ•œ ๊ฐœ์˜ ๊ฐ’์„ ๋ฐ”๊พผ๋‹ค๋ฉด, ์ƒ์œ„ ๋…ธ๋“œ์™€ ๊ณ„์† ๋น„๊ตํ•˜๋ฉด์„œ max-heap ์ƒ์„ฑ์„ ์œ„ํ•ด swap ๊ณผ์ •์ด ์ง„ํ–‰๋  ์ˆ˜ ์žˆ๋‹ค.

โ€ป binary heap ํŠน์„ฑ์ƒ ์ž์‹ node์˜ index๊ฐ€ i๋ผ๋ฉด, ๋ถ€๋ชจ node์˜ index๋Š” i/2 โ€ป

 

def insert_key(arr,key):
    arr.append(key)
    size=len(arr)
    i=size

    while(i>=1 and arr[i/2] < arr[i]):
        arr[i],arr[i/2] = arr[i/2],arr[i] #swap
        i = (i/2)
    print(arr)

 

โ–ถ T(n) = O(h) = O(logn) โ—€

(node์˜ ๊ฐ’์„ ์ฆ๊ฐ€์‹œ์ผœ heap์„ updateํ•˜๋Š” ์—ฐ์‚ฐ์˜ ์‹œ๊ฐ„ ๋ณต์žก๋„๋Š” ์ž์‹ node๊ฐ€ ์ตœ๋Œ€ ๋ถ€๋ชจ node๊นŒ์ง€ ์›€์ง์ด๋Š”, ์ฆ‰ heap์˜ height๊ฐ€ ๋˜๋ฏ€๋กœ, ์ด๋Š” ๊ณง logn)


* extract max from heap

๐Ÿ‘จ‍๐Ÿซ ์ง€๊ธˆ max-heap์„ ๋‹ค๋ฃจ๊ณ  ์žˆ์œผ๋ฏ€๋กœ, heap์˜ ์›์†Œ ์ค‘ ์ตœ๋Œ“๊ฐ’์€ ๊ฐ€์žฅ ์ƒ์œ„ ๋…ธ๋“œ์— ์žˆ๋Š” ๊ฐ’์ด๋‹ค. ๋”ฐ๋ผ์„œ ๋งจ ์œ„์˜ ๋…ธ๋“œ๋ฅผ ๋นผ๊ณ , ๊ทธ ์ž๋ฆฌ์— ๊ฐ€์žฅ ๋์— ์žˆ๋Š” ์›์†Œ๋ฅผ ๊ฐ–๋‹ค ์ค€๋‹ค. (ํ•˜๋‹จ ๊ทธ๋ฆผ ์ฐธ์กฐ)

 

def delete_max(arr):
    size=len(arr)
    if size < 1:
        print('Error')
    
    max_root = arr[0]
    arr[0] = arr[size]
    size=size-1

    A=max_heapify(arr[:size],1)
    return max_root

 

๐Ÿ‘จ‍๐Ÿซ max๊ฐ’์„ ๋ฐ˜ํ™˜ํ•˜๊ณ , ๋งˆ์ง€๋ง‰ ์›์†Œ๋ฅผ root node๋กœ ์˜ฌ๋ฆฐ ๋’ค, max_heapify()๋ฅผ ์‚ฌ์šฉํ•ด max-heap์œผ๋กœ ๋‹ค์‹œ ๋งŒ๋“ ๋‹ค.

 

โ–ถ T(n) = O(logn) โ—€

(max_root๋ฅผ ์ตœ์ƒ์œ„ node์—์„œ ๊บผ๋‚ด๋Š” ๊ฑด O(1), max_heapify()๋Š” O(logn)์œผ๋กœ ์ตœ์ข… O(logn))


* increase key in heap

๐Ÿ‘จ‍๐Ÿซ ์˜ˆ์‹œ) ๋งจ ๋งˆ์ง€๋ง‰ rightmost node ๊ฐ’์„ 60์œผ๋กœ ๋ฐ”๊พผ๋‹ค๋ฉด max-heap ์„ฑ์งˆ์— ์˜ํ•ด ์ƒ์œ„ ๋…ธ๋“œ 42์™€ ๋น„๊ตํ•ด์„œ swap, ์ดํ›„ ์ƒ์œ„ ๋…ธ๋“œ 48๊ณผ ๋น„๊ตํ•ด์„œ swap ์ง„ํ–‰ (์•„๋ž˜ ๊ทธ๋ฆผ ์ฐธ์กฐ)

๐Ÿ‘จ‍๐Ÿซ ์ด์ฒ˜๋Ÿผ, ์ž„์˜์˜ binary heap ์›์†Œ ์ค‘ ํ•œ ๊ฐœ์˜ ๊ฐ’์„ ๋ฐ”๊พผ๋‹ค๋ฉด, ์ƒ์œ„ ๋…ธ๋“œ์™€ ๊ณ„์† ๋น„๊ตํ•˜๋ฉด์„œ max-heap ์ƒ์„ฑ์„ ์œ„ํ•ด swap ๊ณผ์ •์ด ์ง„ํ–‰๋  ์ˆ˜ ์žˆ๋‹ค.

โ€ป binary heap ํŠน์„ฑ์ƒ ์ž์‹ node์˜ index๊ฐ€ i๋ผ๋ฉด, ๋ถ€๋ชจ node์˜ index๋Š” i/2 โ€ป

 

def increase_key(arr,i,key):
    if key < arr[i]:
        print('Error')

    arr[i] = key
    while(i>=1 and arr[i/2] < arr[i]):
        arr[i],arr[i/2] = arr[i/2],arr[i] #swap
        i = (i/2)
    print(arr)

 

โ–ถ T(n) = O(h) = logn / S(n) = 1โ—€

→ node์˜ ๊ฐ’์„ ์ฆ๊ฐ€์‹œ์ผœ heap์„ updateํ•˜๋Š” ์—ฐ์‚ฐ์˜ ์‹œ๊ฐ„ ๋ณต์žก๋„๋Š” ์ž์‹ node๊ฐ€ ์ตœ๋Œ€ ๋ถ€๋ชจ node๊นŒ์ง€ ์›€์ง์ด๋Š”, ์ฆ‰ heap์˜ height๊ฐ€ ๋˜๋ฏ€๋กœ, ์ด๋Š” ๊ณง logn / ๊ณต๊ฐ„๋ณต์žก๋„๋Š” ํ˜„์žฌ ๊ณต๊ฐ„์—์„œ recursion์ด ์ผ์–ด๋‚˜์ง€ ์•Š์œผ๋ฏ€๋กœ ๊ทธ๋Œ€๋กœ 1


3. binary heap time complexity

๐Ÿ‘จ‍๐Ÿซ ์œ„ ์—ฌ๋Ÿฌ implementation ์—ฐ์‚ฐ time complexity๋ฅผ ์ •๋ฆฌํ•ด๋ณด๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

operation ์‹œ๊ฐ„ ๋ณต์žก๋„
find max(root node ์ ‘๊ทผ) O(1)
heapify() (creating a heap) O(n)
heapsort() O(nlogn)
insertion O(logn)
deletion / extracting max O(logn)
increasing value in a heap O(logn)

4. binary heap code using heapq in python

๐Ÿ‘จ‍๐Ÿซ python์—์„œ heapq๋ผ๋Š” library๋ฅผ ์ง€์›ํ•ด์ฃผ์–ด heap ์—ฌ๋Ÿฌ implementation์„ ์‰ฝ๊ฒŒ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

๐Ÿ‘จ‍๐Ÿซ heapq docu

 

heapq — Heap queue algorithm

Source code: Lib/heapq.py This module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm. Heaps are binary trees for which every parent node has a va...

docs.python.org

 

๐Ÿ‘จ‍๐Ÿซ python์˜ heapq๋Š” default๋กœ min-heap์„ ๊ตฌํ˜„ํ•œ๋‹ค.

Q. heapsort using min-heap

import sys
import heapq
input=sys.stdin.readline
def heapsort(iterable):
    h=[]
    res=[]

    #๋ชจ๋“  ์›์†Œ๋ฅผ ์ฐจ๋ก€๋Œ€๋กœ heap์— ์‚ฝ์ž…
    for value in iterable:
        heapq.heappush(h,value)
    
    #heap์— ์‚ฝ์ž…๋œ ๋ชจ๋“  ์›์†Œ๋ฅผ ์ฐจ๋ก€๋Œ€๋กœ ๊บผ๋‚ด์–ด ๋‹ด๊ธฐ
    for i in range(len(h)):
        res.append(heapq.heappop(h))
    return res

n=int(input())
arr=[]

for i in range(n):
    arr.append(int(input()))

res=heapsort(arr)

for i in range(n):
    print(res[i])

 

1) heapq์˜ heappush์— ์˜ํ•ด min-heap ์™„์„ฑ

2) min-heap์ด๋ฏ€๋กœ heapq์˜ heappop์— ์˜ํ•ด ๊ฐ€์žฅ ๋งˆ์ง€๋ง‰ node๊ฐ€ root node์— ์˜ค๋ฉด์„œ root node์ธ ์ตœ์†Ÿ๊ฐ’์ด ์ถœ๋ ฅ

3) ์œ„ 2)์˜ ๊ณผ์ •์„ ๊ณ„์† ๋ฐ˜๋ณตํ•˜๋ฉด์„œ ์ถœ๋ ฅ๋˜๋Š” ๊ฐ’์„ ๋‚˜์—ดํ•˜๋ฉด ์˜ค๋ฆ„์ฐจ์ˆœ ์ •๋ ฌ!

 

→ ์ˆ˜๋ฅผ ์ฐจ๋ก€๋Œ€๋กœ ์ง‘์–ด๋„ฃ์œผ๋ฉด, ์˜ค๋ฆ„์ฐจ์ˆœ ์ •๋ ฌ๋Œ€๋กœ ์ถœ๋ ฅ๋˜์–ด ๋‚˜์˜จ๋‹ค.

 

โ€ป python์˜ ๊ฒฝ์šฐ heapq๋Š” min-heap ์ตœ์†Œ ํž™์„ ๊ตฌํ˜„ํ•ด์ฃผ๋Š” library๋‹ค. ๋”ฐ๋ผ์„œ ์˜ค๋ฆ„์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌ๋œ๋‹ค โ€ป

Q. heapsort using max-heap

โ€ป max-heap์€ python์˜ ๊ฒฝ์šฐ ์œ„ min-heap ๊ตฌํ˜„ ์ฝ”๋“œ์— heappush ์‚ฌ์šฉ์‹œ value์— ์Œ์ˆ˜๋ฅผ ๋ถ™์—ฌ heap์— ์ง‘์–ด๋„ฃ๊ณ , ์ •๋ ฌํ•œ ๋’ค, heappop ์‚ฌ์šฉํ•  ๋•Œ ๋‹ค์‹œ ์Œ์ˆ˜๋ฅผ ๋ถ™์ด๋ฉด ๋œ๋‹ค.

import sys
import heapq
input=sys.stdin.readline
def heapsort(iterable):
    h=[]
    res=[]

    #๋ชจ๋“  ์›์†Œ๋ฅผ ์ฐจ๋ก€๋Œ€๋กœ heap์— ์‚ฝ์ž…
    for value in iterable:
        heapq.heappush(h,-value)
    
    #heap์— ์‚ฝ์ž…๋œ ๋ชจ๋“  ์›์†Œ๋ฅผ ์ฐจ๋ก€๋Œ€๋กœ ๊บผ๋‚ด์–ด ๋‹ด๊ธฐ
    for i in range(len(h)):
        res.append(-heapq.heappop(h))
    return res

n=int(input())
arr=[]

for i in range(n):
    arr.append(int(input()))

res=heapsort(arr)

for i in range(n):
    print(res[i])

Q. building a heap - heapify()

 

 min-heap(์™„์ „ ์ด์ง„ ํŠธ๋ฆฌ)์„ ์‹ค์ œ๋กœ ๋งŒ๋“œ๋Š” heapify()๋Š” O(n)์˜ ์‹œ๊ฐ„ ๋ณต์žก๋„. 

import heapq

nums = [9,7,5,3,1]
heapq.heapify(nums)
print(nums) #[1,3,5,9,7]

Q. deletion using heappop() - find the nth smallest

 

โ‘  ์œ„ ๋งŒ๋“ค์–ด์ง„ heap [1,3,5,9,7]์—์„œ

 

โ‘ก ์˜ˆ๋ฅผ ๋“ค์–ด ์„ธ ๋ฒˆ์งธ ์ž‘์€ ์›์†Œ๋ฅผ ์ฐพ๊ณ  ์‹ถ๋‹ค๋ฉด ์ด ๋‘ ๋ฒˆ์˜ heappop() ์ˆ˜ํ–‰ํ•œ ๋’ค O(1)์ด ๊ฑธ๋ฆฌ๋Š” nums[0] ์—ฐ์‚ฐ ์ˆ˜ํ–‰

heapq.heappop(nums)
print(nums) #[3,7,5,9]

heapq.heappop(nums)
print(nums) #[5,7,9]

print(nums[0]) #5 
#heap์˜ node ์ถœ๋ ฅ

 

→ ์„ธ ๋ฒˆ์งธ ์ž‘์€ ์›์†Œ 5 ์ถœ๋ ฅ!


* heap youtube https://youtu.be/TK48r1Dlo4k

* binary heap wiki https://en.wikipedia.org/wiki/Binary_heap

* ์ธ๋„ค์ผ ์ถœ์ฒ˜ https://www.digitalocean.com/community/tutorials/max-heap-java

 

'Computer Science > Data Structures' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

๐ŸŒณ Binary Tree ๐ŸŒณ  (1) 2023.12.19
set() & dict() - ์ง‘ํ•ฉ๊ณผ ๋งต  (0) 2023.08.03
Types of Binary Tree๐ŸŒฒ  (0) 2023.01.10
priority queue(binary heap)  (0) 2023.01.09
hash table / hashing  (0) 2022.11.06

๋Œ“๊ธ€