xfs๋Š” ๋””๋ ‰ํ„ฐ๋ฆฌ, Extent ๊ด€๋ฆฌ, free space ๊ด€๋ฆฌ ๋“ฑ ์—ฌ๋Ÿฌ ํŒŒ์ผ์‹œ์Šคํ…œ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ B+Tree๋กœ ๊ด€๋ฆฌํ•œ๋‹ค.

29 minute read

xfs๋Š” ๋””๋ ‰ํ„ฐ๋ฆฌ, Extent ๊ด€๋ฆฌ, free space ๊ด€๋ฆฌ ๋“ฑ ์—ฌ๋Ÿฌ ํŒŒ์ผ์‹œ์Šคํ…œ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ B+Tree๋กœ ๊ด€๋ฆฌํ•œ๋‹ค.

Linux ํŒŒ์ผ์‹œ์Šคํ…œ ์ชฝ์„ ๊ณต๋ถ€ํ•˜๋ฉด์„œ ๋ดค๋˜ ๋ฌธ๊ตฌ ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—์„œ B+Tree๋กœ ๊ด€๋ฆฌํ•œ๋‹ค๊ณ  ํ•˜๋Š” ๊ฑธ ๋ดค๋Š”๋ฐ, ์–ด๋А๋‚  ๋‚ด๊ฐ€ B+Tree๋ฅผ ํ•œ๋ฒˆ๋„ ์ง์ ‘ ๊ตฌํ˜„ํ•ด๋ณธ ์ ์ด ์—†๋‹ค๋Š” ๊ฑธ ๊นจ๋‹ซ๊ฒŒ ๋˜์—ˆ์–ด์š”.

๋˜, ์ฆ๊ฒจ๋ณด๋Š” ์œ ํŠœ๋ฒ„ ์ฝ”๋”ฉ์• ํ”Œ์˜ โ€œindex๊ฐ€ ๋ญ”์ง€ ์„ค๋ช…ํ•ด๋ณด์„ธ์š” (๊ฐœ๋ฐœ๋ฉด์ ‘์‹œ๊ฐ„)โ€ ์˜์ƒ์—์„œ DB Index๋ฅผ ์„ค๋ช…ํ•˜๋ฉด์„œ B-Tree์™€ B+Tree๋ฅผ ์„ค๋ช…์ด ์žˆ์—ˆ๋Š”๋ฐ์š”. B+Tree๊ฐ€ ์ข‹์€ ์ž๋ฃŒ๊ตฌ์กฐ์ธ ๊ฑด ์•Œ๊ฒ ๋Š”๋ฐ, ์ด๊ฒŒ ๊ตฌ์ฒด์ ์œผ๋กœ ์–ด๋–ป๊ฒŒ ๋™์ž‘ํ•˜๋Š”์ง€ ๋‚˜ ์Šค์Šค๋กœ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ํ‰๊ฐ€ํ•˜๊ณ  ์‹ถ์—ˆ์–ด์š”.

๊ทธ๋ž˜์„œ ์ด๋ฒˆ ์ฃผ๋ง์— ๊ฐ ์žก๊ณ  ์ด๊ฑธ python์œผ๋กœ ๊ตฌํ˜„ํ•ด๋ดค์Šต๋‹ˆ๋‹ค! ์ด ํฌ์ŠคํŠธ๋Š” ๊ตฌํ˜„ ๊ณผ์ •์—์„œ์˜ ํ–ˆ๋˜ ์ƒ๊ฐ๋“ค๊ณผ ๊ตฌํ˜„ ๊ณผ์ •์„ ์•„์นด์ด๋น™ ํ•˜๋Š” ๋ชฉ์ ์˜ ํฌ์ŠคํŠธ์˜ˆ์š”.

์ œ๊ฐ€ ์Šค์Šค๋กœ ์ดํ•ดํ•˜๊ณ  ์„ค๋ช…ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ €์˜ ์ƒ๊ฐ์˜ ํ๋ฆ„๋Œ€๋กœ ์ž‘์„ฑํ–ˆ๊ณ , ์ฝ”๋“œ๋„ ์žฌ๊ตฌ์„ฑํ•˜๋ฉด์„œ ์ผ๋ถ€ ์˜ค๋ฅ˜๊ฐ€ ์žˆ์„ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.


์ฒ˜์Œ ์‹œ์ž‘์€ BST(Binary Search Tree)๋กœ ์‹œ์ž‘ํ–ˆ๊ณ , ๊ฑฐ๊ธฐ์„œ๋ถ€ํ„ฐ B-Tree, B+Tree ์ˆœ์„œ๋Œ€๋กœ ๊ตฌํ˜„ ํ–ˆ๋‹ค.

Binary Search Tree

์ผ๋‹จ BST์˜ ๋…ธ๋“œ๋ถ€ํ„ฐ ์ •์˜ํ•œ๋‹ค.

from __future__ import annotations

class Node:
  # ํ˜„์žฌ ๋…ธ๋“œ๊ฐ€ ์ €์žฅํ•  ๊ฐ’.
  def __init__(self, key):
    self.key = key
    self.left: Node | None = None
    self.right: Node | None= None

key๋ณด๋‹ค ํฌ๋ฉด, right์— ๋ฐฐ์น˜ํ•˜๊ณ , key๋ณด๋‹ค ์ž‘์œผ๋ฉด left์— ๋ฐฐ์น˜ํ•œ๋‹ค.

class BST():
  def __init__(self):
    self.root = None

  def insert(self, key: int):
    pass

  def search(self, key: int):
    pass

  def remove(self, key: int):
    pass

๊ทธ๋ฆฌ๊ณ  ์ด๋ ‡๊ฒŒ 3๊ฐœ ํ•จ์ˆ˜๋ฅผ ๊ตฌํ˜„ํ•œ๋‹ค.

Insert

class BST():
  ...
  def insert(self, key: int):
    if self.root is None:
      print(f"Insert {key} as root")
      self.root = Node(key)
      return

    current_node = self.root

    while True:
      if current_node.key == key:
        print(f"{key} already exists in BST")
        return
      elif current_node.key > key:
        if current_node.left is None:
          print(f"Insert {key} as left child of {current_node.key}")
          current_node.left = Node(key)
          return
        current_node = current_node.left
      else:
        if current_node.right is None:
          print(f"Insert {key} as right child of {current_node.key}")
          current_node.right = Node(key)
          return
        current_node = current_node.right

current_node์˜ key์™€ ํƒ์ƒ‰ key๋ฅผ ๋น„๊ตํ•˜๋ฉฐ, ์ขŒ/์šฐ ๋…ธ๋“œ๋กœ ์ด๋™ํ•œ๋‹ค.

class BST():
  ...
  def search(self, key: int):
    current_node = self.root

    while current_node:
      if current_node is None:
        print(f"{key} not found in BST")
        return False
      elif current_node.key == key:
        print(f"{key} found in BST")
        return True
      elif current_node.key > key:
        current_node = current_node.left
      else:
        current_node = current_node.right

    print(f"{key} not found in BST")
    return False

๋งˆ์ฐฌ๊ฐ€์ง€๋กœ current_node์˜ key์™€ ํƒ์ƒ‰ key๋ฅผ ๋น„๊ตํ•˜๋ฉฐ, ์ขŒ/์šฐ ๋…ธ๋“œ๋กœ ์ด๋™ํ•œ๋‹ค.

Remove

์‚ฝ์ž…/ํƒ์ƒ‰ ์—ฐ์‚ฐ๋ณด๋‹ค ์•ฝ๊ฐ„ ๋ณต์žกํ•˜๋‹ค.

class BST():
  ...
  def remove(self, key: int):
    if self.root is None:
      print("BST is empty")
      return

    parent = None
    current = self.root

    # ์–ด๋–ค ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•ด์•ผ ํ•˜๋Š”์ง€ ํƒ์ƒ‰
    while True:
      if key == current.key:
        break
      elif key < current.key:
        parent = current
        current = current.left
      else:
        parent = current
        current = current.right

      if current is None:
        print(f"{key} not found in BST")
        return

๋™์ผํ•˜๊ฒŒ ์‚ญ์ œํ•  ๋…ธ๋“œ๋ฅผ Binary Search ๋ฐฉ์‹์œผ๋กœ ์ฐพ๋Š”๋‹ค.

์ด์ œ ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•ด์•ผ ํ•˜๋Š”๋ฐ, ์ž์‹ ๋…ธ๋“œ๊ฐ€ ๋ช‡๊ฐœ ์žˆ๋Š”์ง€์— ๋”ฐ๋ผ ๊ณผ์ •์ด ๋‹ฌ๋ผ์ง„๋‹ค.

  • ์‚ญ์ œํ•  ๋…ธ๋“œ๊ฐ€ ์ž์‹ ๋…ธ๋“œ ์—†์Œ
    • ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•˜๊ณ , ๋ถ€๋ชจ ๋…ธ๋“œ์—์„œ๋„ ์—ฐ๊ฒฐ ํ•ด์ œ
  • ์‚ญ์ œํ•  ๋…ธ๋“œ๊ฐ€ ์ž์‹ ๋…ธ๋“œ 1๊ฐœ๋ฅผ ๊ฐ€์ง
    • ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•˜๊ณ , ๋ถ€๋ชจ ๋…ธ๋“œ์— ํ•ด๋‹น ๋…ธ๋“œ๋ฅผ ์—ฐ๊ฒฐ
  • ์‚ญ์ œํ•  ๋…ธ๋“œ๊ฐ€ ์ž์‹ ๋…ธ๋“œ 2๊ฐœ๋ฅผ ๊ฐ€์ง
    • ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•˜๊ณ 
    • ์ผ€์ด์Šค๊ฐ€ ๋‚˜๋‰œ๋‹ค.
      • ์ž์‹ ๋…ธ๋“œ ์ค‘ ์˜ค๋ฅธ์ชฝ ์„œ๋ธŒํŠธ๋ฆฌ์—์„œ
        • ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์„ ์ฐพ์•„์„œ ์ƒˆ๋กœ์šด ๋ฃจํŠธ ๋…ธ๋“œ๋กœ ์‚ผ์Œ
        • ๊ทธ๋ฆฌ๊ณ  ์ƒˆ๋กœ์šด ๋ฃจํŠธ ๋…ธ๋“œ๋ฅผ ๋ถ€๋ชจ ๋…ธ๋“œ์— ์—ฐ๊ฒฐ
      • ์ž์‹ ๋…ธ๋“œ ์ค‘ ์™ผ์ชฝ ์„œ๋ธŒํŠธ๋ฆฌ์—์„œ
        • ๊ฐ€์žฅ ํฐ ๊ฐ’์„ ์ฐพ์•„์„œ ์ƒˆ๋กœ์šด ๋ฃจํŠธ ๋…ธ๋“œ๋กœ ์‚ผ์Œ
        • ๊ทธ๋ฆฌ๊ณ  ์ƒˆ๋กœ์šด ๋ฃจํŠธ ๋…ธ๋“œ๋ฅผ ๋ถ€๋ชจ ๋…ธ๋“œ์— ์—ฐ๊ฒฐ
class BST():
  ...
  def remove(self, key: int):
    ...
    # current์˜ ์ž์‹์ด 0๊ฐœ์ธ ๊ฒฝ์šฐ
    if current.left is None and current.right is None:
      print(f"Remove {key} from BST with 0 children")
      if parent is None:
        self.root = None
      elif parent.left == current:
        parent.left = None
      else:
        parent.right = None

    # current์˜ ์ž์‹์ด 1๊ฐœ์ธ ๊ฒฝ์šฐ
    if current.left is None and current.right is not None:
      print(f"Remove {key} from BST with 1 child")
      if parent is None:
        self.root = current.right
      elif parent.left == current:
        parent.left = current.right
      else:
        parent.right = current.right

    if current.left is not None and current.right is None:
      print(f"Remove {key} from BST with 1 child")
      if parent is None:
        self.root = current.left
      elif parent.left == current:
        parent.left = current.left
      else:
        parent.right = current.left

    # current์˜ ์ž์‹์ด 2๊ฐœ์ธ ๊ฒฝ์šฐ
    # ์˜ค๋ฅธ์ชฝ ์„œ๋ธŒํŠธ๋ฆฌ์—์„œ ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์„ ์ฐพ์•„์„œ ๊ต์ฒด
    if current.left is not None and current.right is not None:
      print(f"Remove {key} from BST with 2 children")
      right = current.right
      right_parent = current

      # `right.left`์„ ํƒ์ƒ‰ํ•˜๋ฉด์„œ, ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์˜ ๋…ธ๋“œ๋ฅผ ํƒ์ƒ‰
      while right.left:
        right_parent = right
        right = right.left

      if right_parent == current:
        # while ๋ฃจํ”„๋ฅผ ํ•œ๋ฒˆ๋„ ์•ˆ ๋ˆ ์ผ€์ด์Šค
        right_parent.right = right.right
      else:
        # while ๋ฃจํ”„๋ฅผ ํ•œ๋ฒˆ์ด๋ผ๋„ ํƒ„ ์ผ€์ด์Šค
        # ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์— ๋Œ€ํ•œ right ๋…ธ๋“œ๊ฐ€ ์กด์žฌํ•  ์ˆ˜ ์žˆ์Œ.
        # ์ด๊ฑธ ํ•ด๋‹น ๋…ธ๋“œ์˜ left์— ์—ฐ๊ฒฐํ•ด์ค˜์•ผ ํ•จ.
        right_parent.left = right.right

      current.key = right.key
      print(f"Replace {key} with {right.key}")

B-Tree

B-Tree๋Š” BST์™€ ๋‹ค๋ฅธ ์ž๋ฃŒ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.

  • BST๋Š” ๋…ธ๋“œ๋‹น ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋Š” key๊ฐ€ 1๊ฐœ ์ž…๋‹ˆ๋‹ค.
    • ๊ทธ๋ฆฌ๊ณ  ์ž์‹ ๋…ธ๋“œ๋„ ์ตœ๋Œ€ 2๊ฐœ๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  • B-Tree๋Š” ๋…ธ๋“œ๋‹น key๊ฐ€ ์ตœ๋Œ€ 2t - 1๊ฐœ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
    • ๋…ธ๋“œ ๋‚ด์—์„œ๋Š” key๋ฅผ ์ •๋ ฌ๋œ ์ƒํƒœ๋กœ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
    • ๊ทธ๋ฆฌ๊ณ  ์ž์‹ ๋…ธ๋“œ๋Š” key ๊ฐฏ์ˆ˜ k์ผ ๋•Œ, k+1์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, B-Tree๋Š” ์ด๋ ‡๊ฒŒ

    [10 | 20 | 30]
   /    |    |    \
  c0    c1   c2    c3

B-Tree์—์„œ ๋…ธ๋“œ๊ฐ€ ์–ผ๋งˆ๋‚˜ ์ฑ„์›Œ์งˆ ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ t๋ฅผ ์ตœ์†Œ ์ฐจ์ˆ˜(minimum degree)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

t๊ฐ€ ์ปค์งˆ์ˆ˜๋ก ํ•œ ๋…ธ๋“œ์— ๋” ๋งŽ์€ ํ‚ค์™€ ์ž์‹์„ ๋‹ด์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ t=2์ผ ๋•Œ์™€ t=100์ผ ๋•Œ๋ฅผ ๋น„๊ตํ•˜๋ฉด, t=100์ธ B-Tree๋Š” ๋” ์–•์€ ๋†’์ด๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์—์„œ๋Š” t=2๋กœ ๊ณ ์ •ํ•˜๊ณ  ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑ ํ–ˆ์Šต๋‹ˆ๋‹ค.

from __future__ import annotations

class BTreeNode:
  def __init__(self, leaf: bool = True):
    # leaf ์—ฌ๋ถ€๋Š” len(children) == 0์œผ๋กœ๋„ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ์Œ.
    self.leaf = leaf
    self.keys: list[int] = []
    self.children: list[BTreeNode] = []

class BTree:
  def __init__(self):
    # ๊ฐ ๋…ธ๋“œ๋Š” ์ตœ๋Œ€ 3(=2t-1)๊ฐœ์˜ ํ‚ค๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์Œ.
    self.t = 2
    self.root = BTreeNode(leaf=True)

  def insert(self, key: int):
    pass

  def search(self, key: int):
    pass

๊ธฐ๋ณธ์ ์ธ ์—ฐ์‚ฐ์ธ insert()์™€ search()๋งŒ ๊ตฌํ˜„ ํ–ˆ์Šต๋‹ˆ๋‹ค.

Insert

์šฐ์„  B-Tree๊ฐ€ ์ตœ์ดˆ๋กœ ๋งŒ๋“ค์–ด์ง€๋Š”, root ๋…ธ๋“œ๊ฐ€ ์ƒ์„ฑ๋˜๋Š” ๋‹จ๊ณ„๋ถ€ํ„ฐ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.

class BTree:
  ...
  def insert(self, key: int):
    root = self.root

    if root.leaf and len(node.keys) < 3:
      if key in node.keys:
        print(f"Key {key} already exists in the tree.")
        return
      node.keys.append(key)
      node.keys.sort()
      return
    else:
      pass

๋ฃจํŠธ ๋…ธ๋“œ๋งŒ ์กด์žฌํ•˜๊ณ , ๋ฃจํŠธ ๋…ธ๋“œ์— ์ €์žฅ๋œ ํ‚ค๊ฐ€ 3๋ณด๋‹ค ์ž‘๋‹ค๋ฉด, ๋ฃจํŠธ ๋…ธ๋“œ์— ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์œผ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

๊ฒฝ๊ณ„๋Š” ๋ฃจํŠธ ๋…ธ๋“œ์˜ ํ‚ค ๊ฐฏ์ˆ˜๊ฐ€ 3์ธ ์ˆœ๊ฐ„๋ถ€ํ„ฐ ์ž…๋‹ˆ๋‹ค. B-Tree๋Š” ๋…ธ๋“œ์˜ ํ‚ค๊ฐ€ 2t-1์„ ๋„˜์ง€ ์•Š๋„๋ก ํ•ด์•ผ ํ•œ๋‹ค๋Š” ๊ฑธ ๋ฐ˜๋“œ์‹œ ์ง€์ผœ์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋…ธ๋“œ๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์€ ํ‚ค๋ฅผ ๊ฐ€์ง€๊ฒŒ ๋œ๋‹ค๋ฉด, _split() ์—ฐ์‚ฐ์œผ๋กœ ๋…ธ๋“œ๋ฅผ ๋ถ„ํ• ํ•ฉ๋‹ˆ๋‹ค.

     [ 10, 20, 30 ]
---
์ƒˆ๋กœ์šด root ์ƒ์„ฑ
old root๋ฅผ new_root์˜ 0๋ฒˆ์งธ child๋กœ ์—ฐ๊ฒฐ
        [     ]        <- new_root
           |
           |
     [ 10, 20, 30 ]    <- old root
---
๊ฐ€์šด๋ฐ ํ‚ค 20์„ parent๋กœ ์˜ฌ๋ฆฝ๋‹ˆ๋‹ค.
          [ 20 ]
         /      \
    [ 10 ]      [ 30 ]
class BTree:
  ...
  def insert(self, key: int):
    root = self.root

    if root.leaf and len(root.keys) < 3:
      if key in root.keys:
        print(f"Key {key} already exists in the tree.")
        return
      root.keys.append(key)
      root.keys.sort()
      return
    elif root.leaf and len(root.keys) == 3:
      # ์‹ ๊ทœ ๋ฃจํŠธ๋ฅผ ๋ฐœ๊ธ‰
      new_root = BTreeNode(leaf=False)

      # ์ง€๊ธˆ์˜ ๋ฃจํŠธ๋ฅผ ์‹ ๊ทœ ๋ฃจํŠธ์— ๋ถ™์ž„
      new_root.children.append(root)

      self.root = new_root
      self._split_child(new_root, 0)

      ...
    else:
      pass

๋‹น์—ฐํ•˜๊ฒŒ๋„ ์ง€๊ธˆ ๊ฝ‰์ฐฌ ๋ฃจํŠธ๋ฅผ ๋Œ€์‹ ํ•  ์ƒˆ๋กœ์šด ๋ฃจํŠธ ๋…ธ๋“œ new_root๋ฅผ ๋งˆ๋ จ ํ•ฉ๋‹ˆ๋‹ค.

์ด์ œ ๋…ธ๋“œ ๋ถ„ํ•  ๋กœ์ง์„ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

class BTree:
  ...
  def _split_child(self, parent: BTreeNode, idx: int):
    # parent์˜ idx ๋ฒˆ์งธ ์ž์‹ ๋…ธ๋“œ๋ฅผ ๋ถ„ํ• ํ•ด parent์˜ ์ž์‹ ๋…ธ๋“œ๋กœ ์ถ”๊ฐ€ํ•˜๋Š” ํ•จ์ˆ˜
    child = parent.children[idx]

    mid_key = child.keys[1]

    new_child = BTreeNode(leaf=child.leaf)

    # key ๋ถ„ํ• 
    new_child.keys = child.keys[2:]
    child.keys = child.keys[:1]

    # ๋ถ€๋ชจ ๋…ธ๋“œ์˜ idx ๋ฒˆ์งธ ์ž์‹ ๋…ธ๋“œ์— mid key๋ฅผ ์‚ฝ์ž…
    parent.keys.insert(idx, mid_key)

    # ๋ถ€๋ชจ์˜ ์ž์‹ ๋ชฉ๋ก์— ์ƒˆ child๋ฅผ ์‚ฝ์ž…
    parent.children.insert(idx + 1, new_child)

์ด์ œ insert() ํ•จ์ˆ˜๋ฅผ ์™„์„ฑ ํ•ฉ์‹œ๋‹ค.

class BTree:
  ...
  def insert(self, key: int):
    root = self.root

    if root.leaf and len(root.keys) < 3:
      if key in root.keys:
        print(f"Key {key} already exists in the tree.")
        return
      root.keys.append(key)
      root.keys.sort()
      return
    elif root.leaf and len(root.keys) == 3:
      # ์‹ ๊ทœ ๋ฃจํŠธ๋ฅผ ๋ฐœ๊ธ‰
      new_root = BTreeNode(leaf=False)

      # ์ง€๊ธˆ์˜ ๋ฃจํŠธ๋ฅผ ์‹ ๊ทœ ๋ฃจํŠธ์— ๋ถ™์ž„
      new_root.children.append(root)

      self.root = new_root
      self._split_child(new_root, 0)

      # ์ด์ œ ์ƒˆ๋กœ์šด key๋ฅผ insert
      ## ์–ด๋А child๋กœ ๋‚ด๋ ค๊ฐˆ์ง€ ์ฐพ๊ธฐ
      idx = -1
      for i, n_key in enumerate(new_root.keys):
        if key == n_key:
          print(f"Key {key} already exists in the tree.")
          return

        if key < n_key:
          idx = i
          break

      if idx == -1:
        idx = len(new_root.keys)

      ## ์ ์ ˆํ•œ child์— insert
      node = new_root.children[idx]
      node.keys.append(key)
      node.keys.sort()
      return
    else:
      pass

์ง€๊ธˆ๊นŒ์ง€๋Š” ๋ฃจํŠธ ๋…ธ๋“œ๊ฐ€ ์ตœ์ดˆ๋กœ ์ฑ„์›Œ์งˆ ๋•Œ, ๊ทธ๋ฆฌ๊ณ  ๋ฃจํŠธ ๋…ธ๋“œ๊ฐ€ ์ตœ์ดˆ๋กœ ๊ฝ‰ ์ฐจ์„œ ๋ถ„ํ• ์ด ์ผ์–ด๋‚˜๋Š” ์ƒํ™ฉ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ์ด์ œ ์ฝ”๋“œ๋ฅผ ์ผ๋ฐ˜ํ™” ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.


๊ฐ€์žฅ ๋จผ์ €, ๋…ธ๋“œ์— key๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๋ถ€๋ถ„์„ ํ•จ์ˆ˜ํ™” ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ _insert_non_full()๋กœ ๊ณตํ†ตํ™” ํ•ฉ๋‹ˆ๋‹ค.

class BTree:
  ...
  def insert(self, key: int):
    root = self.root

    if root.leaf and len(root.keys) < 3:
      self._insert_non_full(root, key)  # ์—ฌ๊ธฐ
    elif root.leaf and len(root.keys) == 3:
      pass

  def _insert_non_full(self, node: BTreeNode, key: int):
    if len(node.keys) >= 3:
      raise ValueError("Node is full.")

    if node.leaf:
      if key in node.keys:
        print(f"Key {key} already exists in the tree.")
        return
      node.keys.append(key)
      node.keys.sort()
      return
    pass

๊ฐ€์žฅ ๋จผ์ €, ๋ฃจํŠธ ๋…ธ๋“œ๊ฐ€ ์ตœ์ดˆ๋กœ ์ฑ„์›Œ์งˆ ๋•Œ์˜ ์ฝ”๋“œ๋ฅผ ์˜ฎ๊ฒผ์Šต๋‹ˆ๋‹ค. ์ด์ œ, ๋ฃจํŠธ ๋…ธ๋“œ ๋ถ„ํ•  ํ›„์˜ ๋™์ž‘๋„ ์˜ฎ๊ฒจ์ค๋‹ˆ๋‹ค.

class BTree:
  ...
  def insert(self, key: int):
    root = self.root

    if root.leaf and len(root.keys) < 3:
      self._insert_non_full(root, key)  # ์—ฌ๊ธฐ
    elif root.leaf and len(root.keys) == 3:
      # ์‹ ๊ทœ ๋ฃจํŠธ๋ฅผ ๋ฐœ๊ธ‰
      new_root = BTreeNode(leaf=False)

      # ์ง€๊ธˆ์˜ ๋ฃจํŠธ๋ฅผ ์‹ ๊ทœ ๋ฃจํŠธ์— ๋ถ™์ž„
      new_root.children.append(root)

      self.root = new_root
      self._split_child(new_root, 0)

      # ์ด์ œ ์ƒˆ๋กœ์šด key๋ฅผ insert
      self._insert_non_full(new_root, key)  # ์—ฌ๊ธฐ

  def _insert_non_full(self, node: BTreeNode, key: int):
    if len(node.keys) >= 3:
      raise ValueError("Node is full.")

    if node.leaf:
      if key in node.keys:
        print(f"Key {key} already exists in the tree.")
        return
      node.keys.append(key)
      node.keys.sort()
      return

    ## ์–ด๋А child๋กœ ๋‚ด๋ ค๊ฐˆ์ง€ ์ฐพ๊ธฐ
    idx = -1
    for i, n_key in enumerate(node.keys):
      if key == n_key:
        print(f"Key {key} already exists in the tree.")
        return

      if key < n_key:
        idx = i
        break

    if idx == -1:
      idx = len(node.keys)

    ## ์ ์ ˆํ•œ child์— insert
    node = node.children[idx]
    node.keys.append(key)
    node.keys.sort()

์ด๋ ‡๊ฒŒ ํ–ˆ์œผ๋ฉด, ์ด์ œ insert() ์ฝ”๋“œ๋ฅผ ๋ฃจํŠธ๋ฅผ ๋„˜์–ด, non-root ๋…ธ๋“œ์—๋„ insert ํ•˜๋Š” ์ฝ”๋“œ๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

class BTree:
  ...
  def insert(self, key: int):
    root = self.root

    if len(root.keys) < 3:
      self._insert_non_full(root, key)
    elif len(root.keys) == 3:
      # ์‹ ๊ทœ ๋ฃจํŠธ๋ฅผ ๋ฐœ๊ธ‰
      new_root = BTreeNode(leaf=False)

      # ์ง€๊ธˆ์˜ ๋ฃจํŠธ๋ฅผ ์‹ ๊ทœ ๋ฃจํŠธ์— ๋ถ™์ž„
      new_root.children.append(root)

      self.root = new_root
      self._split_child(new_root, 0)

      # ์ด์ œ ์ƒˆ๋กœ์šด key๋ฅผ insert
      self._insert_non_full(new_root, key)

๋‹ค๋งŒ, _insert_non_full()์—์„œ ์ด ๋ถ€๋ถ„์„ ์ถ”๊ฐ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

class BTree:
  ...
  def _insert_non_full(self, node: BTreeNode, key: int):
    if len(node.keys) >= 3:
      raise ValueError("Node is full.")

    if node.leaf:
      if key in node.keys:
        print(f"Key {key} already exists in the tree.")
        return
      node.keys.append(key)
      node.keys.sort()
      return

    ## ์–ด๋А child๋กœ ๋‚ด๋ ค๊ฐˆ์ง€ ์ฐพ๊ธฐ
    idx = -1
    for i, n_key in enumerate(node.keys):
      if key == n_key:
        print(f"Key {key} already exists in the tree.")
        return

      if key < n_key:
        idx = i
        break

    if idx == -1:
      idx = len(node.keys)

    # ์—ฌ๊ธฐ
    if len(node.children[idx].keys) == 3:
      # ๋‚ด๋ ค๊ฐˆ child๊ฐ€ ์ด๋ฏธ ๊ฝ‰ ์ฐจ ์žˆ๋‹ค๋ฉด, ๋จผ์ € ๋ถ„ํ• ํ•ด์•ผ ํ•จ.
      self._split_child(node, idx)

      # ๋ถ„ํ•  ํ›„์— ๋‹ค์‹œ ์–ด๋А child๋กœ ๋‚ด๋ ค๊ฐˆ์ง€ ์ฐพ์•„์•ผ ํ•จ.
      idx = ...

    # ์žฌ๊ท€ ํ˜ธ์ถœ๋กœ ๋ณ€๊ฒฝ
    self._insert_non_full(node.children[idx], key)

์–ด๋А child๋กœ ๋‚ด๋ ค๊ฐˆ์ง€ idx๋ฅผ ์ฐพ๋Š” ์ฝ”๋“œ๋„ ์ž์ฃผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•จ์ˆ˜ํ™” ํ•ด์„œ ํŽธํ•˜๊ฒŒ ์”์‹œ๋‹ค.

class BTree:
  ...
  def _find_child_index(self, node: BTreeNode, key: int) -> tuple[int, bool]:
    """
    internal node์—์„œ ์–ด๋А child๋กœ ๋‚ด๋ ค๊ฐˆ์ง€ ๊ฒฐ์ •.
    """

    for i, n_key in enumerate(node.keys):
      if key == n_key:
        return i, True

      if key < n_key:
        return i, False

    return len(node.keys), False

  def _insert_non_full(self, node: BTreeNode, key: int):
    if len(node.keys) >= 3:
      raise ValueError("Node is full.")

    if node.leaf:
      if key in node.keys:
        print(f"Key {key} already exists in the tree.")
        return
      node.keys.append(key)
      node.keys.sort()
      return

    # leaf๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด, ์–ด๋А child๋กœ ๋‚ด๋ ค๊ฐˆ์ง€ ์ฐพ์•„์•ผ ํ•จ.
    idx, key_exists = self._find_child_index(node, key)  # ์—ฌ๊ธฐ

    if key_exists:
      print(f"Key {key} already exists in the tree.")
      return

    if len(node.children[idx].keys) == 3:
      # ๋‚ด๋ ค๊ฐˆ child๊ฐ€ ์ด๋ฏธ ๊ฝ‰ ์ฐจ ์žˆ๋‹ค๋ฉด, ๋จผ์ € ๋ถ„ํ• ํ•ด์•ผ ํ•จ.
      self._split_child(node, idx)

      # ๋ถ„ํ•  ํ›„์— ๋‹ค์‹œ ์–ด๋А child๋กœ ๋‚ด๋ ค๊ฐˆ์ง€ ์ฐพ์•„์•ผ ํ•จ.
      idx, key_exists = self._find_child_index(node, key)  # ์—ฌ๊ธฐ

      if key_exists:
        print(f"Key {key} already exists in the tree.")
        return

    # ์žฌ๊ท€ ํ˜ธ์ถœ๋กœ ๋ณ€๊ฒฝ
    self._insert_non_full(node.children[idx], key)

์•„์ง ๋ฉ€์—ˆ์Šต๋‹ˆ๋‹คโ€ฆ ใ…‹ใ…‹ ๋…ธ๋“œ๋ฅผ ๋ถ„ํ• ํ•˜๋Š” _split_child()๋„ internal node๋ฅผ ๋ถ„ํ• ํ•˜๋Š” ๊ฒฝ์šฐ, ์ž์‹ ๋…ธ๋“œ๋ฅผ ๋ถ„ํ• ํ•ด์ค˜์•ผ ํ•ด์š”.

class BTree:
  ...
    def _split_child(self, parent: BTreeNode, idx: int):
    # parant์˜ idx ๋ฒˆ์งธ ์ž์‹ ๋…ธ๋“œ๋ฅผ ๋ถ„ํ• ํ•ด parent์˜ ์ž์‹ ๋…ธ๋“œ๋กœ ์ถ”๊ฐ€ํ•˜๋Š” ํ•จ์ˆ˜
    child = parent.children[idx]

    mid_key = child.keys[1]

    new_child = BTreeNode(leaf=child.leaf)

    # key ๋ถ„ํ• 
    new_child.keys = child.keys[2:]
    child.keys = child.keys[:1]

    # ์—ฌ๊ธฐ๊ฐ€ ์ถ”๊ฐ€๋จ.
    # child๊ฐ€ ๋‚ด๋ถ€ ๋…ธ๋“œ๋ผ๋ฉด, children๋„ ๋ถ„ํ• 
    if not child.leaf:
      new_child.children = child.children[2:]
      child.children = child.children[:2]

    # ๋ถ€๋ชจ ๋…ธ๋“œ์˜ idx ๋ฒˆ์งธ ์ž์‹ ๋…ธ๋“œ์— mid key๋ฅผ ์‚ฝ์ž…
    parent.keys.insert(idx, mid_key)

    # ๋ถ€๋ชจ์˜ ์ž์‹ ๋ชฉ๋ก์— ์ƒˆ child๋ฅผ ์‚ฝ์ž…
    parent.children.insert(idx + 1, new_child)

4์— ๋„๋‹ฌํ•˜๋ฉด ๋…ธ๋“œ๋ฅผ ๋ถ„ํ• ํ•˜๋Š”๊ฒŒ ์•„๋‹ˆ๋ผ, ์™œ 3๊นŒ์ง€ ์ฐจ๋ฉด ๋…ธ๋“œ๋ฅผ ๋ถ„ํ• ํ• ๊นŒ?

์™œ keys๊ฐ€ 3์ด ๋˜๋ฉด ์„ ์ œ์ ์œผ๋กœ split ํ•˜๋Š”๊ฐ€?

์ฝ”๋“œ๋ฅผ ๊ตฌํ˜„ํ•˜๋ฉด์„œ ๋“ค์—ˆ๋˜ ์ƒ๊ฐ์ด์˜ˆ์š”.

    [10 | 20 | 30]
     /  |  |  \
    5  15   25  35

์ด ์ƒํ™ฉ์—์„œ 40์„ ์‚ฝ์ž…ํ•˜๋ฉด ๊ฐ€๋Šฅํ•œ ์„ ํƒ์ง€๋Š” 1๋ฒˆ/2๋ฒˆ์ด์˜ˆ์š”.

1๋ฒˆ: ๋…ธ๋“œ๋ฅผ ๋ถ„ํ• ํ•˜๊ณ  insert ํ•œ๋‹ค.

         [20]
      /        \
    [10]       [30]
   /   \        /   \
   [5]  [15] [25] [35|40]

2๋ฒˆ: ์‹ค์ œ ์‚ฝ์ž…์ด ๋˜๋Š” ๋Œ€์ƒ ๋…ธ๋“œ๊ฐ€ ํ•œ๊ณ„(=4)์— ๋„๋‹ฌํ•˜๋ฉด split node

    [10 | 20 | 30]
     /  |  |  \
    5  15   25   [35|40]

1๋ฒˆ๊ณผ 2๋ฒˆ ๋‘˜๋‹ค B-Tree ๊ตฌ์กฐ๋ฅผ ์ถฉ์กฑํ•ฉ๋‹ˆ๋‹ค.

1๋ฒˆ ๋ฐฉ์‹์„ top-down์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์‚ฝ์ž… ์—ฐ์‚ฐ์„ ํ•˜๊ธฐ ์œ„ํ•ด ํŠธ๋ฆฌ๋ฅผ ์ˆœํšŒํ•  ๋•Œ๋งˆ๋‹ค, if node.key is full: split node๋ฅผ ์ฒดํฌํ•˜๊ณ , ๋…ธ๋“œ ๋ถ„ํ• ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์ฆ‰, full(=3) ์ƒํƒœ์˜ ๋…ธ๋“œ ํƒ์ƒ‰ ๊ณผ์ •์—์„œ ๋ชจ๋‘ ์‚ฌ์ „ ๋ถ„ํ• (pre-split) ๋ฉ๋‹ˆ๋‹ค.

๋ฐ˜๋ฉด์—, 2๋ฒˆ์€ ๋‚ด๊ฐ€ ์‹ค์ œ๋กœ ์‚ฝ์ž…ํ•˜๋Š” ๋…ธ๋“œ๊ฐ€ ํ•œ๊ณ„์— ๋…ธ๋‹ฌํ•˜๋ฉด split node๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ, split ํ•˜๋ฉด์„œ child -> parent๋กœ mid_key๋ฅผ ์˜ฌ๋ ค์ฃผ๊ฒŒ ๋˜๋Š”๋ฐ, ๋งŒ์•ฝ parent ๋…ธ๋“œ๊ฐ€ ๊ฝ‰์ฐฌ ์ƒํƒœ์˜€๋‹ค๋ฉด, parent๋„ ๋…ธ๋“œ ๋ถ„ํ• ์„ ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

        [40|80|120]
      /    |    |    \
 [10|20|30] [50|60|70] [90|100|110] [130]

์ด ์ƒํƒœ์—์„œ 5๋ฅผ insert ํ•˜๊ฒŒ ๋˜๋ฉด,

        [40|80|120]
      /    |    |    \
 [5|10|20|30] [50|60|70] [90|100|110] [130]
 (overflow)
---
        [20 | 40 | 80 | 120]
         (overflow)
      /     |     |     |     \
   [5|10]   [30] [50|60|70] [90|100|110] [130]
  (20์ด ๋ถ€๋ชจ๋กœ ์˜ฌ๋ผ๊ฐ.)
---
          [80]
        /         \
      [20|40]         [120]
     /   |   \          /  \
   [5|10]  [30] [50|60|70] [90|100|110] [130]

2๋ฒˆ ๋ฐฉ์‹์€ split ํ›„, parent๋กœ ์˜ฌ๋ผ๊ฐ„ mid_key๋กœ ์ธํ•ด parent overflow๊ฐ€ ๋ฐœ์ƒํ•˜๊ณ , split ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋˜, mid_key๊ฐ€ ์˜ฌ๋ผ๊ฐ€๋ฉด์„œ overflow -> split์„ ์œ ๋ฐœํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, split ์ด๋ผ๋Š” ํ–‰์œ„ ์ž์ฒด๊ฐ€ ์—ฐ์‡„์ ์œผ๋กœ bottom-up์œผ๋กœ ์ „ํŒŒ ๋ฉ๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ, 1๋ฒˆ์€ split์ด ์—ฐ์‡„์ ์œผ๋กœ ๋ฐœ์ƒํ•˜์ง€ ์•Š๊ณ  ์ „ํŒŒ๋˜์ง€๋„ ์•Š์Šต๋‹ˆ๋‹ค. ์ƒ์œ„ ๋…ธ๋“œ๋ฅผ ์ˆœํšŒํ•  ๋•Œ, ์ด๋ฏธ full ์ƒํƒœ์ธ ๋…ธ๋“œ๋ฅผ ์ •๋ฆฌํ•˜๊ณ  ์™”๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ค‘๊ฐ„/๋ฆฌํ”„ ๋…ธ๋“œ์—์„œ split์ด ๋ฐœ์ƒํ•˜๋”๋ผ๋„, ๊ทธ๊ฒŒ ์—ฐ์‡„์ ์œผ๋กœ ์ „ํŒŒ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

Search

ํƒ์ƒ‰์€ ๊ทธ๋Œ€๋กœ ์‰ฝ์Šต๋‹ˆ๋‹ค ใ…Žใ…Ž

class BTree:
  ...
  def search(self, key: int):
    return self._search_node(self.root, key)

  def _search_node(self, node: BTreeNode, key: int):
    # ์–ด๋–ค child๋ฅผ ํƒ์ƒ‰ํ• ์ง€ ๊ฒฐ์ •
    idx, key_exists = self._find_child_index(node, key)

    # ๋งŒ์•ฝ key๊ฐ€ node์— ์žˆ๋‹ค๋ฉด, True๋ฅผ ๋ฐ˜ํ™˜
    if key_exists:
      return True

    if node.leaf:
      return False

    return self._search_node(node.children[idx], key)

B+Tree

์ž ์ด์ œ, ์ด ๊ธ€์„ ์ž‘์„ฑํ•˜๊ฒŒ ๋งŒ๋“  ์ฃผ์ธ๊ณต B+Tree ์ž…๋‹ˆ๋‹คโ€ฆ!

B-Tree์™€ B+Tree๋Š” ํŠธ๋ฆฌ์˜ ํ˜•์ƒ์€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค. ๋‘˜๋‹ค t ๊ณ„์ˆ˜๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ , ๋…ธ๋“œ์˜ ํ‚ค ๊ฐฏ์ˆ˜๋Š” 2t-1๊นŒ์ง€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ๋…ธ๋“œ๊ฐ€ ์ฐจ๋ฉด ๋…ธ๋“œ๋ฅผ ๋ถ„ํ• ํ•ด ์ œ์•ฝ ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

์ผ๋‹จ ์Šค์ผˆ๋ ˆํ†ค ๋ถ€ํ„ฐ ๋งˆ๋ จ ํ•ฉ์‹œ๋‹ค.

class BPlusTreeNode:
  def __init__(self, leaf: bool = True):
    self.leaf = leaf
    self.keys: list[int] = []

    # internal node์—์„œ๋งŒ ์‚ฌ์šฉ
    self.children: list[BPlusTreeNode] = []

    # leaf node์—์„œ๋งŒ ์‚ฌ์šฉ
    self.next: BPlusTreeNode | None = None

class BPlusTree:
  def __init__(self):
    self.t = 2
    self.root = BPlusTreeNode(leaf=True)

  def insert(self, key: int):
    pass

  def search(self, key: int):
    pass

B+Tree๋Š” ๋ฆฌํ”„ ๋…ธ๋“œ๋ฅผ Linked List๋กœ ์—ฐ๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ self.next ๋ณ€์ˆ˜๊ฐ€ ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

Insert

class BPlusTree:
  ...
  def insert(self, key: int):
    root = self.root
    if len(root.keys) == 3:
      # root๊ฐ€ ๊ฝ‰ ์ฐจ ์žˆ๋‹ค๋ฉด, ์ƒˆ๋กœ์šด root๋ฅผ ๋งŒ๋“ค๊ณ  split
      new_root = BPlusTreeNode(leaf=False)

      # ์ง€๊ธˆ์˜ ๋ฃจํŠธ๋ฅผ ์‹ ๊ทœ ๋ฃจํŠธ์— ๋ถ™์ž„
      new_root.children.append(root)

      self.root = new_root
      self._split_child(new_root, 0)

      # ์ด์ œ ์ƒˆ๋กœ์šด key๋ฅผ insert
      self._insert_non_full(root, key)
    else:
      self._insert_non_full(root, key)

  def _insert_non_full(self, node: BPlusTreeNode, key: int):
    # B-Tree์™€ ๋™์ผ

B+Tree๋Š” B-Tree ํŠธ๋ฆฌ์™€ ํ˜•์ƒ์ด ๋™์ผํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋…ธ๋“œ๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ณผ์ •์ด ํฐ ํ‹€์—์„œ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ, _find_child_index()์™€ _split_child() ๋‹จ๊ณ„์—์„œ ์ฐจ์ด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.


class BPlusTree:
  ...
  def _find_child_index(self, node: BPlusTreeNode, key: int) -> int:
    # B+tree์—์„œ๋Š” key == n_key์ธ ๊ฒฝ์šฐ, ์˜ค๋ฅธ์ชฝ child๋กœ ๋‚ด๋ ค๊ฐ.
    # B+tree์—์„œ internal node์˜ key๋Š” ๊ธธ ์•ˆ๋‚ด์šฉ key์ด๊ธฐ ๋•Œ๋ฌธ์ž„.

    # internal node ์ „์šฉ ํ•จ์ˆ˜
    if node.leaf:
      raise ValueError("Node is a leaf, cannot find child index.")

    for i, n_key in enumerate(node.keys):
      if key < n_key:
        return i

    return len(node.keys)

B-Tree์—์„  key == n_key๋กœ ์ผ์น˜ํ•˜๋Š” ๊ฒฝ์šฐ, ํƒ์ƒ‰์„ ์ค‘๋‹จํ•˜๊ณ  ๋…ธ๋“œ์˜ ์œ„์น˜๋ฅผ ๋ฐ˜ํ™˜ํ•˜๊ฑฐ๋‚˜ ๋…ธ๋“œ ์‚ฝ์ž…์„ ์ค‘๋‹จ ํ–ˆ์Šต๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ, B+Tree์—์„  internal node์—์„œ key == n_key๋กœ ์ผ์น˜ํ•œ๋‹ค๋ฉด, ๊ทธ๊ฑธ ๊ธฐ์ค€์œผ๋กœ ๋‹ค์Œ ์ˆœํšŒ ๋…ธ๋“œ๋ฅผ ๊ฒฐ์ • ํ•ฉ๋‹ˆ๋‹ค.

      [20    |   40]
    /        |       \
 [5|10] -> [20|30] -> [40|50]

๊ทธ ์ด์œ ๋Š” B+Tree์—์„  ์˜ค์ง ๋ฆฌํ”„ ๋…ธ๋“œ๋ฅผ ๊ธฐ์ค€์œผ๋กœ Point Query, Range Query์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ทœ์น™์ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์•ผ WHERE id BETWEEN 10 AND 20๊ณผ ๊ฐ™์€ Range Query๊ฐ€ ๋ฆฌํ”„ ๋…ธ๋“œ 10์—์„œ ์‹œ์ž‘ํ•ด Linked List๋กœ 20๊นŒ์ง€ ์ˆœํšŒํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

์ค‘๊ฐ„์— internal node์—์„œ ํƒ์ƒ‰์„ ๋ฉˆ์ถ˜๋‹ค๋ฉด, ์ด๋Ÿฐ Range Query๋ฅผ ์œ„ํ•ด ๋ณ„๋„ ๊ตฌํ˜„์„ ๋งˆ๋ จํ•ด์•ผ ํ•  ๊ฒƒ ์ž…๋‹ˆ๋‹ค.


class BPlusTree:
  ...
  def _split_child(self, parent: BPlusTreeNode, idx: int):
    child = parent.children[idx]

    if child.leaf:
      self._split_child_leaf(parent, idx)
    else:
      self._split_child_internal(parent, idx)

B+Tree๋Š” ๋…ธ๋“œ ๋ถ„ํ•  ๋•Œ, leaf์ธ์ง€ internal node์ธ์ง€ ๊ตฌ๋ถ„ํ•ด์„œ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

  def _split_child_leaf(self, parent: BPlusTreeNode, idx: int):
    # B-tree์—์„œ๋Š” leaf node๊ฐ€ ๊ฝ‰ ์ฐจ๋ฉด, ๊ฐ€์šด๋ฐ key๋ฅผ parent๋กœ ์˜ฌ๋ฆฌ๊ณ , ์˜ค๋ฅธ์ชฝ ์ ˆ๋ฐ˜์„ ์ƒˆ๋กœ์šด leaf node๋กœ ๋งŒ๋“ฆ.
    # B+Tree์—์„œ๋Š” leaf node๊ฐ€ ๊ฝ‰ ์ฐจ๋ฉด, ์˜ค๋ฅธ์ชฝ ์ ˆ๋ฐ˜์„ ์ƒˆ๋กœ์šด leaf node๋กœ ๋งŒ๋“ค๊ณ , parent์—๋Š” ์˜ค๋ฅธ์ชฝ ์ ˆ๋ฐ˜์˜ ์ฒซ ๋ฒˆ์งธ key๋ฅผ ์˜ฌ๋ฆผ.

    child = parent.children[idx]

    new_child = BPlusTreeNode(leaf=True)

    # [10|20|30] -> [10], [20|30]
    new_child.keys = child.keys[1:]  # ์—ฌ๊ธฐ๊ฐ€ ๋‹ค๋ฆ„.
    child.keys = child.keys[:1]

    # [B+Tree] leaf linked list ์—ฐ๊ฒฐ
    new_child.next = child.next
    child.next = new_child

    # parent์— new_child ์ถ”๊ฐ€
    parent.children.insert(idx + 1, new_child)

    # ๋ถ€๋ชจ์˜ ์ž์‹ ๋ชฉ๋ก์— ์ƒˆ child๋ฅผ ์‚ฝ์ž…
    parent.keys.insert(idx, new_child.keys[0])

B+Tree์—์„  ๋ฆฌํ”„ ๋…ธ๋“œ์˜ Linked List ๊ตฌ์กฐ๊ฐ€ ๋ถ„ํ•  ํ›„์—๋„ ์ด์–ด์ ธ์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ paraent์˜ key ๋ชฉ๋ก์— mid_key๋ฅผ ์ถ”๊ฐ€๋งŒ ํ• ๋ฟ, mid_key ์ž์ฒด๋Š” ์—ฌ์ „ํžˆ ๋ฆฌํ”„ ๋…ธ๋“œ์˜ key ๋ชฉ๋ก์— ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.

  def _split_child_internal(self, parent: BPlusTreeNode, idx: int):
    # ๋‚ด๋ถ€ ๋…ธ๋“œ๋Š” ๋ถ€๋ชจ๋กœ key๋ฅผ ์ด๋™ ์‹œ์ผœ์•ผ ํ•จ(move-up).
    child = parent.children[idx]

    new_child = BPlusTreeNode(leaf=False)

    # [10|20|30] -> [10], 20์€ ๋ถ€๋ชจ๋กœ, [30]
    promote_key = child.keys[1] # move-up ๋Œ€์ƒ
    new_child.keys = child.keys[2:] # move-up์œผ๋กœ ์ธํ•ด ์‚ญ์ œ
    child.keys = child.keys[:1]

    # child ๋ถ„ํ• 
    new_child.children = child.children[2:]
    child.children = child.children[:2]

    # parent์— new_child ์ถ”๊ฐ€
    parent.children.insert(idx + 1, new_child)

    # ๋ถ€๋ชจ์˜ ์ž์‹ ๋ชฉ๋ก์— ์ƒˆ child๋ฅผ ์‚ฝ์ž…
    parent.keys.insert(idx, promote_key)

ํ•˜์ง€๋งŒ, ๋‚ด๋ถ€ ๋…ธ๋“œ๋Š” B-Tree์™€ ๋™์ผํ•˜๊ฒŒ ๋…ธ๋“œ ๋ถ„ํ• ์ด ์ด๋ค„์ง‘๋‹ˆ๋‹ค. parent์— mid_key๋ฅผ ์ด๋™์‹œํ‚ต๋‹ˆ๋‹ค(move-up).

Search

ํƒ์ƒ‰์€ ์ข€๋” ์‰ฝ๊ฒŒ ๊ตฌํ˜„ ํ•ฉ๋‹ˆ๋‹ค.

class BPlusTree:
  ..
  def search(self, key: int):
    # internal node์— ์žˆ๋Š” key๋Š” ๊ธธ ์•ˆ๋‚ด์šฉ key์ž„.
    # ์ค‘๊ฐ„์— key์™€ ์ผ์น˜ํ•˜๋Š” ๊ฐ’์ด internal node์— ์žˆ๋”๋ผ๋„, ํ•ญ์ƒ leaf node ๋๊นŒ์ง€ ๋‚ด๋ ค๊ฐ€์•ผ ํ•จ.

    leaf = self._find_leaf(self.root, key)
    return key in leaf.keys

  def _find_leaf(self, node: BPlusTreeNode, key: int):
    """
    key๊ฐ€ ๋“ค์–ด์žˆ์„ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋Š” leaf node๊นŒ์ง€ ๋‚ด๋ ค๊ฐ€๋Š” ํ•จ์ˆ˜
    B+Tree๋Š” ๊ฒ€์ƒ‰์ด ํ•ญ์ƒ leaf์—์„œ ๋๋‚จ.
    """

    if node.leaf:
      return node
    else:
      idx = self._find_child_index(node, key)
      return self._find_leaf(node.children[idx], key)

์ค‘๊ฐ„ ๋…ธ๋“œ์— ์žˆ๋Š” keys๋“ค์„ ์ฐธ๊ณ ํ•ด ๋ฆฌํ”„ ๋…ธ๋“œ๊นŒ์ง€ ๋„๋‹ฌํ•ฉ๋‹ˆ๋‹ค. ๋ฆฌํ”„ ๋…ธ๋“œ์— ๋„๋‹ฌ ํ–ˆ์„ ๋–„, ์ฐพ๊ณ ์ž ํ•˜๋Š” key๊ฐ€ ๋ฆฌํ”„ ๋…ธ๋“œ์— ์กด์žฌํ•˜๋Š”์ง€๋ฅผ ํ™•์ธ ํ•ฉ๋‹ˆ๋‹ค.

๋งบ์Œ๋ง

์ง์ ‘ ๊ตฌํ˜„ํ•ด๋ณด๋‹ˆ ์ง€๊ธˆ๊นŒ์ง€ ์• ๋งคํ•˜๊ณ  ์ดํ•ดํ•˜๊ณ  ๋ฐ›์•„๋“ค์˜€๋˜ ๊ฒƒ๋“ค์ด ๊ตฌ์ฒดํ™” ๋˜์—ˆ๋‹ค. BST์™€ B-Tree์˜ ์ฐจ์ด์ , ๊ทธ๋ฆฌ๊ณ  B-Tree์™€ B+Tree์˜ ์ฐจ์ด์ . ์ด๋Ÿฐ ๊ฒƒ๋“ค์€ ์ด์ „์—๋Š” ๋‹ค ๋น„์Šทํ•œ ๊ฑฐ๋ผ๊ณ  ์ƒ๊ฐํ•˜๊ณ  ๋„˜์–ด๊ฐ”์—ˆ๋‹ค.

ํ•˜์ง€๋งŒ, ์ด์ œ ์‹œ์ž‘์ด๋‹ค!! ๋ณธ๋ž˜ xfs, ext4 ํŒŒ์ผ ์‹œ์Šคํ…œ์„ ์ดํ•ดํ•˜๋Š” ๊ณผ์ •์—์„œ ์ด๊ฑธ ๋งŒ๋‚ฌ๊ธฐ ๋•Œ๋ฌธ์—, ๋“ค์ธ ์‹œ๊ฐ„์ด ์˜จ์ „ํžˆ ๊ฐ€์น˜๋ฅผ ๋‚ด๋ ค๋ฉด xfs, ext4๋ฅผ ์ดํ•ดํ•˜๋Š”๋ฐ ์ด๊ฑธ ์จ๋จน์–ด์•ผ ํ•œ๋‹ค.

์ž๋ฃŒ๊ตฌ์กฐ ์ž์ฒด๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๋Šฅ๋ ฅ์€ ์ง€๊ธˆ ์‹œ๋Œ€์— ์ค‘์š”ํ•˜์ง€ ์•Š๋‹ค. ์ค‘์š”ํ•œ ๊ฑด ๋„๊ตฌ๋ฅผ ์–ธ์ œ/์™œ/์–ด๋–ป๊ฒŒ ์จ์•ผ ํ•˜๋Š”์ง€๋ฅผ ์•„๋Š” ๊ฒƒ์ด๋‹ค.

Categories:

Updated: