Quicksort, Average Time Complexity

(クイックソート、平均計算量)

Data Structures and Algorithms

7th lecture, November 1, 2018

http://www.sw.it.aoyama.ac.jp/2018/DA/lecture7.html

Martin J. Dürst

AGU

© 2009-18 Martin J. Dürst 青山学院大学

Today's Schedule

 

緊急地震速報訓練について

10 時00 分頃

 

Leftovers of Last Lecture

Summary of Last Lecture

 

Today's Goals

Using quicksort as an example, understand

 

History of Quicksort

 

Reviewing Divide and Conquer

 

Basic Workings of Quicksort

Ruby pseudocode/implementation: conceptual_quick_sort in 7qsort.rb

 

Comparison of Mergesort and Quicksort

Both algorithms use the same split-recurse-merge pattern, but there are important differences:

Mergesort Quicksort
split equal size size unpredictable
work done on merge on split
no work needed on split on merge

 

Quicksort Implementation Core

  1. Use e.g. the rightmost element as the pivot
  2. Starting from the right, find an element smaller than the pivot
  3. Starting from the left, find an element larger than the pivot
  4. Exchange the elements found in steps 2. and 3.
  5. Repeat steps 2.-4. until no further exchanges are needed
  6. Exchange the pivot with the element in the middle
  7. Recurse on both sides

Ruby pseudocode/implementation: simple_quick_sort in 7qsort.rb

 

Worst Case Complexity

 

Best Case Complexity

For most algorithms (but there are exceptions):

 

Average Complexity

 

Calculating QA

[1] QA(n) = n + 1 + 1/n Σ1≤kn (QA(k-1)+QA(n-k))

[2] QA(0) + ... + QA(n-2) + QA(n-1) =
= QA(n-1) + QA(n-2) + ... + QA(0)

[3] QA(n) = n + 1 + 2/n Σ1≤kn QA(k-1) [use [2] in [1]]

[4] n QA(n) = n (n + 1) + 2 Σ1≤kn QA(k-1) [multiply [3] by n]

[5] (n-1) QA(n-1) = (n-1) n + 2 Σ1≤kn-1 QA(k-1) [[4], with n replaced by n-1]

 

Calculating QA (continued)

[6] n QA(n) - (n-1) QA(n-1) = n (n+1) - (n-1) n + 2 QA(n-1) [[4]-[5]]

[7] n QA(n) = (n+1) QA(n-1) + 2n [simplifying [6]]

[8] QA(n)/(n+1) = QA(n-1)/n + 2/(n + 1) [dividing [7] by n (n+1)]

QA(n)/(n+1) =
= QA(n-1)/n + 2/(n + 1) = [repeatedly expand right side of [8] by using [8]]
= QA(n-2)/(n-1) + 2/n + 2/(n+1) =
= QA(n-3)/(n-2) + 2/(n-1) 2/n + 2/(n+1) = ...
= QA(2)/3 + Σ3≤kn 2/(k+1) [approximating sum by integral]

QA(n)/(n+1) ≈ 2 Σ1≤kn 2/k ≈ 2∫1n 1/x dx = 2 ln n

 

Result of Calculating QA

QA(n) ≈ 2n ln n ≈ 1.39 n log2 n

O(n log n)

⇒ The number of comparisons on average is ~1.39 times the optimal number of comparisons in an optimal decision tree

 

Distribution around Average

 

Complexity of Sorting

Question: What is the complexity of sorting (as a problem)?

 

Pivot Selection

 

Implementation Improvements

Ruby pseudocode/implementation (excluding split in three): quick_sort in 7qsort.rb

 

Comparing Sorting Algorithms using Animation

Watch animation: sort.svg

 

Stable Sorting

 

Sorting in C and Ruby

 

C's qsort Function

void qsort(
    void *base,        // start of array
    size_t nel,        // number of elements in array
    size_t width,      // element size
    int (*compar)(     // comparison function
        const void *,
        const void *)
  );

 

Ruby's Array#sort

(Klass#method denotes instance method method of class Klass)

array.sort uses <=> for comparison

array.sort { |a, b| a.length <=> b.length }
This example sorts (e.g. strings) by length

The code block (between { and }) is used as a comparison function

 

Ruby's <=> Operator

(also called spaceship operator, similar to strcmp in C)

Relationship between a and b return value of a <=> b
a < b -1 (or other integer smaller than 0)
a = b 0
a > b +1 (or other integer greater than 0)

 

Ruby's Array#sort_by

array.sort_by { |str| str.length } or array.sort_by &:length

(sorting strings by length)

array.sort_by { |stu| [stu.year, stu.prefecture] }

(sorting students by year and prefecture)

This calculates the values for the sort criterion for each array element in advance

 

Summary

 

Preparation for Next Time

 

Glossary

quicksort
クイックソート
partition
分割
partitioning element (pivot)
分割要素
worst case complexity (running time)
最悪時の計算量
best case complexity (running time)
最善時の計算量
average complexity (running time)
平均計算量
standard deviation
標準偏差
randomized algorithm
ランドム化アルゴリズム
median
中央値
decision tree
決定木
tail recursion
末尾再帰
in one go
一括
stable sorting
安定な整列法
criterion (plural criteria)
基準
block
ブロック