Abstract Datatypes and Data Structures:
Stacks, Queues, ...
(抽象データ型とデータ構造、スタック、キューなど)
Data Structures and Algorithms
4th lecture, October 13, 2022
https://www.sw.it.aoyama.ac.jp/2022/DA/lecture4.html
Martin J. Dürst
© 2009-22 Martin
J. Dürst 青山学院大学
Today's Schedule
- Summary and homework of last lecture
- Polynomial vs. exponential time
- Finding the (asymptotic) time complexity of an algorithm
- Recurrence relations
- Abstract Data Types
Summary of Last Lecture: Big-O Notation
The asymptotic growth (order of growth) of a function and the time (and
space) complexity of an algorithm can be expressed with the
Big-O/Ω/Θ notation:
- O(g(n)): Set of functions with lower or
same order of growth as g(n)
- Ω(g(n)): Set of functions with larger
or same order of growth as g(n)
- Θ(g(n)): Set of functions with same
order of growth as g(n)
f(n)∈O(g(n)) ⇔
∃c>0: ∃n0≥0:
∀n≥n0:
f(n)≤c·g(n)
Summary of Last Lecture: Finding Order of Growth
The order of growth of a function can be found by:
- Looking for appropriate c and n0
- Calculating
limn→∞(f(n)/g(n))
- Simplification (e.g.
O(5n2+30n+200) →
O(n2); O(17 log5
n) → O(log n))
When using Big-O notation, always try to simplify g()
as much as possible.
Last Lecture's Homework
(no need to submit)
Review this lecture's material and the additional handout every
day!
On the Web, find algorithms with time complexity
O(1), O(log n), O(n),
O(n log n),
O(n2), O(n3),
O(2n), O(n!), and so
on.
Frequent Orders
- O(1): Simple formulæ (e.g. interest calculation),
initialization
- O(log n) (logarithmic order/time): binary search,
other "divide and conquer" algorithms
- O(n) (linear order/time): proportional to size of
data, checking all data items once (or a finite number of times)
- O(n log n): Many sorting algorithms,
other "divide and conquer" algorithms
- O(n2) (quadratic order/time),
O(n3) (cubic order/time): Considering
all/most combinations of 2 or 3 data items
- O(2n): Considering all/most subsets of
data items
- O(n!): Considering all/most permutations of data
items
Example of other order: O(n2.373), fastest
known algorithm for matrix multiplication (for data of size
n2)
Polynomial versus Exponential Growth
Example:
1.1n ≶ n20
log(1.1)·n ≶ log(n)·20
n/log10(n) ≶ 20/log10(1.1)
≊483.2
n0 ≊ 1541
Conclusion: For any a, b > 1,
an will always eventually grow faster than
nb
(O(nb)⊊O(an))
(nb is polynominal,
an is exponential)
The Importance of Polynomial Time
- What can be called a "realistic" time complexity depends on the
problem
- In general:
- Polynomial time is realistic
- Exponential time is unrealistic
[We will discuss this in more detail in lecture
14]
Finding the (Asymptotic) Time Complexity of an Algorithm
- Find/define the variables that determine the problem/input size (e.g.
n)
- Find the basic operations (steps) in the algorithm that are most
frequently executed
- Express the total number of basic operations (steps) using summation or a
recurrence relation
- Determine the time complexity expressed with big-O
notation
Simplifications possible for big-O notation can be applied
early.
Example: Because constant factors are irrelevant in big-O notation,
they can be eliminated when counting steps.
Finding Time Complexity: A Very Simple Example
Time complexity of linear search:
- Variable that determines size of input: Size of dictionary
n
- Most frequently executed basic operations: Loop is executed once per data
item, runs in constant time.
- Total number of steps: n × O(1)
- Time complexity: O(n)
How to Define Input Size Variables
- In many cases, the input size is the number of data items (examples:
search, sort)
- For matrices, ..., often the number of rows or columns is used (matrices
of size n × n or n × m)
- In some cases, the size of individual data items has to be considered
Examples: Size of integers with unlimited precision (see also Fibonacci
number example in lecture 12); length of
variable-length strings, ...
- Sometimes, there are two or more kinds of data, with different size
Example: String matching (text size n and pattern size
m, lecture 11)
How to Identify the Most Frequent Basic Operations
- Usually inside a loop (especially inside multiple loops)
- If there are several independent loops, check all of them
- If the number of operations depends on the values in the input, check the
worst case
Example: For linear search, if value is not found
- When methods/functions are called, consider the content of the
function
Caution: Some methods/functions may hide complexity (e.g. Ruby
sort
, C strlen
, qsort
,...)
Counting Basic Operations using Summation
Counting Basic Operations using Recurrence Relations
- Example program (recursive version of binary search):
def binsearch(array, low, high, key)
middle = (high+low)/2
if low>=high
array[low]==key ? low : nil
elsif key>array[middle]
binsearch(array, middle+1, high, key)
else
binsearch(array, low, middle, key)
end
end
- Expressing the number of operations as a recurrence
B(size):
B(1) = 1
B(n) = B(⌈n/2⌉) + 1
Ceiling/Floor Functions
- ⌈⌉: ceiling function, ⌊⌋: floor function
- ⌊a⌋ is the floor function of a, the greatest
integer smaller than or equal to a,
i.e. ⌊a⌋∈ℤ ∧ 0 ≦ a-⌊a⌋
< 1
Examples: ⌊3.76⌋ = 3 , ⌊18.35⌋ = 18
- ⌈a⌉ is the ceiling function of a, the smallest
integer greater than or equal to a,
i.e. ⌈a⌉∈ℤ ∧ 0 ≦ ⌈a⌉-a
< 1
Examples: ⌈3.76⌉ = 4 , ⌈18.35⌉ = 19
Recurrence Relations
- Example:
B(1) = 1
B(n) = B(⌈n/2⌉) + 1
- A recurrence (relation) is a recursive definition of a
mathematical function
- There are several ways to solve recurrences
- One way to solve a recurrence is to discover a pattern by repeated
substitution:
B(n) = B(⌈n/2⌉) + 1 =
B(⌈⌈n/2⌉/2⌉) + 1 + 1 =
B(⌈n/22⌉) + 2 =
= B(⌈n/23⌉) + 3 =
B(⌈n/2k⌉) + k
- Using B(1) = 1:
⌈n/2k⌉ = 1 ⇒ 1
≥ n/2k (>1/2) ⇒
2k ≥ n (> 2k-1)
⇒ k ≥ log2 n (> k-1) ⇒
k = ⌈log2 n⌉
- B(n) = 1 + ⌈log2 n⌉ ∈
O(log n)
- The asymptotic time complexity of binary search is O(log n)
Comparing the Execution Time of Algorithms
(from previous lectures)
Possible questions:
How many seconds faster is binary search when compared to
linear search?
How many times faster is binary search when compared to
linear search?
- What is the order [of growth of the execution time] of linear search and
binary search?
Linear search is O(n), binary search is
O(log n).
Conclusion: Expressing time complexity as O() allows to evaluate
the essence of an algorithm, ignoring hardware and implementation
differences.
Abstract Data Type (ADT)
- Combination of data with functions operating on data
- The data can only be accessed/changed using the functions
(encapsulation)
- Goals
- Data integrity (examples: birthday and age; bank account)
- Modularization of big software projects
- Related to type theory
- Often implemented by objects in object-oriented programming languages
- Type → class
- Function → member function/method
Typical Examples of Abstract Data Types
- Stack
- Queue
- Linear list
- Dictionary
Caution: A dictionary ADT is not the same as a book
dictionary
- Priority queue
Stack
- General example:
- Stack of trays in cafeteria
- Principle:
- Last-In-First-Out (LIFO)
- Example from IT:
- Function stack (local variables, return address, ...)
- Main methods:
- new, add/push, delete/pop, top
- Other methods:
- empty? (check whether the stack is empty or not)
top (return the topmost element without removing it from the stack)
Axioms for Stacks
It is possible to define a stack using the following four axioms:
- Stack.new.empty? ↔ true
- s.push(e).empty? ↔ false
- s.push(e).top ↔ e
- s.push(e).pop ↔ s (here, pop returns the new stack, not the top
element)
(s is any arbitrary stack, e is any arbitrary data item)
Axioms can define a contract between implementation and users
Queue
- General example:
- Queue of people in cafeteria waiting for food
- Principle:
- First-In-First-Out (FIFO)
- Example from IT:
- Queue of processes waiting for execution
- Main methods:
- add/enqueue, remove/delete/dequeue
Comparing ADTs
Implementation: 4ADTs.rb; some complexities
can be improved by using additional variables
ADT |
stack |
queue |
Implemented as |
Array |
LinearList |
Array |
LinearList |
create |
O(n) |
O(1) |
O(n) |
O(1) |
add |
O(1) |
O(1) or O(n) |
O(n) or O(1) |
O(n) or O(1) |
delete |
O(1) |
O(n) or O(1) |
O(1) or O(n) |
O(1) or O(n) |
empty? |
O(1) |
O(1) |
O(1) |
O(1) |
length |
O(1) |
O(n) |
O(1) |
O(n) |
Summary
- The order (of growth)/(asymptotic) time complexity of an algorithm can be
calculated from the number of the most frequent basic operations
- Calculation can use a summation or a recurrence (relation)
- The big-O notation compactly expresses the inherent efficiency
of an algorithm
- An abstract data type (ADT) combines data and the operations on
this data
- Stack and queue are typical examples of ADTs
Homework
(no need to submit)
- Order the following orders of growth, and explain the reason for your
order:
O(n2), O(n!),
O(n log log n), O(n
log n), O(20n)
- Write a simple program that uses the classes in 4ADTs.rb.
Use this program to compare the implementations.
Hint: Use the second part of 2search.rb
as an example.
- Implement the priority queue ADT (use Ruby or any other
programming language)
A priority queue keeps a priority value (e.g. integer) for each data
item.
In the simplest case, the data consists of a priority value only.
The items with the highest priority leave the queue first.
Your implementation can use an array or a linked list or any other data
structure.
Glossary
- polynomial growth
- 多項式増加
- exponential growth
- 指数的増加
- integers with unlimited precision
- 非固定長整数
- summation
- 総和
- recurrence (relation)
- 漸化式
- ceiling function
- 天井関数
- floor function
- 床関数
- substitution
- 置換
- abstract data type
- 抽象データ型
- encapsulation
- カプセル化
- data integrity
- データの完全性
- modularization
- モジュール化
- type theory
- 型理論
- object-oriented
- オブジェクト指向 (形容詞)
- type
- 型
- class
- クラス
- member function
- メンバ関数
- method
- メソッド
- stack
- スタック
- cafeteria
- 食堂
- axiom
- 公理
- queue
- 待ち行列、キュー
- ring buffer
- リングバッファ
- priority queue
- 順位キュー、優先順位キュー、優先順位付き待ち行列