Data Structures and Data Management

Course Notes
Parent item
Computer Science
CS 240
Data Structures
search: θ(height)\theta(height)
insert: θ(height)\theta(height)
delete: θ(height)\theta(height)
Worst case cost for all operations is θ(height)=θ(log n)\theta(height) = \theta(log \space n)
Other Dictionary Implementations
Dictionary ADT
Unordered array or linked list: θ(1)\theta(1) insert, θ(n)\theta(n) search and delete
Ordered array: θ(logn)\theta(log n) search, θ(n)\theta(n) insert and delete
Binary Search Trees: θ(height)\theta(height) search, insert, delete
Balanced BST (AVL): θ(log n)\theta(log \space n) search, insert, and delete
Skip Lists
Expected Space: O(n)
Expected height: O(log n)
search: O(log n)
insert: O(log n)
delete: O(log n)
Dictionaries for special keys
Binary Search in a sorted array
insert: Ө(n)
delete: Ө(n)
search: Ө(log n)
Interpolation Search
T(avg)(n)θ(log log n)T^{(avg)}(n) \in \theta(log \space log \space n) if keys are uniformly distributed
Worst case: Ө(n)
Assumption: prefix-free / Based on bitwise comparisons
insert: Ө(|x|)
delete: Ө(|x|)
where |x| represents the length of binary string x, i.e., the number of bits in x
No leaf labels Trie
Halves the amount of space
Allow Proper Prefixes Trie
more space-efficient
Pruned Trie
Saves space if there are only few bitstrings that are long
Could even store infintie bitstrings (e.g. real numbers)
Must store the full keys
Compressed Trie (Patricia Tries)
delete: O(|x|)
insert: O(|x|)
Multiway Tries
O( |x| * (time to find the appropriate child) )
Array of size Σ|\Sigma| + 1 for each node
O(1) time to find child, O(Σ|\Sigma|n) space
List of children for each node
O(Σ|\Sigma|) time, O(# children) space
Dictionary (AVL?) of children for each node
O(log(# children)) time, O(# children) space
Direct Addressing
M: hash table size
search: Ө(1)
insert: Ө(1)
delete: Ө(1)
Space: Ө(M)
Separate Chaining Hashing
insert: O(1)
search: O(1 + size of bucket)
delete: O(1 + size of bucket)
average bucket size: nM=:α\frac{n}{M} =: \alpha called load factor
With Uniform Hashing Assumption that each hash function value is equally likely, average-case cost of search and delete is O(1+α)O(1 + \alpha). Otherwise, Ө(n) on average
Bucket could be implemented by any dictionary realization
Simplest approach is to use unsorted linked lists for buckets tho
Open Addressing
Avoid the links needed for chaining by permitting only one item per slot
All operations have O(1) average-case run-time if the hash-function is uniform and α\alpha is kept sufficiently small. But worst-case run-time is usually Ө(n).
Linear Probing
h(k, i) = (h(k) + i) mod 11
Independent hash functions
Use multiplicative method for second hash function to make them independent
Pattern Matching
Preprocessing: -
Search Time: O(nm)
Extra space: -
Preprocessing: O(m)
Search Time: O(n)
Extra space: O(m)
Preprocessing: O(m + Σ|\Sigma|)
Search Time: O(n)
Extra Space: O(m + Σ|\Sigma|)
Suffix Tree Preprocessing: O(n2n^2)
Search Time: O(m)
Extra space: O(n)
frequency 낮은 순으로 합치는거
Run-Length (RLE)
반복개수로 표현
ASCII로 인코딩 후 Trie로 표현
새로운 substring은 128부터 시작
permute 후 sort 후 마지막 캐릭터로 문장 만들기