什么是符号表?

符号表这种数据结构可以理解为一张抽象的表格,我们将信息(值)存储在其中,每个值都有自己对应且唯一的键关联起来,我们可以通过键来找到其对应的信息。在现实世界中比较通俗的实现就是字典,我们通过指定的一些字母(键)找到其对应的一些单词或者字(值)。

基础实现:无序链表实现

我们首先来看一个比较简单的实现,使用链表可以很好的帮助我们优雅的实现这种数据结构。

# API设计

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
package net.sunzhenyu.miscellaneous.dsa.table;

/**
* The Table Data Structure API
*
* @param <K> Generic Key
* @param <V> Generic Value
* @author <a href="mailto:sunzhenyucn@gmail.com">SunZhenyu</a>
* @version 0.0.1
* @date 2018/06/26
* @since 1.8
*/
public interface Table<K, V> extends Iterable<K> {

/**
* Put value in table with specified key, If the specified key has existed, specified key's value
* will be update.
*
* @param key Specified key
* @param value Specified value
*/
void put(K key, V value);

/**
* Get specified key's value, If specified key's value does not exist, will return {@code null}
*
* @param key Specified key
* @return Specified key's value(if it exists)
*/
V get(K key);

/**
* Delete value with specified key
*
* @param key Specified key
*/
void delete(K key);

/**
* If specified key has existed in table, return {@code true}, else return {@code false}.
*
* @param key Specified key
* @return if it exists, return {@code true}
*/
boolean contains(K key);

/**
* If current table is empty, will return {@code true}, else return {@code false}
*
* @return If current table empty, return {@code true}
*/
boolean isEmpty();

/**
* Return current table's size
*
* @return Return current table's size
*/
int size();

/**
* Return all keys
*
* @return Return all keys
*/
Iterable<K> keys();

/**
* NullPointerException Check
*
* @param objects Specified params
* @throws NullPointerException If specified params has null
*/
//@formatter:off
default void ensureNullPointer(Object... objects) {
if (objects.length == 1)
if (objects[0] == null)
throw new NullPointerException("The param cannot be null.");
else
for (Object object : objects)
if (object == null)
throw new NullPointerException("The param cannot be null.");
}

/**
* The Table Data Structure's Node API
*
* @param <K> Generic Key
* @param <V> Generic Value
* @author <a href="mailto:sunzhenyucn@gmail.com">SunZhenyu</a>
* @version 0.0.1
* @date 2018/06/26
* @since 1.8
*/
interface Entry<K, V> {

/**
* Get key
*
* @return Key
*/
K getKey();

/**
* Get value
*
* @return Value
*/
V getValue();
}

}

关于符号表的实现我们需要严格遵守以下规则:

  • 每个键只对应一个值,符号表中不允许存在重复的键。
  • 如果待插入的键值对的键已经存在,则更新原有的值为新指定的值。
  • 键不允许存在空值。
  • 键的等价性,最好使用不可变的数据类型作为键,否则符号表的一致性是无法保证的。

# 代码实现

代码实现也比较简单,在此处我们将一般链表实现中的\(Node\)看作为\(Entry\),这个\(Entry\)代表了一对键值对,对外提供了访问键和访问值的方法。我们需要实现这个方法,并且需要实现在\(Entry\)中包含下一个节点\(Entry\)的引用,以此构成链表。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
package net.sunzhenyu.miscellaneous.dsa.table;

import java.util.Iterator;
import java.util.NoSuchElementException;
import lombok.AllArgsConstructor;
import lombok.Data;
import net.sunzhenyu.miscellaneous.dsa.queue.ArrayQueue;
import net.sunzhenyu.miscellaneous.dsa.queue.IQueue;

/**
* Sample Table implement
*
* @param <K> Generic Key
* @param <V> Generic Value
* @author <a href="mailto:sunzhenyucn@gmail.com">SunZhenyu</a>
* @version 0.0.1
* @date 2018/06/26
* @since 1.8
*/
public class SimpleTable<K, V> implements Table<K, V> {

private Node<K, V> head;
private int size;

@Data
@AllArgsConstructor
static class Node<K, V> implements Table.Entry<K, V> {

private K key;
private V value;
private Node<K, V> next;

/**
* Get key
*
* @return Key
*/
@Override
public K getKey() {
return key;
}

/**
* Get value
*
* @return Value
*/
@Override
public V getValue() {
return value;
}
}

private class TableIterator implements Iterator<Table.Entry<K, V>> {

private Node<K, V> next;

TableIterator(Node<K, V> next) {
this.next = next;
}

/**
* Returns {@code true} if the iteration has more elements. (In other words, returns {@code
* true} if {@link #next} would return an element rather than throwing an exception.)
*
* @return {@code true} if the iteration has more elements
*/
@Override
public boolean hasNext() {
return next != null;
}

/**
* Returns the next element in the iteration.
*
* @return the next element in the iteration
* @throws NoSuchElementException if the iteration has no more elements
*/
@Override
public Entry<K, V> next() {
Node<K, V> oldNext = next;
next = next.next;
return oldNext;
}
}

/**
* Put value in table with specified key, If the specified key has existed, specified key's value
* will be update.
*
* @param key Specified key
* @param value Specified value
*/
@Override
public void put(K key, V value) {
ensureNullPointer(key);
Node swap = head;
while (swap != null) {
if (swap.key.equals(key)){
swap.value = value;
return;
}
swap = swap.next;
}
head = new Node<>(key, value, head);
size++;
}

/**
* Get specified key's value, If specified key's value does not exist, will return {@code null}
*
* @param key Specified key
* @return Specified key's value(if it exists)
*/
@Override
@SuppressWarnings("unchecked")
public V get(K key) {
ensureNullPointer(key, head);
Object result = null;
Node swap = head;
while (swap != null) {
if (swap.key.equals(key)) {
result = swap.value;
break;
}
swap = swap.next;
}
return (V) result;
}

//@formatter:off
/**
* Delete value with specified key
*
* @param key Specified key
*/
@Override
public void delete(K key) {
if (!contains(key))
throw new NoSuchElementException("Cannot found specified key's element.");
if (key.equals(head.getKey())) {
size--;
head = head.next;
return;
}
Node swap = head;
Node previous = null;
while (swap != null) {
if (key.equals(swap.getKey())) {
if (previous != null)
previous.next = swap.next;
size--;
break;
}
previous = swap;
swap = swap.next;
}
}

/**
* If specified key has existed in table, return {@code true}, else return {@code false}.
*
* @param key Specified key
* @return if it exists, return {@code true}
*/
@Override
public boolean contains(K key) {
return get(key) != null;
}

/**
* If current table is empty, will return {@code true}, else return {@code false}
*
* @return If current table empty, return {@code true}
*/
@Override
public boolean isEmpty() {
return size != 0;
}

/**
* Return current table's size
*
* @return Return current table's size
*/
@Override
public int size() {
return size;
}

/**
* Return all keys
*
* @return Return all keys
*/
@Override
public Iterable<K> keys() {
IQueue<K> keys = new ArrayQueue<>();
for (Node<K, V> item = head; item != null; item = item.next) {
keys.enqueue(item.key);
}
return keys;
}

/**
* Returns an iterator over elements of type {@code K}.
*
* @return an Iterator.
*/
@Override
@SuppressWarnings("all")
public Iterator iterator() {
return new TableIterator(head);
}
}

基础实现:有序数组实现

顾名思义,我们需要两个有序数组来存储键值对,存储键的数组和存储值的数组之间的索引是一一对应的;同样的,数组实现就免不了要解决动态扩容与收缩容量的问题。
在这里我们需要一个方法快速地在存储键的有序数组中找到我们需要找的键,这里我使用了二分搜索算法(\(Binary\ Search\ Algorithm\))来帮助我实现这一需求。

# 代码实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
package net.sunzhenyu.miscellaneous.dsa.table;

import java.util.Iterator;
import net.sunzhenyu.miscellaneous.dsa.queue.Queue;

/**
* Orderly simple table impl
*
* @param <K> Generic Key extends Comparable interface
* @param <V> Generic Value
* @author <a href="mailto:sunzhenyucn@gmail.com">SunZhenyu</a>
* @version 0.0.1
* @date 2018/07/09
* @since 1.8
*/
public class OrderlySimpleTable<K extends Comparable<K>, V> implements Table<K, V> {

private K[] keys;
private V[] values;
private int size;
private static final Integer DEFAULT_CAPACITY = 15;

public OrderlySimpleTable() {
this(DEFAULT_CAPACITY);
}

@SuppressWarnings("unchecked")
public OrderlySimpleTable(int initialCapacity) {
if (initialCapacity > 0) {
keys = (K[]) new Comparable[initialCapacity];
values = (V[]) new Object[initialCapacity];
} else {
keys = (K[]) new Object[DEFAULT_CAPACITY];
values = (V[]) new Object[DEFAULT_CAPACITY];
}
size = 0;
}

/**
* Ensure current table not empty
*
* @throws NullPointerException If empty
*/
private void emptyCheck() {
if (isEmpty()) {
throw new NullPointerException("Table is empty.");
}
}

/**
* To ensure current table size
*
* @param newCapacity If it necessary to expansion, will use specified new capacity
*/
//@formatter:off
private void ensureCapacity(int newCapacity) {
if (newCapacity <= 0)
throw new IllegalArgumentException("New capacity cannot smaller and equal than zero.");
if (size >= keys.length || size >= values.length)
expandAndShrink(newCapacity);
}

/**
* If it necessary to expansion or shrink
*
* @param newCapacity Specified new capacity
*/
@SuppressWarnings("unchecked")
private void expandAndShrink(int newCapacity) {
K[] newKeys = (K[]) new Object[newCapacity];
V[] newValues = (V[]) new Object[newCapacity];
System.arraycopy(keys, 0, newKeys, 0, size);
System.arraycopy(values, 0, newValues, 0, size);
keys = newKeys;
values = newValues;
}

/**
* Put value in table with specified key, If the specified key has existed, specified key's value
* will be update.
*
* @param key Specified key
* @param value Specified value
*/
@Override
public void put(K key, V value) {
ensureNullPointer(key);
ensureCapacity(keys.length << 1);
// Try to find the specified key and it's value, if exist, update value and return.
int index = rank(key);
if (index < size && keys[index].compareTo(key) == 0) {
values[index] = value;
return;
}
// If does not exist,insert new key/value pair to current table.
for (int i = size; i > index; i--) {
keys[i] = keys[i - 1];
values[i] = values[i - 1];
}
keys[index] = key;
values[index] = value;
size++;
}

/**
* Get specified key's value, If specified key's value does not exist, will return {@code null}
*
* @param key Specified key
* @return Specified key's value(if it exists)
*/
//@formatter:off
@Override
public V get(K key) {
ensureNullPointer(key);
if (isEmpty()) return null;
int index = rank(key);
if (index < size && keys[index].compareTo(key) == 0)
return values[index];
return null;
}

/**
* Delete value with specified key
*
* @param key Specified key
*/
//@formatter:off
@Override
public void delete(K key) {
emptyCheck();
ensureNullPointer(key);
int index = rank(key);
if (index == size || keys[index].compareTo(key) != 0)
return;
for (int i = index; i < size - 1; i++) {
keys[i] = keys[i + 1];
values[i] = values[i + 1];
}
size--;
keys[size] = null;
values[size] = null;
// Shrink detection
if (size > 0 && size == keys.length >> 2)
expandAndShrink(keys.length >> 1);
}

/**
* Delete minimum key and it's value
*/
public void deleteMin() {
delete(min());
}

/**
* Delete max key and it's value
*/
public void deleteMax() {
delete(max());
}

/**
* If specified key has existed in table, return {@code true}, else return {@code false}.
*
* @param key Specified key
* @return if it exists, return {@code true}
*/
@Override
public boolean contains(K key) {
ensureNullPointer(key);
if (!isEmpty()) {
int rank = rank(key);
return keys[rank].compareTo(key) == 0;
}
return false;
}

/**
* If current table is empty, will return {@code true}, else return {@code false}
*
* @return If current table empty, return {@code true}
*/
@Override
public boolean isEmpty() {
return size == 0;
}

/**
* Return current table's size
*
* @return Return current table's size
*/
@Override
public int size() {
return size;
}

/**
* Return minimum key
*
* @return Minimum key
*/
public K min() {
emptyCheck();
return keys[0];
}

/**
* Return max key
*
* @return Max key
*/
public K max() {
emptyCheck();
return keys[size - 1];
}

/**
* Less than or equal to specified key
*
* @param key Specified key
* @return Eligible max key
*/
public K floor(K key) {
emptyCheck();
ensureNullPointer(key);
int i = rank(key);
if (i < size && keys[i].compareTo(key) == 0)
return keys[i];
if (i == 0)
return null;
else
return keys[i - 1];
}

/**
* Greater than or equal to specified key
*
* @param key Specified key
* @return Eligible minimum key
*/
public K ceiling(K key) {
emptyCheck();
ensureNullPointer(key);
return keys[rank(key)];
}

/**
* Less than specified key's count(Using binary search algorithm)
*
* @param key Specified key
* @return Eligible key's count
*/
//@formatter:off
public int rank(K key) {
ensureNullPointer(key);
int lo = 0, hi = size - 1;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2;
int cmp = key.compareTo(keys[mid]);
if (cmp < 0) hi = mid - 1;
else if (cmp > 0) lo = mid + 1;
else return mid;
}
return lo;
}

/**
* Ranked of specified value
*
* @param rank Specified rank
* @return Eligible key
*/
K select(int rank) {
emptyCheck();
return keys[rank];
}

/**
* Return all keys
*
* @return Return all keys
*/
@Override
public Iterable<K> keys() {
return keys(min(), max());
}

/**
* Return all keys within the specified range(Orderly)
*
* @param lo Start index
* @param hi End index
* @return Eligible keys
*/
//@formatter:off
public Iterable<K> keys(K lo, K hi) {
ensureNullPointer(lo, hi);
Queue<K> keys = new Queue<>();
if (lo.compareTo(hi) > 0)
return keys;
for (int i = rank(lo); i < rank(hi); i++)
keys.enqueue(this.keys[i]);
if (contains(hi))
keys.enqueue(this.keys[rank(hi)]);
return keys;
}

/**
* Returns an iterator over elements of type {@code T}.
*
* @return an Iterator.
* @throws UnsupportedOperationException OrderlySimpleTable doesn't have iterator
*/
@Override
public Iterator<K> iterator() {
throw new UnsupportedOperationException("OrderlySimpleTable doesn't have iterator");
}
}

二叉查找树(\(Binary\ Search\ Tree\))实现

在基础实现中,有无序链表有序数组两种实现,分别使用了顺序查找二分搜索算法两种查找键的方式,实际的情况下,二分搜索算法都比顺序查找要快得多,二分搜索算法也是众多实际应用中的最佳选择。
我们可以总结下基础实现的符号表的成本:

算法(数据结构) 最坏情况(\(N\)次插入后) 平均情况(\(N\)次随机插入后) 是否高效地支持有序相关的操作
顺序查找(无序链表) 查找\(O(N)\) 插入\(O(N)\) 查找\(O(N \div 2)\) 插入\(O(N)\)
二分搜索算法(有序数组) 查找\(O(log\ N)\) 插入\(O(2N)\) 查找\(O(log\ N)\) 插入\(O(N)\)

现在我们是否能够同时满足查找和插入操作都是对数级别的算法和数据结构。当然是可以的,要支持高效地插入操作,我们需要一种链式结构,但不可以是单链的形式,单链是无法使用二分搜索算法的,因为单链的形式只允许沿链表遍历,我们这时需要的是二叉查找树(\(Binary\ Search\ Tree\))这种复杂的数据结构。
TODO:代码待更新