Bin-arrays are a compact way of storing a large collection of lists while
allowing modifications in O(1) time. They share many of the
characteristics of doubly-linked lists and are applicable to many of the
situations where doubly-linked lists would be used. Bin-arrays are most
useful, however, when dealing with large amounts of data, and when using a
minimal amount of memory is more important than having extremely fast
lookups, insertions, or deletions.
Structure and Algorithms
A precise statement of the problem being solved is, how do you compactly
store a large collection of homogeneous lists of varying lengths while
allowing deletions of arbitrary elements and insertions at arbitrary
positions in constant time?
As a basis for evaluating the effectiveness of our proposed solution,
consider the simpler problem of storing a known set of values in an
immutable structure. Figure 1 above shows the most
compact way of storing the pair of lists [1, 2, 3] and [4, 5]. The general
scheme is to place all elements of each list in a contiguous block of slots
in a single array. If this is done carefully, we can even do without
explicitly recording the index of the last element of each list, and
thereby save space. Specifically, we can deduce the index of the end of a
given list from the index of the start of the following list.
The most straightforward solution to the stated problem is shown in Figure 2 above. The general scheme is to allocate a
node for each element in each list and to link together the elements using
forward and backward pointers. This approach is easy to implement, easy to
understand, and reasonably compact. However, the memory usage is not as
low as it could be for the following reasons. First, allocating a large
number of small objects can cause an undesirable amount of overhead in the
general purpose memory allocation routines of a given language (for
example, malloc() in C). Some implementations of memory allocators are
better able to deal with this situation than others, but it is probably not
unreasonable to assume that there will be undesirable overhead. (Note that
in C++, a programmer can write a class-specific operator new() to
dramatically reduce, though perhaps not entirely eliminate, the overhead.)
Second, using a separate object for each element entails a nontrivial
amount of fixed overhead per object in Java. This is so because every
class in Java must derive from Object, which means that even an
empty object takes up around 16 bytes.
Figure 3 above shows an approach which avoids both
these problems by allocating only three arrays rather than lots of
individual objects. The general scheme is to store each element in an
array and to use a pair of array indices to link together the elements.
Memory usage, however, may still not be as low as could be desired. If the
user data consists simply of 4-byte integers and each link data is a 4-byte
integer, then there is a constant 200% overhead in this representation.
When storing a million elements, this amounts to 8MBs of overhead. This
may seem like a tolerable amount of overhead given the large amounts of
memory common in even desktop systems these days, but it still seems
worthwhile to try to reduce the memory usage even further. It is quite
likely that we may want to deal with not just a million elements, but say
ten million elements, or even fifty million elements. These numbers imply
overheads of 80MBs and 400MBs, which are sizable amounts even today.
The bin-arrays data structure, shown in Figure 4
above, is a modification of the array-based linked-list approach. It
reduces memory usage by reducing the overhead of the link data. In this
scheme, the elements of a given list are stored in one of four arrays. The
array chosen depends on the length of the list. A list of length 1 (for
example, [6]) is stored in the 1-bins array; one of length 2 (for example,
[4, 5]) is stored in the 2-bins array; one of length 3 (for example, [1, 2,
3]) is stored in the 3-bins array; and finally, one of length 4 or longer
(for example, [7, 8, 9, 10, 11, 12, 13]) is stored in the threaded 4-bins
array. Inserting to, or deleting from, a list may cause that list to move
to a different array. For example, appending to the list [6] will cause it
to move successively to the 2-bins array, then the 3-bins array, and
finally to the 4-bins array. Similarly, deleting from the list [1, 2, 3]
will cause it to move successively to the 2-bins array and then to the
1-bins array. Two bits are reserved in the index stored in the index array
for indicating which of these four arrays actually contains the elements of
a given list.
The 1-bins, 2-bins, and 3-bins arrays store user data contiguously in
fixed-size blocks. Because each stored list fits entirely within a block,
we can do without storing any link data. Hence, the memory usage in this
scheme for a collection of lists containing fewer than 4 elements is near
optimal.
The 4-bins array stores the lists that have 4 or more elements. The
approach is similar to the array-based linked-list representation except
that there are now four data elements per a pair of forward and backward
links. There are also a few other slight differences. The "previous" link
of the first bin of a given list stores the index of the last bin of the
same list. This makes it possible to find the last bin in constant time,
in order to do appends, and to determine when the end of the chain of bins
has been reached during traversals. Another difference is that the "next"
link of the last bin gives the size of the list. In combination with the
"previous" link of the first bin, this makes it possible to determine the
length of any list in constant time. Finally, because each bin has room
for four elements, and because not all of the slots may be occupied at a
given moment, two bits in the "previous" link of each bin is reserved for
recording the current occupancy (assuming that all data items are packed at
the beginning of the bin).
There are two distinct algorithms for modifying the lists in the 4-bins
array. One algorithm assumes that the lists are unordered, and the other
that they are ordered. All lists in a given 4-bins array must be of the
same type, either ordered or unordered.
The algorithm for unordered lists is the simpler of the two and always
ensures that the 4-bins array is used in the most optimal way. The
procedures for inserting and deleting an element is as follows: (i) to
insert an element, simply append it to the end of the list, and (ii) to
delete an element, overwrite the slot of the element with the last element
in the list. See Figure 5 below.
The algorithm for ordered lists is only a little more involved. We
introduce some notation to ease the discussion. Number the bins that make
up a list sequentially from zero, and let u(i) be the number of data
items actually occupying bin i; that is, u_i is the usage of
bin i.
We assume the minimum amount of memory needed to store a dataset is
determined directly by the number of bytes in the dataset itself. So we
consider that a dataset of one million 4-byte integers requires a minimum
of 4 million bytes, even if the values are sufficiently limited in their
range to make it possible to store the same data without loss in 1 million
bytes. That is, we are excluding the quite different definition of minimum
memory usage that would arise in a discussion about compression. The
purpose of the following analysis is to examine the degree of overhead
beyond this straightforward calculation of the minimum amount.
As a basis for comparison, we note that the minimal representation uses
exactly the minimum amount of memory. Hence, the ratio waste/data
is zero for this scheme. To determine this ratio for the object-based
linked-list representation, suppose that each data element takes up 4
bytes, that each link pointer takes up 4 bytes, and that each object takes
up a further 16 bytes as a consequence of descending from Object
in Java. Then waste/data = (2*4 + 16) / 4 = 6. To determine this
ratio for the array-based linked-list representation, suppose the same
sizes as given above for data and links. Then waste/data = 2*4 / 4 =
2. The determination of this ratio for the bin-arrays scheme is more
complicated; however, we can see immediately that if all lists have fewer
than 4 elements, then waste/data = 0.
Therefore, without a loss of generality, we can restrict our analysis to
lists with 4 or more elements. Moreover, since the insertion and deletion
algorithms work on each list independently of all others, we can further
restrict our analysis to the memory usage of a single list. So suppose
that we have a list of n >= 4 elements. There are two analyses to
perform, one for the case of unordered lists and another for ordered lists.
We first introduce some notation. Let k be the number of slots
available for user data each bin. Let l be the number of slots
taken up by links (or other administrative data) in each bin. Let n =
kq + r, 0 <= r < k be the number of elements in a list. Finally, let
m be the number of bins taken up by a list. In our case, k
is equal to 4, and l is equal to 2. In the following analysis,
assume that each data element and each link takes up 4 bytes.
It is easy to see that when a list is unordered all the bins except the
last will be fully occupied. Hence, if a list has n = kq + r, 0 <= r <
k, elements, then the overhead equals the sum of l slots of link
data per bin and k-r data slots wasted in the last bin; that is,
{ lq + l + k - r l + (l + k - r) / q
{ -------------- = -------------------, if r > 0 (1)
waste { kq + r k + r/q
----- = {
data { lq l
{ -- = -, if r = 0 (2)
{ kq k
As n goes to infinity, q goes to infinity, so that Eq. (1)
approaches l/k, which is Eq. (2). In fact, Eq. (2) is the lower
bound of Eq. (1) as Figure 6 below shows for the
case when k = 4 and l = 2. The overhead is lowest
whenever n is a multiple of k. The overhead is highest
when n = k + 1. In Figure 6, we can see that the overhead peaks
at 1.4 when n = 5, remains less than or equal to 1.0 for all
other values of n, and sinks to a minimum of 0.5 whenever n
is a multiple of 4.
The analysis for ordered lists is a little more complicated. The
restriction that the elements be kept in order means that the bins will not
in general stay as well-packed as unordered lists in the face of arbitrary
insertions and deletions. However, the insertion and deletion procedures
ensure that the worst-case memory usage remains bound. They do this by
maintaining the following two invariants on the structure of the chain of
bins that make up a list:
if m = 1, then u(0) = k;
for each 0 <= i < m - 1, u(i) + u(i+1) > k.
The first says that if a list takes up only one bin, then that bin must be
fully occupied. The second says that the usage of any two adjacent bins
must add up to more than k. The second therefore implies that
u(i) > 0 for all i. It is immediately obvious that these
invariants hold when a list is first put into the 4-bins array, which
happens exactly when the list has k elements. We can easily verify
that these invariants hold as arbitrary elements are inserted and deleted.
First, consider the insertion algorithm when
inserting an element into bin i. By hypothesis, the invariants hold
true, and so u(i-1) + u(i) > k and u(i) + u(i+1) > k. It is
easy to see that these relations will continue to hold if any of the
u(j) values is increased by one, which is what would happen if an
element was inserted into bin i-1, i, or i+1. So
suppose that none of the three adjacent bins has room. Then u(i-1) =
u(i) = u(i+1) = k, and bin i will be split into two bins,
i' and i'', with the elements distributed in some manner
between the two. The result will be that u(i') > 0, u(i'') >
0, and u(i') + u(i'') = u(i) + 1 = k + 1. This implies that
u(i-1) + u(i') = k + u(i') > k and u(i'') + u(i+1) = u(i'') + k >
k. Hence the invariants will be maintained.
Next, consider the deletion algorithm when
deleting an element from bin i. First, consider the case where bin
i has only one element. By hypothesis, the invariants hold true,
and so u(i-1) + u(i) > k and u(i) + u(i+1) > k. That is,
u(i-1) > k - 1 and u(i+1) > k - 1. This means that u(i-1)
= u(i+1) = k, and hence u(i-1) + u(i+1) > k. Therefore the
invariants will hold after bin i is unlinked from the chain. Next,
consider the case where bin i has more than one element. Let
i' be bin i after the element has been deleted. If u(i-1)
+ u(i') > k and u(i') + u(i+1) > k, then we are done. So
suppose that u(i-1) + u(i') <= k. Then the deletion algorithm will
combine bins i-1 and i' into bin i'' (see Figure 7 below). In the resulting chain, bins
i-2 and i'' are adjacent, and bins i'' and i+1
are adjacent. We need to show that the second invariant holds in each
pair. First, note that u(i'') > u(i-1) and u(i'') > u(i'),
since u(i'') = u(i-1) + u(i'). Now consider the first pair. By
hypothesis, u(i-2) + u(i-1) > k. Therefore, u(i-2) + u(i'') >
u(i-2) + u(i-1) > k. Now consider the second pair. Similar reasoning
shows that u(i'') + u(i+1) > u(i') + u(i+1) > k. Therefore the
invariants hold in each pair. The other possibility when bin i has
more than one element--namely, u(i') + u(i+1) <= k--is similar.
Hence, in all cases, the invariants are maintained after deletion.
The invariants ensure that the worst-case memory usage is bounded for any
given list length. Intuitively, the invariants impose a lower limit on how
thinly the elements may be spread out in a list by a sequence of insertions
and deletions. The worst-case memory usage is determined only by the
number of bins used, and in particular is independent of the arrangement of
the elements within any bin. It is easy to see that the arrangement of the
elements does not matter, since a list of n elements taking up
m bins always has a waste of (k+l)m - n slots, regardless of
the arrangement. Hence, using the maximum number of bins possible for a
given list length leads to the most waste. Conversely, if a list is stored
in such a way that it creates the most waste, then it will necessarily span
the maximum number of bins possible for its length. For suppose not.
Then a list of n elements is stored with the maximum waste in
x bins, and yet it is possible to store the same list in y >
x bins. The waste for y bins is (k+l)y - n >= (k+l)(x+1) - n
= (k+l)x - n + (k+1) which is greater than the waste for x
bins--a contradiction.
Therefore, the worst-case memory usage for a list of a given length is
determined precisely by the maximum number of bins that the list can
span under the invariants. Suppose a list has length
n = (k+1)q + r, 0 <= r < k+1. Then the maximum number of bins
z that the list can span is given by:
{ 2q + 1, if r > 0 (3)
z = {
{ 2q, if r = 0 (4)
This is a consequence of the second invariant. The reasoning is as
follows. For a list spanning m bins, the invariant requires that
u(i) + u(i+1) > k, 0 <= i < m-1. In particular, it requires that
the sum of the usage of each successive pair of bins be greater than
k; that is, u(0) + u(1) >= k+1, u(2) + u(3) >= k+1,
and so on (see Figure 8 below). In light of this,
consider the case where r = 0, that is, where n = (k+1)q.
Suppose on the contrary that the list spans z' > z = 2q bins. Then
the list spans the bins 0, 1, ... , 2q-1, 2q, ... , z'-1. In
particular, it must be the case that u(2q) >= 1. Now, as we saw,
the second invariant requires that u(0) + u(1) >= k+1, u(2) +
u(3) >= k+1, ... , u(2q-2) + u(2q-1) >= k+1. Hence, u(0) +
u(1) + ... + u(2q-1) + u(2q) >= (k+1)q + u(2q) > (k+1)q. This implies
that the list has more than (k+1)q elements, which is a
contradiction. Therefore, if n = (k+1)q, then the list can span no
more than z bins. Next, consider the case where r > 0. For
this case also, suppose the contrary. Then the list spans z' > z = 2q +
1 bins. Reasoning as in the previous case establishes that u(0) +
u(1) + ... + u(2q-1) >= (k+1)q. This means that the remainder of the
elements, which number at most 0 < r < k+1, must be in the bins
2q, ... , z'-1. Since z' > z, bins 2q and 2q+1
at least are a part of the list. They are also adjacent bins, and so the
second invariant requires that u(2q) + u(2q+1) >= k+1. However,
there are only r elements left with which to fill them, and r <
k+1. Therefore z' cannot be greater than z, and this
finally shows that z is an upper bound in all cases.
We have shown that z is an upper bound on the maximum number of bins
that a list can span. It is in fact exactly the maximum value. There is a
distribution of elements allowed by the invariants which causes a list to
span exactly z bins. The distribution is as follows. Suppose
a list has length n = (k+1)q + r, 0 <= r < k+1. Then distribute
the leading (k+1)q elements in q adjacent pairs of bins such
that the first bin of the pair has a usage of 1, and the second has usage
k. If there are any remaining elements, that is, if r > 0,
then put them all in a final bin (note that r <= k). See
Figure 9 below.
Knowing now the maximum number of bins z that a list may span, and
that the waste for a list of length n spanning z bins is
(k + l)z - n, we can compute the ratio of waste to data as follows:
waste (k + l)z - n (k + l)z
----- = ------------- = -------- - 1 (5)
data n n
{ (k + l)(2q + 1) 2(k + l) + (k + l)/q
{ --------------- - 1 = -------------------- - 1, if r > 0 (6)
{ (k + 1)q + r k + 1 + r/q
= {
{ (k + l)2q 2(k + l)
{ --------- - 1 = -------- - 1, if r = 0 (7)
{ (k + 1)q k + 1
Note how the ratio depends only on k and l in Eq. (7), and
how Eq. (6) approaches Eq. (7) as q goes to infinity, which happens
when n goes to infinity. Figure 10 below
shows the worst-case memory usage plotted alongside the best-case memory
usage. The best-case memory usage of ordered lists is the same as the
memory usage of unordered lists of the same length. For lists of length 4,
5 or 9, the figures for the best and worst cases coincide. The worst-case
memory usage peaks at length 6 to 2.0 and stays less than 2.0 for all other
lengths. We can also see the worst-case memory usage converging to the
value of Eq. (7), namely 1.4.
Bin-arrays can store large amounts of data more compactly than in common
implementations of doubly-linked lists, while still allowing modifications
in constant time. The best straightforward implementation of doubly-linked
lists--the array-based linked-list implementation--has a fixed
waste-to-data ratio of 2. Bin-arrays, even in the worst case, do no worse.
The amount of overhead in real use depends too much on the actual dataset
and the actual sequence of insertions and deletions to be estimated.
However, we can still make a few observations. If all the lists have 3 or
fewer elements, then memory usage will be just a negligible amount above
the minimum. If the lists are unordered, then memory usage will always be
better than the array-based linked-list implementation. In this case, the
waste-to-data ratio reaches a maximum of 1.4 for a list of 5 elements and
remains at or below 1.0 for all other lengths. If the lists are ordered,
and they are only appended to, then memory usage will be the same as for
the case of unordered lists. If the lists are ordered, and elements are
inserted into arbitrary positions, and elements at arbitrary positions are
deleted, then worst-case memory usage can equal that of array-based linked
lists in the case of lists of length 6. For lists of all other lengths,
even the worst-case memory usage will be better than that of array-based
linked lists. Hence, in situations where reducing memory usage is
critical, bin-arrays may be a promising approach to implementing a
collection of doubly-linked lists.
Young Hyun
Last modified: Mon Dec 11 11:05:32 PST 2000
"They are called 'sampans,'" Doctor Bronson explained, "and are made entirety of wood. Of late years the Japanese sometimes use copper or iron nails for fastenings; but formerly you found them without a particle of metal about them." "And I am afraid I betrayed the fact," Bruce admitted. "I might have thought of some other way of accounting for my presence here. Still, that rather piratical-looking young man seemed to think you had done right. What's this about some man picked up in the garden?" CHAPTER LII. THE CAGE IS OPENED. Third. Combination machines can only be employed with success when one attendant performs all the operations, and when the change from one to another requires but little adjustment and re-arrangement. When their train approached ours they looked out of the windows, or opened the doors, and waved and greeted and shouted at the top of their voices. me in the cab, he told me that for three days they gave you up. of mind is gone for ever--but anyway, I never cared much for just Two legs waved over the last cockpit place. She waited, too, made silent by sudden realization of how futile anything that she might say would be. "I am glad to see you again," she faltered; "it is four years since Black River and the cloud-burst." She was angry at her own stupidity and want of resource, and her tone was more casual than she meant it to be. The result of this division shook the last resistance of Walpole. When the motion which had been rejected on the 18th of December—for copies of the correspondence with the King of Prussia—was again put, he made no opposition, and it[79] passed without a division. He made, however, one more attempt to carry his measures. In the disputed election of Chippenham he stood his ground against the petition, and was defeated by a majority of one. It was now clear to himself that he must give way. His relatives and friends assured him that to defer longer was only to court more decided discomfiture. On the 31st of January, he, therefore, prepared to depart for his seat at Houghton, and the next morning he demanded of the king, in a private audience, leave to retire. George, on this occasion, evinced a degree of feeling that did him honour. When the old Minister who had served him through so long a course of years knelt to kiss hands, the king embraced him, shed tears, and begged that he would often come to see him. On the 9th of February Sir Robert was created Earl of Orford, and on the 11th he made a formal resignation of all his places. "I'll chance it," said Alf desperately, reaching for the cup of coffee. "I'm sure it'll be better for me to eat something." "They d?an't care, nuther—it's only me." "Wipe the blood off his face." There was silence, in which a coal fell. She still stood with her arms outstretched; he knew that she was calling him—as no woman had ever called him—with all that of herself which was in his heart, part of his own being. HoME免费一1一级做爰片在线观看
ENTER NUMBET 0018www.lizhanfu888.com.cn www.seekwhy.com.cn dzledu.com.cn b026.com.cn www.vuxk.com.cn toyeso.com.cn www.rszy.net.cn www.vygp.com.cn www.lnxld.com.cn www.ag8.org.cn