Rationale for Ada 2005
8.4 Sets
Sets, like maps, come in two forms: hashed and ordered.
Sets are of course just collections of values and there is no question
of a key (we can perhaps think of the value as being its own key). Thus
in the case of an ordered set the values are stored in order whereas
in the case of a map, it is the keys that are stored in order. As well
as the usual operations of inserting elements into a set and searching
and so on, there are also many operations on sets as a whole that do
not apply to the other containers – these are the familiar set
operations such as union and intersection.
Here is the specification of the ordered sets package
giving just those facilities that are common to both kinds of sets.
generic
type Element_Type is private;
with function "<" (Left, Right: Element_Type) return Boolean is <>;
with function "=" (Left, Right: Element_Type) return Boolean is <>;
package Ada.Containers.Ordered_Sets is
pragma Preelaborate(Ordered_Sets);
function Equivalent_Elements(Left, Right: Element_Type) return Boolean;
type Set is tagged private;
pragma Preelaborable_Initialization(Set);
type Cursor is private;
pragma Preelaborable_Initialization(Cursor);
Empty_Set: constant Set;
No_Element: constant Cursor;
The only differences from the maps package (apart
from the identifiers) are that there is no key type and both "<"
and "=" apply to the element type
(whereas in the case of maps, the operation "<"
applies to the key type). Thus the ordering relationship "<"
defined on elements defines equivalence between the elements whereas
"=" defines equality.
It is possible for two elements to be equivalent
but not equal. For example if they were strings then we might decide
that the ordering (and thus equivalence) ignored the case of letters
but that equality should take the case into account. (They could also
be equal but not equivalent but that is perhaps less likely.)
And as in the case of the maps package, the equality
operation on elements is only used by the function "="
for comparing two sets.
Again we have the usual rules as explained for maps.
Thus if x < y is true then y
< x must be false; x < y and
y < z must imply x
< z; and x = y and y
= x must be the same.
For the convenience of the user the function Equivalent_Elements
is declared explicitly. It is equivalent to
function Equivalent_Elements(Left, Right: Element_Type) return Boolean is
begin
return not (Left < Right) and not (Right < Left);
end Equivalent_Elements;
This function Equivalent_Elements
corresponds to the formal generic parameter of the same name in the hashed
sets package discussed below. This should make it easier to convert between
the two forms of packages.
function "=" (Left, Right: Set) return Boolean;
function Equivalent_Sets(Left, Right: Set) return Boolean;
function To_Set(New_Item: Element_Type) return Set;
function Length(Container: Set) return Count_Type;
function Is_Empty(Container: Set) return Boolean;
procedure Clear(Container: in out Set);
Note the addition of Equivalent_Sets
and To_Set. Two sets are equivalent if they
have the same number of elements and the pairs of elements are equivalent.
This contrasts with the function "="
where the pairs of elements have to be equal rather than equivalent.
Remember that elements might be equivalent but not equal (as in the example
of a string mentioned above). The function To_Set
takes a single element and creates a set. It is particularly convenient
when used in conjunction with operations such as Union
described below. The other subprograms are as in the other containers.
function Element(Position: Cursor) return Element_Type;
procedure Replace_Element(
Container: in out Set;
Position: in Cursor;
New_Item: in Element_Type);
procedure Query_Element(
Position: in Cursor;
Process: not null access procedure (Element: in Element_Type));
Again these are much as expected except that there
is no procedure Update_Element. This is because
the elements are arranged in terms of their own value (either by order
or through the hash function) and if we just change an element in
situ then it might become out of place (this problem does not arise
with the other containers). This also means that Replace_Element
has to ensure that the value New_Item is not
equivalent to an element in a different position; if it is then Program_Error
is raised. We will return to the problem of the missing Update_Element
later.
procedure Move(Target, Source: in out Set);
This is just as for the other containers.
procedure Insert(
Container: in out Set;
New_Item: in Element_Type;
Position: out Cursor;
Inserted: out Boolean);
procedure Insert(
Container: in out Set;
New_Item: in Element_Type);
These insert a new element into the set unless an
equivalent element already exists. If it does exist then the first one
returns with Inserted set to False
and Position indicating the element whereas
the second raises Constraint_Error (the element
value is not changed). If an equivalent element is not in the set then
it is added and Position is set accordingly.
procedure Include(Container: in out Set; New_Item: in Element_Type);
This is somewhat like the last Insert
except that if an equivalent element is already in the set then it is
replaced (rather than raising Constraint_Error).
procedure Replace(Container: in out Set; New_Item: in Element_Type);
In this case, Constraint_Error
is raised if an equivalent element does not already exist.
procedure Exclude(Container: in out Set; Item: in Element_Type);
If an element equivalent to Item
is already in the set, then it is deleted.
procedure Delete(Container: in out Set; Item: in Element_Type);
procedure Delete(Container: in out Set; Position: in out Cursor);
These delete an element. In the first case if there
is no such equivalent element then Constraint_Error
is raised. In the second case if the cursor is No_Element
then again Constraint_Error is also raised
– there is also a check to ensure that the cursor otherwise does
designate an element in the correct set (remember that cursors designate
both an entity and the container); if this check fails then Program_Error
is raised.
And now some new stuff, the usual set operations.
procedure Union(Target: in out Set; Source: in Set);
function Union(Left, Right: Set) return Set;
function "or" (Left, Right: Set) return Set renames Union;
procedure Intersection(Target: in out Set; Source: in Set);
function Intersection(Left, Right: Set) return Set;
function "and" (Left, Right: Set) return Set renames Intersection;
procedure Difference(Target: in out Set; Source: in Set);
function Difference(Left, Right: Set) return Set;
function "–" (Left, Right: Set) return Set renames Difference;
procedure Symmetric_Difference(Target: in out Set; Source: in Set);
function Symmetric_Difference (Left, Right: Set) return Set;
function "xor" (Left, Right: Set) return Set renames Symmetric_Difference;
These all do exactly what one would expect using
the equivalence relation on the elements.
function Overlap(Left, Right: Set) return Boolean;
function Is_Subset(Subset: Set; Of_Set: Set) return Boolean;
These are self-evident as well.
function First(Container: Set) return Cursor;
function Last(Container: Set) return Cursor;
function Next(Position: Cursor) return Cursor;
procedure Next(Position: in out Cursor);
function Find(Container: Set; Item: Element_Type) return Cursor;
function Contains(Container: Set; Item: Element_Type) return Boolean;
These should be self-evident and are very similar
to the corresponding operations on maps. Again unlike the operations
on vectors and lists, Find logically searches
the whole set and not just starting at some point (there is also no Reverse_Find).
Moreover, Find uses the equivalence relation
based on the "<" parameter.
function Has_Element(Position: Cursor) return Boolean;
procedure Iterate(
Container: in Set;
Process: not null access procedure (Position: in Cursor));
These are also as for other containers.
The sets packages conclude with an internal generic
package called Generic_Keys. This package
enables some set operations to be performed in terms of keys where the
key is a function of the element. Note carefully that in the case of
a map, the element is defined in terms of the key whereas here the situation
is reversed. An equivalence relationship is defined for these keys as
well; this is defined by a generic parameter "<"
for ordered sets and Equivalent_Keys for hashed
sets.
In the case of ordered
sets the formal parameters are
generic
type Key_Type(<>) is private;
with function Key(Element: Element_Type) return Key_Type;
with function "<" (Left, Right: Key_Type) return Boolean is <>;
package Generic_Keys is
The following are then common to the package Generic_Keys
for both hashed and ordered sets.
function Key(Position: Cursor) return Key_Type;
function Element(Container: Set; Key: Key_Type) return Element_Type;
procedure Replace(
Container: in out Set;
Key: in Key_Type; New_Item: in Element_Type);
procedure Exclude(
Container: in out Set; Key: in Key_Type);
procedure Delete(Container: in out Set; Key: in Key_Type);
function Find(Container: Set; Key: Key_Type) return Cursor;
function Contains(Container: Set; Key: Key_Type) return Boolean;
procedure Update_Element_Preserving_Key(
Container: in out Set; Position: in Cursor;
Process: not null access procedure (Element: in out Element_Type));
and then finally
end Generic_Keys;
private
... -- not specified by the language
end Ada.Containers.Ordered_Sets;
It is expected that most user of sets will use them
in a straightforward manner and that the operations specific to sets
such as Union and Intersection
will be dominant.
However, sets can be
used as sort of economy class maps by using the inner package Generic_Keys.
Although this is certainly not for the novice we will illustrate how
this might be done by reconsidering the stock problem using sets rather
than maps. We declare
type Part_Type is
record
Part_Number: Integer;
Year: Integer;
Shelf: Character range 'A' .. 'T';
Stock: Integer;
end record;
Here we have put all the information in the one type.
We then declare "<"
much as before
function "<" (Left, Right: Part_Type) return Boolean is
begin
return Left.Part_Number < Right.Part_Number;
end "<";
and then instantiate
the package thus
package Store_Sets is new Ordered_Sets(Element_Type => Part_Type);
The_Store: Store_Sets.Set;
We have used the default generic parameter mechanism
for "<" this time by way of illustration.
In this case we add
items to the store by calling
The_Store.Insert((34618, 1998, 'F', 25));
The_Store.Insert((27134, 2004, 'C', 45));
...
The procedure for checking
the stock could now become
procedure Request(
Part: in Integer: OK: out Boolean;
Year: out Integer; Shelf: out Character) is
C: Cursor;
E: Part_Type;
begin
C := The_Store.Find((Part, others => <>));
if C = No_Element then
OK := False; return; -- no such item
end if;
E := Element(C);
Year := E.Year;
Shelf := E.Shelf;
if E.Stock = 0 then
OK := False; return; -- out of stock
end if;
Replace_Element(C, (E.Part_Number, Year; Shelf, E.Stock–1));
OK := True;
end Request;
This works but is somewhat unsatisfactory. For one
thing we have had to make up dummy components in the call of Find
(using <>) and moreover we have had
to replace the whole of the element although we only wanted to update
the Stock component. Moreover, we cannot use
Update_Element because it is not defined for
sets at all. Remember that this is because it might make things out of
order; that wouldn't be a problem in this case because we don't want
to change the part number and our ordering is just by the part number.
A better approach is
to use the part number as a key. We define
type Part_Key is new Integer;
function Part_No(P: Part_Type) return Part_Key is
begin
return Part_Key(P.Part_Number);
end Part_No;
and then
package Party is new Generic_Keys(Key_Type => Part_Key, Key => Part_No);
use Party;
Note that we do not have to define "<"
on the type Part_Key at all because it already
exists since Part_Key is an integer type.
And the instantiation uses it by default.
And now we can rewrite
the Request procedure as follows
procedure Request(
Part: in Part_Key; OK: out Boolean;
Year: out Integer; Shelf: out Character) is
C: Cursor;
E: Part_Type;
begin
C := Find(The_Store, Part);
if C = No_Element then
OK := False; return; -- no such item
end if;
E := Element(C);
Year := E.Year; Shelf := E.Shelf;
if E.Stock = 0 then
OK := False; return; -- out of stock
end if;
-- we are now going to update the stock level
declare
procedure Do_It(E: in out Part_Type) is
begin
E.Stock := E.Stock – 1;
end Do_It;
begin
Update_Element_Preserving_Key(The_Store, C, Do_It'Access);
end;
OK := True;
end Request;
This seems hard work but has a number of advantages.
The first is that the call of Find is more
natural and only involves the part number (the key) – note that
this is a call of the function Find in the
instantiation of Generic_Keys and takes just
the part number. And the other is that the update only involves the component
being changed. We mentioned earlier that there was no Update_Element
for sets because of the danger of creating a value that was in the wrong
place. In the case of the richly named Update_Element_Preserving_Key
it also checks to ensure that the element is indeed still in the correct
place (by checking that the key is still the same); if it isn't it removes
the element and raises Program_Error.
But the user is warned to take care when using the
package Generic_Keys. It is absolutely vital
that the relational operation and the function (Part_No)
used to instantiate Generic_Keys are compatible
with the ordering used to instantiate the parent package Containers.Ordered_Sets
itself. If this is not the case then the sky might fall in.
Incidentally, the procedure
for checking the stock which previously used the maps package now becomes
procedure Check_Stock(Low: in Integer) is
procedure Check_It(C: in Cursor) is
begin
if Element(C).Stock < Low then
-- print a message perhaps
Put("Low stock of part ");
Put_Line(Element(C).Part_Number); -- changed
end if;
end Check_It;
begin
The_Store.Iterate(Check_It'Access);
end Check_Stock;
The only change is
that the call of Key in
Put_Line(Key(C).Part_Number);
when using the maps
package has been replaced by Element. A minor
point is that we could avoid calling Element
twice by declaring a constant E in Check_It
thus
E: constant Part_Type := Element(C);
and then writing E.Stock <
Low and calling Put_Line with E.Part_Number.
A more important point is that if we have instantiated
the Generic_Keys inner package as illustrated
above then we can leave Check_It unchanged
to call Key. But it is important to realise
that we are then calling the function Key
internal to the instantiation of Generic_Keys
(flippantly called Party) and not that from
the instantiation of the parent ordered sets package (Store_Sets)
because that has no such function. This illustrates the close affinity
between the sets and maps packages.
And finally there is a hashed sets package which
has strong similarities to both the ordered sets package and the hashed
maps package. We can introduce this much as for hashed maps by giving
the differences between the two sets packages, the extra facilities in
each and the impact on the part number example.
The specification of the hashed sets package starts
generic
type Element_Type is private;
with function Hash(Element: Element_Type) return Hash_Type;
with function Equivalent_Elements(Left, Right: Element_Type)
return Boolean;
with function "=" (Left, Right: Element_Type) return
Boolean is <>;
package Ada.Containers.Hashed_Sets is
pragma Preelaborate(Hashed_Sets);
The differences from the ordered sets package are
that there is an extra generic parameter Hash
and the ordering parameter "<"
has been replaced by the function Equivalent_Elements.
So if we have
function Equivalent_Parts(Left, Right: Part_Type) return Boolean is
begin
return Left.Part_Number = Right.Part_Number;
end Equivalent_Parts;
function Part_Hash(P: Part_Type) return Hash_Type is
M31: constant := 2**31–1; -- a nice Mersenne prime
begin
return Hash_Type(P.Part_Number) * M31;
end Part_Hash;
(which are very similar
to the hashed map example – the only changes are to the parameter
type name) then we can instantiate the hashed sets package as follows
package Store_Sets is new Hashed_Sets(
Element_Type => Part_Type,
Hash => Part_Hash,
Equivalent_Elements => Equivalent_Parts);
The_Store: Store_Sets.Set;
and then the rest of our example will be exactly
as before. It is thus easy to convert from an ordered set to a hashed
set and vice versa provided of course that we only use the facilities
common to both.
It should also be mentioned
that the inner package Generic_Keys for hashed
sets has the following formal parameters
generic
type Key_Type(<>) is private;
with function Key(Element: Element_Type) return Key_Type
with function Hash(Key: Key_Type) return Hash_Type;
with function Equivalent_Keys(Left, Right: Key_Type) return Boolean;
package Generic_Keys is
The differences from that for ordered sets are the
addition of the function Hash and the replacement
of the comparison operator "<"
by Equivalent_Keys.
(Incidentally the package Generic_Keys
for ordered sets also exports a function Equivalent_Keys
for uniformity with the hashed sets package.)
Although our example
itself is unchanged we do have to change the instantiation of Generic_Keys
thus
type Part_Key is new Integer;
function Part_No(P: Part_Type) return Part_Key is
begin
return Part_Key(P.Part_Number);
end Part_No;
function Part_Hash(P: Part_Key) return Hash_Type is
M31: constant := 2**31–1; -- a nice Mersenne prime
begin
return Hash_Type(P) * M31;
end Part_Hash;
function Equivalent_Parts(Left: Right: Part_Key) return Boolean is
begin
return Left = Right;
end Equivalent_Parts;
and then
package Party is new Generic_Key(
Key_Type => Part_Key,
Key => Part_No;
Hash => Part_Hash
Equivalent_Keys => Equivalent_Parts);
use Party;
The hash function is similar to that used with hashed
maps. The type Part_Key and function Part_No
are the same as for ordered sets. We don't really need to declare the
function Equivalent_Parts since we could use
"=" as the actual parameter for
Equivalent_Keys.
We will finish this discussion of sets by briefly
considering the additional facilities in the two sets packages (and their
inner generic keys packages) just as we did for the two maps packages
(the discussion is almost identical).
The ordered sets package
has the following additional subprograms
procedure Delete_First(Container: in out Set);
procedure Delete_Last(Container: in out Set);
function First_Element(Container: Set) return Element_Type;
function Last_Element(Container: Set) return Element_Type;
function Previous(Position: Cursor) return Cursor;
procedure Previous(Position: in out Cursor);
function Floor(Container: Set; Item: Element_Type) return Cursor;
function Ceiling(Container: Set; Item: Element_Type) return Cursor;
function "<" (Left, Right: Cursor) return Boolean;
function ">" (Left, Right: Cursor) return Boolean;
function "<" (Left: Cursor; Right: Element_Type) return Boolean;
function ">" (Left: Cursor; Right: Element_Type) return Boolean;
function "<" (Left: Element_Type; Right: Cursor) return Boolean;
function ">" (Left: Element_Type; Right: Cursor) return Boolean;
procedure Reverse_Iterate(
Container: in Set;
Process: not null access procedure (Position: in Cursor));
These are again largely self-evident. The functions
Floor and Ceiling
are similar to those for ordered maps – Floor
searches for the last element which is not greater than Item
and Ceiling searches for the first element
which is not less than Item – they return
No_Element if there is not one.
The functions "<"
and ">" are very important for
ordered sets. The first is equivalent to
function "<" (Left, Right: Cursor) return Boolean is
begin
return Element(Left) < Element(Right);
end "<";
There is a general philosophy that the container
packages should work efficiently even if the elements themselves are
very large – perhaps even other containers. We should therefore
avoid copying elements. (Passing them as parameters is of course no problem
since they will be passed by reference if they are large structures.)
So in this case the built-in comparison is valuable because it can avoid
the copying which would occur if we wrote the function ourselves with
the explicit internal calls of the function Element.
On the other hand, there is a general expectation
that keys will be small and so there is no corresponding problem with
copying keys. Thus such built-in functions are less important for maps
than sets but they are provided for maps for uniformity.
The following are additional
in the package Generic_Keys for ordered sets
function Equivalent_Keys(Left, Right: Key_Type) return Boolean;
This corresponds to the formal generic parameter
of the same name in the package Generic_Keys
for hashed sets as mentioned earlier.
function Floor(Container: Set; Key: Key_Type) return Cursor;
function Ceiling(Container: Set; Key: Key_Type) return Cursor;
These are much as the corresponding functions in
the parent package except that they use the formal parameter "<"
of Generic_Keys for the search.
Hashed sets, like hashed
maps also have the facility to specify a capacity as for the vectors
package. Thus we have
procedure Reserve_Capacity(Container: in out Set; Capacity: in Count_Type);
function Capacity(Container: Set) return Count_Type;
The behaviour is much as for vectors and hashed maps.
We don't have to set the capacity ourselves since it will be automatically
extended as necessary but it might significantly improve performance
to do so. Note again that Length(S) cannot
exceed Capacity(S) but might be much less.
The other additional
subprograms for hashed sets are
function Equivalent_Elements(Left, Right: Cursor) return Boolean;
function Equivalent_Elements(Left: Cursor; Right: Element_Type) return Boolean;
function Equivalent_Elements(Left: Element_Type; Right: Cursor) return Boolean;
Again, these are very
important for sets. The first is equivalent to
function Equivalent_Elements(Left, Right: Cursor) return Boolean is
begin
return Equivalent_Elements(Element(Left), Element(Right));
end Equivalent_Elements;
and once more we see that the built-in functions
can avoid the copying of the type Element
that would occur if we wrote the functions ourselves.
© 2005, 2006 John Barnes Informatics.
Sponsored in part by: