Previous section: Operations on Vectors

Dylan reference manual -- The Iteration Protocol

The Iteration Protocol

All collections implement an iteration protocol that allows iteration to be specified abstractly. Many higher level operations on collections can be defined in terms of only the iteration protocol.

The iteration protocol centers on the notion of a "state" object for an iteration. Each collection class chooses its own most appropriate representation for an iteration state, and only the functions of the iteration protocol are affected by this choice.

Use of the iteration protocol is based on the assumption that the collection over which iteration occurs remains static for the duration of the iteration. That is, arbitrary changes to a mutable collection while an iteration is in progress may cause the iteration to produce incorrect results.

In general, two or more iterations over the same collection are not guaranteed to produce the same values in the same order, even if the collection is unaltered (but see the next section on Iteration Stability).

The iteration protocol consists of the following two functions:

forward-iteration-protocol collection	[Generic Function]
=> initial-state :: <object>

limit :: <object>

next-state :: <function>

finished-state? :: <function>

current-key :: <function>

current-element :: <function>

current-element-setter :: <function>

copy-state :: <function>

The eight values returned by this function are used to implement iteration over the collection argument. Only the collection argument this function was called with may be used as the collection argument to functions returned by this function. Only the initial-state object and state objects returned by the next-state and copy-state functions may be used as the state argument to functions returned by this function. Only the limit object may be used as the limit argument to the finished-state? function.

initial-state

The initial iteration state object.

limit

A value that may be used by the finished-state? function to determine whether the iteration has been completed.

next-state collection state => new-state

This function "steps" the iteration by producing a new state from the associated collection and state. The next-state function may or may not modify the state argument; it is an error to use a state value after it has been passed to the associated next-state function. The copy-state function provides a mechanism for saving a particular state in an iteration for later resumption.

finished-state? collection state limit => boolean

This function returns #t if the iteration of the collection has been completed, i.e., there are no other elements of the collection to consider. It returns #f otherwise. It is an error to use a finished state in a call to the associated next-state, current-element, current-key or current-element-setter functions.

current-key collection state => key

This function returns the unique key associated with state in the collection. If the current-key function were called once with each state value produced during an iteration over a collection, the resulting sequence of values would contain every key from the collection exactly once.

current-element collection state => element

This function returns the element of collection "currently indicated" by state.

current-element-setter value collection state => value

This function sets the element of collection indicated by state to value and returns value. If the collection is not mutable, i.e. is not a generalized instance of <mutable-collection>, this function instead signals an error.

copy-state collection state => new-state

This function returns a state which represents the same point in the iteration over collection as is represented by state.

backward-iteration-protocol collection	[Generic Function]
=> final-state :: <object>

limit :: <object>

previous-state :: <function>

finished-state? :: <function>

current-key :: <function>

current-element :: <function>

current-element-setter :: <function>

copy-state :: <function>

Some collection classes that are stable under iteration additionally support the ability to iterate in the reverse of the natural order, by providing a method on the generic function backward-iteration-protocol. The eight values returned by this function are analogous to the corresponding values returned by forward-iteration-protocol, except that the final-state object indicates the last element in the collection and the previous-state returns a state for the preceding element.

An example of the use of the iteration protocol is the following definition of a single-argument version of the do function:

define method do1 (f :: <function>, c :: <collection>)
  let (init, limit, next, end?, key, elt) = 
                     forward-iteration-protocol(c);
  for (state = init then next(c, state),
	    until end?(c, state, limit))
    f(elt(c, state));
  end for;
end method do1;

The Table Protocol

Tables, also known as hashtables, map arbitrary keys to elements. Table keys may be any object, including complex objects such as strings. Each table has an associated equivalence predicate which is used to compare keys. The table maps keys that are equivalent under the predicate to the same table element.

An equivalence predicate is a boolean function of two arguments that returns true if and only if the arguments are "the same" according to some specified criteria. For a function to be used as an equivalence predicate, it must be reflexive, commutative, and transitive. That is, for a function F and any arguments X, Y, and Z in the domain of F, it must be true that (1) F(X,X) is true, (2) F(X,Y) is true if and only if F(Y,X) is true, and (3) if both F(X,Y) and F(Y,Z) are true then F(X,Z) is true.

An equivalence class (for an equivalence predicate) is a set of objects, or potential objects, that are all the same under the specified equivalence predicate and different from all objects not in the class. (This "class" is not meant to refer to Dylan classes as in define class.)

An object is said to be visibly modified with respect to an equivalence predicate if the modification changes the equivalence class of the object. The modifications that are visible to an equivalence predicate are determined by the definition of the predicate.

If an object X is a key in a table T and is visibly modified with regard to the test function of T, then the consequences are unspecified if any of the following objects are used as a key in any subsequent operations on T:

Each table also has an associated hash function, which is a function that relates table keys and table elements by computing hash codes. A hash code is a conceptual object consisting of a hash id and its associated hash state. (It is not an actual Dylan object.) A hash id is an integer encoding of an object. A hash state is an object of implementation-dependent type which is associated with a particular hash id and can be used by the implementation to determine whether the hash id has been invalidated. All hash functions have one argument, a key, and return two values, a hash id and a hash state, which together represent the hash code.

Each hash function is associated with a specific equivalence predicate, and must obey the following constraints:

In addition, a hash function should have the property that the hash ids computed by it are well distributed over the range of possible values. That is, it should be unlikely that two randomly chosen equivalence classes have the same valid hash id.
table-protocol table	[Generic Function]
  =>	test-function  hash-function
The class <table> provides an implementation of the Iteration Protocol, using the function table-protocol. Every concrete subclass of <table> must provide or inherit a method for table-protocol. table-protocol returns the following functions:

test-function key1 key2 => boolean

The test-function is used to compare keys. It returns true if the keys are members of the same equivalence class according to the table's equivalence predicate.

hash-function key => id state

The hash-function computes the hash code of the key, using the hash function associated with the table's equivalence predicate. The hash code is returned as two values, id (an integer) and state (a hash state).

Users can choose either to define a new subclass of <table> for every test function they use, or to define a generic subclass of <table> which holds its test function and hash function in slots, according to their particular needs.

table-protocol object-table	[G. F. Method]
  =>	test-function  hash-function
The method for <object-table> returns == as the test-function and object-hash (described below) as the hash-function. It could be written as
define method table-protocol (table :: <object-table>)
      => test-function :: <function>, hash-function :: <function>;
  values(\==, object-hash);
end method table-protocol;
merge-hash-codes id1 state1 id2 state2  #key ordered 	[Function]
=> merged-id merged-state

This function computes a new hash code by merging the argument hash codes in some implementation dependent way. id1, id2, and merged-id are all integers. state1, state2, and merged-state are all hash states. ordered is a boolean and determines whether the algorithm used to perform the merge is permitted to be order dependent. If #f, which is the default, then the merged result must be independent of the order in which the argument pairs are provided. If true, then the order of the argument pairs matters because the algorithm used need not be either commutative or associative in this case. Providing a true value for ordered is recommended when doing so will not cause the hash function to violate the second constraint on hash functions, because it may result in a better distribution of hash ids.

$permanent-hash-state	[Constant]
An implementation-dependent hash state that indicates that the associated hash id is always valid, and does not depend on any mutable property of the object that can be changed without a visible modification to the object.
object-hash object  => id state	[Function]
This is the hash function for the equivalence predicate ==. It is made available as a tool for writing hash functions in which the object identity of some component of a key is to be used in computing the hash code. It returns a hash id (an integer) and associated hash state for the object, computed in some implementation dependent manner. The values returned by object-hash when called repeatedly on the same object might not be the same for each call. If the id value changes then the state value will also change.

Next section: Iteration Stability and Natural Order