             DATA AREAS RELEVANT TO THE ABSTRACT MACHINE
             ===========================================

The data areas that are used by the abstract machine fall into the
following categories:

  1. Registers

  2. Module space for each chain of loaded modules which is formed by a
     top-level one and those recursively imported.

  3. Global Stack or Heap

  4. Local Stack

  5. Trail

  6. PDL stack


We discuss the items in each category in detail below.

                             I. REGISTERS
                             ============
The registers in the machine are presented below, together with any
relevant explanations:


  1. Data registers:
     A predetermined number of temporary variables or argument
     registers, referred to as Ai and Xi.

  2. Those relevant to term decomposition:
     NUMARGS, NUMABS, RIGFLAG, CONSFLAG, HEAD, ARGVEC registers.
     These registers are set during the head normalization process,
     denoting the number of arguments, the binder length,
     whether the term is rigid, whether the body is list cons,
     a reference to the head and a reference to the argument vector
     of a head normal form.

  3. Those relevant to occurs-check:
     OCFLAG: whether occurs-check is needed;
     ADJ:    the universe count of the variable being bound via a
             get_structure (get_list) instruction;
     VAR_BEING_BOUND:
             a reference to the top-level structure on the heap to which the
             variable being bound via a get_structure (get_list) instruction;
     TVAR_BEING_BOUND:
             a reference to the top-level structure on the heap to which the
             type variable being bound via a  get_type_structure
             (get_type_arrow) instruction.

  4. Mode flags:  WRITE and TWRITE
     These are used to indicate whether the write mode and type write mode
     are on.

  5. Structure arguments: S and TS registers.
     These are registers used in the course of unifying term or type structures.
     The S register is also used in the context of writing a structure or a
     list cons on heap via a get_structure (get_list) instruction.
     These registers should always refer to the heap.

  6. Universe count: UC register.
     This stores the Universe Count, needed for setting tag values for
     constants and variables in the context of tagged unification.

  7. Global disagreement list: LL register.
     This determines the beginning of the current disagreement set,
     represented as a doubly linked list.
     LL stands for Live List.

  8. BND:
     This is a flag denoting whether binding occurs during compiled
     unification or the interpretive higher-order pattern unification
     process. The global disagreement list is examined only
     when it is not empty and this flag is on.

  9. The WAM registers:
        P:    program pointer
        CP:   program continuation pointer
     heap
        H:    current heap top
        HB:   the value of H at the time of the latest choice point
     stack
        E:    current environment record
        B:    current choice point
        B0:   cut register
     trail
        TR:   current trail top

 10. PDL registers: PDL, PDLBOT, TYPES_PDLBOT

 11. Those relevant to implication and import:
     I:   This register records the current most recent implication/import
          point and thus determines the current program context.
     CI:  This stores a pointer to the impl/import point record that
          the clause currently being used comes from.
     CE:  This is relevant only if the current clause comes from an implication
          point record (as opposed to an import point) and, in this case,
          stores a pointer to the environment in whose context the impl point
          record was created. CE is intended to stand for Closure Environment
          or Clause Environment.

 12. TOS register. This indicates the top of stack. It is useful
     to have an aggregate of the B and I register values.
     With respect to the E register, environment trimming may require a
     calculation at the point of need.


                          II. MODULE SPACE
                          ================

A. Organization of module space for an imported chain of modules
----------------------------------------------------------------

The space for an imported chain of modules has the following structure:

                -----------------------------------------------
                | run-time kind symbol table                  |
                -----------------------------------------------
                | type skeleton table                         |
                -----------------------------------------------
                | run-time constant symbol table              |
                -----------------------------------------------
                | type skeleton area                          |
                -----------------------------------------------
                | string area                                 |
                -----------------------------------------------
code areas    { | space for the top-level module              |
including:    { -----------------------------------------------
add code table{ | space for the 1st imported module           |
(import/impl) { -----------------------------------------------
      +       {                   ...
search tables { -----------------------------------------------
      +       { | space for the nth imported module           |
index tables  { -----------------------------------------------
      +
instructions



The details of the different categories are described below.

B. Run-time symbol tables
-------------------------

  1. Kind table

   The components of each entry in this table are the following.

   a. a pointer to the name of the symbol

   b. the arity of the type constructor


  2. Type skeleton table

   A reference to the type skeleton area where the encoding of the relevant
   type skeleton locates.
   An alternative is to directly encode the atomic head of the type skeleton
   in the entries of this table. By doing this, one level of indirection can be
   avoided when accessing a type skeleton. However, this issue seems unimportant
   to the new implementation since type skeletons are never used in core
   computation. Therefore the decision here really depends which way of
   organization is easier to the loader.


  3. Constant table

   The components of each entry in this table are the following.

   a. a pointer to the name of the symbol

   b. the number of type variables whose instantiations should be carried along
      with the constant occurrences (the size of type environment)

   c. an index to the type skeleton table
      (Similar as in 2, the atomic head of the type skeleton can be used
       in this field as an alternative.)

   d. the neededness information of type arguments for a predicate usage of a
      predicate constant (for the transformation between interpretive and
      complied modes of unification)

   e. universe index

   f. precedence

   g. fixity


  4. Type skeleton area

   Contiguous space containing the encoding of type skeletons present
   in a module (chain), to which the entries of the type skeleton table
   should refer.
   The size of this area is decided at compile/link time.

  5. String area

   Contiguous space containing strings present in a module (chain),
   including names of constants and type constructors and those in arguments
   of clauses.
   The size of this area is decided at compile/link time.

C. Code Areas for Each module
-----------------------------

  1. Add code tables

   (1) Import table

   Such a table is used when an imported module is to be added into the
   program context through a push_import_point instruction, or when a top-level
   module in an imported chain is to be added into the initial program context.
   (Note that the corresponding part in bytecode for creating this table may
    need to be properly adjusted to reflect the code accumulated from other
    modules in the linking phase.)

   We assume the following items of information are contained in contiguous
   locations with each occupying the indicated amount of space:

     a. Field containing a count of the number of segments of code
        (1 word)                                                --- NCSEG

     b. Count of number of local constants
        (1 word)                                                --- NLC

     c. A count of the number of predicates whose previous definitions
        might be extended by the code in the module.
        (1 word)                                                --- LTS

     d. Pointer to a function to be invoked for finding code relative to
        the module. (hash/seq search function)
        (1 word)                                                --- FC

     e. Number of entries in the table for the code defined in the module.
        (1 word)                                                --- PSTS

     f. References to run-time constant tables for those representing
        the ``names'' of predicates of the kind described in c.
        (sequence of words of length determined by 1)           --- LT

     g. References to run-time constant tables for local constants.
        These are needed for initializing the universe index of local
        variables.
        (A sequence of words of length determined by 2)         --- LCT
        (If the count in 2 is zero, this section is empty.)

     h. A table of suitable form yielding the address of code. The entries
        will depend on the form of the search function. (hash/seq)
                                                                --- PST
The proposed arrangement of an import table:

                  --------------------------------------
                  |  number of code segments (NCSEG)   |
                  --------------------------------------
                  |  number of local consts m (NLC)    |
                  --------------------------------------
                  |  link table size n (LTS)           |
                  |------------------------------------|
                  |  find code function pointer (FC)   |
                  |------------------------------------|
                  |  predicate search table size (PSTS)|
                - |------------------------------------|
 link table    {  |  constant symbol table index 1     |
 (LT)          {  |------------------------------------|
               {  |  ...                               |
               {  |------------------------------------|
               {  |  constant symbol table index n     |
                - |------------------------------------|
 local const   {  |  local const symbol tab index 1    |
 table (LCT)   {  |------------------------------------|
               {  |  ...                               |
               {  |------------------------------------|
               {  |  local const symbol tab index m    |
                - |------------------------------------|
 pred search   {  |  ...                               |
 table (PST)   {  |                                    |
                - --------------------------------------


   (2) Implication goal tables

   Such a table is used when an implication goal is to be solved through
   a push_impl_point instruction.

   We assume the following items of information are contained in contiguous
   locations with each occupying the indicated amount of space:

      a. link table size:
         A count of the number of predicates whose previous definitions
         might be extended in the antecedent of the implication goal.
         (1 word)                                                --- LTS

      b. Pointer to a function to be invoked for finding code relative to
         the antecedent of the implication. (hash/seq search function)
         (1 word)                                                --- FC

      c. predicate search table size:
         Number of entries in the table for the code defined in the antecedent
         of implication.
         (1 word)                                                --- PSTS

      d. link table:
         References to the (run-time) constant symbol table entries
         corresponding to the predicate constants of the kind described in 1.
         (sequence of words of length determined by 1)
                                                                 --- LT

      e. predicate search table:
         A table of suitable form yielding the address of code. The entries
         will depend on the form of the search function (hash/seq).
                                                                 --- PST

The proposed arrangement of an implication table:

                  --------------------------------------
                  |  link table size n (LTS)           |
                  |------------------------------------|
                  |  find code function pointer (FC)   |
                  |------------------------------------|
                  |  predicate search table size (PSTS)|
                - |------------------------------------|
 link table    {  |  constant symbol table index 1     |
 (LT)          {  |------------------------------------|
               {  |  ...                               |
               {  |------------------------------------|
               {  |  constant symbol table index n     |
                - |------------------------------------|
 pred search   {  |  ...                               |
 table (PST)   {  |                                    |
                - --------------------------------------

 2. Search tables (hash/seq)

   (1). A sequential search table should be allocated in contiguous locations.
        Each entry takes two words with the first as a reference to the
        (run-time) constant table entry for a predicate constant,
        and the second is the absolute address of the code location.

   (2). A hash table should be allocated in contiguous locations with the
        following structure.

            ------------------------------------
            | number of entries (n) (1 word)   |
            ------------------------------------
            | entry 1               (1 word)   |
            ------------------------------------
            |  ...                             |
            ------------------------------------
            | entry n               (1 word)   |
            ------------------------------------
            | hash chains for relevant entries |
            |  ...                             |
            ------------------------------------

    Each entry in this table could be either NULL or a pointer referring to
    a hash chain in the following space.
    Each hash chain is represented as a linked list with 3-word nodes:
    the first as a reference to a constant table entry for the predicate;
    the second as an absolute address of the code location;
    the third as a pointer referring to the next element in the list.

 3. Indexing tables

  (1). constant indexing table (hash)
       The structure of this table is the same as that described in 2.(2)

  (2). bound variable indexing table (branch table)
       The structure of this table is:

              -------------------------------
              | table size (n)   (1 word )  |
              -------------------------------
              | entry 1          (2 words)  |
              -------------------------------
              |   ...                       |
              -------------------------------
              | entry n          (2 words)  |
              -------------------------------

    where the first word of each entry is a de Bruijn index, and
    the second is the absolute address of the code location.



 4. Instruction area

                          III. HEAP
                          =========

The heap in our abstract machine is a space that is managed in much
the same way as in the WAM: the portion of it that is used grows as
procedures are invoked and space is reclaimed only upon *failure* of
the procedure.

The difference between the WAM and our machine is that the heap is
used in our case to a larger collection of item kinds. In addition to
storing terms, the heap is also used to represent types and also
disagreement sets. The atomic objects of each type occupy different
amounts of space. However, all of them will occupy integral multiples
of words. The particular requirements for term, type and disagreement pairs
representations become clear from the format of data cells spelled out
in dataformats.txt.

The creation of terms in the heap takes place in part through the
instructions produced by compiling the setting up of argument
registers. This is similar to the WAM. However, unlike in the WAM, the
interpretive (higher-order pattern) unification may require the creation
of new terms as substitutions. Also, new structures may have to be written
when a term is normalized.

                           IV. LOCAL STACK
                           ===============


The local stack holds three kinds of records:

  1. Import/implication point records

  2. Choice point records

  3. Environment records

Each of these are discussed in turn below.

A. IMPLICATION POINT/IMPORT POINT RECORDS
-----------------------------------------

Implication point record
------------------------

Information should be recorded in an implication record:

 a. Next clause table
    Each entry in this table corresponds to a predicate whose definition
    might be extended by the clauses in the antecedent of the implication
    goal and has the form of

       < code address of the "next clause" for the ith predicate,
         impl record address in which the "next clause" is found  >

    In the situation that there is no "next clause" for a particular predicate,
    the first field in such an entry is the address of the built-in fail
    predicate.

 b. A pointer to the environment for the clause that led to the
    creation of the impl point (called Clause Environment) for the purpose of
    initializing global variables.

 c. A tag indicating whether the record is implication, import with local
    variables or import without local variables.

 d. Start address of the hash/seq search table for predicates defined in the
    antecedent of the implication goal, which is determined in the compilation
    phase and is set up in the loading phase.

 e. A pointer to the hash/seq search function to be used to find code
    from the table in c.

 f. A pointer to the previous implication point record (I)

 g. The number of predicates defined in the antecedent of the implication goal.

The proposed arrangement of an implication point record:

                    ________________________________    <-    stack bottom
                    |                              |
                    |                              |
                    |                              |
                    |                              |
                 -  |------------------------------| <-    start of record
                {   | imp pt for nth next cl       |
                {   |------------------------------|
   the next     {   | code ptr for nth nxt cl      |
   clause       {   |------------------------------|
   table part   {   |           .                  |
   of the       {   |           .                  |
   implication  {   |           .                  |
   point record {   |------------------------------|
   (NCL tab)    {   | imp pt for 1st next cl       |
                {   |------------------------------|
                {   | code ptr for 1st nxt cl      |
                 -  |------------------------------|
                    | clause environment (CE)      | <-   I register points here
                    |------------------------------|
                    | tag field                    |
                    |------------------------------|
                    | start add. for pred tab (PST)|
                    |------------------------------|
                    | code finding fun ptr. (FC)   |
                    |------------------------------|
                    | previous impl pt rec (PIP)   |
                    |------------------------------|
                    | number of preds defined(PSTS)|
                    |------------------------------| <-   end of impl pt rec
                    |           |                  |
                    |           |                  |
                    |           |                  |
                    |           V                  |
                    |                              |


Import point Record
--------------------

An import point record has most of its structure in common with that of an
implication record.
The differences are in the following two aspects.
 1). At the beginning of the record, a table recording backchained counters
     and the most recent choice point value for the "import segments" in
     this module has to be recorded. Correspondingly, the number of the
     predicates defined in the module (for the size of this table) has to
     be remembered.
 2). There is no need to store the current environment value, and so
     the "CE field" in an implication point record is replaced by the
     number of predicates defined in the module.
 3). The tag field contains relevant tags indicating this record is an import
     point with local constants or that without local constants.

A pictorial description of an import point record is as following.

                    ________________________________ <- stack bottom
                    |                              |
                    |                              |
                    |                              |
                    |                              |
                 -  |------------------------------| <- start of record
                {   | MRCP for nth code segmt      |
                {   |------------------------------|
   the          {   | count for nth code seg       |
   backchained  {   |------------------------------|
   count and    {   |           .                  |
   most recent  {   |           .                  |
   choice point {   |           .                  |
   record       {   |------------------------------|
   for each seg {   | MRCP for 1st code segmt      |
   of clauses in{   |------------------------------|
   a module     {   | count for 1st code seg       |
   (BC vector)   -  |------------------------------|
                {   | imp pt for nth next cl       |
                {   |------------------------------|
   the next     {   | code ptr for nth nxt cl      |
   clause       {   |------------------------------|
   table part   {   |           .                  |
   of the       {   |           .                  |
   implication  {   |           .                  |
   point record {   |------------------------------|
   (NCL tab)    {   | imp pt for 1st next cl       |
                {   |------------------------------|
                {   | code ptr for 1st nxt cl      |
                 -  |------------------------------|
                    | no of preds (NPred)          | <-   I register points here
                    |------------------------------|
                    | tag field                    |
                    |------------------------------|
                    | start add. for pred tab (PST)|
                    |------------------------------|
                    | code finding fun ptr.   (FC) |
                    |------------------------------|
                    | previous impl pt rec    (PIP)|
                    |------------------------------|
                    | number of preds defined(PSTS)|
                    |------------------------------| <-    end of import pt rec
                    |           |                  |
                    |           |                  |
                    |           |                  |
                    |           V                  |
                    |                              |


B. CHOICE POINT RECORDS
-----------------------

A choice point will contain the following items.

  a. A sequence of argument registers                  (Am ... A1)

  b. Universe count                                    (UC)

  c. Continuation environment pointer                  (E)

  d. Continuation code pointer                         (CP)

  e. Code context (given by impl/import point record)  (I)

  f. Impl point rec for current set of clauses         (CI)

  g. Previous choice point                             (B)

  h. Cut choice point                                  (B0)

  i. Live list                                         (LL)

  j. Trail pointer                                     (TR)

  k. Next clause address                               (NCL)

  l. Heap pointer                                      (H)


The CE register is not stored in a choice point. The reason is that this
value may or may not be relevant depending on whether or not the current
clause is obtained from an implication point or an import point.
Backtracking action will determine this and then restore CE register if
needed.


C. ENVIRONMENT RECORDS
----------------------

The structure of these is for the most part like that of environment
records in the WAM. The only real difference is that there two
additional fields in the record to hold the values of the UC and CI
registers at the time the environment record was created. These values
are referred to as the UCE and CIE fields of the record.


The components of the environment record are, then, the following:

   a. closure implication point                    (CIE field: CI)

   b. continuation environment                     (CE  field: E)

   c. universe count of environment                (UCE field: UC)

   d. continuation code pointer                    (CP  field: CP)

   e. permanent variables                          (Y1 ... Yn)


The environment pointer E should point to the UCE field of an environment
record, so that the indexes of permanent variables, which have two words in
size, can be used as offsets in reference.
There are three sorts of data items that could appear in the argument
cells of an environment record: terms, types and the value of the B0 register.


                              V. TRAIL
                              ========

Upon backtracking, different kinds of actions may have to be undone:

  (i) bindings made for type variables,

 (ii) bindings made for term variables,

(iii) (destructive) changes made to terms during normalization and,

 (iv) trailing of backchained counts and most recent choice points for
      segments of imported code.

Different kinds of information is needed for undoing each of these
effects. The particular categories of trail items are the following:

(a) type variable binding. The information needs to be kept is an address
    that must be used in an "unbinding" operation similar to that in the WAM:
    i.e. set the contents of the address to be a reference to itself.

(b) term value binding. The information needs to be kept consists of
    first the address of the term and second, the term in that
    address that is overwritten in (ii) and (iii). Note that in the
    situation of (iii), the trailed item has a size of 3 or 4 words.
    Upon "unbinding", the trailed cell(s) will be copied back to the locations
    started from the recorded address.

    [An alternative of trailing terms in the situation of (iii) is to
     only trail the overwritten part of the term which can be controlled
     to always have an atomic size. The advantage of doing so is obvious:
     less data movement needs to be performed upon trailing and unwinding.
     However, the "partial" information present in the trail breaks data
     integrity, and poses difficulties to adopting garbage collection.]

(c) code resurrection. The information needs to be kept in this case is
    an address followed by a count and a MRCP. Upon "unbinding", the
    could and MRCP are reset to the locations started from the stored address.

We use one trail to store all the kinds of information; this
approach reduces the amount of information that needs to be stored in
choice/branch point records. Further, for the seek of portability, C structure
is used to record the trail items. Specifically, the first byte of the
leading word of a trail item is used as a tag to differentiate between
different categories. Note the size of a trail item varies on their categories.

                              VI. PUSH-DOWN LIST
                              ==================
Two sorts of items appear on the push-down list: references to terms and
to types respectively. Since the unification of types is always immediately
solved after being pushed onto the PDL, there is no need to distinguish
type references with term references at the PDL item level.
