SOLVER ARCHITECTURE
===================

The core of Yices is a DPLL-based sat solver (implemented in
smt_core.h and sm_core.c) with hooks and features to interface with a
theory solver object. The theory solver must implement several
functions that the core uses during the search.  Conversely, the core
provides several functions that can be used by the theory solver
during the search to perform theory propagation or create lemmas and
atoms on the fly.


SMT CORE
--------

The core is a data structure of type smt_core_t, defined in
smt_core.h. It implements functions for creating Boolean variables,
creating clauses, and mapping variables to atoms. It also implements
DPLL search functions for solving an SMT problem.

The types of Boolean variables and literals are bvar_t and
literal_t, respectively. These are just aliases for int32_t, so both
Boolean variables and literals are represented by signed 32bit integers.
For a Boolean variable x, there are two literals pos_lit(x) = 2*x,
and neg_lit(x) = 2*x+1


The following functions are defined in smt_core.c for creating variables
and clauses, and for attaching atoms to Boolean variables. All are explained
in smt_core.h

   bvar_t create_boolean_variable(smt_solver_t *s)
   void add_boolean_variables(smt_solver_t *s, uint32_t n)
   void attach_atom_to_bvar(smt_solver_t *s, bvar_t x, void *atom)

As far as the smt-core is concerned, atoms are abstract, opaque
objects that are manipulated as (void *) pointers. If x is a Boolean
variable attached to an atom atm, then the theory solver is notified
whenever x is is set to true or false during the search.


For creating clauses, the following functions can be used. They can be
called before or during the search. This enables a theory solver to
dynamically create lemmas (as clauses) and add them to the solver.

   void add_clause(smt_solver_t *s, uint32_t n, literal_t *a)
   void add_empty_clause(smt_solver_t *s)
   void add_unit_clause(smt_solver_t *s, literal_t l)
   void add_binary_clause(smt_solver_t *s, literal_t l1, literal_t l2)
   void add_ternary_clause(smt_solver_t *s, literal_t l1, literal_t l2, literal_t l3)


When a theory solver detects a conflict, it must call the following function,
that will trigger backtracking and conflict resolution.

   void record_theory_conflict(smt_core_t *s, literal_t *a)

a must be an array of literals l_1, ..., l_n that are all false in the current
Boolean assignment and such that (or l_1 ... l_n) is a theorem (in the solver's theory).
The array must be terminated by null_literal (i.e., -1).


For theory propagation, a solver can call the following function (also defined in
smt_core.c and smt_core.h)

   void propagate_literal(smt_core_t *s, literal_t l, void *expl)

This asserts that l is implied by the theory. The parameter expl must
be a pointer to an "explanation object", which can be any thing
the solver needs to build an explanation for l. The core treats expl as
an abstract (void*) pointer.

If l is involved in a conflict, the smt_core may request that the
explanation be expanded.  At this point, the solver must return a set
of literals l_1,...,l_n such that

   (l_1 and l_2 ... and l_n) implies l

holds in the theory.



EGRAPH
------

The egraph is a theory solver. It can be used as a simple solver
attached to the core for problems that involve uninterpreted functions
only. It can also be used in combination with other "satellite"
solvers. Currently, the egraph can support at most three satellite
solvers: an arithmetic solver, a bitvector solver, and a solver for the
array/function theory. The combination egraph+satellite solvers is seen
by the core as a single solver (for the combined theories).


The data structures used by the egraph are defined in egraph_types.h.
The egraph maintains a congruence closed equivalence relation between
terms. Each egraph term is represented by a 32bit integer (of type
eterm_t).

When uninterpreted functions are mixed with other theories some terms
have a dual internal representation. They are present as eterm_t
objects in the egraph, and as theory variables in a satellite solver.
For example, in formula '(f a b) >= 0' the uninterpreted term '(f a b)'
is present in the egraph as an eterm and in the arithmetic solver
as a theory variable x. Everything is as is we had rewritten the
formula to "x >= 0 AND x = (f a b)". The equality "x>=0" is a pure
arithmetic atom and the equality "x = (f a b)" relate two objects 'x'
and '(f a b)' in distinct theories.  In the egraph, '(f a b)' is
represented by an eterm t. The egraph records that t is equal to x
by attaching x as the theory variable of t. Conversely, the arithmetic
solver must keep track of the fact that t is the egraph term for x.






SOLVER INTERFACES
-----------------

A theory solver must implement a number of interfaces that allows it
to be used by the smt_core or the egraph (or both). Each interface is
a set of related functions. Internally, we use function pointers to
represent interfaces: an interface object is just a record of function
pointers.

Currently, there are three solver interfaces.

1) control interface:

   void start_internalization(void *solver)
   void start_search(void *solver)
   bool propagate(void *solver)
   fcheck_code_t final_check(void *solver)
   void increase_decision_level(void *solver)
   void backtrack(void *solver, uint32_t backlevel)
   void push(void *solver)
   void pop(void *solver)
   void reset(void *solver)


2) smt interface:

   bool assert_atom(void *solver, void *atom, literal_t l)
   void expand_explanation(void *solver, literal_t l, void *expl, ivector_t *v)
   literal_t select_polarity(void *solver, void *atom, literal_t l)
   void delete_atom(void *solver, void *atom)
   void end_atom_deletion(void *solver)

3) egraph interface:

   void assert_equality(void *solver, thvar_t x1, thvar_t x2)
   void assert_disequality(void *solver, thvar_t x1, thvar_t x2, composite_t *hint)
   void assert_distinct(void *solver, uint32_t n, thvar_t *a, composite_t *hint)
   bool check_diseq(void *solver, thvar_t x1, thvar_t x2)
   void expand_th_explanation(void *solver, thvar_t x1, thvar_t x2, void *expl, th_explanation_t *result)


To be used with the smt_core, a solver must implement the control and smt interfaces.
To be used as a satellite theory to the egraph, a solver must implement the control
and egraph interfaces.



INTERNALIZATION INTERFACES
--------------------------

In addition, there are internalization interfaces for mapping terms
and atoms to internal objects in the solver. The internalization
interfaces are distinct for each solver type. These interfaces are
used by the context (defined in context.h/context.c) to convert
the external representation of terms and formulas into clauses, atoms,
literals, egraph terms, and theory variables.

1) Arithmetic internalization interface:

   thvar_t create_var(void *solver, bool is_int)
   thvar_t create_const(void *solver, rational_t *q)
   thvar_t create_poly(void *solver, polynomial_t *p, thvar_t *map)
   thvar_t create_pprod(void *solver, pprod_t *r, thvar_t *map)

   literal_t create_eq_atom(void *solver, thvar_t x)
   literal_t create_ge_atom(void *solver, thvar_t x)
   literal_t create_poly_eq_atom(void *solver, polynomial_t *p, thvar_t *map)
   literal_t create_poly_ge_atom(void *solver, polynomial_t *p, thvar_t *map)
   literal_t create_vareq_atom(void *solver, thvar_t x, thvar_t y)

   void assert_eq_axiom(void *solver, thvar_t x, bool tt)
   void assert_ge_axiom(void *solver, thvar_t x, bool tt)
   void assert_poly_eq_axiom(void *solver, polynomial_t *p, thvar_t *map, bool tt)
   void assert_poly_ge_axiom(void *solver, polynomial_t *p, thvar_t *map, bool tt)
   void assert_vareq_axiom(void *solver, thvar_t x, thvar_t y, bool tt)
   void assert_cond_vareq_axiom(void *solver, literal_t c, thvar_t x, thvar_t y)

   void attach_eterm(void *solver, thvar_t v, eterm_t t)
   eterm_t eterm_of_var(void *solver, thvar_t v)


2) Bitvector internalization interface:

   thvar_t create_var(void *solver, uint32_t n)
   thvar_t create_const(void *solver, bvconst_term_t *const)
   thvar_t create_const64(void *solver, bvconst64_term_t *const)
   thvar_t create_poly(void *solver, bvpoly_t *p, thvar_t *map)
   thvar_t create_poly64(void *solver, bvpoly64_t *p, thvar_t *map)
   thvar_t create_pprod(void *solver, pprod_t *r, thvar_t *map)
   thvar_t create_bvarray(void *solver, literal_t *a, uint32_t n)
   thvar_t create_bvite(void *solver, literal_t c, thvar_t x, thvar_t y)
   thvar_t create_bvdiv(void *solver, thvar_t x, thvar_t y)
   thvar_t create_bvrem(void *solver, thvar_t x, thvar_t y)
   thvar_t create_bvsdiv(void *solver, thvar_t x, thvar_t y)
   thvar_t create_bvsrem(void *solver, thvar_t x, thvar_t y)
   thvar_t create_bvsmod(void *solver, thvar_t x, thvar_t y)
   thvar_t create_bvshl(void *solver, thvar_t x, thvar_t y)
   thvar_t create_bvlshr(void *solver, thvar_t x, thvar_t y)
   thvar_t create_bvashr(void *solver, thvar_t x, thvar_t y)
   literal_t select_bit(void *solver, thvar_t x, uint32_t i)

   literal_t create_eq_atom(void *solver, thvar_t x, thvar_t y)
   literal_t create_ge_atom(void *solver, thvar_t x, thvar_t y)
   literal_t create_sge_atom(void *solver, thvar_t x, thvar_t y)

   void assert_eq_axiom(void *solver, thvar_t x, thvar_t y, bool tt)
   void assert_ge_axiom(void *solver, thvar_t x, thvar_t y, bool tt)
   void assert_sge_axiom(void *solver, thvar_t x, thvar_t y, bool tt)
   void set_bit(void *solver, thvar_t x, uint32_t i, bool tt)

   void attach_eterm(void *solver, thvar_t v, eterm_t t)
   eterm_t eterm_of_var(void *solver, thvar_t v)

   void build_model(void *solver)
   void free_model(void *solver)
   bool value_in_model(void *solver, thvar_t x, bvconstant_t *v)


3) Function-theory internalization interface

   thvar_t create_var(void *solver, type_t tau)
   void attach_eterm(void *solver, thvar_t v, eterm_t t)
   eterm_t eterm_of_var(void *solver, thvar_t v)

   void build_model(void *solver)
   void free_model(void *solver)
   map_t *value_in_model(void *solver, thvar_t x)



MORE DETAILS FOR ARITHMETIC SOLVERS
-----------------------------------

1) INITIALIZATION/DELETION

A typical solver will be a data structure stored into a C struct (or
whatever is necessary). For example, let's say the full solver is
defined by something like:

   typedef struct mysolver_s {
      ...
   } mysolver_t;


Then the solver must implement the following functions for
initialization and deletion.  There is some flexibility in the
arguments to init.


  1) init_solver(mysolver_t *solver, smt_core_t *core, ...)

     this should initialize the solver structure. core is a
     pointer to the SMT core object that will use this solver.


  2) delete_solver(mysolver_t *solver);

     must free all memory used by the solver structure.


  3) init_jmpbuf(mysolver_t *solver, jmp_buf *buffer);

     this should make a copy of buffer inside solver
     (see Exception mechanism below).





2) INTERNALIZATION INTERFACE

All the following functions must be provided by an arithmetic solver (well some
of them are optional). They are used by the context to convert formulas into
terms, atoms, and clauses. The solver should maintain a table of variables
and a table of atoms internally. Each variable is identified by an integer index
(this is what the context expects) or type thvar_t. This is just an alias
for signed 32bit integer.


Term constructors
-----------------
A term in the arithmetic solver is identified by an integer index (arithmetic variable).

 1) thvar_t create_var(mysolver_t *solver, bool is_int)
   - this must return the index of a new arithmetic variable (no eterm attached)
   - if is_int is true, that variable must have integer type, otherwise, it must
     be a real.

 2) thvar_t create_const(mysolver_t *solver, rational_t *q)
    - this must create a theory variable equal to q and return it (no eterm attached)

 3) thvar_t create_poly(mysolver_t *solver, polynomial_t *p, thvar_t *map)
    - this must return a theory variable equal to p with variable renamed as
      defined by map
    - p is of the form a_0 t_0 + a_1 t1 ... + a_n t_n,
       where t_0 is either the special marker const_idx (= 0) or an arithmetic term
         and t_1 ... t_n are arithmetic terms
    - map is an array of n+1 theory variables:
      map[i] = the theory variable x_i mapped to t_i (with the convention that const_idx
               is always mapped to null_thvar)
    - the solver must return a theory variable y equal to a_0 x_0 + ... + a_n x_n

 4) thvar_t create_pprod(mysolver_t *solver, pprod_t *r, thvar_t *map)
    - must return a theory variable equal to r with variables defined by map
    - r if of the form t_0^d_0 x ... x t_n^d_n where t_0 ... t_n are arithmetic
      terms
    - map is an array of n+1 variables: map[i] = variable x_i mapped to t_i
    - the solver must return an arithmetic variable y equal to (x_0^d_0 x ... x x_n^d_n)


Atom constructors
-----------------

 5) literal_t create_eq_atom(mysolver_t *solver, thvar_t x)
    - must create the atom (x == 0) and return the corresponding literal
    - x is an existing theory variable in solver

 6) literal_t create_ge_atom(mysolver_t *solver, thvar_t x)
    - must create the atom (x >= 0) and return the corresponding literal
    - x is an existing theory variable in solver

 7) literal_t create_poly_eq_atom(mysolver_t *solver, polynomial_t *p, thvar_t *map)
    - must create the atom (p == 0) and return the corresponding literal
    - p and map are as in create_poly

 8) literal_t create_poly_ge_atom(mysolver_t *solver, polynomial_t *p, thvar_t *map)
    - must create the atom (p >= 0) and return the corresponding literal
    - p and map are as in create_poly

 9) literal_t create_vareq_atom(mysolver_t *solver, thvar_t x, thvar_t y)
    - create the atom x == y where x and y are two existing variables in solver


Assertion of top-level axioms
-----------------------------

 10) void assert_eq_axiom(mysolver_t *solver, thvar_t x, bool tt)
     - if tt assert (x == 0) otherwise assert (x != 0)

 11) void assert_ge_axiom(mysolver_t *solver, thvar_t x, bool tt)
     - if tt assert (x >= 0) otherwise assert (x < 0)

 12) void assert_poly_eq_axiom(mysolver_t *solver, polynomial_t *p, thvar_t *map, bool tt)
     - if tt assert (p == 0) otherwise assert (p != 0)
     - p and map are as in create_poly

 13) void assert_poly_ge_axiom(mysolver_t *solver, polynomial_t *p, thvar_t *map, bool tt)
     - if tt assert (p >= 0) otherwise assert (p < 0)
     - p and map are as in create_poly

 14) void assert_vareq_axiom(mysolver_t *solver, thvar_t x, thvar_t y, bool tt)
     - if tt assert x == y, otherwise assert x != y

 15) void assert_cond_vareq_axiom(mysolver_t *solver, literal_t c, thvar_t x, thvar_t y)
     - assert (c implies x == y) as an axiom
     - this is used to convert if-then-else equalities:
        (x == (ite c y1 y2)) is flattened to (c implies x = y1) and (not c implies x = y2)

Egraph connection
-----------------

 16) void attach_eterm(mysolver_t *solver, thvar_t v, eterm_t t)
     - attach egraph term t to theory variable v
    - this function may be omitted for standalone solvers (no egraph is used in that case)

 17) eterm_t eterm_of_var(mysolver_t *solver, thvar_t v)
    - must return the eterm t attached to v (if any) or null_eterm if v has no term attached
    - this function may be omitted for standalone solvers (no egraph)

 NOTE: these functions are also used by the egraph. They are required only if
 the context includes both the egraph and the arithmetic solver.


Model construction
------------------

The following functions are used when the solver reaches SAT (or UNKNOWN).
First, build_model is called. The solver must construct an assignment M from variables to
rationals at that point. Then, the context can query for the value of a variable x in M.
If the solver cannot assign a rational value to x, it can signal this when value_in_model
is called. M must not be changed until the context calls free_model.

 18) void build_model(mysolver_t *solver)
    - build a model M: maps variable to rationals.
     (or do nothing if the solver does not support model construction).

 19) bool value_in_model(mysolver_t *solver, thvar_t x, rational_t *v)
     - must return true and copy the value of x in M into v if that value is available.
     - return false otherwise (e.g., if model construction is not supported by the
       solver or x's value is not available).

 20) void free_model(mysolver_t *solver)
    - notify solver that M is no longer needed.


Exception mechanism
-------------------
When a theory solver is created and initialized it's given a pointer b to a jmp_buf
internal to the context. If the solver fails in some way during internalization, it can call
longjmp(*b, error_code) to interrupt the internalization and return control to the
context. For arithmetic solvers, the following error codes should be used:

   FORMULA_NOT_IDL         (the solver supports only integer difference logic)
   FORMULA_NOT_RDL         (the solver supports only real difference logic)
   FORMULA_NOT_LINEAR      (the solver supports only linear arithmetic)
   TOO_MANY_ARITH_VARS     (solver limit is reached)
   TOO_MANY_ARITH_ATOMS    (solver limit is reached)
   ARITHSOLVER_EXCEPTION   (any other failure)


These are constants defined in context.h


All these functions use the Yices internal representation of rationals, polynomials, and
power products:

- a rational is a pair of integers (num/den) with a special encoding
  trick for GMP rationals. Rationals that can be represented as
  num/den where both num and den fit into 32bits are stored as is,
  with den positive.  Otherwise, there's an indirection: den is set to
  0, and num is an index in a global table of GMP rationals (mpq_t).

- a polynomial is an array of monomials: each monomial is a pair (rational, variable)
  where the variable is a 32bit integer. The special variable 0 is used to denote
  the constant part of a polynomial. The monomials are stored in increasing variable
  order, and all coefficients are non-zero.

- a power product is an array of pairs (variable, exponent). Each variable
  is an integer as above and the array is sorted in increasing variable order.


Internally, Yices represent terms by 32bit integers, which are indices
in a global term table. The table stores a descriptor for every term.
The data structure polynomial_t and pprod_t used above are two example
descriptors. The 'variables' that occur in a polynomial or in a power
product are integer indices that refer to other terms in the global table.


For example, the polynomial (x^2 * y + 2) will be stored as two terms:

- some index q will refer to (x^2 * y)

  the descriptor for q will be a pointer to a pprod_t object that stores [x^2 * y^1]

- some index p will refer to (x^2 * y + 2)

  the descriptor for p will be a pointer to a polynomial_t object that stores [2 + q]





3) SOLVER OPERATIONS

The following functions must be provided. They are called by the smt_core during
the search (they must be present even if they do nothing and just return).

  void start_search(mysolver_t *solver):
  - this is called at the beginning of the search (after internalization)
    the solver can do whatever it requires at this point to prepare
    for the search. The simplest option is to do nothing.


  void increase_decision_level(mysolver_t *solver)
  - this is called every time the SMT core does a case split in the search
    tree. The solver must keep track of the current decision level so
    that is can backtrack correctly.


  void backtrack(mysolver_t *solver, uint32_t back_level)
  - this is called when the SMT core backtracks to a previous decision
    level (i.e., to back_level). The theory solver must backtrack to
    the same point.


  bool assert_atom(mysolver_t *solver, void *atom, literal_t l)
  - this is called whenever the SMT core sets a Boolean variable v to true or
    false, and this variable is attached to an atom defined in solver.
  - atom is a pointer to the corresponding atom object (which solver is
    free to interpret as it wants)
  - the literal l is either pos_lit(v) or neg_lit(v)
  - if l is pos_lit(v) then the atom is true
    if l is neg_lit(v) then the atom if false
  - the function must return 'false' if it detects a conflict
    then if must also produce a conflict explanation in the core
    (using record_theory_conflict)
  - otherwise, the function returns true.


  bool propagate(mysolver_t *solver)
  - this function is called repeatedly during the search (to support
    incremental processing of the constraints).
  - the solver can do nothing and return true (then all the work
    is done in 'final_check')
  - or it can check whether the atoms asserted so far are consistent
    if they are the function should return true.
    if they are not, the function should return false and construct
    a theory conflict using 'record_theory_conflict'
  - this function does not have to be complete (i.e., it's OK
    to return true if the atoms are inconsistent)

  fcheck_code_t final_check(mysolver_t *solver)
  - this function is called when the SMT core reaches a full
    branch in the search tree (i.e., all Boolean variables in
    the SMT core have a value).
  - the solver must check whether the set of all atoms asserted
    is consistent, and it may create lemmas or new atoms
    to force the search to continue

    This function must return one of three codes:

      FCHECK_CONTINUE:  means that more search is required
                        or that a conflict is detected

      FCHECK_SAT: means that the conjunction of all atoms
                  is satisfiable

      FCHECK_UNKNOWN: means that the solver doesn't know
        (i.e., the solver is not complete, and it's not
         found a conflict).


    If the function returns FCHECK_SAT or FCHECK_UNKNOWN, the
    search stops here.

    If the function returns FCHECK_CONTINUE, then the SMT core
    will continue the search. For this to work, the solver
    must have done one of the following things:
    - record a conflict in the SMT core (via 'record_theory_conflict')
      then the SMT core will backtrack and try another branch
    - or add new clauses or atoms in the core


There are other functions but they are optional.


The general idea is then as follows:

1) The theory solver must keep track of the atoms asserted true
   or false so far.

   - it's notified that an atom becomes true/false by
     function assert_atom

   - it must remove all atoms that are no longer assigned
     when backtrack is called


2) Then the two main functions are 'propagate' and 'final_check'

   For a non-incremental solver, the simplest strategy is to
   do nothing in 'propagate' and only check whether the whole
   set of atoms is satisfiable when 'final_check' is called.
   If they are not, the solver must produce an explanation
   for the conflict, that is, it must find a subset of atoms
   that are inconsistent, convert that to a clause and
   send the clause to the SMT core. The smaller this subset
   the better.

   An incremental solver can do more work in 'propagate' to
   prune the search more (i.e., detect conflict early).
   There's a tradeoff between how much work to do in 'propagate'
   and in 'final_check'. Normally, the 'propagate' checks should
   be cheap (possibly incomplete) and the 'final_check' function
   can do more exhaustive work to detect conflicts.

