[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Axiom-developer] [Guessing formulas for sequences]

From: anonyme
Subject: [Axiom-developer] [Guessing formulas for sequences]
Date: Sat, 19 Mar 2005 14:59:58 -0600


-The package defined below allows Axiom to guess a formula for a sequence whose 
first few terms are given.
  We present a software package that guesses formulas for sequences of rational
  numbers, rational functions and also more complicated formulas, given the
  first few terms. Thereby we extend and complement Christian Krattenthaler's
  program 'Rate', Doron Zeilberger's program 'GuessRat' and the relevant parts
  of Bruno Salvy and Paul Zimmermann's 'GFUN'.


    For some a brain-teaser, for others one step in proving their next theorem:
given the first few terms of a sequence of, say, integers, what is the next
term, what is the general formula? Of course, no unique solution exists,
however, by Occam's razor, we will prefer a "simple" formula over a more
"complicated" one.

Some sequences are very easy to "guess", like
Of course, at times we might want to guess a formula for a sequence of
polynomials, too:
1, 1, 1 + q, 1 + q + q^2, 1 + q + q^2 + q^3 + q^4, 1 + q + q^2 + q^3 + 2q^4 +
q^5 + q^6,\\
1 + q + q^2 + q^3 + 2q^4 + 2q^5 + 2q^6 + q^7 + q^8 + q^9, (1 + q^4 + q^6)(1 +
q + q^2 + q^3 + q^4 + q^5 + q^6),\\
(1 + q^4)(1 + q + q^2 + q^3 + q^4 + q^5 + 2q^6 + 2q^7 + 2q^8 + 2q^9 + q^{10} +
q^{11} + q^{12}),\dots

Fortunately, with the right tool, it is a matter of a moment to figure out
formulas for all of these sequences. In this article we describe a computer
program that encompasses well known techniques and adds new ideas that we hope
to be very effective.

We would also like to mention *The online encyclopedia of integer sequences* of
Neil Sloane. There, you can enter a sequence of integers and chances are good
that the website will respond with one or more likely matches.  However, the
approach taken is quite different from ours: the encyclopedia keeps a list of
currently 103,667 sequences, entered manually, and it compares the given
sequence with each one of those. Besides that, it tries some simple
transformations on the given sequence to find a match.  Furthermore it tries
some simple programs we will describe below to find a formula, although with a
time limit, i.e., it gives up when too much time has elapsed.

Thus, the two approaches complement each other: For example, there are
sequences where no simple formula is likely to exist, and which can thus be
found only in the encyclopedia. On the other hand, there might be sequences
that have not yet found their way into the encyclopedia, but can be guessed in
a few minutes by your computer.

  Guessing Algorithms

    A formula for Sequence (1), is almost trivial to guess: it seems
obvious that it is $n^2$. More generally, if we believe that the sequence in
question is generated by a polynomial, we can simply apply interpolation.
However, how can we "know" that a polynomial formula is appropriate? The answer
is quite simple: we use all but the last term of the sequence to derive the
formula. After this, the last term is compared with the value predicted by the
polynomial. If they coincide, we can be confident that the guessed formula is

The second sequence does not seem to be generated by a polynomial. However,
paralleling interpolation, Pad\'e approximation may be used to find out whether
the terms of the sequence satisfy a linear reccurrence with constant
coefficients, as is the case here.

We would like to point out that the main problem we have to solve is about
efficiency. For example, maybe we would like to test whether the formula is
rational with an Abelian term, i.e., has the form
  n\mapsto (a+bn)^n \frac{a_0+a_1 n+a_2 n^2+\dots+a_k n^k}
                         {1+b_1 n+b_2 n^2+\dots+b_l n^l}
for some $k$ and $l$. Of course, we could set up an appropriate system of
$k+l+4$ polynomial equations in $k+l+3$ unknowns. However, it will usually take
a very long time to solve this system. Thus, we need to find *efficient*
algorithms that test for large classes of formulas.

It is obvious that the first method described is easily extended to cover
rational functions via rational interpolation. Similarly, the second method can
be extended to cover differentially finite or even differentially algebraic
functions. One of the contributions of this article is to show that there is
also a feasible way to guess sequences generated by Formula~\ref{eq5}.


    Still, there are a lot of formulas which cannot be found using 
interpolation or
Pade approximation. However, it occurs frequently that although a formula is
not rational, applying one of the two following operators makes it rational:

  - $\Delta_n$ the differencing operator, transforming $f(n)$ into
  $f(n)-f(n-1)$, and
  - $Q_n$ the operator that transforms $f(n)$ into $f(n)/f(n-1)$.

Thus, we simply apply these operators recursively to the sequence, until we are
able to guess a formula using interpolation, Pad\'e approximation or one of its

  Using the package

    In this section we describe how to guess a formula for your favorite 
with the 'Axiom' package 'Guess'.

The package provides currently four primitive guessing functions, namely:

  - 'guessRat' for guessing rational functions,
  - 'guessExpRat' for guessing rational functions with an Abelian-type
    term as in (5),
  - 'guessPade' for guessing sequences defined by a recurrence with
    constant coefficients, and
  - 'guessPRec' for guessing sequences defined by a recurrence with
    polynomial coefficients.

Among these, 'guessRat' and 'guessPade' are very fast, 'guessPRec'
is reasonably fast and 'guessExpRat' is rather slow. 

    Guessing formulas for sequences of rational numbers

      For example, if we suspect that a sequence of integers or rationals like
Sequence (2) satisfies a simple recurrence, we type::

  guessPade(n, [1, 1, 2, 3, 5], n+->n)\$GuessInteger

to obtain an answer like
{\left[ {function={coefficient 
{-{1 \over {{ x \sp 2}+ x -1}}, \: x, \: n} 
\: {order=1} 

Thus, we used the operation 'guessPade' from the package 'GuessInteger' to
guess the $n^{th}$ term of the sequence 1,1,2,3,5. The meaning of 'n+->n'
will be made clear later.

The result is a list of formulas which seem to fit, along with an integer that
states from which term on the formula is correct. The latter is necessary,
because rational interpolation features sometimes *unattainable points*,
as the following example shows::

  guessRat(n, [3, 4, 7/2, 18/5, 11/3, 26/7], n+->n)\$GuessInteger

{\left[ {function={{4n+2} \over {n+1}}}, \: {order=3} 
Here, $order=3$ indicates that the first two terms of the sequence *might* not
coincide with the value of the returned function. A similar situation occurs,
if the function generating the sequence has a pole at $n_0\in\mathbb N$, where
$0 < n_0 < m$ and $m$ is the number of given values. We would like to stress
that this is rather a feature than a "bug": most terms will be correct,
just as in the example above, where the value at $n=1$ is indeed $3$.

Apart from 'guessRat', 'guessExpRat', 'guessPade' and 'guessPRec', the package
provides a top level operation 'guess', which applies the operators $\Delta_n$
and $Q_n$ recursively to the given sequence and then tries the specified
guessing algorithms. For example, given the sequence
the appropriate function is::

  guess(n, [0, 1, 3, 9, 33], n+->n, [guessRat, guessPade], 
        [guessSum, guessProduct, guessOne])\$GuessInteger

and the output will be - apart from some information on the progress -
{\left[ {function={\sum \sb{\displaystyle {{s \sb {5}}=1}} \sp{\displaystyle 
{n -1}} {\prod \sb{\displaystyle {{p \sb {4}}=1}} \sp{\displaystyle {{s \sb 
{5}} -1}} {({p \sb {4}}+1)}}}}, \: {order=1} 

Thus, 'guess' takes five parameters. The first three parallel the parameters in
the other guessing functions. The fourth parameter is a list of guessing
functions to be tried, in the example 'guessRat' and 'guessPade'. The fifth
parameter specifies which operators should be applied.  Additionally the option
'guessOne' can be specified, which tells the program to stop immediately if one
fitting function has been found.
[77 more lines...]

forwarded from

reply via email to

[Prev in Thread] Current Thread [Next in Thread]