[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gcl-devel] Fwd: secret of rat3a/ctimes with gcl ?
From: |
Camm Maguire |
Subject: |
Re: [Gcl-devel] Fwd: secret of rat3a/ctimes with gcl ? |
Date: |
Wed, 20 Jun 2012 18:08:06 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) |
Greetings, and thanks!
Here are the definitions I'm examining now:
(progn
(push '((fixnum fixnum) fixnum #.(compiler::flags compiler::rfa)
"(fixnum)(((long long)(#0))%((long long)(#1)))") (get 'i%
'compiler::inline-always))
(push '((fixnum fixnum fixnum) fixnum #.(compiler::flags compiler::rfa)
"(fixnum)(((long long)(#0))*((long long)(#1))%((long long)(#2)))")
(get '*% 'compiler::inline-always))
(push '((fixnum fixnum fixnum) fixnum #.(compiler::flags compiler::rfa)
"(fixnum)(((long long)(#0))+((long long)(#1))%((long long)(#2)))")
(get '+% 'compiler::inline-always))
(push '((fixnum fixnum fixnum) fixnum #.(compiler::flags compiler::rfa)
"(fixnum)(((long long)(#0))-((long long)(#1))%((long long)(#2)))")
(get '-% 'compiler::inline-always))
(setf (get 'i% 'compiler::return-type) t)
(setf (get '*% 'compiler::return-type) t)
(setf (get '+% 'compiler::return-type) t)
(setf (get '-% 'compiler::return-type) t)
(defmacro mcf (f a &optional b &aux (z (assoc f '((identity 64 i%)(* 32 *%)(+
63 +%)(- 63 -%))))(tp (cadr z))(ff (caddr z)))
`(if ,(if b `(typep modulus ',(if (< (integer-length most-positive-fixnum)
tp) `fixnum `(signed-byte ,tp)))
`(and (typep modulus 'fixnum) (typep ,a 'fixnum)))
(let ((x ,a),@(when b `((y ,b)))(z modulus))
(declare (fixnum x ,@(when b `(y)) z))
(let ((k (,ff x ,@(when b `(y)) z))(q (ash z -1)))
(declare (fixnum k q))
(if (> k q) (the fixnum (- k z)) (if (< k (- q)) (the fixnum (+ k
z)) k))))
(let ((k ,@(if b `((,f ,a ,b)) `(,a))))
(if modulus (let ((q (ash modulus -1))(k (mod k modulus))) (if (> k q)
(- k modulus) (if (< k (- q)) (+ k modulus) k))) k))))
(si::define-compiler-macro ctimes (a b) `(mcf * ,a ,b))
(si::define-compiler-macro cplus (a b) `(mcf + ,a ,b))
(si::define-compiler-macro cdifference (a b) `(mcf - ,a ,b))
(si::define-compiler-macro cmod (a) `(mcf identity ,a))
)
The unsigned mod version of mcf would be
(defmacro mcf (f a &optional b &aux (z (assoc f '((identity 64 i%)(* 32 *%)(+
63 +%)(- 63 -%))))(tp (cadr z))(ff (caddr z)))
`(if ,(if b `(typep modulus ',(if (< (integer-length most-positive-fixnum)
tp) `fixnum `(signed-byte ,tp)))
`(and (typep modulus 'fixnum) (typep ,a 'fixnum)))
(let ((x ,a),@(when b `((y ,b)))(z modulus))
(declare (fixnum x ,@(when b `(y)) z))
(,ff x ,@(when b `(y)) z))
(let ((k ,@(if b `((,f ,a ,b)) `(,a))))
(if modulus (mod k modulus) k))))
These speed up your routines nicely. They don't appear to noticeably
impact the maxima testsuite runtimes.
I suppose the 'balanced mod' implementation is to control the growth of
the integers? What is unclear to me is that the #-kcl version in
rat3a.lisp only checks the > mod/2 branch, whereas this as well as the <
-mod/2 appear necessary on my system.
Anyway, let me know if this helps.
Take care,
Volker van Nek <address@hidden> writes:
> Just
> modulus : 7 $
>
> My polynomial '(13 1 12 2 11 3 10 4 0 5) for testing had coefficients smaller
> than 7.
>
> Greetings
> Volker
>
> 2012/6/14 Camm Maguire <address@hidden>
>
> Greetings, and thanks! What are you using for the modulus setting in
> your timings?
>
> Volker van Nek <address@hidden> writes:
>
> > Hi Camm,
> >
> > I attach a file with my versions psq1 and psq2 and with ptimes1 from
> rat3a.lisp and some functions from rat3a.lisp and optimize.lisp which
> are
> > envolved in running ptimes1 and ctimes.
> >
> > Greetings
> > Volker
> >
> > 2012/6/14 Camm Maguire <address@hidden>
> >
> > Greetings! Could you please post the source to psq1, psq1, and
> ptimes1? Thanks!
> >
> > Volker van Nek <address@hidden> writes:
> >
> > > Hi Camm. Greetings!
> > >
> > > Here at work I have a (damned slow) Computer which runs Maxima on
> sbcl and I also performed a timing test with psq1, psq2 and rat3a/
> ptimes1.
> > >
> > > (%i4) :lisp (time (dotimes (i 100000) (psq1 '(13 1 12 2 11 3 10 4
> 0 5))))
> > >
> > > Evaluation took:
> > > 0.460 seconds of real time
> > > 0.428026 seconds of total run time (0.424026 user, 0.004000
> system)
> > > [ Run times consist of 0.016 seconds GC time, and 0.413 seconds
> non-GC time. ]
> > > 93.04% CPU
> > > 1,103,390,864 processor cycles
> > > 35,201,008 bytes consed
> > >
> > > NIL
> > > (%i4) :lisp (time (dotimes (i 100000) (psq2 '(13 1 12 2 11 3 10 4
> 0 5))))
> > >
> > > Evaluation took:
> > > 0.301 seconds of real time
> > > 0.264016 seconds of total run time (0.264016 user, 0.000000
> system)
> > > [ Run times consist of 0.012 seconds GC time, and 0.253 seconds
> non-GC time. ]
> > > 87.71% CPU
> > > 723,411,304 processor cycles
> > > 35,198,904 bytes consed
> > >
> > > NIL
> > > (%i4) :lisp (time (dotimes (i 100000) (ptimes1 '(13 1 12 2 11 3
> 10 4 0 5) '(13 1 12 2 11 3 10 4 0 5))))
> > >
> > > Evaluation took:
> > > 1.050 seconds of real time
> > > 1.024065 seconds of total run time (1.020064 user, 0.004001
> system)
> > > [ Run times consist of 0.004 seconds GC time, and 1.021 seconds
> non-GC time. ]
> > > 97.52% CPU
> > > 2,520,188,384 processor cycles
> > > 21,600,864 bytes consed
> > >
> > > NIL
> > >
> > > Here the number of processor cycles corresponds to the timing
> results I expected.
> > >
> > > psq2 uses just (mod (* a b) modulus) where
> > >
> > > psq1 uses
> > >
> > > (cond
> > > ((not (null modulus))
> > > (let ((.n. (mod (* a b) modulus)))
> > > (cond
> > > ((<= (* 2 .n.) modulus) .n.)
> > > (t (- .n. modulus)) )))
> > >
> > > at the same place, which is a lot more!
> > >
> > > rat3a/ptimes1 even uses rat3a/ptimes instead of rat3a/ctimes
> which puts a lot of conditionals on top and it uses an algorithm which
> consists
> > of
> > > nearly twice the number of multiplications and additions.
> > >
> > > How can it happen, that doing twice the work is faster in gcl
> than just doing it once? sbcl shows the timings I expected. What is
> different
> > with
> > > gcl?
> > >
> > > Volker
> > >
> > > 2012/6/13 Volker van Nek <address@hidden>
> > >
> > > Hi Camm. Greetings!
> > >
> > > Thank you for taking your time for this.
> > >
> > > I don't expect overflowing the fixnum bounds, the numbers are
> very small. Using e.g. a modulus of 7 the multiplication is of type
> 3*4.
> > >
> > > (setq si::*notify-gbc* t)
> > > I do not understand what the answers want to tell me. I hope
> you know.
> > >
> > > I did some experiments with (declare (integer a b c)), but
> this did not seem to have influence.
> > >
> > > I attached the code in two variants and a session introducing
> the function and some timing results on my computer.
> > >
> > > Thanks in advance.
> > >
> > > Volker van Nek
> > >
> > > 2012/6/13 Camm Maguire <address@hidden>
> > >
> > > Greetings!
> > >
> > > I'd like to actually see your implementation to make
> sure. But my guess
> > > is that the unsigned arithmetic is overflowing the fixnum
> bounds,
> > > i.e. producing bignums. You might get an idea by (setq
> si::*notify-gbc*
> > > t). Is anything known about modulus lisp-type wise?
> > >
> > > Volker van Nek <address@hidden> writes:
> > >
> > > > Hi Camm,
> > > >
> > > > there was no answer on the Maxima list to my question.
> Perhaps you might have an idea.
> > > >
> > > > Thanks in advance
> > > > Volker van Nek
> > > >
> > > > ---------- Forwarded message ----------
> > > > From: Volker van Nek <address@hidden>
> > > > Date: 2012/6/9
> > > > Subject: secret of rat3a/ctimes with gcl ?
> > > > To: address@hidden
> > > >
> > > > I have coded a squaring function for polynomials in
> finite fields. As expected it is roughly 1.8 times faster than
> multiplying
> > poly*poly
> > > with rat3a
> > > > /ptimes, wich is the underlying lisp function for
> Maxima level fasttimes.
> > > >
> > > > To multiply the coefficients the function ptimes uses
> ctimes, which multiplies and reduces the result by a balanced modulus.
> From
> > rat3a:
> > > >
> > > > (defmacro mcmod (n) ;;; gives answers from -modulus/2
> ...0 1 2 +modulus/2
> > > > `(let ((.n. (mod ,n modulus)))
> > > > (cond ((<= (* 2 .n.) modulus) .n.)
> > > > (t (- .n. modulus)))))
> > > >
> > > > (defun ctimes (a b)
> > > > (cond ((null modulus) (* a b))
> > > > (t (mcmod (* a b)))))
> > > >
> > > > In my squaring function I want to replace each (ctimes
> a b) by simply (mod (* a b) modulus). I need positive results.
> > > >
> > > > As expected with clisp and sbcl this replacement is
> faster than ctimes, but with gcl this replacement (which does less!)
> slows down
> > the
> > > whole
> > > > function by a factor 3.
> > > >
> > > > Is there a hidden secret about ctimes with gcl? Any
> hints are welcome.
> > > >
> > > > Volker van Nek
> > > >
> > >
> > > --
> > > Camm Maguire
> address@hidden
> > >
> ==========================================================================
> > > "The earth is but one country, and mankind its citizens."
> -- Baha'u'llah
> > >
> >
> > --
> > Camm Maguire address@hidden
> >
> ==========================================================================
> > "The earth is but one country, and mankind its citizens." --
> Baha'u'llah
> >
> >
>
> --
> Camm Maguire address@hidden
> ==========================================================================
> "The earth is but one country, and mankind its citizens." -- Baha'u'llah
>
--
Camm Maguire address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens." -- Baha'u'llah