lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Calculation summary speed


From: Greg Chicares
Subject: Re: [lmi] Calculation summary speed
Date: Tue, 31 Oct 2006 01:21:16 +0000
User-agent: Thunderbird 1.5.0.4 (Windows/20060516)

On 2006-10-31 0:15 UTC, Evgeniy Tarassov wrote:
> 
[speedups include...]
> some modifications applied to xsl templates used to generate html
> and TSV calculation summary output
> 
> 1) i have slightly changed the code to read for every ledger value the
> 'calculation_summary' flag from 'format.xml'. The optimisation in C++
> consisted of skipping the most time-costly place -- method that
> formats double vectors into string a vector. The method generates the
> output for a column only if:
> - either the value is explicitly marked by calculation_summary in
> 'format.xml'
> - or the column is in he supplemental_report value set

I think it would be better to use only columns specified in a new
'calculation_summary_columns' entity in 'configurable_settings.xml'.
See:
  http://lists.gnu.org/archive/html/lmi/2006-10/msg00064.html

> All these values are inserted into the output xml if we generate a
> full version of xml (for pdf generation).
> The single double values and string vector values are still included in
> the xml.

Are there so few string-vector values, that conditionally excluding
them wouldn't make it noticeably faster?

> 2) XSL template optimisation

Okay--filtering out unneeded nodes first, before looking up the
few nodes actually needed to generate the final table, was the
key to the optimization. Thanks for the detailed explanation.

> Some other comments:
>> From the beginning we were discussing the inclusion of
> supplemental_report into ledger xml data vis-a-vis passing
> supplemental_report columns as parameters to the libxslt
> transformation engine.

Almost--the original idea was to use a 'configurable_settings.xml'
entity
   <calculation_summary_columns>
     some_column_name
     some_other_column_name
     ...
   </calculation_summary_columns>
as the set of parameters.

> It turns out that in general passing
> information as parameter is discouraged in the xslt community and is
> generally used to debug/tweak xsl templates (for example pass an
> optional debug parameter to make template print more debug information
> in case some information is missing or in some sort of unusual
> situation).
> If i understand it correctly Lmi as it is now does not have such a
> global switch configurable at runtime (via
> configurable_settings.xml?). Do you think that it could be intersting
> to add such a feature (a new 'debug' flag in
> 'configurable_settings.xml'), which will make xslt print warning in
> red in case any column is missing or something unexpected is going on
> with ledger xml code?

That could be a valuable idea for the future. I have done so
little work with xslt myself that I can't say anything very
insightful.

> Since we are putting supplemental_report columns into ledger xml, we

I think we should move away from that idea, for the calculation
summary, as discussed above.

> have to regenerate xml data every time the user changes
> supplemental_report columns,

That's okay.

> no (simple) caching could be done in
> ledger_text_formats.?pp.

That's okay. Now, I guess that you were using lazy evaluation
to implement caching; as you point out, that's not really
beneficial any more. I suppose it would have helped if an end
user reran an illustration after changing only the selection
of calculation-summary columns; that would actually be rare in
practice, and need not be optimized. Typically, they'll change
other input fields between runs, so that the calculations will
produce a different result--and the xml will need to be
regenerated anyway.

But it's good to understand that. When I saw lazy evaluation,
I originally guessed that it somehow simplified the code to
write in a 'functional' idiom--e.g., that you could write
unconditional statements for generating the data, and then
only the necessary data would actually get generated. So I
figured you were thinking in a functional language while
writing C++, which is perfectly okay. Here:
  'input_harmonization.cpp' [Input::DoHarmonize()]
is a lot of code that *wants* to be declarative, and I've
toyed with the idea of using FC++
  http://www-static.cc.gatech.edu/~yannis/fc++/boostpaper/fcpp.html
for that but never had the time.

> I have removed the cache making the 'Prepare'
> phase useless. Which means that in the latest result Greg have posted:
> 
>> >            total  calculate prepare format
>> > modified:    430        150      30    250 "coarse sketch"
>> > new':        549        154     299     96 xsl optimization
>> > the winner:  278        152      30     96
>>
>> new'':        322        157      20    145 20061030T1836Z cvs

Again, thanks for the explanation.

> the third column ('prepare') does not measure the xml preparation time
> -- it is now included in 'format' phase, therefore the slight
> degradation in 'format' phase speed.
> 
> Greg--do you want me to remove the 'prepare' phase timer from
> illustration_view.cpp?

Well, it still measures something: SetLedger(). I'd say it
does no harm to leave it there, and maybe the information
will be useful somehow, someday; if not, we can always
remove it later.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]