[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


From: Ben Pfaff
Subject: Re: ADD FILES and UPDATE
Date: Sun, 30 Nov 2008 21:00:56 -0800
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux)

John Darrington <address@hidden> writes:

> I also wasn't aware that compare_values only checked the first 8
> characters.  I have a feeling this is a bug.

Yes, in many cases I think so too.

Here is my analysis of each use in the source tree:

        * src/data/category.c (cat_value_find): The code assumes
          that each candidate is exactly 1 union value in size,
          so I don't think that this is a bug here.

        * src/language/stats/binomial.c (do_binomial): I think
          that this is a bug, since I don't see any guarantee
          that the width is short.

        * src/language/stats/examine.q (show_summary): I'm not
          sure from a first look, but I'm inclined to say that
          this is a bug.  However, the SPSS documentation for
          EXAMINE says that only the short string part of long
          string variables is analyzed, and maybe we have that
          restriction too, though we do not document it.

        * src/language/stats/freq.c (compare_freq): The main user
          of this function is FREQUENCIES, and it intentionally
          only analyzes the short-string part of long string
          variables.  chisquare.c also uses it, and I don't know
          whether it has the same intention.

        * src/language/stats/oneway.q (run_oneway): ONEWAY only
          supports numeric variables, so this could actually use
          a different comparison function that only compared
          doubles.  At any rate I don't think it is buggy.

        * src/language/stats/t-test.q (which_group): T-TEST only
          analyzes the short-string part of long string
          variables, so this should be correct.

        * src/math/coefficient.c (pspp_coeff_var_to_coeff): I
          don't know.  Further investigation required.

        * src/math/covariance-matrix.c (various functions):

        * src/math/group.c (compare_group): Ditto.

        * src/ui/gui/find-dialog.c (value_compare): Seems likely
          to be wrong.

I could use advice from you or Jason on the code that you
respectively maintain, especially in the "I don't know" cases

> The only problem with the 3way function is that the caller must
> gaurantee that both values have the same with.
> I think that we need a more general compare_values_with_widths_3way
> function and all existing callers of compare_values_short should use it.

That sounds like a good idea.  I know that in some places
comparison of strings of differing length is done by extending
the shorter string with spaces to the length of the longer
string, then comparing the result; see, e.g., compare_string() in
src/language/expressions/helpers.c, which is used for string
inequality comparison operators (and elsewhere) in expressions.
Are you aware of unequal-length strings being compared any other
way in PSPP?

Have you had a chance to look at any of the other commits yet?


"Premature optimization is the root of all evil."
--D. E. Knuth, "Structured Programming with go to Statements"

reply via email to

[Prev in Thread] Current Thread [Next in Thread]