Date: Thu, 27 Apr 2000 22:22:31 -0600
From: Scott Ribe
To: 4D MailList <4d-nug@lists.4dnetizens.com>
Subject: Real Numbers & Computers [unbounded length] [corrected]
Geez, people, get a grip and quit sniping at each other! I'll provide
some hard FACTS here, but first let me say this: I know what I'm talking
about. (If you don't know what a denormalized representation is, then
don't even THINK about trying to disagree with me because you don't have
the requisite knowledge to do so.)
[I'm up to 3 errors so far in my drafts, if you allow me to count as a
"draft" the version that never made it to the list, possibly because I
used the "bs" word without abbreviating.]
The facts:
- As mentioned many times, real numbers have a limited precision.
- Also as mentioned, many values cannot be exactly represented. Not ever
mentioned though, is what values. I get the feeling that many here think
that we're talking about really small values, or values in the cracks,
like trying to enumerate 1,000,000 numbers between 0.000000000001 and
0.000000000002. But that's not the case. The number 0.1 cannot be
represented exactly by a BINARY number, just as 1/3 cannot be represented
exactly by a decimal number. Try it! You'll find that 1/10 is 1/16 + 1/32
+ 1/64 + 1/128 ad infinitum, or in binary representation
0.00011111111111... forever. The same is true for 0.2, 0.4, 0.6, 0.8, and
of course all numbers ending with those fractions. You might think that
all numbers that are those divided by a power of 10 (0.01, 0.001... and
0.02, 0.002... and ...) would suffer the same problem, but you'd be
making the mistake of applying "10s" thinking to "2s" numbers! Try this:
0.5 can be exactly represented in binary, but 0.05 cannot! So you can
see, as soon as you start representing decimal fractions as binary
fractions there's a whole lot of tiny errors introduced.
- A really, really, really, ugly consequence of these representation
limitations is that if you, for instance, round 0.1000000001 to 1 decimal
place, you may SEE 0.1 whenever you display it, thanks to 4D's display
routines, but you ain't got 0.01! You've got 0.1 less about, well,
exactly 1 / (10 * (2 ^ 53)). OK, that's about 0.111e-18 if you don't have
a calculator handy. It's just the (exact) value of the infinite strings
of 1s out to the right of the 53 (52 actual bits and an implied 1 for any
bit-twiddling geeks reading this) binary digits you've got! You'll never
ever get 0.1 in a binary representation, no matter what rounding you apply!
- Due to display options, and possible weirdness in the debugger, what
you see is often not the full representation. This makes it very
confusing to figure out at what point inaccuracies creep in.
- 4D does not protect us from these "facts of life"; some systems do.
There are serveral methods: decimal mantissas and exponents, essentially
the decimal types offered by some other databases that have been
mentioned here; BCD or binary coded decimal; larger underlying
representations with automatic rounding everywhere. All of these provide
convenience but at the cost of a layer of software over the CPU
instructions, which adds a performance cost.
- I have never been able to document an error in 4D's math. I have
suspected one on very rare occasions. I have much more often found
rounding problems in my own code.
- There have been reports here from time to time of funny behavior with
0. I suspect that in these cases the developers thought they had a 0 but
in fact had a very small number that was the result of some sequence of
operations that in "pure" math would have yielded 0. That said, here are
some facts about 0s and computer math: 0 = 0, any number + 0 = exactly
the same number, any number - 0 = exactly the same number, the string "0"
converted to a number = 0. If 4D actually violates any of those rules
that is an error in 4D. But I've not seen it do that and am not convinced
that it ever has. There actually ARE 2 0s, positive and negative 0, but
they behave according to the rules above and are only provided for an
obscure reason. (-1 divided by 10 gazillion or so will truncate to -0
while 1 divided by 10 gazillion or so will truncate to +0, so that if you
reach "0" by dividing down to a number that is too small to be
represented in a real, the result is a 0 that carries information about
whether it underflowed from the negative or positive direction.)
- I did at one time suspect that going back and forth between Mac 68k and
other platforms would "pollute" 0s somehow. Possibly by not clearing out
the extra bits used in the 68k representation. But I couldn't nail it
down. The problem was rare and my 68k clients were retired and now I'll
never know (or care).
- I don't have an actual explanation for Dan's original problem. Although
2.4 cannot be exactly represented, if you subtract 2.4 from itself you
should get exactly 0; the repeating digits cancel each other out and
there's nothing left. So Dan's post is a clue that there might be a bug
in 4D's Round function. (Although 2.4 is not exactly representable, the
function should return the closest possible binary number to what would
be the actual decimal number result.) I'd like to see this reproduced and
tested more thoroughly.
Now I'm going to respond to some individual statements made here. Since
this thread has degraded so far and I've had a rough day with real,
actual, 4D bugs, I'm going to be as blunt as I like! Please don't anyone
feel like you're being picked on; this IS a difficult subject and one
that causes MUCH confusion.
On Wed, Apr 26, 2000, David Hudson wrote:
>There's been some very complex postings on this topic and many have
>re-iterated that rounding will problems but I still don't understand
>how the value of bBalance could ever be .9999999998 (or thereabouts) -
>but on occasion it was.
Do you understand now? Or, IOW: was my explanation clear enough?
On Wed, Apr 26, 2000, Steve Hussey wrote:
>You'll have the same problems whatever language/tool you use unless it
>supports more bytes per number.
More bytes per number will not do anything to fix the problem. The little
missing bits of your numbers will be much smaller, but you'll still have
to perform the same rounding in the same places.
On Wed, Apr 26, 2000, CMichels76@aol.com wrote:
>Tell that to a scientist who is analyzing his data and wants zero to be
zero!
> Of course, he also wants nul to be nul, but we won't even go there, will we?
You clearly don't know how scientists and engineers deal with numbers and
rounding. Find one and ask for an explanation.
On Wed, Apr 26, 2000, Dan Babcock wrote:
>I DO understand floating point arithmetic by computers, despite the
>condescending remarks by Douglas Blew. I have done professional
>development on a very large number of computer systems, each new how
>to add 1.2 and 3.4. Why can't 4D. There is absolutely no ambiguity
>or numerical estimation in adding two finite numbers. I do not
>recall hitting this in 4D prior to version 6.
Well, I have to question whether you really understand floating point
math on computers. You're demonstrating otherwise. There IS an estimation
when representing 1.2 and 3.4 in binary. As mentioned above, there ARE
ways to hide or eliminate these estimations, and 4D does not do this for us.
>What I am saying here is that I should not have to convert a number
>to a string and back to a number to get correct mathematical results
>when adding such simple numbers. I do believe there is a bug in the
>trunc and/or round function because I have seen it not perform the
>function. Yes, I realize 4D can't fix this until I create a
>reproducible case for them. Maybe someday...
Round cannot return a number with any more accuracy than what the
underlying representation accomodates, so in general you need to round
results, not just operands. However, the case you've presented is a
special case that DOES seem to indicate a problem with and should be
investigated. (Funny thing is though, that this possible bug in no way
affects how calculations should be done in 4d> If ACI fixed it to
tomorrow, we'd still need to round in all the same places!)
On Wed, Apr 26, 2000, Dan Babcock wrote:
>During my testing, I rounded two numbers each to two decimal places.
>I then subtracted them. The result should have been zero. Instead
>it was a mathematical approximation of zero (-.0000000000009 or what
>ever it was).
As mentioned before, but worth repeating, rounding the operands is not
sufficient; you must round the result, IN GENERAL. In your case there
may be a bug. But what if the 2 numbers "rounded" to 2.5 and 2.4? One
exactly representable and one not?
David Hudson, on 4/26/00 4:03 AM, said:
>So why is it that the the results of the same sum on the same machine
>are affected merely by its being carried out in a different record?
>Baffles me.
Because the values in the records are actually different by a small
amount that you did not see. If you could prove otherwise, you'd have a
genuine bug.
On Wed, Apr 26, 2000, Cannon and Nicole Smith wrote:
>I'm jumping in here because it just hit me--do I need to worry about
>this. If the user always only types in numbers to the second decimal
>place (as in dollars and cents) do I need to worry about rounding or
>truncating?
>My assumption is no, but after all I've read here, I'm not sure
>anymore.
Your doubts are putting you on the right track. Yes, you need to be rounding.
On Wed, Apr 26, 2000, Kurt Fujio wrote:
>That's nonsense and just plain wrong. I'd like to know why you don't
>believe zero can be represented as a digital number.
He never said that 0 cannot be represented as a digital number. That's
not the point at all. The point is that when you subtract two other
numbers which CANNOT be represented exactly in binary, you don't get 0;
you get a small number that is a result of the roundoff from the
inaccuarcy in the representation of the original numbers.
>A "digital", binary zero still equals zero, no matter how its
>represented as a digital number electronically.
Yes, of course. And this is completely irrelevant to the point at hand.
>In 4D, when a zero result is expected, but a non-zero result is
>achieved, it is due to errors not related to the representation or
>storage of individual numeric values, but rather to operations
>performed on those digital numbers.
This is just plain incorrect. It may be true in the case Dan is
presenting, but it is most definitely NOT true in general.
>The 4D math errors that user/developers are very legitimately unhappy
>about are due to 4D's sometimes inaccurate calculations using numbers
>within the specified ranges of the language definition. Not, as you
>claim, because user/developers are using numbers outside of 4D's
>specified numeric value ranges for 4D fields and variables.
He didn't say that either. Being "in range" and being "exactly
representable" are two completely different things. And all the errors
I've seen discussed so far are precisely attributable to the conversions
between binary and decimal representation.
>Following your logic and zealotry for 4D, you would be arguing that
>"it's disconcerting that 4D's record indexes may sometimes be
>corrupted, and give you the wrong search and sort results, but you
>can be assured that the underlying data is intact, if you write your
>own routine to search and sort that does not use 4D indexes".
This is just an empty burst of hot air. I should probably have let it go
without comment ;-)
>The 4D arithmetic topic and discussion is more than ten years old,
>and it's sad that while 4D has not improved in this regard, the
>overzealous defense and acceptance of 4D's imperfections by some
>individual 4D users continues to grow unbounded.
So, are you going to claim that ***I'm*** a 4D apologist or zealot? That
would just be too funny for words!
>In the good old days, 4D developers respected and looked for better
>solutions and results.
>
>Now, sadly, we have make-believe 4D gurus pretending to have some
>kind of knowledge or wisdom that novice 4D users lack, when, in
>reality, many new 4D users have fewer limiting preconceptions and
>bring greater insight and far more valuable perspective to the
>realities regarding both 4D's relative strengths and true weaknesses
>or limitations.
Well, it would be nice if 4D would provide a solution to this problem,
preferably one that lets us choose between slower exact decimal
representations and faster "plain" binary math. But in the meantime we
have to learn about and live with the limitations of binary
floating-point math. Blaming 4D doesn't cut it. Some higher-level
languages provide that which you demand; lower-level languages like C and
its derivatives do not. 4D is somewhere in the middle in many ways.
Assuming that 4D will hide all the details from you in the way that Excel
does is just not a reasonable assumption. As a feature request it makes
some sense. But your vitriol and name-calling do not make any sense at all.
Scott Ribe
scott_ribe@killerbytes.com
http://www.scott.net/~sribe
(303) 665-7007 voice
**********************************************************************
4th Dimension Networked Users Group (4D NUG)
FAQ: http://www.4dnetizens.com/4d-nug/4d_mailing_list_faq.html
Admin: Karen R Sabog <4d-admin@4dnetizens.com>
Resources: http://www.4dnetizens.com/4d-nug/resources.html
**********************************************************************
Комментариев нет:
Отправить комментарий