What Does the Pentium Do, and When Does it Do It? _____________________________________ A PC Week Labs White Paper 16 December 1994 By Peter Coffee Advanced Technologies Analyst Independent tests by PC Week Labs show that neither Intel nor IBM has been a consistent source of relevant information concerning the real-world effects of the Pentium bug. Intel's and IBM's statements have been true, but critical details have been left out: IBM's statements make the problem seem much larger than it is, while Intel diminishes the problem by ignoring obvious facts about ways real users work. Based on the PC Week Labs Float Divide Verification Test Suite (see Appendix), we predict that the typical spreadsheet user will incur an error not every 24 days, as predicted by IBM, nor every 27,000 years, as asserted by Intel, but on the order of once every two months to 10 years. Such errors will always be far below the level of accuracy assumed in most business decision making, and will rarely appear even in dollars-and-cents calculations working with amounts in the millions or billions. We believe, based on IBM documents, that IBM's more pessimistic prediction is based on the frequency of calculations vulnerable to the error, rather than being based on actual frequency of errors. We observed errors, using numbers with one digit each to the right and the left of the decimal point, roughly 200 times less often than warned of by IBM--which is nonetheless 200,000 times as often as predicted by Intel. We agree with the IBM study's assumption that numbers close to 1 are used far more often by typical PC users than extremely large or small values. This is not a trivial matter. Suppose, for example, that the Pentium could not divide correctly by 2 or 10. Even though these are only two of the vast number of values that a Pentium can represent, such a flaw would clearly make the chip unacceptable to business users. Once we agree that some numbers are used more often than others, the only question is the degree of concentration. Background and Description of Problem For decades, those who know something about computers have derided the popular phrase "computer error" as blaming the wrong part of the system. "Computers don't make errors," they say. "Programmers make errors." When the rare exception comes along, both users and software developers react with not just surprise, but fury. Getting software right is hard enough when the hardware is functioning perfectly. However, it's impossible when correct instructions return incorrect results. It's not surprising, therefore, that Intel Corp. has found itself a target of anger from both the intensive and the casual user communities--the former because they knew they had a problem, the latter because they weren't sure that they didn't. The controversy began when an independent study of supercomputer-type applications found that Intel's Pentium microprocessor made reproducible floating- point division errors with certain ranges of operands. At first, these errors appeared to be confined to the ninth significant figure or beyond, making them irrelevant to the vast majority of users. Additional exploration, however, found errors in the fifth significant figure, which could affect digits well to the left of the decimal point when using numbers of middle-millions magnitude. The discontinuous nature of these errors, moreover, made it possible that a monotonic (either increasing or decreasing) function might suddenly seem to develop dips or wiggles that could ruin the results of search or optimization algorithms. Implications for Users The results described in our opening paragraphs have different implications for different classes of user. Users who are vulnerable to small but cumulative errors should definitely avoid Pentium-based machines for applications such as engineering or scientific simulation. In financial calculations, the Pentium's errors can be far in excess of its specified accuracy. Business users who have relied without concern on number- crunching tools such as spreadsheets may consider the resulting risk unacceptable. We note, however, that it is unrealistic to use financial calculations as examples of the Pentium flaw (with worst-case errors, for example, in the tens of dollars in a problem involving hundreds of thousands of dollars). This is because floating-point math has long been recognized as inappropriate for financial calculations, which are better done using fixed-point representations of known accuracy. In this regard, the Pentium flaw is a matter of degree, not of kind. Alternatives to the Pentium are an accurate X86 chip such as the 486DX4 or non-Intel equivalent; a non-x86 chip such as the SPARC; or a Pentium with its numeric hardware disabled by software (reducing performance, and assuming that one's applications work correctly without floating-point hardware available). But for network file servers, and for users of word processing, database, and multimedia entertainment software, we believe that the Pentium flaw is of no concern except to the degree that it may affect the resale value of one's machine. This effect will depend entirely on the attitude of system vendors and of Intel in assuring buyers that a flawed Pentium will be replaced on demand at any time. Can Intel Afford to Come Clean? Many agree that interest in the Pentium problem might have never reached its current level if Intel had offered a replacement or rebate. The question is, could Intel afford this option? Based on the most recent version of the manufacturing cost model developed by Microprocessor Report, the estimated direct cost of a Pentium chip is somewhere around $180 (and on its way down toward $100 with improved fabrication processes). In the worst case, there might be roughly 6 million flawed chips out in the user community. Most of these chips are at work in network servers or in other applications that literally never use the floating point hardware. So Intel might find as many as 1 million or 2 million users who would rather have a new chip, while 3 million or 4 million others would happily accept a check. The other 1 million or 2 million, we conjecture, would never bother to do the paperwork but would be satisfied with the knowledge that the resale value of their PCs had been somewhat preserved. Overall, this would cost Intel about $550 million, compared to net profits of $659 million in the third quarter of 1994 alone. And remember, this is meant to be something close to a worst-case analysis. This would reduce Intel's earnings for a single year by roughly 20 percent, letting the company go forward with an outstanding reputation for standing behind its products. What about the alternative, apparently now in effect, of denying the importance of the bug and grudgingly replacing chips for users who can show that they depend on floating-point accuracy? Here we might be talking about 1 million replacements costing $180 million, plus overhead costs of, say, $10 million, saving the company roughly $360 million in the short run. But how much of a dent in buyer confidence can Intel afford before this strategy nets out as more costly than cheerful replacement on demand? Intel says that roughly one-fourth of its revenues come from the Pentium chip, which means (based on published figures for revenues and chip prices) that it is selling between 1.5 million and 2 million chips per quarter. This is consistent with estimates from other sources. With competitors such as Advanced Micro Devices Inc. offering both high- performance 486 chips now and higher- performance Pentium-class chips early next year, suppose that marketplace anger reduced Intel's unit sales of Pentium chips alone by 30 percent during the coming year? That would multiply out as a baseline volume of 1.75 million chips per quarter, times four quarters, times a 30 percent reduction, times the margin per chip (about $238, according to figures published by Microprocessor Report). This totals almost exactly $500 million, right in the ballpark of our estimate for the cost of making people happy right now--but without nearly the same positive public relations effects. Why There's a Problem Why is it hard to do grade-school math on an expensive computer? Because humans usually work with powers of 10 (1, 10, 100, and so on) while computers work with powers of two (1, 2, 4, 8, and so on). This difference isn't a problem when counting whole numbers of things: The number 27 in base 10 (two 10s plus seven 1s) is a longer but numerically identical 11011 in base 2 (one 16 plus one 8 plus zero 4s plus one 2 plus one 1). On a computer, fixed-point math takes advantage of this good behavior by working, for example, with thousandths of a dollar as the fixed-size unit. Fixed-point math has no errors as long as it is working with numbers that are in the intended range. But scientists and engineers work with numbers that range from hugely big to itty- bitty small, and for them a fixed-point representation would be far too bulky. The number of miles that light travels in a year, for example, can be written in base 10 with only 13 digits, but in base 2 it requires 43 1s and 0s--and that assumes that the mile is the smallest unit of distance that will be used. Changing this assumption would change the representation of every number used in the problem. Engineers and scientists, therefore, write their numbers as some small number (at least one but less than 10) multiplied by a power of 10. For example, 5,280 becomes 5.28 x 10^3. But fractions are tricky. For example, a common decimal value such as 0.1 cannot be written with any finite number of digits to the right of a base-2 ``binary point.'' One-tenth can be approximated by a binary fraction, 0.000110011 (one 16th plus one 32nd plus one 256th plus one 512th), but this still leaves an error of more than 2 percent. Additional digits can make the error as small as desired, but can never make it disappear completely. Floating-point numbers, therefore, are a flexible but imperfect tool, designed to handle enormous ranges of values. Intel's statements about the risk of errors have been based on uniform random sampling over that entire enormous range, ignoring the fact that most non-technical users generally use numbers in a modest range such as 0.01 to 100.0--with conventions such as ``dollars in thousands'' as everyday signposts of that preference. Technical Testing--What and How PC Week Labs devised the Float Divide Verification Test Suite to investigate floating-point divisions in this range, focusing for reasons of time on the range from 0.1 through 9.9 at increments of 0.1. We chose groups of four values (call them a.b, c.d, e.f, and g.h) from this range, in every possible combination, and looked at every possible division of the form (a.b/c.d) / (e.f/g.h). To look at additional digits quickly increases the time required. For example, it takes 10 times as long to work with two digits to the right of the decimal point in just one of the four positions in this formula. We did also look at cases of the form (a.b/c.d) / (e.fx/g.h) and at cases of the form (a.b/c.d) / (e.f/g.hy), finding additional errors in the process. We estimate that it would take 46 days to study (a.bp/c.dq) / (e.fx/g.hy) on a 60MHz Pentium machine, but we do not believe that the results would materially affect our conclusions. We also examined (a.b [+-*] c.d) / (e.f [+-*] g.h), and found no cases worse than our base case using only divisions. We tested our program on a wide range of processor architectures, under many different operating systems, using several different compilers for a C version of an algorithm that was initially developed in Ada. Tested compilers and platforms included GNU C++ under SunOS 4.1.3 (on a Sparcstation) and under Linux; Watcom C386 9.0, Microsoft Visual C++ 1.50, 1.51, and 2.0, and Alsys Ada 5.1.3 under DOS 6.2 and Windows NT 3.5 Workstation; Hewlett- Packard's C compiler under HP/UX (on a PA-RISC-based HP 710 workstation); and Watcom C/C++ 10.0 under OS/2 2.1. The DOS and OS/2 versions using C and the DOS version using Ada were tested on both Pentium and 486DX machines. We found no errors larger than roughly 10^-14 on any configuration not using a Pentium processor. Regardless of language or operating system, however, Pentium machines yielded 2,184 significant errors in 96,059,601 interesting cases (defined as cases not involving 0/x or x/0). The largest errors observed were in the 10th decimal place, representing roughly 10,000 times the inaccuracy that a user would reasonably expect from such calculations. The observed error rate of more than two errors per 100,000 possible divisions is 204,622 times as many errors as Intel implies with its statement that a random division operation has only a 1 in 9 billion chance of producing an error. Intel's use of statistical language may also lead to an unconscious assumption on the part of a user that this error is like other errors in life--that if a mistake occurs, one can simply try again. On a Pentium, however, doing it again doesn't mean another chance to get it right: If a given problem returned an incorrect answer the first time, that problem will always return that same wrong answer on that Pentium or any other with the flaw of the current design. Accessing PC Week Labs' Tools The PC Week Labs Float Divide Verification Test Suite can easily be modified to look at other variations on the (a/b)/(c/d) problem format. For example, problems of the form (a-b) / (c-d) yielded 624 errors (or fewer than 1 per 100,000), which is still roughly 60,000 times the frequency of error predicted by Intel. Portable C source code is attached as an Appendix to this PC Week Labs White Paper. Electronic versions of the source code, its resulting DOS executable, and this white paper can be obtained from CompuServe by typing GO PCWEEK (the software is fdv.zip in Library 2, Labs/ Netweek) or on the Internet via FTP at www.ziff.com, in the file /pub/pcweek/ fdv/fdv.zip. The software can also be accessed on the World-Wide Web at http://www.ziff.com/~pcweek. Questions or comments may be directed to Peter Coffee of PC Week Labs at MCI Mail address 357-1756, CompuServe address 72631,113, or via the Internet through gateways to these services. Verbal inquiries may be directed to Peter Coffee at (310) 371-8096 in Torrance, Calif., or to Eamonn Sullivan at (617) 393-3841 or David Berlind (617) 393-3928, both of PC Week Labs, in Medford, Mass. Appendix Source Code for PC Week Labs Float Divide Verification Test Suite /**************************************************************************** -- PC Week Labs Float Divide Verification Test Suite (c)PC Week, 1994 -- Program FDV.EXE -- -- This program attempts to verify IBM's reported Pentium error rates -- in floating-point divisions using ratios of small decimal values -- -- Exhaustively tests all n.m cases not requiring a division by zero -- and not (trivially) using a zero in a numerator -- -- Designed by Peter Coffee, PC Week Labs -- Translated to C from the original Ada by Eamonn Sullivan, PC Week Labs -- -- May be freely distributed but may not be sold -- Attributions may not be removed from source code or output ------------------------------------------------------------------------------ Compatibility Notes When using the Visual C++ 1.51 compiler under Windows 3.x, you must add the /Op ("improve float consistency" in the Project settings) option when compiling the "release" version. VC++ appears to silently reduce precision when "Full Optimization" is turned on. The Watcom C/C++ compiler was also tested with all optimizations disabled due to similar observed effects with its default optimizations. PC Week Labs has attempted to determine the best possible combination of optimizations that does not suppress some known errors. We have tested this code with the following compilers and operating systems: -- GNU C++ under SunOS 4.1.3 (on a Sparcstation) and under Linux -- Watcom C386 9.0 and Microsoft Visual C++ 1.50 and 1.51 under DOS 6.2 -- Hewlett-Packard's C compiler under HP/UX (on a PA720 processor) -- Watcom C/C++ 10.0 under OS/2 2.1 ------------------------------------------------------------------------------ */ #include #include /* variables to test well-known ratio revealing Pentium bug */ double TestNum, TestDen, TestVal; /* variables to test financial problem in IBM Pentium Study */ double YearsPerMonth, TaxRate, Tax, TakeHome, JobYears, PerAnnum; /* variables to generate and test small decimal quotients */ double Num1, Num2, Den1, Quo1, Den2, Quo2, Dif; unsigned long Trials = 0; unsigned long Errors = 0; /* loop counters for quotient components */ int i, j, k, l; /* declaration for error threshold (added in C, not needed in Ada) */ double LSBError; /****************************************************************************/ void main() { printf("PC Week Labs Float Divide Verification Suite (c)PC Week, 1994\n\n"); /* ----------------------------------------------------------------------------- -- Brief verification that we're on a buggy chip: -- Set up for well-known calculation that shows an error in the -- fifteenth bit of the binary result ----------------------------------------------------------------------------- */ TestNum= (double) 4195835.0; TestDen= (double) 3145727.0; printf("On accurate machines, the following is 0: %g\n\n", ((TestNum/TestDen)*TestDen - TestNum)); /* ----------------------------------------------------------------------------- -- Set up to verify IBM example: -- The next six values verify the IBM report's example -- of non-trivial errors in a realistic financial calculation ----------------------------------------------------------------------------- */ YearsPerMonth = 0.083333333; TaxRate = 0.1466667; Tax = 96000.0 * TaxRate; TakeHome = 96000.0 - Tax; JobYears = 22.5 * YearsPerMonth; PerAnnum = TakeHome / JobYears; printf("On accurate machines, the following is 43,690.6651: %.4f\n\n", PerAnnum); /* define tolerable least-significant-bit error */ LSBError = pow(2,-46); printf("Suppressing errors of magnitude not greater than %8.5e\n\n",LSBError); printf("Testing quotients of form (a/b)/(c/d): a,b,c,d in [0.1,9.9] by 0.1\n\n"); printf("Value of \"a\" is printed when incremented as progress check\n\n"); printf("Output format is \"error: a, b, c, d\"\n\n"); for(i = 1; i < 100; i++) { Num1 = (double) (i/10.0); printf("%.1f\n", Num1); /* this tracks error-free regions */ for (j=1; j < 100; j++) { Den1 = (double) (j / 10.0); Quo1 = (double) Num1 / Den1; for(k=1; k < 100; k++) { Num2 = (double) (k / 10.0); for(l=1; l < 100; l++) { Den2 = (double) (l / 10.0); Quo2 = (double) Num2/Den2; Trials++; Dif = ((Quo1 / Quo2) * Quo2) - Quo1; if (fabs(Dif) > LSBError) { Errors++; printf("Error is %8.5e: %3.1f, %3.1f, %3.1f, %3.1f\n", Dif, Num1, Den1, Num2, Den2); } } } } } printf("%lu errors in %lu trials.\n", Errors, Trials); } ________________________________________________________________________________