AMD Processors
Decrease font size
Increase font size
Topic Title: FYL2X bug
Topic Summary: FYL2X produces the wrong result when computing ln(6)
Created On: 06/29/2010 06:13 PM
Status: Read Only
Linear : Threading : Single : Branch
Search Topic Search Topic
Topic Tools Topic Tools
View similar topics View similar topics
View topic in raw text format. Print this topic.
 06/29/2010 06:13 PM
User is offline View Users Profile Print this message

Author Icon
smcpeak
Lurker

Posts: 3
Joined: 06/29/2010

I've discovered what appears to be a bug in the Opteron implementation
of the FYL2X instruction.

Consider the attached C code. It computes the natural logarithm of 6,
printing it as decimal and the raw 80-bit hex bytes (little endian).

When run on an Opteron Processor 270, it produces the following output:

decimal: 1.791759469228055000897606441335
hex: 0x92110051d15f58e5ff3f

When run on an Intel Xeon, it instead produces:

decimal: 1.791759469228055000789186224086
hex: 0x91110051d15f58e5ff3f

The hex values are little endian. The difference is in the least
significant two bits (first byte); 0x92 vs 0x91.

Comparing the decimal values to the output of 'bc':

$ echo 'scale=30; l(6)' | bc -l

we see:

AMD Opteron: 1.791759469228055000897606441335
bc: 1.791759469228055000812477358380
Intel Xeon: 1.791759469228055000789186224086

The Intel answer is closest to the exact answer, so is the correct
answer in the default (and current) "round to nearest" mode.

Tracing the assembly code for this program, it boils down to a call to
the FYL2X instruction. For both processors, the FP operand stack
before the call contains the top two values (the hex here is big
endian):

R7: Valid 0x3ffeb17217f7d1cf79ac +0.6931471805599453094
R6: Valid 0x4001c000000000000000 +6

R7 is ln(2). After executing FYL2X, those values are replaced with
the result value, which in the case of the Opteron is wrong (value
shown above).

This bug is annoying because it means the calculations I'm doing are
hardware-dependent. The larger system in which these calculations
appear flags non-determinism in an attempt to catch software bugs, but
it's flagging this difference in behavior too, which is introducing
noise.

Anyone else seeing this behavior? Does anyone know of a more official
AMD channel to report this? (I doubt my hardware OEM is going to care.)

Code:
#include <math.h>            // logl
#include <stdio.h>           // printf

int main()
{
  long double d = logl((long double)6);
  unsigned char *p = (unsigned char *)(&d);
  int i;

  printf("decimal: %.30Lf\n", d);

  printf("hex: 0x");
  for (i=0; i < 10; i++) {
    printf("%02x", (int)p[i]);
  }
  printf("\n");
  
  return 0;
}
 07/07/2010 11:02 PM
User is offline View Users Profile Print this message

Author Icon
MU_Engineer
Dr. Mu

Posts: 1837
Joined: 08/26/2006

Originally posted by: smcpeak

I've discovered what appears to be a bug in the Opteron implementation

of the FYL2X instruction.



Consider the attached C code. It computes the natural logarithm of 6,

printing it as decimal and the raw 80-bit hex bytes (little endian).



When run on an Opteron Processor 270, it produces the following output:



decimal: 1.791759469228055000897606441335

hex: 0x92110051d15f58e5ff3f



When run on an Intel Xeon, it instead produces:



decimal: 1.791759469228055000789186224086

hex: 0x91110051d15f58e5ff3f



The hex values are little endian. The difference is in the least

significant two bits (first byte); 0x92 vs 0x91.



Comparing the decimal values to the output of 'bc':



$ echo 'scale=30; l(6)' | bc -l



we see:



AMD Opteron: 1.791759469228055000897606441335

bc: 1.791759469228055000812477358380

Intel Xeon: 1.791759469228055000789186224086



The Intel answer is closest to the exact answer, so is the correct

answer in the default (and current) "round to nearest" mode.



Tracing the assembly code for this program, it boils down to a call to

the FYL2X instruction. For both processors, the FP operand stack

before the call contains the top two values (the hex here is big

endian):



R7: Valid 0x3ffeb17217f7d1cf79ac +0.6931471805599453094

R6: Valid 0x4001c000000000000000 +6



R7 is ln(2). After executing FYL2X, those values are replaced with

the result value, which in the case of the Opteron is wrong (value

shown above).



This bug is annoying because it means the calculations I'm doing are

hardware-dependent. The larger system in which these calculations

appear flags non-determinism in an attempt to catch software bugs, but

it's flagging this difference in behavior too, which is introducing

noise.



Anyone else seeing this behavior? Does anyone know of a more official

AMD channel to report this? (I doubt my hardware OEM is going to care.)


I compiled and ran this on my hardware, since I run Linux on everything and I have a lot of old hardware.

1. AMD Athlon 64 X2 4200+ (similar age as your Opteron 270) Gentoo amd64
decimal: 1.791759469228055000789186224086
hex: 0x91110051d15f58e5ff3f

2. Intel Xeon 3.20 GHz, 533 MHz FSB, 2 MB L3 (Gallatin) Gentoo i686
decimal: 1.791759469228055000789186224086
hex: 0x91110051d15f58e5ff3f

3. Intel Core 2 Duo U7500, Gentoo amd64
decimal: 1.791759469228055000789186224086
hex: 0x91110051d15f58e5ff3f

4. AMD Athlon XP 3200+, Debian 5.0 i386
decimal: 1.791759469228055000789186224086
hex: 0x91110051d15f58e5ff3f

5. Texas Instruments TI-89 (Motorola MC68EC000), OS 2.09
decimal ln(6): 1.79175946923

They all get the exact same answer, despite being different chip manufacturers, generations, and OSes. I wonder if you are having some software issue. What OS, compiler, and C library are you using on the Opteron and the Xeon?

-------------------------
 07/22/2010 04:42 PM
User is offline View Users Profile Print this message

Author Icon
smcpeak
Lurker

Posts: 3
Joined: 06/29/2010

It's not a software problem. The exact same binary, statically linked, produces the differing answers shown above. I tried with binaries compiled on various machines, and all give the same results.

Nevertheles, more on the Opteron's software:
OS: Fedora Core release 4 (Stentz)
Compiler: gcc 4.0.0
C library: glibc 2.3.5

And details I omitted on the Opteron hardware:
$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 270
stepping : 2
cpu MHz : 1993.798
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3d
now pni lahf_lm cmp_legacy
bogomips : 3915.77
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 270
stepping : 2
cpu MHz : 1993.798
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3d
now pni lahf_lm cmp_legacy
bogomips : 3981.31
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
 07/22/2010 04:43 PM
User is offline View Users Profile Print this message

Author Icon
smcpeak
Lurker

Posts: 3
Joined: 06/29/2010

And the Xeon's software:

OS: Red Hat Enterprise Linux Client release 5.2 (Tikanga)
compiler: gcc 4.1.2
C library: glibc 2.5
 11/29/2010 11:39 AM
User is offline View Users Profile Print this message

Author Icon
dolsh
Lurker

Posts: 2
Joined: 11/27/2010

Statistics
112018 users are registered to the AMD Processors forum.
There are currently 0 users logged in.

FuseTalk Hosting Executive Plan v3.2 - © 1999-2014 FuseTalk Inc. All rights reserved.



Contact AMD Terms and Conditions ©2007 Advanced Micro Devices, Inc. Privacy Trademark information