AMD Processors
Decrease font size
Increase font size
Topic Title: AMD64 Question for FPU type projects
Topic Summary:
Created On: 01/14/2004 05:50 PM
Status: Read Only
Linear : Threading : Single : Branch
1 2 Next Last unread
Search Topic Search Topic
Topic Tools Topic Tools
View similar topics View similar topics
View topic in raw text format. Print this topic.
 01/14/2004 05:50 PM
User is offline View Users Profile Print this message

Author Icon
b2riesel
Junior Member

Posts: 15
Joined: 01/12/2004

First off I'm Lee Stephens the project originator for www.rieselsieve.com. As the name suggest we started off mainly as a sieving project for the Riesel Problem. It was quickly noted that AMD chips mopped the floor with P4 chips when it came to pure integer math.

Second, I'm not here to recruit...but to just ask a question from those of you who have AMD64 systems. I'm currently seeking a 20/100 ratio for our member resources between sieving...which mentioned is very strong on AMD chips and LLR testing...which due to almost pure FPU math is currently dominated by the P4's. LLR stands for Lucas-Lehmer Riesel and is used to test numbers for prime in the same general way as PRP and Woltman's Prime95. I've always been an AMD guy...well after the K6-II which I wasn't a big fan of but still...for the last several years it has been AMD all the way for me. I currently have 18 systems...some webservers and what not...but AMD.

Now on my project I've been highly tempted to buy a P4 system due to its much stronger performance on the FPU heavy LLR.exe program we use to find primes. My question is how does the AMD64 chip compare to say a 3.06GHz P4 on these test. If it's even close I'll buy an AMD64 as my next chip and keep on being an AMD only guy...I'd hate to have to buy a few machines just because of one little specialty that Intel has so far dominated in our project.

If any one of you could take the time to just do a few test on your AMD64 to give me a benchmark I'd be most appreciative.

You can visit our project at www.rieselsieve.com' ">http://www.rieselsieve.com. Contact me by email from that site or visit us at irc.freenode.net #rieselsieve and we would be very interested in the results. Many of us are at an upgrade cycle and would like to see a comparison test before we make our next purchasing decision.

Thanks everyone for your time to read this,

Lee Stephens
B2
www.rieselsieve.com
 01/14/2004 06:10 PM
User is offline View Users Profile Print this message

Author Icon
jes
Senior Member

Posts: 1134
Joined: 10/22/2003

Can you tell us which program you want us to download, also what command line to use for benchmarking (since I assume you'll want us all to be using the same thing).

-------------------------
The opinions expressed above do not represent those of Advanced Micro Devices or any of their affiliates.
http://www.shellprompt.net
Unix & Oracle Web Hosting Provider powered by AMD Opterons
 01/14/2004 06:16 PM
User is offline View Users Profile Print this message

Author Icon
Ardrid
Heavy Wizardry

Posts: 12398
Joined: 10/08/2003

I'm surprised the Athlon isn't whooping up on the P4 as the Athlon is an FPU beast, hence its performance in ScienceMark.

-------------------------
Intel Core i7 860
ASUS P7P55D-E Pro
Corsair HX650W
Corsair XMS DDR3-1333 (4GB @ 8-8-8-24)
Sapphire Radeon HD 6870
Western Digital VelociRaptor 300GB
Western Digital Caviar Black 1TB
 01/14/2004 06:51 PM
User is offline View Users Profile Print this message

Author Icon
jes
Senior Member

Posts: 1134
Joined: 10/22/2003

I can't seem to find any data files for the LLR executable, it's late and my head is fuzzy...

-------------------------
The opinions expressed above do not represent those of Advanced Micro Devices or any of their affiliates.
http://www.shellprompt.net
Unix & Oracle Web Hosting Provider powered by AMD Opterons
 01/14/2004 07:32 PM
User is offline View Users Profile Print this message

Author Icon
b2riesel
Junior Member

Posts: 15
Joined: 01/12/2004

I'm sorry...I guess if you've never been to the site it's a bit confusing since we started out as a forum only site and are now going to a PHP based site.

LLR can be downloaded from the Download Section' ">http://www.rieselsieve.com/dload.php Just choose the OS you have...if you have FreeBSD without linux compat I'm currently recompiling a client with the newer gcc.

For a range I would prefer: This range' ">http://www.b2project.com/llr/765252.txt since I already have the residuals back and know what kind of machine it ran on and the times.

The LLR.exe should be pretty easy to understand...just point it toward the file that you want to process...the first line in the text file you download for test will tell it what type of test to run. If it doesn't start the first time you enter in the file name...sometimes it takes twice. The new version will be out soon that doesn't do this. LLR was not written by me or my staff. It was written by Jean Penne several years ago and he's in the process of making it even faster...which is amazing in itself.

Thanks for any help you guys can give on benchmarking.

Lee Stephens
B2
www.rieselsieve.com


O..and for returning the residuals...just paste them in the forums under Residual Submission or just in any message anywhere is fine.
 01/14/2004 07:46 PM
User is offline View Users Profile Print this message

Author Icon
jes
Senior Member

Posts: 1134
Joined: 10/22/2003

Ahhh great...it was the data file I was missing out on....ok I've kicked that off running on my Dual 244's (on one processor...the other one is processing a molecular structure).

I'll post the results when they pop out...

-------------------------
The opinions expressed above do not represent those of Advanced Micro Devices or any of their affiliates.
http://www.shellprompt.net
Unix & Oracle Web Hosting Provider powered by AMD Opterons
 01/14/2004 07:55 PM
User is offline View Users Profile Print this message

Author Icon
b2riesel
Junior Member

Posts: 15
Joined: 01/12/2004

One note....LLR is very dependent on FSB and runs faster with the more memory you can feed it. I have no idea how the processing of the molecular structure will affect the speed...I'm not sure how whichever client you are using uses memory. If it was RC5 or ECC2 I know the memory footprint would be tiny and not skew the results at all...but I'm also interested in dualie speeds too


Lee Stephens
B2
www.rieselsieve.com

 01/14/2004 08:18 PM
User is offline View Users Profile Print this message

Author Icon
jes
Senior Member

Posts: 1134
Joined: 10/22/2003

Well the machine has 4GB memory, 2Gb to each CPU (running an SMP, NUMA aware kernel), the footprint of the other program is extremely small (it's a custom application and it's been fine tuned to stay within the cache at all times, it's more computationally expensive than memory intensive).



-------------------------
The opinions expressed above do not represent those of Advanced Micro Devices or any of their affiliates.
http://www.shellprompt.net
Unix & Oracle Web Hosting Provider powered by AMD Opterons
 01/14/2004 08:26 PM
User is offline View Users Profile Print this message

Author Icon
b2riesel
Junior Member

Posts: 15
Joined: 01/12/2004

QUOTE (jes @ Jan 14 2004, 05:18 PM)Well the machine has 4GB memory, 2Gb to each CPU (running an SMP, NUMA aware kernel), the footprint of the other program is extremely small (it's a custom application and it's been fine tuned to stay within the cache at all times, it's more computationally expensive than memory intensive).
sweet
 01/14/2004 09:01 PM
User is offline View Users Profile Print this message

Author Icon
Brian128
Senior Member

Posts: 2644
Joined: 11/06/2003

I didnt run a complete test, but here are the only results I got (since Id rather let my machine crunch f@h)
[Wed Jan 14 20:55:43 2004]
362609*2^765088-1 is not prime. Res64: 02F9C12035F27B62 Time : 3726.829 sec.

-------------------------
 01/14/2004 09:18 PM
User is offline View Users Profile Print this message

Author Icon
b2riesel
Junior Member

Posts: 15
Joined: 01/12/2004

QUOTE (Brian128 @ Jan 14 2004, 06:01 PM) I didnt run a complete test, but here are the only results I got (since Id rather let my machine crunch f@h)
[Wed Jan 14 20:55:43 2004]
362609*2^765088-1 is not prime. Res64: 02F9C12035F27B62 Time : 3726.829 sec.
Actually that was a complete test...one number is fine with me...gives me a good measure of what the chip can do. Running 62 minutes it did very nicely. Several minutes ahead of a P4 3.06 that ran that range the first time. I don't know the differences in load but his average was very close for each test over a range of 20.

This is a very pleasant surprise indeed.

Seems after the next round of price cuts I'll be buying a 64.


Lee Stephens
B2
www.rieselsieve.com

 01/15/2004 09:41 AM
User is offline View Users Profile Print this message

Author Icon
jes
Senior Member

Posts: 1134
Joined: 10/22/2003

Here's my results on dual 244's

QUOTE
14:38:35 up 29 days, 14:21,  16 users,  load average: 2.00, 1.99, 1.96

71009*2^765172-1 is not prime.  Res64: 7CE81C5A0CF3BFD8  Time : 4590.035 sec.

Hmmm so if the calculation should take around the same time regardless of the values involved, then my time sucks compared to Brians, although in my defense that machine is pretty heavily loaded.

I also noticed that's a 32bit binary, without sse2 support. I would imagine you'll see quite a good performance boost once that's made 64bit clean, I assume that's on your to-do list?

-------------------------
The opinions expressed above do not represent those of Advanced Micro Devices or any of their affiliates.
http://www.shellprompt.net
Unix & Oracle Web Hosting Provider powered by AMD Opterons
 01/15/2004 10:52 AM
User is offline View Users Profile Print this message

Author Icon
b2riesel
Junior Member

Posts: 15
Joined: 01/12/2004

QUOTE
I also noticed that's a 32bit binary, without sse2 support. I would imagine you'll see quite a good performance boost once that's made 64bit clean, I assume that's on your to-do list?


Yes, it is on my list of things to do. If you would like to makefile a version real quick just for your testing....here is the source code: http://www.mersenne.org/gimps/llrsource.zip' ">http://www.mersenne.org/gimps/llrsource.zip
 01/15/2004 12:45 PM
User is offline View Users Profile Print this message

Author Icon
jes
Senior Member

Posts: 1134
Joined: 10/22/2003

Hmmm that doesn't appear to be the full source, since it's linking in object files that were compiled to be 32bit (mixing 32/64 bit object files is not a good idea).

-------------------------
The opinions expressed above do not represent those of Advanced Micro Devices or any of their affiliates.
http://www.shellprompt.net
Unix & Oracle Web Hosting Provider powered by AMD Opterons
 01/15/2004 02:44 PM
User is offline View Users Profile Print this message

Author Icon
b2riesel
Junior Member

Posts: 15
Joined: 01/12/2004

QUOTE (jes @ Jan 15 2004, 09:45 AM) Hmmm that doesn't appear to be the full source, since it's linking in object files that were compiled to be 32bit (mixing 32/64 bit object files is not a good idea).
Well the FFT's are in ASM...and I simply can't do that. If you notice in the source there is many...many unneeded files in there that have to do with things not present in the LLR.exe. PrimeNET is not part of LLR...and neither is many of the other things listed as 'source'. Like I said, I'm not the programmer for this program. It belongs to Jean Penne who adapted the code from GIMPS George Woltman's code. It is basically the Prime95 code with calls to a text file instead of primeNet with the only two files needed for the different math included as llr.c and riesel.c

The rest of the stuff is not needed in the source. A new version is being worked on that quite possibly is 2x's faster and would be a boon to those of us interested in prime finding numbers in the form of k*2^n-1. Our project is finite and any speed ups to help us complete our project in the shortest amount of time is most welcome. That's why I asked about the speed of the AMD64.

We have 7.5Million candidates to test. We can either remove them thru sieving...which the AMD XP chips are most excellent at...or thru this LLR testing. Sieving will never find a prime but we eliminate thousands of those 7.5million candidates per day. LLR prime test each k/n pair..to be put into the k*2^n-1 forumla seperately and is time consuming. At the moment I allocate resources that AMD's, Celerons, and others do sieve while P4s' with their higher FSB speeds do the large FPU math required by LLR. However, since the AMD64 seems to be better at both areas of our project...then that is the CPU of choice since if I want to sieve a while...I can...and if I want to LLR test for a while...I can...and both will be at optimal speeds.

With the newer version of LLR coming out that uses a very complex reduction modulas it will seemingly take the overly excessive weight of FPU/FSB dependence and even up the AMD XP chips with the P4 chips...thus it would also seem to give the AMD64 an even greater advantage.

I can't wait for two things. 1. I build a new AMD64 box and 2. the new client arrives. A project that I was told would take me 50 years and I said 5 years may actually be accomplished in 5


Lee Stephens
B2
www.rieselsieve.com
 01/15/2004 04:03 PM
User is offline View Users Profile Print this message

Author Icon
Logan[TeamX]
Senior Member

Posts: 3185
Joined: 12/07/2003

Mind if I ask a really silly question that I just don't see the answer to: why? Why crunch all this data? What is the purpose in life for all of this work? If you draw the majority of your code and purpose from the GIMPS project, what makes it different?

Thanks

Logan
 01/15/2004 05:20 PM
User is offline View Users Profile Print this message

Author Icon
jes
Senior Member

Posts: 1134
Joined: 10/22/2003

I'm sure that b2riesel will give a far better explanation than I could ever give, but prime numbers can play a huge role in computing, particularly in the area of cryptography....

I seem to remember reading somewhere once that we still don't have the maths to predict where a prime number will come next in the number sequence. Infact, if we could calculate that easily then it could spell disaster to cryptography.

-------------------------
The opinions expressed above do not represent those of Advanced Micro Devices or any of their affiliates.
http://www.shellprompt.net
Unix & Oracle Web Hosting Provider powered by AMD Opterons
 01/16/2004 07:51 AM
User is offline View Users Profile Print this message

Author Icon
Logan[TeamX]
Senior Member

Posts: 3185
Joined: 12/07/2003

Well, that gives me some groundwork to think about anyways.

Thanks!
 01/16/2004 07:41 PM
User is offline View Users Profile Print this message

Author Icon
jes
Senior Member

Posts: 1134
Joined: 10/22/2003

QUOTE (b2riesel)
Well the FFT's are in ASM...and I simply can't do that.  If you notice in the source there is many...many unneeded files in there that have to do with things not present in the LLR.exe.


I'm not sure if your programmer is already aware of this, but there are a set of VERY highly AMD64 optimized routines for FFT's available from AMD, they're part of the AMD Core Math Library AMCL' ">http://www.developwithamd.com/.../index.cfm?action=home

-------------------------
The opinions expressed above do not represent those of Advanced Micro Devices or any of their affiliates.
http://www.shellprompt.net
Unix & Oracle Web Hosting Provider powered by AMD Opterons
 01/16/2004 09:44 PM
User is offline View Users Profile Print this message

Author Icon
b2riesel
Junior Member

Posts: 15
Joined: 01/12/2004

QUOTE (jes @ Jan 16 2004, 04:41 PM) QUOTE (b2riesel)
Well the FFT's are in ASM...and I simply can't do that.  If you notice in the source there is many...many unneeded files in there that have to do with things not present in the LLR.exe.


I'm not sure if your programmer is already aware of this, but there are a set of VERY highly AMD64 optimized routines for FFT's available from AMD, they're part of the AMD Core Math Library AMCL' ">http://www.developwithamd.com/.../index.cfm?action=home
Didn't know about the AMD64 FFTs...I've passed the word on. I'd be highly interested in the speed differences if this FFT can be used within the framework of LLR.


Now the LLR test that you guys did for me earlier proved the fact that I'll be buying an AMD64 next. If one of you could possibly run the sieve on an AMD64 for atleast 5 reportings of time...say 20mins..and tell me the speed obtained I'd be most happy...My thinking is it should be in linear relation to say a 2400+ percentage wise..but I've had several questions from the guys on the project...everyone is just curious as can be.

shortcut to client download for whichever OS you want' ">http://n137.ryd.student.liu.se/proth_sieve.php

For a range enter starting range 7000 and ending range 7001. That shouldn't take long to do the entire thing. I can do 7000 to almost 7010 in 24 hours on my 2400+. I'm not interested in running the entire range...just 5 reportings of the time in kp/sec. Should show from the client window or you can view in the RieselStatus.dat file. No need to send in the factors it finds since that range is already done.

Like I said...just curious in the speeds.

Lee Stephens
B2
www.rieselsieve.com
Statistics
112018 users are registered to the AMD Processors forum.
There are currently 0 users logged in.

FuseTalk Hosting Executive Plan v3.2 - © 1999-2014 FuseTalk Inc. All rights reserved.



Contact AMD Terms and Conditions ©2007 Advanced Micro Devices, Inc. Privacy Trademark information