Originally posted by: brane227
1. I don't understand what all the fuss is about. BD is not THAT much better than Ph2. It is bump on the evolutionary road of x86 platform, nothing more.
Bulldozer is "only" the first truly new x86 microarchitecture made since the ill-fated NetBurst architecture of the Pentium 4. AMD's current products are very clearly descendants of 1999's K7 core. K8 obviously made certain parts of the core wider and added more SIMD functionality, but the structure of the FPU, AGUs, and pipelines remained largely unmodified. Ditto with Intel's Sandy Bridge- it hails back to 1995's P6 Pentium Pro. It's modified a little bit more compared to P6 than Stars is to K7, but the microarchitecture design is very similar. I'd call the first truly new microarchitecture in 11 years a pretty big deal.
I don't exactly know what you are expecting from Bulldozer, or for that matter, any new CPU? Silicon microprocessors are a fairly mature product- they've been around for decades. Mature products rarely have anything but incremental improvements. Look at the car you drive- it's not that different from units in the 1930s. Sure, your car may have fuel injection and power steering and brakes, but in the end, it's very similar.
2. Piledriver is just nextgen chip with an improved cores and 1 complete core extra( or, as AMD PR department likes to call it , "module" ).
...which is exactly as expected, given than Bulldozer is a new microarchitecture. AMD has been tweaking the K7 microarchitecture since 1999 and Intel is still tweaking 1995's P6, so expecting a brand-new microarchitecture every year is absolutely SILLY. DItto with new die shrinks every year- we are nearing the limits of CMOS technology and the cost to shrink is increasing exponentially with each shrink. We will be seeing shrinks LESS often rather than more often.
3. Main bottleneck was not the CPU per se, but OS util SW abilities.
You are running Linux. Linux runs on nearly every supercomputer worthy of the name. An 8-core single-socket CPU is NOT putting much of a strain on the OS's abilities to handle threads. Shoot, you are a Gentoo user, you should know that CONFIG_NR_CPUS on amd64 goes WAY over 8 (it goes from 1 to 4096.) There are also excellent threading APIs available for that OS (MPI, for example), else you wouldn't see Linux used so much in HPC and enterprise servers. Just because some crappy Windows game only uses a few threads does not mean that nothing is able to use more than 4-6 cores.
All that means that BD 1/2 will be nice upgrade for old or dying machines. But to loose perfectly good Ph2 system for BD is IMHO stupid.
Many Phenom II systems fall into one of two camps- AM2/AM2+ units that originally held 65 nm Rev G A64s or Rev B Phenoms and were upgraded to Phenom IIs, or AM3/AM3+ units that likely can take Bulldozers. So you either upgraded your system already or it can be upgraded further in many cases. What is there to complain about there, especially considering that AMD's competition rarely allows much of any CPU upgrades any more.
Originally posted by: brane227
Originally posted by: SlayerX
Your a strange cookie brane (no offence) but your probaly right. clock for clock the BD is probaly not going to be that much better than the PII's but I bet most of the people who are going to buy the BD are going to OC it and what I've read on it so far thats were its going to shine. something like how my old PIIx4 955 totaly out classed my really old Px4 9850
1. This is not so much technical as fanboy forum, where probably none could explain CPU booting process in detail but we have "blue led over klonkers" in abundance.
No, actually we covered CPU booting when somebody wanted to know if they could boot two dissimilar Socket G34 Opterons in the same board. If you want a fanboy forum, Xtreme Systems and Anandtech's forums come to mind way before here does. (They're Intel fanboy forums, not AMD fanboy forums.) Shoot, many people here openly advertise that they run Intel systems for crying out loud.
Since human eyes and brain are evolutionaly preconditioned to take notice of changes and edges, it is only natural that rational behaviour in such environment would screamingly stand out like a pig in Teheran.
The only thing I took notice of was your atrocious spelling. By the way, if you are so smart, how does a *computer* take notice of changes and edges (in an image)? And furthermore, can you actually do that math by hand? I know at least one person here who can...
2. Moore's law is no more, at least for classical CPUs ( and especially for that stinking pile of cr** that industry calls x86, in new 64-bit as well as old 32-bit flavour).
Moore's Law, really, "Moore's Observation" is still pretty accurate. The transistor budgets allowed by fairly regular lithography process shrinks coupled with ever-larger wafers allow for the massive caches and multiple cores we've been using the roughly double transistor count every 18 months.
Also, x86 isn't x86 any more, and hasn't been for a long time. Ever since the K5 and P6, "x86" CPUs have instruction decoders to translate x86 instructions into core-native operations. Also, the classical load-and-store single-instruction-per-cycle RISC microarchitectures have adopted a lot of the "nasty" of x86, namely CISC insutructions (what else do you think SIMD is?) and out-of-order instruction scheduling. Why did they do this? x86 had SO much more performance it wasn't funny. Even with a lot of the "nasty" stuff from x86, non-x86 is still considerably behind x86. How well does a dual-core ARM Cortex compare clock-for-clock with a dual-core Phenom II? Or shoot, even the crappiest x86 chip on the market, the Intel Atom? It's not even close, and the crappy Atom is even an in-order chip! The only RISC out there that has decent performance is IBM's POWER7, and anybody can get decent performance out of a chip if they can throw a hundred megabytes of cache on it and have 200W+ TDPs that MANDATE liquid cooling. Oh, and sell the CPUs for five digits apiece.
Get over it.
3. Let's assume for a moment that, had you got it, you would know what to do with extra CPU power.
I certainly would, considering I am already running a 16-core machine. It would make my video encoding and compiling go considerably faster.
1. Throw away x86 !!! It is total disaster. ARM/MIPS and just about anything else is much better, but classical computing has it limits. Changing solely CPU core won't change that. Yes, you can use 8192 Opterons or whatever else on special boards with special interconnect etc. But cost of such machine would buy you literraly an ocean of fluroescent dye you use for watercooling now. And a small galaxy of blue LEDs that you happen to use to pepper out with everything around you. Not to mention that MONTHLY electricity cost would kill you instantly.
Okay, and what would you use in place of a massive Opteron (or Xeon) cluster? Pay 10 times as much for IBM POWER7s that suck down even MORE energy and actually require liquid cooling? And no, enterprise computing does not use garbage like fluorescent dye, they often don't even use water as the coolant. A bunch of GPUs that are fairly limited in the types of operations that they can do, owing to the intrinsic design of the GPU, the small amount of onboard memory, and the slow external data bus? A cluster of CPUs like ARM or MIPS CPUs that are even slower core-for-core than x86 and thus are bottlenecked even more by the CPU interconnects?
None of those are even close in performance to what you can get with x86 hardware. The Tilera unit is a limited-function coprocessor similar to what a GPU can do. Good for some things, decidedly NOT for others. The Picochip is a wireless access point system, not a processor. I looked at the GreenArrays thing and it looks like an FPGA. The datasheet link gave me an HTTP 503 error (unable to service request due to machine downtime)- not a good sign from a company selling purportedly super-powerful hardware. If it is a glorified FPGA, it's also considerably more limited in usefulness than a typical x86 CPU.
3. Write your own app to run with little or preferably none OS involvement.
So don't use any of the OS's thread handling or APIs and instead write all of your own from scratch and reinvent the wheel. Yeah, that's really a step in the right direction...
4. DON'T OVERCLOCK ANYTHING!!!
It's quite the opposite - usually you can get FASTER RESULT WITH __UNDERCLOCKING__.
I agree to not overclock things, but it's for reliability purposes. You won't get a faster result in any calculation by underclocking as the CPU takes longer to execute the exact same operations.
You read that right. Here it's not about how fast ONE core can run. Limit here is your electricity bill, so it's about MPG - how much computing will each kWh get you.
You are confusing three completely different issues. Cars operate on a completely different process than computers and have considerably different efficiency curves. CPU power usage increases exponentially with voltage increases like aerodynamic drag does with velocity, but there is also a minimum voltage you can run a CPU at due to the transistor acting like an amplifier rather than a switch. Heat production scales linearly with speed, so as long as you don't touch the voltage, a CPU running at a higher speed is equally efficient as one running at a slower speed.
And that sweet spot is usually below the rated speed. Furthermore, if you do your board design and cooling right, you might be able to achieve reliable operation at lower voltages with same speed, further reducing the consumption per core.
You will also run into the point where silicon transistors no longer act as switches but as amplifiers. This is about 600 mV and modern chips don't operate at voltages all that much higher, so there is not a tremendous amount of efficiency to be gained from radically underclocking.
Calculation is simple- more economic core is, more of them you can use for a given available power and energy envelope, hence faster your machine is.
You are completely forgetting that the increased number of cores you need to have to get the same performance ALSO cost money, not just the electricity costing money.
If you are truly about computing speed and if value of "Pi" really interests you that much to the million decimals every 15 minutes ( and you're in such hurry that EVERY MILISECOND COUNTS ), then this is your route.
1. Cut your misery and admit that you are blueled poser and overgrown child and be done with it.
2. Stop spending money on stupidities under false pretences and stop stimulating others to compete with you "on a quest for speed".
You are computer litterate speed freak just about as much as I am a Knight. And since I am overweight ***** with a crappy job in second world microcountry, you can calculate left part of that equasion for yourself.
If you enjoy sight of polished metal and blue leds etc, great.
I have nothing against that.
But then call it by its real name- you have a fetish.
I don't see why we all can't admit, that last generation of x86 are fine machines by many accounts and certainly capable performes for jobs that 99,99% of users will load on them and that bottleneck is elsewhere - usually on the SW part of the bundle.
PS: But main one is almost always in brain quality of the user.
I think somebody forgot his Haldol this morning...