I'm doing some tests on a new Dell R715 with two Opterons 6262 HE. In Linux I can stress one core (stress -c 1) and the Turbo CORE kicks in:
# cpufreq-aperf 000 0992000 00 sec 007 ms 00 sec 992 ms 00 001 0896000 00 sec 000 ms 00 sec 999 ms 00 002 0928000 00 sec 000 ms 00 sec 999 ms 00 003 0976000 00 sec 000 ms 00 sec 999 ms 00 004 0848000 00 sec 000 ms 00 sec 999 ms 00 005 0880000 00 sec 000 ms 00 sec 999 ms 00 006 0944000 00 sec 000 ms 00 sec 999 ms 00 007 0880000 00 sec 000 ms 00 sec 999 ms 00 008 0704000 00 sec 000 ms 00 sec 999 ms 00 009 0720000 00 sec 000 ms 00 sec 999 ms 00 010 2448000 00 sec 000 ms 00 sec 999 ms 00 011 2784000 01 sec 007 ms 11529215046 sec 061 ms 100 012 1152000 00 sec 009 ms 00 sec 990 ms 00 013 0848000 00 sec 000 ms 00 sec 999 ms 00 014 0640000 00 sec 000 ms 00 sec 999 ms 00 015 0736000 00 sec 000 ms 00 sec 999 ms 00 016 0848000 00 sec 000 ms 00 sec 999 ms 00 017 0944000 00 sec 000 ms 00 sec 999 ms 00 018 0896000 00 sec 000 ms 00 sec 999 ms 00 019 0912000 00 sec 000 ms 00 sec 999 ms 00 020 0912000 00 sec 000 ms 00 sec 999 ms 00 021 0928000 00 sec 000 ms 00 sec 999 ms 00 022 0912000 00 sec 000 ms 00 sec 999 ms 00 023 0912000 00 sec 000 ms 00 sec 999 ms 00 024 0976000 00 sec 000 ms 00 sec 999 ms 00 025 0976000 00 sec 000 ms 00 sec 999 ms 00 026 0896000 00 sec 000 ms 00 sec 999 ms 00 027 0928000 00 sec 000 ms 00 sec 999 ms 00 028 0544000 00 sec 000 ms 00 sec 999 ms 00 029 1072000 00 sec 000 ms 00 sec 999 ms 00 030 0976000 00 sec 002 ms 00 sec 997 ms 00 031 0784000 00 sec 000 ms 00 sec 999 ms 00
However if I stress all cores the frequency doesn't exceed the default 1.6 GHz:
# cpufreq-aperf CPU Average freq(KHz) Time in C0 Time in Cx C0 percentage 000 1584000 01 sec 002 ms 11529215046 sec 066 ms 100 001 1584000 01 sec 002 ms 11529215046 sec 066 ms 100 002 1600000 01 sec 002 ms 11529215046 sec 066 ms 100 003 1584000 01 sec 002 ms 11529215046 sec 066 ms 100 004 1600000 01 sec 002 ms 11529215046 sec 066 ms 100 005 1600000 01 sec 002 ms 11529215046 sec 066 ms 100 006 1600000 01 sec 002 ms 11529215046 sec 066 ms 100 007 1600000 01 sec 002 ms 11529215046 sec 066 ms 100 008 1584000 01 sec 001 ms 11529215046 sec 066 ms 100 009 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 010 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 011 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 012 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 013 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 014 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 015 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 016 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 017 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 018 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 019 1584000 01 sec 001 ms 11529215046 sec 066 ms 100 020 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 021 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 022 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 023 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 024 1600000 01 sec 001 ms 11529215046 sec 066 ms 100 025 1584000 01 sec 001 ms 11529215046 sec 066 ms 100 026 1584000 01 sec 001 ms 11529215046 sec 067 ms 100 027 1600000 01 sec 001 ms 11529215046 sec 067 ms 100 028 1600000 01 sec 001 ms 11529215046 sec 067 ms 100 029 1600000 01 sec 001 ms 11529215046 sec 067 ms 100 030 1600000 01 sec 001 ms 11529215046 sec 067 ms 100 031 1600000 01 sec 001 ms 11529215046 sec 067 ms 100
The 6262 HE should be able to push all cores to 2.1 GHz simultanously, but it doesn't. Also if I spawn 16 CPU workers on my 32 cores I don't get the expected result:
# cpufreq-aperf CPU Average freq(KHz) Time in C0 Time in Cx C0 percentage 000 1888000 00 sec 999 ms 00 sec 000 ms 99 001 1952000 00 sec 003 ms 00 sec 996 ms 00 002 1888000 00 sec 999 ms 00 sec 000 ms 99 003 1792000 00 sec 000 ms 00 sec 999 ms 00 004 1840000 00 sec 001 ms 00 sec 998 ms 00 005 1888000 01 sec 000 ms 11529215046 sec 068 ms 100 006 1888000 01 sec 000 ms 11529215046 sec 067 ms 100 007 1808000 00 sec 000 ms 00 sec 999 ms 00 008 1808000 00 sec 000 ms 00 sec 999 ms 00 009 1888000 01 sec 000 ms 11529215046 sec 067 ms 100 010 1840000 00 sec 000 ms 00 sec 999 ms 00 011 1888000 01 sec 000 ms 11529215046 sec 067 ms 100 012 1888000 01 sec 001 ms 11529215046 sec 067 ms 100 013 1808000 00 sec 000 ms 00 sec 999 ms 00 014 1888000 01 sec 001 ms 11529215046 sec 067 ms 100 015 1824000 00 sec 000 ms 00 sec 999 ms 00 016 1808000 00 sec 000 ms 00 sec 999 ms 00 017 1888000 01 sec 001 ms 11529215046 sec 067 ms 100 018 1888000 01 sec 001 ms 11529215046 sec 066 ms 100 019 1824000 00 sec 000 ms 00 sec 999 ms 00 020 1888000 01 sec 001 ms 11529215046 sec 066 ms 100 021 1824000 00 sec 000 ms 00 sec 999 ms 00 022 1888000 00 sec 975 ms 00 sec 024 ms 97 023 1888000 00 sec 038 ms 00 sec 961 ms 03 024 1888000 01 sec 002 ms 11529215046 sec 066 ms 100 025 1840000 00 sec 001 ms 00 sec 998 ms 00 026 1824000 00 sec 000 ms 00 sec 999 ms 00 027 1888000 01 sec 002 ms 11529215046 sec 065 ms 100 028 1888000 01 sec 002 ms 11529215046 sec 065 ms 100 029 1792000 00 sec 000 ms 00 sec 999 ms 00 030 1888000 01 sec 002 ms 11529215046 sec 065 ms 100 031 1808000 00 sec 000 ms 00 sec 999 ms 00
I tried to enable HPC in the BIOS, but this doesn't make any difference. Does anybody see what's wrong?
Thanks for your comment, that explains. But how can I test if all cores can run at 2.1 GHz? In other words, how can I put the CPU to work without exceeding the TDP?
Of course this is just a test and it doesn't resemble day to day operations, but I'd just like to see if all cores can be clocked to 2.1 GHz.
black_zion Heavy Wizardry
Posts: 9755
Joined: 04/17/2008
As an Opteron has so many cores, a typical well threaded server app which will use near 100% of all available cores should prevent TurboCore from working, as it will already be working at its TDP.
Thanks again black_zion. But how does the P1 'all core boost' state make any sense if the TDP is too low to use it? Everywhere I read that all cores can be boosted to 2.1GHz simultanously, but in real life this can never happen??
For example see http://blogs.amd.com/work/2011/11/16/a-big-boost/
black_zion Heavy Wizardry
Posts: 9755
Joined: 04/17/2008
It depends on how hard the cores are used, as some programs will hammer it harder than others, same way a power virus will push a CPU or GPU harder than a real world program, it all depends on if there is any thermal headroom left.
Thanks. I found a way to get all cores to 2.1 GHz: "stress -m 40". I am wondering what will happen in real life (production), but now we know for sure that Turbo Core works ;-)
Mime Forum Moderator
Posts: 492
Joined: 08/28/2012
I'd say it's unlikely and when it does happen it's probably not all that useful. If all the cores of your CPU are active, but without enough work being done to jack up power dissipation, then increasing clock speed and doing it faster won't get you much in return. That this can happen at all is probably just a side effect of having per-core granularity in power gating. The earlier versions of turbo core didn't have that, so it was more of an "all or nothing" kind of choice, and there were limits to how aggressive the decisions made could be when switching states. Trying to boil all that down into some kind of easily digestible sound byte isn't always doable, and often leads to confusion like this.
Widgets like turbo core are best for lightly threaded workloads where there's potential for the chip to work hard on something, and yet still have some TDP space left over. It just happens to work out as a kind of patch to help with the relatively weak single threaded performance of Bulldozer as well, although I expect it'll stick around even after that gets sorted out.
-------------------------
Do not meddle in the affairs of archers, for they are subtle and quick to anger. Post Count: +8510 Troll Hunter
The opinions expressed above do not represent those of Advanced Micro Devices or any of their affiliates.
black_zion Heavy Wizardry
Posts: 9755
Joined: 04/17/2008
Best analogy I can think of is to think of TurboCore as HyperThreading. Under a weakly threaded load, which is much more common to the home and office market than server market, you'll see a performance boost, but under something which is heavily threaded and uses every core it's just not going to have an effect. The difference is that on desktops you can use AMD Overdrive to fiddle around with the TurboCore settings, raise the limits and such if you have aftermarket cooling. But there really is no substitute for full time higher clocks, but even then over the next 5-10 years the CPU is going to take on much less importance as GPUs, using OpenCL or other GPU programming languages, take on the heavier workloads, as a single upper mid range graphics card can gap a 4U server in terms of GFLOPS and IOPS performance.
But even with higher clocks, even at 4ghz compared to stock 3.4ghz with me, converting a 2 hour VOB to WMV or converting a few thousand pictures from RAW to JPEG or even playing games (granted this is because I play at a more GPU limited than CPU limited resolution), the performance difference are negligible.
We will use this server as KVM host. With most KVM's (virtual machines) bound to one or two CPU cores, Turbo Core can be very usefull. Most KVM's are not very busy, so the TDP allows to boost the few busy KVM's quite a lot.
Originally posted by: MimeIf all the cores of your CPU are active, but without enough work being done to jack up power dissipation, then increasing clock speed and doing it faster won't get you much in return.
That's a very clever comment. The possibility to boost only a few cores while others are idle is much more useful (for weakly threaded applications/setups).