Topic Title: 7970 ghz edition GPU core osculating?
Topic Summary:
Created On: 01/27/2013 07:29 PM
Status: Post and Reply
Linear : Threading : Single : Branch
Search Topic Search Topic
Topic Tools Topic Tools
View similar topics View similar topics
View topic in raw text format. Print this topic.
 01/27/2013 07:29 PM
User is offline View Users Profile Print this message

Author Icon
BogusJoe
Peon

Posts: 2
Joined: 07/04/2009

Got two of the FX-797G-TDFC and they are defective when the cards are ran at 100% GPU usage. Stock settings GPU @ 1050mhz, Memory @ 1500mhz, and Power control setting @ 0%. It does not matter whether the cards are in CrossFireX enabled mode or Disabled mode. Have also tried this with (both cards) just a single card in the system at a time. Here is what happens. The card (s) runs great until in hits 100%. At the moment of 100% the GPU core clock starts osculating from 537mhz to 1000mhz. Here is a list of programs I have used and it happens every time it hits 100% GPU usage:

1. Heaven 3.0
2. 3DMark 11
3. 3D Vantage
4. Windows Experience
5. Crysis 1
6. Crysis 2
7. Battle Field 3
8. PC Mark 7
9. FurMark 1.10.4

FurMark (Benchmark preset: 1080) is the quickest program to run to produce the same results. I first noticed this when running GPUShark v0.6.7 to monitor the card (s). Also used GPUZ v0.6.7 and logged to file to verify the results. Then loaded MSI AfterBurner to monitor and log the results and this is where the numbers from 537mhz to 1000mhz showed up. The core clock is very erratic once the GPU hit 100% and the VDDC was a constant 1.213v.

At first I thought it might be a driver issue. Started with Catalyst 13.1, then moved down to Cat 12.10, then down to 12.8 Everytime I uninstalled a driver, I ran the AMD Cleanup util 1.2.1.0 and Driver Sweeper and then ran CCleaner to remove any leftover registry entries.

System information:
Windows 7 64bit Ultimate with latest patches and updates
Mother Board: Asus Rampage IV Extreme Bios v3301
Processor: Intel I7 3930K
Ram: 32 gig DDR3 2134 Gskill
Power Supply: SilverStone Olympus 1500
VGA Card 1: in slot 1 or BusID 1 FX-79G-TDFC 
VGA Card 2: in Slot 3 or BusID 3 FX-79G-TDFC 

Has anyone else ran into this situation? Once a GPU hits 100% usage does it clock itself down?

 01/28/2013 09:48 PM
User is offline View Users Profile Print this message

Author Icon
Thanny
Alpha Geek

Posts: 1345
Joined: 07/13/2009

Since the 6900 series, AMD cards (at least the high-end ones) have had what they market as PowerTune.  Basically, instead of capping the clock speed at a much lower overall value to ensure that a given thermal envelope is not exceeded, the card monitors actual power usage (it's not a direct measurement, but a sophisticated calculation based on a number of usage factors), and reduces clock speed only as necessary to stay within the assigned thermal limits.

At 100% usage with a program like FurMark is precisely when you'd expect to see that happen.  It does not indicate a defect in the card.  Quite the opposite.

If you have sufficient cooling, you can increase the power limit, and the card won't clock down until a more taxing load is encountered (not all 100% loads are equal), if at all.

 

 02/03/2013 01:22 PM
User is offline View Users Profile Print this message

Author Icon
BogusJoe
Peon

Posts: 2
Joined: 07/04/2009

Thanks for the reply.  Will it do this even if temps are 70c or below?  If the temp was like 85 to 90c I would understand.  I have even ran the card at 900mhz and it did the same thing!  Here is what the real problem is and it really confuses me.  If I run this card with 2560X1600 with AA on it will run a max of (GPU) 99% with temps at 70c or below.  If I run the card with 1920X1080 no AA, the card hits 100% gpu usage and starts osculating.  The same thing if I have it set at 1280X720.  When the core is set for 1050 and hits 100% gpu usage bam...osculates wildy from 537mhz to 800 or so mhz.  I have also tried this at 1100mhz, same result.  Have tried this at 1130mhz, same result and as mentioned above, it even does this at 900mhz. Temps on card are 70c or below.   Now add a second card to the system and set up as CrossFire.  How does this affect the graphics system when you have cards osculating in core speed?  I thought you always needed both GPU's to be at the same speed. 

 02/03/2013 06:09 PM
User is offline View Users Profile Print this message

Author Icon
Thanny
Alpha Geek

Posts: 1345
Joined: 07/13/2009

Let's respond backwards.

Crossfire used to use the lowest common denominator for clock speed, many years ago.  It no longer does that.  Each card in a CF setup can run at different clock speeds.

Temperature is not directly the issue.  TDP is the issue - that stands for Thermal Design Power.  Something with a TDP of 200W requires a cooling solution capable of removing that much heat, at a temperature below the maximum allowed (let it get hot enough, and simple radiation will give off 200W), to prevent it from overheating. 

Maybe a bit more detail on the relationship between clock speed and TDP will help. 

Before the 6900 series, ATI/AMD cards were given a maximum clock speed that was known to never exceed the TDP under any load.  Any load other than Furmark, anyway - they were looking at real-world loads, not artificial spaceheater benchmarks.  That maximum clock would be well below what the cooling solution could handle under many load conditions.  So you were basically stuck with a card operating well below its potential a good portion of the time. 

Starting with the 6900 series, AMD ditched that concept entirely.  Instead of picking a safe maximum clock speed, the card was given a safe TDP.  If the load on the card exceeds that TDP, the clock speed is reduced to lower heat output.  The result is that the GPU works at pretty close to its maximum potential in a given thermal design envelope.

And rather than make that TDP a rigid quantity, they added the ability to tune it upwards or downwards to accomodate greater or lesser cooling ability.  If you adjust the power limit upwards, the card will be allowed to generate more heat before reducing its clock speed.

As I said, it's about heat output, not actual temperature.  At high temperature, the card will simply shut down to protect itself.  Adjusting clock speed based on thermal output is a way to avoid thermal shutdown.  Keep in mind that with insufficient cooling, you can go from 30C to 100C in a fraction of a second, which is too short a time to simply modify the clock speed to prevent it.

Armed with that understanding, it should be clear now that if your card drops at times to between 500MHz and 800Mhz, it does so regardless of the initial clock speed, and regardless of current temperature.  It's a function of thermal output.  If you're confident that your cooling is up to it, increase the power limit.  Start with +5%, and watch your temperatures under full load.  They should go up, and the reduction in clock speed should go down.

As for 2560x1600 with AA versus lower resolutions without it, that comes down to my earlier point that not all full-load situations are the same.  The resources of the card may be fully occupied in different ways, with some generating more heat than others.  The task of rendering a series of anti-aliased 2560x1600 frames can certainly generate less heat than rendering a series of aliased smaller frames more frequently.  And it's the amount of heat generation that matters.

And finally, maybe a point of comparison may help.  I have a pair of water-cooled 7970's at 1100/1450 each.  The power limit is set at +20% for both cards. 

Here's what I get when running the 1920x1080 15-minute benchmark in Furmark, stopped after almost five minutes to get the whole graph in.  For the first 15 seconds or so, both cards are at 99%, and remain at 1100MHz.  After that, the second card (a Sapphire 7970 OC) starts to downclock.  I saw the number get down into the 800's.  The load on the first card (a Gigabyte 7970 OC) goes down correspondingly, since it has less work to do to keep up with the lower-clocked secondary card, which remains at 99% the whole time.

And here's what happens when I do two short runs at 2560x1600, first without AA, then with 4xAA.  Without AA, you can see the same pattern - starts out normally, then the clock speed starts fluctuating.  With AA, the load on the cards is different, and the clock speed doesn't need to go down at all.

If I turn the power limit to -20%, the first GPU stays at 500MHz for the Furmark test, with the expected performance hit.  Set to +0%, the primary GPU clocks down almost immediately to the 800's, with a ~16% decrease in performance.

So a higher power limit means higher performance, with the expectation that your cooling solution will be able to handle the extra heat output.

Curiously, the Sapphire card seems to behave exactly the same at +0% as at +20%, so I wonder if there isn't a BIOS bug in that card.  But it behaves as expected at -20%, and Furmark isn't representative of any actual game.  I never see a reduction in clock speed in normal games.

 

Statistics
84864 users are registered to the AMD Support and Game forum.
There are currently 4 users logged in.

FuseTalk Hosting Executive Plan v3.2 - © 1999-2014 FuseTalk Inc. All rights reserved.