So, I did a GPU: 3D 5 minute test DX11, 1920X1080 Fullscreen, shader complexity: 7, FPS limit: 400 with error check and nothing.
To be honest I don’t overclock and I have somewhat limited exposure to the hardware side of things. However, AFAIK there is a hard upper voltage limit defined by the VBIOS or possibly even physically limited on the board. Unless you’re willing to go into warranty-violating territory, I suspect that +88 mV is the best you’re going to get.
Unfortunately I don’t have recommendations for publicly available stress tests, but OCCT seems pretty reasonable. You really want something that can stress certain components individually but that’s a much more complex test and probably overkill.
BTW Habeed, feel free to start a VR thread if you’re interested in continuing the discussion.
As for memclock vs core overclocks: ideally you want to increase both at the same ratio. Stock clocks are chosen to achieve a good balance. Now, different parts of the scene are limited by different things, so you can still see gains from increasing one or the other. But ideally you increase both.
Different settings can also alter the optimal ratio. MSAA for instance typically increases the stress on the memory. If you regularly play with high AA settings, you’ll want to pay attention to that. If you instead play at high resolution but low AA, core clock is more important (very roughly speaking, of course).
I realized you were right, saw that my core/vram ratio was about 14/35 and slowly increased both by that ratio. I got to +280MHz on the core clock and +700MHz on the memory clock before deciding that was enough. Some Kombustor benchmarks did a TDR but OCCT said there were 0 errors during a 10 minute test.
The furry donut hardly moved though and the framerate was around 150fps. Temp got to 61C.
You were right about some tasks needing core and memory to different degrees. As it turns out, my overclocking instability problems earlier came from the fact that I wasn’t overclocking the GPU memory. In one Kombustor benchmark, core was 1647MHz and memory was 4204 with temp at 48C. In OCCT, it was common for the core to be at 1110MHz and the memory at 4130MHz with higher temps like upper 50s or low 60s.
Is the VRAM typically as well cooled as the core? I read that a backplate helps cool the GPU’s memory. It too has its temperature taken, right?
It’s not as well-cooled, though that’s mainly a function of it not using nearly as much power. It frequently shares a heatsink with the GPU itself, and you can see that the heatpipes go directly over the GPU. Sometimes the VRAM is cooled only by the wash from the fan. A backplate can help; my eVGA 970 actually had a backplate stuffed in the box that I had to install myself; I suspect they realized after manufacturing that their existing cooling system wasn’t quite up to scratch.
Super-serious overclockers get water blocks that cover both the GPU and VRAM, and can keep both at near-ambient temps. But that’s pretty much overkill.
Unfortunately, the monolithic coolers that you see these days make it harder to cool the VRAM specifically. On older cards, you could install little heatsinks on each VRAM chip, but that doesn’t work now. Either replace the whole unit or live with it. A fan blowing on the backside of the PCB may help a bit.
Why don’t you have an engineering sample from whichever of the 3 GPU manufacturers you work for? Although, if you work for Intel, your company’s products are still too slow to game with so you would logically use NVidia…
I do have engineering samples of this nature. However, they’re extremely obnoxious because they just use crappy CPU fans that haven’t been characterized, and therefore for safety they run at 100% speed all the time. Give me a production cooler any day.
That said, I have used water cooling in the past. It is quiet, which I like. But the system itself is somewhat annoying. I run mostly low-speed air cooling these days.
Reading this, I feel like there has to be a half dozen other ways to solve this.
-
You could use massive off the shelf heat-pipe heat sinks, far larger than anything you’d stick on a retail card, for cooling development boards. Those things that eat 3 or 4 PCIe slots. They would have several oversized fans. Come to think of it, you could just use one meant for hot processors…
-
Watercooling, yeah. I would assume that future GPUs you are testing are roughly the same size as past ones, so you just need to have holes in the board that fit to a commonly available watercooling block size. You could either have an employee slap together 50 identical kit-build watercoolers, simple ones, or you could probably just specify a hole spacing that will let you use something like a Corsair h100 off the shelf cooler, meant for processors, instead.
-
You could stick the development machines in another room. Not helpful at home but maybe this is what you do at work?
No matter the choice, the basic principle you are exploiting is there’s no sense in skimping on paying for a $250 block of copper or water cooling set when an engineering sample of a GPU probably literally cost more than 10k to manufacture.
- How should I interpret pixel fillrate and texel fillrate?
For pixel fillrate, should I just divide the fillrate by (desires framerate X height X width) and see if I get higher than 1? It has to do with the screen’s pixels and how many of them can be updated and how often, right? There must be something else involved.
For texel fillrate, I’m less sure. It’s textures but how do I get an idea of the quantity of texture elements? Is it the sum total of (all texture maps X their resolutions) that are can be called per second?
- I’m using ASUS GPU Tweak after having tried Precision X16, Firestorm and Afterburner. It’s good except that it seems to revert to the auto fan setting instead of the user defined fan setting. I make sure to apply and save yet it reverts to “Auto”. It consistently applies my preferences for all settings except the fan.
My maximum core overclock is supposed to be 1500MHz yet when rendering fractal videos, it goes up to 1588MHz. The VRAM OC stays at 6010MHz even though I’ve overclocked it to 8000MHz. Do I understand correctly that the GPU figures it doesn’t need as much memory speed so it transfers the volts to the core to make it go faster than my overclock setting? If so, that’s pretty smart and would seem to make many user defined overclocking settings of little use since the user could just decide how much to overvolt, the fan curve and then let the GPU figure out what it needs most at runtime.
This is not an easy question to answer.
In the old days, it was easier. The pixel fill you needed was pretty much just the screen pixels/sec times 3. The factor of 3 comes from the average depth complexity. Depth complexity is the number of times a given pixel is overwritten per frame. It is 1 at the very minimum, of course–but a typical scene has a fair amount of overlapping geometry. Most pixels will be written a few times; for some it will be many times.
Today, things are more complicated. There are all kinds of offscreen surfaces that use up fillrate. Shadow maps, G-buffers, environment maps, light maps, and so on. Furthermore, some of these require high precision, blending, anti aliasing, or other advanced features that slow down the “ROP” (Raster OPeration) unit.
Even more importantly, these days you spend large parts of the scene in regimes that are not fill limited–they’re shader limited. The shader mainly does a lot of math. You could easily have a program that does thousands of math ops and then a single pixel write at the end. A high fill rate doesn’t help you here.
Texel fill is similarly complicated. Like pixel fill, a high texel rate only helps in some regimes. And it goes at different speeds depending on the texture format and precision.
So to sum up, there’s no easy way to start with a pixel/texel rate and compute how fast an app will run. Different apps just behave too differently. Even comparing between GPUs is dubious, because of slightly differing definitions in what it means to have a given rate. Two GPUs might have the same peak rate even if one of them is half-speed when writing to floating point surfaces, for instance. Comparisons within the same GPU family is about the best you can do.
I agree with increasing both by the same ratio. I’m not sure what to take as the base value for the core clock.
Nvidia says the base clock for the 970 is 1050MHz and the boost clock 1178*. Wiki says that the 1178MHz figure is the average boost clock and that the max boost clock is around 1250MHz**.
The most common way to count the memory clock on Maxwell puts it at 7010MHz.
So, what core clock value should the memory clock scale to?
If it’s the base clock, then the memory clock should be 6.68 times the core clock. If it’s the average boost, it should be 5.95. If it’s the max boost, it should be 5.6 times.
Also, I note that factory-overclocked GPUs always overclock the core but only OC the VRAM sometimes. Even when the memory is overclocked, its OC is much milder than the core’s. Any guesses as to why?
Do I correctly guess that factory-overclocked GPUs tend to be binned/have higher ASIC quality?
*http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-970/specifications
**List of Nvidia graphics processing units - Wikipedia