First source load takes far longer than subsequent loads

Using RV 7.6.1 (also confirmed this happens with RV-2021) Windows 10 b19041.472 with RTX 2080 (Driver 27.21.14.5206), I am finding that the first image load is far longer than on other OSes. It is taking multiple seconds (in my profile, almost 12 seconds) from drag-drop to first pixel. It doesn’t seem to matter what image size is loaded; I’ve tried UHD and 1080p.

You can see almost the entire time period is spent in internalRender

@bernie/@alexaz, is this potentially the image buffer creation? Is there anyway to debug this further or gather metrics for you, and is this reproducible on your side?

Thanks,
-Kessler

Also, importantly to this community, is anyone else seeing this? If you can try out the -debug profile to generate a profile and use rvprof to view the profile, that would give some metrics as to how long you are seeing without a stopwatch…

Hey Mike,

Is this for EXRs specifically or another format?

– Alexa

Great question, it seems to be format independent, but my profiled test was a sequence of jpegs. I’ll validate with profiling on a variety of formats.

Hey Kessler,
on my end I get 1.7 sec on the first render of a first drag and drop of a file sequence (approx 400 files) (less than 2 seconds). I tested both exr and jpg file sequences and the first render is constant.
Running RV 2021 on Windows10 with NVidia Quadro P5000.
12 seconds is a long time !
I wonder what is the variable that makes RV tick : is it the RTX 2080 ?
Have you tried -resetPrefs just in case there is some RV setting that is causing this ?
Does it matter where the files are located ?
Can you try dragging and dropping the files from a different directory with less files in them ?

Bernie

Hey @bernie, sorry for the late reply. Yeah, I’ve tried various formats (h264 mov, jpg, png, exr) and it seems to happen regardless.

If you guys don’t have an RTX 2080 in the hardware cache (plus, access, since COVID and all) I was wondering if the community at large has Win10+RTX 2080 that could help me try to come up with a separate reproduction.

I’ve tried short sequences, and long sequences alike; nothing seems to change the time dramatically.

Hi, sorry to poke this one; but I’m seeing others complain about this. Were you able to reproduce this on Autodesk’s side?

Thanks,
-Kessler

I’ve made case number 18850487
If anyone else has this problem, I’d love to see if we can crowdsource the common cause to help Autodesk narrow in on a fix.

So far, all verified reproductions have occurred on Threadripper or Threadripper Pro machines, though there are some unverified reproductions on Xeon machines. At least one Intel laptop and one Xeon desktop have not reproduced.

Interestingly, a nearly duplicate specc’ed Threadripper running ProxMox with a passthrough GPU does not reproduce.

This happens when CPUs provide more than 64 cores.

Test methodology:

Binary Search of vCPU count of ProxMox with passthrough GPU
Validate with BIOS CPU core limiting. On bare-metal, when given equal to or fewer than 64-cores, the load time decreases rapidly. Reducing the CPU count further reduces first load time.

@autodesk, do you happen to have any machines with with greater than 64 cores (hyperthreading included)? It appears that potentially the PBO buffer code might be thread-locking or some other behavior based on the maximum number of cores a machine has.