Render queue kicks jobs

Started by Eric Summers, January 09, 2019, 02:59:50 PM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

Eric Summers

I've been having issues where I will process a queue of studios and at some point, it will finish one image and then clear the rest of the queue without processing it. This is a perplexing one to me, as I can't get it to happen consistently. Has anyone else experienced this?

Win 10, KS 8.1.61, plenty of space available on the destination drive. Also doesn't seem to be scene specific as far as I can tell.

DMerz III

I haven't experienced the queue being cleared without processing. But I have experienced hung jobs in the queue.

Our best practice has to be not to disturb the open Keyshot file until the entire set of jobs has finished 'sending' to the network and all the jobs say 'ready'.
We've run into errors when we send a batch, and while they're still in 'transit' we close the file, or interact with it in some manner which seems to disturb the proper upload.

Not sure if that's the case here, but keep an eye on it?

KeyShot

I have not heard of this issue either. Is it using network rendering or rendering in KeyShot directly?

Eric Summers

I'm rendering directly in KeyShot. My typical workflow is this:

I set up my scene on my local desktop and get everything ready to render. I save the scene and close KeyShot. Then I run a remote desktop connection to our rendering machine, open KS and add the studios I want to render to the queue. I process the queue and sometimes I will leave the remote desktop up while it renders. Other times I disconnect if it is going to be a long batch of renders. I have had it clear the queue both ways. 

Could it have something to do with the fact that I'm logging onto another machine? Could borrowing a floating license do anything? I've never gotten any license errors so I'm just grasping at straws.  :-\

mattjgerard

One thing I would try is to see if you can "borrow" a 14 day network render license as a trial from Luxion for your remote render computer and submit from your local workstation that way and see if the problem persists. It sounds like something is clearing your queue somewhere, and like you said its only under a very specific circumstance that might be hard or near impossible to pinpoint. Using the NR option might help eliminate some of the sketchiness of your current workflow.

Eric Summers

I want to run a few more tests when I have some time to be sure, but I may have a solution.

I had a batch of studios that I had to render overnight, so I borrowed a floating license for the rendering machine. The queue ran with no issues. However, my local desktop just got upgraded and now both machines are running Windows 10. I wouldn't think running 7 locally and 10 on the remote desktop would change anything, but I don't know. So, when I have time I'm going to try processing a full queue without borrowing a floating license to eliminate one variable and see what happens.

Niko Planke

Hey Eric,

I assume you are using  the "add Studios" button in the Queue Window on the remote computer, to add your predefined studios in the file to the queue?
May i ask how you share the file between the two computers?

If the Floating license has anything to do with it you should be able to  see log messages in the Floating servers Logs about refused connection attempts or disconnects. If you are using the latest server software the Log file is located in the License folder next to the License file.
If it is indeed related to the computer losing the connection to the floating server while rendering, borrowing a license on the remote rendering computer should prevent this issue.
If that is the case please let us know such that we can find a way to tackle that.

If this is caused by the queue file for some reason being corrupt to begin with this will not help. If the issue keeps showing up with a borrowed license, we would ideally need the ".ext" files that are created in your "scenes folder" for the queued Jobs.
Since these get removed when the Queued Job is finished you would need to Copy them before starting the queue rendering. This is easier if the issue can be reproduced reliably.
You can also check if these exist for each queued job before Starting the rendering.

A few things to look out for:
It is important to note that the .ext file can take up much space, so if the Drive used for the Resource folder(including the scenes folder) has insufficient space you may experience issues here even if the image output location has sufficient space.
It is also important to note that the Default Resource location may not be shared across multiple computers since that can introduce issues as well.

I can not see how upgrading the Local Desktop to Windows 10 could introduce or solve this issue. It should not make a difference (Famous last words.)



Eric Summers

Hey Niko, thanks for the response. I'm going to break out my answers, hopefully it keeps everything from getting jumbled together.


Quote from: Niko Planke on January 17, 2019, 12:22:20 AM
I assume you are using  the "add Studios" button in the Queue Window on the remote computer, to add your predefined studios in the file to the queue?
May i ask how you share the file between the two computers?
Yes, I am using the "add Studios" button to add them to the queue.
My KeyShot scenes are in a folder on a network drive that I have access to on both machines. When I switch machines I close the file on one machine and then open it on the other, I never have the scene open on both machines.

Quote from: Niko Planke on January 17, 2019, 12:22:20 AM
If the Floating license has anything to do with it you should be able to  see log messages in the Floating servers Logs about refused connection attempts or disconnects. If you are using the latest server software the Log file is located in the License folder next to the License file.
If it is indeed related to the computer losing the connection to the floating server while rendering, borrowing a license on the remote rendering computer should prevent this issue.
If that is the case please let us know such that we can find a way to tackle that.
I had some time to run some tests today and it seemed that borrowing a license prevented the issue. I will need to talk to IT tomorrow to find where the Log file is at to look at the messages.

Quote from: Niko Planke on January 17, 2019, 12:22:20 AM
If this is caused by the queue file for some reason being corrupt to begin with this will not help. If the issue keeps showing up with a borrowed license, we would ideally need the ".ext" files that are created in your "scenes folder" for the queued Jobs.
Since these get removed when the Queued Job is finished you would need to Copy them before starting the queue rendering. This is easier if the issue can be reproduced reliably.
You can also check if these exist for each queued job before Starting the rendering.
I will keep this in mind. In my testing today, I checked to make sure that the .ext files were created before starting the queue. Each job also had .ext.geom file. There is also one XML file named "q", I don't know if that has any bearing on anything.

Quote from: Niko Planke on January 17, 2019, 12:22:20 AM
A few things to look out for:
It is important to note that the .ext file can take up much space, so if the Drive used for the Resource folder(including the scenes folder) has insufficient space you may experience issues here even if the image output location has sufficient space.
It is also important to note that the Default Resource location may not be shared across multiple computers since that can introduce issues as well.

I can not see how upgrading the Local Desktop to Windows 10 could introduce or solve this issue. It should not make a difference (Famous last words.)
The (network) drive that my Scenes folder is on has ~150 GB free and the output drive (a different network drive) is about the same.

My folder locations on both machines for the Scenes and Renderings point to the same network folder (separate folder for each). Would this cause a problem?

The upgrade of my local machine did not solve it, I was able to create the problem today. So I think you're right, it didn't make a difference.


I will report back with what the log files say tomorrow, hopefully. Thanks again!

Niko Planke

Hey Eric,

Thanks for the update.
That helps me with understanding the circumstances in which the issue occurs.

QuoteEach job also had .ext.geom file. There is also one XML file named "q", I don't know if that has any bearing on anything.
Yes, exactly these three files are relevant for the Queue rendering.  If this happens again we would benefit from having those files for investigations from prior to starting the queue rendering, to potetially rule out corrupted files as cause.

QuoteMy folder locations on both machines for the Scenes and Renderings point to the same network folder (separate folder for each). Would this cause a problem?
As long as each Computer has a unique scenes folder for itself (implying unique .ext and q.xml files) there should not be an issue. Otherwise there is a risk for both computers reading and modifying the same q.xml file which could introduce issues like the one we are seeing right now.

In rare cases it can also happen that slow network performance introduces a to large delay or timeouts when reading the files from a network drive, which can introduce unforeseen issues.
If you for instance notice that the issue only occurs during Work hours or in times with high Network activity, you can try to set the scene folder to a local drive on the rendering machine.


Eric Summers

Quote from: Niko Planke on January 18, 2019, 12:09:13 AM
As long as each Computer has a unique scenes folder for itself (implying unique .ext and q.xml files) there should not be an issue. Otherwise there is a risk for both computers reading and modifying the same q.xml file which could introduce issues like the one we are seeing right now.

Hmmm, I think this could be my issue! Just to confirm, I'd like to explain my setup slightly differently:

My scene files are on a network drive (let's call the drive N) so I can access them from any machine.
On my local desktop install of KeyShot, in the settings the Scenes folder is like this: N:\Renders\Scenes
On the rendering machine install of Keyshot, the Scenes folder is: N:\Renders\Scenes

So they both point to the same folder. What if I pointed the rendering machine to a dummy folder like this? N:\Renders\Scenes\Dummy Would this solve the issue? On the rendering machine I would open scenes from the N:\Renders\Scenes folder, but in the settings the Scenes folder for that install would be pointing to the N:\Renders\Scenes\Dummy folder.

I hope that makes some sense?  :o

Niko Planke

Hey Eric,

That makes perfect sense, and is likely the cause for the issue.

Your suggestion with using the Dummy folder should solve the issue.
As long as the root folder set in the settings is different, you should be fine.

Ideally you would have the entire default library locally in the computer and then have shared folders on the network drive. You can configure shared folders via the preferences.

By having the shared libraries you always risk that one instance of KeyShot accesses a file while an other instance is changing it, and that can cause a plethora of unexpected issues. Including swapped/missing textures on renderings, licensing issues and crashes.

Eric Summers

Thank you for your help Niko! I really appreciate it!  :)

I have KeyShot installed on the local drive of each machine. I configured the shared folders thru the preferences.

Now I know what to watch for if I have issues pop up again.