NR master stalls and does not process queue

Started by Gordon, January 22, 2019, 01:20:35 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Gordon

Posting this in case anyone else runs into a similar problem with Network Rendering stalling/hanging and refusing to pick up and start the next job in the queue.  In v7 we saw this every month or so, but in v8 it seems to be happening on an almost daily basis (we get 10-20 or so jobs to the master each day).  If I stop restart the service it seems to process normally, but what I have found is it seems to be related to either the logging or having the Network Monitor running on the same server as the master.  I changed my master's logging (found in NR configuration) to only log critical events and deleted any old (large) log files.  I also then do not run the Network Monitor on the master server and use a different machine to monitor the server's progress.  So far it's been stable for a week since these two changes.   Will update more if it hangs again.


Niko Planke

Hey,

I can see that you are using KeyShot Network rendering 8.1
We have recently released KeyShot 8.2.

It includes multiple stability and performance improvements for KeyShot network rendering.
I suggest to update to the latest version and check if it still occurs.

DMerz III

Hey Gordon, we ran into similar issues with 8.1 hanging up as you said. We had most success with making sure the jobs we sent were status 'ready' on the monitor before we touched anything on our workstations, especially the active Keyshot application. (this includes saving or trying to do more work). It was a pain in the rear, but it seemed to help the problem with hung jobs. So if you're sending something, make sure all jobs are completed on the master before moving on. Maybe that will help?

All of that being said, once we moved to 8.2 Network Rendering and Keyshot 8.2 we've had way more stability. It's only been a couple of weeks for us, but I don't recall any issues since we updated.

So I second what Niko says.

mattjgerard

8.2 has been solid for me as well, we tend to go in spurts so we won't use it much for a couple days but then barrage it with dozens of images, and I've had earlier versions just stop on me under certain circumstances. David- my hangs in previous versions might be related to what you describe, as I'd fire images off as soon as I was ready, then move to the next one, fire it off as fast as I could. They would all be setup and ready to go, but have different cameras etc. This was before I discovered studios, now I add everything to the queue using model sets and studios, and let er go.

But, yes, 8.2 is def a more stable release.

ikach

Im having quite the similar issues,
My network monitor has suddenly stopped Rendering any of the que
It just send/archives and that its

it say I need to update my MASTER
Ive already uninstalled > downlaoded latest Keyshot and network monitor ,
still having the same issues

Was working great till about two days back

Irfan




Nils Piirma

Usually when this happens, it's either one of the three:

Database is full and cant fully run and thus needing a re-initialization of the master service.

Que is too long and needs to be cleared.

Master server needs to be rebooted to clear the cache and re-initialize the jobs.

mattjgerard

Quote from: Nils Piirma on July 08, 2019, 01:37:22 AM
Usually when this happens, it's either one of the three:

Database is full and cant fully run and thus needing a re-initialization of the master service.

Que is too long and needs to be cleared.

Master server needs to be rebooted to clear the cache and re-initialize the jobs.

I finally figured out that my render node only has a 256gb ssd in it, and with 2 artists sending jobs the drive would fill up after a couple days. Deleteing the jobs from the queue fixes it. NR doesn't handle full drives very well, and doesn't have any way of letting you know that's the problem.