Monday, January 09, 2006

Waiting on dbms_job

I've been using dbms_job more and more lately. This weekend I had an interesting problem.

Friday's job queue was running just fine. On Friday evening, I put about 200 jobs into the queue to run immediately and expected them to run for a couple hours. I had job_queue_processes set to 4, so I knew that only four jobs would be running at the same time.

About 75 jobs into the run, no jobs were running. Nothing was in dba_jobs_running. I could see that jobs were still in the queue. I wasn't really sure what was going on, but I couldn't bounce the instance.

I finally decided to set the job_queue_processes to 0 with ALTER SYSTEM and waited until I didn't see the query job coordinator process (cjqN) anymore. I then reset the job_queue_processes to 2. I didn't want to set job_queue_processes to 4 as other stuff was going on by this time. CJQn then restarted and continued processing the jobs in the job queue.

1 comment:

Anonymous said...

Oh I had the same problem on a Sun Solaris box. I don't know what you are running on. I did the same as what you did to fix it. I found that the j000 process is generally the one which corrupts/dies. However Solaris has another intersting one. If your system is up more than 497 days, Oracle might decide to run no more jobs. (Oracle 9.2 problem) However we found on one machine whihc had an uptime of 120 days the same problem. The only thing to fix it was to reboot the server. Your turn to go figure.