History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: QUARTZ-668
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Critical Critical
Assignee: James House
Reporter: Dan Washusen
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Quartz Scheduler

Threading issue around SimpleThreadPool.runInThread...

Created: 26/May/08 02:39 AM   Updated: 19/Aug/08 02:29 PM
Component/s: Thread Pools
Affects Version/s: 1.6.1
Fix Version/s: 1.6.1

File Attachments: 1. Text File quartz-668_thread_dump.txt (26 kb)

Environment:
java version "1.5.0_13"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05-237)
Java HotSpot(TM) Client VM (build 1.5.0_13-119, mixed mode, sharing)
Apple OS X 10.5 on a Macbook Pro
Quartz 1.6.1-RC1


 Description  « Hide
We are seeing thread dead locks when running integration tests on our quartz jobs. From the thread dump below it would appear that all the worker threads are waiting QuartzSchedulerThread and the QuartzSchedulerThread is waiting for a free worker thread.

I can work around the issue my ensuring that there's always one more worker thread than jobs to execute. This has obvious down sides when we have many jobs to execute..

{quote}
"QuartzScheduler_manager.clusteredScheduler-dansmac.local1211758611804_ClusterManager" prio=7 tid=0x01054c00 nid=0x875000 waiting on condition [0xb1194000..0xb1194d90]
at java.lang.Thread.sleep(Native Method)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.run(JobStoreSupport.java:3755)

"clusteredScheduler_QuartzSchedulerThread" prio=5 tid=0x0104ebd0 nid=0x874200 in Object.wait() [0xb1113000..0xb1113d90]
at java.lang.Object.wait(Native Method)
- waiting on <0x0d5785c0> (a java.lang.Object)
at org.quartz.simpl.SimpleThreadPool.runInThread(SimpleThreadPool.java:386)
- locked <0x0d5785c0> (a java.lang.Object)
at org.springframework.scheduling.quartz.SimpleThreadPoolTaskExecutor.execute(SimpleThreadPoolTaskExecutor.java:66)
at org.springframework.scheduling.quartz.LocalTaskExecutorThreadPool.runInThread(LocalTaskExecutorThreadPool.java:66)
at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:406)
- locked <0x0d5ecb50> (a java.lang.Object)

"SimpleThreadPoolWorker-5" prio=5 tid=0x0104e080 nid=0x873400 waiting for monitor entry [0xb1092000..0xb1092d90]
at org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
- waiting to lock <0x0d5ecb50> (a java.lang.Object)
at org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
at org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
at org.quartz.core.JobRunShell.run(JobRunShell.java:287)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)

"SimpleThreadPoolWorker-4" prio=5 tid=0x0104db00 nid=0x9d3c00 waiting for monitor entry [0xb1011000..0xb1011d90]
at org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
- waiting to lock <0x0d5ecb50> (a java.lang.Object)
at org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
at org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
at org.quartz.core.JobRunShell.run(JobRunShell.java:287)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)

"SimpleThreadPoolWorker-3" prio=5 tid=0x0104d6c0 nid=0x9d2e00 waiting for monitor entry [0xb0f90000..0xb0f90d90]
at org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
- waiting to lock <0x0d5ecb50> (a java.lang.Object)
at org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
at org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
at org.quartz.core.JobRunShell.run(JobRunShell.java:287)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)

"SimpleThreadPoolWorker-2" prio=5 tid=0x0104d290 nid=0x9b6200 waiting for monitor entry [0xb0f0f000..0xb0f0fd90]
at org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
- waiting to lock <0x0d5ecb50> (a java.lang.Object)
at org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
at org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
at org.quartz.core.JobRunShell.run(JobRunShell.java:287)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)

"SimpleThreadPoolWorker-1" prio=5 tid=0x0104cfd0 nid=0x9df400 waiting for monitor entry [0xb0e8e000..0xb0e8ed90]
at org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
- waiting to lock <0x0d5ecb50> (a java.lang.Object)
at org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
at org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
at org.quartz.core.JobRunShell.run(JobRunShell.java:287)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)

"main" prio=5 tid=0x01001510 nid=0xb0801000 waiting for monitor entry [0xb07ff000..0xb0800188]
at org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
- waiting to lock <0x0d5ecb50> (a java.lang.Object)
at org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
at org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
at org.quartz.core.QuartzScheduler.pauseAll(QuartzScheduler.java:1198)
at org.quartz.impl.StdScheduler.pauseAll(StdScheduler.java:482)
       .......
{quote}

Cheers,
Dan

 All   Comments   Change History      Sort Order:
Dan Washusen - [26/May/08 02:45 AM ]
Additional information:
Spring 2.5.4

Rough steps to reproduce the issue:
1. Set up a thread pool with 5 worker threads
2. Configure 6 jobs that run for around a minute each
3. Run the following code:
 for (String jobGroup : getScheduler().getJobGroupNames()) {
     for (String jobName : getScheduler().getJobNames(jobGroup)) {
        getScheduler().triggerJob(jobName, jobGroup);
     }
 }

James House - [03/Aug/08 04:48 PM ]

Dan,

I still cannot reproduce this. I've tried many, many times over the weeks since you reported it. I've also not had any other reports of this.

The interesting thing is that the thread dump you've posted here shows a scenario that is impossible (at least so far as designed and my brain can verify in the implementation).

The scenario shown in these stacks of the thread dump is:

* QuartzSchedulerThread is holding signalLock and waiting for a free thread in the pool in SimpleThreadPool.runInThread().
* All threads in the pool are waiting for signalLock

This indeed would be a dead lock.

The thing that makes this impossible (even though you obviously got it some how) is that runInThread() is called by QuartzSchedulerThread inside of this block:

int availTreadCount = qsRsrcs.getThreadPool().blockForAvailableThreads();
if(availTreadCount > 0) {
   ...
   ....qsRsrcs.getThreadPool().runInThread(shell)...
   ...
}

The contract of blockForAvailableThreads() is that it only returns values larger than zero (never zero or less) - and that the returned value is the number of threads available in the pool.

So... QuartzSchedulerThread should not be able to get to the call to runInThread() when the threads are exhausted, and it should therefore never block inside that method (because the wait() shown in QuartzSchedulerThread's stack above is only called if the pool is exhausted), and thus the deadlock shouldn't occur.

So how did this happen?

I'd be interested in seeing the rest of your thread dump -- for example I see that you have a call to pauseAll() in progress as well.

Does your app somehow gain hold of and use the SimpleThreadPool itself (such that the avail count can change between QuartzSchedulerThread's calls to blockForAvailableThreads() and runInThread())?

Help!?!?



Yuri Kadetov - [05/Aug/08 10:29 AM ]
I'm not sure that my problem (described here: http://forums.opensymphony.com/thread.jspa?threadID=402480&tstart=0) is the similar issue to this one.
Anyway, I have 5 worker threads and only 2 jobs in my JobStore. And still after some period of time worker threads become blocked one by one. After that MisfireHandler becomes blocked too, trying to hadle misfired trigger.

<... thread dump ...>

Thread Group: jboss : max priority:10, demon:false

  Thread: MyQuartzScheduler_Worker-1 : priority:5, demon:false, threadId:47, threadState:BLOCKED, lockName:java.lang.Object@1802282
  
      org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
      org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
      org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
      org.quartz.core.JobRunShell.run(JobRunShell.java:287)
      org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)
   
  Thread: MyQuartzScheduler_Worker-2 : priority:5, demon:false, threadId:48, threadState:BLOCKED, lockName:java.lang.Object@1802282
  
      org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
      org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
      org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
      org.quartz.core.JobRunShell.run(JobRunShell.java:287)
      org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)
  
  Thread: MyQuartzScheduler_Worker-3 : priority:5, demon:false, threadId:49, threadState:BLOCKED, lockName:java.lang.Object@1802282
  
      org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
      org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
      org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
      org.quartz.core.JobRunShell.run(JobRunShell.java:287)
      org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)

  Thread: MyQuartzScheduler_Worker-4 : priority:5, demon:false, threadId:50, threadState:BLOCKED, lockName:java.lang.Object@1802282

      org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
      org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
      org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
      org.quartz.core.JobRunShell.run(JobRunShell.java:287)
      org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)

  Thread: MyQuartzScheduler_Worker-5 : priority:5, demon:false, threadId:51, threadState:BLOCKED, lockName:java.lang.Object@1802282

      org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
      org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
      org.quartz.core.QuartzScheduler.notifySchedulerThread(QuartzScheduler.java:1736)
      org.quartz.core.JobRunShell.run(JobRunShell.java:287)
      org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)

  Thread: QuartzScheduler_MyQuartzScheduler-NON_CLUSTERED_MisfireHandler : priority:5, demon:false, threadId:53, threadState:BLOCKED, lockName:java.lang.Object@1802282

      org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:202)
      org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:90)
      org.quartz.impl.jdbcjobstore.JobStoreSupport.signalSchedulingChange(JobStoreSupport.java:3065)
      org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.run(JobStoreSupport.java:3823)

Thread Group: QuartzScheduler:MyQuartzScheduler : max priority:10, demon:false

    Thread: MyQuartzScheduler_QuartzSchedulerThread : priority:5, demon:false, threadId:52, threadState:TIMED_WAITING, lockName:java.lang.Object@960f0e

        java.lang.Object.wait(Native Method)
        org.quartz.simpl.SimpleThreadPool.blockForAvailableThreads(SimpleThreadPool.java:416)
        org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:251)

<... thread dump ...>

Full thread dump is too large to post it here. May I send it by email? Or I can attach it to this issue.

James House - [05/Aug/08 11:22 AM ]

yes, please attach the full thread dump to the issue. What you've posted is interesting, but I would really like to see the full dump.

James House - [05/Aug/08 12:05 PM ]
I believe I've reasoned out how this is happening, even though I can't reproduce it.

I'll work toward a fix.

Thank you both for the info - it is the same issue, just manifested in two slightly different ways.

James House - [05/Aug/08 07:06 PM ]
Actually, strike my last comment.... I still can't figure out how this is possible. :(

(and still can't reproduce)

The thread dumps do not show a deadlock, just a wait, and as I described a few comments earlier, that wait should never occur, because a thread should have been guaranteed to be available. :(

Can you provide more platform info about what hardware and OS you are running on, Yuri?

Yuri Kadetov - [06/Aug/08 04:00 AM ]
Issue reproduction thread dump

Yuri Kadetov - [06/Aug/08 04:11 AM ]
Here is server info:

AvailableProcessors 2
OSArch x86
OSVersion 5.2
JavaVendor Sun Microsystems Inc.
JavaVMName Java HotSpot(TM) Client VM
FreeMemory 100870608
ActiveThreadGroupCount 7
TotalMemory 133365760
JavaVMVersion 1.5.0_10-b03
ActiveThreadCount 47
JavaVMVendor Sun Microsystems Inc.
OSName Windows 2003
JavaVersion 1.5.0_10
MaxMemory 532742144

The issue have been reproduced on JBoss 4.2.2GA and JBoss 4.2.0GA

I'm using
JobInitializationPluggin
2 jobs triggered periodically by cron triggers
JobStoreCMT on MSSQL

James House - [19/Aug/08 02:29 PM ]
Issue was caused by some of the threading re-work in 1.6.1 RC1, have now re-worked it some more. ;-)