
|
If you were logged in you would be able to see more operations.
|
|
|
OSCache
Created: 23/Nov/05 04:29 AM
Updated: 21/Jan/07 01:55 PM
|
|
| Component/s: |
Listeners
|
| Affects Version/s: |
2.1.1
|
| Fix Version/s: |
2.3
|
|
|
File Attachments:
|
1.
stack.txt (63 kb)
|
|
Environment:
|
Solaris 9, Sun JDK 1.5.0_05 (but should apply to any platform)
|
|
|
We ran our application (Tomcat webapp) by mistake as root. When we restarted tomcat with the normal restricted user, we observed that it was total unresponsive and completely occupied one out of two CPUs.
Another restart didn't help as well, same behaviour.
After examing the stack trace I concluded that OSCache was hanging in AbstractConcurrentReadCache.persistStoreGroup() because of mssing write rights -
the cache files were created in the previous run with root as the owner.
After changing the cache files owner and restarting Tomcat it worked well.
I think OSCache should fail fast in such a scenario instead of blocking forever. I.e. the while (groupFile.exists() && !groupFile.delete()) {} loop in AbstractDiskPersistenceListener.store() should be changed.
As a suggestion, I've changed it to the following, but haven't tested it yet:
int count = 0;
while (file.exists() && !file.delete() && count < 3) {
count++;
try {
Thread.sleep(100);
} catch (InterruptedException ignore) {}
}
|
|
Description
|
We ran our application (Tomcat webapp) by mistake as root. When we restarted tomcat with the normal restricted user, we observed that it was total unresponsive and completely occupied one out of two CPUs.
Another restart didn't help as well, same behaviour.
After examing the stack trace I concluded that OSCache was hanging in AbstractConcurrentReadCache.persistStoreGroup() because of mssing write rights -
the cache files were created in the previous run with root as the owner.
After changing the cache files owner and restarting Tomcat it worked well.
I think OSCache should fail fast in such a scenario instead of blocking forever. I.e. the while (groupFile.exists() && !groupFile.delete()) {} loop in AbstractDiskPersistenceListener.store() should be changed.
As a suggestion, I've changed it to the following, but haven't tested it yet:
int count = 0;
while (file.exists() && !file.delete() && count < 3) {
count++;
try {
Thread.sleep(100);
} catch (InterruptedException ignore) {}
} |
Show » |
|
IMO this part is the problem:
"Thread t@60: (state = IN_NATIVE)
- java.io.UnixFileSystem.getBooleanAttributes0(java.io.File) @bci=0 (Compiled frame; information may be imprecise)
- java.io.UnixFileSystem.getBooleanAttributes(java.io.File) @bci=2, line=228 (Compiled frame)
- com.opensymphony.oscache.base.algorithm.AbstractConcurrentReadCache.persistStoreGroup(java.lang.String, java.util.Set) @bci=80, line=1136 (Interpreted frame)
- com.opensymphony.oscache.base.algorithm.AbstractConcurrentReadCache.addGroupMappings(java.lang.String, java.util.Set, boolean, boolean) @bci=142, line=1532 (Interpreted frame)"
All other calls to AbstractConcurrentReadCache.put are blocked, because thread 60 is still in the synchronized(this) block in AbstractConcurrentReadCache.put
Note: I think the stack trace reported by jstack for thread 60 isn't complete, as it must be in AbstractDiskPersistenceListener.storeGroup. There is no file operation in AbstractConcurrentReadCache.persistStoreGroup.