|
|
|
[
Permlink
| « Hide
]
Guillaume Berche - [28/Apr/05 02:26 AM ]
Output of the contributed TestConcurrency.testConcurrentStaleGets() method
The incremental changes (.patch file) and the full copy of modified source (.tar.gz file) for Cache and TestConcurrency files.
I confirm that the following brutal fix of "not removing the EntryUpdateState from the Cache.updateStates" indeed fixes the symptom.
As previously discussed, a more precise fix could only remove the EntryUpdateState from the Cache.updateStates once Cache.removeEntry() is called (and by safety notify all threads waiting on the removed EntryUpdateState instance). However it has the drawback of using slightly more memory because the size of the Cache.updateStates Map would be roughly equals to the Cache.cacheMap public void cancelUpdate(String key) { EntryUpdateState state; if (key != null) { synchronized (updateStates) { //state = (EntryUpdateState) updateStates.remove(key); state = (EntryUpdateState) updateStates.get(key); if (state != null) { synchronized (state) { state.cancelUpdate(); state.notify(); } } } } } This should be backported to 2.1.1 but let's see if this all works
Andres,
Thanks for your prmpt work on this fix! Could you please describe the fix that was brought into 2-1-1 to help me understand the alternative that you selected? I tried looking at the CVS repository log and could only find a fix on the Cache class: https://oscache.dev.java.net/source/browse/oscache/src/core/java/com/opensymphony/oscache/base/Cache.java?r1=1.13&r2=1.13.2.1&only_with_tag=v2_1_1_release Were there other modifications that I missed? I wonder whether how the following points are handled in this fix: 1 - how are the EntryStateUpdate entries removed from the updateStates map? I understand they are now only removed in the completeUpdate() method called from putInCache(). Would'nt it necessary to call completeUpdate() method when a CacheEntry is removed from the cacheMap as to avoid dangling EntryStateUpdate instances that could represent a memory leak? I could find the following cases for this CacheEntry instance removal (there may be others?): a) explicit removeEntry() method method call b) implicit removal from the cache when it is full. 2- is the following block of code in Cache.getFromCache() still necessary? I understand that it can be removed since the EntryStateUpdate is not removed from the updateStates map anymore between cancelled updates. Consequently, this old workaround would only decrease concurrency. // We put the updateState object back into the updateStates map so // any remaining threads waiting on this cache entry will be notified // once this thread has done its thing (either updated the cache or // cancelled the update). Without this code they'll get left hanging... synchronized (updateStates) { updateStates.put(key, updateState); } 3- was the modification on completeUpdate() related to this bug? I.e. the call to the EntryStateUpdate.startUpdate() if ! EntryStateUpdate.isUpdating() Thanks in advance for your help, Guillaume. I took a look and it seems you are correct. This fix could lead to a memory leak. That's what happens when I code late at night. I'm thinking of possible solutions but unfortunately the implicit removes have no ability to signal the Cache to remove the update state.
Andres,
Should this bug reopened to account for the memory leak introduced by the fix? Should rather a new bug opened? Would it be possible to move the EntryStateUpdate directly within the CacheEntry class and hence have it stored within Cache.cacheMap field. This way it can be removed the Cache.updateStates, and the implicit remove would automatically properly handle be clean up of the EntryStateUpdate associated with a key? Could a contributed patch help on this issue? Thanks in advance, Guillaume. We should open a new issue for the leak. Your suggestion is one of the problems I have been contemplating. By putting the state in the cache map, we would need to synchronize on the get() method to make sure the state is reflected accurately, thereby significantly reducing concurrency.
I am always open to any suggestions or patches. Andres,
Thanks for your response. I then open a new issue for the leak if you're OK. I don't understand why we would need to synchronize on the AbstractConcurrentReadCache.get() if the EntryStateUpdate reference is maintained as a final value within the CacheEntry? The same synchronization optimization currently done in AbstractConcurrentReadCache.get() could be left unchanged: from a key instance, AbstractConcurrentReadCache.get() retreives a CacheEntry instance. The EntryStateUpdate instance is then accessed from the CacheEntry instance and the accessor does not require anymore synchronization if the reference to the EntryStateUpdate is final. Rather, what worries me if the following comment on the current code for the "private Map updateStates = new HashMap();" declaration: * We need to isolate these here since the actual CacheEntry * objects may not normally be held in memory at all (eg, if no * memory cache is configured). Is this comment true? What would happen when "no memory cache is configured"? What would the Cache.getCacheEntry() method return in this case? Would that invalidate the solution to bug Thanks for your help, Guillaume. you might be right about the sync requirements but I need to look at this solution more closely. If you want to provide a patch, I will have a look.
But the comment is correct. This is a barrier to using the cachemap to store the update state. The issues were having come up very often and I am leaning towards rewriting the internals to be more clear and cleaner. This would be a major rework but given the types of problems that crop up because of the complex implementation I think it is worth it. What I'm suggesting is : - to replace Cache with an interface that exposes the api. - Create a base abstract implementation that gives basic functionality. It would use a single map to store the entry, its groups, and its state. This map would be a backport of the java.util.concurrent.ConcurrentHashMap. All entries would be in the map, although the entry's content may only exist on disk. - Port the exisiting algorithm classes to extend the class above instead of AbstractConcurrentReadCache. Andres,
> But the comment is correct. This is a barrier to using the cachemap to store the update state. Thanks for the confirmation. Looking in more details to the AbstractConcurrentReadCache.sput() method, I now better understand what would be the problem with storing the EntryStateUpdate within the CacheEntry instance (which is stored as a value within the AbstractConcurrentReadCache class: - if no memory cache is used, then the CacheEntry instance is read from disk each time it is requested. Since a new ObjectInputStream instance is created for each call to AbstractDiskPersistence.retreive(File file), then a new CacheEntry instance would be returned for each call. As a result, two consecutive calls to Cache.get() would return a different CacheEntry instance and therefore a different EntryStateUpdate instance, making it unsuiteable to be used as a shared monitor among current threads. Consequently, the current design for the disk-only persistence mode indeed does not support using the CacheEntry to store and share the EntryStateUpdate. > What I'm suggesting is : > - to replace Cache with an interface that exposes the api. > - Create a base abstract implementation that gives basic functionality. It would use a single map to store the entry, > its groups, and its state. This map would be a backport of the java.util.concurrent.ConcurrentHashMap. All entries would > be in the map, although the entry's content may only exist on disk. > - Port the exisiting algorithm classes to extend the class above instead of AbstractConcurrentReadCache. As a newcomer to the oscache source code, this seems a good direction to me :-) In the modifications you describe, I imagine the Map could store a "composite" object as values. This composite object would include the group, the EntryStateUpdate and a "description" of where the content is stored (direct reference for memory cache or "persistent reference" for persistent cache). However, wouldn't the backport of java.util.concurrent.ConcurrentHashMap be difficult because of its dependencies on JDK 1.5 synchronization constructs? > you might be right about the sync requirements but I need to look at this solution more closely. If you want to provide a patch, I will have a look. My best bet was to move the EntryStateUpdate into the CacheEntry, but we discussed the issues with that related to the "persistence-only cache mode". Do you know if the feature of "using a persistence-only cache" is widely used by oscache users? Could it be considered to make a special case for this feature (e.g. either temporary keep the updateState map only in this configuration, or temporary disable this feature until the major rework you describe is complete)? Thanks, Guillaume. Correct, the composite object would include the group(s), state, and either the content or a pointer to where it is stored.
The backport is done. I have been trying out the classes. It is endorsed by Doug Lea (the original creator) and does not depend on the memory model changes of 1.5. I don't think providing a memory only fix at this point is a good idea. It would be more constructive to put my time towards a more comprehensive fix. I do consider this issue a major one but I feel if we don't address the core problems then more will crop up. The question really is when could I get this all done. This weekend is out but I might be able to get to it next weekend. I'd more than willing to accept help with this. You may feel new to the code but it won't matter much in regards to what we want to change. Let me know if you are interested. I may create a branch in cvs to work this stuff out. Thanks for the precisions. If I understand well this refactoring would consist of the following steps:
0- change the concrete Cache class into an interface with same API and probably same name (as to preserve compatibility). Separate from concrete implementation (e.g. CacheImpl) 1- replace AbstractConcurrentReadCache (based on a modified version of the old Doug Lea class from concurrent lib) with new java.util.concurrent.ConcurrentHashMap, and port the current algorithms implemented in AbstractConcurrentReadCache subclasses (FIFO, LRU, Unlimited caches) as ConcurrentHashMap subclasses 2- change the values stored within the map from directly the content to a composite object that would include group(s), state, and either the content or a pointer to where it is stored. Remove the updateStates field While I think I understand steps #0 and #2 and the value they provide, I don't quite understand the details of step #1 and its benefits. Would the ConcurrentHashMap source code be modified to integrate oscache feature in the same way than AbstractConcurrentReadCache? What design can be imagined to make this simpler and clearer? Thanks for the offer to help out. I'll try to see if I am able to free some time any time soon, and will let you know soon. By the way, as discussed, I have opened bug Cheers, Guillaume. 0- right
1- the main difference would be instead of modifying the ConcurrentReadHashMap to use it as a private property of the new abstract cache impl. The map would be the backing store for the cache impl and hold the 3 things we mentioned. 2- right > While I think I understand steps #0 and #2 and the value they provide, I don't quite understand the details of step #1 and >its benefits. Would the ConcurrentHashMap source code be modified to integrate oscache feature in the same way than >AbstractConcurrentReadCache? No. The main purpose of this refactoring would be to consolidate the cache access code and detangle the cache logic from the concurrent hash map. >What design can be imagined to make this simpler and clearer? still working on it but this is what I have so far. Andres,
As we discussed, I have built an internal patch version (based on version 2-0-1, which I called internally 2-0-3) applying the following design: --------------------------- Since the EntryStateUpdate instances are used to coordinate threads during the cache miss/cache stale/putInCache/cancelUpdate conditions, the EntryStateUpdate instances should not be removed from the map until all threads being coordinated are done dealing with the key associated with a given EntryStateUpdate instance. As a result, an explicit usage count is maintained into the EntryStateUpdate, it represent the number of usages on a given EntryStateUpdate instance among the following: - looking up a stale entry - waiting for an update of an entry - [about to be] performing an update Note that a single Thread may potentially account for more than one usage count for a given EntryStateUpdate instance (this case was not optimized and usage is not bound to threads, clients are still responsible for making balanced number of calls to cancelUpdate() or putInCache()). When the usage count reaches zero, then the EntryStateUpdate instance is removed from the updateMap. -------------------------- I attach the diff ("cvs-diff-2-0-1_to_2-0-3.txt") along with a tgz containing the modified classes ("modified_sources_in_2-0-3.tgz"). It includes the modifications for bugs I hope this can help and could be integrated in a future release. Regards, Guillaume. Note: I was unable to add attachements to this bug (maybe because it is CLOSED), I have therefore added them into the
THe fix for this caused a small memory leak
Hello Andres,
You write that there is now a small memory leak. What does small means in this context and how dows this effect the running application? We are using oscache in a web-application, which should, in the best case, never restart or even worser crash. With a memory leak in the system you can't say when the memory is exceeded. Regards Thorsten | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||