|
|
|
[
Permlink
| « Hide
]
Hani Suleiman - [19/Jun/03 01:21 PM ]
Does the content appear correctly without the cache tag? If yes, then can you submit a sample jsp which illustrates this error? Thanks!
Sorry,I meant does the content appear without the cahe filter in place
I have experienced the same problem. If I access the content via the filter, then I get question marks '?' instead of special characters. If I access the content directly, then it works ok.
I think I have a fix for this problem.
The class that needs to be modified is com.opensymphony.oscache.web.filter.CacheHttpServletResponseWrapper. The problem is that in this class, a Writer is created, without a character encoding being specified in the constructor. Hence, the default encoding for the platform is used. This can be a problem if the defaut encoding does not support characters with accents (and this is the case on Solaris if the env. variable LC_CTYPE="C"). So the fix is to specify a valid encoding in the constructor. One possibility is to use the encoding of the HttpResponse... public PrintWriter getWriter() throws IOException { if (cachedWriter == null) { /* The following line has been commented out. It does not work, because it uses the default encoding for the platform, which can cause problems with special characters. On Solaris, for instance, if the env. variable LC_CTYPE is set to "C", then the encoding for the platform is US-ASCII-C (ISO646-US). In that case, special characters (e.g. characters with accents in French) are replaced by question marks ('?'). */ //cachedWriter = new PrintWriter(getOutputStream()); String platformEncoding = System.getProperty("file.encoding"); String responseEncoding = getResponse().getCharacterEncoding(); /* * These lines have been added. Instead of using the platform encoding when creating the Writer, we explicitely specity the same * encoding as the one used for the HTTP response (in the constructor). */ log.info("The character encoding for the platform is " + platformEncoding); log.info("The character encoding for the HTTP response is " + responseEncoding); log.info("Creating a writer with HTTP response encoding: " + responseEncoding); cachedWriter = new PrintWriter(new OutputStreamWriter(getOutputStream(), responseEncoding)); } This method helped only half of the problem.
I'm using UTF-8 at my JSP contents. The first time retreiving the JSP, with the modifications above, it's going fine; but the second time when the cached content is being used the characters cannot display correctly. However, they are not ?s, but gabbage characters, suggesting the cached content was not being sent in the character set it should use. The HTTP Content-Type header captured by the browser was Content-Type: text/html;charset=ISO-8859-1 whenever cached JSP content is being used, even UTF-8 was declared in JSP pageEncoding and HTML <meta> tags. OSCache made two major mistakes. First, it didn't bother to preserve any of the headers from the original response in the cached response. The other is that it cheerily ignored the encoding my servlet requested, instead using an OutputStream into a byte array
We too had the same problem (with 2.0.2) and already resolved it by subclassing the wrapper.
Essentially the wrapper intercepts the "setContentType" call, however there are other ways used by servlets to set the page response content type: - addHeader - setHeader by intercepting this we solved the problem, because the content-type is recorded and then re-sent to the client the client mis-interprets the page content due to missing content-tpye header
If you wish I can provide my (working) patch to fix this (against 2.0.2), the patch has been successfully tested extensively against ISO-8859-1 and ISO-8859-11. Minor testing has been done also for UTF-8.
I did it as a sub-class of the original wrapper in order to keep it separated from the main OS-Cache jar. My own filter obviously used this wrapper instead of the original one.
Further comments on the mini-patch I submitted.
As you can read in ther comments the OSCache filter holds the cached response into a byte[], which might seem incorrect to some of us. The patch I submitted will substantially grant the the cache response will be built using same response type as the original request, so a byte[] will work. However... What if the first client accepted only ISO-8859-1 and the new client only accepts CP-1251? The current implementation will respond with an ISO-8859-1. Most times this will _not_ be a problem since modern browsers do support a wide variety of encodings, however exetic devices/clients (e.g.: cinese browser, PDA's....) might be affected. Alas the solution is not simple, as the ResponseContent.writeTo() does not know the request's acceptable content-types, so it's not able to do on-the-fly response encoding translation, which could otherwise be done simply be making (byte[],orig-ContentType)=>String=>(byte[],dest-ContentType) 1.)
I added Simone's changes to my current I put "result.setContentType(value);" in addHeader and setHeader. But should "super.setContentType(value)" invoked also? 2.) We need to factor in the encoding of the first client and the new client. But also we have to avoid that the content is changed by content-types of "image/gif" etc. Does anybody know when we have to correct the response (e.g. "text/html", "text/plain") ? Every reponse with a content-type "text/*" ? Simone's changes are the first try to resolve this problem. By introducing gzip compression and handling a further different content-type, we may have the possibility to react to different encodings in ResponseContent.
1) I called super.setContentType(...) because I needed to notify the original wrapper of the content type (it recorded it only there).
I don't know how the underlaying J2EE container handles content-types, but I don't expect it to be needed once you override the setHeader and addHeader methods directly within OSCache's wrapper 2) I don't fully understand the question, however the problem is that what we have in cache is only valid for the recorded content-type, if the browser requests a different content-type then the application might have to be invoked again (e.g.: different HTML for different content-types), I don't know of a public API to query the container for content-type compatibility :-( charset needs to be set from the content-type
I don't use the cache filter and we don't as of yet have unit tests for this area, so it would be great if someone could try this fix out.
we're currently in production with 2.1 + the patch I submitted with ISO-8859-15 (Italian accents + € simbol) and everything is fine, if someone can provide a stable snapshot I will try our portaling solution with that snap.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||