I have no special talent. I'm only passionately curious - Albert Einstein
Character Encodings: Get rid of the question marks Comment on Character Encodings: Get rid of the question marks 0

Have you ever been to a web site and seen question marks where you would expect to see a single or double quote (even other characters)?  Have you had to deal with the problem yourself?  It was very frustrating for me to resolve this problem.  After a lot of research, and configuration changes, I finally have it pinned down.

To resolve this problem, you will need to set your character encoding to UTF-8 (or 16 if you need to support that charset).  You will also need to set your character encoding in the database to UTF-8 (read your database specific documentation on how to do this).  For MySQL, I read this documentation.  Finally, you need to set the database connection charset attribute.  I am using Hibernate, and I found that you can set the encoding using the following configs (in Spring):

  <prop key="hibernate.connection.useUnicode">true</prop>
<prop key="hibernate.connection.characterEncoding">UTF-8</prop>
<prop key="hibernate.connection.charSet">UTF-8</prop>

If you're using Tomcat, you will also want to set the default character encoding on your connector. Set the URIEncoding attribute on the <Connector> element in server.xml to something specific (e.g. URIEncoding="UTF-8"). Read this Tomcat Wiki page for more information.

 

Update November 30th, 2009:

  Finally, if you have the opportunity and you're using Spring, add the Spring CharacterEncodingFilter to change your default encoding for all incoming request values to UTF-8, as described here.


0 comments

Comments are currently disabled

About

David Malone is a Java developer residing in the Twin Cities area.  He has been developing enterprise applications since 2004.  This is his personal blog, as well as his design and development workspace.