I have no special talent. I'm only passionately curious - Albert Einstein
November 04, 2009
Character Encodings: Get rid of the question marks
Posted by Dave Malone
in Hibernate,
Spring,
Configuration,
Java
Have you ever been to a web site and seen question marks where you would expect to see a single or double quote (even other characters)? Have you had to deal with the problem yourself? It was very frustrating for me to resolve this problem. After a lot of research, and configuration changes, I finally have it pinned down.
To resolve this problem, you will need to set your character encoding to UTF-8 (or 16 if you need to support that charset). You will also need to set your character encoding in the database to UTF-8 (read your database specific documentation on how to do this). For MySQL, I read this documentation. Finally, you need to set the database connection charset attribute. I am using Hibernate, and I found that you can set the encoding using the following configs (in Spring):
<prop key="hibernate.connection.useUnicode">true</prop>
<prop key="hibernate.connection.characterEncoding">UTF-8</prop>
<prop key="hibernate.connection.charSet">UTF-8</prop>
If you're using Tomcat, you will also want to set the default character encoding on your connector. Set the URIEncoding attribute on the <Connector> element in server.xml to something specific (e.g. URIEncoding="UTF-8"). Read this Tomcat Wiki page for more information.
Update November 30th, 2009:
Finally, if you have the opportunity and you're using Spring, add the Spring CharacterEncodingFilter to change your default encoding for all incoming request values to UTF-8, as described here.
Comments are currently disabled