[cvs] expresso commit by mtraum: added UTF-8 section contributed by Gregor

JCorporate Ltd jcorp at jcorporate.com
Tue Mar 8 03:25:54 UTC 2005


Log Message:
-----------
added UTF-8 section contributed by Gregor Polančič

Modified Files:
--------------
    expresso/expresso-web/expresso/doc/edg:
        internationalization.xml

Revision Data
-------------
Index: internationalization.xml
===================================================================
RCS file: /home/javacorp/.cvs/expresso/expresso/expresso-web/expresso/doc/edg/internationalization.xml,v
retrieving revision 1.11
retrieving revision 1.12
diff -Lexpresso-web/expresso/doc/edg/internationalization.xml -Lexpresso-web/expresso/doc/edg/internationalization.xml -u -r1.11 -r1.12
--- expresso-web/expresso/doc/edg/internationalization.xml
+++ expresso-web/expresso/doc/edg/internationalization.xml
@@ -1,7 +1,6 @@
 <chapter id='internationalization' xreflabel='Internationalization'>
 	<title>Internationalization</title>
 	<para>
-
 		<note>
 			<para>
 If you find this EDG documentation helpful please consider <link linkend='donate_internationalization'>DONATING</link>!
@@ -10,15 +9,13 @@
 		</note>
 	</para>
 	<para>
-
 		<informaltable colsep='0' frame='none' pgwide='1' rowsep='0'>
 			<tgroup cols='2' colsep='0' rowsep='0'>
 				<colspec align='left' colsep='0' colwidth='50%' rowsep='0' />
 				<colspec align='right' colsep='0' colwidth='50%' rowsep='0' />
 				<tbody>
 					<row>
-						<entry>
-						</entry>
+						<entry />
 						<entry>
 <emphasis role='bold'>Maintainer:</emphasis><ulink url='mailto:dlloyd at jgroup.net?Subject=EDG'><emphasis
 role='maintainer'>David Lloyd</emphasis></ulink>
@@ -279,6 +276,133 @@
 		</para>
 	</sect1>
 	<sect1>
+		<title>Expresso and UTF-8 (Unicode)</title>
+		<para>
+When developing web applications for use in mutiple languages simultaneously,
+the UTF-8 character set should be used. There are many aspects of your
+environment which must take this into account. Most of the following applies
+not only to Expresso applications, but to all servlet applications.
+		</para>
+		<sect2>
+			<title>Environment</title>
+			<para>
+The environment should be able to support UTF-8. Please check first in
+the manual if the environment supports UTF-8! After, test UTF-8 on the
+environment!
+			</para>
+			<sect3>
+				<title>Database</title>
+				<para>
+UTF-8 should be configured on database. In case of MySQL, latest version
+(at time ver. 4.1.8) supports UTF-8 by default.
+				</para>
+			</sect3>
+			<sect3>
+				<title>JDBC Driver</title>
+				<para>
+
+Also, the driver must support UTF-8. In the case of MySQL, use. Depending
+on the database used, connection string parameters may need to be set in
+the jdbc element of
+					<filename>expresso-config.xml</filename>
+
+. For example, the latest driver (at this time ver. 3.1.6) should be used
+and '?useUnicode=true&amp;amp;characterEncoding=utf8' should be added to
+the end of the connection URL.
+				</para>
+			</sect3>
+			<sect3>
+				<title>Servet Container</title>
+				<para>
+You should use server which supports UTF-8. With latest Jakarta Tomcat
+(version 5. and above), UTF-8 is supported by default. Other servlet containers
+may need parameters added to the command line of the JVM execution.
+				</para>
+			</sect3>
+		</sect2>
+		<sect2>
+			<title>View</title>
+			<para>
+The view will need to be able to correctly send and render characters (in
+UTF-8).
+			</para>
+			<sect3>
+				<title>JSP</title>
+				<para>
+
+All JSP files should have UTF-8 set following way (in JSP):
+					<programlisting><![CDATA[<%@ page contentType="text/html;charset=UTF-8"%>]]></programlisting>
+
+This JSP declaration should be on the top of the JSP file forcing that
+all data on JSP will use UTF-8. Please note that it is not enough to set
+this directive on an included file. All JSP files should have this directive.
+				</para>
+				<para>
+
+To check if the directive is correctly evaluated, view the rendered (X)HTML
+source, which should present following declaration following (in HTML):
+					<programlisting><![CDATA[<meta http-equiv="Content-Type" content="text/html; charset=utf-8">]]></programlisting>
+
+Additionally, UTF-8 should be selected the the client's web browser, which
+is generally the default.
+				</para>
+			</sect3>
+		</sect2>
+		<sect2>
+			<title>Logic</title>
+			<para>
+Java handles chars in UTF-8 therefore there should not be any problems
+in logic. The data should only be correctly received (from view) into logic.
+			</para>
+			<sect3>
+				<title>Handling requests</title>
+				<para>
+By default the request gets chars in ISO-8859-1 (Latin-1) therefore the
+characters should be &ldquo;casted&rdquo; to UTF-8. If your servlet container supports
+the Servlet 2.3 specification, a filter can be used to accomplish this.
+Many can be found through a search engine, but one example can be found
+<ulink url='http://java.sun.com/blueprints/code/jps131/src/com/sun/j2ee/blueprints/encodingfilter/web/EncodingFilter.java.html'>here</ulink>
+(note that you should change the target encoding, though).
+				</para>
+			</sect3>
+			<sect3>
+				<title>Expresso Database Objects</title>
+				<para>
+
+It is not necessary but recommended to set UTF-8 also in DBObjects. This
+can be done following way:
+					<programlisting><![CDATA[public void setupFields() throws DBException {
+	setTargetTable("table_1");
+	setDescription("table_1_description");
+	setCharset("UTF-8");
+	...
+}]]></programlisting>
+				</para>
+			</sect3>
+		</sect2>
+		<sect2>
+			<title>Messages Bundle Files</title>
+			<para>
+
+MessagesBundle.properties files must be encoded in Latin-1. If you want
+to set UTF-8 values, use can use character entity codes (&amp;#xxx;). Or,
+you can keep a separate copy of the MessageBundle.properties files in another
+encoding, and then use
+				<command>native2ascii</command>
+
+, a command line program bundled with your JVM. For more information about,
+consult your JVM's documentation.
+			</para>
+		</sect2>
+		<sect2>
+			<title>Text Files</title>
+			<para>
+All text files should have text file encoding set to UTF-8. Use appropriate
+text editor (for example Eclipse) to do that!
+			</para>
+		</sect2>
+	</sect1>
+	<sect1>
 		<title>Conclusion</title>
 		<sect2>
 			<title>Contributors</title>
@@ -297,10 +421,12 @@
 Mike Traum <link linkend='jgroup'>(JGroup Expert)</link>
 						</para>
 					</listitem>
+					<listitem>
+						<para>Gregor Polan&ccaron;i&ccaron;</para>
+					</listitem>
 				</itemizedlist>
 			</para>
 			<para>
-
 				<note>
 					<para id='donate_internationalization'>
 Was this EDG documentation helpful? Do you wish to express your appreciation


More information about the cvs mailing list