Oracle8i JDBC Developer's Guide and Reference Release 8.1.5 A64685-01 |
|
This section contains these subsections:
Oracle's JDBC drivers support NLS (National Language Support). NLS lets you retrieve data or insert data into a database in any character set that Oracle supports. If the clients and the server use different character sets, the driver provides the support to perform the conversions between the database character set and the client character set.
For more information on NLS, NLS environment variables, and the character sets that Oracle supports, see the Oracle8i National Language Support Guide. See the Oracle8i Reference for more information on the database character set and how it is created.
Here are a few examples of commonly used Java methods for JDBC that rely heavily on NLS character set conversion:
java.sql.ResultSet
methods getString()
and getUnicodeStream()
return values from the database as Java strings and as a stream of Unicode characters, respectively.
oracle.sql.CLOB
method getCharacterStream()
returns the contents of a CLOB
as a Unicode stream.
oracle.sql.CHAR
methods getString()
, toString()
, and getStringWithReplacement()
convert the following data to strings:
getString()
: converts the sequence of characters represented by the CHAR
object to a string and returns a Java String
object.
toString()
: identical to getString()
, but if the character set is not recognized, toString()
returns a hexadecimal representation of the CHAR
data.
getStringWithReplacement()
: identical to getString()
, except characters that have no Unicode representation in the character set of this CHAR
object are replaced by a default replacement character.
The techniques that Oracle's drivers use to perform character set conversion for Java applications depend on the character set the database uses. The simplest case is where the database uses the US7ASCII
or WE8ISO8859P1
character set. In this case, the driver converts the data directly from the database character set to UCS-2
which is used in Java applications.
If you are working with databases that employ a non-US7ASCII
or non-WE8ISO8859P1
character set (for example, Japanese or Korean), then the driver converts the data, first to UTF-8
, then to UCS-2
. For example, the driver always converts CHAR
and VARCHAR2
data in a non-US7ASCII
, non-WE8ISO8859P1
character set. It does not convert RAW
data.
In the case of a JDBC OCI driver installation, note that there is a client-side character set as well as a database character set. The client character set is determined at client-installation time by the value of the NLS_LANG
environment variable. The database character set is determined at database creation. The character set used by the client can be different from the character set used by the database on the server. So, when performing character set conversion, the JDBC OCI driver has to take three factors into consideration:
UCS-2
The JDBC OCI driver transfers the data from the server to the client in the character set of the database. Depending on the value of the NLS_LANG
environment variable, the driver handles character set conversions in one of two ways.
NLS_LANG
is not specified, or if it is set to the US7ASCII
or WE8ISO8859P1
character set, then the JDBC OCI driver uses Java to convert the character set from US7ASCII
or WE8ISO8859P1
directly to UCS-2
.
NLS_LANG
is set to a non-US7ASCII
or non-WE8ISO8859P1
character set, then the driver changes the value of the NLS_LANG
parameter on the client to UTF-8
. This happens automatically and does not require any user-intervention. OCI uses the value of NLS_LANG
to convert the data from the database character set to UTF-8
; the JDBC driver then converts the UTF-8
data to UCS-2
.
Notes:
NLS_LANG
to UTF-8
to minimize the number of conversions it performs in Java. It performs the conversion from database character set to UTF-8
in C.
UTF-8
is for the JDBC application process only.
NLS_LANG
parameter, see the Oracle8i National Language Support Guide.
If your applications or applets use the JDBC Thin driver, then there will not be an Oracle client installation. Because of this, the OCI client conversion routines in C will not be available. In this case, the client conversion routines are different from the JDBC OCI driver.
If the database character set is US7ASCII
or WE8ISO8859P1
, then the data is transferred to the client without any conversion. The driver then converts the character set to UCS-2
in Java.
If the database character set is something other than US7ASCII
or WE8ISO8859P1
, then the server first translates the data to UTF-8
before transferring it to the client. On the client, the JDBC Thin driver converts the data to UCS-2
in Java.
If your JDBC code running in the server accesses the database, then the JDBC Server driver performs a character set conversion based on the database character set. The target character set of all Java programs is UCS-2
.
The JDBC Server driver supports the ASCII (US7ASCII
) and ISO-Latin-1 (WE8ISO8859P1
) character sets only.
There is a limit to the maximum sizes for CHAR
and VARCHAR2
datatypes when used in bind calls. This limitation is necessary to avoid data corruption. This problem happens only with binds (not for defines) and it affects only CHAR
and VARCHAR2
datatypes if you are connected to a multi-byte character set database.
The maximum bind lengths are limited in the following way:
CHAR
s and VARCHAR2
s experience character set conversions that could result in an increase in the length of the data in bytes. The ratio between data sizes before and after a conversion is called the NLS Ratio. After conversion, the bind values should not be greater than 4 Kbytes (in Oracle8), or 2 Kbytes (in Oracle7).
Driver | Server Version | Datatype | Old Max Bind Length (bytes) | New Restricted Max Bind Length (bytes) |
---|---|---|---|---|
Thin and OCI |
V8 |
|
2000 |
|
|
4000 |
|
For example, when connecting to an Oracle8 server, you cannot bind more than:
OR
Table 5-2 contains examples of the NLS Ratio and maximum bind values for some common server character sets.
Server Character Set | NLS Ratio |
Maximum Bind Value on Oracle8 Server (in bytes) |
---|---|---|
WE8DEC |
1 |
4000 |
US7ASCII |
1 |
4000 |
ISO 8859-1 through 10 |
1 |
4000 |
JA16SJIS |
2 |
2000 |
JA16EUC |
3 |
1333 |