The “N” Variant-Datatypes

So, of what use are the NVARCHAR2 and NCHAR (for completeness)? They are used in systems where the need to manage and store multiple character sets arises. This typically happens in a database where the predominant character set is a single-byte fixed-width one (such as WE8MSWIN1252), but the need arises to maintain and store some multibyte data.

There are many systems that have legacy data but need to support multibyte data for some new applications; likewise, there are systems that want the efficiency of a single-byte character set for most operations (string operations on a string that uses fixed-width characters are more efficient than on a string where each character may use a different number of bytes) but need the flexibility of multibyte data at some point.

The NVARCHAR2 and NCHAR datatypes support this need. They are generally the same as their VARCHAR2 and CHAR counterparts, with the following exceptions:
•\ Their text is stored and managed in the database’s national characterset, not the default character set.
•\ Their lengths are always provided in characters, whereas aCHAR/VARCHAR2 may specify either bytes or characters.

You can view your database’s national character set as follows:
SQL> select * from nls_database_parameters where parameter = ‘NLS_ CHARACTERSET’;
PARAMETER VALUE

This makes the NCHAR and NVARCHAR types suitable for storing only multibyte data.

Binary Strings: RAW Types

Oracle supports the storage of binary data as well as text. Binary data is not subject to the character set conversions we discussed earlier with regard to the CHAR and VARCHAR2 types. Therefore, binary datatypes are not suitable for storing user-supplied text, but are suitable for storing encrypted information—encrypted data is not “text,” but a binary representation of the original text, word processing documents containing binary markup information, and so on. Any string of bytes that should not be considered by the database to be “text” (or any other base datatype such as a number, date, and so on) and that should not have character set conversion applied to it should be stored in a binary datatype.

Oracle supports three datatypes for storing binary data:
•\ The RAW type, which we focus on in this section, is suitable for storing RAW data up to 2000 bytes in size. Starting with 12c, you can configure a RAW to store up to 32,767 bytes of information.
•\ The BLOB type, which supports binary data of much larger sizes. We’ll defer coverage of this until the “LOB Types” section later in the chapter.
•\ The LONG RAW type, which is supported for backward compatibility and should not be considered for new applications.

For example, the following code creates a table capable of storing 16 bytes of binary information per row:
SQL> create table t ( raw_data raw(16) ); Table created.

The RAW type is much like the VARCHAR2 type in terms of storage on disk. The RAW type is a variable-length binary string, meaning that the table T just created, for example, may store anywhere from 0 to 16 bytes of binary data. It is not padded out like the CHAR type.
When dealing with RAW data, you will likely find it being implicitly converted to a VARCHAR2 type—that is, many tools, such as SQL*Plus, will not display the RAW data directly but will convert it to a hexadecimal format for display. In the following example, we create some binary data in our table using SYS_GUID(), a built-in function that returns a 16-byte RAW string that is globally unique (GUID stands for globally unique identifier):
SQL> insert into t values ( sys_guid() ); 1 row created.
SQL> select * from t;RAW_DATA

You can immediately note two things here. First, the RAW data looks like a character string. That is just how SQLPlus retrieved and printed it; that is not how it is stored on disk. SQLPlus cannot print arbitrary binary data on your screen, as that could have serious side effects on the display. Remember that binary data may include control characters such as a carriage return or linefeed—or maybe a Ctrl+G character that would cause your terminal to beep.

Second, the RAW data looks much larger than 16 bytes—in fact, in this example, you can see 32 characters. This is due to the fact that every binary byte takes two hexadecimal characters to display (if the leading character is a zero, the zero is not displayed). The stored RAW data is really 16 bytes in length, and you can see this using the Oracle SQL DUMP function. Here, I am dumping the value of the binary string and using the optional parameter to specify the base that should be used when displaying the value of each byte. I am using base 16, so we can compare the results of dump with the previous string:
SQL> select dump(raw_data,16) from t;DUMP(RAW_DATA,16)

So, DUMP shows us this binary string is in fact 16 bytes long (Len=16) and displays the binary data byte by byte. As we can see, this dump display matches up with the implicit conversion performed when SQL*Plus fetched the RAW data into a string. This implicit conversion goes the other direction as well:
SQL> insert into t values ( ‘abcdef’ ); 1 row created.

That did not insert the string abcdef, but rather a 3-byte RAW with the bytes AB, CD, EF, or in decimal with the bytes 171, 205, 239. If you attempt to use a string that does not consist of valid hex characters, you will receive an error message:
SQL> insert into t values ( ‘abcdefgh’ ); insert into t values ( ‘abcdefgh’ )

The RAW type may be indexed and used in predicates—it is as functional as any other datatype. However, you must take care to avoid unwanted implicit conversions, and you must be aware that they will occur.

I prefer and recommend using explicit conversions in all cases, which can be performed using the following built-in functions:
•\ HEXTORAW: To convert strings of hexadecimal characters to the RAW type
•\ RAWTOHEX: To convert RAW strings to hexadecimal strings

The RAWTOHEX function is invoked implicitly by SQL*Plus when it fetches a RAW type into a string, and the HEXTORAW function is invoked implicitly when inserting the string. It is a good practice to avoid implicit conversions and to always be explicit when coding. So the previous examples could have been written as follows:
SQL> select rawtohex(raw_data) from t;RAWTOHEX(RAW_DATA)
EEF18AA30B563AF0E043B7D04F0A4A30
SQL> insert into t values ( hextoraw(‘abcdef’) ); 1 row created.

Leave a Reply

Your email address will not be published. Required fields are marked *.

*
*

BACK TO TOP