These days we can with recent Oracle versions more -and easy- online reorganize our databases. Before it was very important to think about transaction slots (ITL), freelists, extents at the moment of segment creation. With recent Oracle versions we see Oracle is handling it all -or most of it- for us.
There is however still one very important "thing" to think about BEFORE the database creation : the database characterset. Living in Western Europe I have had very few issues with charactersets. Most of the time I had to deal with singlebyte charactersets like WE8ISO8859P15, WE8ISO8859P1 however these days I get involved in projects where we have to handle UTF8 multibyte characters.
Single-byte character sets are character sets with names of the form xxx7yyyyyy and xxx8yyyyyy.
Each character code of a single-byte character set occupies exactly one byte. Multibyte character sets are all other character sets (like UTF8). Some character codes of a multibyte character set occupy more than one byte.
Let' s check the database characterset
NLS_CHARACTERSET
AL32UTF8
NLS_NCHAR_CHARACTERSET
AL16UTF16
Considerations about multibyte charactersets
In the above example we have to deal with AL32UTF8, which is a unicode variable width multibyte characterset. AL32UTF8 follows the unicode standards, check them out here http://www.unicode.org
More in detail
Version unicode 3.0 in RDBMS version 9.0
Version unicode 3.1 in RDBMS version 9.2
Version unicode 3.2 in RDBMS version 10.1
Version unicode 4.01 in RDBMS version 10.2
It' s time for some sql in a database of which the characterset is multibyte
Table created.
SQL> create table t_2 (col1 varchar2(1 CHAR));
Table created.
SQL> insert into t_1 values ('é');
insert into t_1 values ('é')
ERROR at line 1:
ORA-12899: value too large for column "MY_UTF"."T_1"."COL1" (actual: 3,maximum: 1)
SQL> insert into t_2 values ('é');
1 row created.
SQL> commit;
Commit complete.
Why didn' t we succeed to insert the character é in table t_1 ? The answer is that the character é is a multibyte character in al32utf8. Here I show you it is 3 bytes.
Typ=1 Len=3: 239,191,189
Let' s query user_tab_columns in order to retrieve the difference between t_1.col1 and t_2.col1. T_1.col1 has been created with a fixed byte length whereas t_2.col1 has been created with a fixed char length.
T_2 COL1 4 1 C
Oracle shouldn' t be Oracle if there wouldn' t be a kind of parameter in order to bypass some issues about this. Here we speak about nls_length_semantics which is a database parameter, which we can alter at the instance level, which we can alter at the session level.
NLS_LENGTH_SEMANTICS
BYTE
NLS_LENGTH_SEMANTICS
BYTE
NLS_LENGTH_SEMANTICS
BYTE
Session altered.
NLS_LENGTH_SEMANTICS
CHAR
Let' s check out the difference
Table created.
SQL> insert into t_1bis values ('é');
1 row created.
SQL> commit;
Commit complete.
SQL> select table_name,column_name,data_length,char_length,char_used from user_tab_columns where table_name in ('T_1','T_2','T_1BIS');
T_1 COL1 1 1 B
T_1BIS COL1 4 1 C
T_2 COL1 4 1 C
This one I use in order to have the fixed char length for the current and the next instances (assuming I work with an spfile)
Check out Metalink Note' s
Doc ID:Note:260893.1
Subject: Unicode character sets in the Oracle database
Doc ID:Note:144808.1
Subject: Examples and limits of BYTE and CHAR semantics usage
Doc ID: Note:119164.1
Subject: Changing Database Character Set - Valid Superset Definitions