In this example I show how I migrated from a single byte WE8MSWIN1252 to a unicode multibyte characterset AL32UTF8.
It is beyond the scope of this article to describe all different scenario' s, there are so many database charactersets, the national characterset comes into play.
A full backup of your database prior to the characterset change is always mandatory.
Since release i don' t know oracle has introduced the characterset scan utility. Easy to use assuming the CSMIG schema is installed.csscan \"SYS/SECRET as sysdba\" FULL=Y
Character Set Scanner v2.1 : Release 10.2.0.3.0 - Production on Fri Apr 22 04:57:10 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
CSS-00107: Character set migration utility schema not installed
How to install the CSMIG schema ?
C:\Documents and Settings\E132707a>sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Fri Apr 22
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4
With the Partitioning, OLAP, Data Mining and Real Appli
SQL> start ?/rdbms/admin/csminst.sql;
Once the CSMIG installed, we run csscan again
csscan \"SYS/SECRET as sysdba\" FULL=Y
Character Set Scanner v2.1 : Release 10.2.0.3.0 - Production on Fri Apr 22 04:59:23 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Current database character set is WE8MSWIN1252.
Enter new database character set name: > AL32UTF8 >>> user input required
Enter array fetch buffer size: 1024000 > >>> user input required
Enter number of scan processes to utilize(1..32): 1 > 4 >>> user input required
...........................
. process 1 scanning WMSYS.WM$RIC_TRIGGERS_TABLE[AAAMlVAAJAAAB1kAAA]
. process 4 scanning SYS.WRH$_TABLESPACE_STAT[AAAJBZAAJAAABV8AAA]
. process 3 scanning SYSTEM.LOGMNR_ATTRCOL$[AAAIdcAAJAAAAjaAAA]
. process 2 scanning SYS.WRH$_LATCH_MISSES_SUMMARY[AAAJAyAAJAAABElAAA]
. process 1 scanning SYS.WRH$_DB_CACHE_ADVICE[AAAJA2AAJAAABFRAAA]
. process 4 scanning SYS.WRH$_DLM_MISC[AAAJBOAAJAAABQ9AAA]
. process 3 scanning SYS.WRH$_WAITSTAT[AAAJAhAAJAAABASAAA]
. process 4 scanning SYSTEM.LOGMNRC_GSII[AAAIdNAAJAAAAg1AAA]
. process 2 scanning SYSTEM.LOGSTDBY$APPLY_PROGRESS[AAAJFSAAJAAABz2AAA]
Creating Database Scan Summary Report...
Creating Individual Exception Report...
Scanner terminated successfully.
There is an output file scan.txt which can be read with a text editor
You may want to read Oracle' s Metalink "Csscan output explained" [ID 444701.1] for interpertation of output.Database Scan Summary Report
According the output I should be able to convert the database characterset using the csalter package. We need to be in restricted mode to use it.
Time Started : 2011-04-22 05:00:17
Time Completed: 2011-04-22 05:01:46
Process ID Time Started Time Completed
---------- -------------------- --------------------
1 2011-04-22 05:00:52 2011-04-22 05:01:45
2 2011-04-22 05:00:52 2011-04-22 05:01:45
3 2011-04-22 05:00:53 2011-04-22 05:01:45
4 2011-04-22 05:00:54 2011-04-22 05:01:45
---------- -------------------- --------------------
[Database Size]
Tablespace Used Free Total Expansion
------------------------- --------------- --------------- --------------- ---------------
SYSTEM 543.36M 7.73M 551.10M 8.52M
RBS 456.63M 355.07M 811.70M .00K
TEMP .00K .00K .00K .00K
COCONUT_DATA 1,241.00M 759.00M 2,000.00M 35.00K
COCONUT_INDEX 275.04M 1,724.96M 2,000.00M .00K
DRSYS 5.24M 14.76M 20.00M .00K
TOOLS 160.48M 351.53M 512.00M .00K
USERS 102.00K 108.93M 109.03M .00K
XDB 49.04M 990.00K 50.00M .00K
SYSAUX 51.86M 448.14M 500.00M 12.00K
------------------------- --------------- --------------- --------------- ---------------
Total 2,782.76M 3,771.09M 6,553.85M 8.56M
The size of the largest CLOB is 1625114 bytes
[Database Scan Parameters]
Parameter Value
------------------------------ ------------------------------------------------
CSSCAN Version v2.1
Instance Name PLATINUM
Database Version 10.2.0.4.0
Scan type Full database
Scan CHAR data? YES
Database character set WE8MSWIN1252
FROMCHAR WE8MSWIN1252
TOCHAR AL32UTF8
Scan NCHAR data? NO
Array fetch buffer size 1024000
Number of processes 4
Capture convertible data? NO
------------------------------ ------------------------------------------------
[Scan Summary]
All character type data in the data dictionary are convertible to the new character set
All character type application data are convertible to the new character set
[Data Dictionary Conversion Summary]
Datatype Changeless Convertible Truncation Lossy
--------------------- ---------------- ---------------- ---------------- ----------------
VARCHAR2 1,650,220 0 0 0
CHAR 1,111 0 0 0
LONG 140,383 0 0 0
CLOB 19,084 486 0 0
VARRAY 16,956 0 0 0
--------------------- ---------------- ---------------- ---------------- ----------------
Total 1,827,754 486 0 0
Total in percentage 99.973% 0.027% 0.000% 0.000%
The data dictionary can be safely migrated using the CSALTER script
[Application Data Conversion Summary]
Datatype Changeless Convertible Truncation Lossy
--------------------- ---------------- ---------------- ---------------- ----------------
VARCHAR2 11,174,558 1 0 0
CHAR 0 0 0 0
LONG 6 0 0 0
CLOB 0 172 0 0
VARRAY 0 0 0 0
--------------------- ---------------- ---------------- ---------------- ----------------
Total 11,174,564 173 0 0
Total in percentage 99.998% 0.002% 0.000% 0.000%
[Distribution of Convertible, Truncated and Lossy Data by Table]
USER.TABLE Convertible Truncation Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
PLATINUM.RCFORMULA 172 0 0
PLATINUM.SFMOD 1 0 0
MDSYS.SDO_COORD_OP_PARAM_VALS 200 0 0
MDSYS.SDO_GEOR_XMLSCHEMA_TABLE 1 0 0
MDSYS.SDO_STYLES_TABLE 78 0 0
MDSYS.SDO_XML_SCHEMAS 3 0 0
SYS.METASTYLESHEET 80 0 0
SYS.RULE$ 1 0 0
SYS.SCHEDULER$_EVENT_LOG 6 0 0
SYS.WRI$_DBU_FEATURE_METADATA 98 0 0
SYS.WRI$_DBU_HWM_METADATA 19 0 0
-------------------------------------------------- ---------------- ---------------- ----------------
[Distribution of Convertible, Truncated and Lossy Data by Column]
USER.TABLE|COLUMN Convertible Truncation Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
PLATINUM.RCFORMULA|RCFORMULA_SCRIPT 86 0 0
PLATINUM.RCFORMULA|RCFORMULA_TEXT 86 0 0
PLATINUM.SFMOD|SFMOD_DESC 1 0 0
MDSYS.SDO_COORD_OP_PARAM_VALS|PARAM_VALUE_FILE 200 0 0
MDSYS.SDO_GEOR_XMLSCHEMA_TABLE|XMLSCHEMA 1 0 0
MDSYS.SDO_STYLES_TABLE|DEFINITION 78 0 0
MDSYS.SDO_XML_SCHEMAS|XMLSCHEMA 3 0 0
SYS.METASTYLESHEET|STYLESHEET 80 0 0
SYS.RULE$|CONDITION 1 0 0
SYS.SCHEDULER$_EVENT_LOG|ADDITIONAL_INFO 6 0 0
SYS.WRI$_DBU_FEATURE_METADATA|INST_CHK_LOGIC 12 0 0
SYS.WRI$_DBU_FEATURE_METADATA|USG_DET_LOGIC 86 0 0
SYS.WRI$_DBU_HWM_METADATA|LOGIC 19 0 0
-------------------------------------------------- ---------------- ---------------- ----------------
[Indexes to be Rebuilt]
USER.INDEX on USER.TABLE(COLUMN)
-----------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------
( it is rather obvious that changing the database characterset requires a maintenance window and a single user context )
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup restrict;
ORACLE instance started.
Total System Global Area 524288000 bytes
Fixed Size 1297676 bytes
Variable Size 243270388 bytes
Database Buffers 276824064 bytes
Redo Buffers 2895872 bytes
Database mounted.
Database opened.
SQL> start ?/rdbms/admin/csalter.plb
0 rows created.
Function created.
Function created.
Procedure created.
This script will update the content of the Oracle Data Dictionary.
Please ensure you have a full backup before initiating this procedure.
Would you like to proceed (Y/N)?Y
old 6: if (UPPER('&conf') <> 'Y') then
new 6: if (UPPER('Y') <> 'Y') then
Checking data validility...
Unrecognized convertible date found in scanner result
PL/SQL procedure successfully completed.
Checking or Converting phrase did not finish successfully
No database (national) character set will be altered
CSALTER finished unsuccessfully.
PL/SQL procedure successfully completed.
0 rows deleted.
Function dropped.
Function dropped.
Procedure dropped.
SQL> alter system disable restricted session;
Despite the fact that according the CSSCAN output I can use csalter. the error "Unrecognized convertible date found in scanner result"
When I ran the csscan i had some issues at the level of the XDB schema. So I have to use expdp/impdp. note there are bugs when you use datapump for characterset conversion, which are solved with 10.2.0.4 ( and I use 10.2.0.4 )
Which directories have been configured so far ?
SQL> select * from dba_directories;
SYS DATA_PUMP_DIR
D:\oracle\product\10.2.0\db_1\rdbms\log\
SYS EXPORT
E:\oracle\export\PLATINUM
SYS ORACLE_OCM_CONFIG_DIR
D:\oracle\product\10.2.0\db_1\ccr\state
SYS ADMIN_DIR
D:\oracle\product\10.2.0\db_1/md/admin
SYS WORK_DIR
D:\oracle\product\10.2.0\db_1/work
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
I export the source database with Data Pump, note that
1. If you use datapump you do not need to set the environment variable NLS_LANG
2. if you use traditional export / import you NEED to set the NLS_LANG enviornment variable to the characterset of the source database.
See also MOS : NLS considerations in Import/Export - Frequently Asked Questions [ID 227332.1]
-- REM SET NLS_LANG=AMERICAN_AMERICA.WE8MSWIN1252
expdp dba_user directory=EXPORT dumpfile=expdp_PLATINUM_FULL.dmp logfile=expdp_PLATINUM_FULL.log full=Y
I create a new database with AL32UTF8. The creation of a new database is beyond the scope of this document and I import the previously taken dump.
note that
1. If you use datapump you do not need to set the environment variable NLS_LANG
2. if you use traditional export / import you NEED to set the NLS_LANG enviornment variable to the characterset of the source database.
-- REM SET NLS_LANG=AMERICAN_AMERICA.WE8MSWIN1252
impdp dba_user directory=EXPORT dumpfile=expdp_PLATINUM_FULL.dmp logfile=impdp_PLATINUM_schemas.log parfile=impdp_schemas.par
It can happen that the CSSCAN output lists you some records in the "truncation" column. this means that as a result of the characterset change not all the data can be stored in the varchar fields
Why is that ? Some characters - example latin characters like é,è,ê - require 1 byte in a single characterset database whereas they require more bytes in a multibyte characterset database.
See Multibyte characterset and length semantics
The solution for this is
1. import initially with CONTENT=METADATA_ONLY
2. disable triggers, foreign keys and virtual private database policies
3. change the table definition or change the NLS_LENGTH_SEMANTICS parameter ( see Multibyte characterset and length semantics )
4. import with CONTENT=DATA_ONLY
5. enable triggers, foreign keys and virtual private database policies
It does not take a hero to change / migrate the characterset.
Just concentrate on the NLS_LANG environment variable if you would still use traditional export and import utilities and take a full backup of your database PRIOR to the characterset change.