Check this out

Thursday, November 22, 2007

External Table definitions

The access parameters described in this chapter provide the interface to the external
table access driver. You specify access parameters when you create the external
table. This chapter describes the syntax for the access parameters for the default
access driver.

You may find it helpful to use the EXTERNAL_TABLE=GENERATE_ONLY parameter
in SQL*Loader to get the proper access parameters for a given SQL*Loader control
file. When you specify GENERATE_ONLY, all the SQL statements needed to do the
load using external tables, as described in the control file, are placed in the
SQL*Loader log file. These SQL statements can be edited and customized.

The record_format_info clause is optional. If the clause is
not specified, the default value is RECORDS DELIMITED BY NEWLINE

The following is an example of using FIXED records.

It assumes there is a 1-byte newline character at the end of each record in the datafile. It is followed by a sample of the datafile that can be used to load it.

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
ACCESS PARAMETERS (RECORDS FIXED 20 FIELDS (first_name CHAR(7),
last_name CHAR(8),
year_of_birth CHAR(4)))
LOCATION ('foo.dat'));


Alvin Tolliver1976
KennethBaer 1963
Mary Dube 1973


VARIABLE size
The VARIABLE clause is used to indicate that the records have a variable length and
that each record is preceded by a character string containing a number with the
count of bytes for the record. The length of the character string containing the count
field is the size argument that follows the VARIABLE parameter. Note that size
indicates a count of bytes, not characters. The count at the beginning of the record
must include any record termination characters, but it does not include the size of
the count field itself. The number of bytes in the record termination characters can
vary depending on how the file is created and on what platform it is created.
The following is an example of using VARIABLE records. It assumes there is a
1-byte newline character at the end of each record in the datafile. It is followed by a
sample of the datafile that can be used to load it.

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
ACCESS PARAMETERS (RECORDS VARIABLE 2 FIELDS TERMINATED BY ','
(first_name CHAR(7),
last_name CHAR(8),
year_of_birth CHAR(4)))
LOCATION ('foo.dat'));

21Alvin,Tolliver,1976,
19Kenneth,Baer,1963,
16Mary,Dube,1973,


The DELIMITED BY clause is used to indicate the characters that identify the end of
a record.
If DELIMITED BY NEWLINE is specified, then the actual value used is
platform-specific. On UNIX platforms, NEWLINE is assumed to be "\n". On
Windows NT, NEWLINE is assumed to be "\r\n".
If DELIMITED BY string is specified, string can either be text or a series of
hexadecimal digits.

The LOAD WHEN condition_spec clause is used to identify the records that
should be passed to the database.


LOAD WHEN (empid != BLANKS)
LOAD WHEN ((dept_id = "SPORTING GOODS" OR dept_id = "SHOES") AND total_sales != 0)

The condition_spec is an expression that evaluates to either true or false. It
specifies one or more conditions that are joined by Boolean operators.

The following are some examples of using condition_spec:
empid = BLANKS OR last_name = BLANKS
(dept_id = SPORTING GOODS OR dept_id = SHOES) AND total_sales != 0



Clause used to specify the name of an output file (BADFILE, DISCARDFILE,
or LOGFILE
). The directory object name is the name of a directory object where the
user accessing the external table has privileges to write. If the directory object name
is omitted, then the value specified for the DEFAULT DIRECTORY clause in the
CREATE TABLE AS EXTERNAL statement is used.

The symbol substitutions supported for UNIX and Windows NT are
as follows (other platforms may have different symbols):
  • %p is replaced by the process ID of the current process. For example, if the
    process ID of the access driver is 12345, then exttab_%p.log becomes
    exttab_12345.log.
  • %a is replaced by the agent number of the current process. The agent number is
    the unique number assigned to each parallel process accessing the external
    table. This number is padded to the left with zeros to fill three characters. For
    example, if the third parallel agent is creating a file and bad_data_%a.bad
    was specified as the filename, then the agent would create a file named bad_
    data_003.bad.
  • %% is replaced by %. If there is a need to have a percent sign in the filename, then
    this symbol substitution is used.

The field_definitions clause names the fields in the datafile and specifies how
to find them in records.If the field_definitions clause is omitted, then:

  • The fields are assumed to be delimited by ','
  • The fields are assumed to be character type
  • The maximum length of the field is assumed to be 255
  • The order of the fields in the datafile is the order in which the fields were
    defined in the external table
  • No blanks are trimmed from the field


The following is an example of an external table created without any access
parameters. It is followed by a sample of the datafile that can be used to load it.


CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir LOCATION ('foo.dat'));

Alvin,Tolliver,1976
Kenneth,Baer,1963


In general, specifying single characters for the strings is faster than multiple
characters. Also, searching data in fixed-width character sets is usually faster than
searching data in varying-width character sets.


The following are some examples of using delim_spec:
TERMINATED BY ""
ENCLOSED BY "\" TERMINATED BY ","
ENCLOSED BY "START MESSAGE" AND "END MESSAGE"
Example: External Table with Terminating Delimiters


The following is an example of an external table that uses terminating delimiters. It
is followed by a sample of the datafile that can be used to load it.


CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
ACCESS PARAMETERS (FIELDS TERMINATED BY WHITESPACE)
LOCATION ('foo.dat'));


Alvin Tolliver 1976
Kenneth Baer 1963
Mary Dube 1973


Example: External Table with Enclosure and Terminator Delimiters
The following is an example of an external table that uses both enclosure and
terminator delimiters. Remember that all whitespace between a terminating string
and the first enclosure string is ignored, as is all whitespace between a second
enclosing delimiter and the terminator. The example is followed by a sample of the
datafile that can be used to load it.


CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
ACCESS PARAMETERS (FIELDS TERMINATED BY "," ENCLOSED BY "(" AND ")")
LOCATION ('foo.dat'));
(Alvin) , (Tolliver),(1976)
(Kenneth), (Baer) ,(1963)
(Mary),(Dube) , (1973)

Example: External Table with Optional Enclosure Delimiters
The following is an example of an external table that uses optional enclosure
delimiters. Note that LRTRIM is used to trim leading and trailing blanks from fields.
The example is followed by a sample of the datafile that can be used to load it.
CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
ACCESS PARAMETERS (FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '(' and ')'
LRTRIM)
LOCATION ('foo.dat'));
Alvin , Tolliver , 1976
(Kenneth), (Baer), (1963)
( Mary ), Dube , (1973)

The trim_spec clause is used to specify that spaces should be trimmed from the
beginning of a text field, the end of a text field, or both. Spaces include blanks and
other nonprinting characters such as tabs, line feeds, and carriage returns.
The default is LDRTRIM. Specifying NOTRIM yields the fastest performance.
The trim_spec clause can be specified before the field list to set the default
trimming for all fields. If trim_spec is omitted before the field list, then LDRTRIM
is the default trim setting. The default trimming can be overridden for an individual
field as part of the datatype_spec.
If trimming is specified for a field that is all spaces, then the field will be set to
NULL.
In the following example, all data is fixed-length; however, the character data will
not be loaded with leading spaces. The example is followed by a sample of the
datafile that can be used to load it.
CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20),
year_of_birth CHAR(4))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
ACCESS PARAMETERS (FIELDS LTRIM)
LOCATION ('foo.dat'));
Alvin, Tolliver,1976
Kenneth, Baer, 1963
Mary, Dube, 1973


MISSING FIELD VALUES ARE NULL


MISSING FIELD VALUES ARE NULL indicates that if there is not enough data in
a record for all fields, then those fields with missing data values are set to NULL. If
MISSING FIELD VALUES ARE NULL is not specified, and there is not enough
data in the record for all fields, then the row is rejected.
In the following example, the second record is stored with a NULL set for the year_
of_birth column, even though the data for the year of birth is missing from the
datafile. If the MISSING FIELD VALUES ARE NULL clause was omitted from the
access parameters, then the second row would be rejected because it did not have a
value for the year_of_birth column. The example is followed by a sample of the
datafile that can be used to load it.
CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth INT)
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
ACCESS PARAMETERS (FIELDS TERMINATED BY ","
MISSING FIELD VALUES ARE NULL)

No comments: