nzload
is bulk copy command available in Netezza. This is a command that provides an easy method
for using external tables and getting data into the Netezza appliance.
Read:
Read:
- nzload Command Options
- Netezza Data Loading Guide
- FPGA-Accelerated Streaming Technology (FAST) Framework in Netezza
- Netezza Architecture
- Slowly Changing Dimensions (SCD)
- Netezza Skew and how to avoid it
- Importance of right Distribution key in Netezza
- Design and Implement Slowly Changing Dimension Type 2 (SCD Type2) in SQL
How the nzload Command Works
The
nzload command is a SQL CLI client tool that allows you to load data from the
local or a remote client, on all the supported client platforms (Linux/windows).
The nzload command processes
command-line load options to send queries to the host to create an external
table definition, run the insert/select query to load data, and when the load
completes, then drop the external table.
The
nzload command connects to a Netezza database with user name and password
associated with server. In order to use any of the netezza tool, you should
first install nzodbc or nzjdbc drivers on the client or remote machines.
Required Privileges
To run the nzload command,
you must have the CREATE EXTERNAL TABLE privilege and access privileges to that
table or database (LIST, INSERT, SELECT).
Parallel loading and transactions
While the nzload job is
running on the host or client machine, it sends records to be loaded to the
SPUs along with the current transaction ID. Soon after receiving those new
records, SPU immediately allocates resources and writes the records to the
database.
You can run multiple nzload
jobs in parallel with all nzload job inserting records to the same tables.
While loading, you can execute concurrent queries, inserts, updates, and
deletes against all committed records in the target tables.
The nzload command
performs all insertions into the target table within a single transaction. The nzload
command commits the transaction at the end of the job, provided it does not
detect any fatal errors. If the nzload command cannot commit the transaction,
these storage and resources remain allocated. To free up this disk space, use
the groom command on the specific table or on database.
Order of command execution
The nzload command is a
command-line program that accepts input values from multiple sources.
The precedence order is
the following:
·
Command line
·
Control file
·
Environmental variables (only used for user,
password, database, and host)
·
Built-in defaults
Option names are case
insensitive.
Command syntax
Following
command can be used to load flat file to Netezza data warehouse
nzload -host <host
IP/name> -db <database> -u <username> -pw <password> -t
table –df ‘/home/dw/data/flat_file.csv’ -bf
‘/home/dw/data/flat_file.bad’ -lf ‘/home/dw/data/flat_file.log’
[optional args]
default delimiter in
nzload is pipe(|). The above syntax is basic one and can be used along with
various nzload command line options.
Command line options
Option
|
Description
|
-Host
|
Database host address or DNS name
|
-Port
|
Database port to connect to
|
-d
|
Netezza database name
|
-u
|
Database user name
|
-pw
|
Database password
|
-t
|
Database table to insert data
|
df
|
data file path to be loaded to Netezza table
|
Use named-pipe
|
Check this option if you like to use a named-pipe instead of a
data file
|
Named-pipe name
|
Specify a name for the named-pipe to use for loading
|
-delim
|
The delimiter to use while loading data
|
-cf
|
use control file and provide path here
|
-RequireQuotes
|
use this option if column values are embedded into quotes. YES
|
-QuotedValue
|
specify the type of quote. single, double
|
-lf
|
Name of the log file to generate.
|
-bf
|
Name of the bad file to generatein case of bad records
|
-ouputDir
|
Directory path to where the log and the bad file is generated.
If the parameter is not specified the files are generated under the current
directory where the job is currently running.
|
-logFileSize
|
Maximum size for the log file. The value is in MB. The default
value is 2000 or 2GB.
|
-compress
|
Specify this option if the data file is compressed. Valid values
are "TRUE" or "FALSE". Default value if
"FALSE".
|
-skipRows <n>
|
Number of rows to skip from the beginning of the data file.
|
-maxRows <n>
|
Maximum number of rows to load from the data file.
|
-maxErrors
|
Maximum number of error records to allow before terminating the
load process. The default value if "1".
|
-ignoreZero
|
Binary zero bytes in the input data will generate errors. Set
this option no "NO" to generate error or to "YES" to
ignore zero bytes. The default value is "NO".
|
-requireQuotes
|
This option requires all the values to be wrapped in quotes.
default value is "FALSE".
|
-nullValue <token>
|
Specify the token to indicate a null value in the data file. The
default value is "NULL". To improve slightly performance you can
set this value to an empty field by specifying the value for the option as to
single quotes: "\'\'".
|
-fillRecord
|
Treat missing trailing input fields as null.
|
-ctrlChar
|
Accept control chars in char/varchar fields (must escape NUL, CR
and LF).
|
-ctInString
|
Accept un-escaped CR in char/varchar fields (LF becomes only end
of row).
|
-truncString
|
Truncate any string value that exceeds its declared char/varchar
storage.
|
-dateStyle
|
Specify the date format in which the input data is written in.
Valid values are: "YMD", "Y2MD", "DMY",
"DMY2", "MDY", "MDY2", "MONDY",
"MONDY2". The default value is "YMD".
|
-dateDelim
|
Delimiter character between date parts. The default value is
"-" .
|
-y2Base
|
First year expressible using two digit year (Y2) dateStyle.
|
-timeStyle
|
Specify the time format in which the input data is written in.
Valid values are: "24HOUR" and "12HOUR".
|
-timeDelim
|
Delimiter character between time parts. The default value is
":".
|
-timeRoundNanos
|
Allow but round non-zero digits with smaller than microsecond
resolution.
|
-boolStyle
|
Specify the format in which Boolean data is written in the data.
The valid values are: "1_0", "T_F", "Y_N",
"TRUE_FALSE", "YES".
|
-allowRelay
|
Allow load to continue after one or more SPU reset or failed
over. The default is not allowed.
|
-allowRelay <n>
|
Specify number of allowable continuation of a load. Default
value is "1".
|
nzload exit codes
The nzload command exits with the
following codes:
·
0 – Successful, all input records
were inserted.
·
1 – Failed, no records were inserted
due to an error or errors found during the load.
·
2 – Successful, but errors found
during the input did not exceed the error threshold (-maxErrors), good records
were inserted.
No comments:
Post a Comment