Sunday, 26 June 2016

nzload Command and Usage in Netezza

How the nzload Command Works

The nzload command is a SQL CLI client tool that allows you to load data from the local or a remote client, on all the supported client platforms (Linux/windows).  The nzload command processes command-line load options to send queries to the host to create an external table definition, run the insert/select query to load data, and when the load completes, then drop the external table.

The nzload command connects to a Netezza database with user name and password associated with server. In order to use any of the netezza tool, you should first install nzodbc or nzjdbc drivers on the client or remote machines.

Required Privileges

To run the nzload command, you must have the CREATE EXTERNAL TABLE privilege and access privileges to that table or database (LIST, INSERT, SELECT).

Parallel loading and transactions


While the nzload job is running on the host or client machine, it sends records to be loaded to the SPUs along with the current transaction ID. Soon after receiving those new records, SPU immediately allocates resources and writes the records to the database.

You can run multiple nzload jobs in parallel with all nzload job inserting records to the same tables. While loading, you can execute concurrent queries, inserts, updates, and deletes against all committed records in the target tables.

The nzload command performs all insertions into the target table within a single transaction. The nzload command commits the transaction at the end of the job, provided it does not detect any fatal errors. If the nzload command cannot commit the transaction, these storage and resources remain allocated. To free up this disk space, use the groom command on the specific table or on database.

Order of command execution


The nzload command is a command-line program that accepts input values from multiple sources.
The precedence order is the following:

·         Command line
·         Control file
·         Environmental variables (only used for user, password, database, and host)
·         Built-in defaults

Option names are case insensitive.

Command syntax


Following command can be used to load flat file to Netezza data warehouse

nzload -host <host IP/name> -db <database> -u <username> -pw <password> -t table –df ‘/home/dw/data/flat_file.csv’ -bf  ‘/home/dw/data/flat_file.bad’ -lf ‘/home/dw/data/flat_file.log’ [optional args]

default delimiter in nzload is pipe(|). The above syntax is basic one and can be used along with various nzload command line options.

Command line options


Option
Description
-Host
Database host address or DNS name
-Port
Database port to connect to
-d
Netezza database name
-u
Database user name
-pw
Database password
-t
Database table to insert data
df
data file path to be loaded to Netezza table
Use named-pipe
Check this option if you like to use a named-pipe instead of a data file
Named-pipe name
Specify a name for the named-pipe to use for loading
-delim
The delimiter to use while loading data
-cf
use control file and provide path here
-RequireQuotes
use this option if column values are embedded into quotes. YES
-QuotedValue
specify the type of quote. single, double
-lf
Name of the log file to generate.
-bf
Name of the bad file to generatein case of bad records
-ouputDir
Directory path to where the log and the bad file is generated. If the parameter is not specified the files are generated under the current directory where the job is currently running.
-logFileSize
Maximum size for the log file. The value is in MB. The default value is 2000 or 2GB.
-compress
Specify this option if the data file is compressed. Valid values are "TRUE" or "FALSE". Default value if "FALSE".
-skipRows <n>
Number of rows to skip from the beginning of the data file.
-maxRows <n>
Maximum number of rows to load from the data file.
-maxErrors
Maximum number of error records to allow before terminating the load process. The default value if "1".
-ignoreZero
Binary zero bytes in the input data will generate errors. Set this option no "NO" to generate error or to "YES" to ignore zero bytes. The default value is "NO".
-requireQuotes
This option requires all the values to be wrapped in quotes. default value is "FALSE".


-nullValue <token>
Specify the token to indicate a null value in the data file. The default value is "NULL". To improve slightly performance you can set this value to an empty field by specifying the value for the option as to single quotes: "\'\'".
-fillRecord
Treat missing trailing input fields as null.

-ctrlChar
Accept control chars in char/varchar fields (must escape NUL, CR and LF).
-ctInString
Accept un-escaped CR in char/varchar fields (LF becomes only end of row).
-truncString
Truncate any string value that exceeds its declared char/varchar storage.
-dateStyle
Specify the date format in which the input data is written in. Valid values are: "YMD", "Y2MD", "DMY", "DMY2", "MDY", "MDY2", "MONDY", "MONDY2". The default value is "YMD".


-dateDelim
Delimiter character between date parts. The default value is "-" .

-y2Base
First year expressible using two digit year (Y2) dateStyle.
-timeStyle
Specify the time format in which the input data is written in. Valid values are: "24HOUR" and "12HOUR".


-timeDelim
Delimiter character between time parts. The default value is ":".

-timeRoundNanos
Allow but round non-zero digits with smaller than microsecond resolution.
-boolStyle
Specify the format in which Boolean data is written in the data. The valid values are: "1_0", "T_F", "Y_N", "TRUE_FALSE", "YES".
-allowRelay
Allow load to continue after one or more SPU reset or failed over. The default is not allowed.
-allowRelay <n>
Specify number of allowable continuation of a load. Default value is "1".

nzload exit codes


The nzload command exits with the following codes:

·         0 – Successful, all input records were inserted.
·         1 – Failed, no records were inserted due to an error or errors found during the load.

No comments:

Post a Comment