1.
How many columns you can specify in distribute on clause?
You can specify upto
four columns in distribution on clause. Be careful when using multiple
distribution keys, that may degrade the query performance.
Read More On Netezza here:
Read More On Netezza here:
- Netezza Data Loading Guide
- FPGA-Accelerated Streaming Technology (FAST) Framework in Netezza
- Netezza Architecture
- Slowly Changing Dimensions (SCD)
- Netezza Skew and how to avoid it
- Importance of right Distribution key in Netezza
- Design and Implement Slowly Changing Dimension Type 2 (SCD Type2) in SQL
- Netezza Zone Maps
2.
Is it possible to update distribution key column?
You cannot update column
used in distribution on clause. You have to perform the re-distribute table on
other column to update distribution column. Re-distribution can be done by using
CREATE TABLE AS SELECT * FROM TABLE distribute on new_column;
3.
If you miss to specify column in distribute on clause, what
distribution Netezza uses?
If you miss to specify
the distribution column while creating tables. Netezza distributes data on the
first column using Hash partition.
4.
What is collocated join in Netezza?
When you join two tables
which are distributed on the same key and when used those key columns in the join
condition, then each SPU in Netezza works 100% independent of the other, as the
required data is available within itself. This type of joins is called
collocated joins.
5.
When Netezza redistributes and broadcasts a table?
Whenever it is not
possible to do a collocated joins, Netezza will either redistributes or
broadcasts the table. When the table is a small one, then netezza broadcasts
the table. Otherwise redistributes the table. Broadcast is the process of
sending copy of small table to each SPU in the Netezza and redistribute is the
process of re-distributing the table on the column which is used in Netezza
joins.
6.
What are the different ways to load data from file to table?
There are couple of ways
to load data from file to table
·
Create external table
and load Netezza table using that external table
·
Use nzload command to
load data to table
7.
What are the different ways to unload data from table?
There are couple of ways
to unload data from table
·
Create external table
·
Use nzsql command with –o
option
8.
What are the constraints on a table are enforced?
The only constraint Netezza
supports is Not null and default. There are no primary key and foreign key
constraints in Netezza. The default constraint is enforced when you add new
column to the table with default value to inserted.
9.
How the NOT NULL specification on a column improves the Netezza
performance?
Netezza tracks the NULL
values at rowheader level. Having NULL values results in storing references to
NULL values in header. If all columns are NOT NULL, then there is no record
header. hence specifying Not Null on each column in table results in better
performance
10.Can you insert duplicate
rows in Netezza tables?
Yes, we can insert duplicate
rows. Remember!! There are no primary or foreign key on Netezza
11.How to remove logically
deleted records?
Whenever you delete a row in a
table, it is not physically deleted. It is logically deleted by flagging the
deletexid field in the table with current transaction ID. Nzgroom command
utility is used to remove the logically deleted records from Netezza.
12.How Netezza updates records? Explain with an example
In
Netezza, records contains two slots, one for createxid and other is deletexid. Netezza
does not update records in database tables, it marks current records with
delete flag. i.e. deletexid is updated with current transaction ID. Deletexid
allows us to mark record with current transaction for deletion, up to 31
transactions are allowed in Netezza for all tables. Only one update is allowed
on table at any given point of time, concurrent updates are not allowed on same
table. In Netezza, if the deletexid holds value ‘0’ then it is a new record. All
records with non-zero value in deletexid are deleted records. FPGA takes only records with deletexid contains
value zero before delivering data to client software’s.
For
example:
[ROW id][Create xid][Delete xid]
à Netezza record slots
[Rec1][Trx1][0] à First time a record is loaded, record Rec1
[Rec1][Trx1][0] à First time a record is loaded, record Rec1
Let’s say we have updated above
record
[Rec1][Trx1][Trx22] à Record Rec1 is updated. Current transaction id Trx22 is updated
[Rec22][Trx22][0] à New update record Rec22. Similar to a new record this has zero for Deletexid
[Rec1][Trx1][Trx22] à Record Rec1 is updated. Current transaction id Trx22 is updated
[Rec22][Trx22][0] à New update record Rec22. Similar to a new record this has zero for Deletexid
13.What are data types most suited for zonemaps?
Zone maps are typically
useful for integers, date and time, variations of this data type. Zone maps are
useful for ordered that that are usually built into data that is loaded; for
example, phone call logs
14.What is a snippet?
Snippet is smallest unit of work carried out in SPU. Snippet is
a small block of database operation, typically three to four operations (based
on type of Netezza box), that are carried out on all SPUs where data is
location. If a query results in these snippets: Snippet A, Snippet B, Snippet
C, …. Snippet Z; they are carried out in a sequential manner.
15.What integer data type
is preferred in Netezza?
Below are some of reasons to this
·
Better and efficient
joins.
·
Zonemaps are based on
integer data types
·
Netezza compression
works best with integer data types
16.Is there any way to stop
NZSQL command, if one of the SQL commands fail?
In this case, ON_ERROR_STOP=true on nzsql command so that other commands do not get executed. For instance, we like to create a new table (CTAS) and later drop old table. If a new table creation fails, we certainly do not want to drop old table. In this case, this option is very useful.
No comments:
Post a Comment