Wednesday, 15 June 2016

Netezza Interview Questions and Answers - Part-2


1.    How nzload works in Netezza?
A query usually goes through plan generation, optimization and transaction management. But Netezza bypass all these steps, loads are done in terms of sets and is based on the underlying table structures i.e. if the two tables are distributed on different columns then loading time is different for these two tables. Data format is verified and distribution of records calculated very quickly, fills out the set structure and writes to storage structure.  
While the nzload job is running, it sends binary records to the SPUs along with the current transaction ID. When a SPU receives new binary  records, it immediately allocates resources and writes the records to the database or the table on the disks that the SPU owns.




2.    What are cluster base tables (CBT’s) in Netezza? And what are benefits of CBTs?
A clustered base table (CBT) is a user table that contains data which is organized using one to four organizing keys. An organizing key is a column of the table that you specify for clustering the table records.
Netezza uses the organizing keys to group records within the table and save them in the same or nearby extents. Netezza also creates zone maps for the organizing columns to accelerate the performance of queries on that table that restrict using the organizing keys.

Benefits of CBTs
·         CBTs support “multi-dimension” lookups where you can organize records by one, two, Three or four lookup keys. if your queries commonly restrict on transaction type and store ID, you can organize records using both of those keys to improve query performance.
·         CBTs improve query performance by adding more zone maps for a table because the Organizing key columns are also zone mapped (if the organizing column data type supports zone maps). 
·         CBTs increase the supported data types for zone-mapped columns, thus allowing you to Improve performance for queries that restrict along multiple dimensions.
·         CBTs allow you to incrementally organize data within your user tables in situations where data cannot easily be accumulated in staging areas for pre-ordering before insertions/loads. CBTs can help you to eliminate or reduce pre-sorting of new table records prior to a load/insert operation. 
·         CBTs save disk space. Unlike indexes, materialized views and other auxiliary data structures, CBTs do not replicate the base table data and do not allocate additional data structures.
   
3.    What is FPGA and how it is useful for query performance in Netezza?
Field Programmable Gate Array (FPGA) is Netezza proprietary hardware installed on each SPU. The FPGA on each SPU filters out unwanted data, when it reads data from the disk. This process of data elimination removes IO bottlenecks and frees up downstream components such as the CPU, memory and network from processing extra data and hence the query performance increases.

4.    What are zonemaps? How are they created and updated?
Zonemaps is internal mapping structures to the extents that take advantage of the internal ordering of data to eliminate extents that do not need to be scanned. An extent is the smallest unit of disk allocation on a SPU. Zonemaps avoid scanning of unreferenced rows. Zonemaps are created for every column in the table and contain the minimum and maximum values for every extent. Zonemaps are also created for the columns which are used in organizing on clause.

Zonemaps are created and refreshed for every SPU when you Generate statistics, Nzload operation, Insert, update operations, nzgroom operation.

5.    What is the use of materialized views?
A materialized view reduces the width (number of columns) of data being scanned in the base table by creating a thin version (fewer columns) of the base table that contains a small subset of frequently queried columns.

6.    What are the best ways of creating and using materialized views?
There are some best way to make best use of materialized views
·         Create materialized views with few columns which are frequently queried.
·         Specify order by clause on the most restrictive columns i.e. columns used in where clause.
·         Periodically or manually refresh the materialized views.

7.    What are the limitations of materialized views?
There are limitations associated with the use of materialized views
·         You cannot insert, update, delete or truncate a materialized view. Any changes on the base tables will reflect into materialized views.
·         You can specify only one base table in the from clause.
·         Base table can't be a external table, system table or a temporary table.
·         You cannot use a where clause in the materialized view.
·         Expressions are not allowed as columns.

8.    What are the partitioning methods in Netezza?
There are two partitioning methods available in Netezza
·         Random partitioning: Netezza used round robin method and distributes data randomly

3 comments:

  1. Hello Mate,

    Love it absolutely! So crystalline. No mumbo jumbo. No non-sense. Straight and simple. You guys need a standing ovation for your good work.

    I have 2 servers and 2 tables with few similar columns between eachother.I have to build a tabular model with both as the source.The column called PDCAID is common between the two tables and unique to each table separately but may have some same IDs is both tables.Basically,the PDCAIDs are auto increment identity columns in each table.One table logs data for internal quality issues and other for external quality issues.
    If I were to union the 2 tables ,how do i differentiate PDCAID between internal and external.
    Also,pls suggest a best data model design in this case where:
    • internal: columns depicting defects,open and closure dates of each PDCA,root cause,problems etc.
    • external: columns depicting supplier issues,open/closure dates of each PDCA,root cause,counter measures etc.
    • KPI: aggregates data for a region,site,per month,per year
    the grain level expected by the user is Per PDCA Per day.


    Super likes !!! for this amazing post
    . I think everyone should bookmark this.

    Kind Regards,

    ReplyDelete
  2. I like your articles so much. Please keep writing . Do US people need a visa to Turkey from USA? Yes, of course, Because the US is not included in the list of those countries which do not require a visa to visit in turkey.

    ReplyDelete