Before going to zone maps first
let’s understand extent. An extent is
smallest unit of storage on the disk. In fact it is smallest disk allocation on
storage to store the data. Netezza would maintain the maximum and minimum value
of the column that is stored in an extent (each extent is of 3MB in size)
Read in more details: Netezza Zone Maps
Read in more details: Netezza Zone Maps
Zone map in Netezza is an internal mapping
structure to the extent that takes advantage of internal ordering of the data. Zone maps are automatically generated internal
table structures that the Netezza system uses to improve the response
time of SQL queries when queries data consists of large grouped or nearly
ordered date, timestamp, byteint, smallint, integer, and bigint data types. Netezza
does not create zone maps for numeric types that are larger than 8 bytes.
Zone maps transparently reduce
disk scan operations by eliminating the unreferenced data. A zone map contains
MIN and MAX of the data in each extent. Whenever
SQL query is issued with where clause, the Netezza system first scans the created
zone maps to find the addresses of minimum and maximum extents that contains
the column values. Then Netezza scans the column values in sequential order, results
are displayed once the matches found.
Netezza creates zone maps in
below scenarios
·
Generate statistics command
·
Data is loaded using nzload command
· Data is inserted or updated to tables -- As new
rows are inserted into the table or updated, entries in the zone map are updated
for the new ranges.
·
When groom operation performed to remove deleted
records
Netezza does not create zone maps
for very small table which is < 10 MB (default size). Zone Maps will give excellent
results if table is ordered most restricting (i.e. column used in WHERE clause).
This would reduce the I/O operations since all the rows would be grouped in the
same or nearby extents and rest of extent need not be scanned.
To verify all the columns on
which Netezza can create zone maps, use nz_zonemap
from the sql toolkit. If you don’t have that installed, get it done ASAP as it
has lot of useful syntax and script, those will make your life much easier.
Query performance and data slice issues
If the table size on data slice
exceeds 192 GB, you could get error. Follow below mentioned steps to handle
this situation.
· Perform GROOM on the table to remove deleted records.
This will free up some space on the data slice.
· Redistribute the data on different column if
existing column is causing high skew.
Create smaller table out of big table and create a view with a UNION ALL operation.
Create smaller table out of big table and create a view with a UNION ALL operation.
· Use truncate table statement over delete.
Even query planner wont shows if
zone maps are used. Zone map debug tools are very helpful in getting
information on zone maps. Use the set enable_zmap_debug=1 in nzsql script to
find information on zone maps usage in SQL. This command will enable zone maps
debug for only current session. It will be disable automatically
once the sql execution is completed.
During poor query performance, always
remember to verify if zone maps are disabled zone maps or un-ordered data or
deleted records are still available in database (forgot to run groom command?).
Generate statistics command will disable the zone maps. If by chance, the
generate statistics command is interrupted, that will lead to disabled zone
maps. Netezza will never scan the disable zone maps. One should re-generate the
statistics on the column to re-enable zone maps.
No comments:
Post a Comment