At my work I have inherited a database where the
partitioning was not done correctly.
This database holds millions and millions of records for each year. Hence the need for partitioning.
So I started working to correct/improve the same in a dev environment. The current table (Table1) that is partitioned
is based on filegroup.
After a lot of experiments, I sought feedback from Partitioning expert Martin Cairney
He suggested me to correct this situation in the below steps.
Step 1. Create a new partition function (yearly_pf) and partition
scheme (yearly_ps) that reflects the correct partitions.
The code I used for this is as below.
CREATE PARTITION FUNCTION [yearly_pf](datetime) AS RANGE RIGHT FOR VALUES (N'2007-12-31T23:59:59.997', N'2008-12-31T23:59:59.997', N'2009-12-31T23:59:59.997', N'2010-12-31T23:59:59.997', N'2011-12-31T23:59:59.997', N'2012-12-31T23:59:59.997', N'2013-12-31T23:59:59.997', N'2014-12-31T23:59:59.997', N'2015-12-31T23:59:59.997', N'2016-12-31T23:59:59.997', N'2017-12-31T23:59:59.997', N'2018-12-31T23:59:59.997') GO /****** Object: Create Partition scheme ******/ CREATE PARTITION SCHEME [yearly_ps] AS PARTITION [yearly_pf] TO ([FG_2007], [FG_2008], [FG_2009], [FG_2010], [FG_2011], [FG_2012], [FG_2013], [FG_2014], [FG_2015], [FG_2016], [FG_2017], [FG_2018], [PRIMARY]) GO
Step 2. Create another table (Table 3) with the same schema
as the existing table Table1. But base
this on the partition scheme rather than on filegroup.
For this I scripted the create table script from Table 1 as
this will all the constraints of the existing table. But I changed the On filegroup statement to
On yearly_ps
Step 3. Create a
staging table (Table 2) to insert data from Table 1 on a year by year basis.
For this I used the create table script from Table 1
Step 4. Insert the
data from the staging table with a where clause that includes yearly data into
Table 3.
Step 5. After all the
data is inserted into Table 3, rename Table 1 an rename Table 3 to Table
1. When you are satisfied with
everything, then drop Table 1.
I did this process from 2007 to 2012.
After that I started wondering do we really need the staging
table.
Since this is a dev environment, I thought I will try
without the staging table.
So I repeated the steps 1 to 2 above after dropping Table 3. Then I started to Insert data straight into
Table 3 on a yearly basis. I checked
every time whether the yearly data is going to the correct partition or not.
The script I used for this is as follows.
SELECT OBJECT_SCHEMA_NAME(p.object_id) AS SchemaName , OBJECT_NAME(p.object_id) AS ObjectName , i.name AS IndexName , p.index_id AS IndexID , ds.name AS PartitionScheme , p.partition_number AS PartitionNumber , COALESCE(fg.name, fg2.name) AS FileGroupName , prv_left.value AS LowerBoundaryValue , prv_right.value AS UpperBoundaryValue , CASE ISNULL(CAST(pf.boundary_value_on_right as SMALLINT),-1) WHEN 1 THEN 'RIGHT' WHEN -1 THEN 'N/A' ELSE 'LEFT' END AS PartitionFunctionRange , p.rows AS Rows FROM sys.partitions AS p INNER JOIN sys.indexes AS i ON i.object_id = p.object_id AND i.index_id = p.index_id INNER JOIN sys.data_spaces AS ds ON ds.data_space_id = i.data_space_id LEFT JOIN sys.partition_schemes AS ps ON ps.data_space_id = ds.data_space_id LEFT JOIN sys.partition_functions AS pf ON pf.function_id = ps.function_id LEFT JOIN sys.destination_data_spaces AS dds2 ON dds2.partition_scheme_id = ps.data_space_id AND dds2.destination_id = p.partition_number LEFT JOIN sys.data_spaces dsp ON ds.data_space_id = dsp.data_space_id LEFT JOIN sys.filegroups AS fg ON fg.data_space_id = dds2.data_space_id LEFT JOIN sys.filegroups AS fg2 ON fg2.data_space_id = dsp.data_space_id LEFT JOIN sys.partition_range_values AS prv_left ON ps.function_id = prv_left.function_id AND prv_left.boundary_id = p.partition_number - 1 LEFT JOIN sys.partition_range_values AS prv_right ON ps.function_id = prv_right.function_id AND prv_right.boundary_id = p.partition_number where OBJECT_NAME(p.object_id) = 'Table3' ) ORDER BY ObjectName, IndexId, PartitionNUmber ;
The process was smooth.
I would like to implement this in Production. I would like some feedback of the viewers of
this blog as to whether I am doing any blunders here.