ToyboxCreations SELECT * FROM [My Life]

20Aug/100

Partitioning: MERGE, SPLIT, and SWITCH

OK, you know how to set up partitioning for a table and an index, and now you think you've solved your deadlocking, and your I/O issues forever, right?

Hardly.

Once you've been in SQL long enough you learn that no answer is forever.  Inevitably you'll have to revisit your old queries and see if their performance has degraded.  Partitioning is no different.  There are several issues you may need to solve with partitioning in the long run.  The first scenario is really easy to understand.

Let's say you have the partitions set up like we did back in my Partitioned Tables and Indexes article.  Your items table has grown and grown over the past year, and you notice that your data is starting to get heavy in one particular filegroup.

SPLIT

DECLARE @totalCount DECIMAL(9,2)
SELECT @totalCount = COUNT(*) FROM items

SELECT
 CASE WHEN manufacturerID <= 10 THEN 'fg1' else
   CASE WHEN manufacturerID between 11 and 20 THEN 'fg2' else
     CASE WHEN manufacturerID between 21 and 30 THEN 'fg3' else
       CASE WHEN manufacturerID between 31 and 40 THEN 'fg4' else
         CASE WHEN manufacturerID > 50 THEN 'fg5' END
       END
     END
   END
 END as filegroup
 , CONVERT(DECIMAL(9,2), count(*))/ @totalCount
FROM items
GROUP BY
  CASE WHEN manufacturerID <= 10 THEN 'fg1' else
    CASE WHEN manufacturerID between 11 and 20 THEN 'fg2' else
      CASE WHEN manufacturerID between 21 and 30 THEN 'fg3' else
        CASE WHEN manufacturerID between 31 and 40 THEN 'fg4' else
          CASE WHEN manufacturerID > 50 THEN 'fg5' END
        END
      END
    END
  END

When you run this, you notice even with 5 partitions you have much more than half of your records are in the last group.  This tells you you need to dig into that last group and figure out where you need to add a partition to your table.  You could then run another query to look at the distribution in that last group by groups of 10

SELECT
 (manufacturerID/10) + 1 AS filegroup
 , CONVERT(DECIMAL(9,2), count(*))/ @totalCount
FROM items
GROUP BY
 (manufacturerID/10) + 1

After looking into this you notice that over time you've added more manufaturers in this range, and you can effectively split the load for this file group if you set a boundary at 75.  That would put half of fg5 into a new partition, and the load would be split evenly again.  But how could you add a partition?

Well, in order to add a partition, you're going to have to assign it's data to a filegroup.  I'm going to assume all the filegroups you have defined for your current database are already in use.  We're going to need a new filegroup.

ALTER DATABASE KOTOR
ADD FILEGROUP fg6
   (NAME = fg6_data, FILENAME = 'd:\mssql\data\kotor_fg6.ndf', SIZE = 10MB)

Before you can use that filegroup in your partition, you've got to alter the PARTITION SCHEME.  You need to identify it as the next filegroup to use when you alter your PARTITION FUNCTION.  To do that, you need to ALTER the PARTITION SCHEME and identify the next filegroup to use.

ALTER PARTITION SCHEME myPartitionScheme
NEXT USED fg6

Now that you have identified the new filegroup as the NEXT USED filegroup, you can define the new boundary to your PARTITION FUNCTION.  The SPLIT operator allows you to define a new boundary point to your PARTITION FUNCTION.

ALTER PARTITION FUNCTION myPartitionFunction(int)
SPLIT RANGE (75)

This now adds 75 as a new boundary to the PARTITION FUNCTION we originally created in this article.

MERGE

IF you wanted to remove a boundary instead.  Let's say you have a boundary in use that is completely empty, and you want to stop using that partition, to free it up for some reason, then you want to use the MERGE operator. The syntax is just like SPLIT, except the boundary point you pass will be removed from your PARTITION FUNCTION.

SWITCH

Now before we go, I'd like to really blow you away.  How do you archive data?  SELECT it from one table into an ARCHIVE table, then delete it from the original table?  What if your database is highly available and you can't get a lock on the table to do that?

Well, if you partitioned your table, you've got a very powerful little tool available to you.  You can actually move a partition to a new table then you've effectively detached that part of the table from the original table.

You read that right.  You move a partition from one table to another, and the rest of your data remains untouched.  If you deal with sales data, you've probably already partitioned your table on a DATE or DATETIME field, right?  Once you move beyond a certain period and you're ready to archive that data, you can use this trick.

Script out a create statement for your table, give the table a new name, tableNameArchive.  Then, SWITCH the partition to that new table.

ALTER TABLE tableName
SWITCH PARTITION x TO tableNameArchive

Where x is the number of the partition containing the data you want to archive (this would usually be the first or oldest partition).  Then, MERGE that partition's boundary in the first table, since you no longer need it.

ALTER PARTITION FUNCTION partFunction()
MERGE RANGE (value)

Check out the data in your tableNameArchive table... it's your archived data!  Pretty slick, right?  This is a pretty complex use of partitions.  I think I should put together a demo of it in use, so you can work through it.  What do you think?  Do you want a practice problem?  If so, let me know!

If you have questions on this, or any other SQL topic, send them in.  I'm here to help!

18Aug/100

Partition Schemes

In my last article I started covering partitioning.  I'm going to pick that up and continue with how to create your database to use multiple filegroups.  Without multiple filegroups, you aren't going to see how partitioning can really improve your database's performance.  Let's set up the database that's going to house our table of items from Knights of the Old Republic.

CREATE DATABASE KOTOR
ON PRIMARY
    (NAME = primary_data, FILENAME = 'd:\mssql\data\kotor_primary_data.mdf', SIZE = 10MB),
FILEFROUP fg1
    (NAME = fg1_data, FILENAME = 'd:\mssql\data\kotor_fg1.ndf', SIZE = 10MB),
FILEFROUP fg2
    (NAME = fg2_data, FILENAME = 'd:\mssql\data\kotor_fg2.ndf', SIZE = 10MB),
FILEFROUP fg3
    (NAME = fg3_data, FILENAME = 'd:\mssql\data\kotor_fg3.ndf', SIZE = 10MB),
FILEFROUP fg4
    (NAME = fg4_data, FILENAME = 'd:\mssql\data\kotor_fg4.ndf', SIZE = 10MB),
FILEFROUP fg5
    (NAME = fg5_data, FILENAME = 'd:\mssql\data\kotor_fg5.ndf', SIZE = 10MB)
LOG ON
    (NAME = db_log, FILENAME = 'd:\mssql\logs\kotor_log.ndf', SIZE = 10MB, FILEGROWTH = 10MB);

OK, you should already recognize the parts around CREATE DATABASE, ON PRIMARY, and LOG ON. What you may not recognize is the FILEGROUP clauses. These allow you to set up additional files that will house data from the database. Just like the PRIMARY and LOG ON clauses, you have to identify a NAME, FILENAME, and SIZE. FILEGROWTH is optional, if omitted, the server will default to the server default growth, which is probably 10%. I prefer growing by a set size, since percent growths can really come back to haunt you when your database grows to a more serious size (100+ GB).

Now that we have our filegroups defined, let's move on to..

Partition Schemes

A partition scheme defines the colection of filegroups you want to use for a given partition function. Let's look at the general syntax.

CREATE PARTITION SCHEME myPartitionScheme
AS PARTITION myPartitionFunction
TO (fg1, fg2, fg3, fg4, fg5)

You have to provide the name for the PARTITION SCHEME, and you have to reference a PARTITION FUNCTION that already exists.
In my command I referenced the function we created in the last post.

CREATE PARTITION FUNCTION myPartitionFunction(int)
AS RANGE LEFT|RIGHT
FOR VALUES (10, 20, 30, 40, 50)

Finally you have to enter a filegroup for each partition defined in your partition function. The filegroups you reference must already exist. IF you want to create the filegroups in the same script as the partition scheme, you'll have to separate the filegroup definition from the CREATE PARTITION SCHEME command with a GO. You can reuse filegroups if you wanted two partitions to share the same filegroup.

CREATE PARTITION SCHEME myPartitionScheme
AS PARTITION myPartitionFunction
TO (fg1, fg1, fg2, fg3, fg4)

Or you can put all the partitions in the same filegroup... I question the benefits of doing this, since most of the time you're going to want to split these partitions into separate files (and maybe even split the disks these files are written to) in order to boost read/write efficiency. But SQL does support the ability to put all the partitions into the same file group.

CREATE PARTITION SCHEME myPartitionScheme
AS PARTITION myPartitionFunction
ALL TO (fg1)

Needless to say, if you use ALL, then you can only identify ONE filegroup.

Now you know how to create filegroups, and define a PARTITION SCHEME. These are only building blocks. Next time, we're going to cover how to use this PARTITION SCHEME with a table and indexes! Then you're really going to see the benefits of partitioning. Any questions so far? Just let me know, and I'll do my best to help you out.

16Aug/100

Missed Deadlines

Looks like I missed my personal deadline to take my 70-433 last month.  I let myself get wrapped up in the final phases of development on a conversion I've built for Wachovia/Wells Fargo.  The good news is the extra effort has paid off in testing.  We've worked through the first round of testing with only one bug detected so far.  And that one turns out to be a problem with an input file.  Something we were able to turn around and correct in minutes due to the flexibility built into the design.

The Current plan has us completing testing by mid-September.  That opens me up to do more training on this site!  I've had requests to explain SSIS, so I'll begin there.  I'm not abandoning the 70-433 material, it's just on hold for now.  I'm still studying for the exam.  I've just had to review my timeline a bit.

My next article will start with using the Import Export wizard to move data from a table in one database to a table in another database.  Simple stuff, but it's a way to break into SSIS, and see how some of the components work.  From there, we'll start digging deeper into the components and how to do more and more complex tasks.

If you have tasks that you'd like to learn how to accomplish in SSIS, let me know.  I'll use your requests as examples.

If you have questions on other SQL topics, send those in too.  I'm here to help!