ToyboxCreations SELECT * FROM [My Life]

19Aug/100

Partitioned Tables and Indexes

Ok, at this point you should know how to set up a PARTITION FUNCTION and a PARTITION SCHEME.  Now, we're going to move on and apply this SCHEME (and in turn the FUNCTION) to a table and an Index.  I'm going to include a little recap, just in case you've missed anything so far.  All of this code should be run in the KOTOR database we defined in the last article.

First, we define the PARTITION FUNCTION:

CREATE PARTITION FUNCTION myPartitionFunction(int)
AS RANGE LEFT|RIGHT
FOR VALUES (10, 20, 30, 40, 50)
GO

Then, we define the PARTITION SCHEME (assuming the filegroups are already defined for the database):

CREATE PARTITION SCHEME myPartitionScheme
AS PARTITION myPartitionFunction
TO (fg1, fg2, fg3, fg4, fg5)

Now, we create a table using the PARTITION SCHEME.

CREATE Table items (
     ItemID INT IDENTITY(1,1)
   , ManufacturerID INT -- we're not going to define the FK in this example
   , [Name] VARCHAR(255)
   -- other columns can be defined, but I'm not covering those in this example
)
ON myPartitionScheme(ManufacturerID)
GO

The change here is we define our table on a PARTITION SCHEME, rather than a filegroup.  Since our scheme already covers several filegroups, it's an indirect reference to filegroup.  When you specify the scheme, that references a PARTITION FUNCTION, so you'll have to identify the key for which the function will be applied.  Remember datatypes are important.  If your function is defined for a datatype that your column cannot be implicitly converted, you're going to get an error.

Hey, a quick side note: once you've defined a partition for your table, you could restore parts of your table without affecting the other parts.  Think about it... you could restore a single .ndf file, and you'd be restoring a part of the table without affecting the other parts.  You can use this same technique to back up parts of the database, without locking other parts.  This can become very useful in high volume/high availability databases.

Partitioning an INDEX

You can apply a similar DDL statement to create a partitioned index.

CREATE NONCLUSTERED INDEX ix_items__name
   ON items(name) ON myPartScheme(ManufacturerID)
GO

Did you notice I'm partitioning the index on a column not in my index?  If you remember, indexes can contain extra columns using the INCLUDE clause.  When you create an index on a partitioned table, the server will automatically include the partitioning key in the index created for that table.  That way you can partition the index the same way you partition the table.  Pretty slick, right?

Now that you have all the basics for creating partitions on tables and indexes, take a look at some of your queries.  Do you see any that have a high level of blocking?  Do you see some that respond to lookups a bit slower than you'd like?  Try some partitioning in your development environment, see if you can come up with a solution to your problems through partitioning.  If not, give me a shout... I'll give it a once over and see what I can do to help.

Now that you have the basics, I'm going to have to move into some administrative tasks with you.  You're going to need to learn how to manage and alter your partitioning.  Since we all know that today's solution probably isn't the forever solution, right?

Stay tuned!

Filed under: Microsoft SQL No Comments
18Aug/100

Partition Schemes

In my last article I started covering partitioning.  I'm going to pick that up and continue with how to create your database to use multiple filegroups.  Without multiple filegroups, you aren't going to see how partitioning can really improve your database's performance.  Let's set up the database that's going to house our table of items from Knights of the Old Republic.

CREATE DATABASE KOTOR
ON PRIMARY
    (NAME = primary_data, FILENAME = 'd:\mssql\data\kotor_primary_data.mdf', SIZE = 10MB),
FILEFROUP fg1
    (NAME = fg1_data, FILENAME = 'd:\mssql\data\kotor_fg1.ndf', SIZE = 10MB),
FILEFROUP fg2
    (NAME = fg2_data, FILENAME = 'd:\mssql\data\kotor_fg2.ndf', SIZE = 10MB),
FILEFROUP fg3
    (NAME = fg3_data, FILENAME = 'd:\mssql\data\kotor_fg3.ndf', SIZE = 10MB),
FILEFROUP fg4
    (NAME = fg4_data, FILENAME = 'd:\mssql\data\kotor_fg4.ndf', SIZE = 10MB),
FILEFROUP fg5
    (NAME = fg5_data, FILENAME = 'd:\mssql\data\kotor_fg5.ndf', SIZE = 10MB)
LOG ON
    (NAME = db_log, FILENAME = 'd:\mssql\logs\kotor_log.ndf', SIZE = 10MB, FILEGROWTH = 10MB);

OK, you should already recognize the parts around CREATE DATABASE, ON PRIMARY, and LOG ON. What you may not recognize is the FILEGROUP clauses. These allow you to set up additional files that will house data from the database. Just like the PRIMARY and LOG ON clauses, you have to identify a NAME, FILENAME, and SIZE. FILEGROWTH is optional, if omitted, the server will default to the server default growth, which is probably 10%. I prefer growing by a set size, since percent growths can really come back to haunt you when your database grows to a more serious size (100+ GB).

Now that we have our filegroups defined, let's move on to..

Partition Schemes

A partition scheme defines the colection of filegroups you want to use for a given partition function. Let's look at the general syntax.

CREATE PARTITION SCHEME myPartitionScheme
AS PARTITION myPartitionFunction
TO (fg1, fg2, fg3, fg4, fg5)

You have to provide the name for the PARTITION SCHEME, and you have to reference a PARTITION FUNCTION that already exists.
In my command I referenced the function we created in the last post.

CREATE PARTITION FUNCTION myPartitionFunction(int)
AS RANGE LEFT|RIGHT
FOR VALUES (10, 20, 30, 40, 50)

Finally you have to enter a filegroup for each partition defined in your partition function. The filegroups you reference must already exist. IF you want to create the filegroups in the same script as the partition scheme, you'll have to separate the filegroup definition from the CREATE PARTITION SCHEME command with a GO. You can reuse filegroups if you wanted two partitions to share the same filegroup.

CREATE PARTITION SCHEME myPartitionScheme
AS PARTITION myPartitionFunction
TO (fg1, fg1, fg2, fg3, fg4)

Or you can put all the partitions in the same filegroup... I question the benefits of doing this, since most of the time you're going to want to split these partitions into separate files (and maybe even split the disks these files are written to) in order to boost read/write efficiency. But SQL does support the ability to put all the partitions into the same file group.

CREATE PARTITION SCHEME myPartitionScheme
AS PARTITION myPartitionFunction
ALL TO (fg1)

Needless to say, if you use ALL, then you can only identify ONE filegroup.

Now you know how to create filegroups, and define a PARTITION SCHEME. These are only building blocks. Next time, we're going to cover how to use this PARTITION SCHEME with a table and indexes! Then you're really going to see the benefits of partitioning. Any questions so far? Just let me know, and I'll do my best to help you out.

17Aug/100

Partitioning

Partitioning was a feature added with Microsoft SQL 2005 to allow users to split up large tables across multiple storage locations.  Partitioning can be applied to tables, indexes and indexed views.  By partitioning the data across multiple locations you can speed up query times, reduce contention between queries, and improve overall performance under certain conditions.

Partition Functions

In order to partition your object, you must first define how you want to split the data.  A partition function will allow you to define the boundary points for the different partitions of data.  Consider the following table:

ItemID ManufacturerID Name ...
1 10 Arkanian Heavy Pistol ...
42 12 Casus Fett's Heavy Blaster ...
13 31 Genoharadran Blaster ...

Let's say your table held all the items from Knights of the Old Republic, and you were having speed issues with look-ups.  Your solution: partition the table by ManufacturerID, that way each manufacturer would be in a separate file group, and you should speed look-ups dramatically (for now this is an assumption, we can look at the actual performance differences in a later article).

An example of a partition function we could use for this table would be:

CREATE PARTITION FUNCTION myPartitionFunction(int)
AS RANGE LEFT|RIGHT
FOR VALUES (10, 20, 30, 40, 50)

Every partition function you will define requires a name, a data type, and at least one boundary point.  I'd like to point out one thing: you can't define a partition function on a text, ntext, image, varbinary(max), timestamp, xml, varchar(max), or user-defined datatypes.  Oh, and if you use a computed column,  that column will need to be persisted in order to partition on it.  Since the computed column has to be persisted, that means it has to be deterministic too (no variable values allowed).

Think about it, if you're not storing the values, how can you partition on them?

Notice the Range can be LEFT or RIGHT, Basically this identifies which partition holds the partitioning value, IE which partition would hold 10, 20,  etc.  If you want the lesser partition to hold the value, then you choose RANGE LEFT, otherwise you choose RANGE RIGHT.  The difference will come into play when you try tweaking the performance of your partitioned object, and how many records are stored in each partition.

In our example, by choosing RANGE LEFT, each of the rows you see in the table will appear in a separate partition.  If we had chosen RANGE RIGHT Arkanian Heavy Pistol and Casus Fett's Heavy Blaster would both appear in Partition 2.

There are three points I'd like to make about Partition Functions now:

  • Notice the partition function doesn't have a clause that identifies a specific database object.  That means you can reuse this partition function across multiple database objects.  That can be really useful if you have multiple tables that could benefit from partitioning (if you have one that could benefit, chances are there are more!)
  • You can only have 1000 partitions for a single object.  What are the odds you'd need more than that?  If you do, check into SQL 2008 Parallel Data Warehouse... you may want to read up on that.
  • As a result of having only 1000 partitions, you can only have 999 boundary points defined.

Next time we visit Partitions, I'm going to make sure you are familiar with creating a database with multiple filegroups, and then move on to Creating a Partition Scheme.

Do you have any questions so far?  If so, send them in... I'm here to help!

Filed under: Microsoft SQL No Comments