|Performance Tuning Tips (cont.)
3. Reduce Data Volume (beware the char datatype)!
One of the largest single factors in improving this system’s performance was simply changing char
columns to varchar! Why copy ten gigabytes when you can copy only three? There have been some books
and articles that suggest there is a performance advantage to be gained from using fixed length datatypes
such as char instead of varchar, and perhaps this is true in some unusual situations, but believe me,
when dealing with large volumes of data, any time saved by the server not having to look up or maintain an
internal length variable is overwhelmed by the time spent reading and writing all of those useless padding
spaces. And those useless padding spaces are still there when the value is NULL (!!) due to a change in
the way SQL Server implements the char datatype, a fact which older SQL Server programmers may not
realize (I was one). Of course, when importing from char columns into varchar columns, don’t forget to rtrim
( theContents), or the padding will be retained.
Along the same lines, one should keep in mind that a smalldatetime column takes half the space of char
(8), that nchar and nvarchar columns double the size of data stored in them, and that numeric data types of
the appropriate size are much more space-efficient than character representations. Another method of
reducing data volume is to eliminate unneeded columns altogether. If a column is not used anywhere, why
keep it around if you don’t have to? Since every byte in a permanent table translates to perhaps four bytes
of disk space (think of tempdb, transaction logs, backups, load and extract files, etc.), these savings really
add up for tables with millions of rows. By the time I was finished with the performance improvements, the
size of the databases involved had been cut in half, and this was a major contributor to the time savings as
4. Eliminate the middleman!
If you can load data directly from files, using bcp or the BULK INSERT sql statement, you may find that it is
faster than using openquery, or the data pump in DTS. In the case of the biggest table I had to import from
Oracle, I changed the import method from the openquery method described earlier, to use a DTS data
pump to create an intermediate data file, which was then loaded using the BULK INSERT statement.
Instead of the forty minute savings I was hoping for, this method saved only about 20 minutes or so, but
would have saved more if the file could have been created at the other end. Still, a worthwhile technique,
but it paled in comparison to the real timesaver in the copy phase:
|<Previous Home Next>