can a Microsoft SQL database or table be compacted?

Lobsang · April 22, 2009, 5:07pm

I recently changed a single field from char(30) to char(60) in an MS SQL database and the table doubled in size.

If I add up the field sizes for all the fields except the field in question this comes to 275. So why would changing one field from 30 to 60 double the size of the table?
(the data itself hasn’t changed. I’ve just made it possible for future imports to fit larger strings in that field)

And is there a way of compressing the table so that ‘whitespace’ isn’t stored? (the field’s data can range from strings of length 1 to strings of length 60)

If I run the len() function on the contents of this field I get the length of the string without whitespace. But if I copy/paste the field I get all the whitespace copied as well, so it’s as if it’s stored with the data.

ghostman · April 22, 2009, 5:18pm

SQL Server will actually allocate the space for each page of data whether or not you use it. That way if your page of data is 100 bytes (just for arguments sake). and you are looking for data that is near the end of the index, and there are 50 rows, it can jump over big chunks of the disk without having to actually read that data.

crazyjoe · April 22, 2009, 5:24pm

Char is a fixed length field. you told it to allocate 60 bytes to that column no matter teh size of what’s in it. What you want is varchar instead.

Lobsang · April 22, 2009, 5:56pm

I have a small test table of the same format. I just changed the char(60) to varchar(60). Made no change to the size.

I copied the contents of one of the fields and there are still spaces on the end so I ran a replace(field,’ ‘,’’) on it (to forcibly remove the extra spaces)

the function worked (now when I copy the contents the spaces are gone)

but the table is still the same size.

Lobsang · April 22, 2009, 7:41pm

So can anyone answer the other question in my thread? i.e. why would adding 30 extra chars to a char field in a database of mostly char fields that adds up to 305 chars per record double the size of the table?

Or to try and put it less awkwardly - Why does 305+30 double the size of the table??

And if i do two things (change to varchar(60) and then run replace(field,’ ‘,’’) on that field on all records) will this reduce the size of the database (assume the average length of the data in the field is about 15)

crazyjoe · April 22, 2009, 8:11pm

Varchar will automatically remove trailing spaces if you have the right setting set.

Table size is not automatically updated using the GUI or even in the tables, it is periodically udated when statistics on it change enough to warrant.

Do you have indexes on the table which include the column in question? Each time that column is indexed, it takes up additional space based on how large the indexed column is.

Also, unless you have the autoshrink option enabled (strongly discouraged) the DB will not automatically release space it has claimed. You have to tell it to release the space by shrinking it. It will simply hold the excess space as “free space” for further data insertions to grow into.

Tell me what you are doing to check the size of the table, I might be able to clear up some misconceptions.

LilShieste · April 22, 2009, 9:05pm

Is the column used in any indices for the table?

Also, when you talk about the table doubling in size, can you let us know how you are determining this? (That way, we can be sure that we’re talking about the same thing.)

ETA: I swear, crazyjoe, I didn’t see your post. :smack:

Lobsang · April 22, 2009, 9:08pm

I have Management Studio 2005 (not express. The full version) installed on a PC on a network (not the sql server) I check by right-clicking on the table name and click ‘properties’.

I also have RDP access to the server. I’ve done the same on Management Studio on the server. Both show the increased size.

But from what you tell me it has simply made room for itself to make the change, and then stayed at the larger size.

I’ve checked disk space on the server and there’s about 50gb free. But MS reports that there is 496.60MB ‘Space available’. For such a big database (12 gb) that’s a very small amount. Is it a mistake?

(eta: The DB is 12gb. the table in question is nearly 10gb)
eta: I forgot to answer the question. The field is not indexed. There is one indexed field in the table and this field isn’t it.

[sub](To the thread in general: you can have a moan about companies not employing proper DB admins. But that ain’t my fault. I will happily moan right along with you if it wasn’t for the fact that being the acting DB admin is a fine oportunity for me to gain experience and skills)[/sub]

Lobsang · April 22, 2009, 9:14pm

Here’s how I check -

right click the table in Management Studio 2005, click properties. Look at ‘data space’. make a note.
Make changes.
Repeat step 1.

ETA: FYI the index has grown from about 600mb to about 725mb
ETA: here’s the sql query of the change (with names replaced for blah blah)

alter table [table]

alter column [field] char(60)

Lobsang · April 22, 2009, 9:18pm

I just discovered the option to shrink the database. Even though you say having auto-shrink is bad. would it be safe to run this [manual] shrink?

LilShieste · April 22, 2009, 9:52pm

Ok, thanks for the info.

You mentioned that the index space has grown by about 125MB. I’m not sure how this would happen unless the column was actually being used in an index. I know you said that the column isn’t used in any indices, but is the column part of the primary key? Is it used as a foreign key for anything?

I usually only shrink a database when I have it offline (i.e., it’s not being used by anyone else). That way I’m sure it won’t adversely affect any of my users in any way (e.g., performance).

Otherwise, running the shrink command should be relatively painless. It literally modifies your underlying data files, only removing unused space.

LilShieste · April 22, 2009, 10:01pm

Re-reading my post, I have one more question:

Are other users (besides you) actively operating on this database right now? Because that could explain the increase in the index space, even if the column truly isn’t being used in any indices (e.g., new rows).

Lobsang · April 22, 2009, 10:04pm

The only other users on the database are people accessing the data via an intranet. And the occasional transaction from a website. There were probably no other users at the time I made the change.
I think I’m going to avoid doing any further messing with this database and just live with the increased size. I don’t want to risk doing any harm to it. It’s an important database.

LilShieste · April 22, 2009, 10:25pm

If the database is carrying out other transactions (besides the ones you’re doing), then I don’t think you’re going to be able to get much useful information from us, regarding the increase in size from changing the datalength of a field. (It would be too much of a hassle, IMO, to account for the additional moving parts.)

The size information presented in the Properties window is retrieved using the database statistics, and those numbers aren’t usually 100% accurate (i.e., they’re not grabbed in “real-time”; they’re maintained in their own table).

As for shrinking, I would agree with you. It’s really only worth doing if you’re particularly pressed for space, or experiencing some really dire performance problems.

DMC · April 23, 2009, 12:10am

I believe that all of your other questions have been answered, but feel free to speak up if that’s not the case. Regarding this particular one, typically, the database is set to grow automatically, either as a percentage or a fixed number of bytes, when it needs room. You can also set it to not auto-grow (although that’s pretty rare in the various environments that I’ve consulted), and in that case, you definitely want to grow it manually before it hits the limit.

By the way, when you spoke of moaning about your company not employing a DBA, was that in response to someone in this thread? I haven’t seen anyone be anything other than helpful as, especially in the SQL Server world, most of us are very aware that many companies just sort of wing the administrative tasks with whoever happens to be willing and available.

Lobsang · April 23, 2009, 12:43am

I apologise if it seemed like that was a dig at anyone. And it was not in response to anyone in this thread. It was in anticipation of anyone coming into this thread to have a moan about companies that don’t employ dbadmins.

And that was because In one of my earlier threads about db administration someone DID come in and basically moaned that people come here with no dbadmin qualifications from companies that don’t employ dbadmins and expect people to do it for them. And he did it in a way that managed to belittle me for not having the skills or training to admin a db.

I didn’t mean anything negative by it. to anyone other than that person (if they choose to come into this thread)

You’ve answered my question about the space available bit.

But Nobody has really given a reasonable explanation as to why such a comparatively small change can make such a huge size difference. I worked it out - there are aprox 17 million records. And I’ve increased the size of one field by 30 bytes so that’s 17,000,000 * 30 = 510,000,000 bytes or 510 Megabytes. About 900 times smaller than the actual size increase that has happened.

Cerowyn · April 23, 2009, 1:38am

The data file is allocated in increments according to performance metrics, not space considerations, as should be the case for a database for almost all applications. That means that changing one field size may or may not require an increase in data file size. For instance (pulling numbers completely out of the air), say a table with fields totaling 3000 bytes is allocated in 4096 byte blocks, allowing the engine to calculate the position of a row without having to look up its index. If the table is expanded a few times, adding less than 1096 bytes, the data file doesn’t need to grow. But, as soon as you do a MODIFY that crosses that threshold, the data file needs to be rebuilt with a larger allocation size.

It’s actually a little more complicated than that, because the blocksize is not necessarily related to the size of a single row; each block might contain several rows. In that case, you’ll cross thresholds more often as you increase the size of the columns.

DMC · April 23, 2009, 2:07am

Cerowyn gave you a better answer than the “It depends” I probably would have given, which is actually often the most apt answer when it comes to questions like those. There are all sorts of things going on behind the scenes, and things like pages/extents, fill factors, statistics update thresholds (mentioned upthread), and all sorts of other things that can also have an affect on reported object sizes. To be honest, I know of very few DBAs who worry too much about table sizes, other than perhaps rowcounts. SQL Server does a pretty decent job of micromanaging the lower level objects like tables, for the most part, so most DBAs are far more concerned with things like database sizes, where unexpected growth can be caused by nastier things, such as transaction logs gone wild.

We were all new at this once, so keep learning and asking questions along the way, ignoring anyone who thinks you should have been born with the answer. Some day soon you’ll get so into it that you’ll be cussing things like “parameter sniffing” problems, tabs passing an isnumeric() check, etc. At the same time, you’ll learn cool hacks, such as using built in XML functionality as an easy way to create comma delimited strings, using parsename to separate IP address segments (not its intended purpose at all), zero-date based math, and all sorts of other things to impress the nerds around you.

crazyjoe · April 23, 2009, 2:30pm

That was me! It is frustrating that companies hire people like you, pay them half of what I make, and then expect you to just google for the answers. Helpful people answer their questions and suddenly my value is diminished…see? I’m a nice guy and I answer questions on the internet, too (thus shooting myself in the foot), but everyone has their bad days.

But anyhow, did you do a rowcount before and after you increased the field size?

Lobsang · April 23, 2009, 4:51pm

They didn’t employ me to do this job. It’s not my main job. It’s just a job I’ve ended up doing because my company can’t afford to employ an expert for every job. I also look after the website (also not my main job)

I did a rowcount before and after. This table is only updated once per day. The rowcount doesn’t change throughout the day.

the rowcount before and after was 17,687,988

Topic		Replies	Views
SQL experts. Will this reduce the size of a database. Factual Questions	11	1162	April 30, 2009
Is my database unnecessarily fat? Factual Questions	8	1153	March 2, 2011
Growing a data field size in an existing, populated table (MS SQL server) Factual Questions	5	875	April 5, 2008
Boring SQL question for those in the know. Factual Questions	7	765	June 20, 2003
Q about size of post files About This Message Board	3	693	September 30, 2002

can a Microsoft SQL database or table be compacted?

Related topics