I recently changed a single field from char(30) to char(60) in an MS SQL database and the table doubled in size.
If I add up the field sizes for all the fields except the field in question this comes to 275. So why would changing one field from 30 to 60 double the size of the table?
(the data itself hasn’t changed. I’ve just made it possible for future imports to fit larger strings in that field)
And is there a way of compressing the table so that ‘whitespace’ isn’t stored? (the field’s data can range from strings of length 1 to strings of length 60)
If I run the len() function on the contents of this field I get the length of the string without whitespace. But if I copy/paste the field I get all the whitespace copied as well, so it’s as if it’s stored with the data.
SQL Server will actually allocate the space for each page of data whether or not you use it. That way if your page of data is 100 bytes (just for arguments sake). and you are looking for data that is near the end of the index, and there are 50 rows, it can jump over big chunks of the disk without having to actually read that data.
So can anyone answer the other question in my thread? i.e. why would adding 30 extra chars to a char field in a database of mostly char fields that adds up to 305 chars per record double the size of the table?
Or to try and put it less awkwardly - Why does 305+30 double the size of the table??
And if i do two things (change to varchar(60) and then run replace(field,’ ‘,’’) on that field on all records) will this reduce the size of the database (assume the average length of the data in the field is about 15)
Varchar will automatically remove trailing spaces if you have the right setting set.
Table size is not automatically updated using the GUI or even in the tables, it is periodically udated when statistics on it change enough to warrant.
Do you have indexes on the table which include the column in question? Each time that column is indexed, it takes up additional space based on how large the indexed column is.
Also, unless you have the autoshrink option enabled (strongly discouraged) the DB will not automatically release space it has claimed. You have to tell it to release the space by shrinking it. It will simply hold the excess space as “free space” for further data insertions to grow into.
Tell me what you are doing to check the size of the table, I might be able to clear up some misconceptions.
I have Management Studio 2005 (not express. The full version) installed on a PC on a network (not the sql server) I check by right-clicking on the table name and click ‘properties’.
I also have RDP access to the server. I’ve done the same on Management Studio on the server. Both show the increased size.
But from what you tell me it has simply made room for itself to make the change, and then stayed at the larger size.
I’ve checked disk space on the server and there’s about 50gb free. But MS reports that there is 496.60MB ‘Space available’. For such a big database (12 gb) that’s a very small amount. Is it a mistake?
(eta: The DB is 12gb. the table in question is nearly 10gb)
eta: I forgot to answer the question. The field is not indexed. There is one indexed field in the table and this field isn’t it.
[sub](To the thread in general: you can have a moan about companies not employing proper DB admins. But that ain’t my fault. I will happily moan right along with you if it wasn’t for the fact that being the acting DB admin is a fine oportunity for me to gain experience and skills)[/sub]
You mentioned that the index space has grown by about 125MB. I’m not sure how this would happen unless the column was actually being used in an index. I know you said that the column isn’t used in any indices, but is the column part of the primary key? Is it used as a foreign key for anything?
I usually only shrink a database when I have it offline (i.e., it’s not being used by anyone else). That way I’m sure it won’t adversely affect any of my users in any way (e.g., performance).
Otherwise, running the shrink command should be relatively painless. It literally modifies your underlying data files, only removing unused space.
Are other users (besides you) actively operating on this database right now? Because that could explain the increase in the index space, even if the column truly isn’t being used in any indices (e.g., new rows).
The only other users on the database are people accessing the data via an intranet. And the occasional transaction from a website. There were probably no other users at the time I made the change.
I think I’m going to avoid doing any further messing with this database and just live with the increased size. I don’t want to risk doing any harm to it. It’s an important database.
If the database is carrying out other transactions (besides the ones you’re doing), then I don’t think you’re going to be able to get much useful information from us, regarding the increase in size from changing the datalength of a field. (It would be too much of a hassle, IMO, to account for the additional moving parts.)
The size information presented in the Properties window is retrieved using the database statistics, and those numbers aren’t usually 100% accurate (i.e., they’re not grabbed in “real-time”; they’re maintained in their own table).
As for shrinking, I would agree with you. It’s really only worth doing if you’re particularly pressed for space, or experiencing some really dire performance problems.
I believe that all of your other questions have been answered, but feel free to speak up if that’s not the case. Regarding this particular one, typically, the database is set to grow automatically, either as a percentage or a fixed number of bytes, when it needs room. You can also set it to not auto-grow (although that’s pretty rare in the various environments that I’ve consulted), and in that case, you definitely want to grow it manually before it hits the limit.
By the way, when you spoke of moaning about your company not employing a DBA, was that in response to someone in this thread? I haven’t seen anyone be anything other than helpful as, especially in the SQL Server world, most of us are very aware that many companies just sort of wing the administrative tasks with whoever happens to be willing and available.
I apologise if it seemed like that was a dig at anyone. And it was not in response to anyone in this thread. It was in anticipation of anyone coming into this thread to have a moan about companies that don’t employ dbadmins.
And that was because In one of my earlier threads about db administration someone DID come in and basically moaned that people come here with no dbadmin qualifications from companies that don’t employ dbadmins and expect people to do it for them. And he did it in a way that managed to belittle me for not having the skills or training to admin a db.
I didn’t mean anything negative by it. to anyone other than that person (if they choose to come into this thread)
You’ve answered my question about the space available bit.
But Nobody has really given a reasonable explanation as to why such a comparatively small change can make such a huge size difference. I worked it out - there are aprox 17 million records. And I’ve increased the size of one field by 30 bytes so that’s 17,000,000 * 30 = 510,000,000 bytes or 510 Megabytes. About 900 times smaller than the actual size increase that has happened.
The data file is allocated in increments according to performance metrics, not space considerations, as should be the case for a database for almost all applications. That means that changing one field size may or may not require an increase in data file size. For instance (pulling numbers completely out of the air), say a table with fields totaling 3000 bytes is allocated in 4096 byte blocks, allowing the engine to calculate the position of a row without having to look up its index. If the table is expanded a few times, adding less than 1096 bytes, the data file doesn’t need to grow. But, as soon as you do a MODIFY that crosses that threshold, the data file needs to be rebuilt with a larger allocation size.
It’s actually a little more complicated than that, because the blocksize is not necessarily related to the size of a single row; each block might contain several rows. In that case, you’ll cross thresholds more often as you increase the size of the columns.
Cerowyn gave you a better answer than the “It depends” I probably would have given, which is actually often the most apt answer when it comes to questions like those. There are all sorts of things going on behind the scenes, and things like pages/extents, fill factors, statistics update thresholds (mentioned upthread), and all sorts of other things that can also have an affect on reported object sizes. To be honest, I know of very few DBAs who worry too much about table sizes, other than perhaps rowcounts. SQL Server does a pretty decent job of micromanaging the lower level objects like tables, for the most part, so most DBAs are far more concerned with things like database sizes, where unexpected growth can be caused by nastier things, such as transaction logs gone wild.
We were all new at this once, so keep learning and asking questions along the way, ignoring anyone who thinks you should have been born with the answer. Some day soon you’ll get so into it that you’ll be cussing things like “parameter sniffing” problems, tabs passing an isnumeric() check, etc. At the same time, you’ll learn cool hacks, such as using built in XML functionality as an easy way to create comma delimited strings, using parsename to separate IP address segments (not its intended purpose at all), zero-date based math, and all sorts of other things to impress the nerds around you.
That was me! It is frustrating that companies hire people like you, pay them half of what I make, and then expect you to just google for the answers. Helpful people answer their questions and suddenly my value is diminished…see? I’m a nice guy and I answer questions on the internet, too (thus shooting myself in the foot), but everyone has their bad days.
But anyhow, did you do a rowcount before and after you increased the field size?
They didn’t employ me to do this job. It’s not my main job. It’s just a job I’ve ended up doing because my company can’t afford to employ an expert for every job. I also look after the website (also not my main job)
I did a rowcount before and after. This table is only updated once per day. The rowcount doesn’t change throughout the day.