I have a table that currently holds about 5 million records. We add an average of 5,000 new records per day, all of them from overnight batch jobs. I guess it's not that big, but there are two text columns that hold a couple KB each, so the total size isn't exactly small either. The data is created from medical billing data we receive overnight. We get two reports- patient demographic information and a physician's dictation relating to that patient. This data is always one to one, and the purpose of this table is to store the data as we originally receive, which is why both reports are in the same table. After we extract the details from the report, (which are by this point always reduced to text documents) we need to keep not only the data but the original documents, hence the two text columns. We considered moving the large columns to their own table, which would just have an ID field and the column, but the powers that be really wanted all this in the same table. Nothing new goes into the table during day- it's all SELECT statements.
I need to add a column to this table. It's just a small char(7) column, NULLS allowed, of course. We bill for several clients, and reports from different clients become available at different times, so there's really no down time overnight. Altering the table during the day is out of the question. So how can I add a column while the table is active?
My best idea so far is to use SELECT *, NULL AS NewColumn INTO NewTable to create a copy of the table (using a cast to get the correct datatype) during the day, when no new data is going in, and replacing the old table with the new by simply changing the names right after everyone goes home. But this could still cause slowdowns while it builds the copy, and leave the problem of re-creating indexes (there are several). There ought to be some graceful way to tell it to add the column to the existing table and play nice with ongoing traffic.
i need to add a datetime column to an exisitng table that has like 1.2 million records and its being accessed frequently but i cant afford to stop the db at all
whenever i do : alter table mytable add Updated_date datetime
it just takes too long and i have to stop executing the query after a couple of mins I am running sql express 2005 sp2. db size is over 3 gb but still under the 4 gb limit
can u plz advice on how to add this column. its urgent!!
I need to update a large table, about 55 million rows, without filling the transaction log, in the shortest time as possible. The goal is to alter the table and change the data type for Text column from VARCHAR(7900) to NVARCHAR(MAX).
Since I cannot do it with an ALTER TABLE statement (it would fill up the transaction log) I'm thinking to:
- rename column Text in Text_OLD - add Text column of type NVARCHAR(MAX) - copy values in batches from Text_OLD to Text
The table is defined like:
create table DATATEXT( rID INTEGER NOT NULL, sID INTEGER NOT NULL, pID INTEGER NOT NULL, cID INTEGER NOT NULL, err TINYINT NOT NULL,
[Code] ....
I've thought about a stored procedure doing this but how to copy values in batch from Text_OLD to Text.
The code I would start with (doing just this part) is the following, but maybe there are more efficient ways to do it, or at least there's a better way to select @startSeq in the WHILE loop (avoiding to select a bunch of 100000 sequences and later selecting the max).
declare @startSeq timestamp declare @lastSeq timestamp select @lastSeq = MAX(sequence) from [DATATEXT] where [Text] is null select @startSeq = MIN(Sequence) FROM [DATATEXT] where [Text]is null BEGIN TRANSACTION T1 WHILE @startSeq < @lastSeq
IF NOT EXISTS (SELECT TOP 1 1 FROM dbo.syscolumns WHERE id = OBJECT_ID(N'dbo.Employee) and name = 'DoNotCall') BEGIN ALTER TABLE [dbo].[Employee] ADD [DoNotCall] bit not null Constraint DoNot_Call_Default DEFAULT 0 IF ( @@ERROR <> 0 ) GOTO QuitWithRollback END
It just takes a LOT of time in SQL Server Management studio. I have to cancel the query and cancelling takes a whole lot time. I am using SQL Server 2008.
I am studying indexes and keys. I have a table that has a fixed width of data to be loaded in the first column which is parsed in a view based on data types within the fixed width specifications.
Example column A: (name phone house cost of house,zipcodecountystatecountry) -a view will later split this large varchar string based column b: is the source filename of the data load (varchar 256) ....
a. would there be a benefit of adding a clustered or nonclustered index (if so which/point in direction on why)
b. is there benefit of making one of these two columns a primary key (millions of records) or for adding a 3rd new column as a pk?
c. view: this parses the data in column a so it ends up looking more like "name phone house cost of house zipcode county state country" each having their own column.
-any pros/cons of adding indexes (if so which) to the view instead of the tables or both for once the data is parsed?
I have a Data Flow Task that extracts some data using a DataReader Source and loads it to a Raw File Destination. I am getting the following error message:
[DataReader Source [2357]] Error: The value was too large to fit in the output column "LASTCOL" (2558).
I thought that using a Raw File Destination would avoid this type of problem. How can I resolve this issue?
I'm trying to store a binary data file in my database. I've tried data types image, varchar(max) and text. I don't get error message on loading the data but as soon as the text file exceeds 32,000 bits a query returns an empty data set.
Is this a SSMS display problem and the data is really there? Or is this another one of Microsoft's memory bugs?
I have run the select query which returned one row. There is one column in it which has got large amount of data. I want to copy the complete content of that column(varchar(max)), but I am unable to do it. It's not the xml data. I don't want to do any conversions.
I am wondering if tempdb stores all results tempararily whenever I query a large fact table with over 4 million records which joins another dimension table? Since each time when I run the query, the tempdb grows to nearly 1GB which nearly runs out all the space on my local system drive, as a result the performance totally down. Is there any way to fix this problem? Thanks a lot in advance and I am looking forward to hearing from you shortly for your kind advices.
I am getting the following error on my SSIS package. It runs a large amount of script components, and processes hundred of thousands of rows.
The exact error is: The value is too large to fit in the column data area of the buffer.
I redirect the error rows to another table. When I run just those records individually they import without error, but when run with the group of 270,000 other records it fails with that error. Can anyone point me to the cause of this issue, how to resolve, etc.
I have a variable nvarchar(1000) that I ma reading into the buffer of a data flow task in the script component script task. It gives me this error: "Script component exception.........The value is too large to fit in the column data area of the buffer."
I looked at the BufferColumn members and tried to set the maxlength to 1500. But it does not help.
I'm trying to transfer data from DB2 Database to SQL Server 2005.
Well, i used the OLE DB Source, the Data Conversion Component and the OLE DB Destination component.
I have five Data flows with this configuration above. But I am receiving an error message from one of them.
Please check below the error message:
"[Source Table TARTRATE [1]] Error: The value was too large to fit in the output column "ADJ_RATE_PCT" (60). "
"[Source Table TARTRATE [1]] Error: The "component "Source Table TARTRATE" (1)" failed because error code 0xC02090F8 occurred, and the error row disposition on "output column "ADJ_RATE_PCT" (60)" specifies failure on error. An error occurred on the specified object of the specified component."
Hello there,I have and small excel file, which when I try to import into SQlServer will give an error "Data for source column 4 is too large forthe specified buffer size"I have four columns in the excel file, one of the column contains alarge chunk of data so I created a table in SQL Server and changed thetype of the field to text so I could accomodate this field but stillno luck.Any suggestions as to how to go about this.Thanks in advance,Srikanth pai
We have a table to 100M rows and up until now we were fine with an non clustered index a varchar(4000) because we never went above 900 bytes (yes it is a bad design).We have the need to support international character sets now so the column was updated to nvarchar(4000) and now we have data past the 900 byte limit.
The data is long, seems useless but is needed by the business and they need to be able to search "where bigcolumn like 'test%'". With an index, even with a huge amount of data, it was 'fast'. Now of course without an index it is unusable. The wildcard is always at the end of the search. I made a full text index on the column and basic queries such as: select * from ourtable where contains(bigcolumn, 'AReallyLongStringofTextHere') works fine unless there is a space in the data. We loose thousands of returned rows because of spaces in the data.
I have tried select * from ourtable where contains(bigcolumn, '"AReallyLongStringofTextHere that includes spaces"') but not all of the data is returned. I get 112 rows with the contains statement. The table scanning statement of "select * from ourtable where bigcolumn like 'AReallyLongStringofTextHere that includes spaces%' returns 1939 rows.I understand that a full text index is breaking the long string up since it contains spaces. Is there a way to retain the entire string as 1 index entry or is there a way to fix my query to return all of the rows?
I have a problem to import xls file to sql table, using MS SQL 2000 server. Actual main problem associated with it is xls file contain one colum having large amount of text which length is approximate 1500 characters. I am trying to resolve it through like save xls to csv or text file then import but it also can not copy whole text of that column, like any column in xls having 995 characters then text or csv file contain 560 characater. So, it is also wrong.
I’m attempting to use DTS to import data from a Memo field in MS Access (Jet 4.0 OLE DB Provider) into a SQL Server nvarchar(4000) field. Unfortunately, I’m getting the following error message:
Error at Source for Row number 30. Errors encountered so far in this task: 1. Data for source column 2 (‘Html’) is too large for the specified buffer size.
I also get this error message when attempting to import the same data from Excel.
Per the MS Knowledgebase article located at http://support.microsoft.com/?kbid=281517, I changed the registry property indicated to 0. This modification did not help.
Per suggestions in other SQL Server forums, I moved the offending row from row number 30 to row number 1. This change only resulted in the same error message, but with the row number indicated as “Row number 1�. (Incidentally, the data in this field is greater than 255 characters in every row, so the cause described in the Knowledgebase article doesn’t seem to be my problem).
You might also like to know that the data in the Access table was exported into this table from a SQL Server nvarchar(4000) field.
Does anybody know what might trigger this error message other than the data being less than 255 characters in the first eight rows (as described in the KB article)?
I’ve hit a brick wall, so I’d appreciate any insight.Thanks in advance!
In my quest to get the Script Component as Source to work, I've come upon an error that says "The value is too large to fit in the column data area of the buffer.". Of course, I went through the futile attempt to get debugging to work. After struggling and more searching, I found that I need to run Dts.Events.FireProgress to debug in a Script Component. However, despite the fact that the script says:
I get a new error saying: Error 30451: Name 'Dts' is not declared. Its like I am using the wrong namespace, but all documentation indicates that Microsoft.SqlServer.Dts.Pipeline.Wrapper is the correct namespace. I understand that I can use System.Windows.Form.MessageBox.Show, but iterating through 100 items makes this too cumbersome. Any idea what I may be missing now?
I am developing an application that has a table with lots of records(network traffic) but the data is summarize every so often to create summary records (old records are deleted). The problem is that I have a PK based on an autoincrement ID (int) that will run out of numbers. However, this ID is not referenced anywhere, (not a foreign key from another table, not use for deletion and there is no update in this table whatsoever).
So my possibilites are: 1.- reseed the id when it is about to run out. 2.- make the id bigint 3.- remove the id and change the PK to 2 other fields 4.- remove the id and without PK
I am leaning toward option 4, because I do not see the need for a PK, but I understand that it is quite out of the normal.. So I would like to hear from other people ( I do not have much experience with DB).
I also like option 3. I already have a index on one of the other fields (time).
If I use BCP to export a very large table will that table be blockedfor writes during the export process? I don't want to prevent usersfrom accessing that table during the bcp process?Thank You, TFD.
Hi every one, I am facing problem in printing the reports from browser and also when i export it to pdf,the problem i am facing is blank pages are coming when report column getting the large amount of text around 2500 characters into column value. can any one help me in this issue?. if the report is getting acceptable amout of data it is printing in proper way i.e no balnk pages at all.i maintained all properties like margins+body size < page size.
Hi, I have this page that upload's PFD's to a table. In principle this works fine. Until I try to upload large files (3 to 4 MB)I need to even upload larger files than that. (Don't really know as of yet what users are going to come up with) I get TimeOut problems. Now some people say it is not possible to exceed a limit of about 4 MB. But that there is a workaround by changing something to the web.config file.Can somebody give me info about that, (I am quite a novice really)I tried to change it like this, but to no avail: <system.web><httpRuntime maxRequestLength="102400"enable = "True"requestLengthDiskThreshold="102400" useFullyQualifiedRedirectUrl="True"executionTimeout="102400"/></system.web> Thanks for any help!
On a nightly basis, this table gets rebuilt in a temporary database. Once the table has been built and scrubbed, i need to move it into our webservers db.
I'd like to do this with minimal interuption to the website.
Possible techniques:
1) I could set up a DTS package to copy the table object overwriting the destination table
2) I could export to a flat file and then bulk import into the live table (after truncating it)
3) I could run a process to update smaller chunks of data at a time running delete queries and insert queries.
Anybody have a thought on the best way to do this so that the web users would be virtually unaware that anything was happening ?
I am absolutely innocent as far as T-SQL is concerned. I need to detect all duplicates (key consists of 5 fields) in the table and delete the duplicates. I tried different approaches like joins etc but nope. Any help is appreciated Thanks
OK, I imported 680 million records into an unindexed table. That went well.
Then, I went into Enterprise Manager and added a two column non-unique clustered index to that table to speed access.
It's been running for ~36 hours and I have no idea when it will complete. I have deadlines that I'm going to miss and am very nervous; what can I do?
SQL Server 2000 Enterprise Edition (8.00.818 - sp3 + hotfixes) Dual 3Ghz Xeon (two physical CPUs each have HyperThreading enabled) Windows 2000 SP4 4GB RAM (although I just noticed the 3GB OS switch wasn't on) SCSI boot drive tempdb, data, and transaction log are on a FibreChannel RAID SAN
Hi folks! I'm looking for advice on partitioning a large table. In the DDL below I've changed names to protect the guilty.
My table has this schema:
CREATE TABLE [dbo].[BigTable] ( [TimeKey] [int] NOT NULL, [SegmentID] [int] NOT NULL, [MyVal] [tinyint] NOT NULL ) ON [BigTablePS1] (TimeKey) -- see below for partition scheme
-- will evaluate whether this one is needed, my thinking is yes -- based on the expected select queries. create index NCI_SegmentID on BigTable(SegmentID asc)
The TimeKey column is sort of like a unix time. It's the number of minutes since 2001/01/01, but always floored to a 5 minute boundary. so only multiples of 5 are allowed.
Now, this table will be rather big. There are about 20k possible SegmentIDs. For every TimeKey from 2008/01/01 to 2009/01/01 (12 months), I'll have on the order of 20000 rows, one for each SegmentID.
For the 12 month period, there are 365*24*60/5=105120 possible TimeKey values. So the total rowcount is over 2 billion. (20k * 105120)
Select queries are expected to be something like this:
-- fetch just one particular row... select MyVal from BigTable where TimeKey=5555 and SegmentID=234234
--fetch for a certain set of SegmentID and a particular time... select b.SegmentID ,b.MyVal from BigTable b join OtherTable t on t.SegmentID=b.SegmentID where b.TimeKey=5555 and t.SomeColumn='SomeValue'
Besides selects, also I need to be able to efficiently issue update statements against the table with new values in the MyVal column based on a range of TimeKey values (a contiguous span of a few days) and sets of about 1000 SegmentID. updates would always look like this:
update t set t.MyVal=p.MyVal from BigTable t join #myTempTable p on t.TimeKey=p.TimeKey and t.SegmentId=p.SegmentId
where #myTempTable would have order of 1000*24*60 rows in it, all with contiguous TimeKey values, and about 1000 different SegmentID values. #myTempTable also has a clustered pk on (timekey asc, SegmentId asc).
After the table is loaded, it would never get any inserts or deletes. only selects and updates.
Given the size, and the nature of the select and update queries, this table seems like a good candidate for partitioning. I'm thinking it makes sense to partition on TimeKey.
So my question is, is it stupid to create a separate partition for each day in the year long span of TimeKeys this table covers? That would mean 365 partitions in the partition function and partition scheme. Something like this:
CREATE PARTITION FUNCTION [BigTableRangePF1] (int) AS RANGE LEFT FOR VALUES ( 3680640 + 0*1440, -- 3680640 is the number of minutes between 2001/01/01 and 2008/01/01 3680640 + 1*1440, 3680640 + 2*1440, 3680640 + 3*1440, ...snip... 3680640 + 363*1440, 3680640 + 364*1440, 3680640 + 365*1440 ); GO
CREATE PARTITION SCHEME [BigTablePS1] AS PARTITION [BigTableRangePF1] TO ( [PRIMARY],[PRIMARY],[PRIMARY], ...snip... [PRIMARY],[PRIMARY],[PRIMARY] ); GO
does anyone have any experience with partitioned tables with so many partitions? Is a few hundred partitions too many? From my understanding of partitions, seems like having so many will be ok. Is it somehow worse than having hundreds of tables in a database?
Even with one partition for each day, I'll still have 24*60*20000/5 ~ 5m rows in each one.
Greetings All, I was wondering what would happen if I were to do a"select * from table" on a table that has about 5 million rows. Wouldmy read block other writers to the same table? Would it block otherreaders? I know SQL uses optimistic lockign by default but I am notsure what this means to other users trying to access the same table?Any advise would be greatly appreciated.TFD
Quick question:Does SQL do table/schema changes "in place"?I've got a large table (140+ million rows of very widedata) that we want to change the schema on -- basicallyto remove a number of the unused data elements that wedon't use.Anyway, does anyone know if SQL will do an in-placechange, or if it will copy the table to a new table, therebyincreasing my space allocation needs? I'd effectively,temporarily, need space for two tables while the changeis happening if it copies the table first. This is not good asI do not have enough available space at the moment.If you've got pointers to specific MS docs regardingthis issue, please let me have 'em.Thanks in advance.
I have query that takes 12 minutes to execute. The query uses around 9 tables but I have narrowed down the problem to one table that has over 65 million rows. The problem table has only 3 fields
The query uses the primary key of this table to perform the join. FieldTwo and FieldThree are only used as output parameters.
I noticed if I remove FieldTwo and FieldThree from the output (but still leave the table in the query), the query executes in 1 second. However if I include FieldTwo and FieldThree in the output, the query takes over 12 minutes to execute.
I cannot index FieldTwo and FieldThree because of the field size and I cannot reduce the size of the fields because of the data that needs to be stored in it? How can I index or do something similar to speed up the table look up.
I have been asked to look at some performance isssues with an application that utilises a 800GB table. This table is huge and contains 4 int columns and 1 decimal column. The table has a clustered index that covers 4 of the int columns and is heavily fragmented and it has not been maintained for a long time. The system has limited free space to even attempt rebuilding the index. Does anyone have any experience of running a the Alter Index Reorganize command on such a large table? Any information on what storage would be required to attempt this, how long would this take?
I have to migrate an Oracle Db to SQL Server 2005, including a 450 gb table with images. The estimate is that it will take about 24 hours to move this data. I€™m using SSIS with just one OLEDB input and sending to one OLEDB output. The SSIS process will be running on the destination SQL Server with 2gb of memory and at least half that memory in use by other apps (including SQL Server). I tried the SQL Server Destination but received errors trying to use it.
Are there any suggestions on settings or the best way to do this?