DB Design :: Adding Index Taking More Than 24 Hours
Oct 7, 2015
Here is the DDL of a table that I am trying to add an identity column and create a clustered index. I am using SQL 2012 SE
CREATE TABLE [dbo].[IPF](
[IPFId] [uniqueidentifier] NOT NULL,
[IPId] [uniqueidentifier] NOT NULL,
[FId] [uniqueidentifier] NULL,
[FName] [nvarchar](50) NULL,
[FItemId] [uniqueidentifier] NULL,
[code]....
The table currently has 220million rows. I am trying to add a new identity new column and create a clustered index on it. I ran the script and its been more than 26hrs and its still running. I ensured there is no blocking.
I previously posted about a problem where I added a non-NULL DEFAULT 0 bit column to a table with 80 million records. It was taking a LONG time and we needed that database up fast. It ended up taking a total of 17 hours.
Now my coworker added the same non-NULL DEFAULT 0 bit column to another table on another important server. But this table has more like 400 million rows. It's been running for 100+ hours and is still going. We were hoping it would scale linearly (5*80 million records would hopefully take 5*17 hours) but that isn't happening. I have no idea how much longer it will take. I really need this to be done. I'm tempted to cancel but that will incur a potentially massive rollback, right? Any guestimate on how large that would be?
I am running a DB with 250Gb of documents, the fulltext index just keeps growing and growing. The files in the MssearchCatalogDir folder is currently taking up 106Gb, it was only 74Gb this morning. The full text catalog size property says its only 53.6Gb and this is remaining steady while the files in the MssearchCatalogDir folder seem to be balloning out of control. I ran a reorg on the fulltext catalog and it did not alter the file space (actually increased it).
There is one *.ci file that is doing most of the growing, its about three times as big as the second biggest one and is expanding before my eyes.
Should I have turned population off when I did the reorg?
Manamgement studio has an Optimize catalog option on the fulltext catalog properties dialog, is this different from a fulltext catalog reorg? Should I run this.
Should I run a shrink file on the filegroup containing the fulltext catalog (the filegroup itself is very small, all the space is in the MssearchCatalogDir folder)?
I have 140Gb left on this drive, is it just going to keep on expanding until I'm out of room, I just dont know what I should do.
Abit more info.....
I was rebuilding the index from scratch (needed to move it to another filegroup).
When I first started the rebuild, msftesql.exe and msftefd.exe were both taking alot of cpu and the overall cpu usage was high. Now after 12 hours, only msftesql.exe is running and is taking up 5-10% cpu. Could it be that the rebuild is not complete yet? Although the Item Count property of the full text catalog does seem to indicate that all documents have been processed, the processes keeps running and disk space keeps going down.
I am trying to add the hours between each time block stored in a database.
In this database a user enters the begin time and the end time. For example the course MATH0001 would start at 8am and end at 10am. Therefore the user would enter 0810 in the start field and 1000 in the end field. The course MATH0001 doesn't run the entire semester it may only run from 8th Jan - 15th March and the course is scheduled in a room called GR4. Now because a course can be scheduled modularly, one room could have several courses scheduled in this manner.
The problem: I need to find out how many hours GR4 is used but it contains the following courses
A day only has 13 hours. Therefore the total hours spent in GR4 should be 12 hours. This is calculated by adding the hours between 8am and 5pm = 9 hours and 5pm an 8pm = 3 hours. I would not include 9am to 1pm because it is a subset of the 8am to 5pm slot.
Now, how to accomplish this but below is the code that i have thus far:
I forgot to mention that this code was to just test my 'final code' results and it outputs the table shown above. Anyway for testing purposes I have limited the search to the room GR4 and the day Tuesdays.
Code: select DISTINCT ssrmeet_room_code, ssrmeet_start_date, ssrmeet_end_date, ssrmeet_crn, ssrmeet_begin_time, ssrmeet_end_time, (((CAST(M.SSRMEET_END_TIME AS INT))-(CAST(M.SSRMEET_BEGIN_TIME AS INT)))+10)/100 As HoursPerClass,
I have a rather complex query (to me at least) that I need to create but I am unsure of where to start. The query requires me to copy existing data into a new row (which will then create a new ID) as well as update all existing records with the newly created id. More specifically, I need to separate the data associated with LocationID 219 from it's parent, CompanyID 992.
Ideally I want to copy the data associated with LocationId 219 and then make a new CompanyId with the copied data (which will also create a new LocationID). Since this new record is no longer going to be associated with CompanyID 992 I will want to remove/delete/drop it from that record.
Finally, and perhaps most difficult of all, I need to update all tables that reference the old ID's together (992 / 219) to reflect the newly created Company ID and Location ID.
Hi all,I have a very large table with many columns: dateTime type, nvarchartype and integer field type.A program exec many type of query with where clause.Data field is always in where clause, but some other field is presenttoo. Sometimes integer field, sometimes nvarchar field.Now I must create index for the query!For choose index field what i can do?I must create only index on a datetime field or every combination inevery type of query?In the second case I must create very much index! But this is verydispendious for update/insert/delete operation on the table!!!Some ideas?thnx
I want to create an index that will cause the cost of the query to be as low as possible and also must minimize the space that is used by the index. What type of index/parameters I can associate when I create an Index. I already have a clustered index
I want to have table with 1 PK and 1 Clustered Index on Column2 and Id, this is my code:
CREATE TABLE [dbo].[Test] ( [Id] INT PRIMARY KEY NOT NULL IDENTITY(1,1), [Column2] INT NOT NULL, [Column3] INT NOT NULL, [Column4] INT NOT NULL, ); CREATE CLUSTERED INDEX IX_Test ON dbo.Test (Column2, Id);
Error: "Cannot create more than one clustered index on table..."I know, that PRIMARY KEY automatically create Clustered index on Id.Then I manually delete existing Clustered index, and run this code part again:CREATE CLUSTERED INDEX IX_Test ON dbo.Test (Column2, Id);Everything is fine, except I lost my PK on Id...How can I leave PK on Id and create my custom Clustering Index?
Nice easy one (hopefully) from a newbie on SQL 2000.
I have a table HolidayTakenBooked which is populated from a stored procedure via the following statement;
TRUNCATE TABLE HolidayTakenBooked INSERT INTO HolidayTakenBooked SELECT * FROM #TMP_HolidayTakenBooked ORDER BY ABR_Clock_No
I am finding that for certain values in the HolidayTakenBooked table decimals are not being transferred correctly. ie. 0.5 in the TMP table appears as 1 in the HolidayTakenBooked table.
I'm pretty sure that this is down to the data definition of the table see sample field below; [HOL_DaysTaken1] [decimal](18, 0) NULL ,
So the simple question here is how do I define decimal places when I define a new table. When designing a new table in Enterprise Manager I select decimal and the server does not allow me to change the value of 9 it defaults to.
I have a problem setting the default value of a column. I am trying to set it to
(CONVERT([float],getdate()+(2),0))
However, SQL Server automatically sets it to
(CONVERT([float],getdate()+(2),(0)))
While it functionally does not change anything, we have a tool which compares the database schema against a pre-existing schema and shows this as an error.I have tried setting the value directly and through scripts but it does not work either way.
I have an existing publication and I want to modify an index on my main server. I do not want the index to be pushed out to the other server, but I'm wondering if the index will break replication.Any thoughts?
Hi allI recently noticed when trying to optimise a major query of a chess websiteI am the webmaster of, that adding an order by for "gamenumber" which is aclustered index field as in for example "order by timeleft desc, gamenumberdesc" actually speeded up the queries and reduced sql server 2000 timeouts.I have an ASP error log and I am fairly sure that a dramatic reduction insql server timeouts is simply attributed to adding an extra seeminglyredundant order by field - which is the clustered index. Is this phenomenaat all possible or is it my imagination?!Other special attributes of the query includes the use of "Top" to obtain amaximum specified number of rows. Perhaps it is just the uniquecharacteristics of the query, but I would have thought that the less orderby fields would imply faster performance. Has anyone else noticed that aseemingly redundant order by column on for example the clustered indexcolumn, can actually help speed up queries?!Best wishesTryfon GavrielWebmasterwww.chessworld.net
I have table which having clustered index based on column (A,B,C,D,E,F).Now my query based on B,D,F. e.g: where b='Test1' and D='test2' and F='test3' Now Execution plan ask to create non clustered index with (B,D,F) column.is it make any sense to create non clustered index where clustered already available.
CREATE TABLE [dbo].[A]( [AutoID] [int] IDENTITY(1,1) NOT NULL, [ProID] [int] NOT NULL, [LID] [varchar](12) NOT NULL, [EventID] [varchar](12) NOT NULL, [HEventID] [varchar](12) NULL, ) ON [PRIMARY]
How I should creating the appropiate index for this table?A few option that I think ok.
Opt 1 : creating a primary key on the autoID with create index . create non clustered index on ProID and EventID Opt 2 : create a primary key on the autoID with non clustered index . create clustred index on ProID and EventID . opt 3: create primary key on the ProID and EventID with clustered index.
I have read thru the article on the primary key, clustered and non clustered indexing. However when I want to applyied the indexing..I feel a bit lost.But among the 3 option.... what is the different..
I have come up with one scenarios where I have three table like Product, Services and Subscription. I have to create one table say Bundle where I can have some of the product id , service id and Subscription id , i.e. a bundle may contains sum prduct , services and subscription . How I can design these relations ?
I have data coming from a telephony system that keeps track of when anemployee makes a phone call to conduct a survey and which project numberis being billed for the time the employee spends on that phone call in aMS SQL Server 2000 database (which I don't own).The data is being returned to me in a view (see DDL for w_HR_Call_Logbelow). I link to this view in MS access through ODBC to create alinked table. I have my own view in Access that converts the integernumbers for start and end date to Date/Time and inserts some otherinformation i need.This data is eventually going to be compared with data from someelectronic timesheets for purposes of comparing entered hours vs hoursactually spent on the telephone, and the people that will be viewing thedata need the total time on the telephone as wall as that total brokendown by day/evening and weekend. Getting weekend durations is easyenough (see SQL for qryTelephonyData below), but I was wondering ifanyone knew of efficient set-based methods for doing a day/eveningbreakdown of some duration given a start date and end date (with theday/evening boundary being 17:59:59)? My impression is that to do thiscorrectly (i.e., handle employees working in different time zones,adjusting for DST, and figuring out what the boundary is for switchingfrom evening back to day) will require procedural code (probably inVisual Basic or VBA).However, if there are set-based algorithms that can accomplish it inSQL, I'd like to explore those, as well. Can anyone give any pointers?Thanks.--DDL for view in MS SQL 2000 database:CREATE VIEW dbo.w_HR_Call_LogASSELECT TOP 100 PERCENT dbo.TRCUsers.WinsID, dbo.users.username ASInitials, dbo.billing.startdate, dbo.billing.startdate +dbo.billing.duration AS EndDate,dbo.billing.duration, dbo.projects.name ASPrjName, dbo.w_GetCallTrackProject6ID(dbo.projects.descript ion) AS ProjID6,dbo.w_GetCallTrackProject10ID(dbo.projects.descrip tion) AS ProjID10,dbo.billing.interactionidFROM dbo.projects INNER JOINdbo.projectsphone INNER JOINdbo.users INNER JOINdbo.TRCUsers ON dbo.users.userid =dbo.TRCUsers.UserID INNER JOINdbo.billing ON dbo.users.userid =dbo.billing.userid ON dbo.projectsphone.projectid =dbo.billing.projectid ONdbo.projects.projectid = dbo.projectsphone.projectidWHERE (dbo.billing.userid 0)ORDER BY dbo.billing.startdateI don't have acess to the tables, but the fields in the view comethrough as the following data types:WinsID - varchar(10)Initials - varchar(30)startdate - long integer (seconds since 1970-01-01 00:00:00)enddate - long integer (seconds since 1970-01-01 00:00:00)duration - long integer (enddate - startdate)ProjID10 - varchar(15)interactionid - varchar(255) (the identifier for this phone call)MS Access SQL statement for qryTelephonyData (based on the view,w_HR_Call_Log):SELECT dbo_w_HR_Call_Log.WinsID, dbo_w_HR_Call_Log.ProjID10,FORMAT(CDATE(DATEADD('s',startdate-(5*60*60),'01-01-197000:00:00')),"yyyy-mm-dd") AS HoursDate,CDATE(DATEADD('s',startdate-(5*60*60),'01-01-1970 00:00:00')) ASStartDT,CDATE(DATEADD('s',enddate-(5*60*60),'01-01-1970 00:00:00')) AS EndDT,DatePart('w',[StartDT]) AS StartDTDayOfWeek, Duration,IIf(StartDTDayOfWeek=1 Or StartDTDayOfWeek=7,Duration,0) ASWeekendSeconds,FROM dbo_w_HR_Call_LogWHERE WinsID<>'0'
can you have constraints as such [CreateBy] [nvarchar](30) NOT NULL DEFAULT (suser_sname()),on a table that has a column store index in SQL Server 2012,2014, or 2016?
I am reading "SQL Server Query Performance Tuning Distilled",on page 104 it talks about one of the index design recommendationswhich is to choose the column that has very high selectivity of valuesinstead of a column that has very few selectivity of values.My question is if I have currently indexes on my tables that have1, 2, 3, 4, ... values only on thousands of rows, are these nonclusteredindexes pretty much useless indexes that I should get rid of?And I know that pretty much the number of selectivity values willalways remain very low.Thank you
We have a vendor created database with 9000+ tables, one of which has about 6 billion rows. The vendor redesigned the database recently and ever since we've had terrible performance.
What the vendor did was increase any and all varchar columns (tens of thousands of columns) to 256.
Before the upgrade we had no problems creating an index on the 6billion row table, it would take 2 hours.
Now after the upgrade we've let the index creation command run for 5 days and killed it because it was consuming terabytes of logspace.
The previous design had combined column width of 1049 to what is now over 4000. The primary key itself is 1283 characters (SQL limit is 900).
There is no additional data, just wider columns. Why we are unable to create the index?
What is happening inside SQL Server? Does SQL make "room" in memory for the index for the entire width of the potential max row length?
We have implemented a very small reporting database which has a main table that started off small and has now grown to around half a million rows. Initially, there were no indexes on the table apart from a clustered index, but as the data has grown, performance has dropped and so we have added a number of indexes. This has resolved the performance issues.
Before creating the indexes SQL Server had auto created a number of statistic objects (_WA_Sys_000... etc). After creating the indexes, new statistic objects where created for the new indexes. In some cases, there are duplicate statistics (auto and index) for the same columns.Should I go through and drop the duplicate auto statistics? Will having duplicates cause issues at all?
I'm working on re-indexing a table using some commands from ALTER INDEX REBUILD from Microsoft. The indexes will be tested for threshold fragmentation. my plan is once the reindex is executed, a transaction backup will occur while controlling the size of the log file. The query impose time limitations or stop reindexing after the specified amount of time has elapsed.
my question, 1. How can I integrate a query which checks if transaction log is getting full and which runs a Tlog backup if over 70% 2. How do I impose time limitation?
I desire to have a clustered index on a column other than the Primary Key. I have a few junction tables that I may want to alter, create table, or ...
I have practiced with an example table that is not really a junction table. It is just a table I decided to use for practice. When I execute the script, it seems to do everything I expect. For instance, there are not any constraints but there are indexes. The PK is the correct column.
CREATE TABLE [dbo].[tblNotificationMgr]( [NotificationMgrKey] [int] IDENTITY(1,1) NOT NULL, [ContactKey] [int] NOT NULL, [EventTypeEnum] [tinyint] NOT NULL,
please explain the differences btween this logical & phisicall operations that we can see therir graphical icons in execution plan tab in Management Studio
Is there any way or option to get the all columns of dataset added to table when we add a table in data region. It will take lot of time to add one by one and also there are chances to add one column ore than once.
I am using Full Text Index to index emails stored in BLOB column in a table. Index process parses stored emails, and, if there is one or more files attached to the email these documents get indexed too. In result when I'm querying the full text index for a word or phrase I am getting reference to the email containing the word of phrase if interest if the word was used in the email body OR if it was used in any document attached to the email.
How to distinguish in a Full Text query that the result came from an embedded document rather than from "main" document? Or if that's not possible how to disable indexing of embedded documents?
My goal is either to give a user an option if he or she wants to search emails (email bodies only) OR emails AND documents attached to them, or at least clearly indicate in the returned result the real source where the word or phrase has been found.