I have a requirement to implement CDC for 50+ tables to implement incremental data changes warehouse/reporting rather than exporting the whole table data. The largest table is having more than half a billion records.
The warehouse use a daily copy of OLTP db (daily DB refresh). How can I accomplish this. Is there a downside in implementing CDC just for the sake of taking incremental changes on the tables?
Is there any performance impact if we enable CDC on OLTP db?
Can we make use of the CDC tables on the environment we do daily db refresh so that the queries don't hit OLTP database?
What is the best way to implement CDC to take incremental changes for reporting.
I'm trying to find a way to import data from this data xml feed to get daily exchange rate. I' tried:
select * from openrowset(bulk 'http://www.bankofcanada.ca/stats/assets/xml/fx-noon.xml',single_blob) as x
Which is a feeble attempt at a start; however, am getting this error message:
Cannot bulk load because the file "http://www.bankofcanada.ca/stats/assets/xml/fx-noon.xml" could not be opened. Operating system error code 123(The filename, directory name, or volume label syntax is incorrect.).
When using Change Data Capture on SQL Server 2012 I have researched that you cannot truncate data in a table. Is this also true if one wanted to delete data from the table? Getting a little confused about what DDL statements can be ran against a table with CDC enabled. Does CDC have to be disabled before performing certain DDL statements against a table?
I would like to safeguard the truncation and dropping of certain tables within the dbo schema. Wondering if I could do this with one fail swoop with CDC enabled on those tables. The other option would be to use a DDL trigger to prevent certain DDL statements to be performed.
I am task with identifying the source database name, id, and server name for each staging table that I create. I need to add this to a derived column on all staging tables created from merging same tables on different servers together.
When doing a Merge Join, there is no way to identify the source of data so I would like to see if data came from one database more than the other servers or if their are duplicates across servers.
The thing that bugs me about SSIS Data Flow task is there is no way to do an easy Execute SQL Task after I select my ADO.NET Source to get this information because my connection string is dynamic and there is no way of know which data source is being picked up at runtime.
For Example I have Products table on Server 1 and 2:
Server 2 has more Products and would like to join the two together to create a staging table.
Hi All, I am now working on the design phase of my project, we are looking to implement Change Data Capture (CDC) but i need some help if you guys has implemented before using the SSIS 2005 componets. I am trying to use the Following:
Source---------Derived Column---------Lookup---------------Conditional Split (to split New records and Updated Records)-----------Destination. Respectively. Lets make it clear, my source holds (Old records and newly added or Updated records), the Derived Column is to Derive new columns called Insert_Date and Update_Date. The Lookup i am Using is to look the Fact_Table(the Old Records) as Reference, and then based on this lookup i will split the records on timely based using the Conditional Split. My question is 1. Am i using the right components? 2. what consideration should i have to see to make it true (some Logics on the conditional split)? 3. Any script which helps in this strategy? 4. If you have a better idea please try to help me, i need you help badly.
I have a SQL Server hosted on a Win2003 server. I have a monitor on the server letting me know the average download speeds. When I do a large select from a remote connection, it looks like the download rate maxes out at around 10 kb/sec.
Is this an average download rate for a server hosting sql server?
Or can it record before and after column changes based on the LSN only?
An extract from a file based legacy accounting system is performed every night. The system does not have a primary key because transactions are managed through program code. (the more things change...). The extract is copied to text in Unix and FTP'd to Windows, where the file is loaded into SQL Server by kill & fill. Because of the expense of modifying the source system, there is enormous inertia/resistance to injecting a primary key at the source, so kill & fill it stays.
In reading about Change Data Capture, it seemed to me that column level insert update and delete are stored in tables that remember the before and after content of each column tracked. In my reading I have seen many references to the LSN to decide when and what to record as changed, but I have not seen any refereference to the necessity of a primary key for Change Data Capture to work. This is in contrast to replication, where the requirement for the existence of a primary key is made plain.
Is it possible to use Change Data Capture against a table without a primary key? How to use it to change the extract from kill and fill to incremental.
I have attached some test data for you that has two temp tables "#worker" and "#worker_rate".
The issue is to find all workers who are sharing SAME SET of rate_codes.
I'm able to get the output as "workers sharing same rate_codes", but unfortunately I could not get the list of workers sharing same SET of rate_codes. Also definition of SAME SET is not defined.
Production and development servers are on different domains and they do not trust each other. How do I import data from the table t1 from a database db1 in production and load it into table t1 inside database db1 in development?
I built a SSIS(writing out to a flat file ) in 32 bit machine and it woks fine . But however when I deploy to the produciton server(64 bit) the SSIS writes out garbage data . After some research I found out that the problem with the 32 bit OS and 64 bit OS problem.What is my next step. Am I out of luck that now I will have to redesing the SSIS in 64 bit?
I have to install MDS on a production server without testing on test server (there is none test/dev server) On the production server each day are rendering SSRS reports which cannot be interrupted.
What risk is by installing MDS on a production server, (the SSRS, SSIS and engine may not go down,well can for some hours) SQL2012Enterprise.
What do I have to do first, steps taken, to install as save as possible for the current running BI environment?
We have both a production SQL 7 server, QA, and Development. From time to time, I want to move just the data from the production server to the other 2 servers without modifing the objects that may have been changed such as stored procedures and rights. Is there a way using the SQL tools provided that we can just move the data. Becuase also what happens is that the rights to the objects change which means my developers no longer have access to the tables for selects in QA since the changes where overwritten by production where they do not have the rights.
What is the better table design for a data collection application.1. Vertical model (pk, attributeName, AttributeValue)2. Custom columns (pk, custom1, custom2, custom3...custom50)Since the data elements collected may change year over year, whichmodel better takes of this column dynamicness
I need to feed head office sql server with the data from regional servers. Servers are spread through all continents Data input done locally on Head office server as well and plus need to ship data from other servers. So clarify this - Head office server is not standby one. Mirroring is out of the picture, I think.. Initially, I thought ship a log every 15 min and restore on Head office server but is this going to create an issue for the local data processing?
i am trying to configure data collector on my server. so i configured data collector on server A and setup on server B. but the "Query statistics collection set" do not show me any data.
i right click and select "collect and upload now " item and get success result for this. but in the report i cant see any data...
also in the log page of data collection i see so many errors with messages like this:
"Failed to create kernel event for collection set: {2DC02BD6-E230-4C05-8516-4E8C0EF21F95}. Inner Error ------------------> Cannot create a file when that file already exists."
i tried some solution like disabling and enabling again, re-configuring, removing and configuring again .... but none of them work right.
It is possible that Data Collection can cause massive increasing MB/sec to tempdb ? I cannot find connection with tempdb and I set cash file, but on same disk.
Or it can be something different? Last two weeks what I checked was Read/Write MB/s to tempdb increasing progressively.
One time it was about 20MB/sec
After it was reseting and again 1MB/sec..
What I checked , External company which install SQL Server made one file for tempdb, next week or during breaktime(it will be possible), I would like make 8files next weekend work.
Now I saw that TempDB mdf was still increased, but using was just 8-10%
I am trying to build a query which will be used in an automated report to calculate failure rates of systems based on cases opened through support. Here is where I am stuck. Some systems may have multiple cases opened within the same span of another cases however we would consider this one failure:
System ACase12013-07-11 13:17:09.0002013-07-15 12:05:03.000 System ACase22013-07-12 16:27:50.0002013-07-12 16:29:12.000 System ACase32013-07-12 17:30:32.0002013-07-12 17:40:11.000 System ACase42013-07-12 19:00:24.0002013-07-12 19:04:14.000 System ACase52013-10-01 18:02:23.0002013-10-01 18:11:26.000
Lets say System A generated those 5 cases however Case 2,3 and 4 all happened within the same period as Case 1 so those 4 cases should count as one failure so my end result should be
System ACase12013-07-11 13:17:09.0002013-07-15 12:05:03.000 System ACase52013-10-01 18:02:23.0002013-10-01 18:11:26.000
And that system should show me 2 failures. I was thinking of using a temp table but not sure if that is possible as I am stumped on how to compare the dates to be able to validate if they fall within the range of an older case and whether or not to include them into the new Temp Table.
this is sanjeev, i have SSIS package, using my c# program i want to add one execute package task to this package's sequence container.
it is creating the new package with out any probelm. but when i opened the package and try to move the newly created exeute package task it is giving the following error.
the element cannot be found in a collection. this error happens when you try to retrieve an element from a collection on a container during the execution of the package
I have a product where we feed an SQL 7 DB data collected from Manufacturing. Presently, the Data transport Program is in charge of getting prepared data from machines and inserting into the DB. This design assumes SQL7 is always ready and able - which is not true due to customer queries or backups or etc. consuming resources. There is a low level buffer in system at manufacturing level if transport dies, but transport is ignorant of SQL distress, so keeps hammering DB's frontdoor. I'm looking for help in putting SQL server in charge of allowing data in - when resources are adequate. Seems I need a function that can determine server stress QUICKLY to forestall transport program and a buffer for records at the transport layer. Anyone know / done a system where SQL server CHECKS for waiting records or OK's an external program to send until told to stop? What indicates (reliably) low server resources? Anyone ever used MSMQ?
"Black Holes are proof SOMEBODY, SOMEWHERE really did have a particularly bad Y2K problem!"
I Enabled Data Collection on one of the server and planned to make it as Centralised Management Data Warehouse I configured data collection on it and can view reports. Next, I went to other server and configured "Set up data collection" to use my first instance as the centralised Database. But the issue is I can only see reports of first server. Am I missing something here.
I did exactly as explained in this video [URL] .....
I have an aplpication whih collect different types of history data from an SQL database. On type of those data are hitory log information where amount of records can be really big.
What i would like to implement is a kind of paging mecanism for colelcting those data. fro example the first call will retrun the first 100 rows, then the next call would get rows 101 to 200 . etc....
I have a two node SQL 2012 AlwaysOn HADR cluster (v11.0.3412) with 4 availability groups configured. The AG groups are set to synchronous mode and the secondary is not readable (we do not want the synchronous replica readable so we do not risk any reads causing contention so we maintain fast performance).
On the secondary we are getting a persistent failure with the Data Collector job called Collection_Set_3_Upload. The failure occurs within the second job step. That job step is executing the following command:
dcexec -u -s 3 -i "$(ESCAPE_DQUOTE(MACH))$(ESCAPE_DQUOTE(INST))" The error message is as follows:
Log Job History (collection_set_3_upload) Step ID 2 Server CLUSTERNODE2 Job Name collection_set_3_upload Step Name collection_set_3_upload_upload Duration 00:00:07
[Code] ....
I know I can prevent this error message by enabling readable secondaries, but we do not want this.
I have tried stopping the data collection jobs and purging the cache directory but to no avail. It will succeed the first time then persistently fail again with the same message every time after that.
In addition, if I set the one failing AG group to readable secondary the job succeeds. So that means that 3/4 work fine, only this one is having an issue.
Last weekend many of our severs had a failed job "collection_set_3_upload". The error that occured is: "Violation of PRIMARY KEY constraint 'PK_ active_ sessions_ and_requests'. Cannot insert duplicate key in object 'snapshots.active_sessions_and_requests'. The duplicate key value is (2824333, 2015-10-25 02:54:49.7630000 +02:00, 1)."Last weekend we happened to go from summer time to winter time. i.e. the clock passed 02:00 - 3:00 two times during this night.
I.e. there is a bug in the Data Collector component that collects data for the Management Data Warehouse: it uses local time instead of UTC. I've created a Connect item to report it to Microsoft.URL...how do you get your process running again? the job will no longer run because it will every 5 minutes keep on trying to upload the conflicting data for the 2nd 2:00 - 3:00 period. I've only found one solution: get rid of all data collected but not yet uploaded.
You do this by stopping the Collection set (in SSMS go to Object Explorer -> <the server you want to fix> -> Management -> Data Collection -> System Data Collection Sets. Right click "Query Statistics" and select "Stop Data Collection Set").Then you delete the cached results from the sql server machine's harddisk. These cached results are in files located in a Temp folder on the sql machine itself, inside the AppData folder for the service account SQL Server Agent is running under. Usually it will be something like: c:Users<sql agent service account>AppDataLocalTemp.
Inside this folder delete all files that have 'QueryActivity' in their name. You'll loose all data collected since the start of wintertime, but at least your data collection process will work again.After this you can start the Collection set again by right clicking it and select "Start Data Collection Set". Every 5 minutes the data will be summarised and uploaded into your management data warehouse.
Posting Data Etiquette - Jeff Moden Posting Performance Based Questions - Gail Shaw Hidden RBAR - Jeff Moden Cross Tabs and Pivots - Jeff Moden Catch-all queries - Gail Shaw
I am using SQL Server 2012 and to me a part of data captured by CDC is not making sense.
I have a table called 'Schema.Table1', and I enabled CDC on it by running 'sys.sp_cdc_enable_table'. I see that a table called 'cdc.Schema_Table1_CT' got created which now gets an entry when ever I Insert, Update or delete a record in the original table.
Till this point every thing works fine.
My original Table has a NOT NULL INT column called 'AuditTrackerUserID' with a default value of 1996. My application does not provides a value for this column, but because the column itself has a default value, records get inserted without error.
When I try to execute the following Query I see multiple records with __$operation of 3 and 1.
SELECT * from cdc.Schema_Table1_CT where AuditTrackerUserID IS NULL
My expectation is that I should not ever see any record returned by this query because AuditTrackerUserID is a not null column, but I do.
Once our company encoutered database disk full issue, so we cannot insert any more data. So is there way I can monitor the data file size? or is there a monitor tool to sending out alert?
for example, It can show me how many percentage of data file has been used e.g. 90%. so I know it's critical and I need to increase more space for database or disk drive.
I am monitoring the data file growth of the databases in a table. Every week I review to see how much space is left on each database. I am thinking of writing a query where the current free space left is less than 20% of the file size, it sends out an alert to me, so that I can manually resize the file . Is it a good practice to resize the data file manually? If so I believe this need to be done at the time when the server is least busy since it can slow down the database. Also do I need to re-index the tables once the data file is resized?
I am working on a application in which,at once i have to make a huge amount of data to insert and delete in my data base.At this time of some transaction my data base log file,shows data base error like the log file of this data base has been full.plese see the sys.databsesd view...like..
Please help me.If you need the exception then i will send you later.
Hi at all, first post....first help :P . I hope to help another one (basing on my knowledge) in the next future :)
I'm a beginner dbadmin and I'm looking for help regarding a strange fact relative SQL Server 2005 :)
I migrated a medium(3,5GB) database from sql server 2000 to sql server 2005. I made a backup in sql server 2000 (it has generated a 3,5GB BAK FILE) and I restored it in sql server 2005 on another server. (I didn't checked data file size).
I configured (with wizard) a maintenance plan on new sql server 2005 with these steps (in this order): (tellme if the order of the steps is wrong :))
1) check db 2) rebuild index 3) reorganize indexes 4) update statistics (all table and all views)
I planned the maintenance plan on 3a.m. and I went home.
Next day I found that data file (MDF) on sql server 2005 was 22GB large!! I made a shrink DB but there wasn't free space to erase. It seems there are 22GB of data. (the data inside is the same of the sql 2000 server..same records same table...identical, non change of data in the meanwhile)
How is it possible? What am I doing wrong? I don't understand what can caused the growth of the file, the maintenance plan or the restore? (unfortunately....I didn't checked size after restore...I checked it only the next day). May be the statistics? (there wasn'int update statistics job on old maintenance plan on sql server 2000)
I have a SSIS package with a Data Flow task. This task transfers the data from SQL Server 2000 to a table in SQL Server 2005.
I deployed and tested this package on the Test Server. Then put this package in a job and executed it - Works fine.
On the production server- If I execute the package through DTEXECUI, it works fine. But when I try executing it through a job- the job fails and it gives me following error:
Description: The external metadata column collection is out of synchronization with the data source columns. The "external metadata column "T_FieldName" (82)" needs to be removed from the external metadata column collection....
What I don't understand is, why are there no errors displayed when I execute the package through DTEXECUI.
Again, looking for the best way to do this with SSIS.
I have a source table and I'd like to load it to a database daily, capturing what changed.
This is not a dimentional table but a fact table.
So, what I;d need to do for each record is to see if the record already exists (using business key) and if it does - compare some of the data fields and of there are changes - register it somehow and if not changes ignore.
Right now, the only two ways I see to do it with SSIS:
- Use Slowly Chaging Dimentions transformation
- Use Lookup and customize SQL, adding something like: WHERE key = ? and (field1 <> ? or field2 <> ?...)