Scenario: De-duplication logic should pick one record from source and check with all the destination records and insert if not duplicate. Else raise error. There are average 8 to 10 lookup check for each logic path. Key fields used for de-duplication check are FirstName, LastName, DOB, Gender, SSN.
Issues:
· Since picking row by row and processing the performance is constrained.
· Since 8-10 comparison is done using lookup performance downstream. (Lookup is used without caching, if cashing is used the package is failing after sometimes as if memory is failing. Can we handle this memory problem?)
Please give some suggestion to improve the performance. Current performance is around 2500 records per hour, but there are 8 lac records in total to process.
I am looking for guidance on this issue. If someone can guide me on how to do it better it would help me a lot.
How can I achieve the following... I have a Membership No. field which comes from a bookings table, so multiple membership no. do exist. What I want to achieve is a list of membership No.s with no duplication. Sorry to be so dumb, but we all have to start somewhere.
i have taple translatio witch have theses coloumn ID,TypeID,Status,ComID,RecordID,Translator in this table we assign a certain company with a certain typeID to a certain translator . so next time when the translator log he goes to the company that he assigned to it when he log to the company page, there is another company page n arabic that is transalted by the translators , in the arabic page there is a button when you can send this company to transaltion. but a duplicate have been happen made because user send the same documnet to the translation table so many time so translator transalte the same company profile more than one time witch is not good.
we handle the duplicate of the send to translation button put still there is duplicate record in the database witch is more taht 3000 record how ican to remove thsi record from the DB without make ant other erroe
I have created a job to run a reporting services job which then named it in the job scheduler 354EEF12-404F-46BD-B54F-708B5027837F. I then renamed the job to Rpt ETL log.
However it I was surpised to see two emails come with reference to this report. It seems to have created another one with the long job names.
Is there any way to stop this as I would really like to name to schduled rpt jobs without it duplicating.
Hello I noticed a spelling mistake in the data in a column of several tables, I used the following syntax to alter the spelling: UPDATE [dbo].[Prod_Cat] SET [ProdName]=N'merseyside' WHERE ProdName = 'mmserseyside' The above code correctly updated the spelling error, but it also inserted a new row with the corrected data. So I found myself with two Identical rows containing the corrected information. I had to manually delete the extra row. Because if I had put in a DELETE statement, I would have then lost both rows. What do I need to do to prevent this happening next time. As I find that I need to update the names of some products, but I don't want to duplicate them. Thanks Lynn
Hi I have a set of excel files that i need to export to sql2005 using ssis. Now the issue is that i have no idea about he data ie it may have duplication in the primary key column. If i export it as it is to sql server, it will cause me problems. Is there any way i can filter out the rows which have duplication in the primary key column? Umer
I want to copy data from 4 different database to 1 database... but if the destination database have already the same Primary Key the copying stops/terminated and not copying others that is not yet in the destination...
I don't have knowledge in T-SQL like IF...ELSE my database is SQL Server 2000 but i'm using SQL 2005 Express Management for the query...
What i'm doing is like this:
Use osa (Destination Database) Go
DELETE FROM tblFaculty (*I'll delete first the datas to avoid duplication)
INSERT INTO tblFaculty (FacultyID, LastName, FirstName, MiddleName, Rank, DeptCode) (SELECT FacultyID, LastName, FirstName, MiddleName, Rank, DeptCode FROM cislucena.dbo.tMasFaculty)
INSERT INTO tblFaculty (FacultyID, LastName, FirstName, MiddleName, Rank, DeptCode) (SELECT FacultyID, LastName, FirstName, MiddleName, Rank, DeptCode FROM amapn.dbo.tMasFaculty)
INSERT INTO tblFaculty (FacultyID, LastName, FirstName, MiddleName, Rank, DeptCode) (SELECT FacultyID, LastName, FirstName, MiddleName, Rank, DeptCode FROM abe.dbo.tMasFaculty)
INSERT INTO tblFaculty (FacultyID, LastName, FirstName, MiddleName, Rank, DeptCode) (SELECT FacultyID, LastName, FirstName, MiddleName, Rank, DeptCode FROM aclc.dbo.tMasFaculty)
My problem is if the facultyID (PrimaryKey) which i'm copying is already on the destination which is osa, the copying stops/terminated regardless whether there is more to copy. On the 4 source database, there might data that other database also has. That's why the copying is terminated. All i want to do is to check first each FacultyID if it is already on the destination before copying it to avoid error or duplication of Primary Key so it won't terminate the copying.
How is this possible sir? Anyone care to help? Thanks in advance! More Power!
I have a table with many fields but there is a single field that I do not want duplicates. If I index this specific field preventing duplicates, the entire record does not append. (The field in question is not keyed).
Hello, everyone. I am having problems catching a data duplication issue. I hope I can get an answer in this forum. If not, I would appreciate it if someone can direct me to the right forum. I am working on a vs2005 app which is connected to a sql2005 db. Precisely, I am working on a registration form. Users go to the registration page, enter the data, ie. name, address, email, etc. and submit to register to the site. The INSERT query works like a charm. The only problem is that I am trying to reject registrations for which an email address was used. I put a constraint on the email field in the table and now if I try to register using an e-mail address that already exists in the database I get a violation error (only visible on the local machine) on the sql's email field, which is expected. How can I catch that there is already an email address in the database and stop the execution of the code and possibly show a message to the user to use a different address? Thank you for all your help.
I am using Remote Data service to Query an Sql Server Database using MDAC. The Os in which server is loaded in Window 2003 and the MDAC 2.8 version is installed. Now I create a table X with identity column. Then when I try to insert records in that table using Insert into X select * from Y statement. The statement gets executed by when i check the X table I find that the duplicate records are present with different identity no's.
Even when i truncate and retry the same thing occurs.
We have a large table which is very old and not much ppl take care about, recently there is a performance problem from the report need to query to this table. Eventally we find that this table have primary key missing and there is duplicate data which make "alter table add primary key" don't work
Besides the data size of this table require unacceptable time to execute something like "insert into new_table_with_pk from select distinct * from old table"
Do you have any recommendation of fixing this? As the application run on oracle , sybase and sql server, is that cross database approace will work?
When I tried to create a bar chart using SSRS 2012, the vertical axis values are repeating for smaller data sets values. It's only happening when the data labels are below 5, when the data is above 5 this chart represents data fine.
I tried specifying the custom intervals and this option all together eliminated the  bar for value 1, instead it only showed the value 1 as text on the chart.
I tried changing the data interval type as number and the data type is of Integer, these are counts which I am showing in the chart.
Hello Everyone,I have a very complex performance issue with our production database.Here's the scenario. We have a production webserver server and adevelopment web server. Both are running SQL Server 2000.I encounted various performance issues with the production server with aparticular query. It would take approximately 22 seconds to return 100rows, thats about 0.22 seconds per row. Note: I ran the query in singleuser mode. So I tested the query on the Development server by taking abackup (.dmp) of the database and moving it onto the dev server. I ranthe same query and found that it ran in less than a second.I took a look at the query execution plan and I found that they we'rethe exact same in both cases.Then I took a look at the various index's, and again I found nodifferences in the table indices.If both databases are identical, I'm assumeing that the issue is relatedto some external hardware issue like: disk space, memory etc. Or couldit be OS software related issues, like service packs, SQL Serverconfiguations etc.Here's what I've done to rule out some obvious hardware issues on theprod server:1. Moved all extraneous files to a secondary harddrive to free up spaceon the primary harddrive. There is 55gb's of free space on the disk.2. Applied SQL Server SP4 service packs3. Defragmented the primary harddrive4. Applied all Windows Server 2003 updatesHere is the prod servers system specs:2x Intel Xeon 2.67GHZTotal Physical Memory 2GB, Available Physical Memory 815MBWindows Server 2003 SE /w SP1Here is the dev serers system specs:2x Intel Xeon 2.80GHz2GB DDR2-SDRAMWindows Server 2003 SE /w SP1I'm not sure what else to do, the query performance is an order ofmagnitude difference and I can't explain it. To me its is a hardware oroperating system related issue.Any Ideas would help me greatly!Thanks,Brian T*** Sent via Developersdex http://www.developersdex.com ***
Hello Everyone,I have a very complex performance issue with our production database.Here's the scenario. We have a production webserver server and adevelopment web server. Both are running SQL Server 2000.I encounted various performance issues with the production server witha particular query. It would take approximately 22 seconds to return100 rows, thats about 0.22 seconds per row. Note: I ran the query insingle user mode. So I tested the query on the Development server bytaking a backup (.dmp) of the database and moving it onto the devserver. I ran the same query and found that it ran in less than asecond.I took a look at the query execution plan and I found that they we'rethe exact same in both cases.Then I took a look at the various index's, and again I found nodifferences in the table indices.If both databases are identical, I'm assumeing that the issue isrelated to some external hardware issue like: disk space, memory etc.Or could it be OS software related issues, like service packs, SQLServer configuations etc.Here's what I've done to rule out some obvious hardware issues on theprod server:1. Moved all extraneous files to a secondary harddrive to free up spaceon the primary harddrive. There is 55gb's of free space on the disk.2. Applied SQL Server SP4 service packs3. Defragmented the primary harddrive4. Applied all Windows Server 2003 updatesHere is the prod servers system specs:2x Intel Xeon 2.67GHZTotal Physical Memory 2GB, Available Physical Memory 815MBWindows Server 2003 SE /w SP1Here is the dev serers system specs:2x Intel Xeon 2.80GHz2GB DDR2-SDRAMWindows Server 2003 SE /w SP1I'm not sure what else to do, the query performance is an order ofmagnitude difference and I can't explain it. To me its is a hardware oroperating systemrelated issue.Any Ideas would help me greatly!Thanks,Brian T
We have the same application installed on a few different environments with similar servers and similar hardward. The only difference is the versions of SQL and the colations. Is SQL 2005 a lot faster that SQL 2000? Could colation type make a big effect on performance? ScAndal
HiI want to insert 1000s of records into SQL Server 2005 Database with some manipulation. So that i put into the For Loop and inserting record.Inside the loop i am opening the connection and closing after use. The sample code is belowfor(int i=0;i<1000;i++){ sqlCmd.CommandText = "ProcName"; sqlCmd.Connection = sqlCon; sqlCmd.Connection.Open(): sqlCmd.ExecuteNonQuery(); sqlCmd.Connection.Close(); } What my Question is.. How is the Performance of this Code..?? Will is take time to get the Connection and Close the Connection in every itration?Or Shall I Open the Connection in Begining of the outside loop and close the connection at end of the Loop? will it increase the Performace?Please clarify me these question.. Thanks in advance.
this line 'select * from [viewUserLatestFee]' executes instantly (in Query Analiser) this line 'select * from [viewUserLatestFee] where orgID = 1' takes up to 30 seconds for 1000 rows (still in Query analiser)
can anyone please help - I seem to have ran out of ideas
I have a feeling people might be curious about the view so here it is:
We used a stored proc to pull totals from a database. Everything was fine until the table grew and started to time out. So we created a temp table to populate with a range of data and then pull the totals from there. Everything was fine until the table grew and started to time out. Any suggestion?
I am newly joined as SQL DBA. I want to check the Physical disk Performance. we have RAID 5 with 5+1 disks. I calculated NO Of IO's Per Disk. But how do we know what is actual limit of IO's per disk.
What's my best bet in getting better performance out of one of my database servers? Currently we have 1 set of Raid5 disks partitioned into 2 drives. This houses everything (system, database, and logs) If that server has 2 slots left for drives I was thinking of putting 2 mirrored drives and getting the logs off the main database space? (Make sense?) This is a vendored application so working with new indexes etc. isn't something I should do wo/ the vendor's interaction. Will what I describe above help?
We have SQL Server running on a dual processor Pentium 500mhz server. Our database is hit by about 300 users. 200 of those users are doing constant searches though a client table of about 250,000 records, which in turn is linked to a history table containing over 5,000,000 records. This is only the tip of the iceberg, we have many triggers, procedures, updates, etc. going in the background. The database has over 500 tables.
Keep in mind, these searches that are taking place can involve all kinds of fields: phone number, company name, fax number, first name, last name, status, wildcard searches, etc. So as you can imagine, the database is being hit with all kinds of funky requests to find records. I will be the first to admit that our developers (vendor) are not the best code writers, and we have a tough time getting them to optimize something they do not even understand themselves.
As I speak, our processor utilization is maxing out between 95 to 100 percent. I've done a lot of performance tuning and all of the problems lie in the searching. We've built, tested, rebuilt, re-tested each and every index. I even used the Profiler to filter what I could. It has improved, but our database is growing at a rate of 10 megs a day (already close to 3 gigs, not that huge). I think I've optimized my indexes as best as I can considering all the fields and possibilities available to users to search for records.
For a database that requires all of these different search criteria, what would be a more optimal server? We are looking to purchase something ASAP. I could really use help from someone in a similar situation. It seems odd, in mind, that a company of 300 people would need to rely on a quad server (four processor capability.).
HI I have 700 to 900 mb of production database , 2 gb of ram , 30 gb hard disk, My production machine is runnng very slow , i have check everything memory, page/sec, catch hit ratin , dbcc dbreindex but still it performance is not up to the mark. If i stop SQL SERVER & restart for few days machine works fine but after that again same thing it work very slow, what could be the reason if any one had any solution please suggest. Thanks Nil
Hi friends, My company has aution web site, it is written in Java and all sql statements generated dynamically. No stored procedures used. If 30 users uses this site it is OK but if around 300 users uses then the site becomes very slow(almost dead) and developers saying that database is the bottle neck. Please help me in this problem how can I check and overcome this problem.
We have recently upgraded to SQL 7.0 on NT 4.0/sp6 box which has got 4 PIII 700 processors, 1GB RAM, and 70GB HDD on RAID 1 and RAID 5. We feel that the application performance is not great as expected in SS7. (The application was running in 6.5 smoothly and performance was good)
Is there any option needs to set to improve performance? Now, SS 7 using all the 4 processors and dynamically allocated memory, etc. Any thoughts greatly appreciated.
I'm running MS SQL Server on a 1.4 GHz AMD Athlon Processor with 750 MB or RAM and ample disk space. I have a table with 14 columns; 2 datetime, 8 int and the rest are varchar of various sizes less than 13.
I run a java process on another machine that connects to the database and insert records. It takes about 6 minutes to insert 100,000 records.
I run the xp performance monitor and only about 25% of the SQL Server machine's cpu is being used. I run top on the Linux box running java and I see about the same results. Neither machine is kept busy processing. Why don't I get better performance? Could my local area network be that slow? How many inserts per minutes is good performance?
Does anyone know the performance differences between returning data from SQL Server as XML vs. as a record set? We are about to dive into the For XML world full force, but we wanted to make sure that we are not heading for a performance nightmare.
Thanks for any insight on this. I'll try to look for white papers and do some testing in the meantime.
Declare Cursor for table A WHILE @@FETCH_STATUS = 0 Get values from other function based on some business logic. INSERT Into another table B (or) UPDATE to another table B END
I have to insert/update values to table B, one by one row. So, it is taking more time. Is there any way to collect the values into a temporary storage and Insert/update or Move the values to table B.
1. where do we see the buffer cache hit ratio. can we set the buffer catche hit ratio manually. 2.In query execution plan we execute the query for performance issue.which parameters we check to take an action?