I have a table...say tb1 of 20 columns which has 2.7 million rows. There is no PK and the only way of identifying a unique row can be done with combination of column1+column2+column3.
Can anyone help me how to idetify the duplicate rows and also delete the duplicate rows. And to commit after every 5000 rows.
ITS VERY URGENT....Thanks in advance.
Rajesh writes "Hi I have a table. I need to see the newly inserted row on the day. i.e. newly inserted data alone. I dont have any date field in the table."
Hi Everyone,I have a table in which their is record which is exactly same.I want to delete all the duplicate keeping ony 1 record in a table.ExampleTable AEmpid currentmonth PreviousmonthSupplimentarydays basic158 2001-11-25 00:00:00.000 2001-10-01 00:00:00.000 2.004701.00158 2001-11-25 00:00:00.000 2001-10-01 00:00:00.000 2.004701.00158 2001-11-25 00:00:00.000 2001-10-01 00:00:00.000 2.004701.00I want to delete 2 rows of above table.How can I achieve that.Any suggestion how can i do that.Thank you in advanceRichard
i am trying to delete rows where a particular column (hours) has the same value for the same member (primary key) but where the effective dates are different. i want to delete the duplicate(s) rows which have the most recent effective date(s).
One of my customer is asking to delete 95% of the rows in an table which has around 645708 rows in a table.Now my concerns are that , what criteria we need to follow while deleting huge records in table, what are the steps we need to be taken care of?And after deletion, what are the maintenance tasks we need to perform to be up to mark without any issues.Lastly will deletion of 95 % of rows improve performance of a table?
I am using a cursor to navigate on data...of a table.... inside the while @@fetch_status = 0 command I want to delete some rows from the table(temporary table) in order to not be processed... The problem is that I want this deletion to affect the rows the cursor has.
Dear Gurus,I have table with following entriesTable name = CustomerName Weight------------ -----------Sanjeev 85Sanjeev 75Rajeev 80Rajeev 45Sandy 35Sandy 30Harry 15Harry 45I need a output as followName Weight------------ -----------Sanjeev 85Rajeev 80Sandy 30Harry 45ORName Weight------------ -----------Sanjeev 75Rajeev 45Sandy 35Harry 15i.e. only distinct Name should display with only one value of Weight.I tried with 'group by' on Name column but it shows me all rows.Could anyone help me for above.Thanking in Advance.RegardsSanjeevJoin Bytes!
Hi! I have stored prosedure with only one parameter - date to retrive data by datetime column (that includes date and time (h:min:sec) portions). This query doesn't work when I try to select using LIKE operator by date portion
select * from <table> where <datecolumn> LIKE '20011206%'
BOL says: It is recommended that LIKE be used when you search for datetime values, because datetime entries can contain a variety of dateparts. For example, if you insert the value 19981231 9:20 into a column named arrival_time, the clause WHERE arrival_time = 9:20 cannot find an exact match for the 9:20 string because SQL Server converts it to Jan 1, 1900 9:20AM. A match is found, however, by the clause WHERE arrival_time LIKE '%9:20%'.
Help!!! please!!!... I'm pulling my hair off... Having an external hosting for sql server is terrible for .net. Too many restrictions and they are not willing to help at all... I cannot connect directly, I have to do everything using strings. Honestly and it's a shame to admit it I never learned how to connect using strings, only using the wizard (upsizing DBs from Access to SQL Server)finally, after a month of trying I was able to connect to the database using a string like this:Dim CS As String Dim db_name, db_username, db_userpassword As String Dim db_server As String db_server = "blah,blah,secureserver.net" /yes, with them >:( db_name = "DB##" db_username = "***" db_userpassword = "***" Me.SqlConnection1.ConnectionString = "SERVER=" & db_server & ";DATABASE=" & db_name & ";UID=" & db_username & ";PWD=" & db_userpassword I am probably not using it right but it's working to capture info using stored procedures but I've been trying to get my datagrids to connect and they are impossible... Any ideas? what would be the coding part of the datagrid/dataset/etc... connection? That probably means that I can't use the property builder or the dataset generator, right? Any ideas ? Thanks (using 1.1 Framework)
I used the following select statement to get duplicate records on Case_number column
select cases.distinct case_link, cases.case_number from cases group by case_link having case_number > 1
I got the error message that
"'cases.warrant_number' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. AND cases.case_number' is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.
Any idea on a better statement to use. THANKS FOR YOUR HELP!
Hi, I have a table and this is what i did to get the desired result
Select A.col1,count(A.col1) from Tab1 group by col1 having count(A.Col1) > 1
i tried this - but it didnot worked - it returned col1 as blanks - Select A.col1,B.Col2,count(A.col1) from Tab1 A, Tab2 B where A.col1 = B.col1 group by A.col1 , b.col2 having count(A.Col1) > 1
As I was looking for all the rows that are apperaing more than once.
Now - The problem -
I have to join this table to another table Tab2 to get the other details. My Tab2 is a table from where I have to pull the Customer DEtails like name,address etc. How should I write this query? Any thinuhts? TIA
Hi. I'm a SQL Server newbie, very experienced with Access, developing an ASP.NET database editor web app. I query the database with a statement more or less in the following form:
SELECT organisation.OrgID, organisation.Name, organisation.whatever FROM services INNER JOIN servicegrouping ON services.serviceID=servicegrouping.serviceID INNER JOIN organisations ON servicegrouping.OrgID = organisations.OrgID WHERE services.service=x OR services.service=y
In other words, I have a database of organisations. The services offered by the organisations are in a separate table, and I only want to return organisations that offer services X or Y.
Okay, now if I did this in Access, this query would return just one record for each organisation that meets the condition, unless I was to include a field from the services table in the SELECT clause, in which case of course I would get one record for each organisation and unique service offered.
But in MS SQL, the query returns duplicate rows if there is more than service offered by the organisation that meets the WHERE condition (=x or =y). Why is this and what do I need to do to my SQL statement to ensure I only get unique rows?
I've a query which gets a set of data from multiple tables -
select * FROM A inner JOIN q ON (RIGHT(q.name,CHARINDEX('-',REVERSE(q.name))-1)= a.id) inner JOIN t ON (t.id = q.id) inner JOIN s ON (q.name = s.name ) inner join l on (s.name = l.name and t.name = l.name)
WHERE A.id = 764 and s.name = '764'
I get repeated # of rows for each id. I've some 136 rows for each q.id ( there are 6 q.ids and hence I get 816 rows instead of 136) These 136 rows are actually divided among thse q.ids as
Hello, I have a question, what does a statement look like that finds the duplicate rows and combines them, I have a table named PRODUCTS in it 3 columbs Cost, Stock, Part_number. I need to find all Part_numbers that dublicate, Combine the rows into 1 & combine (sum, add) their stock together is the new row & take an avarerage of their cost and use it as cost in the new row where they combine. Please help me, I am stalled. Looked all over the internet & could not find anything, I really need this for a project I can not finish. I have the following SQL statement: SELECT part_number, COUNT(part_number) AS NumOccurrences FROM Products GROUP BY Part_number HAVING COUNT(part_number) > 1
I have a csv file that I need to import daily into a SQL Server 2005 table. Much of the table contents could just be overwritten with the new csv file, however there are a set of Rows within the table that need to be appended to , rather than overwritten. There is no Primary Key in the csv file that can be used. I'm not sure this is the best approach, but what I have been trying to do, is append the entire csv file to the existing table, and then go back and delete the duplicates. When I run the Delete, it does delete the majority of the records, but leaves a couple hundred behind. The number left behind varies with each run, can't seem to identify a pattern here. Running the Delete a second time does clean up the rows left behind in the first execution of the Delete, and gives the result I want. Any thoughts as to why this needs to be run twice? Or is a better approach available? Here is my code - SELECT [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time], dupcount=count(*) INTO temppkgactions FROM pkgactions GROUP BY [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time]HAVING count(*) > 1
DELETE TOP (SELECT COUNT(*) -1 FROM dbo.temppkgactions WHERE dupcount > 1 ) FROM dbo.pkgactions DROP TABLE temppkgactions
I have a large table that consists of the columns zip, state, city, county. The primary key "zip" has duplicates but the rows are unique. How do I filter out only the duplicate zips. Randy Garland
Hi, I am encountering a problem. There are lots of duplicate rows in the cobol flat files (due to improper data entry and missing columns values )from where I am transforming data to sql 7. 0 tables using DTS. After transformation , can I some how mark the duplicate rows ? it is not for the purpose of eliminating them, but to enter the missing values and make all the rows complete and unique. I have the transformed table as a temporary table. Can I add a column like 'status' etc.. and have the column values marked '1' for the repeating rows etc.... Can anyone suggest 'any' possible way of implementing it ? Thanx Nisha
I have problem in deleting duplicate rows. I have a identity column in my table, if I try to use correlatted sub query with Delete command it gives error.
The other problem I have is I have a date column in my table and update that column with current date and time. If use a query to fetch a records on a particular day , it does not return any rows
select * from rates where ch_date >='02/11/99' and ch_date<='02/11/99'
If I use convert also there is some other problems. Is there any way to force date checkings to be done excluding time.
CAN ANYBODY REPLY FOLLOWING QUESTIONS. I WANT TO DELETE DUPLICATE ROWS IN MY TABLE WITHOUT USING TRANSACTION TABLE. AND ONE MORE QUESTION HOW TO GET YESTERDAY DATE BY USING ISQL WINDOW.
This is an imaginary problem while discussing ROWID in ORACLE.
Consider a table without primary key, unique key, uniuqe index. A row has inserted into the table many times. I want to delete all but one dulicated rows. With any 'where' clause all rows(duplicated) will be deleted. In ORACLE i can achieve this using ROWID as follows:
Delete from Table_name where < all column values > and ROWID <> ( Select max(rowid) from Table_name where < all column values > )
How can this be achieved in MS SQL Server 6.5 ?
According to Dr. Codd's Golden rules for RDBMS one is that One should be able to reach each data value in the database by using table name, row idenfication value and column name.
Does MS SQL Server 6.5 satisfy this requirement ?
Also How many of Dr. Codd's 13 Golden Rules for RDBMS does MS SQL Server 6.5 Satisfy? Which doesn't ?
Hi everyone, I'm migrating some information for a client at the moment. They had everything in Excel files and I'm getting them into SQL Server. There are some differences in the way I am storing data and the way they were storing data.
For each client they stored, they had something like Rel1 Rel2 Rel3 100 101 102
Now, what I have is a seperate row for each of Rel1, Rel2 and Rel3 so I would have 3 seperate rows with identical information except for Rel1. So I would have: Rel1 100 101 102
So one way I thought of doing it was inserting a new row specifying that the value for Rel2 should be stored in Rel1 and for the next row that the value for Rel3 should be stored in Rel1.
Now, I am able to do this but SQL Server inserts an extra row will the NULL value in Rel1. Does anyone know why this would be happening? I think what it is doing is finding a NULL value in Rel3 after creating the two extra rows and is inserting that NULL. So I think I need to check for NULLs and not allow it to create a new row if, say, Rel3 is NULL.
Any pointers are gladly welcome. (I know it's complicated )
Hello I am fairly new to SQL and having spent much time over the manual I decided to ask for help. So here's my deal.
I've got a query with 5 tables that I join together
Code:
SELECT * FROM Map INNER JOIN ThreatCategory INNER JOIN Threat ON ThreatCategory.threatCategoryID = Threat.threatCategoryID INNER JOIN Threat_Map ON Threat.threatID = Threat_Map.threatID ON Map.mapID = Threat_Map.mapID LEFT JOIN person on map.contentPersonID = person.personID WHERE (((DATEDIFF(dd, Map.dataAcquisitionDate, GETDATE()) > map.goodForDays) and (map.expired = '1')) or (map.expired = '3'))
The problem is the table Threat_Map is a many to many mapping between the Map table and the Threat table. Eg) A map can have more than one threat and a threat can have more than one map. I know this is not the best way to have a database set up but its out of my hands as to changing the database. What I need help with is this.
My application checks as to whether a certain field in the Map table is expired or out of date (as in the query). If so it gets some required information from the other tables using those joins. However, I don't want to get information for the same Map.mapID that's expired twice. I don't really care which ThreatID I get from the Threat_Map table I just need to get one of them to meet the objects standards. However, so far this seemingly simple task has eluded me. I'd like to do this in SQL. Is there perhaps a way to do this. If not I guess I'll just take care of it in the application.
Please give the DML to SELECT the rows avoiding the duplicate rows. Since there is a text column in the table, I couldn't use aggregate function, group by (OR) DISTINCT for processing.
Table :
create table test(col1 int, col2 text) go insert into test values(1, 'abc') go insert into test values(2, 'abc') go insert into test values(2, 'abc') go insert into test values(4, 'dbc') go