I have a table with about half a million records, each representing a patient in my county.
Each record has a field (RRank) which basically sorts the patients as to how "unwell" they are according to a previously-applied algorithm. The most unwell patient has an RRank of 1, the next-most unwell has RRank=2 etc.
I have just deleted several hundred records (which relate to patients now deceased) from the table, thereby leaving gaps in the RRank sequence. I want to renumber the remaining recs to get rid of the gaps.
I can see what I want to accomplish by using ROW_NUMBER, thus:
SELECT ROW_NUMBER() Over (ORDER BY RRank) as RecNumber, RRank FROM RPL ORDER BY RRank
I see the numbers in the RecNumber column falling behind the RRank as I scan down the results
My question is: How to convert this into an UPDATE statement? I had hoped that I could do something like:
UPDATE RISC_PatientList_TEMP SET RRank = ROW_NUMBER() Over (ORDER BY RRank);
but the system informs that window functions will only work on SELECT (which UPDATE isn't) or ORDER BY (which I can't legally add).
I have a table on an offline database having let say 2000 rows and same table on an online database with let say 3000 rows. Now I want to copy just those records (1000) from online database that are not in the offline database table. Also, both tables have same name, same schema etc.
I have a task to count distinct records in a big table with roughly 30M records, performance is an important factor. Query is to be written to calculate weekly stats, weekly record number could be as high as 1M.
The actual result is like: ID Policy 350235744Credit Cards 350235744PCI 350235744PCI Audit
So the final number for this particular Policy is 3
I can write the query like:
select count(distinct Incident_id) policy_name from Reporting_DailyDlpDetail Where (year(INSERT_DETECT_TS)=2015) and (month(INSERT_DETECT_TS) =6) and (day(INSERT_DETECT_TS) between 2 and 9) This returns 526254 and costs 11 seconds to complete
or a query like: Select distinct Incident_id, policy_name from Reporting_DailyDlpDetail Where (year(INSERT_DETECT_TS)=2015) and (month(INSERT_DETECT_TS) =6) and (day(INSERT_DETECT_TS) between 2 and 9) This returns 749687 and costs roughly 1 minute to complete.
Result is different from the two queries, I believe the later gives correct number. How can I count the distinct based on a combo?Considering the size of data, what is the best and most efficient way to run the stats calculation against over 30 different scenarios (different policies and alert types) and not timeout?
I need to query to return a result for each unique machine with the latest date. The example result below would be returned because they have the latest date.
MachineA 5/7/2011 MachineB 5/5/2010
Select Distinct would almost do it, but I need each unique machine that has the latest date.
Now I want the records having flag2=1 only.. I.e ID=3 has flag2=1 where as ID = 1 and 2 has flag1 and flag3 =1 along with flag2=1. I don't want ID=1 and 2.
I can't make ID unique or primary. I tried with case when statements but it I am somehow missing the basic logic.
I have 2 tables in a 1: n relation. How can i get a select statement that the field in the n-relation with outputs, separated by a semicolon; Example: One person have many Job Titles
Table1 (tblPerson) Table2 (tblTitles) 1, "John", "Miller", "Employee; Admin; Consultant" 2, "Joan", "Stevens", "Employee, Software Engineer, Consultant" and so on .... 1 in select statement:
any useful SQL Queries that might be used to identify lists of potential duplicate records in a table?
For example I have Client Database that includes a table dbo.Clients. This table contains various columns which could be used to identify possible duplicate records, such as Surname | Forenames | DateOfBirth | NINumber | PostalCode etc. . The data contained in these columns is not always exactly the same due to differences caused by user data entry; so some records may have missing data from some of the columns and there could be spelling differences too. Like the following examples:
1 | Smith | John Raymond | NULL | NI990946B | SW12 8TQ 2 | Smith | John | 06/03/1967 | NULL | SW12 8TQ 3 | Smith | Jon Raymond | 06/03/1967 | NI 99 09 46 B | SW12 8TQ
The problem is that whilst it is easy for a human being to review these 3 entries and conclude that they are most likely the same Client entered in to the database 3 times; I cannot find a reliable way of identifying them using a SQL Query.
I've considered using some sort of concatenation to a new column, minus white space and then using a "WHERE column_name LIKE pattern" query, but so far I can't get anything to work well enough. Fuzzy Logic maybe?
the results would produce a grid something like this for the example above:
ID | Surname | Forenames | DuplicateID | DupSurname | DupForenames 1 | Smith | John Raymond | 2 | Smith | John 1 | Smith | John Raymond | 3 | Smith | Jon Raymond 9 | Brown | Peter David | 343 | Brown | Pete D next batch of duplicates etc etc . . . .
I have a table that I need to do some computations on all the data but first I need to remove the duplicate records and insert the results into a destination table. Here's the example below. My table has 3.1 million rows. I have tried using the DISTINCT and the GROUP BY but both ways to select the data takes about half a minute to run. I'm wondering if there is a way to increase performance. Users are ok with this time since the process runs overnight but improving it won't hurt. I do have a clustered index on these fields but that doesn't seem to improve any.
I have around 3 tables having around 20 to 30gb of data. My table A related to table B by a FK and same way table B related to table C by FK. I would like to delete all rows satisfying certain condition from table A and all corresponding related records from table B and C. I have created a query to delete the grandchild first, followed by child table and finally parent. I have used inner join in my delete query. As you all know, inner join delete operations, are going to be extremely resource Intensive especially on bigger tables.
What is the best approach to delete all these rows? There are many constraints, triggers on these tables. Also, there might be some FK relations to other tables as well.
i need distinct max between a1&a2 which i can get no problem but i cant get that unique datetime that correspond to a1&a2 in 1 query because this is will a subquery in a big query. Creating another temp table etc is not an option for me. for every specific a1 there is many entries of a2 + timedate etc.
create table abc_test ( id int ,runs int ,date1 datetime
[code]....
I can either get distinct ID + latest date or ID + largest #ofRuns, both will do but also need the third column.
I have data in a table Item_TB that I need to extract in a way that pulls out the distinct pax name and all the ticket numbers associated with the passenger per booking reference.
The data is:
Branch Folder ID Pax TktNo BookingRef HQ 123 1 Jim 4444 ABCDE HQ 123 2 Bob 5555 ABCDE HQ 123 3 Jim 6666 ABCDE HQ 123 4 Bob 7777 ABCDE HQ 124 1 Jenny 8888 FGHIJ HQ 124 2 Jenny 9999 FGHIJ HQ 124 3 Jenny 3333 FGHIJ
I somehow need to get a function to pull the data out for each booking ref like so
--BookingRef ABCDE Jim 4444/ 6666 Bob 5555 7777 --BookingRef FGHIJ Jenny 8888/ 9999/ 3333
I know I can get a simple function to return the all data, but I do not know how to only include the pax name once.
How do I get distinct TitleID and Titles? Right now I'm still getting duplicates on them both. Here's my stored procedure. select Distinct (Titles.Titleid), Titles.Title as TITLE, classifications.[description]as TOPIC,Titles.descriptions,media.[description] as MEDIA from Titlesjoin resources on resources.Titleid = Titles.Titleidjoin media on media.mediaid = resources.mediaidjoin titleclassification on titleclassification.titleid = titles.titleidjoin classifications on classifications.classificationid = titleclassification.classificationid WHERE Title LIKE 'p' + '%' GROUP BY Titles.titleid, titles.title,classifications.[description],Titles.[descriptions],media.[description] Thanks!
hi, i want to ask a simple question, i have a sql server relational tables named "tbl_Contact" (w/c has a field of ID & ContactName) and "tbl_reviews" (w/c has ID,ContactCode & Reviews). the tables relationship is one-to-many, now my question is, how can i display a single record per 'ContactCode' from 'tbl_review'?? thanks..
Hi, I'm just wondering if someone can help me with some SQL syntax stuff.I want to take this sql statement: "SELECT TOP 50 tblProfile.chName, tblProfile.intCount FROM tblProfile, tblLinks WHERE (tblLinks.MemberID = tblProfile.MemberID) ORDER BY tblLinks.dtDateAdded DESC;" and select only unique "chName's" records
Hello, I have the following tables: declare @B table (Bid int identity, description varchar(50)) declare @P table (Pid int identity, Bid int, description varchar(50)) declare @T table (Tid int identity, description varchar(50)) declare @TinP table (TinPid int identity, Tid int, Pid int) insert into @B (description) select 'B1' insert into @B (description) select 'B2' insert into @P (description, Bid) select 'P1', 1 insert into @P (description, Bid) select 'P2', 1 insert into @P (description, Bid) select 'P3', 2 insert into @T (description) select 'T1' insert into @T (description) select 'T2' insert into @T (description) select 'T3' insert into @TinP (Tid, Pid) select 1, 2 insert into @TinP (Tid, Pid) select 2, 2 insert into @TinP (Tid, Pid) select 3, 3 select * from @B select * from @P select * from @T select * from @TinP I need to get all records in T (Tid and description) which are related to a given BId So for @Bi = 1 I would get: Tid Description 1 T1 2 T2 So I need the distinct values. How to solve this? Thanks, Miguel
TableName: Order_Archive Fields: orderid load_date filename order_date dollar
I load a file each week into a table, each file has unique orderid, load_date, filename, order_date and dollar. However the same orderid, order_date and dollar could appear in another file with different load_date and file_name.
I have a table contains information related to sales:
SO number Order Date Customer SellingPerson 1001 2012/07/02 ABC Andy 1002 2012/07/02 XYZ Alan 1003 2012/07/02 EFG Almelia 1004 2012/07/02 ABC John 1005 2012/07/02 XYZ Oliver 1006 2012/07/02 HIJ Dorthy 1007 2012/07/02 KLM Andy 1008 2012/07/02 NOP Rowan 1009 2012/07/02 QRS David 1010 2012/07/02 ABC Joey
Now, i want to write a query using CTE that gives me first five distinct customer in result set:
SO number Order Date Customer SellingPerson 1001 2012/07/02 ABC Andy 1002 2012/07/02 XYZ Alan 1003 2012/07/02 EFG Almelia 1006 2012/07/02 HIJ Dorthy 1007 2012/07/02 KLM Andy
I wrote this query :
With t(so_number,order date,customer, SellingPerson) as (select top 5 so_number,order date,customer, SellingPerson from t) select distinct billingcontactperson from t order by so_id
And getting this error:
Msg 252, Level 16, State 1, Line 1 Recursive common table expression 't' does not contain a top-level UNION ALL operator.