Transact SQL :: Optimize Count Distinct For Multiple Groups Of Records?
May 21, 2015
I have a CTE returning a recordset which contains a column SRC. SRC is a number which I use later to get counts and sums for the records in a distinct list.
declare@startdate date = '2014-04-01'
declare@enddate date = '2014-05-01'
; with SM as
(
SELECT --ROW_NUMBER() OVER (PARTITION BY u.SRC ORDER BY u.SRC) As Row,
u.SRC,
[Code] ....
-- If Referral start date is between our requested dates
ref.Referral_Start_Date between @startdate and @enddate
OR
-- Include referrals which started before our requested date, but are still active during our date range.
(ref.Referral_Start_Date < @startdate and (ref.Referral_End_Date > @startdate OR ref.Referral_End_Date IS NULL ))
)
INNER JOIN c_sdt s on s.Service_Delivery_Type_Id = u.Service_Delivery_Type_Id
AND s.Service_Delivery_Unit_Id = 200
)
SELECT
count(distinct (case SRC when 91 then client_number else 0 end)) As Eligable_91,
I have a task to count distinct records in a big table with roughly 30M records, performance is an important factor. Query is to be written to calculate weekly stats, weekly record number could be as high as 1M.
The actual result is like: ID Policy 350235744Credit Cards 350235744PCI 350235744PCI Audit
So the final number for this particular Policy is 3
I can write the query like:
select count(distinct Incident_id) policy_name from Reporting_DailyDlpDetail Where (year(INSERT_DETECT_TS)=2015) and (month(INSERT_DETECT_TS) =6) and (day(INSERT_DETECT_TS) between 2 and 9) This returns 526254 and costs 11 seconds to complete
or a query like: Select distinct Incident_id, policy_name from Reporting_DailyDlpDetail Where (year(INSERT_DETECT_TS)=2015) and (month(INSERT_DETECT_TS) =6) and (day(INSERT_DETECT_TS) between 2 and 9) This returns 749687 and costs roughly 1 minute to complete.
Result is different from the two queries, I believe the later gives correct number. How can I count the distinct based on a combo?Considering the size of data, what is the best and most efficient way to run the stats calculation against over 30 different scenarios (different policies and alert types) and not timeout?
I have a table called Employee which have 6 columns. This table are having 1000 records. Now I want to know the distinct value count of all these 6 columns as well as normal count. like this:
ColumnName DistinctCount NormalCount Id 1000 1000 Name 1000 1000 Phone 600 600 PINCode 200 1000 City 400 1000 Gender 2 1000
I want to build query to return how many rows are in this query:select distinct c1, c2 from t1But SQL won't accept this syntax:select count (distinct c1, c2) from t1Does someone know how to count multiple distinct columns? Thanks.--Disclaimer: This post is solely an individual opinion and does not speak onbehalf of any organization.
Above has 6 files entries for client id 22784 and LOAN_SANCTION_DATE 2014-02-03 out of which 3 are rejected ..
Now , i want to write a query to select those distinct client_id , LOAN_SANCTION_DATE from Client_Master where all files has been rejected ..
means by grouping client ID and LOAN_SANCTION_DATE all the files are rejected ..
I have wrote as below .. got the result but not satisfy with the query
SELECT DISTINCT CLIENT_ID,LOAN_SANCTION_DATE,COUNT(FILE_ID) AS No_Of_Files ,COUNT(DISTINCT CASE WHEN IS_REJECT=1 THEN FILE_ID END )AS No_Of_Rejected FROM dbo.FILE_MASTER GROUP BY CLIENT_ID ,LOAN_SANCTION_DATE HAVING COUNT(FILE_ID)=COUNT(DISTINCT CASE WHEN IS_REJECT=1 THEN FILE_ID END )
I have customers named Alex (Cid=1), Bob (Cid=2), and Carrie (Cid=3) in a table customer.
Cid First_Name 1 Alex 2 Bob 3 Carrie
I have products name Gin (Pid=1), Scotch (Pid=2) and Vodka (Pid=3) in a table products.
Pid Product_Name 1 Gin 2 Scotch 3 Vodka
And I have a table that holds purchase called Customer_Purchases that contain the following records:
Cid Pid 1 1 1 2 2 1 2 3 3 2
I would like to make a marketing list for all customers that purchased Gin or Scotch but exclude customers that purchased Vodka. The result I am looking for would return only 2 records: Cid’s 1 (Alex) and 3 (Carrie) but not 2 (because Bob bought Vodka).
I know how to make a SELECT DISTINCT statement but as soon as I include Pid=2 This clearly doesn’t work :
SELECT DISTINCT Pid, Cid FROM Customer_Purchases WHERE (Cid = 1) OR (Cid = 3) OR (Cid <> 2)
I'm exporting the following query to a datagrid, however in the result set, some values are duplicated (for various reasons... mostly old software and poor categorization)...On the records with identical values, I want to look at the account number and the DateOfService fields and search for joint distinct values and only display that...Current Example: ACCT NUM | DATE OF SERVICE |________________________________ 43490 | 10/01/2006 08:15:23 | 35999 | 10/10/2005 12:00:00 | 35999 | 10/24/2005 12:45:30 | 35999 | 10/10/2005 12:00:00 | 35999 | 10/10/2005 12:00:00 | 23489 | 10/15/2006 15:13:23 |Desired Result: ACCT NUM | DATE OF SERVICE |________________________________ 43490 | 10/01/2006 08:15:23 | 35999 | 10/10/2005 12:00:00 | 35999 | 10/24/2005 12:45:30 | 23489 | 10/15/2006 15:13:23 |Here is the query I'm working with... just can't figure out how to join or limit the results to ONLY unique matches in Acct Number AND DateOfService. "SELECT tblCH.ProcedureKey AS CPT, tblPC.Description, DATEDIFF(d, tblPat.BirthDate, " & _ " { fn NOW() }) / 365 AS Age, tblPat.LastName, tblPat.FirstName, tblPat.BirthDate," & _ " CAST(tblCH.AccountKey AS varchar) + '.' + CAST(tblCH.DependentKey AS varchar) AS Account, tblCH.DateOfService " & _ " FROM dbo.Procedure_Code___Servcode_dat tblPC INNER JOIN " & _ " dbo.Charge_History___Prohist_dat tblCH ON tblPC.ProcedureKey = tblCH.ProcedureKey RIGHT OUTER JOIN " & _ " dbo.Patient_Info___Patfile_dat tblPat ON tblCH.AccountKey = (tblPat.AccountKey AND tblCH.DependentKey) = tblPat.DependentKey "Any suggestions from y'all SQL gurus? I have to have this report ready for production by tomorrow morning and this is the last fix I need to make =Thank you =)
one company can have multiple shareholders and directors records.
i create a search query where users might search by company name, secretary name , shareholder name or directors name. My select query is like below:
Code:
SELECT dsf.dsf_id, dsf.company_name, dsf.incorporation_date, dsf.secretary_name, s.shareholders_name, d.directors_name FROM tbl_dsf dsf LEFT OUTER JOIN tbl_directors d on dsf.dsf_id = d.dsf_id LEFT OUTER JOIN tbl_shareholders s on dsf.dsf_id = s.dsf_id [WHERE CONDITION]
The result for above query would be like:
Code:
abc | 1/2/1999 | william | marry | donna abc | 1/2/1999 | william | jenna | donna abc | 1/2/1999 | william | jolly | donna abc | 1/2/1999 | william | marry | dolly abc | 1/2/1999 | william | jenna | dolly abc | 1/2/1999 | william | jolly | dolly
Is it possible to achive result as below:
Code:
abc | 1/2/1999 | william | marry,jenna,jolly | donna,dolly
My table is test and I have an ID and DateTest columns
I would like to count the weeks with more then one record.
So far I got this and return the weeks with 1 record per week. How can I count the weeks with more then one record
select sum(c) from ( select c = count( id) over (partition by id, datepart(week, DateTest)) from test where id = '1' and DateTest >= '7-7-2015' ) a where c = 1
I am trying to get count of records by month wise when they select year .It was showing the out put correctly but its showing months arer in numbers,but I want to display Jan,Feb ...
SELECT DISTINCT Standard, COUNT(Standard) AS Total,month(ReportDate) Month FROM CPTable where year(ReportDate) = '2015' GROUP BY Standard, Standard , month(ReportDate)
updating the # of Payer from below query to match with the # of rows for each payer record. See the Current and desired results below. The query is currently counting the # of rows for all payers together and updating 3 as # of payers. I need it to count # of rows for each payer like shown inDesired result below. It should be showing 1 for first payer and 2 for 2nd & 3rd based on # of times each payer is repeated..
SELECT b.FILING_IND, b.PYR_CD, b. PAYER_ID, b. PAYER_NAME,a.CLAIM_ICN, (Select Count(*) From MMITCGTD.MMIT_CLAIM a, MMITCGTD.MMIT_TPL b , MMITCGTD.MMIT_ATTACHMENT_LINK c where a.CLAIM_ICN_NU = c.CLAIM_ICN and b.TPL_TS = c.TPL_TS and a.CLAIM_TYPE_CD = 'X'
I'm working on a data analysis involving a table with a large number of records (close to 2 million). I'm using only three of the columns in the table and basically am grouping results based on different criteria. The three columns are PWSID, Installation and AccountType. I have to Provide the PWSID column with a count of the total number of installations per PWSID, also a count of AccountTypes per PWSID. I have the following query, but the numbers aren't adding up and I'm not sure why. I'm falling short in the total count by around 60k records.
I've been able to get this select query to work, but I'm not sure how to modify it to turn it into a DELETE query:
USE QSCTestENG select p.[testid], COUNT(c.[testid]) FROM [dbo].[tblTestHeader] p left outer join [dbo].[tblTestMeasurements] c ON p.[testid]=c.[testid] where p.[model] = 'XPPowerCLC125US12' group by p.[testid] having COUNT(c.[testid]) <>48;
I have a new SQL 2005 (SP2) Reporting Services server to which I've just upgraded and deployed some SSRS 2000 reports.
I have a subreport that contains a matrix with two groups. The report data seems to be inexplicably repeating the data for the first row in the group for all rows in the group. Example:
ID1 ID2 DisplayData
1 1 A
1 2 B
1 3 C
2 1 A
2 2 B
2 3 C
Parent group is on ID1, child group is on ID2, report would show:
1 1 A
2 A
3 A
2 1 A
2 A
3 A
Is this a matrix bug in 2005 SP2, or do I need to do something differently? I can no longer pull a comparison version from an SSRS 2000 server to verify, but I believe it was working as expected before...
1 2015 ba1 137 HL EL Eco 2 2015 ba1 138 EL SL HS 3 2015 ba1 139 SL EL His
From this table i use to admit a student and select their choice of group simultaneously all the subjects associated with GROUP is save on another table.
Here is the TABLE 2 Structure and sample data:
table 2 (NAME - tblstudetail)
id studentID session course sub1 sub2 sub3
1 15120001 2015 ba1 EL SL HS 2 15120002 2015 ba1 HL EL Eco 3 15120003 2015 ba1 SL EL His 4 15120004 2015 ba1 HL EL Eco
AND so no..........................
Now i just want to COUNT the Number of Groups Filled in tblStudateil.
Is there a way to insert multiple records into a database table when you're just given "count" of the number of rows you want? I want to do this in ONE insert statment, so I don't want a solution that loops round doing 100 inserts - that would be too inefficient.
For example, suppose I want to create 100 card records starting it card number '1234000012340000'. Something like this ...
declare @card_start dec(16) set @card_start = '1234000012340000' declare @card_count int set @card_count = 100
If I just use a simple select statement, I find that I have 8286 records within a specified date range.
If I use the select statement to pull records that were created from 5pm and later and then add it to another select statement with records created before 5pm, I get a different count: 7521 + 756 = 8277
Is there something I am doing incorrectly in the following sql?
DECLARE @startdate date = '03-06-2015' DECLARE @enddate date = '10-31-2015' DECLARE @afterTime time = '17:00' SELECT General_Count = (SELECT COUNT(*) as General FROM Unidata.CrumsTicket ct
The query repeats the Header row value for all children associated with the header.I need the output of the query in XML format such that..For every Header element in the XML, all its children should come under that header element//I am using -
SELECT Cols FROM Table Names FOR XML PATH ('Header'), root('root') , ELEMENTS XSINIL
This still repeats the header for each detail (in the XML) , but I need all children for a header under it.I basically want my output in this format -
I have a SQL Query issue you can find in SQL Fiddle
SQL FIDDLE for Demo
My query was like this
For Insert Insert into Employee values('aa', 'T', 'qqq') Insert into Employee values('aa' , 'F' , 'qqq') Insert into Employee values('bb', 'F' , 'eee') Insert into Employee values('cc' , 'T' , 'rrr') Insert into Employee values('cc' , 'pp' , 'aaa') Insert into Employee values('cc' , 'Zz' , 'bab') Insert into Employee values('cc' , 'ZZ' , 'bac') For select select col1,MAX(col2) as Col2,Max(Col3) as Col3 from Employee group by Col1
I have a problem where I have 2 compare 2 records from the same table. This part looks easy but the problem is for a User there can be multiple records and I have 2 compare each record with its previous instance based on the timestamp. Not only I have to compare I have to perform some analysis. Below is the Table script and sample output.
Givens: All SQL Server 2008 or 2012 tools at your disposal.
Production database contains the following tables (simplified for example: constraints ignored, etc.) associated with a racing video game’s server.
-- A player of our game
-- Table greater than 10 million rows
CREATE TABLE [dbo].[User] ( [UserId] [bigint] NOT NULL ,[country] [int] NULL -- User’s home country ,[name] [nvarchar](15) NULL -- User’s displayable name (‘John’, ‘Bill’) ,[subscriptionTier] [int] NULL ) -- 0 == free, 1 == paid, for instance
Assume that rows get written into the event tables at a rate of 1,000 a minute,are never updated once written and currently are only read on a replica/reporting server.
Question Background: Write up a single query that would return the following: List of users and whose “TotalMoneyEarned” value ever grew (between logon events) at a rate of more than 1,000 per minute (we’d consider these suspicious and flag them for later investigation).
For instance, if the sample data were:
-- example of [Events.UserLogon] data -- not the query output we want
Event 1 is okay because there’s nothing to compare it against
Event 2 is okay because the TotalMoneyEarned only grew 500 in a minute
Event 3 should be flagged, as the value grew 1500 in a minute
Event 4 is okay, as it grew 7,000 in 8 minutes (< 1000 per minute)
Query Output (your query should return data in a format like this):
User Flagged Logon Time Rate Since Last Logon (money/minute) John 2010-10-16 00:21:56 1500 Dave 2010-10-16 00:30:50 3200 Bill 2010-10-16 00:35:23 1000
It is likely that you will need to create sample data for both the User and [Events.Logon] tables. We are looking for a single query that returns data like what is represented in Query Output.
INSERT INTO #LatLong SELECT DISTINCT Latitude, Longitude FROM RGCcache
When I run it I get the following error: "Violation of PRIMARY KEY constraint 'PK__#LatLong__________7CE3D9D4'. Cannot insert duplicate key in object 'dbo.#LatLong'."
Im not sure how this is failing as when I try creating another table with 2 decimal columns and repeated values, select distinct only returns distinct pairs of values.
The failure may be related to the fact that RGCcache has about 10 million rows, but I can't see why.
I have a view that takes 14 mins to return 3000 rows. At some points in the execution plan, there are over 4 million rows. The query for the View has a Distinct, a couple columns are populated by functions, there is a cross join, there is a join on a table with no keys or index, joins on non-key varchar columns and so forth. It looks a bit like this:
SELECT DISTINCT TOP 100 PERCENT N.Id, N.CountyName, N.OperationName, N_LO.Name, N_LO.LastName,N_LO.Address1, N_TO.Name, N_TO.LastName, dbo.fn_Sections(N.Shape, s.Points) AS Sections
[code]...
The DISTINCT is in there to get rid of the extra Bob rows, because Bob has many sizes. I need to keep the multiple sizes on Bob's row, but I don't want multiple rows of Bob. Is there a more efficient way to do that other than using DISTINCT?
Second, word on the street is that functions are best avoided if you are going for speed. I don't see a way to use "FOR XML PATH" outside the function, because I can't get it to work on just one column. So I am looking for a way to produce the multiple values on one row.
Hello, in a report in Reporting Services 2005 I want to create a top count 100 for groups. I have a report sheet with a table containing some groups. Under the groupings there are other groupings and so on. Now I want that only the top 100, ordered by a value on group level of the 1st group are contained in the report.
I am in the process of creating a Report, and in this, i need ONLY the row groups (Parents and Child).I have a Parent group field called "Dept", and its corresponding field is MacID.I cannot create a child group or Column group (because that's not what i want).I am then inserting rows below MacID, and then i toggle the other rows to MacID and MacID to Dept.