How To Find The Median?
Jan 28, 2006I've noticed that SQL Server (and other DBMSs I've looked at) doesn't seem to have a built-in function for finding the median of a range of numbers.
Gack!
I've noticed that SQL Server (and other DBMSs I've looked at) doesn't seem to have a built-in function for finding the median of a range of numbers.
Gack!
I recently had to use my own little median technique again on a report here at work, and had posted it before, but wasn't sure if anyone had seen it. I have read Celko's and others techniques for generating a median and haven't seen one more efficent.
Does anyone have a better way they can think of? I think this bad boy is pretty short & efficient.
First, if you want to return the middle number or the higher one next to the middle if there is an even number:
SELECT x.Value AS median
FROM Vals x
CROSS JOIN Vals y
GROUP BY x.Value
HAVING SUM(SIGN(x.Value-y.Value)) IN (1,0)
Change the " IN (1,0)" to "IN (-1,0)" to get the lower value if there is an even # of values.
Basically, we are saying compare each number to all possible numbers, and add up values of 1,0 or -1 depending if the first number is less, equal or higher than the second. The number that returns 0 is right in the middle ... If there is no middle, a -1 or 1 is returned. There will never be a 0 and (-1 or 1) at the same time returned.
To get the financial median (avg of the 2 values middle values if there is an even number), you need to encapsulate the results of the above into a subquery, allow for not just (-1,0) but all three (-1,0,1) and then take the AVG of the values returned.
That is,
SELECT Avg(Median) as Median FROM
(
SELECT x.Value AS median
FROM Vals x
CROSS JOIN Vals y
GROUP BY x.Value
HAVING SUM(SIGN(x.Value-y.Value)) IN (1,0,-1)
) A
If there is an even number of values, the lower and higher middle ones are averaged. If there is an odd number, only the middle value is returned and averaged (which of course has no effect).
Most other techniques used several COUNT(*) subqueries which this one avoids.
Critique and enjoy!
- Jeff
I have a table that contains the following:
customer (ID of customers)
product (Product description numeric value)
UOM (unit of measure like each or pak)
avgprice (Avg proce that this product and uom was sold)
I need to find the median value for a product, uom. Then I need create a table that shows product,uom,avgprice,median grouped by product and uom
I have been at this for two days with no luck
Thanks in advance for any help
Hi All,
I have a table that of server names and their execution times that run in to hundreds of thousands of records. What i need is some SQL that gives me the median execution times for each of these different servers. At the moment i have some SQL that only gives me the median for all the records in the table not the median execution time for every different server name. For example my tables looks something like this;
ServerName | ExecTime
-----------------------
server1 | 0.07
server2 | 0.17
server1 | 0.27
server1 | 0.37
server2 | 0.47
server1 | 0.57
server1 | 0.67
server2 | 0.77
My SQL below gives me
ServerName | ExecTime
-----------------------
server1 | 0.37
Where as i want
ServerName | ExecTime
-----------------------
server1 | 0.37
server2 | 0.47
Here is my SQL, hope someone can modify it and thanks in advance.
Code:
SELECT DISTINCT instance, exec_time AS median
FROM (SELECT instance, exec_time
FROM (SELECT TOP 1 exec_time = exec_time * 1.0, instance
FROM (SELECT TOP 50 PERCENT exec_time, instance
FROM llserverlogs
ORDER BY exec_time) sub_a
ORDER BY 1 DESC) sub_1
UNION ALL
SELECT instance, exec_time
FROM (SELECT TOP 1 exec_time = exec_time * 1.0, instance
FROM (SELECT TOP 50 PERCENT exec_time, instance
FROM llserverlogs
ORDER BY exec_time DESC) sub_b
ORDER BY 1) sub_2)
I have a long term need for a median function, I was wondering if anyone has or knows of some code or user defined functions somewhere that would do this. Ideally you could use it just liket the rest of the aggregate functions like AVG, etc.
View 1 Replies View RelatedHi,
I have looked all over the web to try to find some very basic / simple explanations of how to get a median value from a group of records in a table but with no luck
the problem i am having is that all the information i find is always centered around getting a median using every single row in the table. except i have groups of data in the table and want to work out a median for each group. the group is identified by 4 different columns (the 5th column is what i want to get the median on but for each group not the entire table) and i want to produce a resulting table that has 1 row for each group and therefore contains the median value for the group instead of the individual numbers that it currently has. e.g. the current table is like this
column1 column2 column3 column 4 column5(median of this)
value 1 value 2 value 3 value 4 1.2
value 1 value 2 value 3 value 4 1.0
value 1 value 2 value 3 value 4 1.5
value 2 value 3 value 4 value 5 0.2
value 2 value 3 value 4 value 5 0.4
etc...
and i need a query to get the results to show like this
column1 column2 column3 column 4 column5
value 1 value 2 value 3 value 4 1.0
value 2 value 3 value 4 value 5 0.3
etc...
This is driving me crazy and i will be very helpful if anyone can help
the statement i need to add it to is:
select pat_demid, pat_lastname, meas_gendate, meas_id, test_gendate, avg(srtot)as meansrtot, avg(sreff)as meansreff, avg(BFRaw)as BF_Rawmean, avg(BFTGV)as BF_TGVmean, count(srtot)as countsrtot, count(sreff)as countsreff
from bodyparametersf
where (srtot is not null) OR (sreff is not null)
group by pat_demid, pat_lastname, meas_gendate, meas_id, test_gendate
thanks very much
I need to calculate Median on each calculated result from the query below. There is one Median function available in SQL2K but it is not working. Can anyone help me in this regard.
------------------------------------------------------------------------
SELECT INS.Code As [code],INS.FinYr as [YEAR], ' ' As [FIRE BUSINESS], (INSRev.FI_NetPremLessIns / INSRev.FI_GrPremium) * 100 As [Rention Ratio],
(INSRev.FI_NetClaimPaid/INSRev.FI_AdjNetPremium)*100 As [Claim Ratio], ((INSRev.FI_AgencyCommPaid+INSRev.FI_ReInsCommPaid+INSRev.FI_MgmtExpenses+INSRev.FI_OthExpenses)/INSRev.FI_AdjNetPremium)*100 As [Expense Ratio],
((INSRev.FI_NetClaimPaid/INSRev.FI_AdjNetPremium)*100)+(((INSRev.FI_AgencyCommPaid+INSRev.FI_ReInsCommPaid+INSRev.FI_MgmtExpe nses+INSRev.FI_OthExpenses)/INSRev.FI_AdjNetPremium)*100) As [Combine Ratio],
(INSRev.FI_ClosingBal/INSRev.FI_NetClaimPaid) As [Unexpired Risk Reserve to Net Claim(x)],(INSRev.FI_MgmtExpenses/INSRev.FI_AdjNetPremium)*100 As [Management Expenses to Adj. Net Premium],
(INSRev.FI_AgencyCommPaid/INSRev.FI_AdjNetPremium)*100 As [Agency Commissioned to Adj. Net Premium]
FROM
(InsuranceGen As INS LEFT JOIN INSURANCEGen As INS1 ON (INS1.FinYr=INS.FinYr-1 AND INS.Code=INS1.Code))
LEFT JOIN INSRevGen as INSRev ON (INS.Code=INSRev.Code AND INS.FinYr=INSRev.FinYr) WHERE INS.Code IN ('ABC1','ABC2','ABC3') AND INS.FinYr=2005 ORDER BY INS.Code, INS.FinYr
----------------------------------------------------------------------
Thanks
Im trying to find a funtion for median's if there is one. can anyone help?
View 3 Replies View RelatedUsing SQL Server 2005, I have the following query to calculate the median sales of each quarter over the past 5 years:
WITH CompMedian AS
(
SELECT SoldDate, SoldPrice, ROW_NUMBER() OVER(PARTITION BY Convert(Varchar(5),Year(SoldDate)) + Convert(Varchar(5), RIGHT(CAST(100+DATEPART(QQ,SoldDate) AS CHAR(3)), 2)) ORDER BY SoldPrice) AS RowNum, COUNT(*) OVER(PARTITION BY Convert(Varchar(5),Year(SoldDate)) + Convert(Varchar(5), RIGHT(CAST(100+DATEPART(QQ,SoldDate) AS CHAR(3)), 2))) AS Cnt FROM tbl_Orders
WHERE Status = 'Sold'
AND SoldDate >= DATEADD(Year, -5, Convert(DateTime, Convert(Varchar(5),Month(GetDate())) + Convert(Varchar(5), '/1/') + Convert(Varchar(5), YEAR(GetDate()))))
AND SoldDate < DATEADD(Year, 0, Convert(DateTime, Convert(Varchar(5),Month(GetDate())) + Convert(Varchar(5), '/1/') + Convert(Varchar(5), YEAR(GetDate()))))
)
SELECT Convert(Varchar(5),Year(SoldDate)) + Convert(Varchar(5), RIGHT(CAST(100+DATEPART(QQ,SoldDate) AS CHAR(3)), 2)) AS CompDate, AVG(SoldPrice) AS CompMedian
FROM CompMedian
WHERE RowNum IN((Cnt + 1) / 2, (Cnt + 2) / 2)
GROUP BY Convert(Varchar(5),Year(SoldDate)) + Convert(Varchar(5), RIGHT(CAST(100+DATEPART(QQ,SoldDate) AS CHAR(3)), 2))
ORDER BY CompDate;
Now my client would like me to change the query so that each quarter would represent the median for the past 12 months ending with that quarter. I've been looking at this for hours and I'm at a loss. Anyone have any thoughts?
Thanks in advance,
Russ
I am converting a report created using Crystal Reports 10 to Reporting Services. The report contains a list of items with dollar values. The original report displays both the Average and Median value. I can easily ( using avg(Field1.Value!) ) determine the average but cannot find a function to determine the median.
How can I add the Median to the report?
Thanks in advance,
Greg
I have a table (cars) with 3 fields:VIN, Class, sell_price101, sports, 10000102, sports, 11000103, luxury, 9000104, sports, 11000105, sports, 11000106, luxury, 5000107, sports, 11000108, sports, 11000109, luxury, 9000i need to write a query that WITHOUT USING A FUNCTION will return themedian selling price for each class of car. result should look like:Class, Med_Priceluxury, 9000sports, 11000thanks to all u SQLers
View 4 Replies View RelatedI read the follow query about calculating median posted by Daivd Portaon 10/8/03.CREATE TABLE SomeValues (keyx CHAR(1) PRIMARY KEY, valuex INTEGER NOTNULL)INSERT INTO SomeValues VALUES ('A',1)INSERT INTO SomeValues VALUES ('B',2)INSERT INTO SomeValues VALUES ('C',3)INSERT INTO SomeValues VALUES ('D',4)INSERT INTO SomeValues VALUES ('E',5)SELECT S1.valuex AS medianFROM SomeValues AS S1, SomeValues AS S2GROUP BY S1.valuexHAVING SUM(CASE WHEN S2.valuex <= S1.valuexTHEN 1 ELSE 0 END)[color=blue]>= ((COUNT(*) + 1) / 2)[/color]AND SUM(CASE WHEN S2.valuex >= S1.valuexTHEN 1 ELSE 0 END)[color=blue]>= (COUNT(*)/2 + 1)[/color]I have difficulty to understand the having clause. If S1 and S2 arethe same table, what it means by S2.valuex >= S1.valuex? Could somegive me a help?Also, if I have a table structured as:classID field1 field2 field3c1 1 2 3c1 4 5 6c1 7 8 9c2 9 8 7c2 6 5 4c2 3 2 1Is there a way to create a user-defined function that can get themedian for each field as easy as the average function. Such asselect distinct classID,median(field1),median(field2),median(field3)from [tablename]group by classIDThanks in advance
View 2 Replies View Related
I used the Median() function in a calculated measure in a cube. All seemed well until one of my users, a statistician, pointed out that the displayed values were incorrect. I investigated this and finally built the simplest cube with three records and the displayed median does not make sense, i.e. equal what a median value should be.
The 3 records have loan amounts as a measure with these values: $102,500, $168,400, and $172,181 and loan number keys of 1, 2, and 3. That's it.
The median should be $168,400 since the number of items is odd and that is the middle value.
But the cube calculated measure displays $170,290.50. It appears to be taking the average of the middle row and the next value.
When I increase the number of records I get the same odd behavior but it not only alternates (of course) with odd and even (since they use different formulats for odd and even) but the odd wrong results alternate within themselves. The apparent calculations for 5 or 7 records are different compared to 3, 9, 11, 13, 15, 17 records. The first set seems to be calculating odd number of items median by averaging the middle value + the value BEFORE the middle value while the second set of odd rows (3, 9, etc) seem to be calculating median by averaging the middle value + the value AFTER the middel value. The even numbers result in the larger of the two middle values being selected instead of the average (financial median) or the lower number (as one poster claimed for statistical median).
My calculated measure MDX is very simple:
MEDIAN( [Test Fact].[Loan Number].Members, [Measures].[Loan Amount])
Ideas anyone? Am I missing something? Is this a bug?
Thanks!
- Grant
Hello!
I have been trying to find a T-SQL function that would calculates a Median statistical value for me. I am runnnig on SS 2005? Any examples of using this function would be greatly appreciated.
Thanks for any responses!
I have a fact table fct_line_details having two columns mtid, productivity
mtid productivity
1 400
1 200
1 600
2 700
3 900
I want to calculate the median for each mtid in SSAS . (median for mtid 1=400 )
I'm using custom code in an expression to calculate the median of a column. It works fine up to a point. Like if there are 35 rows in the result set (or up to some number) but when it gets bigger results, like 42 rows or more it doesn't work, the median is -1. This is the custom code I'm using (found online).
I don't see anything limiting the count, but it comes back with a -1 median so I think that means the count is not > 0.
I have a column in the report called arrival_to_complete which is like: 53 min
I create a column with expression: =MAX(Code.AddValue(Val(Fields!Arrival_to_complete.Value))) to fill the array using just the number part of the column. Then in the report I have 'Median =' <expression>, where the expression is: =Code.GetMedian() I run the report with 2 parameters a begin date and an end date. I don't see where any of this should be limiting the median calculation so I don't get why it works sometimes and not other times.
Dim values As New System.Collections.ArrayList
Function AddValue(newValue As Decimal)
If values is Nothing Then
values = New System.Collections.ArrayList
End If
values.Add(newValue)
[code]....
I am facing some problem in calculating Median
I am trying to calculate the median value using one of the measures and a dimension value.
Time is a measure in my cube and OpId is one of the dimensions.The result is as follows:
opid time median
1 55
2 23
3 23
Total 23
The Time here for Op Id 1 is the aggregation for all the rows whose OpId is 1.I want the median of the values whose OpId is 1 which is not showing at the moment.
What I am getting here is the median for all of the OpId but what I really want is the median for each of the individual Opid's as well.
I am using a calculated field Median with the following expression.
MEDIAN
( [Dim Operation].[Dim Operation].currentmember.children ,[Measures].[Elapsed Time])
Thanks
I have a table of sample data
Samples(sample_no, sample_date..)
I have no idea how to do the following in sql server or if its even possible:
1. Calculate the difference in days between all samples.
2. Select the median result
Any trick to get this done would be really helpful
thanks,
DB
I have a table of sample data
Samples(sample_no, sample_date..)
I have no idea how to do the following in sql server or if its even possible:
1. Calculate the difference in days between all samples.
2. Select the median result
Any trick to get this done would be really helpful
thanks,
DB
I need to calculate a median on a column in a table. The code I have is:
Code:
Select gender,
CASE
when gender = 'F' then 'Female'
when gender = 'M' then 'Male'
else 'Unknown'
end as test,
datediff(day, [admit_date], getdate()) as 'datediffcal',
from [tbl_record]
How do I calculate the median on the datediffcal column?
It doesn't matter if the resultset only shows the median result. So if the output shows:
median
15
that's fine. Minimally, I need the median value.
Folks,
I have a an sql table having 10,000 Rows.
The table has 5 columns. They are S1,S2,S3,S4 and M
I need the best way to calculate the MEDIAN for each row and store it in M. i.e. M=median(S1,S2,S3,S4)
I've come across articles where they've calculated medians on table rows, not table columns. But my requirement involves table columns. I guess transposing the columns into rows and then calculating median should be possible, but if I do that for 10000 rows using a cursor, then it would take a loooooong time.
Please suggest the best way to counter this
Thanx
Kiran
I have a database with 1million+ records in and i'm trying to collect the median values of column(2) for all distinct values in column (1)
Example DB:
Column 1 Column 2
978555 500
978555 502
978555 480
978555 490
978324 1111
978324 1102
978311 122
978311 120
978994 804
978320 359
and I need it to display on SELECT as
column 1 column 2
978555 495
978324 1106
978311 121
978994 804
978320 359
Is this possible on 2008 R2?
I need to get the median for the 10 columns in my table. For the sake of example, I've reduced it to 2 columns.
The code below works perfectly if i compute for only 1 column and certainly doesn't for multiple columns.
Is there a way to better handle the median computation in one pass, if multiple columns are involved?
DECLARE @tbl as table (id_n int, col1 int, col2 int)
insert into @tbl
values (1, 1, 2), (1,3, 4), (1, 5, 7), (2, 4, 7), (2, 7, 7), (2, 3, 5), (2,5, 5), (3, 1, 2), (3, 3, 5), (3, NULL,11)
select *
from @tbl
order by id_n, col1
[Code] .....
I wrote a procedure to calculate median:
============================
ALTER proc [dbo].[sp_CalculateMedianTimeInDepartmentMinutes]
@StartDate date
,@EndDate date
as
--== Check if count is even or odd
declare @modulo int
select @modulo = (Select COUNT(*)%2 from ED_data where AdmitDateTime between @StartDate and @EndDate )
--=== Get Median
[Code] ....
My fellow developer is using this code to calcuate a madians in many columns (see below). The problem is that it takes about 2 minutes to execute this code. Is there a way to reduce the time of execution?
I attach also a sample of the view
==============
ALTER PROCEDURE [dbo].[sp_ED_Measures]
@StartDate date,
@EndDate date,
@Hospital varchar(5)
AS
BEGIN
SET NOCOUNT ON;
[Code] ......
calculating a rolling median over a period of 3 years.
I already calculate median and I've tried to calculate rolling median over a period of 3 years as below.
MEDIAN([Date].[Year].CurrentMember.Lag(3):[Date].[Year].CurrentMember,[Measures].[median])
What this does is, it calculates the median of the medians over the period of 3 years. But, what I'm looking for is the overall median of the underlying measure over a period of 3 years.
What I have now:
Year1 - 41,52,73; Median1 - 52
Year2 - 6,9,12; Median2- 9
Year3 - 24,68,89; Median3 - 68
Overall Median of 9,52,68 - 52
What I need:
Year1 - 41,52,73; Median1 - 52
Year2 - 6,9,12; Median2- 9
Year3 - 24,68,89; Median3 - 68
Overall Median of 41,52,73,6,9,12,24,68,89 is 41
I tried all the INFORMATION_SCHEMA on SQL 2000 andI see that the system tables hold pretty much everything I aminterested in: Objects names (columns, functions, stored procedures, ...)stored procedure statements in syscomments table.My questions are:If you script your whole database everything you end up havingin the text sql scripts, are those also located in the system tables?That means i could simply read those system tables to get any informationI would normally look in the sql script files?Can i quickly generate a SQL statement of all the indexes on my database?I read many places that Microsoftsays not to modify anything in those tables and not query them since theirstructure might change in future SQL versions.Is it safe to use and rely the system tables?I basically want to do at least fetching of information i want from thesystem tables rather than the SQL script files.I also want to know if it's pretty safe for me to make changes in thesetables.Can i rename an object name for example an Index name, a Stored Procedurename?Can i add a new column in the syscolumns table for a user table?Thank you
View 4 Replies View RelatedI have 3 database say
db1
db2
db3
I need to setup a script to read all the table names in the database above and then query the database to find the list of Stored Procedure using each table.(SQL Server)
Hello!
Please help me to find web page where guys who passed MS exams share their experience. I remember it should be like BenyenVine or very close...
Thank you,
Elena
Everytime I search for this text, it seems to pick up stuff beginning with 5, not dash 5. Does something need to be escaped and how would this be done?
View 8 Replies View RelatedSometimes the simplest things are the most difficult... I'm creating a
UDF as below, then executing it but all I get is that the object does not exist. I must be missing something very basic here...
CREATE function dbo.GetColumnLength(@vcTableName varchar(50), @vcColumnName varchar(50)) returns smallint
as
begin
declare @intLength as smallint
select @intLength=sysC.prec from syscolumns sysC, sysobjects sysO
where sysC.Id = sysO.Id AND sysO.xtype ='U' And
sysO.Name = @vcTableName AND
sysC.Name = @vcColumnName
return @intLength
End
GO
select top 2 * from player, dbo.GetColumnLength('playerdetails','email')
Some pre-written scripts that contain things such as automobile make or other generic common data like that?
I'm trying to avoid populating my db by hand. Any insight would be appreciated.
cheers
Can someone help on this:I am just learning, and I'm connecting to the the northwindcs.mdf tables /open file is northwindcs.adp.This is the sample installed using msde, which is supposed to be mini sqlserver to learn.Please don't refer me elsewhere, here is what I'm trying to learn:If I want to hit a command button and do the following:1. Find a customerid2. if found, edit the record, if not found, add a new record.How would the below code need to look for this, I'm not even sure theconnection string is correct.I'm getting following error:run-time error 3219operation not allowed in this context.I get the y messagebox, but rst!ContactTitle = "The Owner" doesn't work.When I hit the debug, rst.close is highlighted.Also, how do you handle a no find situation here, I noticed a nomatchdoesn't work.I am real good at programming, but new to the server thing.And finally, is there a way to hit this command button, and do all from astored procedure instead of code? But in background, no user inteventiononce button is hit. Which is better, this code approach or a possiblestored procedure.Please help, if I get this down, I think I'll have the rest wipped. Theconnect string is one big thing confusing me along with handling record oncefound / not found. I'm used of DAO. If some one is willing to help, I canemail detailed real code from a database I'm really working on. I need tolearn this first to convert code.HERE IS SAMPLE CODEPrivate Sub Command16_Click()Dim cnn As New ADODB.ConnectionDim rst As New ADODB.RecordsetDim mark As VariantDim count As Integercount = 0cnn.Open "DSN=NorthwindCS; Provider=SQLOLEDB;Data Source=OEMCOMPUTER;InitialCatalog=NorthwindCS; uid=sa; pwd=;"rst.Open "SELECT * FROM Customers", cnn, _adOpenDynamic, adLockOptimistic, adCmdText'rst.Open "SELECT CustomerID FROM Customers", cnn, _' adOpenDynamic, adLockReadOnly, adCmdText' The default parameters are sufficient to search forward' through a Recordset.rst.Find "CustomerID = 'CHOPS'"If rst!CustomerID = "CHOPS" ThenMsgBox "y"rst!ContactTitle = "The Owner"ElseMsgBox "n"End If' Skip the current record to avoid finding the same row repeatedly.' The bookmark is redundant because Find searches from the current' position.'Do While rst.EOF <> True 'Continue if last find succeeded.' Debug.Print "Title ID: "; rst!CustomerIDcount = count + 1 'Count the last title found.'mark = rst.Bookmark 'Note current position.'rst.Find "CustomerID = 'CHOPS'", 1, adSearchForward, mark'Exit Do'Looprst.Closecnn.CloseDebug.Print "The number of business titles is " & countEnd Sub
View 8 Replies View RelatedTo ensure I don't leave orphans floating around in tables when records get deleted (values from one record might link to values in another) how do I find and possibly replace values in tables?For example, if I have a unit of measure table and want to delete the value "inches", how do I look in other tables to find this value and give the user the option to cancel or clear it out. If I don't it will cause controls bound to that value like the dropdownlist to throw an error.
View 1 Replies View Related