T-SQL (SS2K8) :: Historical Data Where Number Of Records Exists Between Two Dates With Different Years
Jul 10, 2015
Ok, I'm looking to get counts on historical data where the number of records exists between two dates with different years. The trick is the that the dates fall in different years. Ex: Give me the number of records that are dated between 0ct 1, 2013 and July 1, 2014.
A previous post of mine was similar where I needed to get records after a specific date. The solution provided for that one was the following. This let me get any records that occured after May 1 per given Fiscal year.
SELECT
MAX(CASE WHEN DateFY = 2010 THEN Yr_Count ELSE 0 END) AS [FY10],
MAX(CASE WHEN DateFY = 2010 THEN May_Count ELSE 0 END) AS [May+10],
MAX(CASE WHEN DateFY = 2011 THEN Yr_Count ELSE 0 END) AS [FY11],
MAX(CASE WHEN DateFY = 2011 THEN May_Count ELSE 0 END) AS [May+11],
MAX(CASE WHEN DateFY = 2012 THEN Yr_Count ELSE 0 END) AS [FY12],
[Code] ....
I basically need to have CASE WHEN MONTH(OccuranceDate) between Oct 1 (beginning year) and July 1 (ending year).
I'm looking to get counts on historical data where the number of records exists on or after May 1 in any given year. I've got the total number of records for each year worked out, but now looking for the number of records exist after a specific date. Here's what I have so far.
SELECT p.FY10,p.FY11,p.FY12,p.FY13,p.FY14,p.FY15 FROM ( SELECT COUNT(recordID) AS S, CASE DateFY
I have an address table, and a log table will only record changes in it. So we wrote a after udpate trigger for it. In our case the trigger only need to record historical changes into the log table. so it only needs to be an after update trigger.The trigger works fine until a day we found out there are same addresses exist in the log table for the same student. so below is what I modified the trigger to. I tested, it seems working OK. Also would like to know do I need to use if not exists statement, or just use in the where not exists like what I did in the following code:
ALTER TRIGGER [dbo].[trg_stuPropertyAddressChangeLog] ON [dbo].[stuPropertyAddress] FOR UPDATE AS DECLARE @rc AS INT ;
ID varchar (contains alphanumeric values,not unique) Territory (combined with ID unique) Total_Used int can be null Date_ date (date of the import of the data) ID Territory Total_Used Date_ ACASC CAL071287 2014-06-01 ACASC CAL071287 2014-08-01 ACASC CAL071288 2014-09-01
[Code] .....
Now the problem,per month I need the most recent value so I'm expecting
I have a challenge and I'm not sure the best route to go. Consider the following dataset.
I have a table of sales. The table has fields for customer number and date of sale. There are 1 - n records for a customer. What I want is a record per customer that has the customer number and the average number of months between purchases. For example, Customer 12345 has made 5 purchases.
I am using SQL Server 2008 as a back end for a Microsoft Access front end. I have created a report that is essentially a Bill Of Lading. The detail section lists all the purchase orders that are being shipped on a single load. The problem with the Access Report is that I always need a set number of records (8) so that the layout is consistent. So, if the query returns 5 records, I need an additional 3 blank records returned with the recordset. If there are 2 records, I need an additional 6, and so on. For simplicity sake the query is:
SELECT tblBOL.PONumber FROM tblBOL WHERE tblBOL.BOLNumber=@BOLNumber;Now, I can get the results I want by using a union query for the "extra" records.
For instance, if there are 6 records returned for BOLNumber '12345', I can get the expected results by this query:
SELECT tblBOL.PONumber FROM tblBOL WHERE tblBOL.BOLNumber='12345' UNION ALL SELECT '12345',Null UNION ALL SELECT '12345',Null;
Another solution would be to create a temporary table with the "extra" records and then have only one Union statement. Not sure which is better, but I'm not really sure how to programmatically do either of these. I'm guessing I need to do it in a stored procedure. How do I programmatically create these extra records? One other note.... If there are more than 8 records, I need to return 8 of these "blank" records and none of the real records (hard to explain the reason behind this, but it has to do with the report being only a summary when there are more than 8 records while the actual records will go on a different supplemental report).
I've set up a SQL7 database with MSAccess97 as a front end. I'm trying to enter a person with a birthdate prior to 1900, get an ODBC call error, "Datetime field overflow". How to enter dates prior to year 1900? Thanks.
I've been looking around for some kind of known issue or something, but can't find anything. Here's what I'm experiencing:
I have a table with about 50,000 rows. I open several connections and use a command to ExecuteResultSet against each command, with CommandType.TableDirect, CommandText set to the name of the table, and IndexName set to various indexes. In the end, I have several SqlCeResultSet instances which are then maintained for the life of the AppDomain.
In a loop, I call SqlCeResultSet.Read() on one of the instances, and if it returns false, I call SqlCeResultSet.ReadFirst() - essentially creating a circular pass through the result set.
In a Visual Studio debug session, this approach goes swimmingly for a short time, and then after a successful Read(), I'm pegged with an InvalidOperationException (text: "No data exists for the row/column") for a column which was succesfully read on the previous Read(). If, in the immediate window, I call SqlCeResultSet.Read() again on the result set instance, the Get___ methods work as they had been in the previous reads.
It seems like the internal state of the ResultSet is getting corrupted somehow, but it is opaque to me. Any insights on why this suddenly throws this exception?
What I am trying to do: Obtain attendance percentages for schools for the last five days. The outcome would look like this:
DISTRICTGROUPING, SCHOOLNAME, 5 DAYS AGO PCTG, 4 DAYS AGO PCTG, 3 DAYS AGO PCTG, 2 DAYS AGO PCTG, 1 DAY AGO PCTG I am using nested subqueries for each day as follows: (total enrollment-total absent/total enrollment) ,( ((SELECTCOUNT(*)--GET TOTAL ENROLLMENT COUNT FOR SPECIFIED DATE
[Code]....
The query works with the following exceptions:
My issues are:
1. Avoid the "division by zero" error. This can occur if a school is closed for a day or if a smaller school has no absences for a day.
2. Avoid weekend dates. I need the query to display only weekdays
3. Currently I am using "PERCENTAGE 5: as a column header whereas I need the actual date as the header.
ID Count of Consecutive Years ------- ----------------------------- 1 2 4 2 9 0
I know this is a gaps and islands type problem but nothing I have been able to find is working once I attempt modification so that it can fit my dataset. Please note that I am going to use the data return to populate another table that is currently being populated using a cursor that utilizes an insert statement based on different codes.
Today I have got one scenario to calculate the (sum of days difference minus(-) the dates if the same date is appearing both in assgn_dtm and complet_dtm)/* Here goes the table schema and sample data */
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[temp_tbl]') AND type in (N'U')) DROP TABLE [dbo].[temp_tbl] GO CREATE TABLE [dbo].[temp_tbl]( [tbl_id] [bigint] NULL, [cs_id] [int] NOT NULL, [USERID] [int] NOT NULL,
Below is the scenario which I have currently in my query. I need to write this query without any hardcode values , so that it will work til n number of years without modifications.
Startdate = CASE WHEN Trandate between '06-04-2013' and '05-04-2014' then '06-04-2013' Trandate between '06-04-2012' and '05-04-2013' then '06-04-2012' Trandate between '06-04-2011' and '05-04-2012' then '06-04-2011' Trandate between '06-04-2010' and '05-04-2011' then '06-04-2010' Trandate between '06-04-2009' and '05-04-2010' then '06-04-2009' Trandate between '06-04-2008' and '05-04-2019' then '06-04-2008' END
I have a table that has hotel guests and their start stay date and end stay date, i would like to insert into a new table the original information + add all days in between.
I need to list customers in a table that represents sales over the years.
I have tables:
Customers -> id | name |... Orders -> id | idCustomer | date | ... Products -> id | idOrder | unitprice | quantity | ...
I am using this SQL but it only gets one year:
SELECT customers.name , SUM(unitprice*qt) AS total FROM Products INNER JOIN Orders ON Orders.id = Products.idOrder INNER JOIN Customers ON Customers.id = Orders.idCustomer WHERE year(date)=2014 GROUP BY customers.name ORDER BY 2 DESC
I need something like this:
customer | total sales 204 | total sales | 2015 | total sales (2014 + 2015) -------- customer A | 1000$ | 2000$ | 3000$ customer B | 100$ | 100$ | 200$
Is it possible to retrieve these values in a single SQL query for multiple years and grand total?
How to identify different fields with in a group of records?
Example: create table #test (ID int, Text varchar(10)) insert into #test select 1, 'ab' union all select 1, 'ab'
[Code] ...
I want to show additional field as Matched as ID 1 has same Text field on both the records, and for the ID 2 I want to show Unmatched as the Text fields are different but with the same ID.
I need some advise on how to create a historical database.
What is the best way of doing this? For example, should I create a new row if a column is changed in a row and Time Stamp all record? What happens when I have child tables link to a Header table?
I have been looking on the NET for methods of creating a historical database, but I cant find any.
A general data design question:We have data which changes every week. We had considered seperatinghistorical records and current records into two different tables withthe same columns, but thought it might be simpler to have them alltogether in one table and just add a WeekID int column to indicatewhich week it represents (and perhaps an isCurrent bit column to makequerying easier). We have a number of tables like this, holding weeklydata, and we'll have to query for historical data often, but only backthrough the last year -- we have historical data going back to 1998 orso which we'll rarely if ever look at.Is the all-in-one-table approach better or the seperation of currentand historical data? Will there be a performance hit to organizing datathis way? I don't think the extra columns will make querying too muchmore awkward, but is there anything I'm overlooking in this?Thanks.
From what I've read this is called 'slowly changing dimensions'. Bassically the system I'm working on needs to store the history of certain data so that at any time a user can look up an old project and view it exactly as is, even though the associated parts might have had certain changes over time. From what I can tell Type 2 ( current and historical records are stored in the same table) seems to be the most popular. Type 4 (current records in one table and historical records in a seperate history table) seems like it would also work but I've been unable to find any articles comparing the two. Does anybody have any info on the dis/advantages of one v.s. the other?
We have a database that adds over 100,000 records per day. This goes back to 2002 and I only need historical data for 6 months. Presently we can only can delete 1000 row at a time. Is there a faster way of deleting. We seem to continuely run out of disk space. Urgent!!!!!!!!!!!
I have a SQL2005 db for tracking the prices of products at multiple retailers. The basic structure is, 'products' table lists individual products, 'retailer_products' table lists current prices of the products at multiple retailers, and 'price_history' table records when the price of a product changes at any retailer. The prices are checked from each retailer daily, but a row is added to the 'price_history' only when the price at the retailer changes.
I have the following query to retrieve the price history of a given product at multiple retailers:
SELECT price_history.datetimeofchange, retailer.name, price_history.price FROM product, retailer, retailer_product, price_history WHERE product.id = 'b486ed47-4de4-417d-b77b-89819bc728cd' AND retailer_product.retailerid = retailer.id AND retailer_product.associatedproductid = product.id AND price_history.retailer_productid = retailer_product.id
This gives the following results:
2008-03-08 Example Retailer 22.3 2008-03-28 Example Retailer 11.8 2008-03-30 Example Retailer 22.1 2008-04-01 Example Retailer 11.43 2008-04-03 Example Retailer 11.4
The question(s) I have are how can I:
1 - Get the price of a product at a given retailer at a given date/time For example, get the price of the product at Retailer 2 on 03/28/2008. Table only contains data for Retailer 1 for this date, the behaviour I want is when there is no data available for the query to find the last data at which there was data from that retailer, and use the price from that point - i.e. so for this example the query should result in 2.3 as the price, given that was the last recorded price change from that retailer (03/08/2008).
2 - Get the average price of a product at a given retailer at a given date/time In this case we would need to perform (1) across all retailers, then average the results
I've got a customer who wants reproducible/historical reporting. The problem is that the underlying data changes.
I tried to explain that this can't be done (can it?), but he doesn't understand.
To illustrate the situation - Let's say a teacher wants to track spelling test scores for her students. The below are scores for students A, B, and C (for January, February, March)
A: {70,80,85} B: {70,65, 80} C: {100,90,100}
So, I can generate a historical report that charts the class average and student trend - that's pretty easy.
Now, in April, we find that the school board has mandated that the British spelling of words is ok, so now the cumulative scores (for January, February, March, April)
He wants a report showing the January average as (70+70+100)/3 = 80, when really it is (90+80+100)/3 = 90.
Now imagine that there are actually thousands of data points changing like this... Now also imagine that we add and remove students on a regular basis...
He and his office manager get frustrated when I explain that the reports are not simple - in their mind it is. They have determined the solution is to get a report writer and buy Crystal Reports... I've tried to explain that the problem is that the report specification is unclear (basically - they don't understand what they want). The situation is ok for now, I'm just trying to plan for when they figure out that buying Crystal Reports won't change their situation (except they are done several thousand dollars)...
I have a database which contains time series data (historical stock prices) which I have to search for patterns on a day to day basis. But searching this historical data for patterns is very time consuming not only in writing the complex t-sql scripts but also executing them.
Table structure for one min data: [Date] [Time] [Open] [High], [Low], [Close], [Adjusted_Close], [MA], [DI]..... Tick Data: [Date] [Time] [Trade] Most time consuming queries are with lots of inner joins. So for example if I have to compare first few mins data then I have to do inner join like: With IntervalData AS ( SELECT [Date], Sum(CASE WHEN 1430 = [Time] THEN [PriceRange] END) AS '1430', Sum(CASE WHEN 1431 = [Time] THEN [PriceRange] END) AS '1431', Sum(CASE WHEN 1432 = [Time] THEN [PriceRange] END) AS '1432' FROM [INDU_1] GROUP BY [Date] ) SELECT [Date] ,[1430], [1431], [1432], [1431] - [1430] As 'Range' from IntervalData WHERE ([1430] > 0 AND [1431] < 0 AND [1432] < 0) OR ([1430] < 0 AND [1431] > 0 AND [1430] > 0) ------------------------------------------------------------------------ select ind1.[Time], ind1.PriceRange,ind2.[Time], ind2.PriceRange from INDU_1 ind1 INNER JOIN INDU_1 ind2 ON ind1.[Time] = ind2.[Time] - 1 AND ind1.[Date] = ind2.[Date] where (ind1.[Time] = 2058) AND ((ind1.PriceRange > 0 AND ind2.PriceRange >0) OR (ind2.PriceRange < 0 AND ind1.PriceRange < 0)) ORDER BY ind1.[Date] DESC; Is there anyway I can use Sql 2005 Data mining models to make this searching faster?
I have 2 identical tables one contains current settings, the other contains all historical settings.I could create a union view to display the current values from table A and all historical values from table B, butthat would also require a Variable to hold the tblid for both select statements.
Q. Can this be done with one joined or conditional select statement?
DECLARE @tblid int = 501 SELECT 1,2,3,4,'CurrentSetting' FROM TableA ta WHERE tblid = @tblid UNION SELECT 1,2,3,4,'PreviosSetting' FROM Tableb tb WHERE tblid = @tblid
I am working on a project, which involves displaying trends of certain aggregate values over time. For example, suppose we want to display how the number of active and inactive users changed over time.
One issue is how to store historical data. First of all, should I create a separate database for each historical snapshot or should I use one database for all snapshots? Second, our database size is a couple of gigabytes and replicating the entire database on a daily basis is not feasible. An alternative solution is to back up aggregate values, but how do I back up results of aggregate queries, where the user can specify a date range in the WHERE-clause? Another solution is to create fact tables from our relational schema and back those up.
Another issue is how to query historical data. Using multiple databases to store historical snapshots makes it harder to query.
As you can see there are several design alternatives and I would like to know how this sort of problem is generally solved in the industry. Does SQL Server provide any support for solving this problem?
My application is to capture employee locations.Whenever an employee arrives at a location (whether it is arriving forwork, or at one of the company's other sites) they scan the barcode ontheir employee badge. This writes a record to the tblTSCollected table(DDL and dummy data below).The application needs to be able to display to staff in a control roomthe CURRENT location of each employee.[color=blue]>From the data I've provided, this would be:[/color]EMPLOYEE ID LOCATION CODE963 VB002964 VB003966 VB003968 VB004977 VB001982 VB001Note that, for example, Employee 963 had formerly been at VB001 but wasmore recently logged in at VB002, so therefore the application is notconcerned with the earlier record.What would also be particularly useful would be the NUMBER of staff ateach location - viz.LOCATION CODE NUM STAFFVB001 2VB002 1VB003 2VB004 1Can anyone help?Many thanks in advanceEdwardNOTES ON DDL:THE BARCODE IS CAPTURED BECAUSE THE COMPANY MAY RE-USE BARCODE NUMBERS(WHICH IS DERIVED FROM THE EMPLOYEE PIN), SO THEREFORE THE BARCODECANNOT BE RELIED UPON TO BE UNIQUE.THE COLUMN fldRuleAppliedID IS NULL BECAUSE THAT PARTICULAR ROW HAS NOTBEEN PROCESSED. THERE ARE BUSINESS RULES CONCERNING EMPLOYEE HOURSWHICH OPERATE ON THIS DATA. ONCE A ROW HAS BEEN PROCESSED FORUPLOADING TO THE PAYROLL APPLICATION, THE fldRuleAppliedID COLUMN WILLCONTAIN A VALUE. IN THE PRODUCTION SYSTEM, THEREFORE, ANY SQL ASREQUESTED ABOVE WILL CONTAIN IN ITS WHERE CLAUSE (fldRuleAppliedID IsNULL)if exists (select * from dbo.sysobjects where id =object_id(N'[dbo].[tblTSCollected]') and OBJECTPROPERTY(id,N'IsUserTable') = 1)drop table [dbo].[tblTSCollected]GOCREATE TABLE [dbo].[tblTSCollected] ([fldCollectedID] [int] IDENTITY (1, 1) NOT NULL ,[fldEmployeeID] [int] NULL ,[fldLocationCode] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_ASNULL ,[fldTimeStamp] [datetime] NULL ,[fldRuleAppliedID] [int] NULL ,[fldBarCode] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL) ON [PRIMARY]GOINSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (963, 'VB001', '2005-10-18 11:59:27.383', 45480)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (963, 'VB002', '2005-10-18 12:06:17.833', 45480)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (964, 'VB001', '2005-10-18 12:56:20.690', 45481)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (964, 'VB002', '2005-10-18 15:30:35.117', 45481)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (964, 'VB003', '2005-10-18 16:05:05.880', 45481)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (966, 'VB001', '2005-10-18 11:52:28.307', 97678)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (966, 'VB002', '2005-10-18 13:59:34.807', 97678)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (966, 'VB001', '2005-10-18 14:04:55.820', 97678)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (966, 'VB003', '2005-10-18 16:10:01.943', 97678)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (968, 'VB001', '2005-10-18 11:59:34.307', 98374)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (968, 'VB002', '2005-10-18 12:04:56.037', 98374)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (968, 'VB004', '2005-10-18 12:10:02.723', 98374)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (977, 'VB001', '2005-10-18 12:05:06.630', 96879)INSERT INTO dbo.tblTSCollected(fldEmployeeID,fldLocationCode,fldTimeStamp,fldBarCode)VALUES (982, 'VB001', '2005-10-18 12:06:13.787', 96697)
I have a table with about half a million records, each representing a patient in my county.
Each record has a field (RRank) which basically sorts the patients as to how "unwell" they are according to a previously-applied algorithm. The most unwell patient has an RRank of 1, the next-most unwell has RRank=2 etc.
I have just deleted several hundred records (which relate to patients now deceased) from the table, thereby leaving gaps in the RRank sequence. I want to renumber the remaining recs to get rid of the gaps.
I can see what I want to accomplish by using ROW_NUMBER, thus:
SELECT ROW_NUMBER() Over (ORDER BY RRank) as RecNumber, RRank FROM RPL ORDER BY RRank
I see the numbers in the RecNumber column falling behind the RRank as I scan down the results
My question is: How to convert this into an UPDATE statement? I had hoped that I could do something like:
UPDATE RISC_PatientList_TEMP SET RRank = ROW_NUMBER() Over (ORDER BY RRank);
but the system informs that window functions will only work on SELECT (which UPDATE isn't) or ORDER BY (which I can't legally add).
I want to load historical data from an old system into a new one.Thing is, that old system stored dates as Datetime and the new one uses DateTimeOffset.
All data was collected in the same Time Zone... but with the Daylight Saving Time (DST)
The offset is either +04:00 or +05:00, based on the calendar date. To add to the complexity, the rules for DST changed a couple of years ago.
To determine the offset, I'd need to know what was or would have been the server Timezone for each historical date.
Recently, I partitioned one of my largest tables into multiple monthly field groups. For the current month, it is attached to my "Active' table. The older records are kept in the "historical" table. I need an efficient way to pull records when have a date range that can be spread across both tables.