SQL Server 2008 :: Find Records Comparing Two Lists
Jul 31, 2015
Below is the code for two data sets and I can't seem to get my head around the issue. I need to find the number of 'ER' visits and 'IN' visits, separately, in dbo.VisitData for the 'Active' patients in dbo.PatientStatus. So, consider patient 69. He is Active on 5/5/2014 but becomes Inactive on 9/15/2014. I only want to count the number of visits ER or IN that are between those dates. In addition if patient 69 becomes active again after 9/15/2014, I need to capture that data as well. Patients can change there status multiple times.
I'm trying to avoid a large amount of manual data manipulation.
Here's the background: Legacy system that has (well let's call apples apples) pretty much no method of enforcing data integrity, which has caused a fairly decent amount of garbage data to be inserted in some tables. Pulling one of the [Individuals] table from within this Legacy system and inserting it into a production system, into the Table schema currently in place to track [Individuals] in this Production system.
Problem: Inserting the information is easy, how to deduplicate the records that exist within the staging table that the legacy [Individuals] table has been dumped into in production, prior to insertion. (Wanting to do this programmatically with SQL or SSIS preferably, so that I can alter it later to allow for updating existing/inserting new)
Staging Table Schema:
; CREATE TABLE [dbo].[stage_Individuals]( [SysID] [int] NULL, --Unique, though it's not an index intended to identify the [Individuals] [JJISID] [nvarchar](10) NULL, [NameLast] [nvarchar](30) NULL, [NameFirst] [nvarchar](30) NULL, [NameMiddle] [nvarchar](30) NULL,
[code]....
Scenario: There are records that duplicate the JJISID, though this value is supposed to be unique for every individual. The SYSID is just a Clustered Index (I'm assuming) within the Legacy system and will be most likely dropped when inserted into the Production [Inviduals] table. There are records that are missing their JJISID, though this isn't supposed to happen either, but have valid information within SSN/DOB/Name/etc that can be merged into the correct record that has a JJISID assigned. There is really no data conformity, some records have NULLS for everything except JJISID, or some records will have all the [Individuals] information excluding the JJISID.
Currently I am running the following SQL just to get a list of the records that have a duplicate JJISID (I have other's that partition by Name/DOB/etc and will adapt whatever I come up with to be used for those as well):
; select j.* from (select ROW_NUMBER() OVER (PARTITION BY JJISID ORDER BY JJISID) as RowNum, stage_Individuals.*, COUNT(*) OVER (partition by jjisid) as cnt from stage_Individuals) as j where cnt > 1 and j.JJISID is not nullNow, with SQL Server 2012 or later I could use LAG and LEAD w/ the RowNum value to do my data manipulation...but that won't work because we are on SQL Server 2008 in this environment.
[URL]
With, the following as a potential solution:
GSquared (3/16/2010)Here's a query that seems to do what you need. Try it, let me know if it works.
Performance on it will be a problem, but I can't fine tune that. You'll need to look at various method for getting this kind of data from the table and work out which variation will be best for your data. Without access to the actual table, I can't do that.
; WITH CTE AS (SELECT master_id, MIN(ID) AS first_id, MAX(Account_Expiry) AS latest_expiry FROM #People GROUP BY master_id) SELECT P1.master_id,
[code].....
Unfortunately, I don't think that will accomplish what I'm looking for - I have some records that are duplicated 6 times, and I'm wanting to keep the values within these that aren't NULL.
Basically what I'm looking for, is to update any column with a NULL value to the corresponding Duplicate [Individuals] record value for that column.
**EDIT - Example, Record 1 has a JJISID with NULL NameFirst & NameLast BUT Record 2 has the same JJISID and values for NameFirst & NameLast. I'm wanting to propogate the NameFirst & NameLast from Record2 into Record1
I have two tables of book information. One that has descriptions of thebook in it, and the isbn, and the other that has the book title,inventory data, prices, the isbn.Because of some techncal constraints I won't get into now, I can'tcombine them both into one table. No problem. Things are going fine aslong as there is a description in the one table to corrispond to theisbn and other data in the other table.However, about half of the products are not yet entered into thedescrition table. I'd like to run a sql query that pulls up all theisbns that don't exist in the other. In other words, I'd like to get aquery that tells me exactly which isbns do not yet have descrition datain them. I know there is some sql that says to search from one filewhere the number does not exist in the other, but it slips my mind. Cansomeone help me on this please?Thank you!Bill*** Sent via Developersdex http://www.developersdex.com ***Don't just participate in USENET...get rewarded for it!
I'm looking for a way of taking a query which returns a set of date time fields (probable maximum of 20 rows) and looping through each value to see if it exists in a separate table.
E.g.
Query 1
Select ID, Person, ProposedEvent, DayField, TimeField from MyOptions where person = 'me'
Table
Select Person, ExistingEvent, DayField, TimeField from MyTimetable where person ='me'
Loop through Query 1 and if it finds ANY matching Dayfield AND Timefield in Query/Table 2, return the ProposedEvent (just as a message, the loop could stop there), if no match a message saying all is fine can proceed to process form blah blah.
I'm essentially wanting somebody to select a bunch of events in a form, query 1 then finds all the days and times those events happen and check that none of them exist in the MyTimetable table.
I have a table Tbl1 which has 7 columns.This table will be my base table.By using our current application version ,i'll be creating record for Client1. Col1 will have value that application will generate(id).Then i'll be creating Tbl2 with same columns.Then i'll be creating same record for Client1 again ,using our new application version .Col1 will have different (id)value.I would like to compare the rest of the columns if there is any discrepancy caused by new version(columns Col2 -Col7).If there are same ,don't show me anything.
I have a table of raw data with supplier names, and i need to join it to our supplier database and pull the supplier numbers.
The issue is that the raw data does not match our database entries for these suppliers; sometimes there are extra periods, commas, or abbreviations (i.e. FedEx, FederalExpress, FedEx, inc.) etc. I'm trying to create a query that will search for entries that are similar.
I tried setting a variable to be equal to the raw data field, and then using a LIKE '%@Variable%' to try and return anything that would contain it, but it didnt return any rows.
I have 2 SQL server installs for 2008 R2 configured as multi instances. I have a product called Esri ArcMap 10.3 that can be used to generate a database. When I run the wizard against one installation, the wizard successfully creates the database. When I then run the same against the other installation it fails with the following error [Microsoft][SQL Server Native Client 10.0]Invalid cursor state
I've attempted to look at the configuration of each using select * from master.sys.configurations
From this I found several differences
Successful Mulit instance Optimize for Ad hoc Workloads – False Max Degree of Parallelism - 0
UnSuccessful Multi instance Optimize for Ad hoc Workloads – True Max Degree of Parallelism - 4
I attempted to co-ordinate the differences running the wizard for each iteration but it always failed with the same error above. The error always seems to occur when a particular store procedure is run. There are quite a number of scripts run prior to this and are technically under the covers and only discovered via tracing, in this case using SQL Profiler. I don't have access to individual scripts that I can run incrementally to replicate the issue. I have to rely only on the Esri Wizard.
Reviewing the error against several forums suggests that this is an ODBC error but the trace I ran using SQL Profiler finds that the driver used is Native.
My question then is "What are the conditions that would cause this error above (Invalid Cursor)?" "Is there other configuration settings that are not captured via the SQL identified above?" "Could this be caused by mapped drives for data, Logs and Temp?"
I have a table containing records of criminal convictions. There are over 1M records and the only change is additions to the table on a monthly basis. The two columns I need to deal with are convicted.NAME and convicted.DOB
I have a second table that has 2 columns. One is the name of the defendant and the other is the birth date. This would be monitor.NAME and monitor.DOB
There are no primary keys or any other way to join the tables for this search I want to do.
I would like to be able to put a name in the "monitor" table and run a query to see if there is a match in the convicted table.
The problem I am having is middle initials or names. If I want to monitor.name = 'SMITH JOHN' it will return the results fine. The problem I am having is if the conviction is in the database as 'SMITH JOHN T', or 'SMITH JOHN THOMAS'.
How can I use the monitor table with a 'LASTNAME FIRSTNAME' and return results if the convicted table has a middle initial. I tried with a JOIN:
select distinct convicted.* from convicted join monitor on monitor.name like convicted.defendant and monitor.birthdate = convicted.dob
how best to approach a problem involving two tables across two different servers.
Table 1: Contains IP Address along with assessment findings. Lets say the fields are IPADDRESSSTR, FINDING
Table 2: Contains Subnet information stored in integer format. The fields are SITE_ID, LOW, and HIGH
What I'd like to do is load the IP range information into memory and then return the findings from table 1 where the IPADDRESSSTR is between the LOW and HIGH integer value.
1) Is there a way to load all of the ranges from table 2 into an array and then compare all the IP addresses (IPADDRESSSTR) from table 1?
2) How do I convert IPADDRESSSTR (a string) to an integer to perform the comparison.
declare @table table ( ParentID INT, ChildID INT, Value float ) INSERT INTO @table SELECT 1,1,1.2
[code]....
This case ParentID - Child 1 ,1 & 2,2 and 3,3 records are called as parent where as null , 1 is child whoose parent is 1 similarly null,2 records are child whoose parent is 2 , .....
Now my requirement is to display parent records with value ascending and display next child records to the corresponding parent and parent records are sorted ascending
I want find all dependent objects related to a table. I am using -
SELECT DISTINCT so.name INTO #tmp FROM dbo.sysobjects so JOIN dbo.SysComments sc ON sc.id = so.id WHERE sc.text LIKE '%tablename%'But, I want all those SP/functions/views that use the output of this query and so on...
Eg. Table -> used in usp1 -> usp1 used in usp2 ...and so on
Script to find a value in all databases ? for example I want to find a value contain word “ SQL” in all databases so I will get the information in which database and which table it exists.
My script is only for finding a value in one database.
I have a query that finds all SPID's connected to a particular database:
select d.name, p.* from sys.databases d join sys.sysprocesses p on d.database_id = p.dbid where d.name = 'my_db'
But now we have a new rule that we should not use outdated compatibility views, and one of them is sys.sysprocesses. I checked sys.dm_exec_connections/session/requests but failed to replace my existing code. The first two don't have dbid, the last one, requests, has it, but it selects only currently executing statements.
I have express edition [advance] of sqlserver 2008 r2 , is it possible to trace every event with out using profiler as u know it does not ship with it.
Basically i want to see how locks are taken and released in each isolation level when query is executed. I am using
Recently I needed to find all processes connected to a particular database, let's call it Test_db. I have a simple query to find all connections to my database:
select * from sys.databases d join sys.sysprocesses p on d.database_id = p.dbid where d.name = 'test_db'
But there was a process that was connected to another database like USE another_db_name; but was actually selecting from tables in test_db. Is it possible to catch such connections?
"The log in this backup set terminates at LSN 9566000024284900001, which is too early to apply to the database. A more recent log backup that includes LSN 986000002731000001 can be restored."
If I am missing to restore previous log backup file, how to find which LSN mapped to which log backup file (.trn file name)? Another word looking at LSN can we tell which log backup file is missing?
In my asp.net project there are about 100 drop down list.I created a table to store data for drop down list in which including [DropdownID],[Order Sequence] and [Description] three columns. The sample like below. Data was input manually by a user. How to code to find out duplicate [OrderSequence]?
I want to fetch below 3 records from the above scenario i.e. latest record of each amount (Latest is determined using "descr" column i.e. 4 is greater then 3 -
I have basic knowledge of T-SQL and I am using Cursors to get the first value, the last value and the peak value and some other values from other tables. I found some examples on google but the code I am using is mixed up. I am using multiple Cursors. I need to join three tables to get the result set into the Cursor. The first example uses 2 tables.
The above code seems totally inefficient but it gives the correct result. Now I want to pull some more value and join a third table (TABLE z) in the above CURSORS and not sure how to make it working using CURSORS.I would like to use the following in the CURSORS above.
SELECT x.publishdate, y.firstname, y.lastname, y.age, z.initialValue AS FirstValue, z.HighestValue AS Highest, z.LastValue AS Last FROM TABLE x LEFT OUTER JOIN TABLE y ON x.id = y.id INNER JOIN TABLE z ON x.id = z.id
Every sunday, new data will be loaded from temptable to main table. I have to make sure that, duplicates does not get loaded from temptable to maintable.
For example, if last sunday a record gets loaded from temp to main. If this sunday also the same record is present then it means that is a duplicate.
The duplicate is decided on below scenario
select 'CodeChanges: ', count(*) from CodeChanges a, CodeChanges_Temp b where a.AccountNumber = b.AccountNumber and a.HexaNumber = b.HexaNumber and a.HexaEffDate = b.HexaEffDate and a.HexaId = b.HexaId and
[Code] ...
Yesterday (Sunday) , data from temp got loaded onto maintable but with duplicates.
There is a log which just displays number of duplicates.
Yesterday the log displayed 8 duplicates found. I need to find out the 8 duplicates which got loaded yesterday and delete it off from main table.
There is a column in both tables which is 'creation date and time'. Every Sunday when the load happens this column will have that day's date .
Now i need to find out what are all the duplicates which got loaded on this sunday.
The total rows in temp table is : 363 No of duplicates present is : 8
I used below query to find out the duplicates but it is returning all the 363 rows from the maintable instead of the 8 duplicates.
Select 'CodeChanges: ', * from CodeChanges a where exists ( Select 1 from CodeChanges_Temp b where a.HexaNumber = b.HexaNumber and a.HexaEffDate = b.HexaEffDate and
[Code] ...
Need finding the duplicate records which has creation date time as '2015-11-01 00:00:00.000' and all the above columns mentioned in the query matches.
When I give job Id in filter of this query this will give job status of "Success" but actually my job is currently in executing stage. I want to get all jobs that are currently in executing status.
Use msdb go select distinct j.Name as "Job Name", --j.job_id, case j.enabled when 1 then 'Enable' when 0 then 'Disable'
I have an invoice table with customer, date, sales location, and invoice total. I want to total the invoices by location by customer and see what location sells the most to each customer. My table looks like this.
SELECT P.Publication ,P.Publication_type ,S.Subscriber_ID ,S.Update_Mode FROM MSPublications P INNER JOIN MSSubscriptions S ON P.Publication_ID = S.Publication_ID
give me publication_type=0. So it is transactional replication but how do we know that is pull or push?