SQL Server 2008 :: Comparing / Merging Records In Single Table?
Jun 2, 2015
I'm trying to avoid a large amount of manual data manipulation.
Here's the background: Legacy system that has (well let's call apples apples) pretty much no method of enforcing data integrity, which has caused a fairly decent amount of garbage data to be inserted in some tables. Pulling one of the [Individuals] table from within this Legacy system and inserting it into a production system, into the Table schema currently in place to track [Individuals] in this Production system.
Problem: Inserting the information is easy, how to deduplicate the records that exist within the staging table that the legacy [Individuals] table has been dumped into in production, prior to insertion. (Wanting to do this programmatically with SQL or SSIS preferably, so that I can alter it later to allow for updating existing/inserting new)
Staging Table Schema:
;
CREATE TABLE [dbo].[stage_Individuals](
[SysID] [int] NULL, --Unique, though it's not an index intended to identify the [Individuals]
[JJISID] [nvarchar](10) NULL,
[NameLast] [nvarchar](30) NULL,
[NameFirst] [nvarchar](30) NULL,
[NameMiddle] [nvarchar](30) NULL,
[code]....
Scenario: There are records that duplicate the JJISID, though this value is supposed to be unique for every individual. The SYSID is just a Clustered Index (I'm assuming) within the Legacy system and will be most likely dropped when inserted into the Production [Inviduals] table. There are records that are missing their JJISID, though this isn't supposed to happen either, but have valid information within SSN/DOB/Name/etc that can be merged into the correct record that has a JJISID assigned. There is really no data conformity, some records have NULLS for everything except JJISID, or some records will have all the [Individuals] information excluding the JJISID.
Currently I am running the following SQL just to get a list of the records that have a duplicate JJISID (I have other's that partition by Name/DOB/etc and will adapt whatever I come up with to be used for those as well):
;
select j.*
from (select ROW_NUMBER() OVER (PARTITION BY JJISID ORDER BY JJISID) as RowNum, stage_Individuals.*, COUNT(*) OVER (partition by jjisid) as cnt from stage_Individuals) as j
where cnt > 1 and j.JJISID is not nullNow, with SQL Server 2012 or later I could use LAG and LEAD w/ the RowNum value to do my data manipulation...but that won't work because we are on SQL Server 2008 in this environment.
[URL]
With, the following as a potential solution:
GSquared (3/16/2010)Here's a query that seems to do what you need. Try it, let me know if it works.
Performance on it will be a problem, but I can't fine tune that. You'll need to look at various method for getting this kind of data from the table and work out which variation will be best for your data. Without access to the actual table, I can't do that.
;
WITH CTE
AS (SELECT master_id,
MIN(ID) AS first_id,
MAX(Account_Expiry) AS latest_expiry
FROM #People
GROUP BY master_id)
SELECT P1.master_id,
[code].....
Unfortunately, I don't think that will accomplish what I'm looking for - I have some records that are duplicated 6 times, and I'm wanting to keep the values within these that aren't NULL.
Basically what I'm looking for, is to update any column with a NULL value to the corresponding Duplicate [Individuals] record value for that column.
**EDIT - Example, Record 1 has a JJISID with NULL NameFirst & NameLast BUT Record 2 has the same JJISID and values for NameFirst & NameLast. I'm wanting to propogate the NameFirst & NameLast from Record2 into Record1
View 6 Replies
ADVERTISEMENT
Jul 31, 2015
Below is the code for two data sets and I can't seem to get my head around the issue. I need to find the number of 'ER' visits and 'IN' visits, separately, in dbo.VisitData for the 'Active' patients in dbo.PatientStatus. So, consider patient 69. He is Active on 5/5/2014 but becomes Inactive on 9/15/2014. I only want to count the number of visits ER or IN that are between those dates. In addition if patient 69 becomes active again after 9/15/2014, I need to capture that data as well. Patients can change there status multiple times.
create table dbo.PatientStatus
as
(
patient_id varchar(10),
status_type varchar(10),
status_date datetime
[Code] ....
View 2 Replies
View Related
Jun 8, 2015
I have a table Tbl1 which has 7 columns.This table will be my base table.By using our current application version ,i'll be creating record for Client1. Col1 will have value that application will generate(id).Then i'll be creating Tbl2 with same columns.Then i'll be creating same record for Client1 again ,using our new application version .Col1 will have different (id)value.I would like to compare the rest of the columns if there is any discrepancy caused by new version(columns Col2 -Col7).If there are same ,don't show me anything.
View 9 Replies
View Related
Jan 26, 2015
I have multiple databases in the server and all my databases have tables: stdVersions, stdChangeLog. The stdVersions table have field called DatabaseVersion which stored the version of the database. The stdChangeLog table have a field called ChangedOn which stored the date of any change made in the database.
I need to write a query/stored procedure/function that will return all the database names, version and the date changed on. The results should look something like this:
DatabaseName DatabaseVersion DateChangedOn
OK5_AAGLASS 5.10.1.2 2015/01/12
OK5_SHOPRITE 5.9.1.6 2015/01/10
OK5_SALDANHA 5.10.1.2 2014/12/23
The results should be ordered by DateChangedOn.
View 6 Replies
View Related
Jul 20, 2005
I'm trying to come up with an elegant, simple way to compare twoconsecutive values from the same table.For instance:SELECT TOP 2 datavalues FROM myTable ORDER BY timestamp DESCThat gives me the two latest values. I want to test the rate ofchange of these values. If the top row is a 50% increase over the rowbelow it, I'll execute some special logic.What are my options? The only ways I can think of doing this arepretty ugly. Any help is very much appreciated. Thanks!B.
View 22 Replies
View Related
Jan 7, 2008
Hi,
I have two databases lets say DB1 and DB2.
Schemas for both databases is same.
In both database schemas there are tables which has identity columns as primary key.
Now i want to merge these two databases in a single database say DB3.
It may also possible that some master records in both databases are common so they should not repeat in DB3
Is there any way so that i can do it quickly and as soon as possible.
Thanks in advance
Rohit
View 1 Replies
View Related
Sep 16, 2015
I have a scenario where I have to Update a table with date when there are new records in another table
For example:
I load ODS table with the data from a file in SSIS. the file has CustomerID and other columns.
Now, when there is new record for any customerID in Ods, then Update the dbo table with the most recent record for every CustomerID(i.e. update the date column in dbo for that customerID). Also Include an Identifier that relates back to the ODS table. How do I do this?
View 8 Replies
View Related
Mar 17, 2015
I have a table where i am inserting into temp table, I mean selecting the records from existing table. From this how can i get latest records.
create table studentmarks
(
id int,
name varchar(20),
marks int
)
Insert into dbo.studentmarks values(1,'sha',20);
[Code] ....
How to write a sql query to get the below output
studentname totalmarks
sha 90
hu 120
View 1 Replies
View Related
May 12, 2007
Hello, I am pretty new with SQL Server 2005.
I have installed SQL Server Express Edition. I have migrated a set of tables from Oracl10g (by using Microsoft's Migration Tool Kit).While I am trying the following simple update command, the session hangs and it never finishes !!!!!!!!!!!!
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
select pos_key from pos_station where staff_key = 1105
POS_KEY
=======
NULL
update pos_station set pos_key = 1 where staff_key = 1105
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
The table has a few constraints and a couple of indices in place.
Then I create another table (but no contraints or indices), just copy the data from the problematic one and the update WORKS (in msecs) :
update pos_station_new set pos_key = 1 where staff_key = 1105
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Is there any way to tell if the table (any table in SQL Server) is corrupted or not ?
How can I tell if a session is waiting for something and what is that something ?
Thank you very much for your help.
Tom
View 7 Replies
View Related
Apr 14, 2015
I have a query needs to look for 5 records data in a table. Basically i need to hardcode. Below is my query which didn't work out.
select BF_ORGN_CD, BF_BDOB_CD, BF_TM_PERD_CD,data
from BF_DATA
WHERE (BF_ORGN_CD,BF_BDOB_CD,BF_TM_PERD_CD) in ***** i guess this is the wrong query****
('A1', 'B1', 'C1')
('A2', 'B2', 'C2')
('A3', 'B3', 'C3')
('A4', 'B4', 'C4')
('A5', 'B5', 'C5')
but if i use the query below it will generate more records than these 5 records
select BF_ORGN_CD, BF_BDOB_CD, BF_TM_PERD_CD,data
from BF_DATA
WHERE (BF_ORGN_CD) in ('A1', 'A2', 'A3', 'A4', 'A5')
and (BF_BDOB_CD) in ('B1', 'B2', 'B3', 'B4', 'B5')
and (BF_TM_PERD_CD) in ('C1', 'C2', 'C3', 'C4', 'C5')
View 4 Replies
View Related
Apr 22, 2015
I am using a SQL Server Agent jobs that run each morning to update the records in a table to match what they should be for that day. I built them and tested it using a test table called "testtable1". It worked fine. But when I switched over to our production table, it fails saying the table has to be decaled. What would be the difference. The production table has a "@" in front of the name, is that causing issues?
USE [Live_build]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
BEGIN
DELETE
FROM @ZIPLIST
INSERT INTO @ZIPLIST
SELECT * FROM tblZip3DSWed;
END
View 4 Replies
View Related
Jun 4, 2015
Here's the scenario. I have a table (let's call it MyTable) that consists of four fields: Id, Source, FirstField, and SecondField, where Source only takes one of two values: Source1 and Source2.
The records in this table look as follows:
I need to return, using 3 different T-SQL queries:
1) Products that exist only in Source2 (in red above)
2) Products that exist only in Source1 (in green above)
3) Products that exist both in Source1 and Source2 (in black above)
For 1) so far I've been doing something along the lines of
SELECT * FROM MyTable WHERE Source=Source1 AND FirstField NOT IN (SELECT DISTINCT(FirstField) FROM MyTable WHERE Source=Source2)
Not being a T-SQL expert myself, I'm wondering if this is the right or more efficient way to go. I have read about INTERSECT and EXCEPT, but I am a little unclear if they could be applied in this case out of the box.
View 5 Replies
View Related
Feb 4, 2015
I have these two tables Log and CategoryLog, I need to archive records older than 13 months in these two tables to two separate tables and then delete the archived records from Log and CategoryLog tables. The problem is that only 'Log' table has a date column, the other table CategoryLog does not have any date column. But the two tables are connected by a column(LogID). How to archive the data and then delete the archive data from both tables.
View 9 Replies
View Related
Jun 10, 2015
I have a problem where I have 2 compare 2 records from the same table. This part looks easy but the problem is for a User there can be multiple records and I have 2 compare each record with its previous instance based on the timestamp. Not only I have to compare I have to perform some analysis. Below is the Table script and sample output.
Givens: All SQL Server 2008 or 2012 tools at your disposal.
Production database contains the following tables (simplified for example: constraints ignored, etc.) associated with a racing video game’s server.
-- A player of our game
-- Table greater than 10 million rows
CREATE
TABLE [dbo].[User]
(
[UserId]
[bigint] NOT
NULL
,[country]
[int] NULL
-- User’s home country
,[name]
[nvarchar](15)
NULL -- User’s displayable name (‘John’, ‘Bill’)
,[subscriptionTier]
[int] NULL
)
-- 0 == free, 1 == paid, for instance
Assume that rows get written into the event tables at a rate of 1,000 a minute,are never updated once written and currently are only read on a replica/reporting server.
Question Background: Write up a single query that would return the following: List of users and whose “TotalMoneyEarned” value ever grew (between logon events) at a rate of more than 1,000 per minute (we’d consider these suspicious and flag them for later investigation).
For instance, if the sample data were:
-- example of [Events.UserLogon] data -- not the query output we want
EventId UserId TotalMoneyEarned LogonDate
----------- -------------------- ---------------- -----------------------
1 1 1000 2010-10-16 00:19:56.460
2 1 1500 2010-10-16 00:20:56.460
3 1 3000 2010-10-16 00:21:56.460
4 1 10000 2010-10-16 00:29:56.460
Event 1 is okay because there’s nothing to compare it against
Event 2 is okay because the TotalMoneyEarned only grew 500 in a minute
Event 3 should be flagged, as the value grew 1500 in a minute
Event 4 is okay, as it grew 7,000 in 8 minutes (< 1000 per minute)
Query Output (your query should return data in a format like this):
User Flagged Logon Time Rate Since Last Logon (money/minute)
John 2010-10-16 00:21:56 1500
Dave 2010-10-16 00:30:50 3200
Bill 2010-10-16 00:35:23 1000
It is likely that you will need to create sample data for both the User and [Events.Logon] tables. We are looking for a single query that returns data like what is represented in Query Output.
View 3 Replies
View Related
Jan 24, 2015
I have a database full of different types of leads some for company A some for company B and so on, each doing a different service. However the leads from B can be used for A and leads from A can be used for B, so I want to merge the data.
Example:
Phone Number Name Home Owner Credit Insurance
727-555-1234 Dave Thomas Yes B
727-555-1234 Dave Thomas Gieco
I would like the end result to be one record:
Phone Number Name Home Owner Credit Insurance
727-555-1234 Dave Thomas Yes B Gieco
Since these were imported into SQL they all have a unique ID, here are the current labels
ID,phone_ number,first_ name,last_name,address1, address2, address3,city,state,postal_code,HOME_OWNR,HH_INCOME,CREDIT_RATING,AGE,MATCH,source_id,
title,comments,dnc_flag,provider,vehicle,coverage,alt_phone,email,marital status,dob
View 8 Replies
View Related
Jul 17, 2015
I have a table of raw data with supplier names, and i need to join it to our supplier database and pull the supplier numbers.
The issue is that the raw data does not match our database entries for these suppliers; sometimes there are extra periods, commas, or abbreviations (i.e. FedEx, FederalExpress, FedEx, inc.) etc. I'm trying to create a query that will search for entries that are similar.
I tried setting a variable to be equal to the raw data field, and then using a LIKE '%@Variable%' to try and return anything that would contain it, but it didnt return any rows.
View 9 Replies
View Related
Apr 26, 2015
I have 2 SQL server installs for 2008 R2 configured as multi instances. I have a product called Esri ArcMap 10.3 that can be used to generate a database. When I run the wizard against one installation, the wizard successfully creates the database. When I then run the same against the other installation it fails with the following error [Microsoft][SQL Server Native Client 10.0]Invalid cursor state
I've attempted to look at the configuration of each using
select *
from master.sys.configurations
From this I found several differences
Successful Mulit instance
Optimize for Ad hoc Workloads – False
Max Degree of Parallelism - 0
UnSuccessful Multi instance
Optimize for Ad hoc Workloads – True
Max Degree of Parallelism - 4
I attempted to co-ordinate the differences running the wizard for each iteration but it always failed with the same error above. The error always seems to occur when a particular store procedure is run. There are quite a number of scripts run prior to this and are technically under the covers and only discovered via tracing, in this case using SQL Profiler. I don't have access to individual scripts that I can run incrementally to replicate the issue. I have to rely only on the Esri Wizard.
Reviewing the error against several forums suggests that this is an ODBC error but the trace I ran using SQL Profiler finds that the driver used is Native.
My question then is "What are the conditions that would cause this error above (Invalid Cursor)?" "Is there other configuration settings that are not captured via the SQL identified above?" "Could this be caused by mapped drives for data, Logs and Temp?"
View 2 Replies
View Related
Aug 9, 2015
I have a table containing records of criminal convictions. There are over 1M records and the only change is additions to the table on a monthly basis. The two columns I need to deal with are convicted.NAME and convicted.DOB
I have a second table that has 2 columns. One is the name of the defendant and the other is the birth date. This would be monitor.NAME and monitor.DOB
There are no primary keys or any other way to join the tables for this search I want to do.
I would like to be able to put a name in the "monitor" table and run a query to see if there is a match in the convicted table.
The problem I am having is middle initials or names. If I want to monitor.name = 'SMITH JOHN' it will return the results fine. The problem I am having is if the conviction is in the database as 'SMITH JOHN T', or 'SMITH JOHN THOMAS'.
How can I use the monitor table with a 'LASTNAME FIRSTNAME' and return results if the convicted table has a middle initial. I tried with a JOIN:
select distinct convicted.*
from convicted
join monitor
on monitor.name like convicted.defendant
and monitor.birthdate = convicted.dob
View 5 Replies
View Related
Nov 19, 2015
There are 3 tables Property , PropertyExternalReference , PropertyAssesmentValuation which are common for 60 business rule
SELECT
PE.PropertyExternalReferenceValue [BAReferenceNumber]
, PA.DescriptionCode
[PSDCode]
, PV.ValuationEffectiveDate
[EffectiveDate]
, PV.PropertyListAlterationDate
[ListAlterationDate]
[code]....
Can we push the data for the above query in a physical table and create index to make the query fast rather than using the same set tables multiple times
View 11 Replies
View Related
Mar 5, 2012
I'm using a shipping program called endicia professional that allows for database manipulation to make my processing easier. I've managed to fix the database here and there but have had an issue combining orders from a single customer when theybuy more than one item. Ideally I would like to have it combine rows when a customer purchases items going to the same address. To avoid having an issue where the address line is the same ie two people live in the same appt complex and it combines these I thought we could use qualifiers as the purchase will have name, order Id that should be unique enough
Order-id name address sku
1234 John 46 easy ln. A27
1234 John 46 easy ln. B32
Results:
Order-id name address sku
1234 John 46 easy ln. A27,b32
View 6 Replies
View Related
Mar 6, 2015
how best to approach a problem involving two tables across two different servers.
Table 1: Contains IP Address along with assessment findings. Lets say the fields are IPADDRESSSTR, FINDING
Table 2: Contains Subnet information stored in integer format. The fields are SITE_ID, LOW, and HIGH
What I'd like to do is load the IP range information into memory and then return the findings from table 1 where the IPADDRESSSTR is between the LOW and HIGH integer value.
1) Is there a way to load all of the ranges from table 2 into an array and then compare all the IP addresses (IPADDRESSSTR) from table 1?
2) How do I convert IPADDRESSSTR (a string) to an integer to perform the comparison.
View 0 Replies
View Related
Aug 18, 2015
I have 2 columns (ID, Msg_text) in a table where i need to combine every 3 rows into single row. What would be the best option i have? I know by using 'STUFF' and 'XML PATH' i can convert all the rows into a single row but here i'm looking for every 3 rows into a single row.
View 3 Replies
View Related
Sep 7, 2015
declare @table table (
ParentID INT,
ChildID INT,
Value float
)
INSERT INTO @table
SELECT 1,1,1.2
[code]....
This case ParentID - Child 1 ,1 & 2,2 and 3,3 records are called as parent where as null , 1 is child whoose parent is 1 similarly null,2 records are child whoose parent is 2 , .....
Now my requirement is to display parent records with value ascending and display next child records to the corresponding parent and parent records are sorted ascending
--Final output should be
PatentID ChildID VALUE
33 1.12
null3 56.7
null3 43.6
11 1.2
null1 4.8
null1 4.6
22 1.8
null1 1.4
View 2 Replies
View Related
Jun 3, 2008
Hi,
We are trying to consolidate sales order data from different sales locations where in which we have to merge multiple [more than 25] MSDE 2000 databases into a Single DB. What is the best way to do this?
At the end of the day i should have one DB which contains sales order data of all the sales locations.
Thanks,
SakthiVenkatesh.
View 5 Replies
View Related
Sep 3, 2014
I'm trying to update a checkbox from "False" to "True" within a single table for multiple records. I can update a single record using the script below. However, I'm having trouble applying additional Id's to the string.
(Works) - Update Name_Demo set KEY_CONTACT = 'true' where ID = 225249
(doesn't work) - Update Name_Demo set KEY_CONTACT = 'true' where ID = '225249, 210014, 216543'
It says query executes successfully but returned no rows.
View 3 Replies
View Related
Mar 20, 2014
writing the query for the following, I need to collapse the continuity. If the termdate for an ID is one day less than the effdate of the next id (for the same ID) i need to collapse the records. See below example .....how should i write the query which will give me the desired output. i.e., get min(effdate) and max(termdate) if termdate is one day less than the effdate of next record.
ID effdate termdate
556868 1999-01-01 1999-06-30
556868 1999-07-01 1999-10-31
556869 2002-10-01 2004-01-31
556872 1999-02-01 2000-08-31
556872 2000-11-01 2004-01-31
556872 2004-02-01 2004-02-29
output should be ......
ID effdate termdate
556868 1999-01-01 1999-10-31
556869 2002-10-01 2004-01-31
556872 1999-02-01 2000-08-31
556872 2000-11-01 2004-02-29
View 0 Replies
View Related
Feb 15, 2012
I have a table JOBCODE which contains a list of codes.
I want to insert these values into table VIEWS as a list separated by spaces.
E.G.
Table Jobcodes looks like this
code
1
2
3
4
5
6
And I want table Views to look like this:
field1
1 2 3 4 5 6
How do I go about this?
View 4 Replies
View Related
Aug 7, 2015
I'm trying to pull all records from one table and just a single record from another. I have this join, (see below). It works ok, but the problem is if a blog record doesn't have a corresponding image record it doesn't return. The end result should be the blog record and a single corresponding image record. But always a blog record.
SELECT
[Blogs].[ID],
[Blogs].[BlogTitle],
[Blogs].[BlogType],
[Blogs].[BlogText],
[code]...
View 6 Replies
View Related
Aug 6, 2007
Hi Gurus,
I have a table having sales records and there are more than one record per one customer. The sales table has a reference number like below.
CustomerID
Sales_Ref
2
H_1123
2
H_2344
2
H_4322
I need to do a query and generate the following query.
CustomerID Ref
2 H_1123,H_2344,H_4322
Could someone help me on this.
Thanks.
Cheers,
Vijay
View 3 Replies
View Related
Oct 15, 2005
Hello there. I am Completely new to SQL and this forum, and this problem that I have may appear to be very basic to you guys but still... I was wondering if I could get some help with a database I am trying to make in MS Access.
I have used the Access TransferText function to import data from a text file into a table with an ID attached to each line, eg.
ID Text
1 Hello world
2 This is an example
3 Of my database
I want to merge the data, or copy it into a field in a new table to get:
ID Text
1 Hello World
This is an example
Of my database
2 [more imported text from a different table]
and i have been advised that SQL is the best way to do this. Is it possible to have line breaks in a field within microsoft access, or would it have to be structured as
ID Text
1 Hello World This is an Example Of My Database
2 ...
And how would i make the SQL to do this?
Thanks,
Thom
View 2 Replies
View Related
Aug 16, 2015
I want to recursively select all records within a hierarchy, using the main parentid and a textvalue on level 1 OR level 2 of the subcategories.
My data:
CREATE TABLE [dbo].[articlegroups](
[id] [int] NOT NULL,
[parentid] [int] NOT NULL,
[catlevel] [tinyint] NOT NULL,
[slug_en] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_globos_articlegroups] PRIMARY KEY CLUSTERED
[Code] ...
When selecting rows I always have the main parentId (so catlevel 0) and the slug_en value.
In my example case I have id 129 and slug_en='cradles'.
I want my query to then return:
idparentidcatlevel
12900
1301291
1361302If I have id 129 and slug_en='pillows'.
I want my query to then return:
idparentidcatlevel
12900
1391291
How can I do this? I'm new to SQL Server. I was reading here [URL] .... on recursive SQL, but how to implement this as I just have one table and I also have 2 selection criteria (main category id and a text value on either level 1 or 2).
View 9 Replies
View Related
Sep 8, 2013
I have a table like
Number Desc
1 Bank
2 Shop
3 Store
2 Home
1 Mall
2 House
I want to have a result as
Number Desc All
1 Bank Mall
2 Shop Home House
3 Store
using a proper select syntax
View 3 Replies
View Related
May 10, 2001
Hi,
i need to select from a table transact where one of the coulmn values has to be equal to (1and 2 and 3).
e.g: column in (1,2,3) would give me what "OR" would do,
View 1 Replies
View Related