Eliminating Redundant Data
Jul 20, 2005
edit: this came out longer than I thought, any comments about anything
here is greatly appreciated. thank you for reading
My system stores millions of records, each with fields like firstname,
lastname, email address, city, state, zip, along with any number of user
defined fields. The application allows users to define message templates
with variables. They can then select a template, and for each variable
in the template, type in a value or select a field.
The system allows you to query for messages you've sent by specifying
criteria for the variables (not the fields).
This requirement has made it difficult to normalize my datamodel at all
for speed. What I have is this:
[fieldindex]
id int PK
name nvarchar
type datatype
[recordindex]
id int PK
....
[recordvalues]
recordid int PK
fieldid int PK
value nvarchar
whenever messages are sent, I store which fields were mapped to what
variables for that deployment. So the query with a variable criteria
looks like this:
select coalesce(vm.value, rv.value)
from sentmessages sm
inner join variablemapping vm on vm.deploymentid=sm.deploymentid
left outer join recordvalues rv on
rv.recordid=sm.recordid and rv.fieldid=vm.fieldid
where coalesce(vm.value, rv.value) ....
this model works pretty well for searching messages with variable
criteria and looking up variable values for a particular message. the
big problem I have is that the recordvalues table is HUGE, 1 million
records with 50 fields each = 50 million recordvalues rows. The value,
two int columns plus the two indexes I have on the table make it into a
beast. Importing data takes forever. Querying the records (with a field
criteria) also takes longer than it should.
makes sense, the performance was largely IO bound.
I decided to try and cut into that IO. looking at a recordvalues table
with over 100 million rows in it, there were only about 3 million unique
values. so I split the recordvalues table into two tables:
[recordvalues]
recordid int PK
fieldid int PK
valueid int
[valueindex]
id int PK
value nvarchar (unique)
now, valueindex holds 3 million unique values and recordvalues
references them by id. to my suprise this shaved only 500mb off a 4gb
database!
importing didn't get any faster either, although it's no longer IO bound
it appears the cpu as the new bottleneck outweighed the IO bottleneck.
this is probably because I haven't optimized the queries for the new
tables (was hoping it wouldn't be so hard w/o the IO problem).
is there a better way to accomplish what I'm trying to do? (eliminate
the redundant data).. does SQL have built-in constructs to do stuff like
this? It seems like maybe I'm trying to duplicate functionality at a
high level that may already exist at a lower level.
IO is becoming a serious bottleneck.
the million record 50 field csv file is only 500mb. I would've thought
that after eliminating all the redundant first name, city, last name,
etc it would be less data and not 8x more!
-
Gordon
Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
View 5 Replies
ADVERTISEMENT
Apr 27, 2008
Hello everybody,
have following problem:
I need info from 2 Tables. from the Table 2 I just need 1 column. When i ask for this column the output I get is data repeating themselve many times.
Distinct, should give me unique data, but is doesnt....
the code:
SELECT DISTINCT FSenddate, FSupplyIDName, FSupplyerNumber,FBillNo,FSourceBillNo,FItemName,FItemModel,
FAuxQty,FAuxTaxPrice,FHeadSelfP0237
FROM vwICBill_26
WHERE FSenddate BETWEEN DATEADD(dd,-14,GETDATE()) AND GETDATE()
This code just works in Table1 (vwICBill_26)
but with table 2 (vwICBill_1)
SELECT DISTINCT vwICBill_26.FSenddate,vwICBill_26.FSupplyIDName,
vwICBill_26.FSupplyerNumber,vwICBill_26.FBillNo,
vwICBill_26.FSourceBillNo,vwICBill_26.FItemName,
vwICBill_26.FItemModel,vwICBill_26.FAuxQty,
vwICBill_26.FAuxTaxPrice,vwICBill_26.FHeadSelfP0237,
vwICBill_1.FDate,vwICBill_1.FContractBillNo
FROM vwICBill_26,vwICBill_1
WHERE vwICBill_26.FSenddate BETWEEN DATEADD(dd,-14,GETDATE()) AND GETDATE()
AND vwICBill_1.FContractBillNo=vwICBill_26.FSourceBillNo
The last sentence is the problem
I want that it shows me the data that is not equal.
As soon as I implement the not equal it shows me the massive repeating data.
I mean even without the last sentence I get this data output.
All together, I want a clear database output without data repeating.
Any ideas how it may work without DISTINCT?
I think this problem is a typical amateure problem, but I would apreciate help!
View 2 Replies
View Related
May 13, 2008
I am querying several tables and piping the output to an Excel spreadsheet.
Several (not all) columns contain repeating data that I'd prefer not to include on the output. I only want the first row in the set to have that data. Is there a way in the query to do this under SQL 2005?
As an example, my query results are as follows (soory if it does not show correctly):
OWNERBARN ROUTE DESCVEHDIST CASE
BARBAR TRACKING #70328VEH 32832869.941393
BARBAR TRACKING #70328VEH 32832869.941393
BARBAR TRACKING #70328VEH 32832869.941393
DAXDAX TRACKING #9398VEH 39839834.942471
DAXDAX TRACKING #9398VEH 39839834.942471
DAXDAX TRACKING #9398VEH 39839834.942471
TAXTAX TRACKING #2407 40754.391002
TAXTAX TRACKING #2407 40754.391002
TAXTAX TRACKING #2407 40754.391002
I only want the output to be:
OWNERBARN ROUTE DESCVEHDIST CASE
BARBAR TRACKING #70328VEH 32832869.941393
DAXDAX TRACKING #9398VEH 39839834.942471
TAXTAX TRACKING #2407 40754.391002
Thanks,
Walt
View 4 Replies
View Related
Feb 2, 2005
Hi, good day everyone.
I would like to have a question here,
I want to configure a redundant SQL server. Let's said if server A is down, then server B can take over the workload of server A, and this is transparent to users which means they won't notify server A is down.
Besides the failover clustering method, is there any other solution?
My requirement is needed to run in Microsoft SQL 2000 standard edition and Microsoft Windows 2000 standard edition
Thanks,
View 6 Replies
View Related
Aug 26, 2004
The next script, gets redundant indexes, in a given database.
I run it in the query Analyzer, one statement at a time.
PLEASE: review the output, before drop any index.
USE ....
-- step 1
-- gets an tab,idx,col,order view
create view listaidxcols as
select SO.name as tabname,
SI.name as idxname,
IK.keyno as keyno,
SC.name as colname
from sysindexkeys IK,
syscolumns SC,
sysindexes SI,
sysobjects SO
where -- Link syscolumns
IK.id=SC.id
and IK.colid=SC.colid
-- Link sysindexes
and IK.id=SI.id
and IK.indid=SI.indid
-- Link sysObjects (tables)
and IK.id=SO.id
and SO.xtype='U'
-- no internal indexes
and SI.name not like '_WA_Sys_%'
and SI.name not like 'hind_%'
--step 2: view to get # of columns per index
create view cantcolsidx
as select tabname,
idxname,
count(*) as numllaves
from listaidxcols
group by tabname,idxname
-- step 3
-- the redundant index list
select A.tabname as tabla,A.idxname as Aidx, B.idxname as Bidx
from cantcolsidx A, cantcolsidx B
where A.tabname = B.tabname
and A.numllaves < B.numllaves
and A.idxname <> B.idxname
and A.numllaves in (
select count(*)
from listaidxcols C, listaidxcols D
where C.tabname=A.tabname
and C.idxname=A.idxname
and D.tabname=B.tabname
and D.idxname=B.idxname
and C.idxname<>D.idxname
and C.colname=D.colname
and C.keyno =D.keyno
)
--clean up
drop view listaidxcols;
drop view cantcolsidx;
View 2 Replies
View Related
Mar 15, 2007
any tutorial on how to setup SQL Server in a dual redundant environment ?
thanks
View 3 Replies
View Related
Dec 4, 2007
Hi All,
I recently updated my Sql Server 2000 to 2005. I have around 150 stored procedures which are used to produce reporting.
They all worked perfect on 2000 and I was wondering if there were any redundant function or changes in syntax in the 2005 that i should look out for.
Can anyone assist?
Many thanks in advance,
Kurt
View 6 Replies
View Related
Aug 18, 2015
I'm looking for clarification around how SQL 2014 would get licensed if a server only has 1 of 2 CPU sockets in use (second socket being empty). I know the new license model is Core based, not Socket based. So does this mean that if I buy a "4 core pack" to cover my first CPU (quad core CPU), I am compliant with the license model? Or does Microsoft want me to license an empty socket with a Core Pack too? Its hard to find a rack mount server that only has 1 CPU socket. And the ones I do find don't have enough RAM slots or redundant power supplies.
View 2 Replies
View Related
Feb 12, 2008
All,
I have a situation where I need to identify redundant rows within a table. Here is the schema of the table:
create table Temp.Response (
TempKey int identity(1,1) not null primary key clustered,
ResponseId char(27) not null,
StudentUin char(9) not null,
TemplateId char(27) not null,
MidEndFlag char(3) not null
)
Here is a sample dataset that represents the production data:
TempKey | ResponseId | StudentUin | TemplateId | MidEndFlag
1 2008-02-12-08-10-43-3434648 317003316 2008-01-31-10-12-27-4882454 Mid
2 2008-02-12-08-11-40-5279829 317003316 2008-01-31-10-12-27-4882454 Mid
3 2008-02-11-21-29-12-1254611 516007344 2008-01-31-10-32-26-2359751 Mid
4 2008-02-11-21-30-34-7326988 516007344 2008-01-31-10-32-26-2359751 Mid
5 2008-02-11-21-31-24-2804312 516007344 2008-01-31-10-32-26-2359751 Mid
6 2008-02-11-21-31-47-1742947 516007344 2008-01-31-10-32-26-2359751 Mid
7 2008-02-11-18-52-25-3689636 614001463 2008-01-31-10-32-26-2359751 Mid
8 2008-02-11-18-54-11-7500029 614001463 2008-01-31-10-32-26-2359751 Mid
9 2008-02-11-22-13-59-9139208 614001606 2008-01-31-10-32-26-2359751 Mid
10 2008-02-11-22-14-50-5822454 614001606 2008-01-31-10-32-26-2359751 Mid
11 2008-02-11-22-15-47-6257351 614001606 2008-01-31-10-32-26-2359751 Mid
12 2008-02-11-23-23-31-4431851 614001756 2008-01-31-10-32-26-2359751 Mid
13 2008-02-11-23-24-06-4806990 614001756 2008-01-31-10-32-26-2359751 Mid
I need to identify the ResponseId values for rows that contain redundant StudentUin/TemplateId/MidEndFlag values, so that I can delete those rows. ResponseId, while not the primary key, is a unique value in this dataset. I thought I might use a cursor to parse this, but the real dataset is exceedingly large, and would like a set-based solution.
Best,
B.
View 3 Replies
View Related
Feb 11, 2005
Have a pretty simple wuestion but the answer seems to be evading me:
Here's the DDL for the tables in question:
CREATE TABLE [dbo].[Office] (
[OfficeID] [int] IDENTITY (1, 1) NOT NULL ,
[ParentOfficeID] [int] NOT NULL ,
[WebSiteID] [int] NOT NULL ,
[IsDisplayOnWeb] [bit] NOT NULL ,
[IsDisplayOnAdmin] [bit] NOT NULL ,
[OfficeStatus] [char] (1) NOT NULL ,
[DisplayORD] [smallint] NOT NULL ,
[OfficeTYPE] [varchar] (10) NOT NULL ,
[OfficeNM] [varchar] (50) NOT NULL ,
[OfficeDisplayNM] [varchar] (50) NOT NULL ,
[OfficeADDR1] [varchar] (50) NOT NULL ,
[OfficeADDR2] [varchar] (50) NOT NULL ,
[OfficeCityNM] [varchar] (50) NOT NULL ,
[OfficeStateCD] [char] (2) NOT NULL ,
[OfficePostalCD] [varchar] (15) NOT NULL ,
[OfficeIMG] [varchar] (100) NOT NULL ,
[OfficeIMGPath] [varchar] (100) NOT NULL ,
[RegionID] [int] NOT NULL ,
[OfficeTourURL] [varchar] (255) NULL ,
[GeoAreaID] [int] NOT NULL ,
[CreateDT] [datetime] NOT NULL ,
[UpdateDT] [datetime] NOT NULL ,
[CreateByID] [varchar] (50) NOT NULL ,
[UpdateByID] [varchar] (50) NOT NULL ,
[OfficeBrandedURL] [varchar] (255) NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[OfficeManagement] (
[OfficeID] [int] NOT NULL ,
[PersonnelID] [int] NOT NULL ,
[JobTitleID] [int] NOT NULL ,
[CreateDT] [datetime] NOT NULL ,
[CreateByID] [varchar] (50) NOT NULL ,
[SeqNBR] [int] NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[OfficeMls] (
[OfficeID] [int] NOT NULL ,
[SourceID] [int] NOT NULL ,
[OfficeMlsNBR] [varchar] (20) NOT NULL ,
[CreateDT] [datetime] NOT NULL ,
[UpdateDT] [datetime] NOT NULL ,
[CreateByID] [varchar] (50) NOT NULL ,
[UpdateByID] [varchar] (50) NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Personnel] (
[PersonnelID] [int] IDENTITY (1, 1) NOT NULL ,
[PersonnelDisplayName] [varchar] (100) NOT NULL ,
[FirstNM] [varchar] (50) NOT NULL ,
[PreferredFirstNM] [varchar] (50) NOT NULL ,
[MiddleNM] [varchar] (50) NOT NULL ,
[LastNM] [varchar] (50) NOT NULL ,
[PersonalTaxID] [varchar] (9) NOT NULL ,
[HireDT] [datetime] NOT NULL ,
[TermDT] [datetime] NOT NULL ,
[HomePhoneNBR] [varchar] (15) NULL ,
[HomeADDR1] [varchar] (50) NOT NULL ,
[HomeADDR2] [varchar] (50) NOT NULL ,
[HomeCityNM] [varchar] (50) NOT NULL ,
[HomeStateCD] [char] (2) NOT NULL ,
[HomePostalCD] [varchar] (15) NOT NULL ,
[PersonnelLangCSV] [varchar] (500) NOT NULL ,
[PersonnelSlogan] [varchar] (500) NOT NULL ,
[BGColor] [varchar] (50) NOT NULL ,
[IsEAgent] [bit] NOT NULL ,
[IsArchAgent] [bit] NOT NULL ,
[IsOptOut] [bit] NOT NULL ,
[IsDispOnlyPrefFirstNM] [bit] NOT NULL ,
[IsHideMyListingLink] [bit] NOT NULL ,
[IsPreviewsSpecialist] [bit] NOT NULL ,
[AudioFileNM] [varchar] (100) NULL ,
[iProviderID] [int] NOT NULL ,
[DRENumber] [varchar] (10) NOT NULL ,
[AgentBrandedURL] [varchar] (255) NOT NULL ,
[CreateDT] [datetime] NOT NULL ,
[UpdateDT] [datetime] NOT NULL ,
[CreateByID] [varchar] (50) NOT NULL ,
[UpdateByID] [varchar] (50) NOT NULL ,
[IsDisplayAwards] [bit] NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[PersonnelMLS] (
[PersonnelID] [int] NOT NULL ,
[SourceID] [int] NOT NULL ,
[AgentMlsNBR] [varchar] (20) NOT NULL ,
[CreateDT] [datetime] NOT NULL ,
[UpdateDT] [datetime] NOT NULL ,
[CreateByID] [varchar] (50) NOT NULL ,
[UpdateByID] [varchar] (50) NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Office] ADD
CONSTRAINT [FK_Office_OfficeProfile] FOREIGN KEY
(
[OfficeID]
) REFERENCES [dbo].[OfficeProfile] (
[OfficeID]
) NOT FOR REPLICATION
GO
alter table [dbo].[Office] nocheck constraint [FK_Office_OfficeProfile]
GO
ALTER TABLE [dbo].[OfficeManagement] ADD
CONSTRAINT [FK_OfficeManagement_LookupJobTitle] FOREIGN KEY
(
[JobTitleID]
) REFERENCES [dbo].[LookupJobTitle] (
[JobTitleID]
),
CONSTRAINT [FK_OfficeManagement_Office] FOREIGN KEY
(
[OfficeID]
) REFERENCES [dbo].[Office] (
[OfficeID]
) NOT FOR REPLICATION ,
CONSTRAINT [FK_OfficeManagement_Personnel] FOREIGN KEY
(
[PersonnelID]
) REFERENCES [dbo].[Personnel] (
[PersonnelID]
) ON DELETE CASCADE
GO
alter table [dbo].[OfficeManagement] nocheck constraint [FK_OfficeManagement_Office]
GO
ALTER TABLE [dbo].[OfficeMls] ADD
CONSTRAINT [FK_OfficeMls_Office] FOREIGN KEY
(
[OfficeID]
) REFERENCES [dbo].[Office] (
[OfficeID]
) NOT FOR REPLICATION
GO
alter table [dbo].[OfficeMls] nocheck constraint [FK_OfficeMls_Office]
GO
ALTER TABLE [dbo].[PersonnelMLS] ADD
CONSTRAINT [FK_PersonnelMLS_Personnel] FOREIGN KEY
(
[PersonnelID]
) REFERENCES [dbo].[Personnel] (
[PersonnelID]
) NOT FOR REPLICATION
GO
alter table [dbo].[PersonnelMLS] nocheck constraint [FK_PersonnelMLS_Personnel]
GO
Here's the query I'm having trouble with:
SELECT distinct Personnel.PersonnelID,
Personnel.FirstNM,
Personnel.LastNM,
Office.OfficeNM,
Office.OfficeID,
OfficeMls.SourceID AS OfficeBoard,
PersonnelMLS.SourceID AS AgentBoard
FROM Personnel INNER JOIN
OfficeManagement ON
Personnel.PersonnelID = OfficeManagement.PersonnelID
INNER JOIN
Office ON OfficeManagement.OfficeID = Office.OfficeID
INNER JOIN
OfficeMls ON Office.OfficeID = OfficeMls.OfficeID
INNER JOIN
PersonnelMLS ON Personnel.PersonnelID = PersonnelMLS.PersonnelID
where officemls.sourceid <> personnelmls.sourceid
and office.officenm not like ('%admin%')
group by PersonnelMLS.SourceID,
Personnel.PersonnelID,
Personnel.FirstNM,
Personnel.LastNM,
Office.OfficeNM,
Office.OfficeID,
OfficeMls.SourceID
order by office.officenm
What I'm trying to retrieve are those agents who have source id's that are not in the Office's domain of valid source id's. Here's a small portion of the results:
PersonnelID FirstNM LastNM OfficeNM OfficeID OfficeBoard AgentBoard
----------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ----------- ----------- -----------
18205 Margaret Peggy Quattro Aventura North 650 906 908
18205 Margaret Peggy Quattro Aventura North 650 918 908
15503 Susan Jordan Blackburn Point 889 920 909
15503 Susan Jordan Blackburn Point 889 921 909
15503 Susan Jordan Blackburn Point 889 921 920
15279 Sandra Humphrey Boca Beach North 890 917 906
15279 Sandra Humphrey Boca Beach North 890 906 917
15279 Sandra Humphrey Boca Beaches 626 917 906
15279 Sandra Humphrey Boca Beaches 626 906 917
13532 Michael Demcho Boca Downtown 735 906 917
14133 Maria Ford Boca Downtown 735 906 917
19126 Michael Silverman Boca Glades Road 736 917 906
18920 Beth Schwartz Boca Glades Road 736 906 917
If you take a look at Sandra Humphries, you'll see she's out of office 626. Office 626 is associated with source id's 907 and 916. Sandra Humphries is also associated with those two source id's , but she shows up in the results.
I know this was AWFULLY long winded, but just wanted to make sure made myself as clear as possible.
Any help would be greatly appreciated.
Thanks in advance!
View 8 Replies
View Related
Aug 9, 2000
Hi,
I have a table with four columns. like id,lastname,
firstname,acctname. I have duplicate values for the three columns other
than id column. like
ID FirstNameLastname Acctname
1 john hopkins jh
2 john hopkins Jh
3 david webb dw
4 david webb dw
5 david webb dw
6 Dan Kennedy DK
I want to eliminate the duplicate rows. id can be any one of them.
Can any one suggest me with a query by which i can do this.
Thanks in advance
Mohan
View 2 Replies
View Related
Jun 26, 2000
How do I eliminate others from viewing one of the 2 databases on our production server???Is there any security not to allow all users to including sa and developers not to access one of the 2 databases on our server..
The other of the 2 databases can be accessed....
Please advise
Newbie
View 1 Replies
View Related
Apr 11, 2008
Hi All,
I need to eliminate Duplicates in my Sql Query, tried to use distinct and that doesn't seem to work, can anybody pls.help.
duplicates are in #ddtempC table, and am writing a query to get a country name from the hash table where hash table has duplicates
hash table contains (THEATER_CODE, COUNTRY_CODE, COUNTRY_NAME).
and trying to write condition on THEATER_CODE and COUNTRY_CODE to get Country_name
and THEATER_CODE AND COUNTRY_CODE HAS DUPLICATES. whenever i do a sub query i get the below error.
Msg 512, Level 16, State 1, Line 1
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
SELECT USER_FIRSTNAME, USER_LASTNAME,
user_countryCode,
USER_COUNTRY = (SELECT DISTINCT RTRIM(LTRIM(COUNTRY_NAME)) FROM #ddtempC WHERE RTRIM(LTRIM(COUNTRY_CODE)) = USER_COUNTRYCODE AND RTRIM(LTRIM(THEATER_CODE)) = USER_THEATERCODE)
FROM [user]
WHERE USER_USERNAME IS NOT NULL AND User_CreationDate BETWEEN '1/2/2007' AND '4/11/2008'
ORDER BY User_TheaterCode;
Thanks in Advance.
View 3 Replies
View Related
May 13, 2008
Hey There.
I'm in the process of doing a major data clean up and I'm just wondering how I would go about eliminating some redundant data.
The Table Layout
Contracts
CNTRID CONTRACTNUM STARTDATE CUSTOMNUM
=======================================================
0 1234567 091885 A
1 1234567 091885 A
2 1111111 111111 B
3 1234567 081205 A
Equipment
EQUIPID DEVICENAME CNTRID CUSTOMNUM
=======================================================
0 DEVICE1 0 A
1 DEVICE2 2 B
2 DEVICE3 1 A
3 DEVICE4 3 A
You will notice that each customer may have multiple devices. Each device may be tied to a contract, and each contract may have one or more devices tied to it.
In the example above, you will notice in the contracts table the contracts with the IDs 0 and 1.
Fig 1.
CNTRID CONTRACTNUM STARTDATE CUSTOMNUM
=======================================================
0 1234567 091885 A
1 1234567 091885 A
These contracts have the exact same information.
Furthermore, if you look down the table you will notice the contract with the ID 3.
Fig 2.
CNTRID CONTRACTNUM STARTDATE CUSTOMNUM
=======================================================
3 1234567 081205 A
This contract shares the same contract and customer number, but has a different start date.
Now lets take a look devices in the equipment table that refer to these records.
EQUIPID DEVICENAME CNTRID CUSTOMNUM
=======================================================
0 DEVICE1 0 A
2 DEVICE3 1 A
3 DEVICE4 3 A
You will notice that DEVICE1 and DEVICE 3 refer to the contract records that contain identical data. (As shown in 'Fig 1')
My question is as follows:
How do I eliminate the any duplicate records from the contracts table, and update the records in the equipment table with id of the left over contract.
Results Should be as follows:
Contracts
CNTRID CONTRACTNUM STARTDATE CUSTOMNUM
=======================================================
0 1234567 091885 A
2 1111111 111111 B
3 1234567 081205 A
Equipment
EQUIPID DEVICENAME CNTRID CUSTOMNUM
=======================================================
0 DEVICE1 0 A
1 DEVICE2 2 B
2 DEVICE3 0 A
3 DEVICE4 3 A
Any help you may provide would be greatly appreciated!
Thanks
--mike
View 11 Replies
View Related
Jul 29, 2013
I have a SQL statement with two left outer joins which connects 3 tables. Vendors, Tracking & Activity. For whatever reason, even though each is a one-to-many relationship, I am able to join 2 tables (from Vendors to Tracking) without an issue. when I then join Activity, I get a Cartesian product.I suspected that 'DISTINCT'.
SELECT DISTINCT CASE
WHEN `vendor`.`companyname` IS NULL then 'No Company Assigned'
ELSE `vendor`.`companyname`
END AS companyNameSQL, `tracking`.`pkgTracking`, CASE
[code]....
View 4 Replies
View Related
Sep 29, 2014
Need to eliminate certain records from my query. The below is a simple query to illustrate my problem
My Query
Select RequestNo,Event_type from Event_log where Event_type in (10,20)
Data
RequestNo Event_type
123456 10
123457 10
123457 20
123458 10
123459 10
123459 20
This above query returns all requests that meets atleast one criteria. How do i edit my query such that i get requests that meet both criteria and the result set looks like below
Data
RequestNo Event_type
123457 10
123457 20
123459 10
123459 20
View 2 Replies
View Related
Apr 10, 2014
WITH cte_OrderProjectType AS
(
select Orderid, min(TypeID) , min(CTType) , MIN(Area)
from tableA A inner join
tableB B ON A.PID = B.PID left join
tableC C ON C.TypeID = B.TypeID LEFT JOIN
tableD D ON D.AreaID = B.ID
group by A.orderid
)
This query uses min to eliminate duplicates. It takes 1.30 seconds to complete..
Is there any way I can improve the query performance ?
View 9 Replies
View Related
Oct 21, 2014
Being one step removed from innumerate, I was wondering whether there was a more elegant way to avoid divide by zero error instead of trudging through a bunch of isnulls.
My intuition tells me that since multiplication looks like repeated addition, that maybe division is repeated subtraction?
If that's true is there a way to finesse divide by zero errors by somehow reframing the statement as multiplication instead of division?
The sql statement that is eating my kishkas is
cast(1.0*(
(ISNULL(a.DNT,0)+ISNULL(a.rex,0)+ISNULL(a.med,0))-(ISNULL(b.dnt,0)+ISNULL(b.rex,0)+ISNULL(b.med,0))/
ISNULL(a.DNT,0)+ISNULL(a.rex,0)+ISNULL(a.med,0)) as decimal(10,4)) TotalLossRatio
Is there a way to nucleate the error by restating the division? My assertion underlying this statement is that the a alias represents a premium paid, so between medical, pharmacy and dental, there MUST BE at least one premium paid, otherwise you wouldn't be here. the b alias is losses, so likewise, between medical, pharmacy and dental, there MUST BE at least one loss (actually, it just occurred to me that maybe there are no losses, but that would be inconceivable, but ill check again)) so that's when it struck me that maybe there's a different way to ask the question that obviates the need to do it by division.
View 6 Replies
View Related
Feb 11, 2007
I am new to sql server and I am having deficulties writing sql script to perform the following:
1) Merging data from two tables A and B
2) Eliminate duplicate present in table B (Conditions to satisfy for dublicate:If similar address is found in both tables AND class type in Table A =1
3) merge data related to dup(eliminated records) to new table.
Not sure if we can eliminate records first before merging two tables. Tables are as follow:
Table A
Fields: ID, NAME, Address, city, zip, Class type
Value:123, John, 123 Main, NY, 71690,1
Value:124, Tom, 100 State, LA, 91070,0
Table B
Field: ID, NAME, Address, city, zip, Class Type
Value:200, Tim, 123 Main, NY, 71690,0 (duplicate; satisfied both conditions and left out in final table)
Value:124, Jack, 100 State, LA, 91070,0 (same condition but second condition is not met)
Value:320,Bob, 344 coast hwy, slc, 807760,0
Final Table:
Field: ID, NAME, Address, city, zip, Class Type
Value:123, John, 123 Main, NY, 71690,1 (should also show t
Value:124, Tom, 100 State, LA, 91070,0
Value:124, Jack, 100 State, LA, 91070,0
Value:320,Bob, 344 coast hwy, slc, 807760,0
Table d:(relate to table A:showing all products that are related to table A)
table_A.ID, Products
123, Paper 1
123, paper 2
Table e:(relate to table B: showing all products that are related to table B)
table_B.ID, Products
200, Paper 3
Final Table:
ID, Product
123, Paper 1
123, Paper 2
123, Paper 3 (changing table b id to table a)
Would appreciate any help writing script to perform such transformation. Thanks
View 5 Replies
View Related
Jul 13, 2007
Hello,
I'm trying to eliminate the duplicate 'URL' rows in the query:
SELECT
ni.[Id],
ni.[Abstract],
ni.[MostPopular],
ni.[URL]
FROM dbo.[NewsCategory] nc WITH (READUNCOMMITTED)
INNER JOIN dbo.[NewsItem] ni WITH (READUNCOMMITTED)
ON nc.[Id] = ni.NewsCategoryId
WHERE
--nc.[ProviderId] = @ProviderId
--AND
ni.[URL] in (
select DISTINCT URL
from dbo.NewsItem
where mostpopular = 1
-- OR mostemailed = 1
)
ORDER BY ni.[DateStamp] DESC
If you look at this line in the query :
select DISTINCT URL
from dbo.NewsItem
where mostpopular = 1
IF i run this query alone it will return 8 unique rows. I expect that the SELECT IN statemnet would help return a distinct set but it doesn't. This entire query returns like 20 rows with duplicate rows.
The reason why I can't do a distinct in the first set of columns is because the column ni.[Abstract] is TEXT and it says that data type is NOT COMPARABLE.
Thanks so much.
View 5 Replies
View Related
Jul 20, 2007
Hi i have a table value which contains
value
-----
a
a
a
b
b
b
c
c
c
Now i need to have the results as
a 1
b 1
c 1
I tried using distinct.But OLEDB returns error that invalid syntax.It doesn't support distinct keyword.Actually i read these table from a file thru OLEDB.Not from a database.Any idea ? Thanks in Advance
View 8 Replies
View Related
Jul 20, 2005
Am I going about this the right way? I want to find pairs of entitiesin a table that have some relationship (such as a field being thesame), so Iselect t1.id, t2.id from sametable t1 join sametable t2 ont1.id<>t2.idwhere t1.fieldx=t2.fieldx ...The trouble is, this returns each pair twice, e.g.B CC BM NN MIs there a way to do this kind of thing and only get each pair once?Kerry
View 2 Replies
View Related
Jul 20, 2005
Suppose I have users that can belong to organizations. Organizationsare arranged in a tree. Each organization has only one parentorganization but a user maybe a member of multiple organizations.The problem that I'm facing that both organizations and individualusers may have relationships with other entities which aresemantically the same. For instance, an individual user can purchasethings and so can an organization. An individual user can havebusiness partners and so can an organization. So it seems that I wouldneed to have a duplicate set of link tables that link a user to apurchase and then a parallel link table linking an organization to apurchase. If I have N entities with which both users and organizationsmay have relationships then I need 2*N link tables. There is nothingwrong with that per se but just not elegant to have two differenttables for a relationship which is the same in nature, e.g.purchaser->purchaseditem.One other approach I was thinking of is to create an intermediateentity (say it's called "holder") that will be used to hold referencesto all the relationships that both an organization and an individualmay have. There will be 2 link tables linking organizations to"holder" and users to "holder". Holder will in turn reference thepurchases, partners and so on. In this case the number of link tableswill be N+2 as opposed to 2*N but it will have a performance cost ofan extra join.Is there a better way of modelling this notion of 2 different entitiesthat can possess similar relationships with N other entities?
View 28 Replies
View Related
Jun 19, 2015
I have an UPDATE statement that joins two table by SendId. One table, I'll call it T1, has a clustered index on SendId ASC. The other table I will call T2 also has a clustered index on SendID ASC. All the columns from T2 are used to update T1. The execution plan shows a Clustered index scan on T2 and a Clustered Index Seek on T1 going into a Nested Loops inner join. Immediately following is a Distinct Sort that is done on SendId ASC. Why the Distinct SORT if the tables are already ordered by SendID?
View 8 Replies
View Related
Mar 10, 2014
I'm using SQL 2012 express.. and just recently learned how to code.
I wrote a query and keep receiving this error...
Error converting data type varchar to float.
here's the query code
SELECT SUM(cast(lc as float))
FROM [dbo].[LaborCosts]
WHERE ppty = 'ga'
AND PL = 'allctd ktchn expns'
AND ACCT like 'payroll%'
I am trying to sum up the values in column LC, and realized I have unnecessary quotations marks. How can I eliminate the quotations from the column, and only query the numerical values?
View 2 Replies
View Related
Jul 27, 2015
We are trying to do some utilization calculations that need to factor in a given number of holiday hours per month.
I have a date dimension table (dimdate). Has a row for every day of every year (2006-2015)
I have a work entry fact table (timedetail). Has a row for every work entry. Each row has a worked date, and this column has a relationship to dimdate.
Our holidays fluctuate, and we offer floating holidays that our staff get to pick. So we cannot hard code which individual dates in dimdate as holidays. So what we have done is added a column to our dimdate table called HolidayHoursPerMonth.
This column will list the number of holiday hours available in the given month that the individual date happens to fall within, thus there are a lot of duplicates. Below is a brief example of dimdate. In the example below, there are 0 holiday hours for the month of June, and their are 8 holiday hours for the month of July.
DateKey MonthNumber HolidayHoursPerMonth
6/29/2015 6 0
6/30/2015 6 0
7/1/2015 7 8
7/2/2015 7 8
I have a pivot table create based of the fact table. I then have various date slicers from the dimension table (i.e. year, month). If I simply drag this column into the pivot table and summarize by MAX it works when you are sliced on a single month, but breaks if anything but a single month is sliced on.
I am trying to create a measure that calculates the amount of holiday hours based on the what's sliced, but only using a single value for each month. For example July should just be 8, not 8 x #of days in the month.
Listed below is how many hours per month. So if you were to slice on an entire year, the measure should equal 64. If you sliced on Jan, Feb and March, the measure should equal 12. If you were to slice nothing, thus including all 15 years in our dimdate table, the measure should equal 640 (10 years x 64 hours per year).
MonthNumberOfYear HolidayHoursPerMonth
1 8
2 4
3 0
4 0
5 8
6 0
7 8
8 0
9 8
10 4
11 16
12 8
View 3 Replies
View Related
Nov 24, 2006
Hi, all here,
Thank you very much for your kind attention.
I am wondering if it is possible to use SSIS to sample data set to training set and test set directly to my data mining models without saving them somewhere as occupying too much space? Really need guidance for that.
Thank you very much in advance for any help.
With best regards,
Yours sincerely,
View 5 Replies
View Related
Dec 14, 2005
After testing out the application i write on the local pc. I deploy it to the webserver to test it out. I get this error.
System.Data.SqlClient.SqlException: The conversion of a char data type to a
datetime data type resulted in an out-of-range datetime value.
Notes: all pages that have this error either has a repeater or datagrid which load data when page loading.
At first I thought the problem is with the date, but then I can see
that some other pages that has datagrid ( that has a date field) work
just fine.
anyone having this problem before?? hopefully you guys can help.
Thanks,
View 4 Replies
View Related
Dec 4, 2007
I have used both data readers and data adapters(with datasets) in the projects that I have worked on. I am trying to get some clarification on when I should be using which one. I think I am doing this correctly but I want to be sure I am developing good habits.
As the name might suggest, it seems like a datareader is for only reading data. I have read that the data adapter and dataset are for a disconnected architecture. Or, that they can be used for this type of set up. I have been using the data adapter and datasets when writing to a database and the datareader when reading from a database.
Is this how these should be used? Is the data reader the best choice for reading data? Am I doing this the optimal way from a performance stand point?
......................................................thanks in advance
View 1 Replies
View Related
Nov 2, 2015
We already integrated different client data to MDS with MS Excel plugin, now we want to push back updated or new added record to source database. is it possible do using MDS? Do we have any background sync process to which automatically sync data to and from subscriber and MDS?
View 4 Replies
View Related
Oct 18, 2006
When I enter over 4000 chars in any ntext field in my SQL Server 2005 database (directly in the database and through the application) I get an error saying that the data could not be updated because string or binary data would be truncated.Has anyone ever seen this? I cannot figure out what is causing it, ntext should be able to hold a lot more data that this...
View 7 Replies
View Related
Aug 12, 2015
I have a requirement to implement CDC for 50+ tables to implement incremental data changes warehouse/reporting rather than exporting the whole table data. The largest table is having more than half a billion records.
The warehouse use a daily copy of OLTP db (daily DB refresh). How can I accomplish this. Is there a downside in implementing CDC just for the sake of taking incremental changes on the tables?
Is there any performance impact if we enable CDC on OLTP db?
Can we make use of the CDC tables on the environment we do daily db refresh so that the queries don't hit OLTP database?
What is the best way to implement CDC to take incremental changes for reporting.
View 0 Replies
View Related
Jul 20, 2005
Hi,This is driving me nuts, I have a table that stores notes regarding anoperation in an IMAGE data type field in MS SQL Server 2000.I can read and write no problem using Access using the StrConv function andI can Update the field correctly in T-SQL using:DECLARE @ptrval varbinary(16)SELECT @ptrval = TEXTPTR(BITS_data)FROM mytable_BINARY WHERE ID = 'RB215'WRITETEXT OPERATION_BINARY.BITS @ptrval 'My notes for this operation'However, I just can not seem to be able to convert back to text theinformation once it is stored using T-SQL.My selects keep returning bin data.How to do this! Thanks for your help.SD
View 1 Replies
View Related