Background:
Working on a CRM type custom application. The application is for an
event management company. The company will provide the application for
other organizations to manage their own events. The events include
conferences, corp meetings, sales meetings, etc...
An event planner will define what information is needed for an attendee
to register for an event. We will be providing a standard list of
attributes for the event planner to select from. This list includes
personal information (name, address, phone numbers), air travel
information (preferred carriers, departure airports, etc...), hotel
information, etc...we've included all of the information available to
us from the business's previous experience. As far as the database
goes, all of the standard information given to use will be normalized.
The problem is each event may have unique information that needs to be
collected that is not part of the standard list of attributes. For
example, if McBurgers is planning an event, the event planner may want
to collect an attendee's McBurger employee code.
Depending on the uniqueness of the event, there may be up to 200 unique
attributes defined for it. This number comes from researching events
planned in the last 5 years. The number of attendees for an event range
from 100 to 10,000. The company expects about 3000 events per year.
Database Design
I've done a fair amount of research and found a couple of options to
meet our requirements, more specifically the need for event planners to
define custom attributes for an event.
1-)DynamicColumns:
Add an Event specific custom attributes table. The table would look
something like this:
Event_McBurger05
AttendeeID | McBurgerEmployeeCode | HiredDate | SomeOtherAttribute
-
Join Bytes! | AxEt356 | 01/01/2004 | Other val 2
2-)EAV:
Add an EAV (entity, attribute, value) table. The table would look
something like this:
There would be one Event_[DataType]Attribute table for each of the
datatypes allowed.
Pros/Cons
1-)DynamicColumns
Pros:
-Data integrity can be enforced
-Simpler queries for reporting
-Clearer data model for understanding data stored
Cons:
-Row size limitation of 8k must be managed (probably need to add
another table if run out of room.
-Stored procedures for CRUD operations would need to dynamically
created OR
Need to use dynamic SQL on the database or application.
-Adding/Removing columns on the fly can be very error prone
2-)EAV
Pros
-Static CRUD stored procs
Cons
-No data integrity
-Complex queries for reporting
-Worse performance than option 1.
-Table can get BIG...fast.
3-)Stronger Typed EAV
Pros
-Static CRUD stored procs
-Better data type integrity than EAV
Cons
-Complex queries for reporting
-Worse performance than option 1
-Table can get BIG...fast.
If you are still reading this...thank you!
The Questions:
-Are there other options other than the 3 described above? Or are these
pretty much it with slight variants.
-Does anyone see any missing Pros/Cons for any of the options that
should be considered?
-Is there a "preferred" method for what I am trying to do?
I suspect this will come down to the lesser of three devils. Just
trying to figure out which of the three it is.
We have prototyped the three options and are leaning towards option 1
and 3.
I'm building a custom transform component. I want to mark some input columns as keys for deduplicating. In a similar way to the provided Sort component, I want to check those columns and allow pass-throughs (or not) for the others - so next to each input column name I need two checkboxes (1:use for dedupe; 2:include in output if 1 not checked). If a column is checked for use in the dedupe, I want some other attributes to be shown indicating how it will be used. How do I display the checkboxes to let users select which columns to include for deduplication, and then how do I add further attributes underneath (copying the Sort component's look) for selection?
Thanks in advance for guidance and pointers on this.
I'd like to create a table that will store different order items. Several order items make up one single order. Order items can have 0 or more children (max depth will never be deeper than one). Order items can have up to 150 attributes/values. The way I think this should be done is using XML column instead of the EAV type of model. My table structure currently looks like this:
* child_order_item_id (PK) * parent_order_item_id (FK to child_order_item_id) * order_id (FK to Order table) * product_id (FK to Product table) * price * attribute_XML
How my attribute_XML should look like or how to validate the xml.
I am working in SQL Server Master Data Services  Version 11.0.5058.0 (SP 2).
I have been asked to group all the financial attributes together. Â When I move one of the attributes up using the arrows, it works good jumping over one attribute at a time. Â Then I reach a section of attributes where it leap frogs over 24 attributes.
It appears these 24 attributes are in a subgroup but there are no attribute groups and I removed the subscription view from the entity. Â If I move one of the 24 attributes in the group, it moves it outside of the 24 attributes.
This is under leaf member attributes. Â There are no collection or consolidated groups.
I'm using a DW from Northwind database to build a cube to do some analitical taks. I already create the cube and now I am "cleaning" the dimensions. I'm having some difficults to understand the logical off this part. The reason is that When I create the Data Source View, I only import the Foreign Keys that connect the Dimensions to Fact_Table. I have to drag the attributes of Dimension from Data Source View to the tab attributes?Â
Imagine this:
I have the following dimension:
Dim_Customer: Customer_ID Name_Customer Job_Function Date_of_Birth Contact Address City Country
When I create the cube only Customer_ID appears in attributes tab, it's normal?Â
One more question:
I don't want to create a hierarchy like:
Customer ID -> Name_Customer Customer ID -> Date_of_Birth Customer ID -> Address Customer ID -> City Customer ID -> Country
My idea is to create the following hierarchy:Â
Name_Customer ->Â Date_of_Birth ->Â Â Address ->Â Â City ->Â Country
But the first hierarchy that I show is always appears to me. Do you know what is happens?
I have a specification table that has some attributes defined. SpecId - Id of the specification Attribute - Attribute of the spec. (Like Color, HP etc) Value - Is the value of the attribute Then I have a car table that actually has information about the cars. Intention is to take each specification and match the cars that match the specification. If the car has more attributes than the spec, we ignore the extra attributes for the match. But if the car has less attributes, we don't even consider the car as a match (even if the attributes present, match). To summarize, the car's attributes should be >= spec's attributes.
The code I have below is bad because I am joining the same tables twice. In addition, it fails in the condition "the car's attributes should be >= spec's attributes"
INSERT INTO @Specification VALUES ('S1', 'Type', 'Sedan') INSERT INTO @Specification VALUES ('S1', 'Transmission', 'Auto') INSERT INTO @Specification VALUES ('S1', 'HP', '220')
INSERT INTO @Specification VALUES ('S2', 'Type', 'SUV') INSERT INTO @Specification VALUES ('S2', 'Transmission', 'Manual') INSERT INTO @Specification VALUES ('S2', 'HP', '300')
INSERT INTO @Car VALUES ('Accord', 'Type', 'Sedan') INSERT INTO @Car VALUES ('Accord', 'Transmission', 'Auto') INSERT INTO @Car VALUES ('Accord', 'HP', '220') INSERT INTO @Car VALUES ('Accord', 'Color', 'Black')
INSERT INTO @Car VALUES ('Escape', 'Type', 'SUV') INSERT INTO @Car VALUES ('Escape', 'Transmission', 'Manual') INSERT INTO @Car VALUES ('Escape', 'HP', '300')
INSERT INTO @Car VALUES ('Explorer', 'Type', 'SUV') INSERT INTO @Car VALUES ('Explorer', 'Transmission', 'Manual')
SELECT DISTINCT Spec.SpecId, Car.CarName FROM @Specification Spec INNER JOIN @Car Car ON Spec.Attribute = Car.Attribute AND Spec.Value = Car.Value WHERE Spec.SpecId NOT IN (SELECT Spec.SpecId FROM @Specification Spec LEFT OUTER JOIN @Car Car ON Spec.Attribute = Car.Attribute AND Spec.Value = Car.Value WHERE Car.CarName IS NULL)
I'm developing a Custom Report Item and would like to fill my properties dialog in design mode with real data. I have the name of the data source and the command text for the corresponding report dataset, read from the report design.
Is it possible to create an ado.net dataset so that I can display and use real data in my properties dialog box? First problem I have is to get the connection string from the data source string. I found no docs for that, especially the Microsoft.ReportingServices.DataExtensions.ReportDataSource class is not documented. Am I on the right way with that?
I am creating a custom transformation component, and a custom user interface for that component.
In my custom UI, I want to show the custom properties, and allow users to edit these properties similar to how the advanced editor shows the properties.
I know in my UI I need to create a "Property Grid". In the properties of this grid, I can select the object I want to display data for, however, the only objects that appear are the objects that I have already created within this UI, and not the actual component object with the custom properties.
How do I go about getting the properties for my transformation component listed in this property grid?
What I want to accomplish is that at design time the designer can enter a value for some custom property on my custom task and that this value is accessed at executing time.
I am writing a custom task that has some custom properties. I would like to parameterize these properties i.e. read from a varaible, so I can change these variables from a config file during runtime.
I read the documentation and it says if we set the ExpressionType to CPET_NOTIFY, it should work, but it does not seem to work. Not sure if I am missing anything. Can someone please help me?
In the Editor of my custom task, under custom properties section, I expected a button with 3 dots, to click & pop-up so we can specify the expression or at least so it evaluates the variables if we give @[User::VaraibleName]
I am using db keyowrds as attributes in my table such as from, to and date. These are enclosed in [] in SQL Server Enterprise Manager. I'm just asking if doing this is a bad idea? Reason being these are the most applicable names for these attributes but don't want to run into problems further down the line.
Have a report that I want to keep certain attributes and as long as the report contains this certain attribute, bring all other attributes with it. Better with an example. In this report I am specifically looking for attribute "Alcohol", if I find this attribute I want to include all others that fit with this record's Primary Key which could include, "Drugs","Arson","Vandalism", etc. Problem is when I try to use a paramater or filter I get the "Alcohol" Attribute but not the "Drugs","Arson","Vandalism", etc. Conversely since I dont have any filter/paramater set I get everything even if it does not include "Alcohol"
Bit of a design question as I'm interested to know if anyone's done anythign like this...This is my main table (ish) Thing(ThingId, Ref)I then need to be able to give this "Thing" any number of attributes. Thing1 - Type:Red, Location:LondonThing2 - Type:Blue, Height:400, Width: 300Thing3 - Height:500, Location:Norwich But I have no idea how to model this in the database - it needs to be in such a way that I can add a Thing and all its attributes in one database hit basically (is there a stored procedure you could pass an array into?) My initial thoughts were to have Thing(ThingId, Ref) Attribute(AttributeId, ThingId*, AttributeTypeId*, Value) AttributeType(AttributeTypeId, Description) Is that completely mad? It seems like quite a lot of data accesses to enter a ThingIt could be Thing(ThingId, Ref, Type, Location, Height, Width) but then when "Thing - Color:White" comes along the model is stuffed Any ideas? (hope that makes sense)
In query analyzer, what is the command to tell me the attributes of the entities in a table? In oracle I can use the describe command. I know their is a way to do it in Query analyzer but I can't remember how. Also I can look visually by expanding the node of the table. But if I can do this through the command line in query analayzer, it is sometimes quicker.
Example. I want to find out about a table named "Employee" What command would I type that would tell me all of the columns/attributes in that table, and the data types which they are? Bill
. When I copy tables from one database to another (Using DTS Wizard) I lose my settings .. primary keys + default values !! Any help would be appreciated.. . Thanks
I have a few tables that have an disabled attribute using a BIT datatype. A lot of my queries on the front end look like:
SELECT * FROM TableA WHERE disabled <> 1
There's usually some other constraints on the query (get TOP 10 and greater than a certain date for example). Right now my tables are very small (only a couple thousand rows). I don't anticipate these tables having more than 100,000 rows.
Right now let's say there's only a CLUSTERED INDEX on the date field, and regular INDEXES on the identity field and perhaps some other UNIQUE name in the table.
Unless I am doing ranged queries on the CLUSTERED INDEXED field, I'm going to be performing table scans almost every time, right?
This sort of goes along with another question:
Say you run the following (SQL Server):
CREATE TABLE TestA ( [id] INT IDENTITY (1, 1) PRIMARY KEY, disabled BIT DEFAULT 0 ) GO INSERT INTO TestA (disabled) VALUES ('0') GO INSERT INTO TestA (disabled) VALUES ('0') GO INSERT INTO TestA (disabled) VALUES ('1') GO INSERT INTO TestA (disabled) VALUES ('0') GO INSERT INTO TestA (disabled) VALUES ('0') GO INSERT INTO TestA (disabled) VALUES ('0') GO INSERT INTO TestA (disabled) VALUES ('0') GO INSERT INTO TestA (disabled) VALUES ('1') GO INSERT INTO TestA (disabled) VALUES ('0') GO INSERT INTO TestA (disabled) VALUES ('0') GO INSERT INTO TestA (disabled) VALUES ('1') GO INSERT INTO TestA (disabled) VALUES ('1') GO INSERT INTO TestA (disabled) VALUES ('0')
Since [id] is a PK there will be a CLUSTERED INDEX placed on it. My question is; what does the optimizer do when you perform the following query?
SELECT TOP 3 * FROM TestA WHERE disabled <> '1'
My assumption is that since there's a CLUSTERED INDEX it will simply iterate through every tuple and check to see if disabled is not '1'. If my assumption is correct then these kind of boolean fields aren't a big deal if TOP queries are performed on a CLUSTERED INDEX.
So I guess what I am getting at is: Are bit attributes a sign of bad design? As tables get larger will performance degrade significantly? Would a better design be to have a seperate table of disabled items (which may result in large NOT IN subqueries)?
Any information on his would be greatly appreciated.
I have a question about storing the history of particular objects in a database. For example, if I had a table of "People" which had fields "PersonId", "Name", "PhoneNumber", "Height", "Weight", "Proffession" the data in every field stored for each person can change over time, except for the "PersonId", of course, which is why it is included.
I would like to be able to view a persons attributes at any point in time and therefore need to maintain a history. The currenct approach in place is to archive images of the whole table at certain points in time, which is unacceptable as it misses some changes, is not very accessible and also stores data which does not change.
My solution would be to created seperate tables for each changing attribute and have corresponding date for each change. For example, for phone numbers have a table "PeoplePhoneNumbers" with fields "PersonId","PhoneNumber" and "ChangeDate". A few shortcomings I can see in this approach is that firstly there will be many tables, one for each changing attribute, which can be in far greater number than those mentioned. Secondly, joins will have to created between every attribute table to get the orignal single table form, although I don't see this as a very important issue.
I am wondering; is there a more elegent way to structure for objects of this changing nature, or is having seperate tables for each changing attribute the best solution? I'm sure this is a very common issue. Thanks very much for the help,
We have an entity such as a documentSearchKey that contains attribtes about a particular document. This document can have 1-N number of search keys or attributes. The classic Employee Table is a good example for a horizontal listing of attributes (fname, lname, SS#, address, etc.) because the employee entity has a "fixed" number of attributes so we can add columns across.
For the documentSearchKey entity attributes can be considered search keys or where clause values. The documentSearchKey entity has variable number of attributes (docType A has 5 keys, docType B has 15 keys, etc) For this example each docType lives inside its on table so there is not a problem with mixing a variable number attributes inside the same table i.e. we will assume this table has 20 keys vertical or 20 columns horizontal as defined below.
The problem is whether or not to add 20 columns across or to add 3 columns and create a non-normalized DB so additional keys can be added at will.
The proposed table now contains 3 columns (docID, KeyID, KeyValue). Of course, 10 keys for 1 million records create 10 million rows Versus the traditional table with 1 million records always has 1 million records(keys are cols) where some columns contain blanks or nulls.
Which design is better in terms of searching and performance? Also, books and links are welcome as well. This is a specific question to a production issue.
I am having a question about Microsoft Clustering algorithm here. When we train the clustering model, we gain the clusters based on the model training. So what are the relationship among all attributes within each cluster? When we sumarize the characteristics for each cluster, for example, based on criteria attribute A=X, we got the darker cluster for this criteria, along with this characteristics (A=X), we also got other characteristics, so what is the exact relationship among all these chracteristics? It seemed they dont have any relationship to each other at all? (A=X dose not mean most likely B=Y if A=X?, what it means only is within this cluster, most likely A=X and B=Y etc. and A=X has the largest population within this cluster). I therefore cant see these chracteristics really interested.
Looking forward to any guidance and advices for that.
the query below (from Adventure Works) displays the sales amount for three products and a custom member "aggregation" which is the aggregate of these three products, and it cross joins with the attribute "colour".
Code Snippet
with member [Product].[Product Categories].[Subcategory].&[31].[aggregation] as 'AGGREGATE({ [Product].[Product Categories].[Product].&[214], [Product].[Product Categories].[Product].&[215], [Product].[Product Categories].[Product].&[220] })'
SELECT { [Date].[Calendar].[All Periods] } ON COLUMNS ,
Can someone please explain me why I'm getting this result:
All Periods
Sport-100 Helmet, Red Red 39328.1586
Sport-100 Helmet, Black Black 12098.0788
Sport-100 Helmet, Blue Blue 13331.5816
aggregation Black 64757.819
aggregation Blue 64757.819
aggregation Red 64757.819 (note that 64757.819 is the total of the three products)
instead of something like this:
All Periods
Sport-100 Helmet, Red Red 39328.1586
Sport-100 Helmet, Black Black 12098.0788
Sport-100 Helmet, Blue Blue 13331.5816
aggregation Black 12098.0788
aggregation Blue 13331.5816
aggregation Red 39328.1586
and also if anyone knows of a possible way of getting the second type of result?
please note that if I create a custom member that aggregates members of any other level of the Product Category hierarchy, the problem doesn't exist (see code and results below)
Code Snippet WITH MEMBER [Product].[Product Categories].[Category].&[4].[Aggregation] as 'AGGREGATE({ [Product].[Product Categories].[Subcategory].&[31], [Product].[Product Categories].[Subcategory].&[32] })' SELECT { [Date].[Calendar].DEFAULTMEMBER } ON COLUMNS , NON EMPTY { { { [Product].[Product Categories].[Subcategory].&[31], [Product].[Product Categories].[Subcategory].&[32], [Product].[Product Categories].[Category].&[4].[Aggregation]} * { [Product].[Color].[All Products].CHILDREN } } } ON ROWS FROM [Adventure Works] WHERE ( [Measures].[Reseller Sales Amount] )
hi i have a problem i am writing a stored procedure which contains "EXECUTE" statment which excutes the query and retrieves the attributes what i want. by using that procedure it is working fine and i am able to get the result
but i am not able to build the report why because this dataset not listing the attributes.
my procedure is like this : -------------------------------------------- USE [HOST_BPM_COVLTCP] GO
SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO
ALTER PROC [dbo].[PB_Report_GetProjectAttributes] @intProjectId INT AS BEGIN
SELECT @SRC_ATTRIBUTE_COLUMNS = (CASE WHEN @SRC_ATTRIBUTE_COLUMNS IS NULL THEN '' ELSE ',' + @SRC_ATTRIBUTE_COLUMNS END), @STR_ATTRIBUTES = ISNULL(@STR_ATTRIBUTES, ''''' DUMMY_COL'), @SRC_ATRIBUTE_NAMES = ISNULL(@SRC_ATRIBUTE_NAMES, ' '''' WHERE 1 <> 1')
EXEC ( ' SELECT DP.IDX PROJECT_ID, dbo.FindAndReplace(DP.CODE) [Project Code], dbo.FindAndReplace(DP.NAME) [Project Name], dbo.FindAndReplace(DP.LABEL) [Project] ' + @SRC_ATTRIBUTE_COLUMNS + ', ISNULL(DP.CREATED_BY,'''') AS CREATED_BY, ISNULL(DP.MODIFIED_BY,'''') AS MODIFIED_BY, DP.CREATED_DATE, DP.MODIFIED_DATE FROM DIM_PROJECT DP, ( SELECT ' + @STR_ATTRIBUTES + ', PROJECT_ID FROM PB_PROJECT_ATTRIBUTE_VALUE WHERE PROJECT_ID = ' + @intProjectId + ' GROUP BY PROJECT_ID ) SRC WHERE DP.IDX = ' + @intProjectId + ' AND DP.IDX *= SRC.PROJECT_ID ORDER BY DP.LABEL ' ) ---------- actually this procedure should result the following attributes
Hi, I am a novice data miner, working primarly in the BI field. I want to learn more about Data Mining so I am doing some experimenting.
I have a question regarding input attributes. I am particurlary wondering about the Neural Network algorithm, but also for Data Mining in general. What I am thinking about is if, and if so to what extend, I should create derived attributes for the algoritms. I´ll try to clarify with an example:
Lets say I am analysing sales performance for departments in a large company. Some of those departments has a high staff turnover, which might affect sales negatively (although I don't know that...). The high staff turnover could be detected, by the algorithm and humans, by looking at each sales, and which salesperson that handled it. If there are a lot more different salespersons in different departments by the same size and during the same time period, this is a sign of a high staff turnover.
Now is this info enough for the algorithm? Or should I add a column in the case dataset, where I discretesize the staff turnover as "High,Medium,Low"? Does this help the algorithm or can it affect the performance?
I hope you'll get the idea of my question, otherwise ask me!
Is there a way to either create a View or create a Query Table that converts the data like that for me? I need the new view/query to update when the original table updates, so a new table isn't an option.
In SQL 2005, is it possible to read the date modified of a file which is located on the hard drive of the server? Is there a procedure/function that would allow you to do so?
I have a fact table that has terminations. Fields include EmployeeName, TermDate, TermReason, and HireDate, et al.
I need to make EmployeeName available to drillthrough, and since it's a varchar field I can't make it a measure, so it has to be a dimension attribute. My question is, should I leave the fact table as it is and use SSAS to create a dimension that contains only EmployeeName and the link to TerminationID? Or should I redesign the OLAP tables so that EmployeeName is in a separate table?
[crossposted]Hi, I wonder if anyone might lend me a brain.I have a stock database to build that covers over 1000 products, whichmight be said to exist in around 50 product families.Obviously, just to be awkward all the types of stock will havedifferent attributes. So one product might be a tube withinside/outside diameter and length and another a T shaped cable joint.All I can come up with is a separate table for each stock type familyand store the table name and product code in the main stock table, so:Tables:ProdAProdBProdCStockStock attributes:ProdIdProdTableAmountDateetc..ProdA attribute:ProdIdAttributeXAttributeYAttributeZetc..Then use code to parse the table and product ID to select the correctquery to get the product details. BUT This seems awefuly inelegant andpotentially wrong so I'm loathe to continue down this route.Can anyone tell me the "right" way to do this, I feel sure it must bea classic db design exercise, but unfortunatly one they didn't teachus at University -- or maybe I was asleep...Thanks!
I've upgraded MS-Access 2002 to a MS-Access Project (adp), so now I have todeal with more sophisticated queries (may I call them so?) like storedprocedures. I have a form with a combobox for selections and a textbox toenter a certain value. Let us say I call the combobox @select and thetextbox @find. The combobox always shows the first of the items to select.Now I want to return a message if nothing is found, or if nothing has beenentered at all. The stored procedure reads for instance:ALTER PROCEDURE OPC@select nvarchar(20), @find nvarchar (100)ASSET NOCOUNT ONIF@find IS NULLBEGINPRINT 'You didn't enter any value'RETURNENDIF@select = 'Author'BEGINSELECT *FROM BooksWHERE Author LIKE '%' + @find + '%'ORDER BY Author, Title, Publisher, YearEND[And so on]RETURNThis works correct when I enter something, but when I don't enter any value,a message box pops up saying that the stored procedure has been executed,but no records were found. I want to see an error message like above. It'sas simple asALTER PROCEDURE HelloASBEGINPRINT 'Hello'ENDand nothing is seen. Does anybody know? Thank you.Wim
Processing Association Rules model on SQL 2005 Standard edition produced following error:
"Error (Data mining): The 'WO_3' mining model has 6690 attributes. This number of attributes exceeds the attribute limit of 5000 allowed by the current version of the algorithm associated with the mining model."