3-4-5 Rule
Feb 2, 2008Hi
I came across something like 3-4-5 rule while going through datamining book....but couldn't get from where that rule has been generalized and how it really works....
can anyone explain this rule ?
Thank you
Hi
I came across something like 3-4-5 rule while going through datamining book....but couldn't get from where that rule has been generalized and how it really works....
can anyone explain this rule ?
Thank you
Hi,
How do I extract rules and it's value from a database?
I can extract the rules through view(sys.objects) but where can I get it's content?
Regards
Marcelo Gamba
How can I setup the dbs in sql server so that when I change the data in one table the changes will cascade down to the tables in my other dbs. Therefore, one database would hold a primary key table. If I had 15 other dbs, then I could somehow link them so the data changed in the primary key table of the 1st database would cascade down to the other dbs.
Thanks
I wonder about whether this rule is valid or invalid for nested BEGIN/END statement...
Code:
BEGIN
BEGIN
--Query #1 (blah)...
END
WHILE EXISTS (SELECT TOP 1 * FROM #tmpTblPurchaseRaw)
BEGIN
BEGIN
--Query #2 (blah)...
END
BEGIN
--Query #3 (blah)...
END
END
END
I have no idea if nested BEGIN/END is allowed or not...
I am getting some data from another database and only want to copy sensible data...
How do I write a validation rule in SQL (SQL SERVER 2000) for a fax number, so that it only contains numbers-- i.e digits 0-9
Thanks in advance
I have a query that maps all products to some customer levels. In this case levels 0,5,7 and 8
DELETE FROM ProductCustomerLevel
WHERE CustomerLevelID IN (0, 5, 7, 8)
INSERT ProductCustomerLevel
(
ProductID,
CustomerLevelID
[Code] ....
Basically this maps all products in a database to these customer levels so that they get a discount.
I know need to create a new customer level, example number 9. These will only have 1 or 2 products applied to it.
How can I change the SQL above so that it does not map those products already in Customer Level 9 to levels 0,5,7 and 8
Hi, I have a database which saves data about bus links. I want to provide a information to passenger about price of their journay. The price depends on three factors: starting busstop, ending busstop and type of ticket (full, part - for students and old people, ...).
So I created a table with three foreign key constraints (two for busstops and one for type).
When the busstop is deleted or type of ticket I want all data connected with it to be deleted automatically. I wanted to use cascade deleting.
But I receive a following exception: Introducing FOREIGN KEY constraint 'FK_TicketPrices_BusStops1' on table 'TicketPrices' may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.
How can I achieve my task? Why should it cause cycles or multiple cascade paths?
Hi,I have a table with the following columns:ID INTEGEDR,Name VARCHAR(32),Surname VARCHAR(32),GroupID INTEGER,SubGroupOneID INTEGER,SubGroupTwoID INTEGERHow can I create a rule/default/check which update SubGroupOneID &SubGroupTwoID columns when GroupID for example is equal 15 onMSSQL2000.It is imposible to make changes on client, so I need to checkinserted/updated value of GroupID column and automaticly updateSubGroupOneID & SubGroupTwoID columns.Sincerely,Rustam Bogubaev
View 4 Replies View RelatedI have run into a .. somewhat of a "duh" question. I'm running association rule to run a basket analysis, and I'm trying to get probability of each prediction. I know this is wrong, but how do I go about running PredictProbability on each ProductPurchase prediction?
When I run the below DMX query, I get this error message...
Error (Data mining): the dot expression is not allowed in the context at line 5, column 25. Use sub-SELECT instead.
Thanks in advance...
-Young K
SELECT
t.[AgeGroupName]
, t.[ChildrenStatusName]
, (Predict([Basket Analysis AR].[Training Product], 3)) as [ProductPurchases]
, (PredictProbability([Basket Analysis AR].[Training Product].[ProductName])) as [ProductPurchases]
From
[Basket Analysis AR]
PREDICTION JOIN
OPENQUERY([DM Reports DM],
'SELECT
[AgeGroupName]
, [ChildrenStatusName]
FROM
[dbo].[DM.BasketAnalysis.Contact]
WHERE isTrainingData = 0
') AS t
ON
[Basket Analysis AR].[Age Group Name] = t.[AgeGroupName]
AND [Basket Analysis AR].[Children Status Name] = t.[ChildrenStatusName]
I haven't been able to find a DMX query which will spit out the cases which support a particular association rule. I was hoping it would work sort of like drillthrough but show only the cases supporting a particular rule. Am I missing something?
What I ended up doing was extracting the itemsets of the rule from the model's content then running a SQL query to retrieve the cases that contain both the left-hand and right-hand itemset of the rule. I'm hoping there's a better way.
---------------------------------------
small explain
this fonctin-generate daily shift pattern 1,1,2,2,3,3,4,5,...
(shift=1 morning shift 2=evening shift 3=night ........)
and it work ok
-------------------------------------------------------------------------------------------------
how to do this ?
i want to take this fonctin and add rule
so this functin do this
generate daily shift pattern 1,1,2,2,3,3,4,5,...
now add the new rule !!
if the employee get the shift 2 OR 3 on Thursday !!!
but only if it Thursday !
(the week-end start from Thursday until Sunday morning)
the order for this employee id be 2,2,2 or 3,3,3
i explain
the employee must start the week-end and finish it with the same shift
but only if it start a series 2 OR 3 (2=evening 3=night) ON Thursday .
and after continue
if the employee on Thursday start shift 2=evening than after 2,2,2 3,3,4,5,1,1,2,2,3,3,4,5,..
if the employee on Thursday start shift 3=night than after 3,3,3 4,5,1,1,2,2,3,3,4,5,..
so like this if the employee on Thursday start a series value 2 OR 3 the employee must to end it on the week-end from Thursday until Sunday morning
so my friends
can someone save me
how to do this
Code Block
-- need a list of employee ids with a basedate set to when they start with shift_code=1, unit=1
-- this is a minimal tale to show the format
-- extra columns could be added with other info (e.g. name)
create table empbase (
empid int,
basedate datetime
)
-- fill with test data
insert empbase (empid,basedate) values (12345,'2007/1/1')
insert empbase (empid,basedate) values (88877,'2007/1/5')
insert empbase (empid,basedate) values (98765,'2007/1/20')
insert empbase (empid,basedate) values (99994,'2007/6/5')
go
-------------------------------
create function shifts (
@mth tinyint,
@yr smallint
)
returns
@table_var
table (
empid int,
date datetime,
shift_code int,
unit int)
as
-- generate daily shift pattern 1,1,2,2,3,3,4,5,... changing units 1,2,3,4,... every 30 days.
begin
declare @d1 datetime
declare @d31 datetime
set @d1=convert(datetime,convert(char(8),@yr*10000+@mth*100+1))
set @d31=dateadd(dd,-1,dateadd(mm,1,@d1))
;with n01 (i) as (select 0 as 'i' union all select 1)
,seq (n) as (
select
d1.i+(2*d2.i)+(4*d3.i)+(8*d4.i)+(16*d5.i) as 'n'
from
n01 as d1
cross join
n01 as d2
cross join
n01 as d3
cross join
n01 as d4
cross join
n01 as d5)
,dates (dt) as (
select
dateadd(dd,n,@d1) as 'dt'
from
seq
where
dateadd(dd,n,@d1) <= @d31)
,modval (mod,val) as (
select 0,1 union all
select 1,1 union all
select 2,2 union all
select 3,2 union all
select 4,3 union all
select 5,3 union all
select 6,4 union all
select 7,5)
insert @table_var
select
b.empid,
d.dt,
(select val from modval where mod=(datediff(dd,b.basedate,d.dt) % 8)),
((convert(int,(datediff(dd,b.basedate,d.dt) / 30)) % 4) + 1)
from
empbase b, dates d
where
b.basedate <= d.dt
return
end
go
-- test for various months
select * from shifts(1,2007) order by empid,date
select * from shifts(2,2007) order by empid,date
select * from shifts(3,2007) order by empid,date
select * from shifts(4,2007) order by empid,date
select * from shifts(5,2007) order by empid,date
select * from shifts(12,2007) order by empid,date
Can anyone tell me, how the Business Ã?ntelligence Studio calculates the importance of a rule. I can't find the formula. I know some formulas, but the result in SQL Server is completly different.
Thanks!
Hi,
I need to create the database design for a pretty complex project. We have data coming from a feed and being stored in a table. We need to provide a UI for users to create custom "rules". Each "rule" has to be fully customizable. Here are some examples of possible rules, tailored to the Northwind DB :
1. ( Avg of all Orders with OrderValue > 10 ) / (Median Price of all Products)
2. 50% of the value by which the Product Price exceeds a threshold value of 10
These are just sample rules, there might be many more similar to this.
I am basically looking for pointers for the kind of architecture that would make this sort of customization possible. I am currently thinking of getting the data into a dataset, and store the custom rules that the user creates as DataSet expressions in the DB. When the user chooses to apply a certain rule, the Dataset expression gets evaluated and accordingly returns a value.
Any help is appreciated.
Hi
In access i can make a rule
like if i have a Coloumn to date
i can make a rule to say that this fields data
shall be > date
can i do this also in sql and how?
regards
alvin
I have a table in a database that keeps getting duplicate records added to it.
Is there a way to set a rule so that if someone tries to add a duplicate record for that field, it will stop the record from going in?
I know creating an index would be the proper way to do this but:
1. The application does not belong to us.
2. Duplicates already exist in the table for the database.
Basically I am trying to do the most without making alot of changes to the database.
Any help would be appreciated.
Thanks
I am currently importing tick data for a stock. Let's say my table structure is like this:
CREATE TABLE tick
(
tickId bigint identity(1,1) primary key
, tickTime datetime
, price money
)
If the stream of data I get resembles:
'4/17/12 2:00:00.000', 10.00
'4/17/12 2:00:02.000', 10.02
'4/17/12 2:00:01.000', 10.01
'4/17/12 2:00:03.000', 10.03
I want my table to look like this:
1, '4/17/12 2:00:00.000', 10.00
2, '4/17/12 2:00:02.000', 10.02
3, '4/17/12 2:00:03.000', 10.03
Essentially ignoring the out of place '4/17/12 2:00:01.000' record. What is the least expensive way to accomplish this?
Hi,I basically have two tables with the same structure. One is an archiveof the other (backup). I want to essentially insert the data in to theother.I use:INSERT INTO table ( column, column .... )SELECT * FROM table2Now, table2 has a rule on various columns:@CHARACTER IN ('Y','N')but the column allows nulls, in the design view is says so anyway.When I run this query I get:A column insert or update conflicts with a rule imposed by a previousCREATE RULE statement. The statement was terminated. The conflictoccurred in database 'database', table 'table', column 'column'.The statement has been terminated.Obviously, I've changed the names of everything.The only data in those columns which could possibly conflict with therule is the NULL value. Any ideas why this doesn't work?Thanks.
View 8 Replies View RelatedHi There,
Here I have small problem with default and rule.
After create rule or default then we will bind that to any table.
I bounded that rule to some of tables.If i want see the list of objects dependent on this rule or default
how to see.I know sp_depends stored procedure will show the all dependent
objetcs but i could not get through that.I found in help it says sp_depends works
for all objects in the database like table,view and so on.But default and rule also
objects i could not get it.Please let me know on this if you can give this answer as early as possible.
I am very thanks to you.Please don't specify SQL-DMO Listboundedcolumns function.....
Thanks
Ramki
I have a query with 2 subqueries, and no error message is reported, but, my problem is that the 2 subqueries do not follow the GROUP BY rule and show the total instead of by vendor...
Code:
SELECT Table1.agents AS Vendor
, Count(Table1.carS) AS Car_Sold
, Sum(Table1.carP) AS Car_Price
, Count(Table1.busS) AS MortBus_Sold
, Sum(Table1.busP) AS busPRice
[Code] ....
It's often said or done that when inserting or updating into a 'large' table that disabling the non-clustered indexes can is needed for performance.
Now I know the obvious way to find out if this is best or not is by testing the different options. I was wondering if there was a rule of thumb to this?
Say you have a table with half a billion rows and 4 non-clustered indexes and are only updating half a million rows then sometimes disabling every night and re-enabling can take way more time than the actual update. Haven't found an articles advising to disable them when a table is over X rows and you are updating Y% of them...
CREATE TABLE EDI_data_proc_log(
ID int IDENTITY(1,1),
comment VARCHAR(3000),
time_recorded DATETIME DEFAULT GETDATE(),
run_by varchar(100),
duration int );
When a record is inserted I like the duration column to be computed.This should happen only after the first record to the table has gotten inserted.You might say a trigger would be the best.. Ok then, show me the syntax.
Or I am thinking can we write a user defined function that will compute the value for the duration column.
--By default, I would like to update the duration column as follows:
--It should record the number of seconds between the last insertion ( You can get that time from the time_recorded column from the previous record and the current time can be obtained from the getdate() function )
This is a general question about data modeling. I'm more curious than anything else.
There is much talk about over-training data model, and I'm sure there are under training as well. As a rule of thumb, depending on the algorithm, what is a good ratio of attributes vs data points?
-Young K
I read somewhere that market basket analysis finds rules with substitutes as likely as rules with complements due to a consumer behavior called "horizontal variety seeking". This is when customers buy more than one product in the same category even though they are subsitutes. For example, when people go to the grocery store and buy soda, they buy coke and sprite at the same time even though they are substitutes of each other. I was wondering if anyone has experience with this anomaly and how they solved it. I found a time series model called the vector autoregressive model which is used to find the elasticity of prices over a time period. Does anyone have experience working with the VAR model? I am having trouble figuring out what some of the variables in the model are.
Below is the paper
http://www.feb.ugent.be/fac/research/WP/Papers/wp_04_262.pdf#search='VAR%20model%20market%20basket%20analysis'
Hi,
I am a Microsoft BI Developer and currently working on Pharmaceutical BI project. In this project, Client wants to integrate his Blaze Advisor rule engine to SSIS so that he can change the rules in Blaze advisor any time and see the effect of it on the source data. Hence, my question is:
How can i integrate the "Blaze Advisor" to "SQL Service Integration Services" (Microsoft SQL Server ETL tool) which will use my Business Rules ( Written in Blaze Advisor) in the transformation task and process all my source data with the same business logic?
My Trails to Solve this problem:
I have written the rule in the Blaze Advisor & Imported it's rules into .Net file which includes *.Server, *.Client and some other files. I have used the DLLs in this solution in my SSIS script task but it's not supporting to it. It is demanding for *.Server & *.Client files there.
- Can you suggest me a way to integrate SSIS with Blaze Advisor?
- How can i use the Blaze Advisor's .Net output files as DLLs into my custom transformation?
I'll be really greatful to you if you could suggest me an approch for this particular business problem.
Thank You!
Regards,
Sandeep
SQL BPA says the following:"One or more objects are referencing tables/views withoutspecifying a schema! Performance and predictability of theapplication may be improved by specifying schema names.""When SQL Server looks up a table/view without a schemaqualification, it first searches the default schema and then the'dbo' schema. The default schema corresponds to the currentuser for ad-hoc batches, and corresponds to the schema of astored procedure when inside one. In either case, SQL Serverincurs an additional runtime cost to verify schema binding ofunqualified objects. Applications are more maintainable andmay observe a slight performance improvement if objectreferences are schema qualified."How important is to specify the schame (dbo. in my case) instored procedures? Will it really improve performance if I goand fix each object that is missing "dbo."?The problem is I have thousands and thousands of themwith no schemas. Before I invest a lot of time fixing themI am trying to determine if it's really worth it or not?Thank you
View 1 Replies View RelatedWhat is the best practice in setting a minimum support threshold for market basket analysis? Is there a formula? Does it depend on ROI you predict?
View 4 Replies View RelatedHello,
I use ODBC driver to perform SQLServer commands from C/C++ application.
An "INSERT INTO <table> (<column>) VALUES (NULL)" command has a random behavior in a SQL2000 Server running on WindowsXP.
The <column> in this command has 2 definitions about the NULL value :
- the NULL is accepted in the table definition, with <column> ut_oui_non NULL".
- the NULL is rejected in the type definition, with EXEC sp_addtype ut_oui_non, 'char(1)', 'NOT NULL' and a rule to check values with '0' or '1'
1/ The column definition in any explorer show the NULL from table deffinition
2/ The "INSERT INTO" is completed in SQL Query tool, used on Windows2000 and WindowsXP computers, connected to the same SQL2000 server.
3/ The "INSERT INTO" is completed in the application, running on Windows2000 with an ODBC driver to the same SQL2000 server.
4/ The "INSERT INTO" is rejected in the application, running on WindowsXP witjh an ODBC driver to the same SQL2000 server. The error 513 means that INSERT VALUES conflicts with previous rule. So only the type definition seems to be used.
But :
5/ This is a random error, and some INSERT with the same values in this column are completed.
6/ This random error seems to be discarded by using the "Use NULLs, paddings and warnings ANSI" checkbox in the ODBC driver user source configuration.
This checkbox is only use for enforcing the ANSI syntax in SQL commands, and has no known effect on type checking.
Do you know about any conflict of column NULL value between a type definition and a table definition ?
Hi,
is there a way to import a decision tree-model from pmml where a node contains two or more states of an attribute as the split-rule?
Example:
...
<Node recordCount="600">
<CompoundPredicate booleanOperator="or">
<SimplePredicate field="color" operator="equal" value="red" />
<SimplePredicate field="color" operator="equal" value="green" />
</CompoundPredicate>
<ScoreDistribution value="true" recordCount="200"/>
<ScoreDistribution value="false" recordCount="400"/>
</Node>
...
This node shoud contain all cases, whose color is red or green (The Microsoft DecisionTree-Algorithm would build a model with two steps like red/ not red and then green / not green). According to the DMG, this is valid PMML 2.1, but when trying to import the server complains about an unexpected value in the SimplePredicate-tag.
How can i import such a node in SqlServer 2005?
Thank you in advance for any help
Chris
Hello all.
First of all, I've been a reader of swynk.com for quite sometime now, and I'd like to say 'thank you' to everyone who contributes.
Today, I'm the town moron.. haha I'm having issues with column level constraints. I have a varchar(50) where I want to keep *,=,#,/, .. etc, OUT OF the value input. I don't want to strip them. I simply want for sql to throw an error if the insert contains those (and other characters). The only characters that I want in the column are A-Z and 0-9. However, it's not a set number of characters per insert. It always varies... There has to be an easier way to do this than creating a constraint for every possibilty... Any help would be greatly appreciated.
tia,
Jeremy
I've been fixing some issues lately where weekly maintenance has been causing logs to grow and filling disks.
Is there any rule of thumb for allocating log space for doing reorgs and rebuilds in a worst case scenario? I'm thinking 3x the largest database size?
I've been watching them run on databases in the range of 50GB where the logs are growing well over that for rebuilds or even reorgs. Once you have a few databases like this on a server, you can suddenly eat through a lot of disk space just for holding logs during maintenance.
I deleted some records out of an entity, I'd like to keep the Codes as contiguous and incremental, meaning no breaks between the code numbers.I created a business rule and applied it but codes remain the same.
I used the "Default to a generated Value" action, then selected the Code attrib. --Saved.
Then back to the Entity, I applied business rules. But nothing seemed to have happened. As there was no change in codes.
There are no validation errors either.
Dear all,
I have a table containing call records, and made a mining model from that table only. The model has 3 columns : calling_number, called_number, and target_operator, using Association Rule algorithm. The key is calling_number, input was operator, and predicted column called_number.
The result shows no rule, but there are results with item-set size of 1 (column) and 2 (column). On the top record of the result, SQL Server says there are 1891 support for called_number = 1891 and operator = 'INDOSAT'.
I queried the table with this query
SELECT DISTINCT calling_number
FROM call_records
WHERE called_number = '07786000815'
AND target_operator = 'INDOSAT';
It returns 2162 records instead of 1891. If I removed the DISTINCT qualifier, SQL Server returns 2159 records. Why is this differences with the result of mining?
Thank you,
Bernaridho
Hi,
In Analysis Services there is an option to enforce a role either on the client side or the server side.
Can someone kindly guide what's the recommended approach and what's really the difference between the two options.
Thanks.