Linear Regression For Column Values

Jul 24, 2006

This is a real challenge. I hope someone is smart enough to know how
to do this.

I have a table

TABLE1
[Column 1- 2001]
[Column 2- 2002]
[Column 3- 2003]
[Column 4 - 2004]
[Column 5 - 2005]
[Column 6 - 2006]
[Column 7 - Slope]

[2001][2002][2003][2004][2005][2006] [Slope]
[1] [2] [3] [4] [5] [6] [1]
[1.2] [.9] [4] [5] [5.4] [6.2] [?]

Slope is defined as "M" in the equation y=mx+b

I need a way a finding the linear equation that best fits the points so
I can have SQL calculate the slope.

Are there any smart people around that would know how to do this?

thanks

View 3 Replies

How Does Linear Regression Handle Missing Values For Prediction And For Training?

Sep 18, 2006

Q1. Model Prediction -- Suppose we already have a trained Microsoft Linear Regression Mining Model, say, target y regressed on two variables:

x1 and x2, where y, x1, x2 are of datatype Float. We try to perform Model Prediction with an Input Table in which some records consist of NULL x2 values. How are the resulting predicted y values calculated?

My guess:

The resulting linear regression formula is in the form:

y = constant + coeff1 * (x1 - avg_x1) + coeff2 * (x2 - avg_x2)

where avg_x1 is the average of x1 in the training set, and avg_x2 is the average of x2 in the training set (Correct?).

I guess that for some variable being NULL in the Input Table, Microsoft Linear Regression just treat it as the average of that variable in the training set.

So for x2 being NULL, the whole term coeff2 * (x2 - avg_x2) just disappear, as it is zero if we substitute x2 with its average value.

Is this correct?

Q2. Model Training -- Using the above example that y regressed on x1 and x2, if we have a train set that, say, consist of 100 records in which

y: no NULL value

x1: no NULL value

x2: 70 records out of 100 records are NULL

Can someone help explain the mathematical procedure or algorithm that produce coeff1 and coeff2?

In particular, how is the information in the "partial records" used in the regression to contribute to coeff1 and the constant, etc ?

View 1 Replies View Related

How Does Linear Regression Handle Missing Values For Prediction And For Training?

Sep 18, 2006

Q1. Model Prediction -- Suppose we already have a trained Microsoft Linear Regression Mining Model, say, target y regressed on two variables:

x1 and x2, where y, x1, x2 are of datatype Float. We try to perform Model Prediction with an Input Table in which some records consist of NULL x2 values. How are the resulting predicted y values calculated?

My guess:

The resulting linear regression formula is in the form:

y = constant + coeff1 * (x1 - avg_x1) + coeff2 * (x2 - avg_x2)

where avg_x1 is the average of x1 in the training set, and avg_x2 is the average of x2 in the training set (Correct?).

I guess that for some variable being NULL in the Input Table, Microsoft Linear Regression just treat it as the average of that variable in the training set.

So for x2 being NULL, the whole term coeff2 * (x2 - avg_x2) just disappear, as it is zero if we substitute x2 with its average value.

Is this correct?

Q2. Model Training -- Using the above example that y regressed on x1 and x2, if we have a train set that, say, consist of 100 records in which

y: no NULL value

x1: no NULL value

x2: 70 records out of 100 records are NULL

Can soemone help explain the mathematical procedure or algorithm that produce coeff1 and coeff2?

In particular, how is the information in the "partial records" used in the regression to contribute to coeff1 and the constant, etc ?

View 3 Replies View Related

How Linear Regression Choose His Regressor ?

Apr 22, 2007

I would like to understand the algorithm that the linear regression method uses to choose the regressors in the model from a list of possible regressors.

I think that it is different from the common methods used in statistics like stepwise, forward or backward.

Laura Lerner

View 8 Replies View Related

Linear Regression With Nested Explanation Variable

Jan 22, 2007

We are trying to create a model of linear regression with nested table. We used the create mining model sintax as follow :

create mining model rate_plan3002_nested2

( CUST_cycle LONG KEY,

VOICE_CHARGES double CONTINUOUS predict,

DUR_PARTNER_GRP_1 double regressor CONTINUOUS ,

nested_taarif_time_3002 table

( CUST_cycle long CONTINUOUS,

TARIFF_TIME text key,

TARIFF_VOICE_DUR_ALL double regressor CONTINUOUS

)

) using microsoft_linear_regression

INSERT INTO MINING STRUCTURE [rate_plan3002_nested2_Structure]

(CUST_cycle ,

VOICE_CHARGES ,

DUR_PARTNER_GRP_1 ,

[nested_taarif_time_3002](SKIP,TARIFF_TIME ,TARIFF_VOICE_DUR_ALL)

)

SHAPE {

OPENQUERY([Cell],

'SELECT CUST_cycle ,

VOICE_CHARGES ,

DUR_PARTNER_GRP_1

FROM dbo.panel_anality_3002

order by CUST_cycle ')}

APPEND

({OPENQUERY([Cell],

'select CUST_cycle,

TARIFF_TIME,

CYCLE_DATE

from dbo.nested_taarif_time_3002

order by CUST_cycle,TARIFF_TIME')

}

relate CUST_cycle to CUST_cycle

) as nested_taarif_time_3002

The results we got are a model with intercept only. if we don't use the nested variable (the red line) we get a rigth model . (we had more variable ....)

Is there a way to do this regression correctly?

Thanks,

Dror

View 7 Replies View Related

Linear Regression And Tolking The Coefficient For Each Variabel?

Sep 2, 2007

When using linear regression in the SQL Server 2005 Business IntelIigence Studio I interpet the information below as follow: X has a standard deviation of +- 37.046. Is it possible to obtain the standard deviation of each coefficient in the regression expression?

View 1 Replies View Related

Specifying Bound For Coefficients In Linear Regression Model

Jan 18, 2008

Hi,

I am trying to create a model using microsoft Linear Regression algorithm. But I want to constrain the coefficient of the parameters to non-negative value. There is concept of bound in SAS where we can specify the range of the coefficient. Does any of the SSAS mining algorithms support restricting the coefficient value?

Thanks,
DMN

View 3 Replies View Related

Mining Content Viewer For Linear Regression: Node Distribution Output

Dec 19, 2006

With the number of threads it is difficult to know if this has been posted. If I use the Mining Content Viewer for Linear Regression, under Node Distribution, there are values given for Attribute Name, Attribute Value, Support, Probability, Variance, and Value Type. The output is similar to what Joris supplied in his thread about Predict Probability in Decision Trees. My questions:

1. How should these fields be interpreted?

2. With Linear Regression, is it possible to get the coefficient values and tests of significance (t-tests?), if they are not part of the output I have pointed to?

Thanks for your help with this?

Sam

View 1 Replies View Related

LogRegHelper - A Scorecard For Logistic Regression Models Does Not Match Logistic Regression Favors Score

Jun 24, 2007

Hello,

This question is regarding the LogRegHelper - "A scorecard for Logistic Regression models" example in sqlserverdatamining Tips and Tricks page. I launched TestLogReg (Analysis Services Database associated with the project) and ran Logistic Regression over that. While the LogReg shows the highest score for IQ (107 - 121), a score of 558, the Logistic Regression shows that Parent Encouragement has the highest score for the case College Plans = 'Plans to Attend'. Can someone verify this and clarify?

I have a few other questions with LR

- In SQL Server 2005 LR Mining Model Viewer "favors" chart, what algorithm is used for generating Scores?

- Can I use this score as a feature selector? Higher score => stronger predictor (input)

- Is the coefficient weight algorithm used in LogReg wrong ?

Thanks

MA

View 1 Replies View Related

Reporting Services :: Count Values In A Column Based Upon Distinct Values In Another Column In SharePoint List

Sep 7, 2015

We have SharePoint list which has, say, two columns. Column A and Column B.

Column A can have three values - red, blue & green.

Column B can have four values - pen, marker, pencil & highlighter.

A typical view of list can be:

Column A - Column B
red - pen
red - pencil
red - highlighter
blue - marker
blue - pencil
green - pen
green - highlighter
red - pen
blue - pencil
blue - highlighter
blue - pencil

We are looking to create a report from SharePoint List using SSRS which has following view:

red blue green
pen 2 0 1
marker 0 1 0
pencil 1 3 0
highlighter 1 1 1

We tried Sum but not able to display in single row.

View 2 Replies View Related

Integration Services :: SSIS Package - Replacing Null Values In One Column With Values From Another Column

Sep 3, 2015

I have an SSIS package that imports data from an Excel file, replaces any value in Excel that reads "NULL" to "", then writes the data to a couple of databases.

What I have discovered today, is I have two columns of dates, an admit date and discharge date column, and what I need to do is anywhere I have a null value in the discharge date column, I have to replace it with the value in the admit date column.

I have searched around online and tried a few things using the Replace funtion in Derived columns but no dice so far.

View 3 Replies View Related

Add Symbol To Column Values And Convert Column Values To Western Number System

Feb 12, 2014

I want to add $ symbol to column values and convert the column values to western number system

Column values
Dollar
4255
25454
467834

Expected Output:
$ 4,255
$ 25,454
$ 467,834

My Query:
select ID, MAX(Date) Date, SUM(Cost) Dollars, MAX(Funded) Funding from Application

COST is the int datatype and needs to be changed.

View 2 Replies View Related

Linear Interpolation

Jun 7, 2007

my data is like this:

header | data | key
-------------------
500 | 3.2 | 10
500 | 3.4 | 20
500 | 3.6 | 25
500 | 3.7 | 40
501 | 4.1 | 10
501 | 4.2 | 15
501 | 4.4 | 30
501 | 4.6 | 35

and what I want to do is find the median of "data", but keyed off of "key", so if my desired median is 30, I want to take the two records (data, key) nearest to key = 30, and get the average of "data".
...and do this within each "header" value.

actually, to be precise, I want the linear interpolation, so for header = 500, I want to get the (data, key) pairs of (3.6, 25) and (3.7, 40) and return the interpolated "data" value of 3.6333 (as done here (http://en.wikipedia.org/wiki/Linear_interpolation))

so for the above example the query would produce:

header | interp
-----------------
500 | 3.633
501 | 4.4

possible, or am I crazy?

View 4 Replies View Related

Specify A Non-linear Order By

Jul 23, 2005

How can I order the results of my query in non-linear fasion. I have afield with these values: Reg S, 144A, US and want to order my resultsby US, 144A, Reg S.I would prefer not to create another field in the table if possible.

View 4 Replies View Related

Integration Services :: SSIS Reads Nvarchar Values As Null When Excel Column Includes Decimal And String Values

Dec 9, 2013

I have SQL Server 2012 SSIS. I have Excel source and OLE DB Destination.I have problem with importing CustomerSales column.CustomerSales values like 1000.00,2000.10,3000.30,NotAvailable.So I have decimal values and nvarchar mixed in on Excel column. This is requirement for solution.However SSIS reads only numeric values correctly and nvarchar values are set as Null. Why?

CREATE TABLE [dbo].[Import_CustomerSales](
[CustomerId] [nvarchar](50) NULL,
[CustomeName] [nvarchar](50) NULL,
[CustomerSales] [nvarchar](50) NULL
) ON [PRIMARY]

View 5 Replies View Related

Analysis :: Bitmask Column Values As Dimension Values

Jun 18, 2015

Bitmask fields! I am capturing row changes manually via a high frequency ETL task. It works effectively however i am capturing the movement of multiple fields. A simple example, for Order lines, i have a price, a discount and a date. I am capturing a 001, 010, 100 respectively for each change.

I would like my users to be able to select from a dimension which has the 3 members in it and they can select one, multiples, or all values (i.e. only want to see rows that have had the date and price changed).

Obviously if i only had 3 columns i would use bit's and be done with it, i have many different values (currently around 24 and growing).

View 2 Replies View Related

How Can I Generate Incremented Linear Number T-SQL

May 26, 2008

Hi all

i wants to generate linear sequence number like 1,2,3,.............1000000

,are there any function like NEWID() ( this return unique guide, i want to get integer)

i want to used this generated number inside the SQL query

thanks

IndikaD (Virtusa Cop)

View 12 Replies View Related

Transact SQL :: Calculate DateTime Column By Merging Values From Date Column And Only Time Part Of DateTime Column?

Aug 3, 2015

How can I calculate a DateTime column by merging values from a Date column and only the time part of a DateTime column?

View 5 Replies View Related

Transact SQL :: Convert Certain Row Values To Next Column Values

Jul 9, 2015

I have a table with 2 columns and my source data looks like this..

PolicyNum   InsCode

1ABC12          1001
1ABC12          1002
1ABC12         1003
1ABC12         1004
1ABC12          1005

[Code] ....

My output should look like this..I need T-sql to get below output.

PolicyNum   InsCode1   InsCode2

1ABC12           1001       1005
1ABC12             1002       1006
1ABC12           1003       1004
1ABC20            1001       1005

[Code] ...

Basically it's converting certain row values to new column. Every PloicyNum will have 1001 to 1006 Fixed InsCode values as a group.

Rule-1: InsCode value 1001 should always mapped to 1005
          InsCode value 1002 should always mapped to 1006
           InsCode value 1003 should always mapped to 1004

Rule-2: For a policyNum, If any Inscode value is missed from the group values 1001 to 1006, still need to mapped with corresponding values as shown in Rule-1

In the above sample data..

for PolicyNum - 1ABC20 , group values 1003,1006 are missing
for PolicyNum - 1ABC25 , group values 1002,1003,1004,1005,1006 are missing

Create Table sampleDate (PolicyNum varchar(10) not null, InsCode Varchar(4) not null)
Insert into Sample Date(PolicyNum, InsCode) Values ('1ABC12','1001')

Insert into Sample Date(PolicyNum, InsCode) Values ('1ABC12','1002')
Insert into Sample Date(PolicyNum, InsCode) Values ('1ABC12','1003')

[Code] ....

View 14 Replies View Related

Choosing Between Two Column Values To Return As Single Column Value

Sep 14, 2007

I'm working on a social network where I store my friend GUIDs in a table with the following structure:user1_guid user2_guidI am trying to write a query to return a single list of all a users' friends in a single column. Depending on who initiates the friendship, a users' guid value can be in either of the two columns. Here is the crazy sql I have come up with to give what I want, but I'm sure there's a better way... Any ideas?SELECT DISTINCT UserIdFROM espace_ProfilePropertyWHERE (UserId IN
(SELECT CAST(REPLACE(CAST(user1_guid AS VarChar(36)) + CAST(user2_guid AS VarChar(36)), @userGuid, '') AS uniqueidentifier) AS UserId FROM espace_UserConnection WHERE (user1_guid = @userGuid) OR
(user2_guid = @userGuid))) AND (UserId IN
(SELECT UserId FROM espace_ProfileProperty))

View 1 Replies View Related

Counting Multiple Values From The Same Column And Grouping By A Another Column

Sep 16, 2004

This is a report I'm trying to build in SQL Reporting Services. I can do it in a hacky way adding two data sets and showing two tables, but I'm sure there is a better way.

TheTable
Order# Customer Status

STATUS has valid values of PROCESSED and INPROGRESS

The query I'm trying to build is Count of Processed and INProgress orders for a given Customer.

I can get them one at a time with something like this in two different datasets and showing two tables, but how do I achieve the same in one query?

Select Customer, Count (*) As Status1
FROM TheTable
Where (Status = N'Shipped')
Group By Customer

View 2 Replies View Related

Altering Column Values Using Derived Column Component

Dec 21, 2007

Can anyone show how to alter the value in a column using DerivedColumn component when creating an SSIS package programatically.

View 4 Replies View Related

Table Column Names = Dataset Column Values?!

Apr 28, 2008

I need to create the following table in reporting services

PRODUCT April March Feb

2008 2007 2008 2007 2008 2007
chair 8 9 7 4 4 4
table 3 4 5 6 4 6

My problem is the month names are a column in the dataset, but I dont know how to get it to fill as column headers???

Thanks in advance!!!

View 1 Replies View Related

Power Regression

May 30, 2006

I need to write some SQL to do a power regression for a trendline. I have 2 columns of data which represent my X, Y data and all I'm after is the a and the b for the function y=ax^b. Has anyone ran into this before?? I know SSAS has a linear regression function but my data really only fits the power model.

View 4 Replies View Related

Column Defaults As Parameters And/or Column Values

Jan 7, 2008

Good afternoon,

I am trying to figure out a way to use a columns default value when using a stored procedure to insert a new row into a table. I know you are thinking "that is what the default value is for", but bare with me on this.

Take the following table and subsequent stored procedure. In the table below, I have four columns, one of which is NOT NULL and has a default value set for that column.

CREATE TABLE [dbo].[TestTable](
[FirstName] [nvarchar](50) NULL,
[LastName] [nvarchar](50) NULL,
[SSN] [nvarchar](15) NULL,
[IsGeek] [bit] NOT NULL CONSTRAINT [DF_TestTable_IsGeek] DEFAULT ((1))
) ON [PRIMARY]

I then created the following stored procedure:

CREATE PROCEDURE TestTable_Insert
@FirstName nvarchar(50),
@LastName nvarchar(50),
@SSN nvarchar(15),
@geek bit = NULL
AS
BEGIN
INSERT INTO TestTable (FirstName, LastName, SSN, IsGeek)
VALUEs (@FirstName, @LastName, @SSN, @geek)
END
GO

and executed it as follows (without passing the @geek parameter value)

EXEC TestTable_Insert 'scott', 'klein', '555-55-5555'

The error I got back (and somewhat expected) is the following:

Cannot insert the value NULL into column 'IsGeek', table 'ScottTest.dbo.TestTable'; column does not allow nulls. INSERT fails.

What I would like to happen is for the table to use the columns default value and not the NULL value if I don't pass a parameter for @geek. OR, it would be really cool to be able to do something like this:

INSERT INTO TestTable (FirstName, LastName, SSN, IsGeek)
VALUEs (@FirstName, @LastName, @SSN, ISNULL(@geek, DEFAULT))

Does this make sense?

View 9 Replies View Related

How To Create A New Column And Insert Values Into The New Column

Mar 3, 2008

Can anyone assist me with a script that adds a new column to a table then inserts new values into the new column based on the Table below. i have included an explanation of what the script should do.

Column from
Parts Table Column from
MiniParts New Column in
(Table 1 ) (Table 2 ) MiniParts (Table2)

PartsNum

MiniPartsCL

NewMiniPartsCL

1

K

DK

1

K

K

1

Q

Q

0

L

L

0

L

LC

0

D

G

0

S

S

I have 2 tables in a database. Table 1 is Parts and Table 2 is MiniParts. I need a script that adds a new column in the MiniParts table. and then populate the new column (NewMinipartsCL) based on Values that exist in the PartsNum column in the Parts Table, and MiniPartsCL column in the MiniParts columns.

The new column is NewMiniPartsCL. The table above shows the values that the new column (NewMiniPartsCL) should contain.

For Example
Anytime you have "1" in the PartsNum column of the Parts Table and the MiniPartsCL column of the MiniParts Table has a "K" , the NewMiniPartsCL column in the MiniParts Table should be populated with "DK" ( as shown in the table above).

Anytime you have "1" in the PartsNum column of the Parts Table and the MiniPartsCL column of the MiniParts Table has a "K" , the NewMiniPartsCL column in the MiniParts Table should be populated with "K" ( as shown in the table above). etc..

View 3 Replies View Related

Add Values To A Column With Derived Column Expression?

Feb 25, 2008

Hi, how are you?
I'm having a problem and I don't know if it can be solved with a derived column expression. This is the problem:

We are looking data in a a sql database.

We are writting the SQL result in a flat file.

We need to transform data in one of the columns.

For example: we can have 3 digits as value in a column but that column must be 10 digit length. So we have to complete all the missing digits with a zero. So, that column will have the original 3 digits and 7 zeros. How we can do that tranformation? We must do it from de the flat file or it can be a previous step?
Thanks for any help you can give me.
Regards,

Beli

View 10 Replies View Related

Logistic Regression Question

Feb 14, 2008

Hi All,

We're currently preparing for a project for a bank client of ours where we would be using SQL Server 2008's data mining capabilities.

Does anyone know if logistic regression supports the following types:

Binomial (standard)

Multinomial (standard)

Conditional

Ordered

Rank-ordered

Nested

Stereotype
Regards,
Joseph

View 1 Replies View Related

R*R - Excel Like Regression With SSAS?

Dec 13, 2007

Hi!
I try to make linear regression in multiple dimensions
with SSAS (y = a + a1*x1+ ... a2*xn)

I got the equation, but I also want to see R squared and R adjusted in same manner as in Excel.
How to achieve that?
Greetings

View 2 Replies View Related

DMX Query For Regression Coefficients

Oct 3, 2006

How do I write a DMX query to return the coefficients of the independent variables in my regression equation?

Thanks,

Carrie

View 10 Replies View Related

Regression Line In Chart

Feb 8, 2008

I would know if is possible to add the regression line to a scatter chart !!!

View 5 Replies View Related

Trendlines/Regression With RS Charts?

Apr 15, 2008

[using: Reporting Services 2005, SQL Server 2005, Analysis Services 2005]

Has anyone ever implemented dynamic trendlines with RS charts?

I have a requirement to create a web-based chart based on an existing Excel chart that the client is already using. This chart uses a trendline to forecast performance for 3 months out. I know in Excel it's as easy as right-click->add trendline.

Is there a similarly simple way to do this in Reporting Services?
Also, the data source for this is OLAP, so if any of you are MDX gurus, is there some regression function to plot all the parallel axis points?

thanks for any insight.
-michael

View 1 Replies View Related

Questions About Regression Tree Model

Oct 21, 2007

I have two questions about the regression tree of Microsoft Decision Trees algorithm.

1. The mining legend window has a column named Histogram showing a bar for each coefficient. What does this bar mean?
2. Since each node of a regression tree corresponds to a linear regression, how can I find the regression coefficient of each node? I mean the coefficient that tells how good the regression is.

Any tip will be greatly appreciated.

View 1 Replies View Related