Is there some Transformation or other method to remove outlying records based on an attribute during a Data Flow task?
I have a list of Organizations complete with a list of Products they have bought. I am going to do some data mining / profiling off of this data but first I need to get rid of the top 25% and bottom 25% quantity records by Product. I've looked at Percentage / Row Sampling but they are too simple.
I have two tables that have a common column (ID). Now, what i am trying to do is find what is not in one table that is in the other.
For instance:
Table A ID NAME 1 Tom 2 George 3 Richard
Table B ID NAME 1 Tom 3 Richard 4 Kevin
With this information, I am trying to write a query that would tell me that Kevin is the only record that doesn't exist in both tables ... like an outlier.
I have tried using something like the following:
SELECT distinct id, name FROM Table 1 INNER JOIN Table 2 ON Table 1.id <> Table 2.id
EDITION PRODUCT INSERTDATE ---------- ------------ ---------------------- CNE TN-Town News 12/19/2007 12:00:00 AM TN TN-Town News 12/19/2007 12:00:00 AM
What i have to do is if there are multiple records for one product in any day, then i need to remove all those records. In this case i am getting two records for the PRODUCT 'TN-Town News' and for INSERTDATE = 12/19/2007 . So i need to remove these two records from the table.
We imported approximately 2.9 million records from our mainframe server into our SQL Server but have run into a problem. The data in a few of the fields contains both leading and trailing spaces. An example of the data would be like this, using periods to represent spaces:
What we have:
..1A02938.....
What we need:
1A02938 (no spaces)
Is there some sort of algorithm I can run on the data to remove those spaces? The problem is coming up when trying to perform a SELECT query. We try something like:
SELECT * FROM PCPIPT0 WHERE PANO20 = "1A02938" but we get zero results because of the spaces in the database. The datatype of the filed is char(20) because we need some flexibility on the size of the data stored.
I currently have one table that lists all projects and tasks within the organisation. One of the table fields is the task status, open or closed. I would like to be able to have a process by which the tasks that are completed are removed from the table and placed into another (archive) table. The same records then being removed from the original table. which then only contains the incomplete tasks. This process could be run at given times during the day or at the point when the status of a task is changed from open to closed, either way each time the process is run it would need to append the rows removed into the archive table. Anyone any ideas on the best way to do this?.
We configured IDERA SQL Safe for backups and restores.we setup an email for notifications. One day we performed manual backup operation for 150 databases from SQL Safe tool. Unnoticed it backed up to C:Backup folder.
We got alerts with the report of backups on C: drive. Then we moved backup files to respective folders.But, I could not clear the records from the report, its been 25 days, still we are receiving the alerts as below.How can I clear this report or to configure or setup anything to avoid this, in future as well.
Below is the sample records from the report. I need this report to be cleared.
Subject: SQL Safe Validation Report
The following files are recorded in the SQLSafe repository, but no longer exist in their locations:
C:BackupLUXOR_DB_GroupEPF_Diff_201402050045 (1 of 1).safe C:BackupLUXOR_DB_InsightEPF_Diff_201402050045 (1 of 1).safe C:BackupLUXOR_DB_IMEPF_Diff_201402050045 (1 of 1).safe C:BackupLUXOR_DB_SiteEPF_Diff_201402050045 (1 of 1).safe C:BackupLUXOR_DB_SiEPF_Diff_201402050045 (1 of 1).safe C:BackupLUXOR_DB_ICEPF_Diff_201402050045 (1 of 1).safe C:BackupLUXOR_DB_vdbEPF_Diff_201402050045 (1 of 1).safe C:BackupLUXOR_DB_M_SEPF_Diff_201402050045 (1 of 1).safe
I have a table that "Geography" that has the following columns: city, state, zip
There are tons of duplicate cities in this table. I ran this query and it shows me the number of occurrences of each city. I want to delete all the duplicates except for 1. I don't want to do this manually as there are a lot of records.
What would the SQL look like to delete the duplicate records but keep at least one?
We all were new at one point.... any help is appreciated.
Objective:
Combining two 49,000 row tables and remove records where there is only 1 column difference. (keeping the specified column value removing the one with a blank.)
Reason:
I have 2 people going through a list, coding a specific column with a single letter value. They both have different progress on each sheet. Hence I am trying to UNION them and have a result of their combined efforts without duplicates.
My progress/where I'm stuck:
Here is my first query/union:
SELECT * FROM [Eds table] UNION SELECT * FROM [Vickis table];
As shown above, I have unioned these 2 tables and my results removed th obvious whole record duplicates, but since 1 column is different on these, a union without criteria considers them unique.....
an example of duplicates that I must remove are as follows:
I have a SQL 2012 database that has 10 tables. One of the tables is populated by manual import from CSV file. Each time a user calls custom ASP.NET code., records get inserted into a table called forecast_data with incremental increase in FileID. So first import has FileID of 1, second import has FileID of 2 etc.
What I am trying to do is only keep the data that has the highest FileID (MAX(FileID). I would like to write a store procedure that removes all older data once a new import is written into the table.
I have a requirement where i want to delete the records based on the Date column. I have table which contain the columns like machinename ,lasthardwarescandate
I want to delete the records based on the max(Lasthardwarescandate) i.e. latest one, column where the machine name is duplicate menace it repeats. So how would i remove the duplicate machine names based on the Lasthardwarescandate column(There are multiple entries for the Lasthardwarescandate so i want to fetch the latest date column).
Note: Duplication should be removed based on “Last Hardware Scan” date.
Only latest date should be considered from multiple records for the same system. "
I wanted to remove duplicate records from SSRS report. I set the "Hide Duplicates" to True. It is now working, But i am getting the space between the two records, which i want to get rid of. How to get rid of extra spaces between two records ( Please find the details below).
I tried to remove AdventureWorksDB in the "Add or Remove Programs" of Contol Panel and I got the following errors: (1) AdventureWorksDB Error 1326: Error getting file security: CProgram FilesMicrosoft SQL ServerMSSQL1MSSQLGetLastError: 5. |OK| and (2) Add or Remove Programs Fatal Error during installation (after I clicked the |OK| button). Please help and tell me how I can solve this problem.
I have uninstalled the CTP version of the SQL Server express so that I can install the released version but CTP version is still listed in the add/remove program list but without the change/remove button. I have been to different sites to find information on cleaning this up and I have ran all the uninstall tool I can find but the problem still prevails. I cannot install the released version without completely getting rid of the CTP version. Please help anyone.
I am having a hard time removing my SQL instance inside the Add/Remove program. After i select the SQL Instance name and then I tried to remove it but it won't allow me to delete it. There isn't any error message or whatsoever. Actually, when i try to log it in my SQL Management studio, that certain sql instance name is not existing according to the message box. Is there any way to remove the Sql Instance in my system?
writing the query for the following, I need to collapse the continuity. If the termdate for an ID is one day less than the effdate of the next id (for the same ID) i need to collapse the records. See below example .....how should i write the query which will give me the desired output. i.e., get min(effdate) and max(termdate) if termdate is one day less than the effdate of next record.
I have a situation where deleting old records is blocking updating latest records on highly transactional table and getting timeout errors from application.
In details, I have one table called Tran_table1 in OLTP database. This Tran_table1 is highly transactional table, it will receive data for insert/update continuously
While archiving 2 years old records from Tran_table1 into Tran_table1_archive in batches(using DELETE OUTPUT INTO clause), if there is any UPDATEs on Tran_table1,these updates are getting blocked and result is timeout errors in application.
Is there any SQL Server hints to avoid blocking ..
declare @table table ( ParentID INT, ChildID INT, Value float ) INSERT INTO @table SELECT 1,1,1.2
[code]....
This case ParentID - Child 1 ,1 & 2,2 and 3,3 records are called as parent where as null , 1 is child whoose parent is 1 similarly null,2 records are child whoose parent is 2 , .....
Now my requirement is to display parent records with value ascending and display next child records to the corresponding parent and parent records are sorted ascending
I have been trying to solve the locking problem from past couple of days. Please help mee!!
Scenario: -------------- I have a SSIS package in which 2 data flow tasks. 1st data flow task deletes records from a 5 tables and the 2nd data flow task should insert records into 1 of the five tables after the success of 1st data flow task. This scenario runs in Transacation.
The above scenrio in the 2nd data flow task hangs in runtime. It does not complete. with sp_who2 command i could see that there is an intent share lock(LK_M_IS) on the table and the status is SUSPENDED.
I dont know how to come out of this locking. Please help.
I have a table with about half a million records, each representing a patient in my county.
Each record has a field (RRank) which basically sorts the patients as to how "unwell" they are according to a previously-applied algorithm. The most unwell patient has an RRank of 1, the next-most unwell has RRank=2 etc.
I have just deleted several hundred records (which relate to patients now deceased) from the table, thereby leaving gaps in the RRank sequence. I want to renumber the remaining recs to get rid of the gaps.
I can see what I want to accomplish by using ROW_NUMBER, thus:
SELECT ROW_NUMBER() Over (ORDER BY RRank) as RecNumber, RRank FROM RPL ORDER BY RRank
I see the numbers in the RecNumber column falling behind the RRank as I scan down the results
My question is: How to convert this into an UPDATE statement? I had hoped that I could do something like:
UPDATE RISC_PatientList_TEMP SET RRank = ROW_NUMBER() Over (ORDER BY RRank);
but the system informs that window functions will only work on SELECT (which UPDATE isn't) or ORDER BY (which I can't legally add).
I need a little help here..I want to transfer ONLY new records AND update any modified recordsfrom Oracle into SQL Server using DTS. How should I go about it?a) how do I use global variable to get max date.Where and what DTS task should I use to complete the job? Data DrivenQuery? Transform data task? How ? can u give me samples. Perhaps youcan email me the Demo Package as well.b) so far, what I did was,- I have datemodified field in my Oracle table so that I can comparewith datelastrun of my DTS package to get new records- records in Oracle having datemodified >Max(datelastrun), and transferto SQL Server table.Now, I am stuck as to where should I proceed - how can I transfer theserecords?Hope u can give me some lights. Thank you in advance.
I have a query similar to the following. The intent of this query is to retrieve the top 6 records meeting the specified criteria (LOGTYPENAME = 'Process Status Start' OR LOGTYPENAME = 'Process Status End' ) based on most recent dates. Please keep in mind that I expect to return up to 6 records for each unique LogProcessName. This could be thousands of different LogProcessNames with up to 6 records for each.
1) The table I am executing against currently is very large in size and thus takes a long time to execute against. It would seem there must be a more efficient query to get the results I am looking for? 2) CTE doesn't work on SQL 2000. I need a query that does. 3) I cannot modify the database itself in the process.
;WITH cte AS ( SELECT [LogProcessName], [LogBody], [LogDate], [LogGUID], row_number() OVER(PARTITION BY [LogProcessName] ORDER BY [LogDate] DESC) AS RN FROM [LOGTABLE] WHERE [LogTypeGUID] IN ( SELECT LogTypeGUID FROM LOGTYPE WHERE LogTypeName = 'Process Status Start' OR LogTypeName = 'Process Status End' ) ) SELECT * FROM cte WHERE RN = 1 OR RN = 2 OR RN = 3 OR RN = 4 OR RN = 5 OR RN = 6 ORDER BY [LogProcessName] DESC, [LogDate] DESC
Does anybody else have any idea that would yield the results that I am looking for and take into account items 1-3 above?
I tried to port 10000 records using DTS. After porting of 9900 records I got an error and comes out without any result. But I want to keep the records which has been ported till the error occured. Plz help me.
Hi, I have had this problem for a while and have not been able solve it.
What im looking at doing is looping thru my patient table and trying to organise the patients in to there admission sequence, so when patient "A" comes in and is treated at my hospital and is discharged and admitted to another Hospital within one day then patient "A" will get a code of 1 being there first admission.
then if patient "A" is admitted again but there admission date is greater than one day they get a code of 2 being for there second admission but will need to loop thru table looking for other admissions and discharges.
The table name is Adm_disc_Match_tbl
Basically what i have 4 fields. Index_key = which is the patient common link (text) ur_episode = this wil change for each Hospital (text) Admission_datetime = patient admission date and time (datetime) Discharge_datetime = patient discharge date and time (datetime)
I have to search the records after the records populated.
I mean to say, i have displayed records in report, if i enter some strings in the textbox and clicked find, then it will highlight the particular records, instead of highlighting the values, is it possible to display only those particular records.
For example, say i have 50 records in a page,i entered some strings in the textbox and clicked find, then it will highlight the particular 5 records one by one which match the criteria i have entered in the texbox, instead of that i have to display only those 5 records.
Hi all In my table one of the column consists the values like this Krishna" Rama" - - - - - I want to remove ". I started with instr to findout the occurence. Its giving the message that 'InStr' is not a recognized built-in function name.. How to solve it. Waiting for valuable replies Thank u Baba
Please help to remove (-) The EM_TERMINATION_DATE column have records with no data. I am using the following below script to format the dates. The problem is : the format with (-) between yyyy-mm-dd retrive NULL records with (-) see the output. How to fix it? Here is my ouput: EM_TERMINATION_DATE 2006-03-15 - - - - - - - -
Here is my script:
select
cast(left(EM_TERMINATION_DATE, 4) as varchar(4))
+ '-' +
Cast(right(EM_TERMINATION_DATE, 4) as varchar(2))
+ '-' +
Cast(left(Right(EM_TERMINATION_DATE,2), 2) as varchar(2))as TERM_DT
Ok, I'm really new at this, but I am looking for a way to automatically insert new records into tables. I have one primary table with a primary key id that is automatically generated on insert and 3 other tables that have foreign keys pointing to the primary key. Is there a way to automatically create new records in the foreign tables that will have the new id? Would this be a job for a trigger, stored procedure? I admit I haven't studied up on those yet--I am learning things as I need them. Thanks.
Hi, I don't need sql express, never use it and I would like to remove it. Is it possible without loosing important functionalities in VWD? Is that true that cannot compile the applications (ctrl+f5) anymore?
I had sqlcachedependancy installed. But now I cannot uninstall it? 1. I unregistered: C:WINDOWSMicrosoft.NETFrameworkv2.0.50727>aspnet_regsql.exe -E -S localhost-d DNN1 -dd 2. I remove the Dependencry.Start() from my application_start ' System.Data.SqlClient.SqlDependency.Start(ConfigurationManager.ConnectionStrings("SiteSqlServer").ConnectionString) 3. I removed the cache section from my web.config 4. I removed the outputcache statements from my aspx pages. <!--<%@ OutputCache Duration="1" VaryByParam="*" SqlDependency="CommandNotification" %>--> However, I still get the next error:
File
Error When using SqlDependency without providing an options value, SqlDependency.Start() must be called prior to execution of a command added to the SqlDependency instance. What to do? Any help or advice is appreciated. Jelle