SQL 2012 :: String Splitting Using Like And Data Cleansing
Mar 28, 2014
I have a large poorly designed table (inherited) With a Name field that contains comma delimited text containing address information. I need to do several things with it but unfortunately there doesn't appear to be any true consistency in it. When it displays in its own text box it works by placing each section on a new Line and looks ok.But I need to pull it apart and place things like unit number, Building Name in its own column etc. In the data it could be in either the 2nd,3rd, 4th, dependent on what came 1st. the data looks some thing like the following
unitNumber/StreetNumber Space StreetName (Building Name), Subub,City,Country
Some addresses won't have unit number or Suburb or country so when splitting you could have Suburbs and Citys in multiple columns even if you try and stagger the split process.Has any body go a good tool or reference site for dealing for this sort of problem. I have a table that I have made up that has some of the street names that could be used for comparing against existing records but it is by no means fool proof due to spelling inconsistencies . I also have another list of Common building names that could be used to compare, remove and place in the new building column.
View 1 Replies
ADVERTISEMENT
Dec 14, 2006
Hi to everyone,My problem is, that I'm not so quite sure, which way should I go.The user is inputing by second part application a long string (let'ssay 128 characters), which are separated by semiclon.Example:A20;BU;AC40;MA50;E;E;IC;GREENNow: each from this position, is already defined in any other table, asa separate record. These are the keys lets say. It means, a have someproperities for A20, BU, aso.Because this long inputed string, is a property of device (whih alsohas a lot of different properities) I could do two different ways ofstoring data:1. By writing, in SP, just encapsulate each of the position separatedby semicolon, and write into a different table with index of device,and the position in long stirng nearly in this way:Major device data tableID AnyData1 AnyData2 ... AnyData3123 MZD12 XX77 .... any comment text124 MZD13 XY55 ... any other commentString data Tablefk_deviceId position value123 1 A20123 2 BU123 3 AC40.....123 8 GREENThe device table, contains also a pointer (position), which mightchange, to "hglight" specified position.Then, I can very easly find all necessary data. The problem is, I needto move the device record data (from other table) very often into otherhistory table (by each update). That will mean, that I also need tomove all these records from 1 -8 for example to a separate historytable, holding the index for a history device dataset. This is a littleinconvinience in this, and in my opinion, it will use to much storagedata, and by programming, I need always to shift this properities intohistory table, whith indexes to a history table of other properities.2. Table will be build nearly in this way:Major device data tableID AnyData1 AnyData2 ... AnyData3 stringProperty pointer123 MZD12 XX77 .... any comment text A20;BU;AC40;MA50;E;E;IC;GREEN 3124 MZD13 XY55 ... any other comment A20;BU;AC40;MA50;E;E;IC;GREEN 2By writng into device table, there will be just a additional field forthis string, and I will have a function, which according to specifiedpointer, will get me the string part on the fly, while I need it.This will not require the other table, and will reduce the amout ofdata, not a lot ... but always.This solution, has a inconvinance, that it will be not so fast doing asearch over the part of this strings, while there will be no real indexon this.If I woould like to search all devices, by which the curent pointervalue is equal GREEN, then I need to use function for getting thevalue, and this one will be not indexed, means, by a lot amount ofdata, might be slow.I would like to know Your opinion about booth solutions.Also, if you might point me the other problems with any of thissolution, I might not have noticed.With Best RegardsMatik
View 7 Replies
View Related
Dec 15, 2014
Here's the resultset:
Main
CC-09-00032-D
CC-09-00113-A
PR-10-01004-2
Expected result:
P C
PR-10-01004-2 CC-09-00032-D
CC-09-00113-A
What I need is split the data into two columns if data in column Main starts with 'PR-' then output result to column P and if it starts with 'CC-' then to column C (the output needs to be in one table).
View 3 Replies
View Related
Jun 22, 2015
Currently I have a column with multiple postcodes in one value which are split with the “/” character along with the corresponding location data. What I need to do is split these postcode values into separate rows while keeping their corresponding location data.
For example:
PostCodeLatitudeLongitude
66000/6610042.6965952.899370
20251/2027042.1964719.404951
Would become
PostCodeLatitudeLongitude
6600042.6965952.899370
6610042.6965952.899370
2025142.1964719.404951
2027042.1964719.404951
how this can be done?
View 2 Replies
View Related
Sep 11, 2006
Hello All,
I'm a non-programmer and an SQL newbie. I'm trying to create a printer usage report using LogParser and SQL database. I managed to export data from the print server's event log into a table in an SQL2005 database.
There are 3 main columns in the table (PrintJob) - Server (the print server name), TimeWritten (timestamp of each print job), String (eventlog message containing all the info I need). My problem is I need to split the String column which is a varchar(255) delimited by | (pipe). Example:
2|Microsoft Word - ราย?ารรับ.doc|Sukanlaya|HMb1_SD_LJ2420|IP_192.10.1.53|82720|1
The first value is the job number, which I don't need. The second value is the printed document name. The third value is the owner of the printed document. The fourth value is the printer name. The fifth value is the printer port, which I don't need. The sixth value is the size in bytes of the printed document, which I don't need. The seventh value is the number of page(s) printed.
How I can copy data in this table (PrintJob) into another table (PrinterUsage) and split the String column into 4 columns (Document, Owner, Printer, Pages) along with the Server and TimeWritten columns in the destination table?
In Excel, I would use combination of FIND(text_to_be_found, within_text, start_num) and MID(text, start_num, num_char). But CHARINDEX() in T-SQL only starts from the beginning of the string, right? I've been looking at some of the user-defind-function's and I can't find anything like Excel's FIND().
Or if anyone can think of a better "native" way to do this in T-SQL, I've be very grateful for the help or suggestion.
Thanks a bunch in advance,
Chutikorn
View 2 Replies
View Related
Jul 30, 2002
Does any one knows any software used for Data Cleansing? Or is there any tools with SQL Server 2000 for Data Cleansing? I look forward to hearing from anyone.
Regards,
View 3 Replies
View Related
May 11, 2007
I have sql server 2005 developer edition and vs2005 team system.
I want's to use data cleansing feature.i.e.
Integration Services Advanced Transforms Includes data mining, text mining, and data cleansing
Can please some body points me to any good webcast about using this feature
View 1 Replies
View Related
Apr 18, 2007
I am just starting out with SSIS and trying to get the feel of data cleansing using it. But on my very first project for data cleansing I've got into this weird error.
My data flow is very simple, it has a OLE DB source, Fuzzy Lookup and OLE DB destination. I've built three tables for this purpose, one is source, one is reference (it will be used to match for the real entries in fuzzy lookup) and the last is the destination table.
In all the three tables I've a field of City which I'd like to Fuzzy lookup in the reference table and if it crosses certain confidence level, I'd like to insert to the destination table. City in all the tables has the same datatype, defined in the same way, it is varchar(50).
But when in the fuzzy lookup I try to map the Source tables City field to reference tables City field, it gives me this error:
The following columns cannot be mapped:
[City, CityRef]
One or more columns do not have supported data types, or their data types do not match.
Although as I have mentioned before, both have same data types and are defined in the same manner (i.e. I've just selected the datatypes for those columns and all the other settings are left to default). I just cannot understand why this is happening, plz help me with this. FYI I've also tried to give the City Column different datatypes in all the tables like varchar(max), text, Only to be greeted with the same error message.
Waiting for your reply!!
Regards,
Sajid.
View 6 Replies
View Related
Mar 29, 2008
Hi all
I have two data sources - an excel spreadsheet and one access database of Customer details, (quite small in size) and I have to use SQL Server 2005 to Integrate the data and cleanse the data so that I could (in theory) import it into a CRM system (though I won't actually be importing it) as this is for coursework only.
Is there a simple steps to do this is SQL server and guides that I can follow that are relevent? As there are lots of tutorials and I haven't a clue where to start.
many thanks
little_tortilla
View 4 Replies
View Related
Feb 7, 2008
I have a character field that should contain either a number or space in one of my transitions but it is coming over with junk in it occasionally. What is the best way to ensure that this data is only numeric or a space? Whenever I hit a < or ] one of those weird symbols I want to replace it with a space in my target field. It is SQL Server to SQL Server.
View 4 Replies
View Related
Oct 28, 2004
aaaaa,bbbb,ccccc,dddd,ffff,gggg,llll,kkkk,nnnnn
How can split the above string to the following individual strings
aaaaa
bbbb
ccccc
dddd
ffff
gggg
llll
kkkk
nnnnn
Thanks
View 2 Replies
View Related
Nov 30, 2005
Hello,I have been placed in charge of migrating an old access based databaseover to sql server 7.0. So far, I have imported all the tables intosql server, but now I have come across the issue of needing to split astring variable. For instance, in the old database, the variable forname was such that it included both first and last names, whereas inthe new database there are seperate entities for first and last name.I know that there is a way to write a script that will separate out thetwo strings by using the "space" in between the name, but I'munfamiliar how to do this. Any suggestions? Thanks!Rick
View 1 Replies
View Related
Jan 9, 2008
If I have a string of
'WLL EXT FACE REND'
how would I go about splitting the string into 4 strings of
'WLL'
'EXT'
'FACE'
'REND'
I know that for the first bit I can use a combination of PATINDEX and LEFT, but not sure how I can pull out the rest of them.
Thanks
View 6 Replies
View Related
May 28, 2004
I use to use the Instr() method, but that is not allowed in SQL..
Any suggestions?
View 3 Replies
View Related
Aug 14, 2003
I'm trying to split a hyphen-delimited string into three columns in a view. I've been using substring and len to split up the string, but it is getting very complicated (and isn't working in all cases). I've used a SPLIT function in vbscript - does t-sql have anything similar? I've attached a spreadsheet that shows what I am looking for. Maybe someone can guide me in the right direction?
Thanks.
View 3 Replies
View Related
Aug 22, 2007
I need to creat distinct terms of the example parsing the term on the '|' character. I will be using mysql.
example: 1885-1974.|Johnson family|Frontier and pioneer life - Alberta - Black Hill district|Cadogan region (Alta.) - Biography|Black Hill district (Alta.) - Biography
View 1 Replies
View Related
Dec 6, 2007
Hi
I have a string EX: "How are you doing" Now. I wanted this string to be splitted in to respective words and return me those words in sql server. .
Thanks!
View 3 Replies
View Related
Mar 2, 2012
I have a column in a table which looks like below.
Column
-------
AA123
D123
AXC1
QF23
I need to split this value into two part, Alphabets and numeric. How to do this using SQL query.My column value will not have mixed characters like A1D3,G32S,12F.It will always follow the ablve pattern mentioned above.
View 3 Replies
View Related
Sep 25, 2014
I have a table name tbl_testme with columns (id,mac,keys,outputmk)
mac column have 12 character and keys have 16 character
mac keys
6545da7n9hg8 hsi457s5sd77jk87
What i want is i need to split the column into 4 characters of both column E.G.
(6545 da7n 9hg8) and (hsi4 57s5 sd77 jk87)
after splitting i need to take last 4 character of key(jk87) and last 4 character of mac(9hg8)and join them and insert that into ouputmk column.
E.G.
(jk87-9hg8-sd77-da7n-57s5-6545-hsi4)
I need this to insert in outputmk column ....
View 2 Replies
View Related
Oct 1, 2007
I have a string like say: '3:4:5:4,2:4:1,4:1:2:5:2'. Now I need to split the substrings delimited by commas. So my final output shall be
3:4:5:4
2:4:1
4:1:2:5:2
I did write a piece of code to achieve the same, but I feel its not so efficient. Can anyone suggest me a better way, if any? My code is as follows:
Declare @person as varchar(255), @cnt smallint,@loc smallint,@prevloc smallint, @str varchar(255)
Select @prevloc=0,@loc=1,@cnt=1,@person = '3:4:5:4,2:4:1,4:1:2:5:2'
While @loc != 0
begin
set @prevloc=(case when @loc = 1 then 0 else @loc end) +1
set @loc = charindex(',',@person,@loc+1)
Set @str = substring(@person,@prevloc,(Case when @loc = 0 then len(@person) - @prevloc + 1 else @loc - @prevloc end))
print 'String = ' + @Str
set @cnt=@cnt+1
end
RESULT
------
String = 3:4:5:4
String = 2:4:1
String = 4:1:2:5:2
Note: My actual purpose is to also sub split it again with ':' delimiter too. So looking for an efficient code.
View 4 Replies
View Related
Dec 19, 2002
I'm creating a web-based NT RAS report site and am looking for the most efficient way to import the data from NT Event log into SQL2k. I'm using the 'dumpel' utility from rsc kit and all is fine except the 10th column - the message detail:
"The user DOMAINuserid connected on port Mdm15 on 08/23/2002 at 07:25am and disconnected on 08/23/2002 at 07:27am. The user was active for 2 minutes 23 seconds. 78809 bytes were sent and 50675 bytes were received. The port speed was 49300."
I need to parse this one long text string into 6 distinct columns: userID, port, duration, bytes_xmt, bytes_rcv and portspeed. After a quick review of the rowsets, the strings seem to hold a consistent output ... no real variances I can see.
I've dablled with views but am facing a small performance issue that could get bigger: The sql server not only has to run the text file import package, but also the view to format the text dump into a workable dataset, then my report code bangs over 30 queries against the final dataset. It already takes our SQL2k server over 3 minutes to parse about 20,000 rows and the server's a beast (dual 1.8 p4 cpu, 3gb ram, raid, etc).
What I think would work best is to abandon the view (performance will only get worse as the row count increases) and instead INSERT the rows into one table.
Any ideas anyone? any good scripts out there that can help me to parse the long text string quicker that using substring and replace functions?
TIA:rolleyes: :rolleyes:
View 2 Replies
View Related
May 20, 2008
So we have a field called forenames, and it needs to be split into fields forename_1, forename_2, forename_3, forename_4 (don't ask).
Ok, I've come up with this so far, which works, but is pretty nacky in my opinion. Has any one got a better way of achieving this?
SELECT forenames
, Replace(forenames, ' ', '.')
, Reverse(ParseName(Replace(Reverse(forenames), ' ', '.'), 1)) As [f1]
, Reverse(ParseName(Replace(Reverse(forenames), ' ', '.'), 2)) As [f2]
, Reverse(ParseName(Replace(Reverse(forenames), ' ', '.'), 3)) As [f3]
, Reverse(ParseName(Replace(Reverse(forenames), ' ', '.'), 4)) As [f4]
FROM ( SELECT 'John' As [forenames]
UNION SELECT 'John Paul'
UNION SELECT 'John Paul George'
UNION SELECT 'John Paul George Ringo'
) As [x]
Results
forenames (no column name) f1 f2 f3 f4
---------------------- ---------------------- ---- ---- ------ -----
John John John NULLNULL NULL
John Paul John.Paul John PaulNULL NULL
John Paul George John.Paul.George John PaulGeorgeNULL
John Paul George Ringo John.Paul.George.Ringo John PaulGeorgeRingo
View 7 Replies
View Related
Jul 20, 2005
I have a column named "LIST" in a table with strings like the following:151231-1002-02-1001151231-1001-02-1001151231-1002-02-1002151231-1003-02-1001etc....What I'd like to do is include an ORDER BY statement that splits thestring, so that the order would be by the second set of four numbers(i.e. between the first and second - marks), followed by the third setof two numbers, and then by the last set of four numbers.How would I do something like this?--Sugapablo - Join Bytes!http://www.sugapablo.com | ICQ: 902845
View 1 Replies
View Related
Jul 9, 2007
I need a help in SQL Server 2000.
I am having a string variable in the format like -- (1,23,445,5,12)
I need to take single value at a time (like 1 for 1st, 23 for 2nd and so on) from the variable and update the database accordingly. This is like a FOR loop.
Can anyone help me out in splitting the variable using the comma separator...
View 4 Replies
View Related
Jan 14, 2008
Hi All!!!
I was tasked to come up with a search function and the content of the database given to me is in Chinese Characters. This would be my first time dealing with Chinese characters in the database and I need help with the following problem:
The company wants to conduct the search in such a way that, instead of having the system read the entire sentence/phrase which the user keyed in as a SINGLE string, they want the Chinese Characters to be accessed individually, so that as long as any information in the database contains any one of the characters which the user have entered, they will be retrieved and returned.
So how do I go about doing this? Does it have anything to do with Unicode? By the way, everything abt the search tool is working fine, I am just left with this dilemma of having the system recognise the entire sentence as ONE STRING, instead of conducting a search word by word or character by character.
Anyway, the following is the SQL statement of my SQL Data Source which is bound to a Gridview displaying the returned results after a search is done...1 SELECT Name, Trans, Address1, Address1T, Address2, Address2T, City, CityT, CRPLID
2 FROM CRPL
3 WHERE (Trans LIKE '%' + @Trans + '%') OR
4 (Name LIKE '%' + @Name + '%') OR
5 (Address1 LIKE '%' + @Address1 + '%') OR
6 (Address1T LIKE '%' + @Address1T + '%') OR
7 (Address2 LIKE '%' + @Address2 + '%') OR
8 (Address2T LIKE '%' + @Address2T + '%') OR
9 (City LIKE '%' + @City + '%') OR
10 (CityT LIKE '%' + @CityT + '%')
Thanks for all your help in advance!!!
View 6 Replies
View Related
Oct 22, 2007
Hi,
I have a function which takes a string and a delimter. It then splits the string by the delimter and returns a table of resultant strings:
CREATE FUNCTION [dbo].[vs_SplitTags] (@sep char(1), @s varchar(512))
RETURNS table
AS
RETURN (
WITH Pieces(pn, start, stop) AS
(
SELECT 1, 1, CHARINDEX(@sep, @s)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(@sep, @s, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT pn,
SUBSTRING(@s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS s
FROM Pieces
)
This works very well, other than instances of the delimter are, themselves, considered to be results. For example:
SELECT * FROM vs_SplitTags(' ', 'foo bar') AS result
returns:
pn s
1 foo
2 bar
which is exactly the result I would want.
However,
SELECT * FROM vs_SplitTags(' ', ' foo bar ') AS result -- There are spaces before 'foo' and after 'bar'
returns
pn s
1
2 foo
3 bar
4
And
SELECT * FROM vs_SplitTags(' ', 'foo bar') AS result -- There are two spaces between 'foo' and 'bar'
returns
pn s
1 foo
2
3
4 bar
I want the function to ignore whitespace altogether, be it a single space or multiple spaces. Other than to delimit the boundries between words, of course.
In other words, all three examples above should produce the same result:
pn s
1 foo
2 bar
How can I do this? Any thoughts much appreciated...
View 5 Replies
View Related
Aug 27, 2014
I'm quite new to SQL. I'm able to extract the info that I need, but only into a result of one row, like:
Order header | Order details
ID | Customer name | Customer address | Product number | Product name | Quantity | Price | Product number | Product name | Quantity | Price
2 Andy Andy's way 2 24 Glue 3 35 39 Oyster 2 9
I would like the query to return it in this way:
2AndyAndy's way 2
24Glue335
39Oyster290
Is it even possible?
View 7 Replies
View Related
May 21, 2015
So we have new servers that are going to be installed with SQL 2012 and I'm debating the wisdom of splitting tempdb with multiple files.
I know it's a myth that performance automatically improves if you split it into a number of files based on processors, but I'm debating the wisdom of putting a file on each of my data / log file drives.
For instance, I have a server with a C: drive (OS), D: drive (Data for system DBs and install of programs - 458 GB), an F: drive for user DB data files (767 GB), and a J: drive for log files (255 GB).
Obviously no files are going on C:. I'm debating on whether or not we should even leave system DBs on the D: drive given in our current 2k8 servers, we end up with Memory.dmp files over flowing the D: drives as well as .cabs and other install / update files that tend to collect on that drive over the years.
But if we leave the system DBs on D:, I'm wondering if adding a second tempdb file to F: and a third to J: will improve query performance or not.
View 9 Replies
View Related
Jul 10, 2014
Looking for returning multiple entries from a time span. I have a date, start-time, end-time and duration. I need the start-times separated in a list. It's fine if temp tables are needed - I have that clearance.
Entry might look like:
(datetime) date: 7/10/2014
(int) start-time 820
(int) end-time 1000
(int) duration 20
The result would needed:
(datetime) 7/10/2014 08:20:00
(datetime) 7/10/2014 08:40:00
(datetime) 7/10/2014 09:00:00
(datetime) 7/10/2014 09:20:00
(datetime) 7/10/2014 09:40:00
View 8 Replies
View Related
Jan 13, 2015
I have a very interesting problem in T-SQL coding for which I can't figure out the solution. Actually there is a Line_1_Address column in our data warehouse address table which is being populated from various sources. Some sources have already concatenated house number + street address fields in the Line_1_Address column whereas one source has separated columns for both data fields.
Now I'm trying to extract data from this data warehouse table and I need to split the house number from street address and load it into separate columns in my destination table. In case there is no data for house number then I should load it as NULL.
The issue is that data in this Line_1_Address column is very inconsistent so I don't know which functions to use. Here is some sample data for your consideration:
Line_1_Address
101 E Commerce ST
120 E Commerce ST
2 Po Box
301 W. Bel Air Ave
West Main Street, PO Box 1388
[Code] .....
View 6 Replies
View Related
Mar 3, 2015
I have a description field in a table which also stores unit of measure in the same column but with some space between them, I need to split these into two different columns.
For Eg:
Description
APPEARANCE UNIT
BDV KV
DENSITY KG/L
View 9 Replies
View Related
May 7, 2015
I have system id information in table system_ids and productids and systemidinsformation has lot of data but I am looking two strings in tire data to pull into two separate columns. details below
Database versions :ms sql 2008/2012
tablename:system_id's
column:system id information
sample data from system_id_information column
########################################
<obj xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:vim25" versionId="5.5" xsi:type="ArrayOfHostSystemIdentificationInfo"><HostSystemIdentificationInfo xsi:type="HostSystemIdentificationInfo"><identifierValue> unknown</identifierValue><identifierType><label>Asset Tag</label><summary>Asset tag of the system</summary><key>AssetTag</key></identifierType>
[Code] .....
I am looking output of two columns, which are bolded
product_id snumber
654081-B21 MXQ43905SW
for serial number this is common
before string :HostSystemIdentificationInfo"><identifierValue>
and after string </identifierValue><identifierType><label>Service tag
and snumber is always between the before and after string and number of characters of snumber varies and entire data for a row also varies
View 9 Replies
View Related
Dec 11, 2013
How do I write a query using the split function for the following requirement.I have a table in the following way
Identity Name Col1 Col2 Col3
1 Test1 1,2,3 200,300,400 3,4,6
2 Test2 3,4,5 300,455,600 2,3,8
I want an output in the following format
Identity Name Col1 Col2 Col3
1 Test1 1 200 3
1 Test1 2 300 4
1 Test1 3 400 6
2 Test2 3 300 2
2 Test2 4 455 3
2 Test2 5 600 8
If you see the data, first element in col1 is matched to first element in col2 and 3 after splitting the string.
View 2 Replies
View Related