SQL Server 2008 :: Text String Parsing To Apply Operators To Datasets?
Aug 7, 2015
I have a problem at the moment, where the client wants to be able to type in a custom algebraic formula with add/minus operators, and then to have this interpreted, so that the related datasets are then added and returned as a single dataset.
An example would be having a formula stored of [a] + [b] - [c]
and if I were to write the SQL to apply that formula, I might write something like (let's assume 1:1 relationships with the ID's)
select a.a + b.b - c.c as [result]
from z
inner join tblA a on z.id = a.id
inner join tblB b on z.id = b.id
inner join tblC c on z.id = c.id
The formula can change though, maybe things like:
[a] + [b] + [c] + [d]
[a] + [b]
The developer before me wrote something SQL-based where they parsed the string and assigned each value of the formula as either positive or negative (e.g A is positive, B is positive, C is negative, now sum the datasets to get the result), and then created one large table of values then summed them. This does (kind of) work, I'm just contemplating potential alternatives, as it is quite a slow process, and feels like it is quite convoluted, when I get into the details. If I were to do something like this in SQL, I'd normally want each part of the expression to be a column, and then to just apply the operators, but because the formula can change, then the SQL would need to be somehow dynamic for this approach.
I'm trying to parse out a line of data that is separated by the text "atc1.", "atc2." etc.
For example,
[atc1.123/atc2.456/atc3.789/atc4.xyz/]
If I only want the data after atc2., then I could search the string for "atc2." and collect all the characters afterwards. But how can I make sure to trim off all the data after "atc3." to make sure I'm only collecting "456" from the example above?
I've below value in a column with data type - TEXT
QU 221025U2V/AN G-DT DL A 5 1A- 11,5,SF,230,30162,LZ,2,118,0,0,10170,25,06
This text value has some special characters in it. and I could not paste the exact value as this text box is not allowing me to do so. So, for reference I've attached a screenshot (Capture.png) of the value.
I want to fetch last two values from this text i.e. 25 and 06. (It can be anything like 56R,06T but will be the last two values separated by comma)...
We have a legacy database that have hundreds of stored procedures.
The previous programmar uses a string like servername.databasename.dbo.tablename in the stored procedures. We now have migrated the database to a new server. The old server is either needed to be replaced by the new server name, or remove it.
I don't know why he used servername as part of the fully qualified name, we don't use linked servers. So I think better removing the servername in all the stored procedures.
I know I can do a generate script, and replace the text and then use alter procedure to recreate all the stored procedures. But since hundreds of them, is there a programmatically way to replace them?
I want to take this XML and put it into a table with CustomerId and MatchingSetId. With this SQL, each MatchingSetId gets assigned to each CustomerId instead of retaining the relationships in the XML.
Select... ,DISCHARGEHOUR.value('(./Discharge_x0020_Time/time/Hour)[1]', 'varchar(10)') AS [hour] ,DISCHARGEMINUTES.value('(./Discharge_x0020_Time/time/Hour:minute)[1]', 'varchar(10)') AS [Minutes] ,DISCHARGEAMPM.value('(./Discharge_x0020_Time/time/Hour/minute/AM_x002F_PM)[1]', 'varchar(10)') AS [ampm]
But minutes AND AMPM come up as NULL I assume I am setting up something wrong with the level on minutes AND AMPM. Also, can I disregard the ":" in the minutes.
I have a CSV file with roughly 6 million rows. The file is unstructured; that is, some rows have 5 fields, others have 15, and there are as many 50 fields in one row.
I am using bulk insert to read the entire file into a table in database, with each row being a database record. With that, I have one column that contains a row of comma delimited fields. All fields are character string and I want to find a quick way of parsing each row and placing each comma-delimited value in a column. For example:
Column CSVString contains the a CSV row (I don't know how many filelds (no. of commas + 1) in the row, but if the row contains 10 fields, I need to populate columns C1-C10. If the row has 15 fields, I populate columns C1-C15.
How can I do this in a very efficient way? I tried CTE but performance was not very good.
I'm unable to reproduce the error. when they upgrade their OS and SQL EXPRESS to a more recent version the error disappears.
The error is: Incorrect syntax near '.'
the query in question resembles this:
Select column1, column2 from Table1 T cross apply function(t.column4,t.column5) F where column3 = 'XXXX'
I made sure that the compatibility level is greater than 90 this error is happening on SQL2005 SP2 as well as SQL2008 with SP2 (but not all clients are suffering from the same problem)
Can it be the .net framework? Although the machines had .net framework 3.52.
Can the OS be an issue? The OS' seem to be old, Windows Server 2008 SP2
I've tried to reproduce the error by setting up virtual machines with same OS and SQL but, again, can't reproduce.
I'm seeing where previous developers have used a single stored procedure for multiple reports, where each report required different columns to be returned. They are structured like this:
CREATE PROCEDURE dbo.GetSomeData (@rptType INT, @customerID INT) AS BEGIN IF @rptType = 1 BEGIN SELECT LastName, FirstName, MiddleInitial
[Code] ....
As you can see, the output depends on the given report type. I've personally never done this, but that's more because it's the way I learned as opposed to any hard facts to support it.
DECLARE @Teams AS TABLE(Team VARCHAR(3)) INSERT INTO @Teams SELECT 'IND' UNION SELECT 'SA' UNION SELECT 'AUS' select Team from @Teams where Team > 'AUS'
[code]....
co-relation between comparison operators in WHERE Clause and the respective output.
I have a scenario where in I need to use a comma delimited string as input. And search the tables with each and every string in the comma delimited string.
How to remove same repeated string in a column per row from a table? Looked at replace, stuff string functions, but none take a column name as a parameter.
can anybody translate to Transact SQL specifically the example of create function elemIdx i didnt understand how he used recursion may b cuz the language is odd to me i didnt get it
<p> Hi everybody, I was hoping to get some advice something I can't quite get my head around. I have a SQL db which contains a table with ratings using the AJAX rating control. When someone rates an object, I need to select the current rating and then use those numbers to; - calculate the new average - add new score to total score - increment number of votes by one.
I thought this can be best achieved using the SELECT statement and then parsing the SELECT string. (is the string comma separated?) using each array, i'd need to convert this into integers and then do the calculation. and re-upload the data to the ratings table (using the UPDATE statement).
Is this the best way of proceeding? I have tried initially to write the code using three sql statements. But that would mean to many requests from the server, right? Below is the conde I have writting already.int myrating; myrating = Rating1.CurrentRating;string getscore = "SELECT " + "RatingScore" +"FROM Rating " + "WHERE ItemID= '" + _ItemID+ "'";string getcount = "SELECT " + "RatingCount" +"FROM Rating " + "WHERE ItemID = '" + _ItemID + "'";string getaverage = "SELECT " + "RatingAverage " +"FROM Rating " +"WHERE ItemID = '" + _ItemID + "'";
int _ratingscore;int _newscore; _ratingscore = int.Parse(getscore); _newscore = _ratingscore + myrating; //add new rating score to old scoreint _ratingcount; int _newcount;_ratingcount = int.Parse(getcount); _newcount = _ratingcount + 1; //increase count by 1int _ratingaverage; int _newaverage;_ratingaverage = int.Parse(getaverage); _newaverage = _newscore / _newcount; //calculate new average rating otherwise otherwise would i be best off to do the following?... string[] dbRatings = SQLstring.Split(','); ?? Any help would be appreciated. Many thanks in advance. Phil
I'm running into a couple of performance issues with regards to the parsing of a text string. We have a function that will take a comma delimited character string, parse out the individual values, and then populate a temp table with those values. The two issues are 1.) the parsing process is VERY slow and 2.) there's a max to how large the string can be - at some point it could easily be 8000 characters or more in length.
Here are the function and the stored procedure wher eit occurs:
CREATE FUNCTION [dbo].[Split](@String varchar(MAX), @Delimiter char(1))
RETURNS @Results TABLE (Item nvarchar(4000))
AS
BEGIN
DECLARE @INDEX INT
DECLARE @SLICE nvarchar(4000)
-- HAVE TO SET TO 1 SO IT DOESNT EQUAL Z
-- ERO FIRST TIME IN LOOP
SELECT @INDEX = 1
WHILE @INDEX !=0
BEGIN
-- GET THE INDEX OF THE FIRST OCCURENCE OF THE SPLIT CHARACTER
SELECT @INDEX = CHARINDEX(@Delimiter,@STRING)
-- NOW PUSH EVERYTHING TO THE LEFT OF IT INTO THE SLICE VARIABLE
it was simple to parse simple variables using replace functions. eg. REPLACE(@str, '@customer_name', @customer_name). It worked like mail merge.the converted string was then sent forward using a webservice.now my requirement is to add conditional values in body field e.g:
body = Document ID: @document_id Customer Name: @customer_name Item name: @item_name Quantity: @qty IF isnull(@rate, 0) > 0 Rate: @rate IF isnull(@rate, 0) > 0 Amount: @amount
how can i parse strings like this. I'm open to change format of values for body field.
parsing any delimited string (in above example it is using ',' as parsing delimiter. This query can be useful in many business scenarios where in we have input data as a long string containing delimited values.
I'm running into a couple of performance issues with regards to the parsing of a text string. We have a function that will take a comma delimited character string, parse out the individual values, and then populate a temp table with those values. The two issues are 1.) the parsing process is VERY slow and 2.) there's a max to how large the string can be - at some point it could easily be 8000 characters or more in length.
Here are the function and the stored procedure where it occurs:
CREATE FUNCTION [dbo].[Split](@String varchar(MAX), @Delimiter char(1)) RETURNS @Results TABLE (Item nvarchar(4000)) AS BEGIN DECLARE @INDEX INT DECLARE @SLICE nvarchar(4000) -- HAVE TO SET TO 1 SO IT DOESNT EQUAL ZERO -- FIRST TIME IN LOOP SELECT @INDEX = 1 WHILE @INDEX !=0 BEGIN -- GET THE INDEX OF THE FIRST OCCURENCE OF THE SPLIT CHARACTER SELECT @INDEX = CHARINDEX(@Delimiter,@STRING) -- NOW PUSH EVERYTHING TO THE LEFT OF IT INTO THE SLICE VARIABLE IF @INDEX !=0 SELECT @SLICE = LEFT(@STRING,@INDEX - 1) ELSE SELECT @SLICE = @STRING -- PUT THE ITEM INTO THE RESULTS SET INSERT INTO @Results(Item) VALUES(@SLICE) -- CHOP THE ITEM REMOVED OFF THE MAIN STRING SELECT @STRING = RIGHT(@STRING,LEN(@STRING) - @INDEX) -- BREAK OUT IF WE ARE DONE IF LEN(@STRING) = 0 BREAK END RETURN END
--------------------
...and the stored procedure:
CREATE PROCEDURE [dbo].[RPTPatientAnalysis] ( @stateList CHAR(2), @employerIdList VARCHAR(4000), @payerIdList VARCHAR(4000) ) AS SELECT p.PAT_ID, p.PAT_FirstName, ISNULL(p.PAT_MiddleName,'') AS PAT_MiddleName, p.PAT_LastName, p.PAT_Gender, CONVERT(VARCHAR(10),p.PAT_DOB,101) AS DOB, p.PAT_AddressStreet1, ISNULL(p.PAT_AddressStreet2,'') AS PAT_AddressStreet2, p.PAT_AddressCity, p.PAT_AddressStateProvince, p.PAT_AddressPostalCode, ISNULL(p.PAT_EmailAddress,'') AS PAT_EmailAddress, p.PAT_PhoneNumber, ISNULL(e.EMPLOYER_Name,'<Unknown>') AS EMPLOYER_Name, ISNULL(p.PAT_OtherEmployerName,'') AS PAT_OtherEmployerName, ISNULL(p.PAT_Comment,'') AS PAT_Comment, ISNULL(p.PAT_PrimCareProv_PRIMCP_ID,'') AS PAT_PrimCareProv_PRIMCP_ID, ISNULL(p.PAT_PrimCareProvAllowNotification,0) AS PAT_PrimCareProvAllowNotification, ISNULL(p.PAT_PrimCareProvFullName,'') AS PAT_PrimCareProvFullName, ISNULL(p.PAT_DoNotMail,0) AS PAT_DoNotMail, ISNULL(p.PAT_UnderAgePermission,0) AS PAT_UnderAgePermission, p.PAT_LastEandMCodingDateTime, p.PAT_Desceased, p.PAT_PCP_ID, p.PAT_LastUpdatedDateTime, ISNULL(p.PAT_PCPRecordType,0) AS PAT_PCPRecordType, ISNULL(p.PAT_EnableEmailMarketing,0) AS PAT_EnableEmailMarketing, ISNULL(p.PAT_EnablePortal,0) AS PAT_EnablePortal, ISNULL(p.PAT_PortalID,0) AS PAT_PortalID, ISNULL(e2.EMPLOYER_Name,'') AS EMPLOYER_Name, ISNULL(p.PAT_OtherEmployerName,'') AS PAT_OtherEmployerName, pcp.PRIMCP_ID, ISNULL(pcp.PRIMCP_ADDR_ID,'') AS PRIMCP_ADDR_ID, ISNULL(pcp.PRIMCP_ClinicName,'') AS PRIMCP_ClinicName, ISNULL(pcp.PRIMCP_PhysicianFullname,'') AS PRIMCP_PhysicianFullname, pcp.PRIMCP_DateDeactivated, ISNULL(pcp.PRIMCP_Phone_MedicalRecordFax,'') AS PRIMCP_Phone_MedicalRecordFax, ISNULL(pcp.PRIMCP_Phone_Voice,'') AS PRIMCP_Phone_Voice, ISNULL(pcp.PRIMCP_MedicalRecords_Street1,'') AS PRIMCP_MedicalRecords_Street1, ISNULL(pcp.PRIMCP_MedicalRecords_Street2,'') AS PRIMCP_MedicalRecords_Street2, ISNULL(pcp.PRIMCP_MedicalRecords_City,'') AS PRIMCP_MedicalRecords_City, ISNULL(pcp.PRIMCP_MedicalRecords_State,'') AS PRIMCP_MedicalRecords_State, ISNULL(pcp.PRIMCP_MedicalRecords_Zip,'') AS PRIMCP_MedicalRecords_Zip, ISNULL(pcp.PRIMCP_Street1,'') AS PRIMCP_Street1, ISNULL(pcp.PRIMCP_Street2,'') AS PRIMCP_Street2, ISNULL(pcp.PRIMCP_City,'') AS PRIMCP_City, ISNULL(pcp.PRIMCP_State,'') AS PRIMCP_State, ISNULL(pcp.PRIMCP_Zip,'') AS PRIMCP_Zip, ISNULL(pcp.PRIMCP_DoNotFax,0) AS PRIMCP_DoNotFax, pati.PATINS_InsuranceTypeID, ISNULL(pati.PATINS_Account,'') AS PATINS_Account, ISNULL(pati.PATINS_Group,'') AS PATINS_Group, ISNULL(pati.PATINS_CopayType,'') AS PATINS_CopayType, ISNULL(pati.PATINS_CopayAmount,0) AS PATINS_CopayAmount, ISNULL(pati.PATINS_CollectFullAmount,0) AS PATINS_CollectFullAmount, ISNULL(pati.PATINS_EmployerPays,0) AS PATINS_EmployerPays, ISNULL(pati.PATINS_ZeroScreenCopay,0) AS PATINS_ZeroScreenCopay, ISNULL(pati.PATINS_ZeroVaccineCopay,0) AS PATINS_ZeroVaccineCopay, ISNULL(pati.PATINS_NonPar,0) AS PATINS_NonPar, ISNULL(pati.PATINS_MedicarePlan,0) AS PATINS_MedicarePlan, ISNULL(ipcl.INSPCAT_Description,'') AS INSPCAT_Description, ISNULL(ip.INSP_Name,'') AS INSP_Name, ISNULL(ip.INSP_ChargeFullPrice,0) AS INSP_ChargeFullPrice, ISNULL(ip.INSP_CopayApplies,0) AS INSP_CopayApplies, CONVERT(VARCHAR(10),ip.INSP_DeactivatedDate,101) AS INSP_DeactivatedDate, ISNULL(ip.INSP_EligibilityActive,0) AS INSP_EligibilityActive, CONVERT(VARCHAR(10),ip.INSP_PromoStartDate,101) AS INSP_PromoStartDate, CONVERT(VARCHAR(10),ip.INSP_PromoEndDate,101) AS INSP_PromoEndDate FROM dbo.patient AS p LEFT JOIN dbo.Employer AS e ON p.PAT_EMPLOYER_ID = e.EMPLOYER_ID LEFT JOIN dbo.Employer AS e2 ON p.PAT_SecondaryEMPLOYER_ID = e2.EMPLOYER_ID LEFT JOIN dbo.PrimaryCareProvider AS pcp ON p.PAT_PCP_ID = pcp.PRIMCP_ID LEFT JOIN dbo.PatientInsurance AS pati ON p.PAT_ID = pati.PATINS_PAT_PERS_ID AND PATINS_InsuranceTypeID = 1 LEFT JOIN dbo.InsurancePayer AS ip ON pati.PATINS_INSP_ID = ip.INSP_ID LEFT JOIN dbo.InsurancePayerCategoryLookup AS ipcl ON ip.INSP_INSPCAT_ID = ipcl.INSPCAT_ID WHERE p.PAT_AddressStateProvince IN (SELECT Item FROM dbo.SplitVarcharMax(@stateList,',')) AND PAT_EMPLOYER_ID IN (SELECT Item FROM dbo.SplitVarcharMax(@employerIdList,',')) AND pati.PATINS_INSP_ID IN (SELECT Item FROM dbo.SplitVarcharMax(@payerIdList,','))
Is there a faster / more efficient way to accomplish the above?
I am trying to parse a text column using a cursor. Basically here is the statement I am trying to convert to the cursor: SELECT DATA_ROW, SUBSTRING(FAILURE_MESSAGE,35,5) AS INVALID_1 SUBSTRING(FAILURE_MESSAGE,70,5 AS INVALID_2 fROM TBL_ERRORS WHERE LEFT(FAILURE_MESSAGE,200) LIKE '%ORA%'
My table has 2 fields as 'count' and 'codes'. The 'codes' field has 'count' # of code values in each record. Size of each code is 4. For example, if my record is 2,'abcdefgh' then there are 2 codes and the values are 'abcd' and 'efgh'.
Currently I am using 'script component' to parse the field into multiple values. Since I have to read 1 million records and on an average, each record has 10 codes, it is taking hrs to load it.
Can it be done without 'script component' using some other transformations?
I can't figure this one out. I don't have enough knowledge of the string functions I guess.
I need to pull a value out of a variable I setup in a for each loop. The value is the filename/path of each source file being processed. Let's say the variable that has the source file path is called VAR1.
One sort of off topic thing I've noticed is when watch the variable in bebug mode and I look at the value of VAR1 it has double back slashes. Here's an example of the value of VAR1:
"\\L3KRZR6.na.xerox.net\C$\Documents and Settings\ca051731\Desktop\Project4\DPT_20070926.ver"
How come the back slashes have been doubled? And do I need to account for that when I start parsing the string value?
Anyway, I need to grab part of the filename from VAR1 and I need the value populated at the start of the for each loop container - ideally when I capture VAR1 in the for each container. I'll be using the string in drop table, create table and create index statements before the actual Data Flow task within the overall package
In the above example I need to grab the characters before the underscore and after the last \. So I'd need the string "DPT" captured in this example.
The actual string could be 1 to 3 characters long, even though this example has it as 3 long.
Underscores could exist anywhere in the actual UNC path once this package is moved to our actual system environments so I can't key off of the underscore.
Because I can't count on the string being a fixed lenght I can't just use a positional string function and grab specific text starting/ending at specific points.
Is there a way to use the various string functions in the expression builder to grab the text between the right most underscore and the right most back slashes or something like that? Ideally I'd like to setup a new expression based packed scope variable called VAR2 and build it using string functions applied to VAR1.
The suggestion to do this is buried deep in one of my posts, however I still do not have a clear idea of how to do this.
I have a flat file which has several "bad rows" in it. Because file error redirection is buggy, I need a manual approach to get rid of these incomplete rows in my data file.
Phil, you suggested I read the file as one long string, then parse out the bad rows (using a script?).... however I have no idea as to how to actually do this.
I was wondering if it's possible to clarify the steps involved in doing this, or perhaps point me to an example I can look at, as I cannot seem to get around this problem on my own.
Hello all, I have a question regarding importing text file data into SQL Server. I'm hoping someone can point me in the right direction, as my searches haven't turned up anything specific enough. I'm trying to parse a large (24MB) text file. It's a fixed-width file, with multiple columns. I need to parse this file, check if a record already exists, and then import the data into the database. But I don't need to insert every column. There's only a few columns from the file I need to insert. This parsing also needs to occur at regular intervals (daily). I looked at BULK INSERT, but I can't find an example that uses only some of the columns. Every example uses all columns, and the file is delimited, not fixed-width. Is there anything within SQL Server that can accomplish this? I haven't turned up anything that will solve my problem. The only other solution I can think of is an application that parses the file for me and inserts the data into the database. But can I schedule that application to run every night at midnight (for example) through SQL Server? I'm not too familiar with SQL Server, so I appreciate any help offered. Thanks,Jay
Hey Guys I knwo this may sound impossible but lets say I have a number of fields one of which is a Long blob or long text
is there a way to have MYSQL search the blobs for keywords and then to extract them to other fields? basically what I am asking is it possible to parse a long text blob for keywords and then grab data before or after those keywords?/
See sample data below. I'm trying to count the number of occurrences of strings stored in table @word without a while loop.
DECLARE @t TABLE (Id INT IDENTITY(1,1), String VARCHAR(MAX))
INSERT INTO @t SELECT 'There are a lot of Multidimensional Expressions (MDX) resources available' AS String UNION ALL SELECT 'but most teaching aids out there are geared towards professionals with cube development experience' UNION ALL
I have a website, where people upload tab delimited text files of their product inventories, which the site parses and inserts into a database table. Here's the catch: Instead of insisting that each user use a standardized format, each user can upload the file in whatever column order they want, they just have to let the site know through a GUI which column is in which order. And, they may upload columns that if not mapped, will be ignored. Right now, I am doing all of this in code and it runs slow, I was thinking of offloading this to either a stored procedure, ssis, or bulk upload. But, with the varying format of the uploaded text file, I am not sure how I could do that. Any suggestions? Thanks!