Validating Inputs And Finding Patterns With PatIndex
Sep 5, 2007
---Checks that input only contains numbers
if PatIndex('%[^0-9]%','11') > 0
Begin
Print 'Not all numbers'
End
Else
Begin
Print 'All numbers'
End
---Checks that input only contains letters
if PatIndex('%[^a-z]%','aaaaa') > 0
Begin
Print 'Not all letters'
End
Else
Begin
Print 'All letters'
End
--Checking for mixed input
If PatIndex('%[^0-9][0-9]%','abc') > 0
Begin
Print 'Alpha numeric data'
End
Else
Begin
Print 'Either all numbers or all letters'
End
--Checks that value must start with a letter and a number
If PatIndex('[^0-9][0-9]%','A1anamwar11') > 0
Begin
Print 'Starts with a letter and a number'
End
Else
Begin
Print 'Does not start with a letter and a number'
End
--Checks that value must End with a letter and a number
If PatIndex('%[^0-9][0-9]','A1anamwar11a1') > 0
Begin
Print 'Ends with a letter and a number'
End
Else
Begin
Print 'Does not End with a letter and a number'
End
--Checks that value must Start with a letter and Ends with a number
If PatIndex('[^0-9]%[0-9]','namwar1') > 0
Begin
Print 'Starts with a letter and ends with a number'
End
Else
Begin
Print 'Does not start with a letter and ends with a number'
End
This question is rather open ended. Basically, I'm wondering the different approaches people use when validating input into a stored procedure. The rest of my post just describes a rather simple approach I'm using, and some ideas I have, but I'm eager to know what others are doing. So, if you'd like to comment on this, feel free to do so without reading the following.
I often find myself calling a stored procedure, and needing to validate some of the inputs before proceeding with the rest of the tasks. Such as, making sure a matching record isn't already in the database before adding a new record. Ideally, this sort of validation will report back to the user interface immediately, so that the user can get real-time response to their form input, instead of the all-too-common approach of redirecting to an error page if something goes wrong in the database transaction.
It's also convenient, at times (though perhaps not efficient in all cases) to let your database perform business rules validation of inputs (let the user interface make sure data types are valid, and such), so that you duplicate less code in the event that the stored procedure is called in more than one context.
To address these issues, I've taken to making two stored procedures instead of one. Let the stored procedure be called AddRecord. I'd create that, as well as the procedure named AddRecord_Validate, which would take identical inputs as AddRecord, whenever possible. Nearly the first thing done in AddRecord would be to execute AddRecord_Validate. The user interface would also call AddRecord_Validate, and return any validation errors to the user interface.
This seems rather convenient to me, in many cases, but often it doesn't work as well as I prefer. It's not uncommon for me to end up performing the same queries in the _Validate SP as I do in the parent SP. For example, let's say I'm passing a parameter that I use to look up a record I plan to edit. In my validation, I query the database to verify that the record is found. But in the parent SP, I run the same query again in order to get the stored ID of the record I need to edit. This ends up duplicating queries, making the pair of procedures less efficient on the whole, as well as introducing the likelihood of bugs when code changes in one of the SPs, but not the other.
So I've been brainstorming other approaches. The first idea is to execute a single stored procedure, but when I want to run it in "validation" mode, I simply rollback the transaction. This would let me know if the stored procedure would have run with the specified input, but won't ultimately change anything. Unfortunately, I see some badness in this, such as the entire SP perhaps taking a lot longer to execute that simple validation would have, and side-effects such as incrementing Sequences, or locking the database, and basically just doing a whole lot more work during validation than needs to be done.
The next idea was to require that the user specify the "run mode" via a parameter, either 'validate' or 'run'. Then, all of the validation would be performed at the top of the stored procedure, and a simple IF statement would exit the SP before actually making any changes if it's run in 'validate' mode. Otherwise, it will continue, and do the real work. The only real downside I see to this is forcing the developer to deal with an extra parameter. And, perhaps, it's not really a "relational" approach.
Design patterns are indispensable to improve the quality and productivity of system under development. Even though there are books available addressing the needs of object oriented programming, but useful information for SQL design patterns are hard to find.
I would appreciate if you could share any of the resources that you may have come across.
I am looking for something similar to patterns for table design, but I have been unable to find anything on that, or good rule of thumb performance guides for table design.
Most of the situations we face look like perfect candidates for patterns, and some good rules of thumb -- problems like scalable OLAP requirements PLUS real time reporting.
I am looking for several differnet approaches and a good summary of the trade-offs for each one.
I am trying to design some functionality that emulates the recurrence functionality of MS Outlook calendar. In the calendar, users can choose a recurrence pattern (i.e. daily, weekly, monthly or yearly). Depending on the radio button that is selected, different choices appear. I am unsure as to how to model this in the database.
Here's what I have so far:
A ScheduleDetails table that will store information applicable to all schedules, regardless of recurrence pattern. One of the columns in this table will store the recurrence pattern type (0 for daily, 1 for weekly, etc.). The valid values for this column will be stored in a look-up table.
Now, if daily is selected, the user can select a radio button called 'Every <#> days' OR a radio button called 'Every weekday'. My plan is to have a new ScheduleDailyRecurrences table that has a Days field. If 'Every <#> days' is selected, the value of Days will be the value entered by the user. Otherwise, if the user has selected 'Every weekday', the value will be -1. Is this a good way to do it? I feel as though I may be attaching too much meaning to the particular value.
If monthly is selected, it gets even more complicated. The user can either select 'Day <#> of every <#> month(s)' OR 'The <count> <day> of every <#> month(s)'. I'm not sure how to model this. In my ScheduleMonthlyRecurrences table, should I have a field called 'Pattern'? Depending on the value of this field, the details would be then be stored in yet another table? Or should I just have columns in ScheduleMonthlyRecurrences for each user-entered value? This would mean that some of the fields would not be applicable depending on the radio button that the user has selected.
It's a little difficult to explain, but if you go to the Recurrence button in Outlook calendar, you'll see what I mean.
I have a database which contains time series data (historical stock prices) which I have to search for patterns on a day to day basis. But searching this historical data for patterns is very time consuming not only in writing the complex t-sql scripts but also executing them.
Table structure for one min data: [Date] [Time] [Open] [High], [Low], [Close], [Adjusted_Close], [MA], [DI]..... Tick Data: [Date] [Time] [Trade] Most time consuming queries are with lots of inner joins. So for example if I have to compare first few mins data then I have to do inner join like: With IntervalData AS ( SELECT [Date], Sum(CASE WHEN 1430 = [Time] THEN [PriceRange] END) AS '1430', Sum(CASE WHEN 1431 = [Time] THEN [PriceRange] END) AS '1431', Sum(CASE WHEN 1432 = [Time] THEN [PriceRange] END) AS '1432' FROM [INDU_1] GROUP BY [Date] ) SELECT [Date] ,[1430], [1431], [1432], [1431] - [1430] As 'Range' from IntervalData WHERE ([1430] > 0 AND [1431] < 0 AND [1432] < 0) OR ([1430] < 0 AND [1431] > 0 AND [1430] > 0) ------------------------------------------------------------------------ select ind1.[Time], ind1.PriceRange,ind2.[Time], ind2.PriceRange from INDU_1 ind1 INNER JOIN INDU_1 ind2 ON ind1.[Time] = ind2.[Time] - 1 AND ind1.[Date] = ind2.[Date] where (ind1.[Time] = 2058) AND ((ind1.PriceRange > 0 AND ind2.PriceRange >0) OR (ind2.PriceRange < 0 AND ind1.PriceRange < 0)) ORDER BY ind1.[Date] DESC; Is there anyway I can use Sql 2005 Data mining models to make this searching faster?
I was wondering if someone would point out my flaw. I am attempting to get information separated by spaces, placed into separate columns. I tried without success several times yesterday to get what I desired and was not able to get it done. I am fairly new at developing SQL code and am low on the SQL coding totem pole of experience but would like your help.
My problem arises when I, in the second column, attempt to get only the information BETWEEN the first space: PATINDEX('% %', c1)+1 and the second space. I am coming up short and would like someone to point out my error or assist me in getting the query pointed in the right direction.
I have a report with a chart that utilizes both colors and background patterns to distinguish various pieces of data.
The background patterns work great (they are very clear and provide excellent contrast) when viewed in the report viewer or printed straight from the report viewer.
However, when I export to PDF, the background patterns become compressed and small, almost to the point of being too small to distinguish that there is a pattern at all. The result ends up being what appears to be just a different shade of the background color. This is particularly a problem when you then print the PDF on paper, since hatching patterns are similar to how printers create different shades in the first place.
I am somewhat suspicious that this might have to do with resolution settings for PDF exports. If I could lower the resolution, perhaps the patterns would not become so tight and compressed. Below are examples of what I'm describing:
My application has a log table with a timestamp field which represents the time when the record was inserted.
We would like query the the table and group the results into date units based on the value of the timestamp.
The grouping specification is held in another table but only specified for a single week.
For example
WeekDay Start End
1 06:00:00 14:00:00
2 08:10:00 17:00:00
What this specification means is that when the log table is queried records with a timestamp that fall on a Monday between 06:00:00 and 14:00:00 should be grouped together, on Tuesday the group is records between 08:10:00 and 17:00:00 and so on...
The only way I can think of doing this is to generate a temporary table when the log table is queried that has a unique record for each time period for the span of the query and then attempt to join this table to the log table using the timestamp and then group based on the temporary table unique ID.
I'm not sure how to generate such a temporary table from the specification table so any help would be appreciated.
hi! i have one datatable with some file's details like id, name, createdate. i wrote one storedprocedure to search in that table by filename using wildcard character. its also working perfectly. But the problem is i want to arrange the results in ascending order depended by matching level. If I search for 'file123' it should give results in one order by matching level, means - lets think the results are 'fil23', file123', 'file', 'file1'. the results should be in order like 'file123', 'fil23', 'file1', 'file'. the results of the sp which i used, are order by datecreated. Please help me to do this.
the storeprocedure i used - "CREATE PROCEDURE [dbo].[SP_DYN_SearchinFiles_DynByWildcardsADMIN] @searchparam nvarchar(50) AS BEGIN select Id, OriginalName, ContentType, FullName, Department, DateCreated, Client, username from Files_Dyn where PATINDEX('%' + @SearchParam + '%', OriginalName) > 0
Hello,I am trying to update a column in a table with an 8 digit string.The 8 digit number is obtained from another column which is a commentsfield.e.g.Comments Field :1) RD-V0216970 RQST NON SMKING KING Unknown # : 223389212) Received verbal authorization Authorization # 0403453) international plaza, singapore # 96722540The code that I am using isUPDATE U SET U.NUM =CASEWHEN U.BOOKED_COMMENTS_TXT LIKE('%[0123456789][0123456789][0123456789][0123456789][0123456789][0123456789]%')THENSUBSTRING(U.BOOKED_COMMENTS_TXT,PATINDEX('%[0123456789][0123456789][0123456789][0123456789][0123456789][0123456789]%',U.BOOKED_COMMENTS_TXT), 8)ELSE NULLENDFROM UNKNOWN1_RESERVATIONS UHere's what my result set looks like1)V02169702)0403453)96722540But this is how I want my result set to look like1)223389212)null3)96722540What I need is a way to restrict the search criteria to exactly 8numeric digits. Any suggestions will be helpful.Thanks in advance
Hi allUsing SQL Server 2000 SP3 on WIn2K Pro with SP2.When I do thisSELECT PATINDEX('%[%', 'ABC[DEF')GOI get a result of 0, when I would expect a result of 4.This works as I expect:SELECT PATINDEX('%]%', 'ABC]DEF')GOI must be missing something really obvious here Any help would beappreciated.
Hi all, My database is not case-sensitive, but I want output like... SELECT patindex('%[A-Z]%','gaurang Ahmedabad') The output should be first occurrence of uppercase A to Z, so output should be 9 it should not be 1. Above query is giving output as 1 bcoz the 1st character in the expression is 'g' and it is in A to Z, but this is not capital 'G'. The 1st capital letter in the expression is 'A' (9th character in the expression). Is there anyway to achieve this using PATINDEX? or Is there any other way to achieve this? Thanks, Gaurang Majithiya
Alright, I am trying to write a query that will select a column.. then search through the column and return me the fields with no underscores in them.
i.e we have fields that are like ABCD and ABCD_1. I want to basically return all of them up to the underscore.
so i am trying to use a substring combined with patindex. Here is my statement for patindex. patindex('%_%', short_name)
however, i cannot get patindex to ignore the fact that _ is a wildcard and treat it as character. Ive tried ('%[_]%', short_name) and ive tried ('%!_%', short_name) escape '!' and i just keep getting errors... If someone could help me out, that would be most appreciated. thank you.
I have a number of columns I'm selecting from a table, and I'd like to use PATINDEX to search for a string within a field. Upon finding this string it would return another string...:
SELECT CASE PARTNERID WHEN PATINDEX('%INX%',PARTNERID)>0 THEN 'AXXENT' ELSE PARTNERID END AS PARTNERID
However, I'm getting this error:Incorrect syntax near '>'.
I have a table which stores a big chunk of html text, which I have to search for and replace some strings. How can I write a query that makes use of regular expressions to do the following:
Below is a block of example text ====================================================================== " src="http://aaaa/bbbb/cccc.nsf/blind%2010x10.gif?OpenImageResource" width=10> <img height=10 alt="" src="http://aaaa/bbbb/cccc.nsf/blind%2010x10.gif?OpenImageResource" width=10><img height=11 alt="" http://aaaa/bbbb/cccc.nsf/blablablabla?OpenDocument src="http://aaaa/bbbb/cccc.ns http://aaaa/bbbb/cccc.nsf/d8d4f0dfa09ba71142256cc50040ee51/ee9b4493dac727be422571c30027df61? OpenDocument" target=_self>
Now I want to search and replace the string that is highlighted in red with something like "FOUND" The below string is what I want to search for - I will used PATINDEX on the string below to find the start index of the string in the body, and then I will add on 114 to the when doing a replace:
what would the regular expression be and how does one type it in the query? I tried something like the below but it does not work, please advise ..... [a-z][0-9]{32} %/% [a-z][0-9]{32}
I have a value in a column '01-08-087-0101W5'. I would like to break this value down into 5 different columns. The column will be broken down at the dashes , so the 5 columns will have values like
01 08 087 101 5
I would also like to trim any leading 0's in the above values. i have been trying the following
SUBSTRING(column1,patindex('%[^0]%',column1) ,2) AS FROM_TWP
I am creating a view to pull data for a UPS integration I am doing. I currently have this view where I pull my data from. All is well accept for my PATINDEX,
Currently I have this PATINDEX So when a user puts this into delivery instructions "#999999" UPS_FINAL returns 999999 which is good for me to use. But The PATINDEX will also grab all kinds of neat stuff out of the delivery instructions field when I really only want whatever is after the # sign in the field.
I am new to sql and dont quite understand how this search is working but I think I am in need of a better way to search the field
Code:
WHEN SUBSTRING(P21PLAY.dbo.p21_view_oe_hdr.delivery_instructions, PATINDEX('%[^a-z ]%', P21PLAY.dbo.p21_view_oe_hdr.delivery_instructions) + 1, 6) IS NULL
THEN dbo.Address_Table.ups_code
ELSE SUBSTRING(P21PLAY.dbo.p21_view_oe_hdr.delivery_instructions, PATINDEX('%[^a-z ]%', P21PLAY.dbo.p21_view_oe_hdr.delivery_instructions) + 1, 6) END AS UPS_FINAL
------------------------------------------------------------------------------------- What I am looking to accomplish is a SUBSTRING that will search delivery_instructions
Code:
SELECT
TOP (100) PERCENT P21PLAY.dbo.p21_view_oe_pick_ticket.pick_ticket_no, P21PLAY.dbo.p21_view_oe_hdr.order_no, P21PLAY.dbo.p21_view_oe_hdr.customer_id, P21PLAY.dbo.p21_view_oe_hdr.ship2_name, P21PLAY.dbo.p21_view_oe_hdr.ship2_add1, P21PLAY.dbo.p21_view_oe_hdr.ship2_add2, P21PLAY.dbo.p21_view_oe_hdr.ship2_city, P21PLAY.dbo.p21_view_oe_hdr.ship2_state, P21PLAY.dbo.p21_view_oe_hdr.ship2_zip, P21PLAY.dbo.p21_view_oe_hdr.po_no, P21PLAY.dbo.p21_view_oe_pick_ticket.carrier_id AS Carrier, P21PLAY.dbo.p21_view_oe_pick_ticket.carrier_id AS Supplier,
P21PLAY.dbo.p21_view_oe_hdr.delivery_instructions) + 1, 6) AS UPS_Shipper,
------------------Checks view for email address or assigns alternate------------------ (CASE WHEN charindex('@', p21_view_contacts.email_address) > 0 THEN p21_view_contacts.email_address ELSE
'email@domain.com' END) AS alternate_address,
'Y' AS QVN, 'email@domain.com' AS failureaddress,
P21PLAY.dbo.p21_view_contacts.email_address,
------------When carrier_id is not one of these # then Null; else ------------------------------
CASE WHEN P21PLAY.dbo.p21_view_oe_pick_ticket.carrier_id NOT IN (105188, 105191, 105194, 105197, 105200,
105203, 105206, 105209, 105212) THEN NULL
----------------- Looks for special Character in delivery_instructions; if NULL then ups_code; ELSE return value from delivery_instructions as UPS_Final--------------------
WHEN SUBSTRING(P21PLAY.dbo.p21_view_oe_hdr.delivery_instructions, PATINDEX('%[^a-z ]%', P21PLAY.dbo.p21_view_oe_hdr.delivery_instructions) + 1, 6) IS NULL THEN dbo.Address_Table.ups_code ELSE SUBSTRING(P21PLAY.dbo.p21_view_oe_hdr.delivery_instructions, PATINDEX('%[^a-z ]%', P21PLAY.dbo.p21_view_oe_hdr.delivery_instructions) + 1, 6) END AS UPS_FINAL
FROM dbo.Address_Table INNER JOIN P21PLAY.dbo.p21_view_oe_pick_ticket INNER JOIN P21PLAY.dbo.p21_view_oe_hdr ON P21PLAY.dbo.p21_view_oe_pick_ticket.order_no =
P21PLAY.dbo.p21_view_oe_hdr.order_no ON dbo.Address_Table.id = P21PLAY.dbo.p21_view_oe_hdr.customer_id LEFT OUTER JOIN P21PLAY.dbo.p21_view_contacts ON P21PLAY.dbo.p21_view_oe_hdr.contact_id = P21PLAY.dbo.p21_view_contacts.id
WHERE (P21PLAY.dbo.p21_view_oe_hdr.completed <> 'Y') AND (P21PLAY.dbo.p21_view_oe_hdr.delete_flag <> 'Y') AND (P21PLAY.dbo.p21_view_oe_hdr.will_call <> 'Y') ORDER BY P21PLAY.dbo.p21_view_oe_pick_ticket.pick_ticket_no
Hello, I understand that it is not possible to use PATINDEX in a Derived Column transform. I'm trying to eliminate leading zeros from a column where the string is always 14 characters long, but the first non zero character could occur any number of characters from the left of the string. The following achieves what I need:
substring([Account No w/0s], patindex('%[^0]%', [Account No w/0s]), 14) AS ExtractedAcctNo
Does anyone happen to know how I might express this differently in a Derived Column transform? Should I consider a calling a function through an OLE DB Command transform on each row instead? Maybe I should have a SQL Task that runs an UPDATE statement against the column.
I would appreciate any thoughts/ideas on the following use case for the distributed service broker application we plan to migrate from our existing proprietary tcp based message protocol using database tables for reliability.
There are two ssb services running in separate sql server instances, each on a different server machine. For simplicity, let us assume the ssb endpoint names are SSBA, SSBB. SSBB is the Initiator of the Dialog while SSBA is the Target. Now the requirement is that if the underlying network communication between the two ssb endpoints(SSBA and SSBB) is broken or if the critical service SSBB is down, then processing of any incoming message into SSBA's queue from a third service broker service (say SSBEXPR) running within a SqlExpress instance should be delayed until SSBB is alive and network communication between SSBA and SSBB is established. In our existing implementation (wherein SSBA, SSBB and SSBEXPR are windows services) we use a combination of TCP socket disconnects and Heartbeat messages between SSBA and SSBB to determine the health of network connection and that of the SSBB service.
Now my understanding of how the underlying network connection for a ssb dialog works is that if there is no activity on a dialog for a certain amount of time then the underlying network connection is closed. Is there a way to specify the amount of time to say infinite value or something and thus change this behavior? My other question is how can one query the underlying network connection (i.e. a row from sys.dm_broker_connections) associated with a particular conversation? If none of this is possible, then any other patterns/ideas/approach is welcome.
Using the Microsoft Patterns and Practices "Object Builder" (Dependency Injection/Builder library), I wrote an SQL CLR Stored Procedure (using VS 2005 Professional).
All compiles and deploys ok (to SQL Server 2005 Express).
However, at run-time, I get the following error upon a "BuilderContext" object's instantiation: {"Attempted to perform an operation that was forbidden by the CLR host."}
Thoughts on how to get ObjectBuilder working in the SQLCLR?
Thanks!
Andy
(posted below is the sample code...with the run-time exception occuring on the ...BuilderContext cxt = new ... call)
Microsoft.Practices.ObjectBuilder.BuilderStrategyChain chain = new Microsoft.Practices.ObjectBuilder.BuilderStrategyChain();
Microsoft.Practices.ObjectBuilder.Locator locator1 = new Microsoft.Practices.ObjectBuilder.Locator(null);
// Get error when new'ing BuilderContext: "Attempted to perform an operation that was forbidden by the CLR host." Microsoft.Practices.ObjectBuilder.BuilderContext cxt = new Microsoft.Practices.ObjectBuilder.BuilderContext(chain, locator1, null);
My database is not case-sensitive, but I want output like...
SELECT patindex('%[A-Z]%','gaurang Ahmedabad')
The output should be first occurrence of uppercase A to Z, so output should be 9 it should not be 1.
Above query is giving output as 1 bcoz the 1st character in the expression is 'g' and it is in A to Z, but this is not capital 'G'. The 1st capital letter in the expression is 'A' (9th character in the expression).
Is there anyway to achieve this using PATINDEX? or Is there any other way to achieve this?
I am sorry, This might sound very silly, but i am new here. I have multiple excel files from which i need to draw data into one database table. Any thoughts?Any help will be appreciated
Thanks in advanace for taking the time to read this post. I am using MSSQL 2005 and have created a function that allows me to use regular expressions in my SQL queries. My question is I have a pattern buried in a field of misc data that I need to pull out just that pattern and discard the rest of the data. Here is the Regular Expression I am using select field1 from table1 where dbo.RegExMatch (field1,'[a-zA-Z]{4}[0-9]{6}[a-zA-Z]{2,4}')=1 This returns all values in the field that match the expression. What I want to do now is remove all data from the field on the left and right of the expression that does not match the expression. How would I accomplish this without reading through the 200k+ records and writing rules for every exception I run across? so I could have Gar b/a ge 'THE GOOD DATA' m/or1 ba4d da....ta. All I want to do is return 'THE GOOD DATA'
I would like solving the following issue using the Patindex function i cannot retrieve or extract the single numeric value as an example in the the values below i would like retrieve the Value 2, but in my result set the value 22 also appears or it is completely omitted.