GETALLWORDS Inserts The Words From A String Into T
Aug 25, 2005
-- GETALLWORDS() User-Defined Function Inserts the words from a string into the table.
-- GETALLWORDS(@cString[, @cDelimiters])
-- Parameters
-- @cString nvarchar(4000) - Specifies the string whose words will be inserted into the table @GETALLWORDS.
-- @cDelimiters nvarchar(256) - Optional. Specifies one or more optional characters used to separate words in @cString.
-- The default delimiters are space, tab, carriage return, and line feed. Note that GETALLWORDS( ) uses each of the characters in @cDelimiters as individual delimiters, not the entire string as a single delimiter.
-- Return Value table
-- Remarks GETALLWORDS() by default assumes that words are delimited by spaces or tabs. If you specify another character as delimiter, this function ignores spaces and tabs and uses only the specified character.
-- Example
-- declare @cString nvarchar(4000)
-- set @cString = 'The default delimiters are space, tab, carriage return, and line feed. If you specify another character as delimiter, this function ignores spaces and tabs and uses only the specified character.'
-- select * from dbo.GETALLWORDS(@cString, default)
-- select * from dbo.GETALLWORDS(@cString, ' ,.')
-- See Also GETWORDNUM() , GETWORDCOUNT() User-Defined Functions
CREATE function GETALLWORDS (@cSrting nvarchar(4000), @cDelimiters nvarchar(256))
returns @GETALLWORDS table (WORDNUM smallint, WORD nvarchar(4000), STARTOFWORD smallint, LENGTHOFWORD smallint)
begin
-- if no break string is specified, the function uses spaces, tabs and line feed to delimit words.
set @cDelimiters = isnull(@cDelimiters, space(1)+char(9)+char(10))
declare @k smallint, @wordcount smallint, @nEndString smallint, @BegOfWord smallint, @flag bit
select @k = 1, @wordcount = 0, @nEndString = 1 + datalength(@cSrting) /(case SQL_VARIANT_PROPERTY(@cSrting,'BaseType') when 'nvarchar' then 2 else 1 end) -- for unicode
while charindex(substring(@cSrting, @k, 1), @cDelimiters) > 0 and @nEndString > @k -- skip opening break characters, if any
set @k = @k + 1
if @k < @nEndString
begin
select @wordcount = 1, @BegOfWord = @k, @flag = 1 -- count the one we are in now count transitions from 'not in word' to 'in word'
-- if the current character is a break char, but the next one is not, we have entered a new word
while @k < @nEndString
begin
if @k +1 < @nEndString and charindex(substring(@cSrting, @k, 1), @cDelimiters) > 0
begin
if @flag = 1 and charindex(substring(@cSrting, @k-1, 1), @cDelimiters) = 0
begin
select @flag = 0
insert into @GETALLWORDS (WORDNUM, WORD, STARTOFWORD, LENGTHOFWORD) values( @wordcount, substring(@cSrting, @BegOfWord, @k-@BegOfWord), @BegOfWord, @k-@BegOfWord ) -- previous word
end
if charindex(substring(@cSrting, @k+1, 1), @cDelimiters) = 0
select @wordcount = @wordcount + 1, @k = @k + 1, @BegOfWord = @k, @flag = 1 -- Skip over the first character in the word. We know it cannot be a break character.
end
set @k = @k + 1
end
if charindex(substring(@cSrting, @k-1, 1), @cDelimiters) > 0
set @k = @k - 1
if @flag = 1
insert into @GETALLWORDS (WORDNUM, WORD, STARTOFWORD, LENGTHOFWORD) values( @wordcount, substring(@cSrting, @BegOfWord, @k-@BegOfWord), @BegOfWord, @k-@BegOfWord ) -- last word
end
I am pleased to offer, free of charge, the following string functions Transact-SQL:
AT(): Returns the beginning numeric position of the nth occurrence of a character expression within another character expression, counting from the leftmost character.
RAT(): Returns the numeric position of the last (rightmost) occurrence of a character string within another character string.
OCCURS(): Returns the number of times a character expression occurs within another character expression (including overlaps).
OCCURS2(): Returns the number of times a character expression occurs within another character expression (excluding overlaps).
PADL(): Returns a string from an expression, padded with spaces or characters to a specified length on the left side.
PADR(): Returns a string from an expression, padded with spaces or characters to a specified length on the right side.
PADC(): Returns a string from an expression, padded with spaces or characters to a specified length on the both sides.
CHRTRAN(): Replaces each character in a character expression that matches a character in a second character expression with the corresponding character in a third character expression.
STRTRAN(): Searches a character expression for occurrences of a second character expression, and then replaces each occurrence with a third character expression. Unlike a built-in function Replace, STRTRAN has three additional parameters.
STRFILTER(): Removes all characters from a string except those specified.
GETWORDCOUNT(): Counts the words in a string.
GETWORDNUM(): Returns a specified word from a string.
GETALLWORDS(): Inserts the words from a string into the table.
PROPER(): Returns from a character expression a string capitalized as appropriate for proper names.
RCHARINDEX(): Similar to the Transact-SQL function Charindex, with a Right search.
ARABTOROMAN(): Returns the character Roman numeral equivalent of a specified numeric expression (from 1 to 3999).
ROMANTOARAB(): Returns the number equivalent of a specified character Roman numeral expression (from I to MMMCMXCIX).
AT, PADL, PADR, CHRTRAN, PROPER: Similar to the Oracle functions PL/SQL INSTR, LPAD, RPAD, TRANSLATE, INITCAP.
More than 5000 people have already downloaded my functions. I hope you will find them useful as well.
For more information about string UDFs Transact-SQL please visit the
http://www.universalthread.com/wconnect/wc.dll?LevelExtreme~2,54,33,27115
Please, download the file
http://www.universalthread.com/wconnect/wc.dll?LevelExtreme~2,2,27115
Hi,I'd be interested in people's thoughts about the following. A user on my site will be searching for a venue name, and that could officially include a sponsor which the user might not search for. Now I am using the AutoCompleteDropdown from the AJAX Control Toolkit, so the user will start typing in a few characters and the results will be returned. I can generate the results from sql by doing a simple LIKE '%' + @searchTerm + '%' however, this fills me with great fear of table scans. At the moment, we'd be querying against a table of 5K records, but our application is very new.I'm thinking one option is to split the words into another table - a one to many relationship to hold each word of the venue. The benefit of this would be that you could do a:LIKE @term + '%'but then I have the cost of the join. (And the added complexity which is not a major issue)Any thoughts/tips?Thanks!
I am required to send an XML file of our clients to head office in Belgium for comparison against a database of known undesirables. The data is in a legacy system with a custom database so I have created an SSIS package that extracts the tables I need into SQL Server and have developed a program that reads from a text source and creates the XML then Secure FTPs it to Hong Kong who will handle it from there.
My problem lies in actually extracting enough data to avoid too many false positives. The scanning will check name, identity (passport number, etc.), town/city and country. We don't hold an identity number and the town/city and country are buried in free format fields. A quick analysis of the 419,000 records shows that the spelling is terribly unreliable, too. In most cases country has not been entered because the clients are local and even when they are overseas, sometimes only the city has been entered. That is often misspelt, too e.g. Kuala Lumpar or Melboure.
The addresses are held in 3 equal length fields called Address_1, Address_2 and Address_3. There's no guarantee that I will find the town/city or country in any particular one of these fields. In some cases, the street number and name are in Address_3 because the first two hold a company name and a C/O line.
So I'm not going to fret over the ones where the address information is nonsense or missing but I would like to try and extract valid country names and town/city names, where present and this is where I get stuck. I'm from a COBOL programming background and although I'm loving getting used to the power of SQL, I'm still a bit stumped when I come across a problem like this probably because I keep thinking of the solution in procedural terms.
I have a feeling that the solution will be to create two separate reference tables, one of towns/cities and the other of countries. I would then somehow search the 3 fields looking for those keywords and if found, entering them in the appropriate part of the output text file to represent town/city and/or country. I did also think about destringing to find the separate words but that doesn't help where the name consists of two words such as NEW ZEALAND.
I would love to hear from anyone who has dealt with a similar problem and has a neat solution to this using SQL.
We have a VB.Net 2005 application that uses SQL CE 3.1 as its embedded database.
Frequently in the application, we must store strings with apostrophes, quotes, and all kinds of other stuff. It's totally unwieldy to try and manually escape every nonstandard character in every string... this is why we need to know how to handle this issue for all possible input.
What is the best method we can use to store any string, no matter what characters occur in it? The reason we must now improve our string handling is that we are now being required to store MD5 hashes of files for security and duplicate file avoidance, and these hashes usually break our import functions.
We normally enclose strings in single quotes ('). But, with the hashes as mentioned above, none of our current code works. Again: how can we be certain that the exact string we pass in will be stored in its current form, no matter what the characters?
Hello i need to know hoy to use the LIKE operator to find results that contains 2 or more words. ================TABLE EXAMPLE====================== I HAVE A TABLE CALLED ITEMS
ITEMNAME Good Bike Good Mountain Bike Klein Bike Mountain ===================================================
If i use SELECT ITEMNAME FROM ITEMS WHERE ITEMNAME LIKE '%Good Bike%' i only get: Good Bike
What to code i need to write if i want to get that results for QUERY: "Good Bike" returns Good Bike Good Mountain Bike
Hi I'm using the full-text indexing on a table and I'm trying to implement a search where users can search for words and use wildcards themselves. However I'm working on a method so that can enter a wildcard in the middle of a word to get records where they are unsure of the spelling etc. For instance, a search of 'Ste*en' should return results like 'Steven' and 'Stephen' etc. So if they are searching for word 'establishment' they can search for 'estab*ment' and it should return all the records using this query: SELECT * FROM myTable WHERE CONTAINS(myField,'"estab*ment"') If I do a wildcard at the end e.g: SELECT * FROM myTable WHERE CONTAINS(myField,'"estab*"') I get the results I am looking for. But the middle wildcard does not seem to work as expected even though it is the syntax used on MSDN and other SQL info sites. Is there something I am not doing properly?
hi i am working on sql server200.I m using "LIKE" to search the records.There is freetexttable and containstable table also.just like to know the difference between them.Could anyone provide me a good link regarding this??Thanks
Hi There, I've created a couple of search pages which look at sql server. whenever words or values like "?@~:&£^" etc, or words like for, the and so forth, the page the nasrty error page: Execution of a full-text operation failed. A clause of the query contained only ignored words Exception Details: System.Data.SqlClient.SqlException: Execution of a full-text operation failed. A clause of the query contained only ignored words. In short: is there a way I can stop it doing this. It looks rather horrible. I've looked at form validation but cant find anything that seems to fit. I would imagine there is a simple solution, but I haven't been able to find it so far. Many thanks Stuart
How can you search for the occurance of a whole word in a string? but not return any results that have the word as a substring.
For instance, if I search for the term 'scene' in a column. Then it will only return rows that have the word 'scene' and not those with the word 'scenery'. I've tried the following sql, but it relies on having text either side of the word as well. If the word 'scene' is on the begining or end of the cell then it is not returned.
SELECT Name, Description FROM tblWine WHERE Name LIKE '%[^a-zA-Z]scene[^a-zA-Z]%' OR Description LIKE '%[^a-zA-Z]scene[^a-zA-Z]%'
Can I define field names with more words in Access and SQL Server likefield: "Bus station" instead "BusStation" or "Bus_Station"? I have hadproblems because of this in VB6. Can I have problems in VB 2005 or C# 2005and SQL Server?
Hi,I'm trying to read a varchar(50) field writed in Japanese using thissentence:is = rset.getBinaryStream(num);at that sentence the JDBC driver shows the following error:java.sql.SQLException: [Microsoft][SQLServer 2000 Driver forJDBC]Unsupported data conversion.Does anybody know why?Thank you,--__________________________________________Emilio PerezJoin Bytes!SINERGIA TECNOLÓGICAC/ Eusebio Sempere 1, Entreplanta A30003 AlicanteTel. 965 136 191www.sinergiatec.com__________________________________________La información incluida en el presente correo electrónico es CONFIDENCIAL,siendo para el uso exclusivo del destinatario arriba mencionado. Si ustedlee este mensaje y no es el destinatario señalado, el empleado o el agenteresponsable de entregar el mensaje al destinatario, o ha recibido estacomunicación por error, le informamos que está totalmente prohibidacualquier divulgación, distribución o reproducción de esta comunicación, yle rogamos que nos lo notifique, nos devuelva el mensaje original a ladirección arriba mencionada y borre el mensaje. Gracias.
Is it possible with SQL Server 2005 to include ignored words in a full-text search? For example, searching for "in force as of"? This gives the same results as searching for "force" only. I've tried to empty the ignored words list (noiseENG.txt), but this does not seem to have any effect.
Also we want to be able to search for strings such as "205/1305-2". Searching with punctuation characters in a query seems to be a problem.
What are the possibillities in SQL Server 2005 with regard to these problems?
Can someone point me in the right direction to write a select query to return the first 10 whole words from a table?
For example, table "testtable" contains a field named "description" with value "here is some test data in order to select the first full ten words from."
The SELECT statement would return the value "here is some test data in order to select the".
I need to build a search function for my site. So there is a single text box for the users to type in their search string. I have been asked that I need to break the user's search string into separate words. So if the user enters: "This is my search query", I need to break it into: "This" "is" "my" "search" "query" and then search for all these words.- Can I break a string into separet words using SQL?- How do I remove funny / dangerous characters from the search string?- If I have to break the search string using a programming language, I would have to run the search query for each word. If I run the search query for each word, How do I combine the search result for the user. For instance, if I search for "my" and find some result, then search for "search" and find some results, how do I display a SINGLE search result to the user.thanks
I am try to build a query which only matches whole words and so far I've got this.
Code:
SELECT * FROM tblSearchWords WHERE CorrectSpelling LIKE '%[^a-zA-Z0-9]blah[^a-zA-Z0-9]%'
This will return rows which contain the string 'blah' without any numeric or alphanumeric characters beside it. However it doesn't return the rows where 'blah' is either at the start or end of the string as it expects any character except a-zA-Z0-0.
Is there any way to accept string when there is nothing on either side as well?
Hello all. I've been pulling my hair out for the last few weeks trying to come up with a statement that will do what I want. I'm hoping someone can lend some help.
Basically I have a table of articles with titles. I want to go thru the titles and find out what words show up the most. For example, if I had these two article titles in two records:
Microsoft develops new software for NASA NASA blames software problem on Microsoft
I would get the following results - the word and the number of times it appears:
Microsoft 2 NASA 2 Software 2
The statement should ignore those words that only appear once. It would be nice to skip static words like the, and, a, etc.. (Or words 3 characters or less)
I have fulltext query enabled on the table which works great for searching, but not for what I want it to do.
I have a client who wants to have a function to have a system create a validation system for users who register for a system. It would email them a registration code, which they want to be two or three random words strewn together.
So, I'm looking for a big table of words - random, glossary terms, etc. Does anyone have anything like this? (A flat file that I can import would be fine.)
hi could u help me in suggesting how to convert number given in Rs or in $ to words as normal we spell.. ther's is no limit it can be billions or millions.. thanks in advance..
for some reason, i had to write a function to count the number of words in a particular column in a table. (pl find the attachment). i would like to know whether there is any other mechanism with which we can count the number of words in a particular column.
for example, if the column data is,'This Is A Test', the function, will return 4. pl suggest any other efficient strategies to accomplish this
This works when @searchString is used in containstable (provided searchString has value)...
set @searchStringNoneOfWords = 'not(Airplane)' SET @searchString = @searchString + ' AND ' + @searchStringNoneOfWords
This does NOT work when @searchString is used in containstable...
set @searchStringNoneOfWords = 'not(Airplane)' SET @searchString = @searchStringNoneOfWords
I understand it is because the syntax is AND NOT, but what if I have a list of words that I do not want included? How do I start out with a NOT using containstable? It is kind of like Google's advanced search except that if you enter a word in the "without words" section with the other fields blank it would return everything under the sun except for things found with those words.
hi , I want to create a function in sql server 2000 which can display the given input currency into words but in indian format eg , '12345678.50' should display as One crore twenty lacs fourty five thousand six hundred and seventy eight rs fifty paise only . Can anybody help pls
Hi all,I have a table of text and associated data. I want to break apart the textinto individual words, yet retain the data in other columns. For example:Sentence: Chapter:--------------------------I like cats. 1Joe likes dogs. 2Should become:Word: Chapter:--------------------------I 1like 1cats 1Joe 2likes 2dogs. 2Are there built-in SQL parsing functions, If not, what text handlingfeatures would be most useful for building them?Thanks!
I guess there is no built in functions to do this but I have a function that replaces anything that is not A-Z with a space and returns @data. What I additionally need the function to do is scrunch up @data (remove all blanks betwwen each word so that 'I ran very fast' would be 'Iranveryfast').
What I need help in doing is the "Scrunch" part. Is there a way I could move the @Data to something like @DataHold and inspect each character, if it is not a blank, move that character back to @Data? This was pretty easy for me to do in C# with a while loop, but I do not know how to get it done in SQL Server 2005.
I have a project in which I have about 20,000 records in sql database table.
What I would like to do is generate a query that lists all the unique words in a particular field acros the entire table so as to generate a glossary of words.
if we had a table that looked like
ID Description
001 This is the first record
002 This is the second record
003 This is not the first record
and the query was run on the description field, then the result I would like to see is
This is the first second not I hope this makes sense. Any help is appreciated.
Using VWD I have created a search feature using the LIKE clause. The filter expression on my SQLDataSource allows the user to search the Description field of a database and yield a result that contains the exact word or phrase entered into a textbox. Assuming that the user enters more than one word, my understanding is that the search result is limited to database rows that contain the EXACT phrase (such as found in an advanced Google search using the “with the exact phrase” option). The current filter expression is: Description LIKE '%{0}%' For example, if “John Smith” is typed into the search textbox, the results will include a row with: 1. “John Smith is my neighbor” but NOT a row with 2. “John is my neighbor. His last name is Smith”. How does one modify the filter expression so that the search result is like the Google “with all the words” search option, where the search results are limited to records in which all the words typed into the textbox are present but not necessarily in the EXACT continuous order? In the example above, BOTH Descriptions would be returned in the search results when “John Smith” in typed into the search textbox. Thanks for any help you can provide in helping me refine my search options.
Hi, I want to search multiple words that is present in the database, e.g if i am putting "porperty in south delhi" but south word is in the data base, but result does not comes. if I use like operator like this select * from MASTERSEARCH where companyname like '" + txt_Company.Text + "%' or dealsin like '%" + txt_keywords + "%'";Here I put only south word, then result comes. but I want search criteria should be any word that is present in the database.I am using this with my web site. http://www.b2bindialinks.com