Full Text Search With Language Other Than English(ex Chinese, Japanese)
Feb 10, 2007
I have set up a full text search to handle multiple columns searching for chinese
But the result of the search isn't really what i have expected.
I have setup the catalog to have a chiense word break, and the columns in the tables are all nvachar
when i do something like
select * from dbo.Table_1 where contains(*, '"<chinese character>"',language 1082)
the search result is really inconsistent, especially with single characters.I have also checked that these characters are not in the noise filter file....
the search result is better when the input is more than a single characters, but still, somtimes it will not return any result at all.
so, I try to use the "like" statement instead of "contains" to do the search with the same inputs, and 100% of the time, it returns the correct result.
does anyone have any experience about things like that? coz I guess this is a more spcific issue with language. Is there any place that you guys know of, can offer me some help?
We should support multiple language(Latin,chinese,japanese,korea) in one report when exporting to PDF format in reporting service. We have used Arial Unicode as our font. But when we exported the report, the korean language item can not be displayed. Any idea on that? Thanks a lot.
I'm using FTS on on two columns (VARCHAR) of my database. The data is exported from a MySQL database to SQL Server. When populating the database the default language was set to English (or maybe neutral?), although 90% of the records are in Dutch. Now, when I change the FTS language specification to Dutch problems occur when querying the database. When I enter typical Dutch noise words in my query like "van" or "van der" the query does not return any results. When I set it back to Neutral the queries do return results, although a drawback is that I can't query for example plural forms of words. Could this be because, when populating the database, the correct language was not set? If so, is there a way to get the Full Text Index working with Dutch in a correct manner? Thanks in advance for your replies! Would be great to get this working in Dutch!
I'm using the Full Text Search with the neutral language in the indexed columns. I have no stoplists so everything counts. We changed to neutral language because our customers don't have the need for language specific criteria. They just want raw text search.Everything was working great but then they found a term that they can't search and I can't figure out what's wrong. It comes down to this:
SELECT * FROM sys.dm_fts_parser('"1.2.3.4"', 1033, NULL, 0); SELECT * FROM sys.dm_fts_parser('"1.2.3.4"', 0, NULL, 0);
In the neutral language, the parser will return "3.4", differently than the English language parsing.In the 1033 language, the parser will break all the numbers into words and the result will show correctly.I really don't want to change the language from neutral because it's working for every other query.
Hi,my client requires a multilingual website including Japanese and Chinese. When I try to add text in Japanese and Chinese into the MSSQL database it says the data is not consistant with the data type or length, do you know how I can get round this??any help or direction would be greatly appreciatedMike
Hi all,I am quite experimented with SQL Server, but not that much with fulltext indexing. After some successful attempts with english fields, I'vedecided to try it with Japanese characters. I don't know why, but itseems to have a strange behaviour.As in this screenshot(http://img65.imageshack.us/img65/980/jap3xt.gif), the CONTAINSfunction does not seem to return only fields with an exact word matchof the given "word" (query), but also strange results which does noteven correspond to the query. Can anybody help me with that one?Thanks! :)ibiza
The following queries on a fulltext index is returning different results.
select CustomerNameLocal from dbo.Customers where contains (CustomerNameLocal,'A.C.E') -- returns 1388 records
select CustomerNameLocal from dbo.Customers where contains (CustomerNameLocal,'ACE') -- returns 1388 records
select CustomerNameLocal from dbo.Customers where contains (CustomerNameLocal,'ace') -- returns 1388 records
select CustomerNameLocal from dbo.Customers where contains (CustomerNameLocal,'a.c.e') -- returns 22 records
Can someone let me know why the last query is retuning only 22 records. Since search on ACE and ace returns the same number of records - I guess there shouldn't be any problem with case sensitivity.
SQL 2000, latest SP. We currently have the need to store data from aUTF-8 application in multiple languages in a single database.Our findings thus far support the fact that single-byte anddouble-byte characters can be held in the same DB without issue.However, when holding two sets of DIFFERING double-byte characters(i.e. Chinese and Japanese) there are issues.Since Japanese has a superset of both Kanji and Katakana charactersit's our theory that the Japanese collations will hold Chinese as well(Mandarin).1) Has anybody tried to store multiple languages in the same db? Whatcollation was used?2) Is it possible to change collation by table?3) Which collation of Japanese should be used for best multibyte,UTF-8 character sets? Currently we're testing with Japanese_CI_AS(encoding MS932).Any and all responses appreciated,Join Bytes!
I need a small confirmation regarding storing the Chinese and Japanese characters in sql server. Can we store Chinese and Japanese characters on a same database with Chinese Collation? Or else we need to store it separately with respective collations. I tried to store both characters on db with Chinese collation it works but I am not so sure if it is right way to do so. Please confirm on this as we are doing research stage to build website in Chinese and japanese. Thanks in advance.
My application supports multiple languages/locales in a single database. Some of our new customers want to support Chinese, Japanese, Korean, Italian, Spanish, and German in addition to English. Supporting the Latin based languages is not a problem. But I am having trouble finding a collation sequence that allows me to store the other double byte languages in the same database correctly.
I have found changing the data types from text, char, varchar to ntext, nchar, nvarchar and adding an N in front of the various strings that getting inserted seems to work:
insert into CONTENTDATA (recordid, xml) values (newid(), N'<CHANNEL1><FILE1/><TEXT1><![CDATA[和红魔拉拉队的动感精神 ]]></TEXT1><TEXT3><![CDATA[和红魔拉拉队的动感精神]]></TEXT3></CHANNEL1>');
But this is not going to be a practical solution for us. Is there a collation sequence that would allow us to store multiple locales like we do in Oracle (AL32UTF8)?
Hi - I'm short of SQL experience and hacking my way through creating a simple search feature for a personal project. I would be very grateful if anyone could help me out with writing a stored procedure. Problem: I have two tables with three columns indexed for full-text search. So far I have been able to successfully execute the following query returning matching row ids: dbo.Search_Articles @searchText varchar(150) AS SELECT ArticleID FROM articles WHERE CONTAINS(Description, @searchText) OR CONTAINS(Title, @searchText) UNION SELECT ArticleID FROM article_pages WHERE CONTAINS(Text, @searchText); RETURN This returns the ArticleID for any articles or article_pages records where there is a text match. I ultimately need the stored procedure to return all columns from the articles table for matches and not just the StoryID. Seems like maybe I should try using some kind of JOIN on the result of the UNION above and the articles table? But I have so far been unable to figure out how to do this as I can't seem to declare a name for the result table of the UNION above. Perhaps there is another more eloquent solution? Thanks! Peter
Our clients want to be able to do full text search with a single letter. (Is the name Newton, Nathan, Nick?, Is the ID N1, N2...). Doing a single character full text search on a table work 25 out of 26 times. The letter that doesn't work is 'n'. the WHERE clause CONTAINS(full_text_field, ' "n*" ') returns all rows, even rows that have no 'n' in them anywhere. Adding a second letter after the "n" works as expected.
Here is an example
create table TestFullTextSearch ( Id int not null, AllText nvarchar(400) ) create unique index test_tfts on TestFullTextSearch(Id); create fulltext catalog ftcat_tfts;
I have a scenario of where the standard Full-Text search identifies keywords but Semantic Search does not recognize them as keywords. I'm hoping to understand why Semantic Search might not recognize them. The context this is being used in medical terminology and the specific key words I noticed missing right off the bat were medications.
For instance, if I put the following string into a FT indexed table
'J9355 - Trastuzumab (Herceptin)' AND 'J9355 - Trastuzumab emtansine'
The Semantic Search recognized 'Herceptin' and 'Emtansine' but not 'Trastuzumab'
Nor in
'J8999 - Everolimus (Afinitor)'
It did not recognize 'Afinitor' as a keyword.
In all cases the Base of Full-Text did find those keywords and were identifiable using the dmvsys.dm_fts_index_keywords_by_document.It does show the index as having completed.
why certain words might not be picked up while others would be? Could it be a language/dictionary issue? I am using English and accent insensitive settings?
would you use sql server "full text search" feature as your site index? from some reason i can't make index server my site search catalog, and i wonder if the full text is the solution. i think that i wll have to you create new table called some thing like "site text" and i will need to write every text twice- one the the table (let's say "articles table") and one to the text. other wise- there is problems finding the right urlof the text, searching different tables with different columns name and so on... so i thought create site search table, with the columns: id, text, url and to write every thing to this table. but some how ot look the wrong way, that every forum post, every article, album picture or joke will insert twice to the sqr server... what do you think?
I have installed the Adobe iFilter 11 64 bit and set the path to the bin folder. I still cannot find any text from the pdf files. I suspect I am missing something trivial because I don't find much when I Bing for this so it must not be a common problem.Here is the code.
--Adobe iFilter 11 64 bit is installed --The Path variable is set to the bin folder for the Adobe iFilter. --SQL Developer version 64 bit on both Windows 7 and Windows 8. USE master; GO DROP DATABASE FileTableStudy; GO CREATE DATABASE FileTableStudy ON PRIMARY
I have Sql server 2005 SP2. I enabled it for Full Text search. Substring search where i enter *word* doesn't return any row. I have a table testtable where description has word Extinguisher.
If i run a query with *ting* it doesn't return any row. select * from testtable where contains(description,'"*xting*"') ;
But it works if i do select * from testtable where contains(description,'"Exting*"') ;
The Full text search document says it supports substring search. Is it an issue with sql server 2005?Please help.
I am using Sql Server 2014 Express edition.I have a table with a varchar(max) column. I have created a full text search that use the stoplist "system". column has this struct: xxx.yyy.zzz.... where xxx, yyy, zzz... are numbers, like 123.345.123123.366456...I can have rows like that:
select * from Mytable where contains(MyColumn, '123.345.')
I gues the contains would return all the rows with column contains 123.345, but this does not return all the expected rows, only one row.I have tried to replace "." with "-" but the result is the same.I have also tried with '123.345.*. In this case I have got more results, but no all the exptected rows.If I use this query:
select * from MyTable where MyCOlumn like '123.345.%';
Hi, i'm trying to do a full text search on my site to add a weighting score to my results. I have the following database structure: Documents: - DocumentID (int, PK) - Title (varchar) - Content (text) - CategoryID (int, FK) Categories: - CategoryID (int, PK) - CategoryName (varchar) I need to create a full text index which searches the Title, Content and CategoryName fields. I figured since i needed to search the CategoryName field i would create an indexed view. I tried to execute the following query: CREATE VIEW vw_DocumentsWITH SCHEMABINDING ASSELECT dbo.Documents.DocumentID, dbo.Documents.Title, dbo.Documents.[Content], dbo.Documents.CategoryID, dbo.Categories.CategoryNameFROM dbo.Categories INNER JOIN dbo.Documents ON dbo.Categories.CategoryID = dbo.Documents.CategoryID GOCREATE UNIQUE CLUSTERED INDEX vw_DocumentsIndexON vw_Documents(DocumentID) But this gave me the error: Cannot create index on view 'dbname.dbo.vw_Documents'. It contains text, ntext, image or xml columns. I tried converting the Content to a varchar(max) within my view but it still didn't like. Appreciate if someone can tell me how this can be done as surely what i'm trying to do is not ground breaking.
Hello everyone ! I want to perform Full Text Search with SQL Server 2000. My documents (.doc, .xls, .txt, .pdf) are stored in a SQL Server field which is binary (the type of the column is image). I would like to know, how you can extract pieces of text from the documents. Example: I have a ASPX page with codebehind in C# making the search in a table in SQL server that is full text indexed. I make a search looking for the word "peace", than SQL server will take care about the search and return it to me the rows that match with that. But also I'd like to extract the 50 characters before and after where sql server found the word "peace" to show in the result page. Does anyone has any idea how to work around it ? Best regards. Yannick
I have a table called country that will store all the country related details in it. Below is the screenshot of my country table.
I want to localize this table to Japanese language. I googled out and found out that a new table needs to be created for storing the data in localized language.
If that's the case do we need to manually translate the text in the country table for each and every country?
Is there any automated process for that? Just like not translating the text manually for each and every rows..
This is because I have few more tables in which the text are not static. they will get loaded on a daily basis. So i will not be able to translate them every day..
I have a table in sql server 2000 which has over 94000 records. I have to delete a record from table ,which record having a language other than english . I need to clean the table by removing all the data which are in other language . My main table has 12 fields .
Hi,I have a database and want to store data in Spanish and English. Toaccompish this:1. Do i need to create separate tables for both the languages likeitems_en and items_sp?2. If I opt for the UTF16 charset what single collation setting can Iuse?Thanks and RegardsJackal Hunt
We can see the chinese language without any problem.However, when I open the upper report on my Report Builder, the chinese words are broken looks like below.This symptom happens after windows10 upgrade from window7 , once I use windows7 , there was no problem to see report builder.
hye everyone, i am new in reporting service and have question about :
if my table in database store the english words but want to display in chinese/ arabic on my report. so..can the reporting service do like that.. any suggestion/tips or idea...
Hi I have a full text index on my product table. When I do a search for Record, it returns all values for Record and Records.Now If I do a search with a spelling mistake say Recod . it doen't return anything.How can I get the full text to return my query even if there is a spelling mistake ? Thanks My query:SELECT * From Product WHERE FREETEXT (description, @SearchString)
hi all.i want to search, for example :"test string" in database : table have column(name) , i want to search all rows with column(name) is "test " or "string" or "test string"i don't want to use(full text search of sqlserver 2005 ) can i help me.thanks in advance
Hello ! With SQL Server Management Studio Express I have created a catalog and a index.Here is the code : create fulltext catalog myfirstcatalogcreate unique index myfirstindex on northwind.dbo.customers(companyname)create fulltext index on northwind.dbo.customers(companyname)key index myfirstindex ON myfirstcatalogWITH CHANGE_TRACKING AUTO With SQL Server Management Studio Express and the following command the full text search is working fine. select companynamefrom northwind.dbo.customerswhere contains(companyname, ' "blauer" ') I have a big problem : When I try to use this database (NOTRHWIND.MDF) into my .aspx file with VWD 2008 I get an error : Cannot use full-text search in user instance.
Can you tell me what can I do to make use of full-text search inside my aspx pages ? Thank You !