SQL Server 2008 :: Identifying ASCII Characters In NVARCHAR Columns
May 25, 2010
I have an issue where I am storing various international characters in nvarchar columns, but need to branch the data at one point of processing so that ASCII characters are run through an additional cleansing process and all non-ASCII characters are set aside.
Is there a way to identify which nvarchar values are within the ASCII range and can be converted to varchar without corruption? Also, the strings may contain a mix of english and international character sets, so the entire string must be checked and not just the first character.
Part 1: When there is ~ (tilde) and has any value after it then it goes into a new row and duplicating the other columns like the facility in the screenshot attached and new column having the sequence.
Part 2: When there is ^ (Caret) its a new column irrespective of a value present or not
CREATE TABLE [dbo].[Equipment]( [EQU] [VARCHAR](50) NOT NULL, [Notes] [TEXT] NULL, [Facility] [VARCHAR](50) NULL) INSERT INTO [dbo].[Equipment] ([EQU] ,[Notes] ,[Facility]) SELECT '1001','BET I^BOBBETT,DAN^1.0^REGULAR^22.09^22.090~BET II^^^REGULAR^23.56^0~','USA' union SELECT '998','BET I^JONES, ALANA^0.50^REGULAR^22.09^11.0450~BET II^^^REGULAR^23.56^0~','Canada' UNION select '55','BET I^SLADE,ADAM F.^1.5^REGULAR^27.65^41.475~','USA' SELECT * FROM dbo.Equipment
I created the table in excel and attached the screenshot for a clear picture as to what is required. I use text to Columns in excel to achieve this not sure if there is anything similar in sql.
Our database defines the long_value column as nvarchar(max). I want to find out which rows actually contain non-ASCII characters in that column, but this clause also returns rows with only ASCII characters:where long_value like (N'%[' + nchar(128) + N'-' + nchar(65535) + N']%')
I've got one SQL Server 7.0 table with a "Decsciption" Column of length 4000. The values in this column contains "End of Line" ASCII Character. The ASCII Value of this character is 10. I'm not able to remove this ASCII Character. I tried by using REPLACE function. But i could not remove that character.
Hi, I have a problem with BULK INSERT. I created the following table:
Code Snippetcreate table Test (id char(4), name nvarchar(16), last char(1))
I am trying to bulk insert data from ASCII (not unicode) file with only two rows: 0011First name 0018Second name
Since it is a fixed length file, I am using the following format file:
Code Snippet 8.0 3 1 SQLCHAR 0 4 "" 1 ID HEBREW_CI_AS 2 SQLCHAR 0 16 "" 2 NAME HEBREW_CI_AS 3 SQLCHAR 0 0 " " 3 Last HEBREW_CI_AS
With bcp utility everything works just fine!
Code Snippet bcp Demo.dbo.test in c: est -T -f c: est.fmt
But when I use BULK INSERT in the following form:
Code Snippet BULK INSERT Test FROM 'c:Test' WITH ( FORMATFILE='c:Test.fmt', CODEPAGE='OEM' );
I am getting error Server: Msg 4863, Level 16, State 1, Line 1 Bulk insert data conversion error (truncation) for row 1, column 2 (name).
Now, one interesting thing: if I change the name field from nvarchar to varchar, it is working with BULK INSERT as well. Can anybody explain what is going on here?
I have a problem with alot of my SPs. All compile correctly but cause erroneous data due to IF statements begin ignored due to
characters (see below).
Example SP... -- Opened within last 12 Months
IF @NEWACCIND = 1 BEGIN
EXECUTE usp_DFDX03_D0150_A4 @COSTALL OUTPUT END
-- Accounts in Arrears in Current Quarter
Should look like ...
-- Opened within last 12 Months IF @NEWACCIND = 1 BEGIN
EXECUTE usp_DFDX03_D0150_A4 @COSTALL OUTPUT END -- Accounts in Arrears in Current Quarter
I need to find all SPs with double
instance and manually replace. There are hundreds of SPs in total. I have tried
SELECTCOLID, ID FROMSYSCOMMENTS WHERECHARINDEX (CONVERT (VARCHAR(3), CHAR(13)+CHAR(13)), TEXT) > 0
but this also returns SPs containing 2 consecutive blank lines as well (which there are alot of due to formatting of T-SQL). Really I need to distuinguish between and new line which both appear to be CHAR(13)
declare @localtab INT SET @localtab = (SELECT Convert(INT,('select count(*) from ' + @specificDB+'.'+'INFORMATION_SCHEMA.Tables WHERE TABLE_TYPE = ''BASE TABLE'' AND Table_name = ' + @tablename))) Print @localtab Print @localtab
----
Msg 245, Level 16, State 1, Line 8 Conversion failed when converting the nvarchar value 'select count(*) from AdventureWorksDW2012.INFORMATION_SCHEMA.Tables WHERE TABLE_TYPE = 'BASE TABLE' AND Table_name = DimAccount' to data type int.
I know that if I have an nvarchar column I can use an equality like = N'supersqlstring' so it doesn't implicit cast as a varchar, like if I were to do ='supersqlstring'. And then I'll be a big SQL hero and all my stored procedures will run before a millisecond can whisper.
But if I'm comparing an nvarchar column to a varchar column, is it better to cast the varchar 'up' to an nvarchar or cast the nvarchar 'down' to a varchar?
For instance:
cast(a.varchar as nvarchar(100)) = an.nvarchar
or
cast(an.nvarchar as varchar(100)) = a.varchar
Leaving aside non-matching, like (at least I don't think) that SQL considers the varchar n to be equal to the nvarchar ń, what's the best way to handle this?
Pretend for a moment that each column contains a mixed letter and number ID with no accented or wiggly-squiggly Unicode characters; it's just designs clashing.
Is there a performance hitch doing it one way or another? Should I use COLLATE? Should one of the columns be altered?
create table Test(ID number, Name nvarchar(500)); insert into Test(1,'abc testing'); insert into Test(2,'abc include persian آنا اسمیت'); insert into Test(3,'mnp testing'); insert into Test(4,'abc include Russian Джон Тед');
I want to get records that have only english characters i.e ID=1 and 3 only.
I tried select * from Test where Name like '%[a-zA-Z0-9]%' but this will return all 4 records. How can I accomplish this?
I get the following error : "Msg 8115, Level 16, State 8, Line 1.. Arithmetic overflow error converting nvarchar to data type numeric. The statement has been terminated."
The table is set to nvarchar, and i am just trying to make the prices go up 10%.
I am having an issue fetching Chinese characters in a XML data type. It return questions mark (?).
Below is the sample script.
DECLARE @XMLVAR XML SET @XMLVAR = '<?xml version="1.0"?> <POLICY_SEARCH xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <NAME>QA*保险1</NAME><NUMBER /></POLICY_SEARCH>'
SELECTI.xmlParam.query('./NAME').value('.','NVARCHAR(25)') NAME ,I.xmlParam.query('./NUMBER').value('.','NVARCHAR(25)') NUMBER FROM@XMLVAR.nodes('POLICY_SEARCH') AS I(xmlParam)
MS SQL 2000. Does anyone know how to find all rows where an nvarchar column contains a specific unicode character? Is it possible without creating a user defined function? Here's the issue. I have a table Expression (ExpID, ExpText) with values like 'x < 100' and 'y ≤ 200'. where the second example contains Unicode character 8804 [that is, nchar(8804)]. Because it's unicode, I don't seem to be able to search for it with LIKE or PATINDEX. These fail: SELECT * FROM Expression WHERE ExpText LIKE '%≤%' -- no recordsSELECT * FROM Expression WHERE PATINDEX('%≤%', ExpText) -- no records However, SELECT PATINDEX('%≤%', 'y ≤ 200') will return 3. Any suggestions? Thanks in advance.
I need to find all uses of special characters in a database. I used the following code to do this:
USE dbName GO IF OBJECT_ID('tempdb.dbo.#Results') IS NOT NULL DROP TABLE #Results GO
[code]...
This will check all tables in the database, but if you want to check specific tables you can uncomment the line in the where clause and specify tables to be checked. The query will return any text fields that have any characters other than letters, numbers or spaces.
This code works fine for me because all the tables in my database have single column primary keys. However I know how much Jeff Moden hates cursors or RBAR queries, so my question is could this have been done by any method other than using a cursor?
I want the reason for the above statement where I user nvarchar(4000) to insert the japanese text it give the same error , why we cannot have maximum size ? if we can have maximum size than 8060 what is the setting
I frequently have the problem where I have a list of items to delete ina temp table, such asProjectId Description------------- ----------------1 test12 test43 test34 test2And I want to delete all those items from another table.. What is thebest way to do that? If I use two IN clauses it will do it where itmatches anything in both, not the exact combination of the two. I can'tdo joins in a delete clause like an update, so how is this typicallyhandled?The only way I can see so far to get around it is to concatenate thecolumns like CAST(ProjectId as varchar) + '-' + Description and do anIN clause on that which is pretty nasty.Any better way?
I have a hungarian character which looks like a lower case o with two single quotes on top of it --> ő
I have this character stored in two table the datatype of the column where this is stored at is varchar in one table and nvarchar in the other. When I try to view the field in enterprise manager the character appears as it should in the 2 tables, but when I use a jsp page deployed on weblogic to look at this character the one stored in the column of type varchar displays perfectly, but the table in which the column is nvarchar the character on the jsp page appears as a Q instead.
Any inputs on how to correct this issue will be much appreciated. Any changes to the character set on the html / jsp pages has no affect on the result.
I'm presented with a problem where I have a database table which must be migrated via a "custom tool", moving the data into a new table which has special character requirements that didn't exist in the source database. My data resides in an SQL Server 2008R2 instance.
I envision a one-time query which will loop through selected records and replace the offending characters with --, however I'm having trouble understanding how this works.
There are roughly 2500 records which meet the criteria of "contains bad characters", frequently containing multiple separate bad chars, and the table contains roughly 100000 rows.
Special Characters are defined as #%&*:<>?/{}|~ and ..
While the field is called "Filename" it isn't always so, it is a parent/child table where foldernames are also stored.
The examples I'm finding are all oriented around SELECT statements, to change the output of what I see returned, however I'd rather just fix the entire column using an UPDATE. Initial testing using REPLACE fails because I don't always have a single character as the bad thing in a string.
In a better solution, I found an example using a User Defined Function to modify the output of a select, but I cannot use that UDF in an UPDATE.
My alternative is to learn enough C# to modify the "migration tool" to do this in-transit, but I know even less about C# than I do of SQL.
I gather I want to use @@ROWCOUNT to loop through the rows but I really can't put it all together in a cohesive way.
Problem about pass a big string (over 8000 characters) to a variable nvarchar(max) in stored procedure in SQL 2005! I know that SQL 2005 define a new field nvarchar(max) which can stored 2G size string. I have made a stored procedure Hellocw_ImportBookmark, but when I pass a big string to @Insertcontent , the stored procedure can't be launch! why? create procedure Hellocw_ImportBookmark @userId varchar(80), @FolderId varchar(80), @Insertcontent nvarchar(max) as declare @contentsql nvarchar(max); set @contentsql=N'update cw_bookmark set Bookmark.modify(''declare namespace x="http://www.hellocw.com/onlinebookmark"; insert '+ @Insertcontent+' as last into (//x:Folder[@Id="'+@FolderId+'"])[1]'') where userId='''+@userID+''''; exec sp_executesql @contentsql;
as declare @contentsql nvarchar(max); set @contentsql=N'update cw_bookmark set Bookmark.modify(''declare namespace x="http://www.hellocw.com/onlinebookmark"; insert '+ @Insertcontent+' as last into (//x:Folder[@Id="'+@FolderId+'"])[1]'') where userId='''+@userID+''''; exec sp_executesql @contentsql;
I"ve had some issues in developing the sql server portion of my site. The issue is editing, deleting, inserting data from a form (at least from what I can understand, I'm a beginner). Below is the error. Any help I can get is greatly appreciated! Josh
Server Error in '/WebSite4' Application.
Incorrect syntax near 'nvarchar'. Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code. Exception Details: System.Data.SqlClient.SqlException: Incorrect syntax near 'nvarchar'.Source Error:
An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below. Stack Trace:
Recently I have come across a requirement where i need to design a table.
There are some columns in table like below with DECIMAL Datatype:
BldgLength
BldgHeight
BldgWeight
Based on my knowledge, i know that values before Floating-Point will not be more than 4 digits.
Now as per MSDN,
Precision => 1 - 9 Storage bytes => 5
so i can create column as:
BldgLengthDECIMAL(6,2) DEFAULT 0
OR
BldgLengthDECIMAL(9,2) DEFAULT 0
Now while reading some articles, i came to know that when we do some kind of operation like SUM Or Avg, on above column then result might be larger than current data type.
So some folks suggested me that i should keep some extra space/digits considering above MATH functions, to avoid an Arithmetic Over Flow error.
So my question is what should be value of DataType for above column ?
The query Im running so far is wrong, but here it is...
SELECT t.FromUserID, t.ToUserID, t.msg, u.UserName AS UserFrom, u.GroupID AS FromGroup, u2.UserName AS UserTo, u2.GroupID AS ToGroup FROM tmp_Messages t LEFT JOIN (SELECT UserID, GroupID, UserName FROM tmp_users WHERE GroupID = 3) u
[Code] .....
im missing the details of one of the users.I know what the problem is, I just cant figure out how to get this working without using temp tables, which I cant do in the production version.
how to use like operator select statement to retrieve multiple column names in sql server DB...for ex: I have a table say employees where in I want to get all column names like emp_,acc_ etc using '%' And what is this below query used for?
SELECT column_name as 'Column Name', data_type as 'Data Type', character_maximum_length as 'Max Length' FROM information_schema.columns WHERE table_name = 'tblUsers'
I'm connecting to a SQL Server 2005 database using the latest (beta) sql server driver (Microsoft SQL Server 2005 JDBC Driver 1.1 CTP June 2006) from within Java (Rational Application Developer).
The table in SQL Server database has collation Latin1_General_CI_AS and one of the columns is a NVARCHAR with collation Indic_General_90_CI_AS. This should be a Unicode only collation. However when storing for instance the following String:
__ÙÚÜÛùúüû_ÅÆØåæøßÇçÑñ__ЎўЄє?ґ_пр?туф_ЂЉЊЋ ... it is saved with ? for all unicode characters as follows (when looking in the database): __ÙÚÜÛùúüû_ÅÆØåæøßÇçÑñ__??????_??????_????
The above is not correct, since all unicode characters should still be visible. When inserting the same string directly into the sql server database (without using Java) the result is ok.
Also when trying to retrieve the results again it complains about the following error within Java:
Codepage 0 is not supported by the Java environment.
Hopefully somebody has an answer for this problem. When I alter the collation of the NVARCHAR column to be Latin1_General_CI_AS as well, the data can be stored and retrieved however then of course the unicode specific characters are lost and results into ? So in that case the output is as described above (ie __ÙÚÜÛùúüû_ÅÆØåæøßÇçÑñ__??????_??????_????)
We would like to be able to persist and retrieve unicode characters in a SQL Server database using the correct JDBC Driver. We achieved this result already with an Oracle UTF8 database. But we need to be compliant with a SQL Server database as well. Please help.
Hi,I have a problem, I have a table with a text type column and anvarchar(2000) type column on my MS SQL 2000 Server.I know that the longest text in the text field is 1000 chars. I want tocopy the content the content of the text field into the nvarchar field.I tried convert and cast but after the update there are only 255 charsin the nvarchar field.Best regardsMarc