How To Put Few Hundred Million Strings In Proper Case
Sep 16, 2015
First off, I know this is a presentation issue. Second, no, I can't force a change on my source systems.
Some of the systems that send my BI application data, send that data in all upper case like so "JOHN DOE". We have this horrible SQL function that goes through and makes sure that the first letter in a word is always uppercase and the rest of the letters are lower case. So my results are "John Doe".
As you can imagine this is dreadfully slow when executed a couple of hundred million times, but what are my options?
I have not used Data Quality Services yet, but the chart in BOL says a DQS SSIS cleansing task can do 1 million records in 2 hours on a given set of hardware. That is still pretty horrible.
I suppose I could cobble together a Script task in SSIS, but I am pretty sure clumsy dotNet is not going to be much faster.
CREATE FUNCTION [dbo].[udf_ProperCase](@UnCased varchar(max))
RETURNS varchar(max)
as
begin
declare @Reset bit;
declare @Ret varchar(max);
[Code] .....
View 14 Replies
ADVERTISEMENT
Oct 22, 2000
How would I convert an expression like on of these to all upper case first letters with remaining letters lower case? VB has a function for that but sql doesn't seem to. I thought about having a loop go through each character to check for spaces. I've written a couple of similar pieces of code in VB when a while ago, but is there a better way? Thanks :)
Just a couple of typical examples of how the data should appear ~
payment, credit card ==> Payment, Credit Card
butcher & singer ==> Butcher & Singer
View 1 Replies
View Related
Mar 29, 2008
I've used this udf for a while with great success, but only on fields with more than one word....
http://weblogs.sqlteam.com/jeffs/archive/2007/03/09/60131.aspx
I'd like to know how I can adapt this function so it will convert a scottish/irish surname (McDonald or O'Shea) when there is only the surname in the column
This is what I'd been using for multiple words (Ronald McDonald). But it won't work on just Mcdonald. I'm sure it's just a simple tweak, but it all looks Punjabi to me?
Thanks in advance!!
CREATE FUNCTION [dbo].[f_ProperCase](@Text as varchar(512)) RETURNS varchar(512) as
BEGIN
DECLARE @Reset bit
DECLARE @Ret varchar(512)
DECLARE @i int
DECLARE @c char(1)
SELECT @Reset = 1, @i=1, @Ret = ''
WHILE @i <= LEN(@Text)
SELECT @c= SUBSTRING(@Text,@i,1),
@Ret = @Ret + CASE WHEN @Reset=1 THEN UPPER(@c) ELSE LOWER(@c) END,
@Reset= CASE WHEN
CASE WHEN SUBSTRING(@Text,@i-4,5) like '_[a-z] [DOL]''' THEN 1
WHEN SUBSTRING(@Text,@i-4,5) like '_[a-z] [D][I]' THEN 1
WHEN SUBSTRING(@Text,@i-4,5) like '_[a-z] [M][C]' THEN 1
ELSE 0
END = 1
THEN 1
ELSE CASE WHEN @c like '[a-zA-Z]' or @c in ('''') THEN 0
ELSE 1
END
END,
@i = @i +1
RETURN @Ret
-- Test: SELECT dbo.f_ProperCase('it''s crazy! i couldn''t believe kate mcdonald, leo dicaprio, (terrence) trent d''arby (circa the 80''s), and jada pinkett-smith all showed up to [cHris o''donnell''s] party...donning l''oreal lIpstick! They''re heading to o''neil''s pub later on t''nite. the_underscore_test. the-hyphen-test.' )
END
View 4 Replies
View Related
Jun 12, 2015
I have a requirement to delete 1 Million records from a table having 10 Million data and it's being queried on 24/7 basis (don't have a downtime). how can I achieve that?
View 13 Replies
View Related
Feb 21, 2008
Im trying to build myself a report that recieves a values as worded text e.g
zero zero one zero zero
it would be nice if i write some clever bit of code that would convert this to
One Hundred
Has anyone come up against this before, or could point me in the direction of a tutorial, i would appreciate it
thank you in advance
Jonny
View 6 Replies
View Related
Feb 19, 2007
I have a whole bunch of bit fields in an SQL data base, which makes it a little messy to report on.
I thought a nice idea would be to assigne a text string/null value to each bit field and concatenate all of them into a result.
This is the basic logic goes soemthing like this:
select case new_accountant = 1 then 'acct/' end +
case new_advisor = 1 then 'adv/' end +
case new_attorney = 1 then 'atty/' end as String
from new_database
The output would be
Null, acct/, adv/, atty, acct/adv/, acct/atty/... acct/adv/atty/
So far, nothing I have tried has worked.
Any ideas?
View 2 Replies
View Related
Apr 17, 2003
Hi all,
I have a table with approx 75 million rows of names and addrersses in it that I am trtying to update...so far the update is running 5 hours and with no end in sight...a liitle background is that this is running on a quad zion 500 with 3 gb ram ands one 145 gb drive (boooo) without improving the hardware needs can i improve the performance...I have indexed all the where fields that i read on and only update the table but once or twice a month, but I do daily selects by zip or county (all indexed) i even have a composite key on phone and zip...
i have heard of horizontal partioning but i always thought that was reserved for archiving old transactional data that rarely gets read on....
when i performed a trace there are plenty of reads but no writes...is this normal during an update like this...
i have been running this proc for the past 7 HOURS!!!....any help is appreciated, since all i have is time at this point....
THANKS!!!!
--Set rowcount to 100000 to limit number of updates
--performed in each batch to 100K rows.
Set rowcount 100000
--Declare variable for row count
Declare @rc int
Set @rc=100000
While @rc=100000
Begin
Begin Transaction
--Use tablockx and holdlock to obtain and hold
--an immediate exclusive table lock. This unusually
--speeds the update because only one lock is needed.
Update [2000] With (tablockx, holdlock)
set [source] = '2000'
--Get number of rows updated
--Process will continue until less than 10000
Select @rc=@@rowcount
--Commit the transaction
Commit
End
View 5 Replies
View Related
Apr 9, 2008
I'm new to using a DB and have a few questions about what I'm trying to do. I have some historical options data and want to place it into a sql express database. (I understand I might need to use a none express version once the db gets to big.) A months worth of data is over 5.5 million rows of data. So six years worth is ~400 million rows. Is it possible to put this into a sql db and be able to search it very fast? I have a months worth in a db now and it is pretty slow. Should I use a new table for each month and then have 6 years * 12 month = 72 tables to increase the search speed? I search by date and stock_symbol and the data looks like this:
Date, Stock_Symbol, Option_Symbol, Strike, BidPrice, AskPrice, Volume, OpenInterest, (and a few others)
The select statement is simple: SELECT * FROM Options WHERE Date = @Date and StockSymbol = @Symbol
Thanks
View 4 Replies
View Related
Jan 27, 2006
I am currently working on a simple page to insert 1.6 million UK postcode records into an SQL server table. The table has three columns for the postcode, longditude coordinate and lattitude coordinate. The data is sourced from a pipe (|) delimited txt file and inserted into the database using a FOR loop. The problem I have is that the page will hang after inserting only 10,000 records, the page displays either an invalid View State error or a page cannot be found error.
Now I assume the viewstate error stems from the fact that there is a form on the page which simply contains a button to execute the script and a few labels to show the progress. But without the form and associated viewstate the insert still fails to complete.... any ideas?? Would I be better running this on a thread or should I just do it in stages and be patient. I have now modified the page to read the database on load and pick up from where it crashes?
View 2 Replies
View Related
Aug 30, 2006
Meg writes "Hi,
I have a table that has 4+ million records. I need to update those records. I am facing some performance issue. Can someone please advice?
update stage
set batch_status = 1
where update_status = 0
Update transaction
Set aId = s.aId,
b = s.b,
from stage s
Where s.aId = transaction.aId
and s.batch_status = 1
Update stage
Set update_status = 1,
batch_status = 2
where
batch_status = 1
When I run the above query with "set rowcount 1000", it runs in one minute. When I run the query for "set rowcount 10000", it runs in 1 hour 56 minutes. Can someone help me to optimize it?
Thanks.
Meg"
View 4 Replies
View Related
Jul 20, 2005
Hey folks...So I have a table that looks like this:CREATE TABLE [tblStation] ([CAMPAIGN] [varchar] (8),[LISTNUM] [varchar] (10),[PHONE] [varchar] (10),[EVENTTIME] [datetime] ,[STATION] [int],[OPERATOR] [varchar] (16),[EVENTCODE] [varchar],[CALLSPAN] [decimal](18, 0),[FDISP] [int],[RECORDNUM] [varchar],[STC] [varchar],[PROMOC] [varchar],[EXP_CAMP] [varchar],[PROMO3] [varchar],[MAXATT] [char],[LISTNAME] [varchar],[SITENAME] [char],[Row_id] [int] IDENTITYIt's taking nine seconds to run the following command:SELECT count([fdisp])FROM [TrunkFiles_new].[dbo].[tblStation] WITH (NOLOCK)WHERE fdisp IS NULLAnyone familiar with a table of this size having performance likethis? The [fdisp] column has a non clustered index on it.Thanks in advance...
View 1 Replies
View Related
Jun 9, 2008
Hi all - I have posted inquiries on this rather vexing issue before, so I apologize in advance for revisting this. I am trying to create the code to add the parameters for two CheckBoxLists together. One CheckBoxList allows users to choose a group of Customers by Area Code, the other "CBL" allows users to select Customers by a type of Category that these Customers are grouped into. When a user selects Customers via one or the other CBL, I have no problems. If, however, the user wants to get all the Customers from one or more Area Codes who ALSO may or may not be members of one or more Categories; I have had trouble trying to create the proper SQL. What I have so far:Protected Sub btn_CustomerSearchCombined_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btn_CustomerSearchCombined.Click Dim CSC_SqlString As String = "SELECT Customers.CustomerID, Customers.CustomerName, Customers.CategoryID, Customers.EstHours, Customers.Locality, Category.Category FROM Customers INNER JOIN Category ON Customers.CategoryID = Category.CategoryID WHERE " Dim ACItem As ListItem Dim CATItem As ListItem For Each ACItem In cbl_CustomersearchAREA.Items If ACItem.Selected Then CSC_SqlString &= "Customers.AreaCodeID = '" & ACItem.Value & "' OR " End If Next CSC_SqlString &= "' AND " <-- this is the heart of my problem, I believe For Each CATItem In cbl_CustomersearchCAT.Items If CATItem.Selected Then CSC_SqlString &= "Customers.CategoryID = '" & CATItem.Value & "' OR " End If Next CSC_SqlString = Left(CSC_SqlString, Len(CSC_SqlString) - 4) CSC_SqlString &= "ORDER By Categories.Category" sql_CustomersearchGrid.SelectCommand = CSC_SqlString End SubAny help on this is much appreciated, many thanks --
View 5 Replies
View Related
May 23, 2005
T-SQL offers UPPER and LOWER functions for formatting text strings. Is there a PROPER function? Thanks.
View 1 Replies
View Related
Sep 18, 2000
What are the proper procedures to move a SQL 6.5 to another 6.5 (new box)
I need to move everything including stored procedures?
Any tips would be helpful.
Thanks,
Jason
View 1 Replies
View Related
Apr 3, 2006
I have a tsql where I need to do a patindex on a variable and check if a record exists to meet the where clause for a IF statement below. What am I doing wrong
declare @l_orderid int
set @l_orderid = 18
declare @l_SIGShort varchar(20)
set @l_SIGShort = '~KOP~'
-- KOP Orders
print patindex ( '%~KOP~%', @l_SIGShort)
if patindex ( '%~KOP~%', @l_SIGShort) <> 0 and (if exists (select * from orderoptions
where orderid = @l_orderid and ordertype = 'KOP'))
View 2 Replies
View Related
Nov 16, 2001
How well SQL Server can support 300 million records...
Any body is working on big database like this. can anyone give me some input on this. it's going to be 60GB database size.
View 1 Replies
View Related
Mar 21, 2000
In our database, we have a very large table that gets updated every morning, start of the day is copying 4 million rows from the fact table from previous date to today's date in the same table and then some other processing. It takes 1 1/2 to 2 hrs to do this. There is a dts package created to copy these rows into temp table and then to this fact table.
This table has more than 200 million rows
Any ideas on how to accomplish this without doing the copy twice and not running into locking problems.
Thanks for any suggestions.
View 5 Replies
View Related
Mar 26, 2004
i have a directory database with approx. 80 million records. i am feeding the database with bulk_insert. Indexing one of the fields took about 8 hrs. After indexing when i run queries with the indexed field the response time is under 1 sec. However if i run select queries with like on non-indexed fields it takes more than 2 mins. So i decided to index 4 other fields in the database and it looks like the indexing process is going to run for 2 days.
i am a novice in SQL database design and i am not sure if this is the best way to index the table. i am just using create index. Any suggestions / advice welcome.
View 5 Replies
View Related
Jul 11, 2013
We have a table with 16 Million records, and also this table is replicated.
We want to add a new column in to this table for some reason?
View 1 Replies
View Related
Jul 16, 2013
i am deleating 8 Million rows from my database,I am wondering how to control T-Log,also I heard something about row lock and table lock
View 4 Replies
View Related
Feb 27, 2015
i have a following table
table name : emp_master
empid efname emname elamane efathername emothername deptno edob edoj createdby updateby lastupdatedatetime lastactionperformed
empid is primarykey.
this table contains 20million of records and i want to fire following query on this to get employye all data where eployee is more than 10 year old
select empid ,efname, emname, elamane, efathername, emothername, deptno ,edob ,edoj ,createdby, updateby, lastupdatedatetime ,lastactionperformed
from emp_master
where year(doj)+10 > year(getadate())
this will return approx 10 million rows and taking 18 mins. tune this query what approaches should i take to reduce the time of execution.
View 6 Replies
View Related
Mar 19, 2008
Hello,
What is the fastest way to update 20million records in our database.
I have tried to do a simple update statement like this:
update trail_log with (tablockx, holdlock)
set trail_log .entry_by = users.user_identity
from users
where trail_log.entry_by = users.user_id
but it take 10 plus hours to run since it cannot commit the transactions until the very end. So was was thinking that I need to commit in batch like after 50K but that is slow as well.
Set rowcount 50000
Declare @rc int
Set @rc=50000
While @rc=50000
Begin
Begin Transaction
update trail_log With (tablockx, holdlock)
set trail_log.entry_by = users.user_identity
from users
where trail_log.entry_by = users.user_id
and trail_log.entry_by not like '%[0-9]%'
Select @rc=@@rowcount
--Commit the transaction
Commit
End
go
I have let the above statement run for 1.5 hours and it only update 450000 rows. Any ideas...
Maybe I'm doing it wrong. Please Help!!
View 1 Replies
View Related
Mar 24, 2008
Hi,
i have to load 1million rows from database( or flatfile) to the database(or flat file).
which task is used as the best solution for this?
Appreciate any assistance in this regard.
Thanks,
Das
View 5 Replies
View Related
Jul 20, 2005
Hello,We maintain a 175 million record database table for our customer.This is an extract of some data collected for them by a third partyvendor, who sends us regular updates to that data (monthly).The original data for the table came in the form of a single, largetext file, which we imported.This table contains name and address information on potentialcustomers.It is a maintenance nightmare for us, as prior to this the largesttable we maintained was about 10 million records, with lesscomplicated updates required.Here is the problem:* In order to do the searching we need to do on the table it has 8 ofits 20 columns indexed.* It takes hours and hours to do anything to the table.* I'd like to cut down as much as possible the time required to updatethe file.We receive monthly one file containing 10 million records that arenew, and can just be appended to the table (no problem, simple importinto SQL Server).We also receive monthly one file containing 10 million records thatare updates of information in the table. This is the tricky one. Theonly way to uniquely pair up a record in the update file with a recordin the full database table is by a combination of individual_id, zip,and zip_plus4.There can be multiple records in the database for any givenindividual, because that individual could have a history that includesmultiple addresses.How would you recommend handling this update? So far I have mostlytried a number of execution plans involving deleting out the recordsin the table that match those in the text file, so I can then importthe text file, but the best of those plans takes well over 6 hours torun.My latest thought: Would it help in any way to partition the tableinto a number of smaller tables, with a view used to reference them?We have no performance issues querying the table, but I need somethoughts on how to better maintain it.One more thing, we do have 2 copies of the table on the server at alltimes so that one can be actively used in production while we runupdates on the other one, so I can certainly try out some suggestionsover the next week.Regards,Warren WrightDallas
View 7 Replies
View Related
Oct 12, 2007
Hi all,
I have a sql script that updates records in a table with 40 million records.
There is some functionality in the script that could be put away in functions for code reuse/elegance.
Functions would cause execution overhead.
What else could I use besides functions that would allow me the code reuse and not compromise the execution over head? Is there any thing like includes in TSQL that would allow me to do so?
TIA..
View 4 Replies
View Related
Jan 14, 1999
Greetings, I seem to be getting a problem during installing SQL 7.0 over the SQL 7.0 beta. It tells me that there are ODBC components need to be upgraded and they are read only ... I can find no way of changing this.
Alternatively if I remove the beta and then install the proper version, will all the old db created under the beta still be recognised ?
Kris Klasen
Act. Manager, Data Warehouse Project
Information Management Branch
Department of Education
E-mail: Kris.Klasen@Central.Tased.Edu.Au
http://www.tased.edu.au
Tel: 03 6233 6994
Fax: 03 6233 6969
Mobile: 0419 549237
73 Murray Street
2nd Floor
Hobart 7000
Tasmania
Australia
View 1 Replies
View Related
Mar 18, 1999
Greets!
I have been told that simply stopping the SQL server service and backing up the data directory is all I have to do to do a backup of my data. Is this accurate?
Thanks,
Jimmy Ipock
View 2 Replies
View Related
Jul 12, 2004
Hello: a nice simple question (I hope). Is there a MS SQL equivilant to PROPER (string) which would return "Fred Bloggs" from "FRED BLOGGS" and equally from "FrEd bLoggs" ? I cant find such ....
Gerry
View 3 Replies
View Related
Feb 14, 2013
I have a very simple query that gets a field to use as constraints in another query.
Code:
SELECT A.RIN
FROM apcType T
INNER JOIN apcAttribute a on A.T = T.RIN
WHERE T.Name like '%Sales'
The results of the first query are used in the following query where it is bolded and marked with and <<<<========
Code:
SELECT AP.Arg2, AP.Arg3, M.Parcel,
M.Serial, M.Name, M.Acres, M.District,
V.YearBuilt, V.Code, V.Size,
(SELECT SUM(V1.Acres)
FROM TRValue V1
WHERE V1.Year = V.Year and V1.Parcel = V.Parcel and
SUBSTRING(V1.Code, 1, 1) = 'L'
[code]....
My question is How can I fold the first query into the second?
View 1 Replies
View Related
May 2, 2015
I am no stranger to Databases, I worked a lot with MySQL but never really cared about proper DB design as long as it worked. Now I am playing with SQL in a ASP.NET project and want to get things done the right way.Let's say I have a Movies database. My movies can have multiple genres so I set my tables up like this:
[Movies]
MovieID
MovieName
MovieRelease
[code]....
Is this the proper way of doing things? The problem with this is when I want to enter a record manually I have to know the ID of the movie and the ID of the Genres of the movie. And what about naming conventions? By default the identifier is always Id, from my MySQL experience I liked naming it like the table, same goes with other columns. This is my T-SQL code for above tables in VS-2013.
CREATE TABLE [dbo].[Movies] (
[MovieID] INT IDENTITY (1, 1) NOT NULL,
[MovieName] VARCHAR (50) NOT NULL,
[MovieRelease] NUMERIC (18) NOT NULL,
CONSTRAINT [PK_Movies] PRIMARY KEY CLUSTERED ([MovieID] ASC)
[code]....
View 2 Replies
View Related
Apr 16, 2007
I read some questions where questioners ask "Sometimes client gives data where dates are expressed as float or integer values.
How do I find maximum date?".
Ex
March 02, 2006 can be expressed as
02032006.0
020306
2032006
20306
020306.0000
2032006
Assuming the values are expressed in dmy format
The possible way is convert that value into proper date so that all types of date related calculations can be done
Create function proper_date (@date_val varchar(25))
returns datetime
as
Begin
Select @date_val=
case when @date_val like '%.0%' then substring(@date_val,1,charindex('.',@date_val)-1)
else @date_val
end
return
cast(
case
when @date_val like '%[a-zA-Z-/]%' then case when ISDATE(@date_val)=1 then @date_val else NULL end
when len(@date_val)=8 then right(@date_val,4)+'-'+substring(@date_val,3,2)+'-'+left(@date_val,2)
when len(@date_val)=7 then right(@date_val,4)+'-'+substring(@date_val,2,2)+'-0'+left(@date_val,1)
when len(@date_val)=6 then
case when right(@date_val,2)<50 then '20'
else '19'
end
+right(@date_val,2)+'-'+substring(@date_val,3,2)+'-'+left(@date_val,2)
when len(@date_val)=5 then
case when right(@date_val,2)<50 then '20'
else '19'
end
+right(@date_val,2)+'-'+substring(@date_val,2,2)+'-0'+left(@date_val,1)
else
case when ISDATE(@date_val)=1 then @date_val else NULL end
end
as datetime
)
End
This function will convert them into proper date
select
dbo.proper_date('02032006.0') as proper_date,
dbo.proper_date('020306.000') as proper_date,
dbo.proper_date('02032006') as proper_date,
dbo.proper_date('020306') as proper_date,
dbo.proper_date('20306') as proper_date,
dbo.proper_date('020306') as proper_date
Apart from converting integer or float values to date, it will also convert date strings to date
Select
dbo.proper_date('March 2, 2006') as proper_date,
dbo.proper_date('2 Mar, 2006') as proper_date,
dbo.proper_date('2006 Mar 2') as proper_date,
dbo.proper_date('2-Mar-2006') as proper_date,
dbo.proper_date('3/02/2006') as proper_date,
dbo.proper_date('02-03-2006') as proper_date,
dbo.proper_date('2006/03/02') as proper_date,
dbo.proper_date('March 2006') as proper_date,
dbo.proper_date('2 Mar 2006') as proper_date
Madhivanan
Failing to plan is Planning to fail
View 8 Replies
View Related
Mar 31, 2008
What is the proper way to return the identity of a newly inserted row from a stored procedure? Using a return value or a select statement? (I guess as an output parameter should also be considered...) As in
RETURN SCOPE_IDENTITY()
or
SELECT SCOPE_IDENTITY()
What are the pros/cons of using one approach over the other?
- Jason
View 4 Replies
View Related
Aug 26, 2015
I have two xml queries that take long: the 1st query takes about 5 minutes (returns 700 rows) and the 2nd query takes about 10 minutes (returns 4 rows). The total rows in the table is about 2 million. There are three secondary indexes: Property, Value and Path in addition to the clustered index on CardID and Primary XML index. Here is the table definition:
CREATE TABLE [dbo].[Cards]
(
[CardId] [int] NOT NULL,
[Card] [xml] NOT NULL,
CONSTRAINT [PK_dbo_Cards_CardId] PRIMARY KEY CLUSTERED
([CardId] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
[code]...
Looking at the execution plan, the query uses the Primary XML Index even if I add any of the secondary xml indexes. My question is why does not the optimizer use the Property 2ndary index instead of the Primary XML Index? Microsoft recommends that creating a Property index for Value() method of the xml datatype would work to provide a performance benefit. What would be another alternative to make the query run faster?
View 12 Replies
View Related