I have been experimenting with SQL Server 2005 partitions. I loaded a terabyte of information into 2 tables. The first holds the document information and the second holds the actual binary document (in this case pdf). Most of the documents are about 1 megabyte in size, but the largest is 212 megabytes.
SQL Server has no problem storing the blobs. The problem occurs when I attempt to get the data.
I did some quick tests to test how fast I could pull the documents out. The largest took about 24 seconds. The 1 meg documents are sub-second.
Here is how the 212 meg doc breaks down:
Time to load datatable: 18.79 seconds
Time to load byte array: 3.84 seconds
Time to Write and open document: 0.01 seconds
If I access the file from a file server, the time is 0.04 seconds to begin showing the document.
As you can see, the longest time period is related to retrieving the data from SQL, and it is much slower that launching it from disk across the network. (note: the sql server and file server used to test are next to each other).
My question is, how can I speed up the access from SQL Server? I believe the keys are "partition aligned". Any suggestions would be appreciated.
I will add the table definitions and partition information as a reply since only 5000 chars are allowed in the post.
Excel 2007 Documents Not Displayed in Windows XP Professional sp2 Start Menu's Recent Documents List:
Dear Microsoft Support: I can't figure out how to get recently used Excel 2007 (new file formats) documents to show up in the Windows XP Professional (sp2) Start Menu's Recent Documents List. I checked the Internet, the knowledgebase, many parts of the MS web site, etc. for an answer but can't find one. Are Excel 2007 documents supposed to show up in the XP Start Menu's Recent Document List? Is this a bug or do I have to do some sort of configuration to make it do so? If it's a bug, when will a fix be available?
Are these new Excel 2007 files filtered out like EXE files are which will not appear in the Recent Document List?
The Excel 2007 file types are listed in the Registry.
This question is also posted on Experts-Exchange...No solutions yet.
I have a database approximately 30 GB in sixe which need to be moved from one SQL server to another. Does anyone know the most efficient way of doing this, other then backing up to tape?
I have a 50Gig OLTP production database that currently takes +- 50 minutes to backup, (normal sql flat file backup to disk).
This database will grow to +- a terrabyte by next year.
My major concern is how will i be able to backup this DB when it is that big in 2 hours or less.
I have been checking out my options, in terms of SAN snapshots/clones. Also multiple backup devices and using differential/filegroup/full backup strategy.
What i want to know is if anyone out there is backing up VLDB's what strategy/methos/tools are you using, even 3rd party tools for faster,smaller backups?
Any pointers/best practices for VLDB backups would be greatly appreciated.
I hard that SQL Server 7.0 has problems when the database reaches 50 - 100GB, in areas such as backup, transaction logging, and database admin and that by 100GB parallel queries are also affected.
Is this true ? Where I can get information on this ?
Does anyone have experience/advice with large databases (5-10 Gig)? If so, I was wondering about performance/other benefits of spanning a large database across multiple devices (different disks). Would anyone vote for or against doing this?
I have a production database that is in the low gigabyte size andgrowing steadily. No issue there.I wish to completely refresh the development database daily on asecond server. What is going to be the fastest easiest way to do thiswith hindering performance on the production system ?Thanks,Craig
I need to manage the problem of negative performance implications when I fragment a 1TB+ DB. I want to perform Index Reorganization if fragmentation is no higher than 30%, and Index Rebuild if the fragmentation exceeds 30%.
Firstly can anyone recommend a script which uses sys.dm_db_index_physical_stats system to ascertain the fragmentation level. Secondly, is there a technique I can employ to prevent the ONLINE operation completely killing performance on 27/4 production system?
I've got a few VLDB's that we want to make smaller. Since the tables are running on legacy stuff, all of it's basically made with int's and char's and it's horriably inefficant.
The problem that I came across is when I made a new table with the best data types and copied the data from the old table, the table size was the exact size (excluding the index size). It was estimated that a total of ~20 GB would be saved with this change. As it turned out, 0 bytes of data were saved with the data types chagnes.
Why are the two tables the same, even though one has much more efficant data types?
If you want more information about the table I'm using:
391 columns. 50,147,035 rows. 65,295.625 MB in size.
I have been using AlwaysON AG for a long time now and currently have about 10TB of data across 120 databases and 3 AG groups for any application that is on SQL 2012 with great success. Each AG group is running on patch level 11.0.5058.0 with 2 synchronous replica(on different SANS) in Primary Data center and 1 ASYNC replica in DR. Migration has been a non-issue because none of the databases weren't substantial enough that I could not fit into my maintenance window which is 12-4AM on SAT morning.
My issue is that my last application to migrate to 2012 includes a 4TB TDE encrypted databases database which is about 10x larger than any of the previous ones I have migrated. The database takes 4 hours to backup after tuning extensively(I hate TDE!!)
The restore to the primary replica is instant because of seeding incremental but the issue comes from having to backup the database before adding to the availability group. 4 hours is my exact outage window and I can't get any more. My plan to migrate application is to -
First Outage Window
1) Restore Database from 2008 to 2012 Primary Replica 2) Change application ARECORD(or cname not sure which) to Primary replica 3) Run database on single node until next outage window
Week Later 1) Add database to availability group 2) Change ARECORD/CNAME to listener
What I don't like about this is I am going an entire week with 1 node instead of 3 which is worrisome. How to accomplish this I would love to hear from you or any type of comment from people who have worked with VLDB in availability groups and what you like/hate/loved about doing it. I am trying to go all in on this software and have loved it so far but getting worried when it comes to the VLDB migration.
The column I'm adding needs to be part of the clustered PK (it will be the last of three columns) so I need to recreate all the indexes.
My DB is set for FULL recovery mode ALLOW_SNAPSHOT_ISOLATION ON. I've tried two methods so far.
Method 1:
BEGIN TRANSACTION CREATE TABLE dbo.Tmp_copyoftablewithnewfield ( ) ON PRIMARY IF EXISTS(SELECT * FROM dbo.originaltable) EXEC('INSERT INTO dbo.Tmp_copyoftablewithnewfield (<original fields>) SELECT <original fields> FROM dbo.originaltable WITH (HOLDLOCK TABLOCKX)') GO DROP TABLE dbo.originaltable GO EXECUTE sp_rename N'dbo.Tmp_copyoftablewithnewfield', N'originaltable', 'OBJECT' GO <recreate PK constraint> <rebuild indexes> COMMIT
Pro's: Lets me add the new field in the spot I'd like it (not a big deal) Con's: Tons of wasted space and time. It took about 15 hours.
Method 2: SET XACT_ABORT ON GO SET TRANSACTION ISOLATION LEVEL SERIALIZABLE GO BEGIN TRANSACTION <drop PK constraint> <drop indexes>
ALTER TABLE [dbo].[originaltable] ADD [newfield] [tinyint] NOT NULL CONSTRAINT [DF_originaltable_newfield] DEFAULT ((1))
Pro's: No making a copy of the entire table taking up 200GB more space in the db data file Con's: My tempdb grew to accomodate the row versioning info for every row in the 200GB table. It took over 30 hours.
A lot of time and disk space is wasted with both.
Since the db is going to be unavailable to users I have some flexibility here. I was considering turning ALLOW_SNAPSHOT_ISOLATION OFF and then trying method 2 again which should stop the versioning in tempdb and then turning it back on.
I was also curious if setting the database recovery mode to SIMPLE would cut down on db log usage and then I could set it back to FULL when done.
Do these really need to be in a transaction? If there's some hardware failure or something unexpected I can just restore from backup and do the conversion again. If the presence of the transaction itself is causing more disk usage for logging or any other slowdown, I think I'd rather do without.
Given the amount of time this conversion takes, I wanted to get some feedback other than "just try it" before doing any new tests.
Hi, Im a Jr DBA and have been given an assignment by my lead to find information on the following. We are to migrate existing db of size 4TB to a DELL PowerEdge 2950[Mem:Up to 32GB] OS : Windows Server 2003 Std Edition X64 SP2 DB : SQL Server Enterprise Edition x64
I am to find on how to design the db to provide optimum performance,fail over and consider the growing factor of the db.
1)What would be the recommended RAID settings? 2)Placement of the tempdb ? 3)Should we do clustering and why ? 4)What Data partioning would do to help? 5)Any Other aspects to be considered for sizing db ? 6)Placement of data files and log file on separate physical disk ? 7)Indexing?
I have read many sites.I would appreaciate if someone could write suggestions and opinions based on their current db design spec or previous experience,by selecting best db design points.Thank You.
Hi, I would like to delete a data from a 750million row table in chunks of 10000,without blocking the users.As ours is a 24/7 shop I donot want to block the users for a long time. Answer for this is highly appreciated. Thanks Samna
Hello,I need to create a database to hold documents information.1. Basically, I need the following information for each document: Title, Description, LastUpdated, Category, Type, Url Should I create tables for Category and Type? And link them to my documents table? What type of relationship should I use?2. I also need to know how many downloads each document had Should I add a column in my documents table? Then I would increase it one by one. Or should I create a new table which would register each download.3. I need to let users to rate each document from 1 to 5. How should I implement this?Thank You Very Much,Miguel
I'd like to create an XML document from within SQL 7.0. Is the do-able? I know it's build into SQL 2K. But how is it done (or can it be done) in SQL 7.0.
Yesterday I installed MS SQL 2000 for the first time and have no idea what I'm doing.
I have been sent a database and asked to convert this to MS Access, for most of the data that is ok and I have already managed to do this. My problem is that the database contains MS Word documents stored in some of the tables (field type - image). I need to extract these from the database and get them back to individual Word files, ideally with a file name that relates them to the primary key of the table from which they came.
I have less that 24 hours experience with SQL server and would be very grateful if anyone can explain how I can do this.
This may be a stupid question but I'll throw it out here, is it possible to use sql 2005 to split up pdf files into individual files by a field on the form or an index?
We have a document library that we display on report manager. When we open a pdf document there is a print icon, but when we open an Excel document or a Word document, there is no ablility to print. The user must save the document locally and then reopen it to print from Word or Excel. Is there a setting somewhere that can be set to enable printing on the Excel and Word docs?
Hi, I have the requirements to add the attachment, so i am saving the documents in sqlserver. now i am facing the problem with openning the document. can any body suggest me how to open the documents which are stored in sql server
I need a help with respect to the storage of documents in SQL server. Is it possible to store Word documents in SQL Server straight away ? If yes, what is the data type that is supported for this kind of storage. How do I read the data , store it & render it ( using both ado & Just TSQL)
I was wondering if I can save documents e.g. pdf, word, excel or anyother format in sql2005. If yes what datatype should I use and what would be the best way to go about it.
Hi all, I need to migrate some documents(in GB's) from FTP Server1 with meta data information in SQLdbA to FTP Server2 with meta data information in SQLdbB. How can we achieve this?
Am new to this concept and got information that we can use FTP task. But unable to proceed how to achieve this. Please help me.
Is it possible to have a word document as a datatype? I am attempting to create a searchable SQL 7 database of approximately 5000 resumes, adding anywhere from 10-100 every day (we are a recruiting/consulting firm).
I know index server is an easier way to do this, but my managers are against it for unknown reasons.
I am developing a resume storage system, and don't know the best way to store the resumes that come in to our company in both MS Word and text files. Should I store the files in the original format they come in, and reference the file name in the databse that points to a directory where they are stored, or should I store the text of the resumes directy in the database. There are 2 things that I must follow.
1: I need to have the documents keep their formatting. 2: I also need the capibility of conducting a full text search to pull out key words from the documents.
Brief overview. Got 2 tables, client table and document table. Both tables have client name as the primary key. Client table shows client info, address, phone, dob. Document table shows client name, document, document type. I need to write a query that will count how many documents are in the table for each name.
This is attempt at it, please let me know whats wrong. Thanks.
SELECT count [client table].client name as cli_name, count ([document table].name as doc_qty) FROM [client table] INNER JOIN [document table] ON [client table].id = [document table].ID GROUP BY [client table].name ORDER BY [client table].name
I am still finding my feet with sql server.I want to allow users to upload different versions of documents. However I want them to be able to access the previous versions if they need to. Below, is some info (there are other fields, but not necessary here) from a table 'file_resources':
The problem I have is that I want to display the latest version details by default but have a link to previous versions of the document. So, when I run the following sql:
SELECT file_id, version FROM file_resources ORDER BY file_id DESC, version DESC
it returns:
file_idversion 406 4 405 3 404 2 403 1 402 1
But what I need to get is the 2 unique documents (The latest file_id where original_file_id is duplicated): 406 402
I need to build a query that can return only documents where the field "u_DIM4" for the same document have more than one different value..my script are this one just to having an example:
Select docnome [documentname], adoc [docnr], count(*) [countAlldifferentbyDoc], u_dim4 from fn where u_dim4 <> '' and data between '2015-01-01' and '2015-07-31' AND adoc = '02634' Group by docnome,adoc,u_dim4 ORDER BY 2 asc
I am creating a document management systems using asp. I have beenresearching the different ways of handling the documents such as using thefile system and storing the path in the db, and actually storing thedocument in the db. I like the idea of storing it in the database muchbetter because I can allow users to manage documents themselves (I alreadyhave the code in place to do it if I decide), having a central system withthe ability to add my own document properties by adding fields to the table,security, and backups. I have found that most think it is better to storethe path due to performance issues and the rate the db can grow. I havelooked at our current system in access and we have a total of 4400 documents(of which probably 25% are in the database but don't actually exist anymorein the file system, one hangup about the file system) since 1988. Thiscomes to about 300 documents added each year. The other thing is the issuewith the size of the db. I don't see a whole lot of difference with thisissue because it is going to take up space in your file system too, althoughthe file system may be more efficient at storing them. I would say that 95%of our docs are under 1 mb in size and done in ms word.The last thing is using full-text search capabilities in SQL Server. I needto be able to search the contents of the field.Is there other issues around storing documents in the db to consider besidesthe above?
I was wondering if someone was able to comment on something that I've encountered using SSRS.
I have set some items with the ToggleItem property and viewing the report, the [+] and [-] buttons show. However when the report is exported (to pdf, mhtml or even tif), the icons are no longer there.