Design Question For Large Table
Sep 20, 2007
Hi,
What's the most efficient way to store the following information:
* Table contains 1 million listings
* Each listing can be geo-targeted to any of the 200+ countries
* Searches return listings based on geo-location
Storage options:
Option #1 (normalized)
* ListingsTable (PK listingID int) [1 million rows]
* ListingGeoLocations (listingID, geoLocationID) [could be up to 200 million rows]
Option #2 (denormalized)
* ListingsTable (PK listingID int, binary(32) with bit-mask consisting of 200 bits one for each location)
Did anyone have experience with similar structures? Which option is more efficient?
Thanks,
Av
View 8 Replies
ADVERTISEMENT
Sep 25, 2007
Hi,
What's the most efficient way to store the following information:
* Table contains 1 million listings
* Each listing can be geo-targeted to any of the 200+ countries
* Searches return listings based on location
Storage options:
Option #1 (normalized)
* Listings (PK listingID int) [1 million rows]
* ListingLocations (listingID, locationID) [could be up to 200 million rows]
Option #2 (denormalized)
* Listings (PK listingID int, binary(32) with bit-mask consisting of 200 bits one for each location)
Usage: Usually the query will simply lookup listings based on some keywords. It will get back 50-200 listings. Then the application (C#) will filter the listings based on location.
Did anyone have experience with similar structures? Which option is more efficient?
I know that using the intersection-table in Option #1 is the "proper" relational-DB way of doing things. However, I do not like the idea of storing the listingID so many times (ones for each locationID).
Thanks,
Av
View 1 Replies
View Related
Jan 4, 2006
I have a general SQL design-type question.
I want to log errors to a table. If the error is with a URL, I want to store the URL. These URLs can be very large, hundreds of characters, but I only need to store it if it causes the error, which should be very infrequent. Which is the better design:
Create a large varchar field in the log table to hold the URL, or null if the error wasn't with the URL.
Create a foreign key field in the log table to a second URL table, which has a unique ID and a large varchar, and only create a record in this table if the error is with the URL.
One concern I have with design 2 is that there could be many other fields that are infrequent. Do I create a separate table for every one?
Richard
View 3 Replies
View Related
Jul 6, 2006
Hello all,
I have recently been task with rewriting a database that holds large volumes of data, whilst ensuring that query can be run in optimal time. Having never really delved into this sort of thing before, I hoped you guys might be able to offer some advice and guidance.
The design I have inherited is based around 2 main tables:
[captured_traps]
[id] [int] IDENTITY (1, 1) NOT NULL
[snmp_version] [int] NULL
[community_name] [varchar] (255)
[packet_type] [varchar] (50)
[oid] [varchar] (500)
[source_ip] [varchar] (15)
[generic] [int] NULL
[specific] [int] NULL
[time_stamp] [varchar] (15)
[trap_entered] [datetime] NULL
[status] [int] NULL
[captured_varbinds]
[id] [int] IDENTITY (1, 1) NOT NULL
[captured_trap_id] [int] NOT NULL
[varbind_oid] [varchar] (500)
[varbind_text] [varchar (500)
The relationship between the two tables is on the "captured_traps (id)" to "captured_varbinds (captured_trap_id)". Currently the "captured_traps" table contains around 350 million rows, the "captured_varbinds" table contains around 900 million rows.
Now as you can probably gather this model runs like a....well it sort of hobbles more than runs hence the need to redesign.
My current thoughts on this are:
- Normalising all varchars - there is alot of duplicate values in most of the varchar fields.
- Full Text Indexing
However beyond that I am not sure which route to go down. After googling for most of today I have come across a number of "solutions" however I do not want to go steaming down the track of one of these to discover that it is fatally flawed somewhere.
View 6 Replies
View Related
Aug 18, 2015
I would like to create a table called product. My objective is to get list of packages available for each product in data grid view column while selecting each product. Each product may have different packages type (eg:- Nos, CTN, OTR etc). Some product may have two packages and some for 3 packages etc. Quantity in each packages also may be differ ( for eg:- for some CTN may contain 12 nos or in other case 8 nos etc). Prices for each packages also will be different that also need to show. Â How to design the table..Â
Product name  : Â
Nestle milk |
Rainbow milk
packages  :
CTN,OTR, NOs |
CTN, NOs
Price:
50,20,5 |
40,6
(Remarks for your reference):CTN=10nos, OTR=4 nos Â
| CTN=8 Nos
View 3 Replies
View Related
Jun 25, 2007
Hi, all experts here,
I am wondering if tempdb stores all results tempararily whenever I query a large fact table with over 4 million records which joins another dimension table? Since each time when I run the query, the tempdb grows to nearly 1GB which nearly runs out all the space on my local system drive, as a result the performance totally down. Is there any way to fix this problem? Thanks a lot in advance and I am looking forward to hearing from you shortly for your kind advices.
With best regards,
Yours sincerely,
View 11 Replies
View Related
Aug 27, 2007
Hi everyone,
I use sql 2005. What is the best practice for dealing with large table (more than million rows)? Table Partition, View or other?
Can you please give some suggestions? It will be very helpful if you can post some references or examples.
Thank you!
View 12 Replies
View Related
May 6, 2015
We need to Insert/Update a Fact Table from staging Table. currently we are using a SP which update Fact Table for Each region. this process is schedule, every 5 min job is run and Update fact table.but time of Insert and Update too long from staging to Fact, currently we are using merge statement for Insert and update.in my sp we are looping number how many region we need to update and at a time single Region we are updating using while loop in current SP.
View 7 Replies
View Related
Oct 7, 2015
I have a requirement of table partitioning. we have 10 years of data on a table which is 30 billion up rows on 2005 server we are upgrading it to 2014. we have to keep 7 years of data. there is no keys on table or date column. since its a huge amount of data and many users its slow down the process speed. we are thinking to do partition on 7 years for Quarterly based. but as i said there is no date column on table we have to use reference table to get date. is there a way i can do the partitioning with out adding date column on table? also does partition will make query faster?Â
I have think three ways to do it.
1. leave as it is.
2. 7 years partition on one server
3. 3 years partition on server1 and 4 years partition on server2 (for 4 years is snapshot better?)
View 3 Replies
View Related
Nov 16, 2007
I am developing an application that has a table with lots of records(network traffic) but the data is summarize every so often to create summary records (old records are deleted). The problem is that I have a PK based on an autoincrement ID (int) that will run out of numbers. However, this ID is not referenced anywhere, (not a foreign key from another table, not use for deletion and there is no update in this table whatsoever).
So my possibilites are:
1.- reseed the id when it is about to run out.
2.- make the id bigint
3.- remove the id and change the PK to 2 other fields
4.- remove the id and without PK
I am leaning toward option 4, because I do not see the need for a PK, but I understand that it is quite out of the normal.. So I would like to hear from other people ( I do not have much experience with DB).
I also like option 3. I already have a index on one of the other fields (time).
Any input will be appreciated.
Claudio Robles
View 7 Replies
View Related
Jul 23, 2005
If I use BCP to export a very large table will that table be blockedfor writes during the export process? I don't want to prevent usersfrom accessing that table during the bcp process?Thank You, TFD.
View 1 Replies
View Related
May 26, 2008
Hi,
I have this page that upload's PFD's to a table. In principle this works fine.
Until I try to upload large files (3 to 4 MB)I need to even upload larger files than that. (Don't really know as of yet what users are going to come up with) I get TimeOut problems. Now some people say it is not possible to exceed a limit of about 4 MB. But that there is a workaround by changing something to the web.config file.Can somebody give me info about that, (I am quite a novice really)I tried to change it like this, but to no avail:
<system.web><httpRuntime maxRequestLength="102400"enable = "True"requestLengthDiskThreshold="102400" useFullyQualifiedRedirectUrl="True"executionTimeout="102400"/></system.web>
Thanks for any help!
View 2 Replies
View Related
Mar 3, 2004
I have a table of approx 1/2 million rows.
On a nightly basis, this table gets rebuilt in a temporary database. Once the table has been built and scrubbed, i need to move it into our webservers db.
I'd like to do this with minimal interuption to the website.
Possible techniques:
1) I could set up a DTS package to copy the table object overwriting the destination table
2) I could export to a flat file and then bulk import into the live table (after truncating it)
3) I could run a process to update smaller chunks of data at a time running delete queries and insert queries.
Anybody have a thought on the best way to do this so that the web users would be virtually unaware that anything was happening ?
View 4 Replies
View Related
Mar 1, 2002
Hi,
I am absolutely innocent as far as T-SQL is concerned. I need to detect all duplicates (key consists of 5 fields) in the table and delete the duplicates.
I tried different approaches like joins etc but nope.
Any help is appreciated
Thanks
View 2 Replies
View Related
Aug 6, 2004
OK, I imported 680 million records into an unindexed table. That went well.
Then, I went into Enterprise Manager and added a two column non-unique clustered index to that table to speed access.
It's been running for ~36 hours and I have no idea when it will complete. I have deadlines that I'm going to miss and am very nervous; what can I do?
SQL Server 2000 Enterprise Edition (8.00.818 - sp3 + hotfixes)
Dual 3Ghz Xeon (two physical CPUs each have HyperThreading enabled)
Windows 2000 SP4
4GB RAM (although I just noticed the 3GB OS switch wasn't on)
SCSI boot drive
tempdb, data, and transaction log are on a FibreChannel RAID SAN
Help! Thanks in advance!
View 8 Replies
View Related
Nov 14, 2007
Hi folks! I'm looking for advice on partitioning a large table. In the DDL below I've changed names to protect the guilty.
My table has this schema:
CREATE TABLE [dbo].[BigTable]
(
[TimeKey] [int] NOT NULL,
[SegmentID] [int] NOT NULL,
[MyVal] [tinyint] NOT NULL
) ON [BigTablePS1] (TimeKey) -- see below for partition scheme
alter table [dbo].[BigTable] add constraint [PK_BigTable]
primary key (timekey asc, SegmentID asc)
-- will evaluate whether this one is needed, my thinking is yes
-- based on the expected select queries.
create index NCI_SegmentID on BigTable(SegmentID asc)
The TimeKey column is sort of like a unix time. It's the number of minutes since 2001/01/01, but always floored to a 5 minute boundary. so only multiples of 5 are allowed.
Now, this table will be rather big. There are about 20k possible SegmentIDs. For every TimeKey from 2008/01/01 to 2009/01/01 (12 months), I'll have on the order of 20000 rows, one for each SegmentID.
For the 12 month period, there are 365*24*60/5=105120 possible TimeKey values. So the total rowcount is over 2 billion. (20k * 105120)
Select queries are expected to be something like this:
-- fetch just one particular row...
select MyVal from BigTable
where TimeKey=5555 and SegmentID=234234
--fetch for a certain set of SegmentID and a particular time...
select
b.SegmentID
,b.MyVal
from BigTable b
join OtherTable t on t.SegmentID=b.SegmentID
where b.TimeKey=5555
and t.SomeColumn='SomeValue'
Besides selects, also I need to be able to efficiently issue update statements against the table with new values in the MyVal column based on a range of TimeKey values (a contiguous span of a few days) and sets of about 1000 SegmentID. updates would always look like this:
update t
set t.MyVal=p.MyVal
from BigTable t
join #myTempTable p on t.TimeKey=p.TimeKey
and t.SegmentId=p.SegmentId
where #myTempTable would have order of 1000*24*60 rows in it, all with contiguous TimeKey values, and about 1000 different SegmentID values. #myTempTable also has a clustered pk on (timekey asc, SegmentId asc).
After the table is loaded, it would never get any inserts or deletes. only selects and updates.
Given the size, and the nature of the select and update queries, this table seems like a good candidate for partitioning. I'm thinking it makes sense to partition on TimeKey.
So my question is, is it stupid to create a separate partition for each day in the year long span of TimeKeys this table covers? That would mean 365 partitions in the partition function and partition scheme. Something like this:
CREATE PARTITION FUNCTION [BigTableRangePF1] (int)
AS RANGE LEFT FOR VALUES
(
3680640 + 0*1440, -- 3680640 is the number of minutes between 2001/01/01 and 2008/01/01
3680640 + 1*1440,
3680640 + 2*1440,
3680640 + 3*1440,
...snip...
3680640 + 363*1440,
3680640 + 364*1440,
3680640 + 365*1440
);
GO
CREATE PARTITION SCHEME [BigTablePS1]
AS PARTITION [BigTableRangePF1]
TO
(
[PRIMARY],[PRIMARY],[PRIMARY],
...snip...
[PRIMARY],[PRIMARY],[PRIMARY]
);
GO
does anyone have any experience with partitioned tables with so many partitions? Is a few hundred partitions too many? From my understanding of partitions, seems like having so many will be ok. Is it somehow worse than having hundreds of tables in a database?
Even with one partition for each day, I'll still have 24*60*20000/5 ~ 5m rows in each one.
5m seems like a manageable number. 2b does not.
elsasoft.org
View 2 Replies
View Related
Jul 23, 2005
Greetings All, I was wondering what would happen if I were to do a"select * from table" on a table that has about 5 million rows. Wouldmy read block other writers to the same table? Would it block otherreaders? I know SQL uses optimistic lockign by default but I am notsure what this means to other users trying to access the same table?Any advise would be greatly appreciated.TFD
View 3 Replies
View Related
Jul 20, 2005
Quick question:Does SQL do table/schema changes "in place"?I've got a large table (140+ million rows of very widedata) that we want to change the schema on -- basicallyto remove a number of the unused data elements that wedon't use.Anyway, does anyone know if SQL will do an in-placechange, or if it will copy the table to a new table, therebyincreasing my space allocation needs? I'd effectively,temporarily, need space for two tables while the changeis happening if it copies the table first. This is not good asI do not have enough available space at the moment.If you've got pointers to specific MS docs regardingthis issue, please let me have 'em.Thanks in advance.
View 2 Replies
View Related
Apr 29, 2008
I have query that takes 12 minutes to execute. The query uses around 9 tables but I have narrowed down the problem to one table that has over 65 million rows. The problem table has only 3 fields
FieldOne (PrimaryKey)
FieldTwo Varchar(3000)
FieldThree Varchar(3000)
The query uses the primary key of this table to perform the join. FieldTwo and FieldThree are only used as output parameters.
I noticed if I remove FieldTwo and FieldThree from the output (but still leave the table in the query), the query executes in 1 second. However if I include FieldTwo and FieldThree in the output, the query takes over 12 minutes to execute.
I cannot index FieldTwo and FieldThree because of the field size and I cannot reduce the size of the fields because of the data that needs to be stored in it? How can I index or do something similar to speed up the table look up.
View 1 Replies
View Related
Jul 8, 2007
I have been asked to look at some performance isssues with an application that utilises a 800GB table. This table is huge and contains 4 int columns and 1 decimal column. The table has a clustered index that covers 4 of the int columns and is heavily fragmented and it has not been maintained for a long time. The system has limited free space to even attempt rebuilding the index. Does anyone have any experience of running a the Alter Index Reorganize command on such a large table? Any information on what storage would be required to attempt this, how long would this take?
View 4 Replies
View Related
Sep 21, 2006
I have to migrate an Oracle Db to SQL Server 2005, including a 450 gb table with images. The estimate is that it will take about 24 hours to move this data. I€™m using SSIS with just one OLEDB input and sending to one OLEDB output. The SSIS process will be running on the destination SQL Server with 2gb of memory and at least half that memory in use by other apps (including SQL Server). I tried the SQL Server Destination but received errors trying to use it.
Are there any suggestions on settings or the best way to do this?
View 1 Replies
View Related
May 10, 2008
I have 4 tables with the respective amount of records
1) 6755
2) 2021
3) 2021
4) 355
They all have the same columns. However, they need to be seperate, or at least when I query them. I'll be accessing this database via the web. i was first afraid that a large database would cause major slow down when accessing the db. So I broke it up into 4 tables. If I combined all 4 tables into one large table and just had a column that differentiated the 4, how significant would be the change in speed when accessing the table? It's not a big deal to keep them seperate, its just that when I have to add or remove a column from one table I have to remove it from all the tables. Furthermore, I'm using a module from DEVEXPRESS, don't know if anyone has heard of it, but when you use a gridview, it loads up the entire table even though your paging (which I think is retarded), so for that reason I was afraid it would slow up my access to the db. Any thoughts?
View 2 Replies
View Related
Nov 22, 1999
We have a table that we BCP into, the data is then processed and inserted into its appropriate table.
Then the table or its data needs to be removed. This seems to be a very slow operation to remove
the table or table data. I have tried drop table, and truncate table and it takes nearly as long as
the bcp operation. The table has 12 million rows. I didn't think either operation wrote to the
transaction log except for page extent management. Why is the drop and truncate so slow. Suggestions?
View 1 Replies
View Related
Oct 31, 1999
Hello:
The purchased-application mssql 6.5, sp 4 that I am working on has one large table has 13m illion. It the largest table considering thenextlatgest table is only1.75 million rows.
Thew vnedor has made a change to this largest table in recommending changing a data type -- char to varchar. To make this change easier to do,
I want to "archive" older data not necessary for the current year or current processing to another table.
What is the best way to do this archiving?
Any information you can provide will be greatly appreciated. Thanks.
David Spaisman
View 4 Replies
View Related
Jan 7, 2008
I have a table that currently holds about 5 million records. We add an average of 5,000 new records per day, all of them from overnight batch jobs. I guess it's not that big, but there are two text columns that hold a couple KB each, so the total size isn't exactly small either. The data is created from medical billing data we receive overnight. We get two reports- patient demographic information and a physician's dictation relating to that patient. This data is always one to one, and the purpose of this table is to store the data as we originally receive, which is why both reports are in the same table. After we extract the details from the report, (which are by this point always reduced to text documents) we need to keep not only the data but the original documents, hence the two text columns. We considered moving the large columns to their own table, which would just have an ID field and the column, but the powers that be really wanted all this in the same table. Nothing new goes into the table during day- it's all SELECT statements.
I need to add a column to this table. It's just a small char(7) column, NULLS allowed, of course. We bill for several clients, and reports from different clients become available at different times, so there's really no down time overnight. Altering the table during the day is out of the question. So how can I add a column while the table is active?
My best idea so far is to use SELECT *, NULL AS NewColumn INTO NewTable to create a copy of the table (using a cast to get the correct datatype) during the day, when no new data is going in, and replacing the old table with the new by simply changing the names right after everyone goes home. But this could still cause slowdowns while it builds the copy, and leave the problem of re-creating indexes (there are several). There ought to be some graceful way to tell it to add the column to the existing table and play nice with ongoing traffic.
View 2 Replies
View Related
Aug 4, 2005
I have a few hundred users, maybe a dozen or two active at any given time, accessing the same database via ASP. The database has many tables, one being a very large orders table with a few million records, in which I have created a view against. A view only because I need to allow the user to filter quite extensively against the results. The users typically only need to view records for the last 30 days and results for each user might be five thousand records or less.
My question is this. Would I be better off writing each user's resultset to a temp table for that user's session and allow the filtering and sorting by the user go against that temp table and increase my hardware requirements to accomodate that. Possibly to the point of creating a database cluster. OR would I be better off leaving it as is where each users uses the same view.
FYI...each user may need visibility to only a hand full of fields, but over all the view must maintain many fields.
Any thoughts on this would be greatly appreciated. Thanks in advance.
Dave
View 2 Replies
View Related
Dec 5, 2006
Using SQL Server 2000, SP1 with 4Gb max memory allocated to the instance. The problem is that one large table is hogging cache and it's dragging down overall query performance. I realise it's in cache because it's getting queried regulary. However, I need to know what options exist to get around this problem - to free up some cache for other tables and indexes? Of course, there is the option of archiving off some the data in the table to reduce its size and we will look at doing this although it will not be as easy as it sounds.
I can imagine that there must be many databases that have at least one large table that is getting hit regularly and is left in cache more-or-less permanently. Therefore, I can't believe I have an usual problem.
Thanks in advance,
Zarty
View 2 Replies
View Related
May 21, 2013
What will be the best way to go to select summary data from a big table with detail records and insert it into another table.
The source table contains approx 100 million detail records for a couple of months. I have a select statement that select a summary of the latest month's data and it average about 15 million records for the month which i want to insert into another table.
In the past i just used a standard insert into statement but not the best way of doing it.
If a view is created with just the last months summary data and i select from the view will the performance be better or will it just add more overhead?
Will a SSIS package work better to insert the summary data?
View 1 Replies
View Related
Jun 17, 2008
Asalam o alykum!!!
i just want to know tht i have table which have large record almost 7 lakh round about , when i set auto increment on it gives me timeout error, is der any easy way to ON auto increment ??
Thank you
View 7 Replies
View Related
Jul 23, 2005
SQL Server 7/2000: We have reasonably large tables (3,000,000 rows)that we need to add some indexes for. In a test, it took over 12 hoursto CREATE a new INDEX against this table. One of us suggested that wecreate a temp table with the new index and copy the data from the oldtable into the new one, then rename it. I understand this took 15minutes. Why the heck would it be faster to move the data and buildmultiple indexes incrementally vs adding an index??
View 11 Replies
View Related
Jul 26, 2007
I'm working with a table with about 60 million records. This monster is growing every minute of the day as well, by 200,000 - 300,000 records/day. It's 11 columns wide, and has one index on a datetime column. My task is to create some custom reports based on three of these columns, including the datetime one.
The problem is response time. Any query executed on this table takes forever--anywhere between 30 seconds and 4 minutes. Queries such as this one below, as simple as it is, can take a minute or more:
select
count(dt_date) as Searches
from
SearchRecords
where
datediff(day,getdate(),dt_date)=0
As the table gets larger and large, the response time is going to get worse and worse. Long story short, what are my options to get the speed of queries down to just a few seconds with a table this big? So far the best I can come up with is index any other appropriate columns (of which there is one for sure, maybe two).
View 6 Replies
View Related
Jun 1, 2015
I have a table with a couple hundred billion records (sql server 2005). When I do a select count(*) from tblx -- it takes this side of forever. Is it possible to count partitions and then add them up to make it faster? Â
How I could improve the performance for count(*) of this huge table. Note: if the partition idea sounds viable -- what would that look like?Â
View 11 Replies
View Related
Aug 15, 2007
We have a table that is 800GB. We are planning to re-build the clustered index on this table to a different filegroup. The new filegroup and files associated with it will sit on a SAN which will have a 1.5TB allocation. Does anyone have any suggestions in regards to how many files to have associated with the filegroup to provide optimal performance? Apparently we could have 3 LUNS (500gb each), so would 1 file on each LUN provide additional performance as opposed to one file on 1 LUN?
View 1 Replies
View Related