Interesting Large Table Design Recommendation

Sep 25, 2007

Hi,

What's the most efficient way to store the following information:

* Table contains 1 million listings
* Each listing can be geo-targeted to any of the 200+ countries
* Searches return listings based on location

Storage options:

Option #1 (normalized)
* Listings (PK listingID int) [1 million rows]
* ListingLocations (listingID, locationID) [could be up to 200 million rows]

Option #2 (denormalized)
* Listings (PK listingID int, binary(32) with bit-mask consisting of 200 bits one for each location)

Usage: Usually the query will simply lookup listings based on some keywords. It will get back 50-200 listings. Then the application (C#) will filter the listings based on location.

Did anyone have experience with similar structures? Which option is more efficient?

I know that using the intersection-table in Option #1 is the "proper" relational-DB way of doing things. However, I do not like the idea of storing the listingID so many times (ones for each locationID).

Thanks,
Av

View 1 Replies


ADVERTISEMENT

&"Interesting&" Table Design...need Opinions

Mar 6, 2008

Need some opinions on how to query a set of tables designed like this:

Main table: Subscriber

Table has 80 columns, each column has a table which is written to via an update trigger everytime the column changes. Subscriber table is Type 1 dimension but the "Change" tables are Type 2 dimension. The "change" tables have 3 columns, SubscriberID, ChangeDate, NewValue.

The requirement is to push a Type 2'd version of the Subscriber table into a data warehouse for historical reporting. Meaning the users need to be able to see what a Subscriber looked like on an exact date.

Has anyone had any experience with this type of structure? If so, how did you join together the Subscriber table to all the change tables to extrapolate a "view" of the subscriber on each given day? You can assume that a record can only change once in a day, but not every column will change on the record.

I was thinking of something like this:


SELECT Subscriber.ID, (SELECT TOP 1 NewValue FROM Subscriber_FirstNameChanges b WHERE b.SubscriberID = Subscriber.ID AND b.ChangeDate <= @GivenDate ORDER BY b.ChangeDate DESC)
FROM Subscriber

This would pull the most recent version of the subscribers FirstName based on a given date (I'd have to add in the 79 other change columns with almost identical sub queries). This would also take into consideration that there may not be change records for particular columns on particular dates because not all columns change at the same time/day and pull the most recent "version" of the subs FirstName since @GivenDate.

View 6 Replies View Related

Design Question For Large Table

Sep 20, 2007

Hi,

What's the most efficient way to store the following information:

* Table contains 1 million listings
* Each listing can be geo-targeted to any of the 200+ countries
* Searches return listings based on geo-location

Storage options:

Option #1 (normalized)
* ListingsTable (PK listingID int) [1 million rows]
* ListingGeoLocations (listingID, geoLocationID) [could be up to 200 million rows]

Option #2 (denormalized)
* ListingsTable (PK listingID int, binary(32) with bit-mask consisting of 200 bits one for each location)

Did anyone have experience with similar structures? Which option is more efficient?

Thanks,
Av

View 8 Replies View Related

Index Design Recommendation - Examine Column Uniqueness

Nov 30, 2005

I am reading "SQL Server Query Performance Tuning Distilled",on page 104 it talks about one of the index design recommendationswhich is to choose the column that has very high selectivity of valuesinstead of a column that has very few selectivity of values.My question is if I have currently indexes on my tables that have1, 2, 3, 4, ... values only on thousands of rows, are these nonclusteredindexes pretty much useless indexes that I should get rid of?And I know that pretty much the number of selectivity values willalways remain very low.Thank you

View 1 Replies View Related

Design Of Tables With Large Optional Fields?

Jan 4, 2006

I have a general SQL design-type question.

I want to log errors to a table. If the error is with a URL, I want to store the URL. These URLs can be very large, hundreds of characters, but I only need to store it if it causes the error, which should be very infrequent. Which is the better design:

Create a large varchar field in the log table to hold the URL, or null if the error wasn't with the URL.
Create a foreign key field in the log table to a second URL table, which has a unique ID and a large varchar, and only create a record in this table if the error is with the URL.

One concern I have with design 2 is that there could be many other fields that are infrequent. Do I create a separate table for every one?

Richard

View 3 Replies View Related

Large Volumes Of Varchar Data - Design Advice

Jul 6, 2006

Hello all,

I have recently been task with rewriting a database that holds large volumes of data, whilst ensuring that query can be run in optimal time. Having never really delved into this sort of thing before, I hoped you guys might be able to offer some advice and guidance.

The design I have inherited is based around 2 main tables:


[captured_traps]
[id] [int] IDENTITY (1, 1) NOT NULL
[snmp_version] [int] NULL
[community_name] [varchar] (255)
[packet_type] [varchar] (50)
[oid] [varchar] (500)
[source_ip] [varchar] (15)
[generic] [int] NULL
[specific] [int] NULL
[time_stamp] [varchar] (15)
[trap_entered] [datetime] NULL
[status] [int] NULL


[captured_varbinds]
[id] [int] IDENTITY (1, 1) NOT NULL
[captured_trap_id] [int] NOT NULL
[varbind_oid] [varchar] (500)
[varbind_text] [varchar (500)


The relationship between the two tables is on the "captured_traps (id)" to "captured_varbinds (captured_trap_id)". Currently the "captured_traps" table contains around 350 million rows, the "captured_varbinds" table contains around 900 million rows.

Now as you can probably gather this model runs like a....well it sort of hobbles more than runs hence the need to redesign.

My current thoughts on this are:

- Normalising all varchars - there is alot of duplicate values in most of the varchar fields.
- Full Text Indexing

However beyond that I am not sure which route to go down. After googling for most of today I have come across a number of "solutions" however I do not want to go steaming down the track of one of these to discover that it is fatally flawed somewhere.

View 6 Replies View Related

Interesting Scenario - Multiple Files Updating One Table

Sep 6, 2007

Hi,

i have a scenario where I have to read 2 files that update the same table (a temp staging table)...this comes from the source system's limitation on the amount of columns that it can export. What we have done as a workaround is we split the data into 2 files where the 2nd file would contain the first file's primary key so we can know on which record to do an update...

Here is my problem...

The table that needs to be updated contains 9 columns. File one contains 5 of them and file2 contains 4 of them.

File 1 inserts 100 rows and leaves the other 4 columns as nulls and ready for file 2 to do an update into them.
File 2 inserts 10 rows but fails on 90 rows due to incorrect data.
Thus only 10 rows are successfully updated and ready to be processed but 90 are incorrect. I want to still do processing on the existing 10 but cant affort to try and do processing on the broken ones...

The easy solution would be to remove the incorrect rows from the temp table when ever an error occurs on the 2nd file's package by running a sql query on the table using the primary keys that exist in both files but when the error occurs on the Flat File source, I can't get the primary key.

What would be the best suggestion? Should i rather fail the whole package if 1 row bombs out? I cant put any logic in the following package that does the master file update/insert from the temp table because of the nature of the date. I

Regards
Mike

View 4 Replies View Related

DB Design :: Table Design For Packages

Aug 18, 2015

I would like to create a table called product. My objective is to get list of packages available for each product in data grid view column while selecting each product. Each product may have different packages type (eg:- Nos, CTN, OTR etc). Some product may have two packages and some for 3 packages etc. Quantity in each packages also may be differ ( for eg:- for some CTN may contain 12 nos or in other case 8 nos etc). Prices for each packages also will be different that also need to show.  How to design the table.. 

Product name   :  
Nestle milk |
Rainbow milk
packages  :
CTN,OTR, NOs |

CTN, NOs
Price:
50,20,5 |
40,6

(Remarks for your reference):CTN=10nos, OTR=4 nos  
| CTN=8 Nos

View 3 Replies View Related

Does It Store All The Results To Tempdb Database When I Query Against A Large Table Which Joins Another Table?

Jun 25, 2007

Hi, all experts here,



I am wondering if tempdb stores all results tempararily whenever I query a large fact table with over 4 million records which joins another dimension table? Since each time when I run the query, the tempdb grows to nearly 1GB which nearly runs out all the space on my local system drive, as a result the performance totally down. Is there any way to fix this problem? Thanks a lot in advance and I am looking forward to hearing from you shortly for your kind advices.



With best regards,



Yours sincerely,



View 11 Replies View Related

Large Table-Table Partition, View Or Other Method?

Aug 27, 2007

Hi everyone,

I use sql 2005. What is the best practice for dealing with large table (more than million rows)? Table Partition, View or other?

Can you please give some suggestions? It will be very helpful if you can post some references or examples.

Thank you!

View 12 Replies View Related

Recommendation

Jun 24, 2004

I was if anyone can recommend a book that specifically focuses on SQL statements such as Queries, Stored Procedures, Triggers, Transcations..etc

View 11 Replies View Related

Need A Recommendation

Apr 8, 2008



Hello,

I will have to create a table that consists of only of two fields. one: them employeeID and two: the SupervisorID,
my question is what should I define as my primary key. Should it be an aditional field, or could it be the EmployeeID field.

The employeeID is an unique filed. The end user for this application will be updating rearly some of this records, and may be adding or deleting some new records exporadically.

Thanks for suggestions.

View 6 Replies View Related

SQL HA Recommendation

Dec 21, 2006

Hi:

Here's the scenario I'm working with:

A SQL 2005 server with around 1K~ databases, capacity at about 1TB~. We would like to be able to have a warm standby with transactions replicated to it. In the event of a failure on the principle, we would want the warm standby to come online automatically and begin serving db requests.

I've looked at the SQL 2005 database mirroring option; however, this has a restriction of around 10 databases per SQL server instance which, unfortunately, I exceed. One method I've been looking at is transaction replication in the classic publisher / subscriber model; however, how would I handle automated fail-over to the subscriber if the publisher were to fail?

Does anyone in the community have any thoughts or recommendations?

Thanks.

-matt

View 1 Replies View Related

DB Design :: Insert / Update FACT Table From Staging Table

May 6, 2015

We need to Insert/Update a Fact Table from staging Table. currently we are using a SP which update Fact Table for Each region.  this process is schedule,  every 5 min job is run and Update fact table.but time of Insert and Update too long from  staging  to Fact, currently we are using merge statement for Insert and update.in my sp we are looping number  how many region we need to update and at a time single Region we are updating using while loop in current SP.

View 7 Replies View Related

DB Design :: Table Partitioning Using Reference Table Data Column

Oct 7, 2015

I have a requirement of table partitioning. we have 10 years of data on a table which is 30 billion up rows on 2005 server we are upgrading it to 2014. we have to keep 7 years of data. there is no keys on table or date column. since its a huge amount of data and many users its slow down the process speed. we are thinking to do partition on 7 years for Quarterly based. but as i said there is no date column on table we have to use reference table to get date. is there a way i can do the partitioning with out adding date column on table? also does partition will make query faster? 

I have think three ways to do it.
1. leave as it is.
2. 7 years partition on one server
3. 3 years partition on server1 and 4 years partition on server2 (for 4 years is snapshot better?)

View 3 Replies View Related

PK On A Large Table

Nov 16, 2007

I am developing an application that has a table with lots of records(network traffic) but the data is summarize every so often to create summary records (old records are deleted). The problem is that I have a PK based on an autoincrement ID (int) that will run out of numbers. However, this ID is not referenced anywhere, (not a foreign key from another table, not use for deletion and there is no update in this table whatsoever).

So my possibilites are:
1.- reseed the id when it is about to run out.
2.- make the id bigint
3.- remove the id and change the PK to 2 other fields
4.- remove the id and without PK

I am leaning toward option 4, because I do not see the need for a PK, but I understand that it is quite out of the normal.. So I would like to hear from other people ( I do not have much experience with DB).

I also like option 3. I already have a index on one of the other fields (time).

Any input will be appreciated.

Claudio Robles

View 7 Replies View Related

BCP Large Table.

Jul 23, 2005

If I use BCP to export a very large table will that table be blockedfor writes during the export process? I don't want to prevent usersfrom accessing that table during the bcp process?Thank You, TFD.

View 1 Replies View Related

Forum Recommendation

Sep 8, 2004

Can anyone recommend me other "REALLY GOOD" msql server forums?

Thanks

View 1 Replies View Related

General Recommendation

Sep 26, 2004

Hi, folks. I've a production SQL machine with more than 20 users making transactions 24 hrs in 6 days a week. I've only Sunday for maintenance. The server has fixed 2 GB RAM allocation for SQL. Is it good to Restart SQL ( or machine) to clear the Buffer-Cache( or is it good to keep the cache) .... :rolleyes:


Howdy!

View 3 Replies View Related

Filegroup Recommendation

Jan 15, 2004

Are there any general recommendations concerning filegroups? My personal point of view is to place large tables in their own filegroups and group smaller, more static, tables in a single filegroup. Is it also good practice to group small and large index in two separate filegroups or should each large index have their own filegroup? Are there any useful links out there concerning filegroups and configuration?

View 1 Replies View Related

Software Recommendation

Oct 7, 2007

i need a recommendation for a good database creator & manager software, with a simple user interface (not access :-) ).

thx

View 9 Replies View Related

SQL Hardware Recommendation

Jul 20, 2005

I am in need of some advice. I need to build a SQL machine that willbe adequate for my company. Budget is a very big factor but I need themachine to be reliable and as redundant as possible.This box will be 'vanilla' since I will be building it myself. Ilooked at some larger companies websites and the prices are way out ofcontrol.Here's what my configuration is so far (keeping price in mind):Case: rack-mount 4UMotherboard: Intel 865GLCL (800MHz FSB)Processor: Intel Pentium 4 2.4GHzMemory: 1GB DDRAMHard Drive(s): 3-36GB SATA [10,000 RPM] in RAID 5 configurationCD-ROM: standardFloppy: standardRAID Controller: Promise SATANIC: 3ComMy machine does not come under a very heavy load but it is used often.I'm interested in hearing others comments about their SQL servers so Iknow how to gauge building my machine.

View 4 Replies View Related

Book Recommendation

Jul 20, 2005

Can someone please recommend a good book for SQL Server 2000 for aprogrammer(Powerbuilder), not a DBA.

View 1 Replies View Related

Book Recommendation

Jul 27, 2006

I have begun to try to break out of using Access db's (97!) and have been trying out SQL Server Express 2005 along with the SQL Management Studio Express. I am a little confused with it as I am trying to use the interface inside of VB.NET 2005 as well as the management studio and sometimes I can connect from one without the other.

Anyway this points to the fact that I have a lot to learn and I was looking for a recommendation for a book that could be a tutorial for using VB.NET 2005 with SQL Server Express. I really need something that starts from square one but hopefully builds fast. Right now it appears I need to understand connection strings (when do I put ".sqlexpress" and when do I use the server name followed by the instance for example?).

I have tried some of the books online for example and ran into a dead end with the simple tutorial (http://msdn2.microsoft.com/en-us/library/ms165732.aspx) when the headers didn't sort, I couldn't select any other pages and the edit button didn't work. I don't have a clue what happened as I followed the instructions.

Anyway if someone could recommend something that teaches using SQL Server Express while building an application with VB2005 that would be perfect.

Dana

View 3 Replies View Related

Performance Recommendation

May 31, 2007

Please give me some advice. In my application I calculate a list of identifiers (Guids) that are primary keys in my table and I have to retrieve those rows from the database. So my first approach is like




Code SnippetSELECT id, c2 FROM t1 WHERE id IN (@id1, @id2, @id3,....)

where @idn are the calculated identifies as parameters. This approach does not scale well since there is a limit of parameters that can be used. So one possibility might be to use several SELECT statements, each with the maximum number of parameters. I can't believe that this is a good solution. A temporary table may be a better solution - I don't know. Are there any better ways to retrieve performantly - any recommandations?



Thanks a lot

Hans-Peter



View 8 Replies View Related

Storing Large PDF's In Table

May 26, 2008

Hi,
I have this page that upload's PFD's to a table. In principle this works fine.
Until I try to upload large files (3 to 4 MB)I need to even upload larger files than that. (Don't really know as of yet what users are going to come up with) I get TimeOut problems. Now some people say it is not possible to exceed a limit of about 4 MB. But that there is a workaround by changing something to the web.config file.Can somebody give me info about that, (I am quite a novice really)I tried to change it like this, but to no avail:
<system.web><httpRuntime maxRequestLength="102400"enable = "True"requestLengthDiskThreshold="102400" useFullyQualifiedRedirectUrl="True"executionTimeout="102400"/></system.web> 
Thanks for any help!

View 2 Replies View Related

Move Large Table From DB To DB

Mar 3, 2004

I have a table of approx 1/2 million rows.

On a nightly basis, this table gets rebuilt in a temporary database. Once the table has been built and scrubbed, i need to move it into our webservers db.

I'd like to do this with minimal interuption to the website.

Possible techniques:

1) I could set up a DTS package to copy the table object overwriting the destination table

2) I could export to a flat file and then bulk import into the live table (after truncating it)

3) I could run a process to update smaller chunks of data at a time running delete queries and insert queries.

Anybody have a thought on the best way to do this so that the web users would be virtually unaware that anything was happening ?

View 4 Replies View Related

Duplicate In Large Table

Mar 1, 2002

Hi,

I am absolutely innocent as far as T-SQL is concerned. I need to detect all duplicates (key consists of 5 fields) in the table and delete the duplicates.
I tried different approaches like joins etc but nope.
Any help is appreciated
Thanks

View 2 Replies View Related

Help! Indexing LARGE Table

Aug 6, 2004

OK, I imported 680 million records into an unindexed table. That went well.

Then, I went into Enterprise Manager and added a two column non-unique clustered index to that table to speed access.

It's been running for ~36 hours and I have no idea when it will complete. I have deadlines that I'm going to miss and am very nervous; what can I do?

SQL Server 2000 Enterprise Edition (8.00.818 - sp3 + hotfixes)
Dual 3Ghz Xeon (two physical CPUs each have HyperThreading enabled)
Windows 2000 SP4
4GB RAM (although I just noticed the 3GB OS switch wasn't on)
SCSI boot drive
tempdb, data, and transaction log are on a FibreChannel RAID SAN

Help! Thanks in advance!

View 8 Replies View Related

Partitioning A Large Table - How Much Is Too Much?

Nov 14, 2007

Hi folks! I'm looking for advice on partitioning a large table. In the DDL below I've changed names to protect the guilty.

My table has this schema:


CREATE TABLE [dbo].[BigTable]
(
[TimeKey] [int] NOT NULL,
[SegmentID] [int] NOT NULL,
[MyVal] [tinyint] NOT NULL
) ON [BigTablePS1] (TimeKey) -- see below for partition scheme

alter table [dbo].[BigTable] add constraint [PK_BigTable]
primary key (timekey asc, SegmentID asc)

-- will evaluate whether this one is needed, my thinking is yes
-- based on the expected select queries.
create index NCI_SegmentID on BigTable(SegmentID asc)


The TimeKey column is sort of like a unix time. It's the number of minutes since 2001/01/01, but always floored to a 5 minute boundary. so only multiples of 5 are allowed.

Now, this table will be rather big. There are about 20k possible SegmentIDs. For every TimeKey from 2008/01/01 to 2009/01/01 (12 months), I'll have on the order of 20000 rows, one for each SegmentID.

For the 12 month period, there are 365*24*60/5=105120 possible TimeKey values. So the total rowcount is over 2 billion. (20k * 105120)

Select queries are expected to be something like this:


-- fetch just one particular row...
select MyVal from BigTable
where TimeKey=5555 and SegmentID=234234

--fetch for a certain set of SegmentID and a particular time...
select
b.SegmentID
,b.MyVal
from BigTable b
join OtherTable t on t.SegmentID=b.SegmentID
where b.TimeKey=5555
and t.SomeColumn='SomeValue'


Besides selects, also I need to be able to efficiently issue update statements against the table with new values in the MyVal column based on a range of TimeKey values (a contiguous span of a few days) and sets of about 1000 SegmentID. updates would always look like this:


update t
set t.MyVal=p.MyVal
from BigTable t
join #myTempTable p on t.TimeKey=p.TimeKey
and t.SegmentId=p.SegmentId


where #myTempTable would have order of 1000*24*60 rows in it, all with contiguous TimeKey values, and about 1000 different SegmentID values. #myTempTable also has a clustered pk on (timekey asc, SegmentId asc).

After the table is loaded, it would never get any inserts or deletes. only selects and updates.

Given the size, and the nature of the select and update queries, this table seems like a good candidate for partitioning. I'm thinking it makes sense to partition on TimeKey.

So my question is, is it stupid to create a separate partition for each day in the year long span of TimeKeys this table covers? That would mean 365 partitions in the partition function and partition scheme. Something like this:


CREATE PARTITION FUNCTION [BigTableRangePF1] (int)
AS RANGE LEFT FOR VALUES
(
3680640 + 0*1440, -- 3680640 is the number of minutes between 2001/01/01 and 2008/01/01
3680640 + 1*1440,
3680640 + 2*1440,
3680640 + 3*1440,
...snip...
3680640 + 363*1440,
3680640 + 364*1440,
3680640 + 365*1440
);
GO

CREATE PARTITION SCHEME [BigTablePS1]
AS PARTITION [BigTableRangePF1]
TO
(
[PRIMARY],[PRIMARY],[PRIMARY],
...snip...
[PRIMARY],[PRIMARY],[PRIMARY]
);
GO


does anyone have any experience with partitioned tables with so many partitions? Is a few hundred partitions too many? From my understanding of partitions, seems like having so many will be ok. Is it somehow worse than having hundreds of tables in a database?

Even with one partition for each day, I'll still have 24*60*20000/5 ~ 5m rows in each one.

5m seems like a manageable number. 2b does not.



elsasoft.org

View 2 Replies View Related

Select On Large Table.

Jul 23, 2005

Greetings All, I was wondering what would happen if I were to do a"select * from table" on a table that has about 5 million rows. Wouldmy read block other writers to the same table? Would it block otherreaders? I know SQL uses optimistic lockign by default but I am notsure what this means to other users trying to access the same table?Any advise would be greatly appreciated.TFD

View 3 Replies View Related

Large Table Schema Changes

Jul 20, 2005

Quick question:Does SQL do table/schema changes "in place"?I've got a large table (140+ million rows of very widedata) that we want to change the schema on -- basicallyto remove a number of the unused data elements that wedon't use.Anyway, does anyone know if SQL will do an in-placechange, or if it will copy the table to a new table, therebyincreasing my space allocation needs? I'd effectively,temporarily, need space for two tables while the changeis happening if it copies the table first. This is not good asI do not have enough available space at the moment.If you've got pointers to specific MS docs regardingthis issue, please let me have 'em.Thanks in advance.

View 2 Replies View Related

Joining Large Table

Apr 29, 2008



I have query that takes 12 minutes to execute. The query uses around 9 tables but I have narrowed down the problem to one table that has over 65 million rows. The problem table has only 3 fields

FieldOne (PrimaryKey)
FieldTwo Varchar(3000)
FieldThree Varchar(3000)

The query uses the primary key of this table to perform the join. FieldTwo and FieldThree are only used as output parameters.

I noticed if I remove FieldTwo and FieldThree from the output (but still leave the table in the query), the query executes in 1 second. However if I include FieldTwo and FieldThree in the output, the query takes over 12 minutes to execute.

I cannot index FieldTwo and FieldThree because of the field size and I cannot reduce the size of the fields because of the data that needs to be stored in it? How can I index or do something similar to speed up the table look up.

View 1 Replies View Related







Copyrights 2005-15 www.BigResource.com, All rights reserved