Choosing DBMS And Architecture For Ecommerce Website
Jul 20, 2005
Hi to all
I have to choose a DBMS and a database architecture for an Ebay like
website about to be launched.
The company wants to use a web hosting service and not host the
database on dedicated servers at the office.
The database will contain web-only information and lots of back end
information that is not really needed to be stored on the web host.
I'm wondering how to design that part, should I store all information
on the web host only ? miror that DB every evening on some local DB
server to be able to use the data without eating up lots of bandwith ?
separate the database in 2 parts ? how to sync and assure integrity
then ? having a local DB will also mean the company will have to pay a
licence for the DBMS ...
What DBMS should I pick considering that the database will have to
hold at least 1 million products to sale (eBay like) and all the
information that goes with it. I thought any DBMS weaker than SQL
Server or Sybase or Oracle will not be enough. What do you think ?
Thanks a lot, hope I have made myself clear enough
P.S. I would really like to get lots of different points of view. I
think I'll use Sybase after all, so I wonder if that'sa good choice
and I still want to know your thoughts about the 2 or 1 DB design
(separate Web & Billing information for example, or leave all the info
in the hosted database, what techniques to use to keep the integrity
and to have the latest information in-house)... Thanks a lot
First, is there a forum here for advanced database design that I'm missing?
I have an ecommerce project I'm working on for which I have to develop a database that will house many products under many categories with unlimited subcategories/levels. For example:
baseball baseball weightlifting baseball weightlifting upper body baseball weightlifting lower body ...
The levels of subcategories must be completely scalable, as the administrator wants to be able to add and subtract at will. Further, there is the idea that some subcategories will transfer across other categories (ex: weightlifting would be a subcategory of multiple sports).
In previous database designs I'vesimply have a category table and maybe a subcategory table; obviously that won't work here, as the number of levels of categories is unknown.
Does anybody have any suggestions, examples, recommended books, forums, etc. to point me in the right direction?
Hi folks, I have a table located in DB2 nd I need to have a mirror image of this table on a SQL2000 database to avoid some server downtime problems. Right now I have a solution using ADO.NET with Windows Services.
This windows service invokes itself everyday morning and pulls all the records from this table in DB2 to a dataset. Then I loop through the dataset and insert every record into SQL 2000 Table. This method is working fine ( It take approximately 2 minutes to insert 5000 records). I am just wondering whether there is any way to acheive bulk insertion in this case. Considering future growth of table I am not thinking the existing solution is neither elegant nor efficient.
Please let me know if I can achive the same either using XML, BULK INSERTS or any other mechanism in ADO.NET and please remeber that we are talking about data migration between different DBMS ( DB2 to SQL 2000)
I want to ask if there is any option in SQL Server 2000 that can be used to import automatically every day....I have an FoxPro database which is updated every day and I want my main table in SQL Sever to be updated by importing every thing again from the FOXPRO ever day at night.....is there any option for that?
Many companies use DBMS that are not Microsoft SQL Server. Those companies use Oracle, PostgreSQL, MySQL and Ingres, for example. I'm looking for a paper o website which exponses the advantages (strengths) and disadvantages (weakness) of those DBMS. Why? Our customers would like a comparison between different systems, and we must show them the Microsoft SQL Server possibilities.
Can you help me with this? In terms of performance, speed, security, maintenance, etc.
HI All, I want to pull data from mysql dbms to sql servr 2005, i have wrote the following codes but it takes more than an hour and half which is not visible. Thus is there any consideration to consider and reduced time it takes. For your information i am going to use SSIS packags, there is no any tranformation, it is direct dump. Here is the code i am using,
SELECT * FROM OPENQUERY (Server_1,'
SELECT t3.Column11 as Column1, Column12 as Column2, Column13 as Column3, Column14 as Column4, Column15 as Column5, Column16 as Column6, Column17 as Column7, Column18/1000 as Column8 FROM table1 t1 INNER JOIN table2 t2 ON t1.ColumnId = t2.columnID INNER JOIN Table3 t3 ON t2.columnId = t3.columnID WHERE t1.Column4 > Sometime ')
tblJobDates serves two purposes: to give us the most recently entered due date for a job, and to serve as a "repository" to track changes to the due date.
Report C: The report I want to generate does NOT provide historical information... it only serves to show the CURRENT due date for each job in the tblJobs table: --------------------------------------------
COLUMNS: LocationName Due Date (alias of DateData)
OUTPUT (csv): Jonesport ME, 6/8/2002 Garden City NY, 6/13/2002
Note that for Jonesport, an initial due date of 6/17/2002 was entered (based on the CRD). Then someone changed it so that the job was due EARLIER.
Note that for Garden City, an initial due date of 6/12/2002 was entered (based again on the CRD). Then someone changed it so that the job was due LATER.
The "most recently entered due date" is what should be reflected in my report -- just as it does above ("C")
Other Notes:
-- There are other columns of information from both tables that i would like to return, but above is the most basic form of my request. Most notably, we would need to return the JobPK in report (C).
-- A job should only appear ONCE in report (c), with it's "current" due date, regardless of the other due dates that may have been entered for that job.
-- If a job has no due date, it should not appear on the report.
-- Although not shown here, each row in (B) DOES have a unique identifier (DatePK) as well... if that helps in your solution.
-- Note that the job that is "due first" appears at the top of report (C). This allows a person looking at the report to quickly determine which job "gets priority" -- the one on top!
Okay gurus -- how should the query look that would generate the desired output in Report C?
THANKS IN ADVANCE if you even can point me in the right direction!!
I need to decided between Standard and Enterprise Edition (Cost is acriteria - but its secondary to performance - <!--and I am not paying forit myself-->)The server spec under consideration: Dual Xeon, 1GB RAM, 36GB - RAID 1(Dell PowerEdge 1850).Application: Windows 2003 Std Server, ASP.NET, MS SQL Server 2000 baseddata driven web application.Approximately 25 simultaneous clients. Peak activity would probably be 50transactions/activities per second (2 per second per client). I expectthe database size to grow up to 4GB in 1 year.The application would use only basic OLAP features (if at all)...sofeature set wise I believe that standard edition is good enough.What I am concerned about is when MS documentation says that StandardEdition is for "organization that do not require the advanced scalability,availability, performance, or analysis features of the SQL Server 2000Enterprise Edition"Is there a difference in performance between Std and Ent editions? Interms of number of transactions per second that can be serviced?What other criteria should I be aware of before deciding to go one way orthe other?Any ideas?
There must be a way to do this simply. We're running SQL Server 2000. I'm looking for some generic SQL statement that I can apply.
If I have a table with a person column and a location column and multiple records for the same person / locatioin combination, how do I select the person with the location they most frequently visited? Say George visits Mexico 5 times, and the Bahamas twice and costa rica once. I would have 8 records in my table for George. The data looks something like this:
Please help me out: I have some records in a sqldatasource and want to show it column wise. Now I do it with a datalist because it's easy. But other options are open. Every item/record should have a radiobutton (in a group, so that you can only choose one from all). People advised me to do this with a html radiobutton inside the template. After the user has selected an item and chooses the next-button I need to know what item the user has choosen. Furthermore, when the user likes to step back, the same radiobutton should allready be selected. Please help, this is bothering me for a while, best regards from The Netherlands, Gert
My company has a website that connects to a sql server (on a different box). I am trying to convince them to get sql server 2005. However, I do not know if SQL Server 2005 Workgroup edition is okay for our needs. Can someone please tell me if it is. Basically, our setup is the following:
The SQL Server will only have one/two clients - the web server
i have to store some data on a remote sever(MS SQL SERVER2000). The scenario is like 1. The web application runs on a local machine. User (who inputs) uses through LAN.2. The Input should be stored in the remote server. if the remote connection is ok. otherwise it should be saved in local server's database(MS SQL 2000).3. In the application's web.config there is a connection string pointing to the remote server and another one (alternating one) points the local server's database. in scenario like this i first to tested the remote connection. if it is not ok then i initialize the local server's connection like thisprivate MyConnection() { try { connectionSql = new SqlConnection(ConfigurationManager.ConnectionStrings["ConnForRemote"].ToString()); connectionSql.Open(); } catch (Exception ex) { connectionSql = new SqlConnection(ConfigurationManager.ConnectionStrings["ConnForLocal"].ToString()); } finally { connectionSql.Close(); } connectionSql2 = new SqlConnection(ConfigurationManager.ConnectionStrings["Temp"].ToString()); }My problem is when the remote connection is lost it takes almost 1 minute to store in local database. how can i make it more time efficient. Thanks....
Hello, I have a table with some data in it. What I want to do is to create a query that returns me randomly one of the records of the table. Can this be done?
If this is not possible from SQL server I have thought an alternative way. This is:
I want to return all rows of the table with SELECT *, but I want the select to return in the first column an autoincreament number for each row without the need to add an autoincrement field in the table. e.g
Table ------ Banana Tomatoe Aple ... ... Orange
Result from select ------------------ 1 Banana 2 Tomatoe 3 Aple . .... . .... 23 Orange
Can this be done? At least this way 1) I can travel to the end of the results (from ASP), 2) read the ID of the last row 3) Create a random integer number from 1 to last ID, 4) and finaly select the appropriate random row from that integer.
I am purchasing a new/first server and could use some help with the details.
I am purchasing the server with the intent of managing a large database that will be quite extensive and requires a good amount of processing power. I have decided to go with windows server 2003 and SQL Server 2000 as a database. Within next year I hope to have this database directly flowing to a website that I could possibly be hosting as well as 2-3 offsite employess logging into the system remotely.
I would say my biggest question is whether or not to choose the raid 1 configuration or the raid 5. I want to be able to have the Hard drives mirror eachother. I was thinking of going with three hard drives but im not really sure if I would even need that setup. With that, I will just show my current system:
Dell poweredge 1800
3.0 ghz xeon 2 gb memory sata 1 raid cerc 6-Channel sata raid controller 160 gb hd x 2 onboard NIC network adapter
Im going price savvy on this one so no ups redundant, power supplies, or tape backup. Although I am open to any suggestions.
Definately appreciate any help with this as I have been hard pressed to find some quality reseller help. They just want to throw the biggest and baddest thing at me.
I would like to know the experts views on the following I have listed below.
1. Is there any significant performance gain by choosing the Native SQL server driver rather than OLEDB for example. I know there are lot of specified features in the Native SQL Driver but I am thinking in terms of the performance.
2. Why not develop for the generic database rather than specific database?
3. More generic mean less work when migrating database to a different database?
Appreciate your valuable thoughts and any recommendations.
Whether this index will be considered by the query optimiser to lock records? If I created another index with only the QuestionId field will it boost the performance? Actually how the optimiser chooses the right index while update?
This is driving me bananas. Can't find any info on this anywhere....SQL 2000 seems to replace double space with a single space when I seta varchar field to " " (2spaces), it only stores " " (1space). Whyon earth would microsoft do this? If I save 2 spaces - I WANT TO SEE2 SPACES!!!!Can anyone help? Is this a database setting? Is this due to usingvarchar?Any help appreciated.Colin Hale
I need to choose a database based on the following criteria (using .NET app): 1) a light but fully functional database, preferably with the support of store proc and constraints, less than 8000 transaction a day. 2) portable or the database can be export/import very easily 3) reliable and stable 4) least maintenance
I have two db in my mind, Access and MSDE? Does anyone have some hand-ons experience on the above two? Or any other better suggestions?
Hello, I am really dripping wet behind the ears on this and would really appreciate some help. I am setting up my first SQL table and am lost at trying to choose data types for my fields. Basically, all I am doing is setting up a contact form. It is going to ask for phone number, name, address, city, state, zip, etc. I will also have two fields which if I were using an Access db, would be "memo" with say, 500 characters. So in researching SQL data types, I came across the following:
char Fixed-length non-Unicode character data with a maximum length of 8,000 characters.
varchar variable-length non-Unicode data with a maximum of 8,000 characters.
text Variable-length non-Unicode data with a maximum length of 2^31 - 1 (2,147,483,647) characters.
nchar Fixed-length Unicode data with a maximum length of 4,000 characters.
Can someone shed some light on what I need for simple fields like street, name, city, and more importantly, description? I will also have a "premium" field which should be a "yes" or "no". I am thinking a data type of bit, which is set to 1 or 0? Thanks for any help, I appreciate it so much. TOm
Hello group:I've done alot of reading on this subject somewhat and have found thatmany people have many different opinions on this subject. My questioncenters mainly around using a lookup table to enable users to select apre-defined list of values.I have developed a practice myself of avoiding AutoNumber type datafields for primary keys where the primary key will be related to achild table. Nevertheless, what do most users do with lookup tables?My thoughts are to create a small key value for each value in thelookup table. For example:I might have a Carriers table which shows a list of carriers that Imight ship an order by. One of the entries may be 'Air Freight -Overnight', or 'Air Freight - 2nd Day Air'. I've seen a few exampleswhere the primary key field for each entry like these would beautonumber, or at least, a numeric value. What I like to do is createmy own key, like for 'Air Freight - Overnight', I might use 'AFO' forthe key, and for 'Air Freight - 2nd Day Air', I might use 'AF2'. Anythoughts on this? Mine are that even tho the users may never see thisvalue - I, as the developer will see it and I tend to prefer a keyvalue based on real data that means something other than anauto-incremented number. In referencing the well-known Northwind.mdbdatabase, I noticed their Categories table used a number field value,like 1, 2, 3....etc, but their customers table used values like'ALFKI' to represent their key values.What are some other thoughts out there? I'm working with Accesscurrently, but this project is about to move to SQL Server.James
I've got a couple of questions linked to partitionating tables.
-What sort of criteria follows Database Engine when you have two NDF assigned to one filegroup and this filegroup is part of partition What's more: Could I force that Sql will use one by default?
I mean, my first partition encompass from 20020101 till 20030101. When I add data for example March or June, could I decide that these months belong to NDF1 rather than NDF2?
Lets assume database A is production, B is copy. SQL Server 2005 sp2, SQL CE 3.5
Database A has a variety of transactions against it 24x7 Database B (the copy) is for reporting and as a source of merge replication for SQL CE instances Merge replication and reporting is used 24x7 as well
I have the following requirements: Maintain an up to date copy of the production database (need not be up to the minute, could be hourly, even daily update) Database B is read-only. The merge replication is NOT bi-directional.
Here is the caveat (which I think prohibits using some solutions to this problem): The production application accomplishes much of it's functionality with in-memory copies of records. I have no control over the production application. When it works against the database, it sort of does a 'withdrawal-deposit' scenario. (to the best of my knowledge it's not using SQL Server transactions) So, for every record it works with, a copy is made out of the database, changes are made in memory, a delete of the database record is done, then the record is re-inserted.
With this kind of behavior in db A, I'm not sure what it would do to log-shipping or transactional replication. I do know that I want to minimize the changes required at the SQL CE instances to keep the sync operation to a minimal cost.
choosing a primary key for the database which i am designing.
I have few tables which contains 5 -15 fields out of it 3 - 9 columns combined to form the uniqueness of the row.
All are un-related tables. Three parent tables connect with 20 child non-related child tables.
I believe it would not be a wise choice to choose 3 to 9 fields for primary key. But if i use an auto increment as a key will there be of any use as it might never be used to fetch rows. Then why do i still have to go with that?
Or Is it ok to create a primary key of upto 5 attributes?
I am a newbie to datamining, but have nearly a decade of solid database experience with the last 6 years in SQL Server 2000. We are moving our accounting system to SQL Server 2005 and I have been asked to explore the possibilities of mining an inventory table. I'd like to get some opinions prior to spending too much time potentially barking up the wrong tree!
We have an inventory table with approximately 10 million serialized records. Each row contains the serial number of the individual unit and its manufacturer/model designation. We have no control over the assigning of the serial numbers as they come from multiple manufacturers and some of the manufacturers correlate serial numbers to model and some don't.
My thought was to use a cluster model to try to predict the model of a new serial number as it is entered into the database. Is this thought feasibile? Is the mining model choice appropriate? If pointed in the right direction, I'm sure that I can run with this.
Hi I am having a query SELECT Dur1.rootId FROM DurableEventTab Dur1 WHERE (Dur1.dev_ReferenceClusterRoot = 'iyrwd.52' ) AND Dur1.dev_Action = 'Order:Ordered') AND (Dur1.dev_Active = 1) AND (Dur1.dev_PurgeState = 0) AND (Dur1.dev_PartitionNumber = 0)
This table has a primary key : aribapk11 and the indexes on the dev_ReferenceClusterRoot, dev_Action,dev_purgestate .
Now when I fire this query the query execution plan is actaull doing a Clustered Index scan on the PK :aribaPK11 . What I was expecting was an index seek on the key defined on dev_referenceClusterRoot. Please not the index seek is the behaviour in sql server 2000.
Any idea what is going wrong ?
Clustered Index Scan(OBJECT:([typhoon1902].[dbo].[DurableEventTab].[AribaPK7] AS [Dur1]), WHERE:([typhoon1902].[dbo].[DurableEventTab].[dev_Active] as [Dur1].[dev_Active]=(1.) AND [typhoon1902].[dbo].[DurableEventTab].[dev_PurgeState] as [Dur1].[dev_PurgeState]=(0) AND [typhoon1902].[dbo].[DurableEventTab].[dev_PartitionNumber] as [Dur1].[dev_PartitionNumber]=(0) AND [typhoon1902].[dbo].[DurableEventTab].[dev_ReferenceClusterRoot] as [Dur1].[dev_ReferenceClusterRoot]='iyrwd.52' AND [typhoon1902].[dbo].[DurableEventTab].[dev_Action] as [Dur1].[dev_Action]=N'Order:Ordered')) 0 0 Clustered Index Scan Clustered Index Scan OBJECT:([typhoon1902].[dbo].[DurableEventTab].[AribaPK7] AS [Dur1]), WHERE:([typhoon1902].[dbo].[DurableEventTab].[dev_Active] as [Dur1].[dev_Active]=(1.) AND [typhoon1902].[dbo].[DurableEventTab].[dev_PurgeState] as [Dur1].[dev_PurgeState]=(0) AND [typhoon1902].[dbo].[DurableEventTab].[dev_PartitionNumber] as [Dur1].[dev_PartitionNumber]=(0) AND [typhoon1902].[dbo].[DurableEventTab].[dev_ReferenceClusterRoot] as [Dur1].[dev_ReferenceClusterRoot]='iyrwd.52' AND [typhoon1902].[dbo].[DurableEventTab].[dev_Action] as [Dur1].[dev_Action]=N'Order:Ordered') [Dur1].[rootId] 1 0.00386574 0.0002263 71 0.00409204 [Dur1].[rootId] PLAN_ROW 0 1
Is there a recommended file format (csv, xml, txt) when choosing a file destination for SSIS? Does a file format impact the performance in terms of loading? Let's say i have chosen to use a .csv as my file destination (this has 15million rows and 50 columns with 2 bigint and the rest binary(32)) and later on, i would need to reload them back to table using SSIS. Is using csv faster than e.g. xml when reloading? Does it have performance impact at all?
I have created a few packages and i want to execute this in a sequence so I created a wrapper/parent package and added all the other packages as child Package using the Execute Package Task. These packages are file system based packages. I am executing the wrapper/parent package from a web page which will execute all the child packages. All is well and works fine when I choose the TransactionOption as "Supported" in my wrapper/parent package but when I choose the TransactionOption as "Required" in my wrapper/parent package I get the following error
Error Occurred: The package is failed due to following: The SSIS Runtime has failed to enlist the OLE DB connection in a distributed transaction with error 0x8004D024 "The transaction manager has disabled its support for remote/network transactions.".
What I am doing is connecting to 3 DB in the same server and doing some data manipulation. The MSDTC is running in the Target SQL Server and also the DTC Server in my Local is started and running. What else could be the problem.