SQL Server 2012 :: Select Large Data From Multiple Tables
May 10, 2014
In a Library Management database we have these tables
1) Document ( DocNo , Doc_type , permalink,inDate)
2)Title(id, DocNo,Main_Title, Other_Title)
3)Author(id , Author_Name , Author_Family,Type--Like:main author , translator ,....)
4)Publisher(id,DocNo , Name,Publisedate,address)
5)Subject(id,DocNo,Subject)
6)Description(id,DocNo,ISBN,description)--one document may have some ISBN,etc
In document table I have 500,000 records.
I want to search a word in these tables ,for example i want to search 'Computer' ,this word may be in subject or title or description and etc. How can I do this with best performance?
Hello,Currently we have a database, and it is our desire for it to be ableto store millions of records. The data in the table can be divided upby client, and it stores nothing but about 7 integers.| table || id | clientId | int1 | int2 | int 3 | ... |Right now, our benchmarks indicate a drastic increase in performanceif we divide the data into different tables. For example,table_clientA, table_clientB, table_clientC, despite the fact thetables contain the exact same columns. This however does not seem veryclean or elegant to me, and rather illogical since a database existsas a single file on the harddrive.| table_clientA || id | clientId | int1 | int2 | int 3 | ...| table_clientB || id | clientId | int1 | int2 | int 3 | ...| table_clientC || id | clientId | int1 | int2 | int 3 | ...Is there anyway to duplicate this increase in database performancegained by splitting the table, perhaps by using a certain type ofindex?Thanks,Jeff BrubakerSoftware Developer
I need to consume a live data feed from a golf tournament. And by consume, I really mean insert (merge) into our own SQL Server database on a regular intervals as a tournament progresses.This site didn't let me upload an XML file, but you can see a sample of the data feed here: URL....
I need to insert this data into 2 tables, Player_Holes and Player_Shots. But while doing the insert, I need to lookup several things such as our player ID match to theirs on an external_id against the players table. The shot types translation, and some other logic about the process overall.
The columns in my player_holes tables are: id, player_id, hole_id, round, shots (this is a total # of strokes) and date_created/date_modified.Shots table is similar: id, player_id, hole_id, round, shot_number, shot_type_id, club, distance, date_created/date_modified.
The only way I know how to do it, is inefficient. I would parse the XML in ColdFusion (please no comments on ColdFusion, that's what we use for webdev), and then loop over it and do inserts for each player, each hole for each round, and the shots would probably be separate for each hole.
It would be so much better and more efficient if I could do it in SQL directly. I've done some research and SQL Server Data Tools looks promising. I've never used it, so would have to learn, but also I'm not sure if that'd work in this application when we want to run is as a scheduled task every few minutes.
I have a source table in the staging database stg.fact and it needs to be merged into the warehouse table whs.Fact.
stg.fact is not a delta feed it is basically an intra-day refresh.
Both tables have a last updated date so its easy to see which have changed.
It will be new (insert) or changed (update) data that I am interested in, there are no deletions.
As this could be in the millions of rows that are inserts or updates then this needs to be efficient.
I expect whs.Fact to go to >150 million rows.
When I have done this before I started with T-SQL Merge statement and that was not performant once I got to this size.
My original option was to do this is SSIS with a lookup task that marks the inserts and updates and deal with them seperately. However I set up the lookup tranformation the reference data set will have a package variable in the SQL commnd. This does not seem possible with the lookup in 2012! Currently looking at Merge Join transformation and any clever basic T-SQL that could work as this will need to be fast, and thats where I think that T-SQL may be the better route.
Both tables will have >100,000,000 rows Both tables have the last updated date The Tables are in different databases but on the same SQL Instance Each table holds 5 integer columns, one Varchar, one datatime
Last time I used Merge it was a wider table with lots of columns so don't know if this would be an option.
Hi, I have a number of related tables: RGData is related to RGCrossReference RCPPositionData is related to RCPCrossReference RGCrossReference is also related to RCPCrossReference. The data is returned correctly from these tables. However, I also want to return data from another table - RComments. How do I do this? RComments is related to either RGData or RCPPositionData only. Thanks.
Code Snippet SELECT cm.CommentImage AS ViewComment, gd.PositionID AS GPositionID, cd.UniquePositionID AS CPPositionID FROM RGData gd INNER JOIN RGCrossReference g ON g.GPositionID = gd.PositionID INNER JOIN RCPCrossReference c ON c.GMatchID = g.GMatchID INNER JOIN RCPPositionData cd ON cd.UniquePositionID = c.CPPositionID left outer JOIN RComments cm ON ((cm.CPPositionID = cd.UniquePositionID) or (cm.GPositionID = gd.PositionID)) AND cm.CommentsDate = (SELECT MAX(CommentsDate) AS Expr1 FROM RComments WHERE (GPositionID = g.GPositionID)) WHERE (cd.Quantity != gd.Quantity OR cd.Currency != gd.Currency) AND g.ForcedMatch = 'no';
I am fetching large amount of data from teradata to sql server using linked server. I am facing below query:
A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
I need to get this data into an SQL table in the following form so I can use it to further manipulate the data and update several other tables. I am thinking that UNPIVOT or CROSS APPLY might be the way to go, but am not sure how to code it.
OrderID ControlName 1 Row1COlumn1 (It Means Pant in Red Color is selected by user(relation with Child2 Table)) 1 Row3Column1 (It Means Gown in Blue Color is selected by user(relation with Child2 Table)) 1 Row4Column3 (It Means T Shirt in White Color is selected by user(relation with Child2 Table)) 2 Row1Column2 (It Means Tie in Green Color is selected by user(relation with Child2 Table)) 2 Row3Column1 (It Means Bow in Red Color is selected by user(relation with Child2 Table))
Child2 Table
PackageID Product Color1 Color2 Color3 1 Pant Red Green Blue 1 Shirt Blue Pink Purple 1 Gown Blue Black Yellow 1 T Shirt Red Green White 2 Tie Red Green White 2 Socks Red Green White 2 Bow Red Green White
We want to have result like
OrderID PackageID CustomerName Pant Gown T Shirt Tie Bow
1 1 ABC Red Blue White x x Blue 2 2 Bcd x x x Green Red
I have tried
;with mycte as ( select ms.OrderID,ms.PackageID ,ms.CustomerName , Replace(stuff([ControlName], charindex('Column',ControlName),len(ControlName),''),'Row','') rowNum ,Replace(stuff([ControlName], 1, charindex('Column',ControlName)-1 ,''),'Column','') columnNum From child1 c inner join MasterTable ms on c.Orderid=ms.orderid)
[code]....
it works if we have a product in one color only. like if we have pant in red and blue then its showing just first record
I have a dataset where i want to select the records that matches my input values. But i only want to try macthing a field in my dataset aginst the input value, if the dataset value is not NULL.
I always submit all 4 input values.
@Tyreid, @CarId,@RegionId,@CarAgeGroup
So for the first record in the dataset i get a succesfull output if my input values matches RegionId and CarAgeGroup.
I cant figure out how to create the SQl script for this SELECT?
I have Database Library, which has a lot of tables and we need 3 tables for quary:
Table Librarians: field ID, field Surname; Table StudentCard: field ID, foreign key on table Librarians and other fields, which we don't use Table TeacherCard: field ID, foreign key on table Librarians and other fields, which we don't use
Query: Select the librarian's surname, which gave the most count of books.
I know, how to resolve, when i took datas only from one table, e. g. TeacherCard
SELECT TOP 1 WITH TIES Librarians.LastName, MAX(Librarians.CountOfBooks) AS Books FROM (SELECT L.LastName, COUNT(*) AS CountOfBooks FROM Libs L, T_Cards T WHERE T.Id_Lib IN (SELECT L.Id) GROUP BY L.LastName) AS Librarians GROUP BY Librarians.LastName ORDER BY MAX(Librarians.CountOfBooks) DESC GO
I dont know, how to use datas from TeacherCard and from StudetnCard at the same time.
i have database which has 25 tables. all tables have productid column. i need to find total records for product id = 20003 from all the tables in database.
Here is the desired counted output, I would like to pull distinct Date, MachineNumber, TestName and then count how many times they occur in the raw data form.I do need to perform a case on the date because right now its in a datetime format and I only need the date.
I am pulling three columns with the same names from 8 different tables. What I need to display the date, machine & test name and count how many times a test was run on a machine for that date. I have a feeling this can be handled by SSAS but haven't built an analysis cube yet because I am unfamiliar with how they work. I was wondering if this is possible in a simple query. I tried to set something up in a #Temp table. Problem is the query takes forever to run because I am dealing with 1.7 Million rows. Doing an insert into #temp select columnA, columnB, columnC from 8 different tables takes a bit.
I am trying to run an update statement against a vendor's database that houses HR information. If I run a regular select statement against the database with the following query, it returns without error:
SELECT "QUDDAT_DATA"."QUDDAT-INT", "NAME"."INTERNET-ADDRESS", "QUDDAT_DATA"."QUDFLD-FIELD-ID", "QUDDAT_DATA"."QUDTBL-TABLE-ID" FROM "SKYWARD"."PUB"."NAME" "NAME", "SKYWARD"."PUB"."QUDDAT-DATA" "QUDDAT_DATA" WHERE ("NAME"."NAME-ID"="QUDDAT_DATA"."QUDDAT-SRC-ID") AND "QUDDAT_DATA"."QUDTBL-TABLE-ID"=0 AND "QUDDAT_DATA"."QUDFLD-FIELD-ID"=16 AND "QUDDAT_DATA"."QUDDAT-INT"=11237When I try to convert it into an
[Code] ....
I am assuming I am receiving this error because it doesn't know where to find QUDDAT-INT? How can I fix that?
The "QUDDAT-INT" column houses the employee number. So in the case of the SELECT query above, I am testing against a specific employee number.
how we can replace the multiple values in a single select statement? I have to build the output based on values stored in a table. Please see below the sample input and expected output.
Other than right-clicking on each individual table in SSMS and generating a CREATE script, is there a simple way to generate CREATE TABLE scripts for tables within a given database?
Background: I have a bunch of tables in one database, and I would like to add tables to a second database that have the same names and basic structures of some of the tables from the first database.
I do not need to transfer any data from the tables, this is a seperate project that will use a similar data structure. I just want to generate the CREATE TABLE scripts for 30ish tables within the first database, and then I'll tweak the scripts as appropriate and run them against the new database.
I have created a trigger that is set off every time a new item has been added to TableA.The trigger then inserts 4 rows into TableB that contains two columns (item, task type).
Each row will have the same item, but with a different task type.ie.
I have a problem where my users complain that a select statement takes too long, at 90 seconds, to read 120 records out of a database. The select statement reads from 9 tables three of which contain 1000000 records, the others contain between 100 and 250000 records. I have checked that each column in the joins are indexed - they are (but some of them are clustered indexes, not unclustered). I have run the SQL Profiler trace from the run of the query through the "Database Engine Tuning Advisor". That just suggested two statistics items which I added (no benefit) and two indexes for tables that are not involved at all in the query (I didn't add these). I also ran the query through the Query window in SSMS with "Include Actual Execution Plan" enabled. This showed that all the execution time was being taken up by searches of the clustered indexes. I have tried running the select with just three tables involved, and it completes fast. I added a fourth and it took 7 seconds. However there was no WHERE clause for the fourth table, so I got a cartesian product which might have explained the problem. So my question is: Is it normal for such a type of read query to take 90 seconds to complete? Is there anything I could do to speed it up. Any other thoughts? Thanks
1. Take a subset of data from about 100 tables that have multiple references to other tables in this group of 100 from a first DB. 2. Insert the above data into a second DB, a database that already has data in the 100 tables, while maintaining the correct references.
As a general approach, the best way I can think of doing this is as follows:
1. Create mapping tables for every ID that is referenced in a different table (OldID NewID) 2. Insert the old data into the new table and output the OldID and NewID into the mapping table. 3. Use that mapping data to make sure all tables that use those IDs have the new IDs in DB2.
This approach is extremely labor intensive both on initial implementation and would require a fairly substantial amount of work to maintain going forward.
i have a large table in sql server 2005 (it has about 6 columns and 10 million records).
i need to work in a linear way on all the records (i know it sounds dumb but i need to work on all records).
now, obviously when trying to work on this table sql server get stuck for timeout or something like that...
i've noticed that a simple function like "select top 100 * from ExportTable" still works.
is there any way to have sql send me the data when it access it so that i'll still be able to proccess it on the same time, i basically work using dataset so that fixing the timeout wont be helpfull since windows probably wont allow me to load this amount of data into memory.
I have a query currently that looks like this . @Month and @Year are supplied as parameters
SELECT -- select the sum for each year/month combination using a correlated subquery (each result from the main query causes another data retrieval operation to be run) (SELECT SUM(SalesofProductA) FROM #ABC WHERE [Year]=T.[Year] AND [Month]=T.[Month]) AS [Sum_SalesofProductA]
[Code] ...
Right now I see an output like this : for a particular value of @Month and @Year
SalesofProductA, SalesofProductB, SalesofProductC What I would like to see is :
I need to make a query that counts installed developer software for all our developers (from the sccm database), for licensing purposes. The trick here is that a license should only be counted once per. developer and that should be the highest version. But in the database, the developers can have different versions of the software installed (upgrades) on the same computer and they often use several computers with different software versions.
So for example: A source table with two developers
------------------------------------------------------------------- | dev1 | comp1 | Microsoft Visual Studio Ultimate 2013 | dev1 | comp1 | Microsoft Visual Studio Professional 2010 | dev1 | comp2 | Microsoft Visual Studio Premium 2010 | dev2 | comp3 | Microsoft Visual Studio Professional 2010 | dev2 | comp4 | Microsoft Visual Studio Premium 2012 --------------------------------------------------------------------
I want the result to be: ----------------------------------------------------- | dev1 | Microsoft Visual Studio Ultimate 2013 | dev2 | Microsoft Visual Studio Premium 2012 ------------------------------------------------------
I have created a query using cursors that give me the correct result, but it's way to slow to be acceptable (over 20 min..). I also toyed with the idea of creating some sort of CRL proc or function in C# that does the logic, but a SCCM consultant from MS said that if I create any kind of custom objects on the SCCM SQL Server instance, we loose all support from them. So I'm basically stuck with using good old fashioned T-SQL queries.
My idea now, is to use a CTE table and combine it with a Temp table with the software and a rank. I feel that I'm on the right track, but I just can't nail it properly.
This is how far I have come now:
IF OBJECT_ID('tempdb..#swRank') IS NULL CREATE TABLE #swRank(rankID int NOT NULL UNIQUE, vsVersion nvarchar(255)) INSERT INTO #swRank(rankID, vsVersion) VALUES (1, 'Microsoft Visual Studio Ultimate 2013'), (2, 'Microsoft Visual Studio Ultimate 2012'),
I have found a bunch of duplicate records in our housing database that ideally I need to delete.There are two tables that I need to remove data from ih_cml_log_entry and ih_cml_log_notes. There is no unique identifier between the tables for a log entry. So I have had to join on the person_ref, log_seq and the date/time of entry.How do I go about deleting the data - I've used the script below to identify what I need to delete -
SELECT * FROM ( select cml.person_ref, cml.open_date + open_time as 'datetime',cml.open_user,cml.log_type ,ROW_NUMBER() OVER (PARTITION BY cml.person_ref, cml.open_date + cml.open_time,cml.open_user,cml.log_type ORDER BY (SELECT 0)) AS RowNo ,n.note FROM ih_cml_log_entry cml
I have 6 tables which are very huge in row count and records needs to deleted which are older than 8 days.
Little info: Every day, 300 Million records are inserted in below 7 tables. we should maintain only 8 days worth of data in below tables. How to implement Purge script which can delete records in all tables in the same time and with optimized parallelism.
Master table which has [ID],[Timestamp] Table Name: Sample - 2,578,106
Child tables: Foreign key [ID] is common for all the tables. There is no timestamp column in child table. So the records needs to deleted based on Min(ID) from Sample
If F.parent_id(101)=T.team_id(101) and T.team_id(101)=T.parent_folder_id (101) then output should come as 'Mobile/c' (this is for f.parent_id=101)
If F.Parent_id=T.team_id and T.team_id!=T.parent_folder_id then parent_folder_id have to start search on team_id column where it got match and pick the Team_name from that corresponding id
Ex: F.parent_id=202 is matching with T.Team_id (202) but this T.team_id(202) is not matching with T.parent_folderid(200) , so this T.parent_folderid (200) have to search on T.id (200) ,if now T.id(200) is matching with T.Parent_folder_id(200) then it have to give the names from the starting hirache
I'm pulling data from XML into tables, but I'm unsure how to link the data after it's imported. This example has names and tasks, and I can pull the data into two tables, but I can't find any way to link the task to the appropriate person. My person and task tables populate without issue, but there's nothing I can find to link the rows together. So in this example Test 1 would go to the first two Tasks and Test 2 would go to the second two work items.
I have 3 tables , Customer , Sales Cost Charge and Sales Price , i have join the customer table to the sales price table with a left outer join into a new table.
i now need to join the data in the new table to sales cost charge. However please note that there is data that is in the sales price table that is not in the sales cost charge table and there is data in the sales cost charge table that is not in the sales price table ,but i need to get all the data. e.g. if on our application it shows 15 records , the sales price table will maybe have 7 records and the sales cost charge table will have 8 which makes it 15 records
I am struggling to match the records , i have also tried a left outer join to the sales cost charge table however i only get the 7 records which is in the sales price table. see code below