SQL Server 2012 :: Interpreting JSON Data For Reporting Purpose
May 16, 2014
We have a gaming application which generates transactional data in MongoDB which eventually sends the data to SQL Server and it is in JSON format. This data needs to be used for reporting tool but visualizing this data in forms of a table is proving to be difficult. One example of a column we receive is:
I'm designing a new database which will be the back-end to a heavily-used web-based application (all these terms are relative - I guess the use won't be that heavy in the grand scheme of things, I'm only talking 100 users or so at the very most). Data from the old application database will be migrated to this one, and the old database is around 7GB in size after 5 years of use.
I have two different ways of linking some tables in mind, one which is slightly more complex than the other but which potentially has benefits over the simpler method. However, I'm concerned that I might be 'over-cooking' the design, and that performance would suffer as a result, so I've tried creating the two different versions of the database (the part of it I'm concerned with, anyway), one for each of the solutions I've got in mind, migrated the data into the relevant tables and carried out some queries on the data to collect some statistics.
The problem is that, whilst I can see that the more complex method is more expensive, as expected, I don't really understand if the difference is significant. Since I don't know what the numbers in the Client Statistics window actually mean (there are no units! I'm guessing times are in milliseconds?), or how much of real-world impact the difference will have, I'm finding it hard to interpret my statistics and come to a decision.
Querying the entirety of my tables to return ~20,000 records listing one column from each of the main tables I'm playing with, the simpler method had a Total Execution Time of 199, and the more complex a Total Execution Time of 272. Is that the statistic I should be most concerned with? Is that a difference I should be concerned about? Is the difference likely to be magnified when the database is much larger and in use, such that a difference of 73 milliseconds in this test scenario could end up being as much as a whole second in production, for example?
We run std 2008. In my ssrs log I see this for one of our most critical reports...
library!ReportServer_0-64!2244!07/07/2015-08:24:53:: Call to GetPermissionsAction(/somedirectory/somedirectory1).... which I assume is an indication of a report starting to render by first checking permissions.
Around the time my user says he still saw the revolving arrow and he stopped the report because he felt it was running too long, I see...
webserver!ReportServer_0-64!1dbc!07/07/2015-08:54:44:: i INFO: Processed report. Report='/somedirectory/somedirectory1/importantreport', Stream=''
How can it be true that he stopped it and ssrs reports that it processed the report?
About 4 minutes later I see this entry in the log...
webserver!ReportServer_0-64!15e4!07/07/2015-08:58:34:: i INFO: Processed report. Report='/somedirectory/somedirectory1/importantreport', Stream=''
Which processed report message is right? Could there be multiples cuz of subreports? I see a number of errors and exceptions around these same times but do not know how to tie either to a specific report. Is there a way?
Below is the many to one currency conversion MDX query generated by BI wizard. I am having hard time understanding the purpose of [[Reporting Currency].[Local]. Can anyone explain, why do we need [Reporting Currency].[Local], how does it work in the conversion and where are the relationships established between [Reporting Currency].[Local] member and other measures/dimensions? Thanks.
// This is the Many to One section
// All currency conversion formulas are calculated for the pivot currency and at leaf of the time dimension
I have a field called URL in my table. I want to get the SEARCH TERM from a given URL and create a report based on that information. I'm getting difficulties, because the URL have different format depending up on the search engine that the users use to browse. Some of the search engines are "google",".excite.com", "search.msn.","search.netscape", "search.lycos", "altavista", "search.yahoo" and many more.
Examples of the URLs from google :
http://www.google.com/search?q=S26+Collet+Chuck&hl=en&client=firefox-a&rls=org.mozilla:en-USfficial&start=30&sa=N -- The search term is S26 Collet Chuck http://www.google.com/search?sourceid=navclient&ie=UTF-8&rls=GGLG,GGLG:2006-02,GGLG:en&q=kt21+kia -- The search term is kt21 kia http://www.google.com/search?hl=en&q=Slagger+burning+Tables -- The search term is Slagger burning Tables
Does anybody have a sql query or used a CLR functions to get the SEARCH TERM from different search engine (URL).
I have a business need to create a report by query data from a MS SQL 2008 database and display the result to the users on a web page. The report initially has 6 columns of data and 2 out of 6 have JSON data so the users request to have those 2 JSON columns parse into 15 additional columns (first JSON column has 8 key/value pairs and the second JSON column has 7 key/value pairs). Here what I have done so far:
I found a table value function (fnSplitJson2) from this link [URL]. Using this function I can parse a column of JSON data into a table. So when I use the function above against the first column (with JSON data) in my query (with CROSS APPLY) I got the right data back the but I got 8 additional rows of each of the row in my table. The reason for this side effect is because the function returned a table of 8 row (8 key/value pairs) for each json string data that it parsed.
1. First question: How do I modify my current query (see below) so that for each row in my table i got back one row with 19 columns.
SELECT A.ITEM1,A.ITEM2,A.ITEM3,A.ITEM4, B.* FROM PRODUCT A CROSS APPLY fnSplitJson2(A.ITEM5,NULL) B
If updated my query (see below) and call the function twice within the CROSS APPLY clause I got this error: "The multi-part identifier "A.ITEM6" could be be bound.
2. My second question: How to i get around this error?
SELECT A.ITEM1,A.ITEM2,A.ITEM3,A.ITEM4, B.*, C.* FROM PRODUCT A CROSS APPLY fnSplitJson2(A.ITEM5,NULL) B, fnSplitJson2(A.ITEM6,NULL) C
I am using Microsoft SQL Server 2008 R2 version. Windows 7 desktop.
I am in the position where I have to transfer data from an old database schema to a new database schema. During the transfer process alot of logic has to be performed so that the old data gets inserted into the new tables and are efficiently done so that all foreign keys are remained and newly created keys (as the new schema is Normalised alot more) are correct.
Is it best if I perform all this logic in a Stored Procedure or in C# code (where the queries will also be run)?
We have SCCM 2012 primary site and Remote SQL 2012 server. Due to hardening and password reset we are facing reporting issue.
while we Open the SRS report in SQL server and try to edit the Report (Report Builder) we are getting following error due to which we are unable to configure Reporting Service point in the SCCM 2012 server. We created the New Reporting server database still we are getting the below error.
Does anybody know what the "NT AUTHORITYSYSTEM" login create during a SQL Server 2005 instillation is used for?
Does this login pose a security risk, and can it be removed safely? It seems to me as if it is similar to the "BultinAdministrator" login which we remove from our production servers?
basically, we have two databases, one with fix name, eg. DB1 in all environments, and another is named with prefix based on environment, eg. DB2_DEV, DB2_TEST, that is generated by some managed application.
Then we have queries or view/SPs residing in the first database, eg. DB1 that access database resources from the second database, DB2_***. Both databases are residing in the same SQL instance.
Currently, we have hardcoded the database name in the query but would prefer not to manually rename or write other scripts to update the query with the correct database name when deploying to other environments.
For example:
SELECT * FROM DB2_DEV.dbo.vAccounts When deploy to TEST, we need to update the query to: SELECT * FROM DB2_TEST.dbo.vAccounts
So, I looked at Synonyms but it seems it can only create 'alias' for tables or views, and not database.
Hence, last thought is to create self linked server, so I can write my query like below and setup the linked server accordingly.
SELECT * FROM SRCDB.dbo.vAccounts
Any consideration to think of, eg. performance, security, etc?
I have a report with three subreport, i want to hide data of subreport while exporting reporting in to excel. I have used this function (=IFF Globals! Render foramt.IsInteractive,False,True) but didnt work.
I have a scatter chart in SSRS (SQL Server 2012, Visual Studio 2010) that is producing the following:
There are five data points on there, however the result set I am using has 10 rows (a 'Completed Date' of datetime and a 'Lateness' integer whose values can be positive or negative.
This is the Dataset and the results it produces:
SELECT DISTINCT TOP 10 a.ACTIVITY_NAME As [Activity Name] , ad.COMPLETED_DATE As [Completed Date] , ad.DAYS_LATE As [Lateness] FROM ACTIVITIES a JOIN ACTIVITY_DATA ad ON ad.ACTIVITY_ID = a.ACTIVITY_ID
[code]....
How can I tell SSRS to show every data point in my chart?
We are showing hovering data in the report. When we hover the pointer of the mouse over cell, it shows the data. But when we export the report in excel, hover functionality is not working in exported report. Finding the solution to ensure hovering should work in the exported excel report.
Any knowledge of an SSIS source pipeline component which reads the JSON, a data interchage format.
JSON looks pretty tempting for heavy data interchange (somewhat human read-able, name/value pairs + arrays, nesting, lighter weight than most xml serializiers), and if its gaining momentum, I should think a source component would follow on (most likely third party)
Is there a good, common way to migrate data from relational tables into hierarchical json?
I am asking this because, basically, I prefer not to write a lot of procedural code.
I am currently working with MS SQL Server db and I’ll soon need to do the same with My SQL db. I use java and java db connectivity for access. It would be great if any of these two db’s can somehow assemble the data server-side.
I have a job whose first step is to run a DTS package via a DTSRUN Operating System Command. I get the following message.
DTSRun: Loading... DTSRun: Executing... Error: -2147220499 (800403ED); Provider Error: 0 (0) Error string: No Steps have been defined for the transformation Package. Error source: Microsoft Data Transformation Services (DTS) Package Help file: sqldts.hlp Help context: 700. Process Exit Code 1. The step failed.
Prior to 2/29/2000, it had run dozens of times successfully, the last time on 2/23/2000.
If I log on to the SQL server and using IE go to http://sql1/reportServer I see the reportServer webpage just fine.However if I go to another server in the same domain and do exactly the same thing (same user account, same URL) I get a windows box asking for username and password. It doesn't matter what username I use I still get prompted for username and password.
I've looked in the sys.dm_exec_connection and it says NTLM so I am guessing that SPN are not the cause.
We have another SQL instance on the same SQL server and you can reach the reporting webpage just fine no matter where you are.
The only thing I have noticed different is the working instance has login for DPM$ (being a DPM instance) but the default instance does not have a SQL1$ entry. Don't know if that is important.
I am unable to connect to reporting server , services are running fine .Checked in logs and found
ERROR: Error initializing configuration from the database: Microsoft.ReportingServices.Library.ReportServerDatabaseUnavailableException: The report server cannot open a connection to the report server database. A connection to the database is required for all requests and processing. ---> System.Data.SqlClient.SqlException: A connection was successfully established with the server, but then an error occurred during the login process. (provider: Shared Memory Provider, error: 0 - No process is on the other end of the pipe.)
I ran the DBCC SHOW_STATISTICS command for all of my indexes; I was told that high density numbers are bad, low numbers good. I have some questions about my results, though; I'm not sure how to interpret them.
Of 48 indexes, 14 have a density of 0. Does this mean that the indexes are not selective enough? Does it mean they're garbage and I should toss them?
6 have a density of NULL. They are all primary keys. I suppose this just means that they're never used because these tables are rarely queried. Would this assumption be correct?
13 have a density of 1. I have no idea what this means.
The others have densities ranging from 0.01210491 to 0.5841165. I was told that the lower this number is, the more selective and thus more useful an index is. I think 0.5841165 is too high a number. Would this be correct?
I'm going to write an advanced whitepaper on interpreting the results of CHECKDB in SQL Server 2005 (mostly applicable to SQL Server 2000 as well), should be available before end of the year. Couple of questions for you:
1) would this be interesting/useful to you? 2) anything in particular you'd like to see covered?
Thanks
Paul Randal Dev Lead, Microsoft SQL Server Storage Engine (Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
I used SET STATISTICS TIME ON to get execution stats for a query. I found that the CPU Time was sometimes greater than the elapsed time. How is this possible? The query does not use any parallelism since I used the query option MAXDOP 1. Is the elapsed time wait time? Is the total execution time the sum of the CPU time and elapsed time?
I used a decision-tree mining-model to describe and predict fraud. The table contains 1039 records with 775 distinct value of A-number (the calling party). I used 9 columns in the model. SQL Server reports that only 3 columns are significant in predicting the fraud
- BPN_is_too_short (called party-number is too short) - Duration_is_zero - Invalid_area_code
The key-column in A-number, and the predicted column is Is_Fraud with the range of values are only 0 and 1. There's no record with NULL (missing-value) in the column Is_Fraud.
Mining Legend shows in the first split [-] 625 cases of fraud [-] 150 cases of non-fraud [-] 0 cases of missing
In addition to that, Mining Legend shows [-] 79.69% of fraud [-] 19.64% of non-fraud [-] 0.67% Missing
Now when I compare those values, they don't match. (A) 625/775 is 80.645%, not 79.69% (B) 150/775 is 19.355%, not 19.64% (C) 0 cases of NULL (missing value) should imply 0% of missing, not 0.67% of missing
Furthermore in one node (with the split on duration_is_zero), there are 541 cases of fraud and 0 cases of non-fraud. This implies the node is leaf-node. However, Mining Legend shows
514 cases of fraud, 99.35%
0 cases of non-fraud, 0.33%
[F] 0 cases of missing, 0.33%
My questions (1) Why the values don't match like in cases A through C ? (2) Why the values don't match even in cases D through F when we have no subtree at all ?
I've searched explanation by reading the mathematical reasoning, entropy, Gini index; but it does not answer the discrepancies of those values and percentages in the Mining Legend.