Term LOOKUP
Aug 14, 2007
hi brother
what is the use of the term lookup please give me the
example
regards
koti
hi brother
what is the use of the term lookup please give me the
example
regards
koti
I am designing a ssis package,This is intends to mine text data(Data extracted from websites).
Term lookup/Term extraction has been used as tools for mining.
I have lookup terms defined with me for reference table,but the main problem lie in extracting the nearby text/number/charcters to these lookup terms during mining.
For example :
I found noun "Email" 200 (frequency score) times in my text,Now I want to extract nearby email address(this is also true for PhoneNumber,Address attributes also).so how can I achieve this with SSIS.
If u have some idea/suggestion to carry out this challenge with or without Term Extraction/Term Lookup,plz do write here.
Hi,
SQL Server Data Mining comes with "Term extraction" and "Term Lookup" for phrase detection. Rather than using the GUI tool, how to utilize these two features in coding? Please assist! Thanks!
Mary
Hello,
Although this SP intends to sorround a search text in double quotes, it seems that when called from Management Studio it throws a Syntax Error even before entering the SP.
createproc fts(@t nvarchar(1000)=null) as begin
select @t = '"' + @t + '"'
select @t
select * from dbo.products where CONTAINS(name, @t)
end
GO
exec fts @t = 'my product name'
GO
Msg 7630, Level 15, State 3, Procedure fts, Line 4
Syntax error near 'product' in the full-text search condition 'my product name'.
-------------------------
If I pass the string in double quotes I get a different error:
exec fts @t = '"my product name"'
Go
Msg 7630, Level 15, State 3, Procedure fts, Line 4
Syntax error near 'my' in the full-text search condition '""my product name""'.
Now, if I remove the quotes again and make the original call:
exec fts @t = 'my product name'
GO
-- Then it works fine.
Doing a :
dbcc freeproccache
GO
Shows the same behavior as outlined initially.
There are many references in the BOL for the term "ad hoc":
ad hoc query
ad hoc updates
ad hoc monitoring
ad hoc testing
ad hoc view
ad hoc name
...
What are these terms mean? I'm especially interested in the terms ad hoc query and ad hoc updates.
what use reason of 'weighted-term' ?explain it.
SELECT ID, firstname, lastnameFROM [contain-1]WHERE CONTAINS(firstname, 'ISABOUT(mohsen weight(.8),yaser weight(1.0))')
table [contain-1] information:
ID FIRSTNAME
1 mohsen
2 mohsen
3 yaser
4 mehdi
thanks,mohsen
I've just started using the SSIS and i would like to know if it's possible to change or update the dictionary of the term extraction tool. That's important to me because i may have to look for words that don't exist in the defaut english dictionary of the tool.
Thanks.
We did some "at scale" fuzzy lookup tests today and were rather disappointed with the performance. I'm wanting to know your experience so I can set my performance expectations appropriately.
We were doing a fuzzy lookup against a lookup table with 25 million rows. Each row has 11 columns used in the fuzzy lookup, each between 10-100 chars. We set CopyReferenceTable=0 and MatchIndexOptions=GenerateAndPersistNewIndex and WarmCaches=true. It took about 60 minutes to build that index table, during which, dtexec got up to 4.5GB memory usage. (Is there a way to tell what % of the index table got cached in memory? Memory kept rising as each "Finished building X% of fuzzy index" progress event scrolled by all the way up to 100% progress when it peaked at 4.5GB.) The MaxMemoryUsage setting we left blank so it would use as much as possible on this 64-bit box with 16GB of memory (but only about 4GB was available for SSIS).
After it got done building the index table, it started flowing data through the pipeline. We saw the first buffer of ~9,000 rows get passed from the source to the fuzzy lookup transform. Six hours later it had not finished doing the fuzzy lookup on that first buffer!!! Running profiler showed us it was firing off lots of singelton SQL queries doing lookups as expected. So it was making progress, just very, very slowly.
We had set MinSimilarity=0.45 and Exhaustive=False. Those seemed to be reasonable settings for smaller datasets.
Does that performance seem inline with expectations? Any thoughts to improve performance?
I'm working with an existing package that uses the fuzzy lookup transform. The package is currently working; however, I need to add some columns to the lookup columns from the reference table that is being used.
It seems that I am hitting a memory threshold of some sort, as when I add 3 or 4 columns, the package works, but when I add 5 columns, the fuzzy lookup transform fails pre-execute:
Pre-Execute
Taking a snapshot of the reference table
Taking a snapshot of the reference table
Building Fuzzy Match Index
component "Fuzzy Lookup Existing Member" (8351) failed the pre-execute phase and returned error code 0x8007007A.
These errors occur regardless of what columns I am attempting to add to the lookup list.
I have tried setting the MaxMemoryUsage custom property of the transform to 0, and to explicit values that should be much more than enough to hold the fuzzy match index (the reference table is only about 3000 rows, and the entire table is stored in less than 2MB of disk space.
Any ideas on what else could be causing this?
Hi,
Can you search a column of a database table to find all the rows that have a wor in it?
example: I have a row that contains 'adventure st north', there are other columns in that table that are suppose to be the same but read 'adventure street N.' or 'adventure st. N.' or 'North Adventure st.'
could I search for rows that contain 'adventure' in the column searched (lets call it columnA). I tried:
select * from tbl_test where columnA LIKE 'adventure' and got no results.
what is the way to do this?
Thank-you,
Eric
I'm not really sure how to explain this, so please bear with me.
I have a SQL statement, such as:
SELECT TOP (10) FROM chartTracks
This works with SQL Server Express 2005, but when I moved my site over to work with a MSSQL Server 2000, the statement had to be changed in order for it to work:
SELECT TOP 10 FROM chartTracks
I was just wondering if there was a technical term for this, and if possible, the locations of anymore sourecs of information regarding the above?
I'm just writing a report and would like to include this, if possible. Thanks in advance!
Just learning full-text searching in SQL Server 2005 and have questions about the proximity term "near".
1. How near is near? Measured in characters, words, or whatever?
2. How do you know? Is this documented? Can't find it anywhere.
3. Can it be adjusted? at design time? at runtime?
I have used a program called Sonar which has powerful proximity options that allow the user to specify proximity in terms of words at runtime. Would like to be able to do that but can't find much on "near" in the documentation other than it seems to relative, provides for left and right nearness, and allows for chaining of multiple search terms.
Hello,
when I extract nouns from a text with the Term Extraction Transformation, the destination indeed keeps the correct nouns but they are in a random order.
Is it possible to keep the order of these nouns and maybe already keep double occurrences?
Kind regards, _Rodney_
Is there a way to perform term extraction on each row in a table, and have the term extraction denote the frequency of the term on a per row basis rather than a per table basis?
Otherwise, is there a way I can take extracted terms and apply a sql function that returns the occurance of that term in each row?
Example: Row 1 = 2 hits, row 3 = 5 hits, etc.
Say I want to lookup a value in another dataset, but there is a grouping that requires you to know what the values for each level is in order to get to the correct detail record. Can you still use the lookup function with more than one field to compare against? So for example
Department
\___SalesPerson
\___Measure
I want to be able to add a new row at the Measure level, but lookup each field from another dataset. In order to do that I will need the Department AND SalesPerson values to do the lookup, but I dont think the Lookup function will let us do that will.
Hi guys,
I have a field called URL in my table. I want to get the SEARCH TERM from a given URL and create a report based on that information. I'm getting difficulties, because the URL have different format depending up on the search engine
that the users use to browse. Some of the search engines are "google",".excite.com", "search.msn.","search.netscape", "search.lycos", "altavista", "search.yahoo" and many more.
Examples of the URLs from google :
http://www.google.com/search?q=S26+Collet+Chuck&hl=en&client=firefox-a&rls=org.mozilla:en-USfficial&start=30&sa=N -- The search term is S26 Collet Chuck
http://www.google.com/search?sourceid=navclient&ie=UTF-8&rls=GGLG,GGLG:2006-02,GGLG:en&q=kt21+kia -- The search term is kt21 kia
http://www.google.com/search?hl=en&q=Slagger+burning+Tables -- The search term is Slagger burning Tables
Does anybody have a sql query or used a CLR functions to get the SEARCH TERM from different search engine (URL).
Thanks in advance.
THIS IS THE ONE.TXT FILE
Customer called to complain that the ice maker on her fridge has stopped working model XXYY-3
Door to refrigerator is coming off model XX-1
Ice maker is making a funny noise XXYY-3
Handle on fridge falling off model XXZ-1
Freezer is not getting cold enough XX-1
Ice maker grinding sound fredge XXYY-3
Customer asking how to get the ice maker to work model XXYY-3
Customer complaining about dent in side panel model XXZ-1
Dent in model XXZ-1
Customer wants to exchange because of dent in door model XXZ-1
Handle is wiggling model XXZ-1
Customer happy with us. Best fridge yet!
i created the table term_result(term_id varchar2(50)); ( termid ==xxyy-3 like)
now i want to find the no of times repeat the xxyy-3 posted queries
ERROR :
DT_NTXT OR DT_WSTR TYPES ONLY ALLOWS HERE ERROR I AM GETTING SO
Flat File Source-----------> Data Conversion --------->Term Lookup -------->Oledb data source
one.txt DT_stR I choosen error occur here
SO WHAT IS THE DATA TYPE I HAVE TO GIVE FOR THAT MATCHING LOOKUP
REGARDS
KOTI
Hi All,
Actually this is in regard to SCD Type 2 Dimension, Scenario is like that I am moving Fact table from some old source and I have dimensionA description value in fact which I want to replace with appropriate id from Dimension Table and that Dimension table is SCD Type 2 based on StartDate and EndDate and Fact Table doesn't contains direct date value rather there is timeId in Fact so to update the value in Fact table I have to Join Time Dimension table and other Dimension Table to replace fact Description with proper Id.
Lets assume DimensionA Structure
id
Description
StartDate
EndDate
Fact Table
id
measure1
measure2
TimeId
Description
Time Dimension
TimeId
Date
Day
Hour ...
I am doing a lookup that requires mapping 2 columns in the column mapping section. When I do this, I get the error "Row yielded no match during lookup" . The SQL that I captured in SQL profiler does find the record when I run it in Management Studio. I have already tried trimming everything to no avail.
Why is this happening?
I tried enabling memory restrictions but then I my package hangs and I get a SQLDUMPER_ERRORLOG.log file with the following logged:
07/24/07 13:35:48, ERROR , SQLDUMPER_UNKNOWN_APP.EXE, AdjustTokenPrivileges () failed (00000514)
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, Input parameters: 4 supplied
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, ProcessID = 5952
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, ThreadId = 0
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, Flags = 0x0
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, MiniDumpFlags = 0x0
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, SqlInfoPtr = 0x0100C5D0
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, DumpDir = <NULL>
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, ExceptionRecordPtr = 0x00000000
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, ContextPtr = 0x00000000
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, ExtraFile = <NULL>
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, InstanceName = <NULL>
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, ServiceName = <NULL>
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, Callback type 11 not used
07/24/07 13:35:48, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, Callback type 15 not used
07/24/07 13:35:49, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, Callback type 7 not used
07/24/07 13:35:49, ACTION, SQLDUMPER_UNKNOWN_APP.EXE, MiniDump completed: C:Program FilesMicrosoft SQL Server90SharedErrorDumpsSQLDmpr0033.mdmp
07/24/07 13:35:49, ACTION, DtsDebugHost.exe, Watson Invoke: No
Why am I getting this error with "Enable Memory Restriction"?
Hi all,
I got a problem, i'd like to update my DB in production in terme of
design and data from the DB test. In other words, I added many tables
to my design.
Therefore, i'd have to update the real DB on the server. Is there a way to do so with MSSQL? without doing it manually.
Thanks,
for example : " server={0};database={1};trusted_connection=true;application name={3}"
The article SqlConnection.ConnectionString Property refer this term,but doesn't specify use.
I want to know whether it will affect sql server and how to find its effect
thanks
I'd like to get some ideas for the following:
I am writing a quick mini-application that searches for records in a database, which is easy enough. However, if the search term comes up empty, I need to return 10 records before the positon the search term would be in if it existed, and 10 records after. (Obviously the results are ordered on the search term column)
So for example, if I am searching on "Microsoft", and it doesn't exist in my table, I need to return the 10 records that come before Microsoft alphabetically, and then the 10 that come after it.
I have a SP that does this, but it is pretty messy and I'd like to see if anyone else had some ideas that might be better.
Thanks!
hi, I have a question regarding calling sql table columns dynamically? workflow would go as:1. user enters search term into a textbox2. user checks a checkbox to search by column in sqldb (eg.. firstname or surname) pseudo sql would go like......SELECT +%column1(checkbox1.value)%+ OR +%column2(checkbox2.value)%+ OR +%column3(checkbox3.value)%+WHERE column1 = +%TextBox.Text%+ OR column2 = +%TextBox.Text%+ 3. display results in gridview my sql needs to improve greatly so any code insight(good book or link) would be terrific . thanks
View 10 Replies View RelatedHello,
I want to search a column with all the words deliminate by underscore. E.g. User_id, Community_name, author_id and etc.
It seems like freetext only deal with string with blank deliminator. How should I do the rull text search on column like this? Here is the code.declare @var varchar(2000)
set @var = 'id'
select [name], definition,version_code
from dbo.base
where freetext([name],@var)
thx
Why terms are not being corrected on a partial match? I thought this was the point of term based relation rules.
I am testing in an email address (nvarchar) field?
E.g. trying to correct @@ to @ and .con to .com (just to test)
And everything passes.
If I test with an entire value, the field is corrected.
E.g. 605688878@@qq.com corrects to 605688878@qq.com if I enter those exact values as term based relations.
Hi, I test the following sql statement, finding that an error ocurs:
Msg 7630, Level 15, State 2, Line 3
Syntax error near '"' in the full-text search condition '"dsg SDRGDG " OR "sdfsdfsdfsdafdsafdsfds'.
DECLARE @searchTerm NVARCHAR(40)
SET @searchTerm = '"dsg SDRGDG " OR "sdfsdfsdfsdafdsafdsfdsafdsafdsafsafdfdsafdf"';
SELECT [JobTitle], [JobDes], [OpenDate], j.[URLRef], c.[CompanyName], c.[URLRef], c.[URLSource]
FROM JobWanted AS j INNER JOIN
Company AS c ON c.CompanyID = j.CompanyID
WHERE CONTAINS((JobTitle, JobDes), @searchTerm)
It seems too lengthy string will cause an error for full-text engine. I find the sdfsdfsdfsdafdsafdsfdsafdsafdsafsafdfdsafdf is truncated as shown in error message.
How to avoid this issue? Could I configre this limination?
Thanks in advance.
Ricky.
Hi all,
I don't understand what's happening here.
I have a Conditional Split with 3 outputs. On the first output I have a lookup, when I execute the package I have 56 rows going through the Conditional Split, all rows are then going to the 2nd and 3rd output but the lookup on the first output generates an error "Row yielded no match during lookup".
I don't understand why the lookup is generating an error while there is no row going through it.
Any idea ?
Sébastien.
Hi,
I am currently designing a SSIS package to integrate data into a data warehouse fact table. This fact table has about 70 columns among which 17 are foreign keys for dimension tables.
To insert data in that table, I have to make several transformations and lookups. Given the fact that the lookups I have to make are a little complicated, I have about 70 tasks in my Data Flow.
I know it's a lot, but I can't find a way to make it simpler. It seems I really need all these tasks.
Now, the problem is that every new action I try to make on the package takes a lot of time. At design time, everything is very slow. My processor is eavily loaded each time I change a single setting in one of the tasks, and executing the package in debug mode takes for ages. If I take a look at the size of my package file on disk, it's more than 3MB.
Hence my question : Are there any limitations in terms of number of columns or number of tasks that can be processed within a Data Flow ?
If not, then do you have any idea why it's so slow ?
Thanks in advance for any answer.
I want to lookup values from a database into another database both of which are in the same sql server 2000. One databases is called GamingCommissiondb the other is called LicensingActions I need some of the tables to communicate with each other, to look values from one another. Example I need the Termination table to look up values from the Revocations table. Would using LinkedServers suffice??
View 5 Replies View RelatedHi,
Can anybody provide me a Lookup UDF? I need to supply columnname,Tablename and condition dynamically and I need the scalar value in return.
Any help will be greatly appreciated...
Hi,
I am transferring the xml data from an xml file into sql server table using ssis.
To avoid any duplicate import via this ssis package, I would like to first check if data exists in the sql server table for what is about to be imported. If so then delete the existing data and then import.
Question:
How do I get the field value say ID field from the file and then take this id and delete these from the table in sql server first.
Is this to do with lookup or is there an easier way to do this please?
thanks
Hi
I am trying to use lookup to see if a item esists in my table ( 3 key fields ). If the lookup fails I want to insert the records. If it succeeds I have put a recordcount to catch the items that are not required. I don't think that I understand the settings for failed rows. I have tried setting the Configure Error Output to redirect, but this does not seem to work. I have the below errors.
[SQL Server Destination [151]] Error: Unable to prepare the SSIS bulk insert for data insertion.
[DTS.Pipeline] Error: component "SQL Server Destination" (151) failed the pre-execute phase and returned error code 0xC0202071.
Can someone please advise me how to set up this component to work for my application
Thanks
ADG
Hi,
I am using SCD and Lookup for Incremantal ETL load. But it is very interesting that
my 3000 recs is becoming 2500 when I put SCD and lookup. if I replace dummy multicast then
it is 3000.
thanks