Quoted
Ich glaube die lösungen tauchen nicht mehr auf. Plan b ist zu vergleichen. Ich poste infach mal was ich raus bekommen habe.
Exercise 1
1. Query optimization
A document collection with 125,000 documents contains film reviews. Given is the
following query:
(NOT horror) AND (film OR animation) AND (action OR comedy)
Quoted
Specify the most efficient order of execution for this query that can be determined from the
following table:
Term DF
horror 75,000
film 62,000
animation 3,000
action 41,000
comedy 40,000
Is the order you proposed always optimal?
Quoted
2. Inverted index
Given is the following document collection:
D1:
Ice Age 4 was released in 34 territories.
D2:
Ice Age 4 (original Ice Age: Continental Drift) is a 2012 American computer-animated comedy
film.
Create an inverted index for this document collection. Tokenization rules: word wise, case-
folding, ignore punctuation. Stop list: was, in, is, a. Include TF and DF values at a suitable
position in the index.
Quoted
Which search results can be obtained from this index for the following queries?
Da horror negiert ist, gilt für die DF von horror: DF~=125.000-75.000~=50.000. An der Ausführungsreihenfolge, die du angegeben hast, ändert sich aber nichts.Quoted
Ich glaube die lösungen tauchen nicht mehr auf. Plan b ist zu vergleichen. Ich poste infach mal was ich raus bekommen habe.
Exercise 1
1. Query optimization
A document collection with 125,000 documents contains film reviews. Given is the
following query:
(NOT horror) AND (film OR animation) AND (action OR comedy)
75k 62k +3k 41k+ 40k => Erst die operation mit kleinsten Mengen (not horror) and (film and animation) und dann erst and (action or comedy)
Ich habe jetzt ignoriert, dass horror negiert ist, habe dazu nichts gefunden.
This post has been edited 1 times, last edit by "JoX" (Mar 8th 2015, 5:47pm)