logo

Creating Inverted Index for Information Retrieval - Desklib

   

Added on  2023-06-11

11 Pages1069 Words461 Views
 | 
 | 
 | 
COVER PAGE
Creating Inverted Index for Information Retrieval - Desklib_1

Contents
COVER PAGE........................................................................................................................................1
DETAILS................................................................................................................................................1
Question 1...................................................................................................................................................3
a. Remove all stop words and punctuation.....................................................................................3
b. Applying Porter Stemming algorithm..........................................................................................3
c. Creating dictionary......................................................................................................................3
Merge all documents into one dictionary file......................................................................................3
Sort the merged in alphabetical order.................................................................................................4
Merge terms appearing more than once.............................................................................................5
Create posting file................................................................................................................................7
d. Testing.........................................................................................................................................8
Boolean and vector queries.....................................................................................................................8
Question 2 IR evaluation.............................................................................................................................9
a. Target and designed queries...........................................................................................................9
Search engines.....................................................................................................................................9
Target..................................................................................................................................................9
Target 1: obtain the course information for S779...............................................................................9
Search queries.....................................................................................................................................9
b. target, results and designed search queries........................................................................................9
A. Google Search engine......................................................................................................................9
B. Ask.com.........................................................................................................................................10
C. Average comparison......................................................................................................................11
Bibliography...............................................................................................................................................11
Creating Inverted Index for Information Retrieval - Desklib_2

Question 1
Creating inverted index.
a. Remove all stop words and punctuation
DOC 1
Information retrieval activity obtaining information resources relevant information collection
information resources Searches based full-text content-based indexing
DOC 2
Information retrieval finding material unstructured nature satisfies information large collections
DOC 3
Information systems study complementary networks hardware software people organizations
collect filter process create distribute data
b. Applying Porter Stemming algorithm
The next step is applying porter stemming algorithm and as a result the new documents become;
DOC1
Information retrieve active obtain inform resource relevant information collect information
resource Search base full text content base index
DOC2
Information retrieve find material unstructure nature satisfy information large collect
DOC3
Information system study complement network hardware software people organ collect filter
process create distribute data
c. Creating dictionary
Merge all documents into one dictionary file
Term Doc ID
Information 1
Retrieve 1
Active 1
Obtain 1
Inform 1
Resource 1
Relevant 1
Information 1
Resource 1
Search 1
Base 1
Full 1
Text 1
Content 1
Base 1
index 1
Information 2
Creating Inverted Index for Information Retrieval - Desklib_3

Retrieve 2
Find 2
Material 2
Unstructure 2
Nature 2
Satisfy 2
Information 2
Large 2
collect 2
Information 3
System 3
Study 3
Complement 3
Network 3
Hardware 3
Software 3
People 3
Organ 3
collect 3
Filter 3
Process 3
create 3
Distribute 3
data 3
Sort the merged in alphabetical order
Term Doc ID
Active 1
Base 1
Base 1
collect 2
Complemen
t
3
Content 1
Find 2
Full 1
Hardware 3
index 1
Information 1
Information 2
Information 1
Information 1
Information 2
Information 3
Creating Inverted Index for Information Retrieval - Desklib_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents