Question Answer | Information retrieval is the activity of obtaining information

Verified

Added on  2022/05/25

|15
|1877
|28
AI Summary
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
NAME:
STUDENT ID:
COURSE:
TUTOR:
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Contents
Question 1...................................................................................................................................................3
Boolean model and vector model...........................................................................................................8
Question 2 IR evaluation.............................................................................................................................9
Bibliography...............................................................................................................................................15
Document Page
Question 1
Document 1
Information retrieval is the activity of obtaining information resources relevant to an
information need from a collection of information resources. Searches can be based on full-text
or other content-based indexing.
Document 2
Information retrieval is finding material of an unstructured nature that satisfies an information
need from within large collections
Document 3
Information systems is the study of complementary networks of hardware and software that
people and organizations use to collect, filter, process, create, and distribute data.
Based on the documents above;
Step 1: Remove stop words
Doc 1
Information retrieval activity obtaining information resources relevant information collection
information resources Searches based full-text content-based indexing
Doc 2
Information retrieval finding material unstructured nature satisfies information within large
collections
Doc 3
Information systems study complementary networks hardware software people organizations
collect filter process create distribute data
Step2: Apply Porter Stemming algorithm
Doc 1
Informat retriev activ obtain inform resourc relev inform collect inform resourc Search base full
text content base index
Doc 2
Informat retriev find materi unstructur natur satisfi inform within larg collect
Doc 3
Informat system studi complementari network hardwar softwar peopl organ collect filter
process creat distribut data
Step 3: Merge the documents
Merged list
Document Page
Term Document
activ 1
Base 1
Base 1
collect 2
collect 3
Complementari 3
Content 1
creat 3
data 3
Distribut 3
Filter 3
Find 2
Full 1
Hardwar 3
index 1
Inform 1
Inform 1
Inform 2
Inform 3
Informat 3
Larg 2
Materi 2
Natur 2
Network 3
Obtain 1
Organ 3
Peopl 3
Process 3
Relev 1
Resourc 1
Resourc 1
retriev 1
Retriev 2
Satisfi 2
Search 1
Softwar 3
Studi 3
System 3
Text 1
Unstructur 2
Within 2
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Merged list with within-document frequencies
Term Frequency Document
activ 1 1
Base 2 1
collect 1 2
collect 1 3
Complementari 1 3
Content 1 1
creat 1 3
data 1 3
Distribut 1 3
Filter 1 3
Find 1 2
Full 1 1
Hardwar 1 3
index 1 1
Inform 2 1
Inform 1 2
Inform 2 3
Larg 1 2
Materi 1 2
Natur 1 2
Network 1 3
Obtain 1 1
Organ 1 3
Peopl 1 3
Process 1 3
Relev 1 1
Resourc 2 1
Retriev 1 1
Retriev 1 2
Satisfi 1 2
Search 1 1
Softwar 1 3
Studi 1 3
System 1 3
Text 1 1
Unstructur 1 2
Within 1 2
Step 4: Create Posting file
Document Page
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Term Frequency Posting
Activ 1 1
Base 2 1
Collect 1 2
Complementari 1 3
Content 1 3
Data 1 3
Distribut 1 3
Filter 1 3
Find 1 2
Full 1 1
Hardwar 1 3
Index 1 1
Inform 5 1
3
3 2
Document Page
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Larg 1 2
Materi 1 2
Natur 1 2
Network 1 3
Obtain 1 1
Organ 1 3
Peopl 1 3
Process 1 3
relev 1 1
Resourc 1 2
Retriev 1 1
Satisfi 1 2
Search 1 1
Softwar 1 3
2
Studi 1 3
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Testing the posting file
By using key words system, information and index a test can be done using a search engine like Google
to determine whether the results of the documents retrieved by the search engine matches the
documents used to create the posting file.
Boolean model and vector model
a. Boolean Model queries
1) Network AND People AND NOT search
This query returns Doc3 only.
2) Process AND (retrieve OR retreive)
This query returns Doc1, Doc2 and Doc3
3) Information AND Data
This query returns Doc1, Doc2 and Doc3
b. Vector Model
Query Q= (Information, system, index)
Doc 1
D1= <3, 1, 0>
Q= <1, 1, 1>
σ ( D1 , Q)= 3 x 1+ 1 x 1+ 0 x 1
32 +12 +02 12+12 +12 = 4
7 3 = 1.15
Doc 2
D2= <2, 0, 0>
Q = <1, 1, 1>
σ ( D2 , Q)= 2 x 1+ 0 x 1+0 x 1
22 +02 +02 12+ 12+12 = 2
4 3 = 0.76
System 1 3
Text 1 1
Unstructur 1 2
Within 1 2
Document Page
Document 3
D= <1, 1, 0>
Q= <1, 1, 1>
σ ( D3 , Q)= 1 x 1+ 1 x 1+ 0 x 1
12 +12 +02 12+12 +12 = 2
2 3 = 1.07
Boolean Model and Vector Model comparison
In information retrieval both Boolean model and vector model are used to show documents that are
retrieved based on a certain query. Boolean model shows the documents that are returned but does not
show the order in which the documents will be retrieved but vector model shows the documents that
will be retrieved and the order in which the documents will be retrieved from the first document to the
last document.
Question 2 IR evaluation
Search engines
Bing
Yahoo
Target
Target 4: Obtain Oracle SQL Tutorial
Queries
Query 1= Oracle SQL Tutorial
Query 2= Oracle SQL notes
Bing
Document Page
Query 1 Query 2
Precision Recall Precison Recall
R 1 0.0714 R 1 0.071
R 1 0.143 R 1 0.143
1 0.214 0.667 0.143
R 1 0.286 0.75 0.214
R 1 0.357 R 0.8 0.289
R 0.833 0.429 0.667 0.289
0.714 0.429 0.571 0.289
0.625 0.429 R 0.625 0.357
R 0.667 0.5 R 0.667 0.429
0.6 0.5 0.6 0.429
0.636 0.571 0.636 0.5
R 0.667 0.643 R 0.667 0.571
0.692 0.714 R 0.615 0.571
0.643 0.714 0.571 0.571
0.6 0.714 R 0.6 0.643
0.5625 0.714 0.5625 0.643
0.529 0.714 0.529 0.643
0.556 0.786 0.556 0.714
R 0.526 0.786 0.526 0.714
R 0.55 0.857 R 0.55 0.786
Interpolation Interpolation
Precision precision Average Precision
0 1 0 1 0 1
0.1 1 0.1 1 0.1 1
0.2 1 0.2 0.8 0.2 0.9
0.3 1 0.3 0.625 0.3 0.8125
0.4 0.833 0.4 0.667 0.4 0.75
0.5 0.667 0.5 0.667 0.5 0.667
0.6 0.667 0.6 0.615 0.6 0.641
0.7 0.526 0.7 0.6 0.7 0.563
0.8 0.55 0.8 0.55 0.8 0.55
0.9 0 0.9 0 0.9 0
1 0 1 0 1 0
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Figure 1: Bing precision against recall plotted graph
Yahoo search engine
Document Page
Query 1 Query 2
Precision Recall Precison Recall
R 1 0.0714 R 1 0.071
R 1 0.143 R 1 0.143
1 0.214 0.667 0.143
R 1 0.286 0.5 0.143
R 1 0.357 R 0.6 0.214
R 0.833 0.429 0.5 0.214
0.714 0.429 0.429 0.214
0.625 0.429 0.375 0.214
R 0.667 0.5 0.444 0.286
0.6 0.5 0.4 0.286
0.636 0.571 R 0.455 0.358
R 0.667 0.643 0.417 0.358
0.692 0.714 R 0.462 0.429
0.643 0.714 0.4 0.429
0.6 0.714 R 0.4375 0.5
0.5625 0.714 R 0.412 0.5
0.529 0.714 0.389 0.5
0.556 0.786 0.368 0.5
R 0.526 0.786 0.4 0.571
R 0.55 0.857
Interpolation Interpolation
Precision precision Average Precision
0 1 0 1 0 1
0.1 1 0.1 1 0.1 1
0.2 1 0.2 0.6 0.2 0.8
0.3 1 0.3 0.455 0.3 0.7275
0.4 0.833 0.4 0.462 0.4 0.6475
0.5 0.667 0.5 0.4375 0.5 0.55225
0.6 0.667 0.6 0.412 0.6 0.5395
0.7 0.526 0.7 0 0.7 0.263
0.8 0.55 0.8 0 0.8 0.275
0.9 0 0.9 0 0.9 0
1 0 1 0 1 0
Document Page
Figure 2: Yahoo Precision against recall plotted graph
Average Comparison
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Bing Yahoo
Average Precision Average Precision
0 1 0 1
0.1 1 0.1 1
0.2 0.9 0.2 0.8
0.3 0.8125 0.3 0.7275
0.4 0.75 0.4 0.6475
0.5 0.667 0.5 0.55225
0.6 0.641 0.6 0.5395
0.7 0.563 0.7 0.263
0.8 0.55 0.8 0.275
0.9 0 0.9 0
1 0 1 0
Figure 3: Bing vs Yahoo average precision vs recall plotted graph
A comparison can be done to determine the superior search engine between Bing and Yahoo based on
the average precision and recall graph shown above. Based on the data reflected on the graph Bing
performs better than yahoo because it has a higher precision and recall. Recall is the number of true
Document Page
correct results over the total results and Bing retrieves more correct than Yahoo results for both queries.
Precision is the number of true positives over the relevant results thus Bing retrieves the most correct
documents for both queries as compared to Yahoo thus making Bing better than Yahoo for obtaining
oracle SQL tutorial.
Bibliography
Koehrsen, W. (2018). Beyond Accuracy: Precision and Recall – Towards Data Science.
[online] Towards Data Science. Available at: https://towardsdatascience.com/beyond-
accuracy-precision-and-recall-3da06bea9f6c [Accessed 26 Jan. 2019].
Mikulski, B. (2018). Precision vs. recall - explanation – Bartosz Mikulski. [online] Bartosz
Mikulski. Available at: https://mikulskibartosz.name/precision-vs-recall-explanation-
aada1ec393ec [Accessed 24 Jan. 2019].
chevron_up_icon
1 out of 15
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]