logo

IR Evaluation: Search Engines

   

Added on  2023-03-21

25 Pages2043 Words81 Views
 | 
 | 
 | 
COVER PAGE
IR Evaluation: Search Engines_1

Question 1
1a) Results of removing the stop words
Document 1
Data science interdisciplinary field scientific methods, processes, algorithms systems extract
knowledge insights data various forms, structured unstructured.
Document 2
Data mining process discovering patterns large data sets involving methods intersection
machine learning, statistics, database systems
Document 3
Information systems study complementary networks hardware software people organizations
collect, filter, process, create, distribute data
(condition) S1 S2
Step 1a
SSES -> SS
Processes Process
IES -> I
SS -> SS
Process-> process
S ->
Algorithms Algorithm
Systems System
Insights Insight
Forms form
Patterns -> pattern
Sets set
Methods method
Statistics statistic
IR Evaluation: Search Engines_2

Networks network
Organizations organization
Step 1b
(*v*) ED ->
Structured structur
Unstructured->unstructur
(*v*) ING ->
Discovering discover
Involving involve
Learning learn
(m=1 and *o) -> E
Knowledge knowledg
Large larg
Machine machin
Hardware hardwar
Software softwar
Distribute distribut
Step 1c
(*v*) Y -> I
Interdisciplinary -> interdisciplinari
Complementary->complementari
Study->studi
IR Evaluation: Search Engines_3

Step 2
(m>0) ATION -> ATE
Information->informate
(m>0) IZATION -> IZE
Organization->organize
(m>0) IVITI -> IVE
Activiti->active
Step 3
Step 4
(m>1) AL ->
(m>1) ATE ->
Informate->inform
(m>1 and (*S or *T)) ION ->
Intersection Intersect
(m>1) IZE ->
(m>1) ANT ->
Step 5a
(m>1) E ->
Knowledge knowledg
Large larg
Machine machin
IR Evaluation: Search Engines_4

Hardware hardwar
Software softwar
Distribute distribut
(m=1 and not *o) E ->
Create->creat
People->peopl
Large->larg
Searche->search
Stemmed documents
Document 1
Data scienc interdisciplinari field scientif method process algorithm system extract knowledg
insight data variou form structur unstructur
Document 2
Data mine process discov pattern larg data set involv method intersect machin learn statist
databas system
Document 3
Informat system studi complementari network hardwar softwar peopl organ collect filter
process creat distribut data
1b) Merged inverted list including within-document frequencies
IR Evaluation: Search Engines_5

Merged Sorted List with within document frequency
Term DocumentFrequency
algorithm 1 1
collect 3 1
complementari 3 1
creat 3 1
Data 1 2
Data 2 2
data 3 1
databas 2 1
discov 2 1
distribut 3 1
extract 1 1
field 1 1
filter 3 1
form 1 1
hardwar 3 1
informat 3 1
insigt 1 1
interdisciplinari 1 1
intersect 2 1
involv 2 1
knowledg 1 1
larg 2 1
learn 2 1
machin 2 1
method 1 1
method 2 1
mine 2 1
network 3 1
organ 3 1
pattern 2 1
peopl 3 1
process 1 1
process 2 1
process 3 1
Scienc 1 1
scientif 1 1
set 2 1
softwar 3 1
statist 2 1
structur 1 1
studi 3 1
system 1 1
system 2 1
system 3 1
unstructur 1 1
variou 1 1
IR Evaluation: Search Engines_6

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents