logo

Data modeling in R: Conditional Probability, Entropy, Correlation and Covariance Coefficient

   

Added on  2023-06-14

20 Pages3680 Words203 Views
 | 
 | 
 | 
Data_Modelling
Student:
March 31, 2018
Title: Data modeling in R
Assignment type: Analytical computation and
simulation, in R
Student Name: XXX
Tutor’s Name: XXX
Due date: XX-XX
Data modeling in R: Conditional Probability, Entropy, Correlation and Covariance Coefficient_1

Question 1
Conditional probability
A dice is tossed 7 times: A being probability of each value appearing at least once, B probability
outcome is alternate numbers, hence:
Analytical
Conditional prob(probability) Of A given B is-
PR(A|B)
Such that
Pr(a and b) = PR(A|B)*P(B)
Where, Pr(a and b) is the joint probability of A and B.
Therefore:
PR(A|B) = P (AB)
P (B)
Probability of a number on a dice occurring at least once i.e. event A
k =7 (n=sample number)
p=0.167 (p=probability of event occurring)
q=0.833 (q=probability of event not occurring)
Hence given a binomial probability b(x; k, P) = { k !
[ X ! ( k X ) !]} × PX× (1- p k-X)
( [ 7 !
6 ! 1 !0.16760.8331
] )
6
=0.000126
Probability of a number occurring but not adjacent, i.e. without replacement, event B
Given the probability a number occurs is 0.000126 then the probability that the succeeding
number is not the same is such that the succeeding number is either odd followed by even or
even followed by odd, hence
Probability of even= 7 !
6 ! 1! 0.16760.8331=0.000126
Data modeling in R: Conditional Probability, Entropy, Correlation and Covariance Coefficient_2

And
Probability of odd= 7 !
6 ! 1! 0.16760.8331=0.000126
However given that joint the probability of A(odd) and B(even) is given by P(A B)
In such that P(A) + P(B)
But P(A)=0.000126, P(A)=0.000126
Therefore
0.000126 + 0.000126=0.000252
Alternatively
0.000126×2 =0.000252
Hence:
pr(b)=0.000252
Therefore calculating conditional probability:
PR(A|B) = P (AB)
P (B)
But
Pr(a and b) = Pr (a) * Pr (b) = 0.0001260.000252=0.000000031
Hence:
Pr ( a/b )= Pr ( ab )
P ( b ) = 0.000000031
0.000252 =0.000126
Pr (a/b) = 0.000126
Simulation
#calculating probability of event occurring when rolling dice once,
i.e. p
dicex<-6
n<-1
dice.prob<-function(dices) prod(1/dices)
dice.prob(c(n,dicex))
## [1] 0.1666667
Data modeling in R: Conditional Probability, Entropy, Correlation and Covariance Coefficient_3

#calculating probability of an event not occurring when rolling a
dice, i.e. q
dicex<-6
n<-1
dice.prob<-function(dices) 1-prod(1/dices)
dice.prob(c(n,dicex))
## [1] 0.8333333
#calculating probability of a number occurring at least once in 7 dice
tosses, i.e. event A
eventA<-dbinom(6, size=7, prob=0.1666667)
eventA
## [1] 0.0001250287
#calculating probability of event B
evendice<-(dbinom(6, size=7, prob=0.1666667))
odddice<-(dbinom(6, size=7, prob=0.1666667))
EventB<-evendice+odddice
EventB
## [1] 0.0002500574
#calculating conditional probability for event A given event B p (A/B)
conditprob<-(EventB*eventA)/EventB
conditprob
## [1] 0.0001250287
Question 2
Entropy
Imputing missing data (handling NAs)
#raw data
#imputed data
#summary of imputed and raw data
#histogram of Y variables
assignment<-read.table("c:/data.csv",
header=TRUE,
sep=",")
assignment
newdata<-mice(assignment, m=10, maxt=40, meth='pmm', seed=1000)
fulldata<-complete(newdata, 1)
fulldata
summary(fulldata)
Data modeling in R: Conditional Probability, Entropy, Correlation and Covariance Coefficient_4

End of preview

Want to access all the pages? Upload your documents or become a member.