MAT9004 Assignment 3: Neural Network Optimization and Analysis

Verified

Added on 2023/03/31

AI Summary

This assignment solution provides a detailed analysis of neural network performance, focusing on optimization techniques. It covers finding the gradient (∇f(x, y)) and Hessian matrix (H(x, y)) for a given performance function f(x, y). The solution also includes locating and classifying stationary points, determining values for maximizing and minimizing network performance, and identifying the direction of the function's most rapid decrease. The assignment also solves probability questions related to traffic flow in a CBD section. Additionally, it addresses combinatorial problems related to mixed dozens of wine bottles with specific constraints. The assignment also includes determining valid PDF and calculating maximum value of z. This solution provides a comprehensive approach to solving problems related to neural network optimization and statistical analysis.

Student Name:
Course:
Due Date:
Problem 1
Part a
Find rf(x; y).
Soln.
∇ f ( x , y )=i ∂
∂ x f ( x , y , z ) + j ∂
∂ y f ( x , y , z )+ k ∂
∂ z f ( x , y , z )
We would solvethe derivatives of x and y variables individually.
∂ f
∂ x = ∂
∂ x ( ( x2 + y2 ) )
3/ 2
−6 ∂
∂ x ( x2 + y2 ) + ∂
∂ x (9 y )
Applyingthe chain rule ¿ solve
∂ f ( u )
∂ x = ∂
∂u
f ∗∂
∂ x u
Let u=x2+ y2 , f =u3 /2
Now ∂
∂ u
(u
3
2 )∗∂
∂ u ¿
But u= ( x2 + y2 )

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

∂ f
∂ x =3 x ¿
Now solving for ∂ f
∂ y
∂ f
∂ y = ∂
∂ y ( ( x2+ y2 ) )3 /2
−6 ∂
∂ y ( x2 + y2 ) + ∂
∂ y (9 y )
Here we will also apply chains rule ¿ obtain our so ltuion
¿
3
2 ( u )1
2 −6 ( 2 y ) + 9
But u= ( ( x2 + y2 ) ) 1
/2
¿ 3 y ¿
Now substituting theour calculated solution∈our formulawe will have
∇ f =i¿
Part b
We will use the Hessian formula which is stated as
H=[
∂ 2 f
∂ x 2
∂ 2 f
∂ xdy
∂ 2 f
∂ ydx
∂ 2 f
∂ y 2
]
∂ 2 f
∂ x 2 =¿
First derivative will yield
¿ 3 x ( ( x2 + y2 )
1
2 −12 x )
Second derivative yields
∂ 2 f
∂ x 2 = ∂
∂ y ( 3 x ( ( x2+ y2 )
1
2 −12 x ))
¿ 3 ( x2+ y2 )
1
2 + 3
2∗2 x ( x2 + y2 )
−1
2 −12

Simplifyingthe above , we have
∂ 2 f
∂ x 2 =3 ( x2+ y2 )
1
2 +3 x2 ( x2 + y2 )
−1
2 −12
∂2 f
∂ ydx =3 xy ( x2+ y2 )
−1
2
∂ 2 f
∂ xdy =3 xy ( x2+ y2 ) −1
2
∂ 2 f
∂ y 2 = ∂
∂ y (3 y ( ( x2 + y2 )
1
2 −12 y +9 ))
¿ 3 ¿
simplifying theabove , we have
¿ 3 ¿
Now substituting theabove calculated solutions into the formula
Therefore ,
H ( x , y )=¿
Part C
¿ locate a stationary point , , we would equate the derivatives of x∧ y ¿ zero .
that is ∂ f
∂ x = 0∧∂ f
∂ y =0 ,but y ≤ 4
Now 3 x ( x2+ y2 )
1
/2−12 x=0
x=0 , ( x2 + y2 )1
/2=4

so x=0 , but ( x2 + y2 )1
/2≠ 4
¿
3 y ( x2 + y2 )1
/ 2−12 x+9=0
12 y−12 y+ 9=0
3 y ( y ) −12 y+ 9=0
Dividing by acommon factor , we obtain
y2−4 y +3=0
Solvingthis yields
( y−1)( y −3)
Now the statinaory points will be ( 0,1 ) ( 0,3 )
Part d
At what values of x; y is the network performance maximized?
We would consider minimum stationary points ¿ calculatemaximum performance .
∂2 f
∂ x2 =3 ( x2+ y2 )
( 1
2 ) +3 x2 ( x2 + y2 )
−1
2 −12
But x=0 , y=1
Substituting this values
¿ 3 ( 1 )
( 1
2 )+3 ( 0 ) ( 1 ) (−1
2 )−12
¿ 3−12=−9<0
∂2 f
∂ x2 =3 ( x2+ y2 )
( 1
2 ) +3 y2 ( x2 + y2 )
−1
2 −12
But x=o , y=1
3 ( 1 )
( 1
2 ) +3 ( 1 ) ( 1 )
( −1
2 ) −12

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

¿ 6−12=−6< 0
Therefore , ( 0,1 ) is the maximum network performance.
Part e
At what values of x; y is the network performance minimised?
Solution.
We would use maximum stationary points to calculate
this.
∂2 f
∂ x2 =3 ( x2+ y2 )
( 1
2 ) +3 x2 ( x2 + y2 )
−1
2 −12
but x=0 , y =3
¿ 3 ( 9 )
1
2 +0−12=−3< 0
∂2 f
∂ y2 =3 ( x2 + y2 )
( 1
2 ) +3 y2 ( x2+ y2 )
−1
2 −12
¿ 3 ( 9 )
1
2 +3 ( 9 ) −12
¿ 240>0
Now
∂2 f
∂ x ∂ y = ∂2 f
∂ y ∂ x =3 xy ( x2+ y2 )
−1
2
¿ 3 ( 0 ) ( 3 ) ( 0+9 )
−1
2
¿ 0
Now
∂2 f
∂ x2 ∗∂2 f
∂ y2 − ∂2 f
∂ x ∂ y <0
Now ¿the above findings ,
we can confidently conclude that ( 0,3 ) has neither mininor maximum

network performance
Part f
In which direction if the function f decreasing most rapidly when (x; y) = ( 3
4 ; 1)?
Solution
f ( x , y ) = ( x2 + y2 )
3
2 −6 ( x2 + y2 ) +9 y
∇ f ( x , y ) = ( 3 x ( x2+ y2 )
1
2 −12 x ) i+( 3 y ( x2+ y2 ) ( 1
2 )
−12 y +9 ) j
The gradient of f
¿ (3 x ( x2 + y2 )
1
2 −12 x ) i+ ( 3 y ( x2 + y2 ) ( 1
2 )
−12 y+ 9 ) j
Now thedire ction by which f decreases will be foundusing the formual
u=−¿
¿ ( 9
4 ( 5
4 ) )−9 ¿ i +(3 ( 5
4 )−12+9) j
¿− 99
16 i+ 3
4 j
Problem 2
Event
Locatio
n Pr
A X1≥3 0.273
B Y1≥2 0.182
C X2=3 0.273
D X1≥3 0.273
∑Pr=1.00
0
Part a
Pr(X1,Y1)= (x,y)

The pr ( X 1 ) =0.273
Pr ( y 1 )=0.182
Part b
Pr ( A )=0.273
Pr ( B ) =0.182
Part c
Pr ( A ∩ B )=P ( A ) . P ( B )
¿ 0.273∗0.182
¿ 0.0497
Pr ( A ∪ B )=P ( A ) +P ( B )−P ( A ∩B )
But P ( A )=0.273 , P ( B )=0.182, P ( A ∩B )=0.0497
¿ ( 0.273+0.182 ) −0.0497
¿ 0.455−0.0497
¿ 0.4053
Part e
The events A and B are independent.
Part f
Pr ¿
The probability of event C happening given that event of ∩ ( A ∩ B )
This will be just Pr ( C ) =¿0.273
Part g
Pr(D / C).
The probability that D occurring given the event C has occurred,
is simply the probability of C

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Pr (D)=0.273
Part h
Pr ( D ∩C )=Pr ( D ) . Pr ( C )
¿ 0.273∗0.273=0.0745
The events D and C are not independent since the
multiplication of the individual probabilities changes the
condition probability.
Problem 3
Part a
How many mixed dozens are possible with these constraints?
Soln
¿ the question we can see that each dozen contains 2 red∧2 whites varieties
Now for 4 bottles we will have 3∗3∗2∗2 ways
While for 8 bottles we have 5 choices
Then thetotal choices will be
¿ 58∗32∗22
¿ 3,515,629 choices
Part b
For 5 bottles we will be having on; y one choice for each bottle
¿ for the other 7 bottles , we will have 5 choices
Now our total choices will be
¿ 57 choices
¿ 78,125 choices
Part c
For one Merlot bottle , we will have only one choice .

Now since this is red, then for another bottle to be red, we
will have only two choices and for 2 bottles to be white ,
we have again two choices and the other 8 bottles we will
have 4 choices.
Hence, Total choices will be
¿ 1∗2∗2∗2∗48=23∗48
¿ 219
¿ 524,288 choice s
Also when we have no merlot of bottle
Now for 2 red merlot bottles, we have two choices for 2
white bottles we have only two choices
Then the total choices will be ¿ 22∗22∗48=2 020
So the total choices for both cases will be
¿ 219+220=219 (1+2)
Now thechoices is (3+219 )
Problem 4
Part a
Given that f ( w )= 3 z
8 x w2+ z ( x2 + y2 )
2 x w where o< w<2
Then for a valid PDF
∫
0
2
f ( w ) dw=1
¿∫
0
2
( 3 z
8 x w2+ z ( x2+ y2 )
2 x w )dw=1
¿ [ z
8 x w3 + z ( x2 + y2 )
2 x
w2
2 ] 2
0=1

¿ [ z
8 x (2)3 + z ( x2+ y2 )
2 x
(2)2
2 ]
¿ [ z
x + z ( x2 + y2 )
4 x ]=1
simplifying theabove
¿ z
x (1+ x2+ y2)=1
¿ z (1+ x2+ y2)=x
making the z the subject
¿ z= x
(1+ x2 + y2)
Part b
For maximum value of z
∂ z
∂ x = ∂ z
∂ y =0
∂ z
∂ x = ( 1
1+ x2 + y2 )− ( x (2 x )
( 1+ x2+ y2 )2 )=0
1
1+ x2 + y2 = 2 x2
( 1+ x2 + y2 )2
Simplifying the above becomes
2 x3 =1+ x2 + y2
2 x2 −x2 −1= y2…………( Eqn a)
Solving the above equation we get
x=0 , y=0
∂ z
∂ y =0
Now differentiating z with respect to y we obtain

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

∂ z
∂ y = − x
(1+ x2 + y2 ) . 2 y =0
When x=0 , y=0 gives the
z= x
1+ x2+ y2 = 1
2
When the above is solved yields
y=0 ,2 x3−x2−1=0
x=1
For x=1 , y=0 , z= x
1+x2+ y2 = 1
2
Hence for the maximum values x=1 , y=0 , z= 1
2
Part c
Now that x=1 , y=0 ,∧z= 1
2
f ( w )= 3
16 w2 + w
4 for 0 <0<2
∫
0
2
f ( w ) dw=∫
0
2
( 3
16 ¿¿ w2+ w
4 )dw ¿ ¿
¿ [ 1
16 w3 + w2
8 ]2
0
( 2 ) 3
16 + ( 2 ) 2
8 =1
2 + 1
2 =1
Since∫
0
2
f ( w ) dw=1

Therefore , f ( w )= 3
16 w2 + w
4 isa PDF
Part d
The probability of a scarlet butterfly at least 1 week.
Then P( w ≤1)
∫
0
1
f ( w ) dw=∫
0
1
( 3
16 ¿¿ w3+ w
4 )dw ¿ ¿
¿ 1
16 + 1
8 = 3
16
therefore , P ( w ≤ 1 ) = 3
16
Part e
Expected life¿∫
0
2
wf ( w ) dw=∫
0
2
w ( 3
16 w2+ w
4 ¿¿)dw ¿ ¿
¿∫
0
2
( 3
16 w4 + w3
4 ¿)dw ¿
¿ [ 3
64 w4 + w3
4 ] 2
0
¿ 3
64 (2)4 + (2)3
4
¿ 17
12 week
Problem 5
Part a
We have 1 upper cas e letter , 2 digits∧5lower caseletters