logo

Assessment of FluffyGroCo's Briefing Note on Data Science Concepts

   

Added on  2022-11-10

14 Pages4970 Words202 Views
Data science
1
<University>
Principles of data science for business
<Author>
10 November 2022
<Professor’s name>
<Program of Study>

Data science
2
2.2.1 Report Section 1: Assessment of FluffyGroCo’s briefing note
Application of the datascience concepts while solving anticipated challenges has not been
done by the FluffyGroCo Briefing Note. There is lack of statistics applied in the FluffyGroCo
Briefing Note especially when making the assumption that the return of Truffula trees has
brought clean air back to Thneedville. Therefore, there is limitation when it comes to the
efective decision making processes.
Furthermore, the risk status of the stunting and infestation based on different plantations has
not been shown in the note, yet it would have helped in making some predictions but covered
in the analysis section.
Different algorithms of data science like linear regression have not been utilized by the
FluffyGroCo Briefing Note and as a result it is difficult to validate the accurate and reliable
conclusions proposed by the company.
As much as the methods discussed on the conclusions, it is limited to prediction of the
Truffula tree infestation despite the dataset having other types of fields like Uptagoo yet no
results of the same has been shown in the concept note. As a result, there is limitation when
depicting accurate results in a big dataset. For to note, the correct implementation of these
basic techniques in a larger system of data science would give accurate results as well,
(Abadi, et, al, 2016).
The identification of how the planning of the integration of the databases, algorithms settings,
Crackety Crikling infestations logic, and development of effective policies that ensures that
the high risk of infestation within the fields or plantations has not been done within the
concept note. This, therefore, is likely to lead to the projects’ failure.
In data science, one of the known algorithms used in the analysis is the application of the
Linear Regression, (McCullagh, 2019). However, this was not utilized in the concept note.
Just to mention, the FluffyGroCo may be interested to know the trend of the risk of stunting
and infestation with the plantations, then linear regression would be an ideal to check high
risk of stunting and infestations over a period by the plantations.
To achieve any linear regression, a proposed model is figured out that predicts the risk of
stunting and infestation among different plantations. Through big data, it does not only have
model creation tools but also have opportunities to provide reasons behind the observed
differences, (Mead, 2017).
Within the concept note, the company believed that the Crackety Crickling larvae growth is
stunted in a way that, once the larvae mature as adults about 14 days later, they are unable to
produce fully developed skin due to infestation. To complete a detailed analysis, several
variables may be required including complex dataset without limiting to the Crackety
Crickling larvae growth and stunting alone. The company has failed to validate the results by
the use of the goodness-of-fit (Engel, Bryan, Noonan, and Whitehurst, 2018), as much as the
company note indicating the evidence of association.
2.2.2 Report Section 2: Overview of investigation
The FluffyGroCo has a dataset which is quantitative in nature. Therefore, it can be easily
prepared, explored, analyze, presented and validated based on the variables. To note, this type
of dataset can be analyzed using both descriptive and inferential statistics, (Mertler, and
Reinhart, 2016). In order to describe and summarize the data in the form of frequencies,
percentages, and means, descriptive statistics are used. On the other hand, to make inferences
and draw conclusions about the dataset, the inferential statistics are used, (Makar, and Rubin,

Data science
3
2009). Statistical test including variances, standard deviations, chi-square tests, and linear
regression is also used to test the hypothesized statements. All tests of significance can be
computed at α = 0.05.
The setting of alpha at 0.05 and a confidence level at 95% is ideal given that this is a social
science. Furthermore, it gives the best assumption especially when the results are found to be
statistically significant, (Goldstein, 2011). A hypothesis test on the proportion of the risk of
stunting and infestation from the plantations were considered in the analysis. Therefore, the
analysis indicates the proportions of the risk of stunting and infestation as shown by the
sampled dataset. The company’s approach while maintaining the observation and tests on the
company’s plantations, the risk of stunting was predicted using a deterministic rules-based
approach for all the fields:
IF (rainy AND temperature >= 15) OR (NOT rainy AND temperature >= 22):
.......> High risk of stunting and infestation
ELSE:
.......> Low risk of stunting and infestation
In addition, the excel function COUNTIFFS was used to generate frequencies of the variable
Type. This also helped while generating different proportions of the fields in terms of the risk
of stunting and infestation,
2.2.3 Report Section 3: Analysis and results
The mean temperature calculated is 20.226 with a standard deviation of 6.966. the maximum
and minimum values of the temperature is 40 and -1 respectively. The median temperature is
20 while the first and the third quartile stands at 15 and 25 respectively.
From the graph below, the Uptagoo field has the highest risk of stunting and infestation; 1019
(24.61%) as compared to Nextafoo and Rondadoo fields where the highest risk of stunting
and infestation is accounted for 978 (23.62%) and 974 (23.53%) respectively. On average,
the fields have a high risk of stunting and infestation; 2971 (71.76%) compared to low risk of
stunting and infestation at 1169 (28.24%).
In addition, the lowest risk of stunting and infestation is from the Rondadoo field; 382
(9.23%) followed by Nextafoo field at 387 (9.35%). Moreover, the Uptagoo field has 400
(9.66%) cases of the risk of stunting and infestation. From the results, it seems that there are
not many variations of the risk of stunting and infestation from both Nextafoo and Rondadoo
fields.

Data science
4
Population size, n: 4140
Correlation Results:
Correlation coeff, r: 0.981618629
Regression Results:
Y= b0 + b1x:
Y Intercept, b0: 949.33
Slope, b1: 20.5
Total Variation: 100%
Explained Variation: 82.31%
Unexplained Variation: 17.69%
Coeff of Det, R^2: 0.6775
Looking at the histogram and the correlation coefficient, there is strong evidence that there is
almost a perfect positive correlation between fields and risk of stunting and infestation. There
is a positive slant on the trendline, and the correlation coefficient is +0.981618629 therefore
as the FluffyGroCo continues to work with the Uptagoo field, high risk of stunting and
infestation is likely to witness. Therefore, the FluffyGroCo should develop treatment
strategies targeting the Uptagoo to prevent increased insect infestation.
The results further show the trend of the low risk of stunting and infestation with time. This is
presented in the graph below.

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Principles of Data Science Assessment 2022
|17
|4215
|22

Data Science Investigation for FluffyGroCo's Business
|18
|4235
|450

FluffyGroCo Smart Pest Infestation Management: Report & Recommendations
|12
|4121
|274

Assignment on Assessment of FluffyGroco’s
|18
|5256
|23