Real Time Analytics

Verified

Added on  2023/01/12

|28
|4130
|47
AI Summary
This document provides an overview of real time analytics, including data modeling, data mining, and research in the field. It explores the concept of real time analytics and its applications in various industries. The document also discusses the process of data modeling and provisioning, as well as the different techniques used in data mining. It concludes with a section on research and recommendations for CEOs.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Real time analytics

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Contents
Contents...........................................................................................................................................1
I. Introduction..............................................................................................................................2
II. Data Modelling and Provisioning............................................................................................2
III. Data Mining..........................................................................................................................5
IV. Research................................................................................................................................7
V. Recommendations for CEO...................................................................................................23
VI. Cover Letter........................................................................................................................24
VII. Conclusion..........................................................................................................................25
VIII. References.......................................................................................................................26
pg. 1
Document Page
I. Introduction
The report is made for the NCLT Corporation, at Balmoral. The Real time Analytics
which
uses the SAP HANA web based tool. The Dataset which explains about the unintentional
poisoning that leads to the death of the people. United States suffers more because of this
illegal activities of the drugs in the year 2000--2016.these are the year where the death is
increased. According to the WHO data, in 2012 an estimated 193,460 people died worldwide
from the unintentional poisoning. This caused the loss of over 10.7 million years of healthy life.
For example, it is estimated that deliberate ingestion of pesticides causes 370,000 deaths each
year. The number of these deaths can be decreased by having the small availability of, and
access
to, highly toxic pesticides.
The SAP HANA Web-based development toolset is used to find the Data Analytics. It is
a statistical tool for analysis and to find the connection between the data. The preparation of the
data and modeling data is made using the SAP tool. The process in preparation of data contains
combination, structure and organizing the data. It is most probably a single database table or
statistical data. It defines the nature and attribute of the field on the dataset table (Fabiane, 2012).
The main attribute in the dataset is fields. Field describes the type of data reserved in the table.
The information is about the project Country, year of female, male and both the sexes. The
content has researches, observations made, Data mining and finding of the data.
II. Data Modelling and Provisioning
The data modeling process generally consists of the data cleansing and data organizing
process. The first step is to arrange the data set for analysis (M., M., K. & S., 2018). Here the
invalid rows and columns are removed. SAP HANA studio allows users to carry out this activity.
In the below section the step by step procedures for data modeling and provisioning are
discussed.
pg. 2
Document Page
Collection generation in SAP HANA studio
The collection is the information package. It brings various details about the model. From
that, we can know about the analysis type, collection view, and attribute view, etc.
SAP HANA is the online-based data analytics tool ("Methadone causes half of unintentional
drug poisoning deaths in young children", 2016).
The first step is to open the SAP HANA web application. Here we need to generate
SDGPOISON package. For that, we need to select the Hana system. This process must be carried
out before opening the content folder.
Now the create package dialogue box opened. On this we need to fill the package name
as “SDGPOISON”, then enter the description on the description section as similar to the attached
screenshot. Also, the language is selected in this stage (Pattanayak, 2017).
pg. 3

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Import dataset
Now select the import option. And select the file to import the file.
Import the dataset SDGPOISON.csv in the created package file is SDGPOISON. Then
click the import button. The sdgpoison.csv dataset details are shown in the SAP HANA web-
based development workbench.
pg. 4
Document Page
The SDGPOISON dataset details are shown in the above picture.
III. Data Mining
Followed steps and procedures in the data mining process are explained in this section.
The cluster analysis is the best-suited method for our need. Cluster analysis is a process of
grouping the values in the dataset. In the clustering process, the group of data which has the
relation or similarity is grouped as a cluster. Cluster analysis is mainly conducted to find the
death rate and various regions (Rofiqo, Windarto & Hartama, 2018). Again the data analysis is
classified into two different types. And they are listed below.
Classification
Prediction
Cluster data mining techniques
First import the SDGPOISON.csv file in the SAP HANA web based development work
bench.
Analyse the dataset using the cluster techniques in data mining.
Select the target variables for analyzing the dataset.
pg. 5
Document Page
Algorithm
A dataset can have as much as cluster for the partition of the information in the recent
models of the cluster. This article is about the most influential one. The advantages and
disadvantages of this method must be described briefly. The algorithm that is chosen always
depends on the nature of the dataset and for what it is used.
Centroid-based
In this grouping method of OS the vector value is given to every single cluster. The value
difference of the object is less and objects are the part of the cluster. The main problem in this
kind of algorithm is that the number count of the cluster should be determined priorly. These
methods are nearest to the classification and commonly used in problems of optimization.
Distributed-based
The objects are linked which has a same distribution using the distributed methodology
that belongs to pre-defined statistical models. The process is understood and the complex model
communicate with the real data in real way it generate a random value. This process will give a
result of excellent solution and correlations and dependencies are calculated.
Connectivity-based
In this algorithm all the objects are linked to the neighboring objects and it depends on
the degree of relationship and distance between two objects. Using the assumptions clusters are
built with the objects and has a long distance limits. This model has a hierarchical representation
of the communication between the members. The distance between the function changes with the
focal analysis.
Applications of Cluster Analysis
There are various applications in the world of science and these are worthful data analysis
technique. The result is produced with well-defined data type and this analysis is made in all the
large set of data information.
The image processing is one of the main application is associated with it by finding the
pattern kind of image data. This is used functionally in the research related to biology,
differentiating objects and to identify the patterns. And it is used on the classification of the
exams on the medical field.
pg. 6

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
The personal data of place, regards, shopping, operations, and a number of desired works
are analyzed using these methodologies and it gives the key information and the trends. The web
analytics, strategies on marketing, researches on the market and all other works.
The other applications are recommender systems, mathematics and statistics analysis,
robotics, climatology and it provides utilization for broad spectrum.
IV. Research
Unintentional Injuries
This chapter defines about the affairs of the unintentional injuries and is based on the
selected number of causes of that injuries.it is also defined as a damage of a person who is
caused an acute transfer of energy otherwise, or it happens in the sudden absence in the heart.
This injury consists of a subset of injuries in pre-determined intent. WHO (World Health
Organization) examined unintentional injuries. It includes causes, burdens and Risk factor of
poisonings.
Burden of Unintentional injuries
In the worldwide, the unintentional injuries are justified more than 3.5 million deaths in
2005 or about 6 percent of all deaths and 66 percent of deaths. And more than that 113 million
DALY in 2005. Males clarified almost two-thirds of the deaths (LMIC in 2005). From the
unintentional injuries are accounted 15 to 29 of young peoples.
pg. 7
Document Page
Causes of Unintentional Injuries in LMICs
Most of the disease are caused by Multiple factors. The ideas are prototyped in the host,
vector and environmental factors have been adapted and utilized in the causes. In each factor, it
relates to the time of injury (before and after). While the matrix is also known as Haddon Matrix,
who initially developed the problems in the addresses of RTIs and it reaches the multiple factors
in the injury.
In the past, the proof of the identification factors is in high-income countries (HICs)
which have increased the number of injury researchers and the Research Institution. LMICs
shows the scarcity of the research injury for the identification of these countries is growing
slowly.
Risk Factors for Poisonings
The LMICs contains brief information about the desirable poisoning that has importation
information about the occupations that are related to poisoning may be with pesticide poisoning
and the information assists to the lead poisoning. And these poisons are covered that is absent on
the book. This chapter is about the risk taken on the types of poisons in LMICs and risk is on the
poisoning of the children with a young age.
The literature is about the danger faced on the childhood poisoning that in turn shows the
victims of child poisoning are noted to be more than the adult in the hospitals. The fact here is
exactly opposite to the presented data that reveals the mid-aged separate persons are large in the
count of death and LMICs poisoning on Daly’s. The count says the supreme need of work-
related poisonings.
Males who are young have an increased danger on poisoning than the females. The
paraffin in most commonly used as an agent for childhood poisoning. The other agents that are
related to households are pesticides, chemicals, plants, and animals. The need for the number of
sociodemographic dangers includes supervision of adults, lonely residence and young parents in
lmic's. The previous poisoning can also be a dangerous factor that is reported by some studies.
The other factors are the storage of the multiple numbers of containers in the residence and the
usage of nonstandard containers and storing the poison on the ground level.
pg. 8
Document Page
Risk Factors for Burn-related Injuries
There numerous country-specific surveys that are conducted by the medical team says
that injuries are made by using the hot water is moreover equal to the injuries caused by the fire
burns. The countries like India and China the injuries on fire are greater than the scald related
injuries.
On the overall bases, the fire burned injuries of women is higher at risk than the men. The
data surveyed is based on the population and the survey of the medical center says that males are
higher at risk than the women that excludes India. While studying the survey young children are
at a high rate on the burn-related injuries than the other group age. The risk factor in rural areas
is compatible with burn-related injuries at the home (Tang et al., 2017).
The identification of the danger and factors that prevent burn-related injuries on Asia,
Africa, and South America are undertaken by the investigators. And it focuses on the children.
The factor that increases the risk is no water supply, storing the flammable substances in the
home, keeping the tools or equipment used on the kitchen in the children reachable distance and
living at the slums and overcrowded area. The persons with social or personal problems have a
risk factors like the child which is not first born, mothers who are pregnant, mother who had a
dismissal of job, the person who had a siblings died of burn or seen a burn, illiterate parents,
parent with no alert to the burn, and families with lower in status. The factors are to be protective
are being a presence in the living room, giving maternal education and the study of injuries of
other males who lived in the good condition and environment.
pg. 9

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
The above picture shows the summary of modeling parameters. There are four categories
are displayed in the modeling parameters such as data to be modeled, clustering strategy, target
variable, and weight variable. There are three target variables are chosen such as the year 2016
both sexes, male and female (Schmertmann, Williamson & Black, 2012).
The model overview described the initial number of variables is 18, the selected number
of variables is 14. A number of records are 2,999, the minimum cluster count is 10 and
maximum cluster count is 20.
pg. 10
Document Page
The suspicious variables have described as the variable, target, KI, and KR. The variable
has the year 2005 to 2015 both sexes, male and female. The target is the year 2015 both sexes,
male and female. The year 2015 female variable and 2016 male target have highest KI value is
0.9949. The year 2015 female variable and 2016 male target have the lowest value is 0.9107
(Sheik Yousuf & Devi, 2017). The year 2015 male variable and 2016 male target have highest
KR value is 0.990. Variable 2005 both sexes and target 2016 female has lowest KR value.
pg. 11
Document Page
The target variables are the year 2016 both sexes, female and male. These have a
minimum, maximum, mean, standard deviation. 2016 both sexes minimum value is 0, the
maximum value is 9.8, the mean value is 2.048 and SD value is 2.313. The 2016 male min value
0, max value 13, mean value 2.13 and standard deviation value 2.398. The 2016 female
minimum value 0, maximum value 9.8, mean value 1.915 and SD value 2.358.
The year 2016 both sexes predictive power value is 0.7478 and prediction confidence
value is 0.9686. The year 2016 male KI value is 0.7075 and KR value is 0.9668. The year 2016
female KI value is 0.7063 and KR value is 0.9825 (Shimosaka, Ishiduka, Izui, Yamada &
Nishiwaki, 2018).
pg. 12

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
The above picture shows the cluster counts for 2016 female, male and both sexes. Both
sexes’ variables have the same initial and final number of clusters are 14. Female has 13 initial
and final number of clusters. Male have the 12 initial and final number of clusters.
Cluster count for target variables
The above picture shows the model, winner, KI, KR, the initial number of clusters, the
final number of clusters overlap, and unassigned records in percentage. 2016 both sexes have 14
initial clusters so its winner row is true others are false.
pg. 13
Document Page
The year 2016 male has 12 initial clusters so it winner row is true. Others rows are false.
The 2016 female has the 13 initial clusters so this winner row is true other rows are false.
Visualization
pg. 14
Document Page
In the given picture, the graphical representation for analyzing sdgpoison.csv dataset. The
x-axis represents the year 2016 both sexes and y-axis represents frequency. The 14 clusters are
differentiated in various colors.
The above picture shows the 2016 male variable cluster information in graphical
representation. Cluster 7 has the highest frequency. Its value is 0.200.
The above picture shows the 2015 female clusters frequency. It has the 14 clusters. The
cluster 10 has the highest frequency. It value is more than 0.200. The cluster 14 has the lowest
frequency value. It value is 0.025.
pg. 15

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Performance
pg. 16
Document Page
In the above three pictures shows the performance for the year 2016 male, female and
both sexes. The x-axis represents the 0 % to 100%. The y-axis represents the standardized profit.
It starts at 0.0 and ends at 1.0. The green color represents the wizard. The violate color represents
the validation.
Clusters frequencies
pg. 17
Document Page
In the above three pictures shows the cluster name, frequencies and target means for 2016
male, female and both sexes.
The year 2016 female cluster 1 is shown in the above picture. It’s found the whole
population deaths categories and fraction for the variable year 2016 female. It overall mean is
1.91463 and cluster mean is 3.3315.
The above picture shows the categories and fraction of all population for variable 2016
male in cluster 1. This category starts at 0 and ends at 9.8. The overall mean value is 2.09043 and
cluster mean value is 6.55913. This fraction from 0 to 0.5.
pg. 18

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
The both sexes categories from 0 to 3.1. Fraction from 0.0 to 0.25. The cluster mean
value is 2.04206 and cluster mean value is 0.762637.
The above picture shows the target variables, min, max, mean and SD. The 2016 male
has the highest SD value. It value is 2.398.
Compare all Cluster KLs
pg. 19
Document Page
In the given picture, 2016 both sexes defined the variable, 1 KL to 14 KL. There are
various variables are described in that picture. The year 2015 female have the highest 1- KL
value. This target variable has 14 clusters. So it has 14 KL.
The target variable is 2016 male. It formed the 12 clusters.
In the above two pictures shows the year 2016 female formed 13 clusters. So these
variables have 1 KL to 13 KL.
Cluster SQL Expressions
pg. 20
Document Page
pg. 21

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
The above three screen shots show the year 2016 female, male and both sexes SQL
expressions. It differs from the various clusters.
Research Questions
What are the main causes of unintentional injuries?
How can unintentional injuries be prevented?
How many people die from unintentional injuries?
pg. 22
Document Page
V. Recommendations for CEO
The recommendations for the NCLT Corporation is explained in this procedure. All
recommendations are proposed based on the research.
The need for poison control centers is too high. It is recommended to develop the poison
control center set up throughout the country. In the current situation, the position control
centers operate with a lower number of staffs. Even though they saved the lives of many
peoples.
Production and selling of poisonous products might be regulated (Sikka, Färber, Goel &
Lehner, 2013).
Proper rules and regulations might be formed.
Companies must focus on the production of toxin-free products.
pg. 23
Document Page
VI. Cover Letter
NCLT Corporation,
39 Olive Gr,
Balmoral,
QLD 4171.
Mr. Norris K M
NCLT Corporation CEO,
Sir,
In this report, we showed a cluster of the data analytics to gather the “Unintentional
poisoning: burden of disease” form the World Health Organization repository. The analysis is
performed using SAP HANA web-based development workbench. All the graphical
representation view are showed in the report.
Yours Sincerely,
ZZZZZZ.
Date,
18/5/2019.
pg. 24

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
VII. Conclusion
There are different analytical solutions to the company in a business analytical project. We
have 10,000 rows of data in dataset using the sources analyzed. The dataset is chosen from the
World Health Organization. The analysis of data processing is by using SAP HANA software. It
gives various information about the performance of data mining and it is the process of choosing
the patterns in a larger dataset. Clustering is the technique used while data analysis. Dashboard
and the Business Intelligence of report solution data mining are the technique found. In the
company chosen classified reference is used to build an operational manager. The sample reports
are given by using the snipping tool as a task to make research properly done.
The assignments should be explained properly with the appropriate details and reasoning on
the business data. Searching of the data is assumed using the case. Using the reference on the
before task the current task Child morality analysis dataset is performed. The CEO can be
referred for the logical advancement. The observations on business analytical data are performed
with the given website to the CEO. The information found and the observations of the analytical
data given to the CEO.
The leading data patterns and predictive models are developed using the operational goals.
The covering letter is given with more information to the CEO. The analyses are made and
solutions are given for the business analytical for a process of company at the end. The software
SAP gives security and productivity for the make a skilled business and analytics. The task made
on assignment has a very good expectation.
VIII. References
Fabiane. (2012). A Review on Clustering and Outlier Analysis Techniques in
Datamining. American Journal Of Applied Sciences, 9(2), 254-258. doi:
10.3844/ajassp.2012.254.258
pg. 25
Document Page
M., D., M., G., K., S., & S., A. (2018). SAP HANA-Database: Inter Organisation Cooperations
with SAP Systems Perspectives on Data Management for Business Applications. Bonfring
International Journal Of Networking Technologies And Applications, 5(2), 21-25. doi:
10.9756/bijnta.8379
Methadone causes half of unintentional drug poisoning deaths in young children. (2016). The
Pharmaceutical Journal. doi: 10.1211/pj.2016.20201176
Pattanayak, A. (2017). Data Virtualization with SAP HANA Smart Data Access. Journal Of
Computer And Communications, 05(08), 62-68. doi: 10.4236/jcc.2017.58005
Rofiqo, N., Windarto, A., & Hartama, D. (2018). PENERAPAN CLUSTERING PADA
PENDUDUK YANG MEMPUNYAI KELUHAN KESEHATAN DENGAN
DATAMINING K-MEANS. KOMIK (Konferensi Nasional Teknologi Informasi Dan
Komputer), 2(1). doi: 10.30865/komik.v2i1.929
Schmertmann, M., Williamson, A., & Black, D. (2012). Unintentional poisoning in young
children: does developmental stage predict the type of substance accessed and
ingested?. Child: Care, Health And Development, 40(1), 50-59. doi: 10.1111/j.1365-
2214.2012.01424.x
Sheik Yousuf, T., & Devi, M. (2017). FREQUENT PATTERN SUB-SPACE CLUSTERING
OPTIMIZATION (FPSSCO) ALGORITHM FOR DATAMINING FROM LARGE DATA
BASE. International Journal Of Business Intelligence And Data Mining, 12(3/4), 1. doi:
10.1504/ijbidm.2017.10004686
Shimosaka, M., Ishiduka, D., Izui, K., Yamada, T., & Nishiwaki, S. (2018). A clustering and
datamining technique for analysis of non-dominated solutions in manufacturing system
optimization problem. The Proceedings Of Manufacturing Systems Division
Conference, 2018(0), 106. doi: 10.1299/jsmemsd.2018.106
Sikka, V., Färber, F., Goel, A., & Lehner, W. (2013). SAP HANA. Proceedings Of The VLDB
Endowment, 6(11), 1184-1185. doi: 10.14778/2536222.2536251
pg. 26
Document Page
Tang, Y., Zhang, L., Pan, J., Zhang, Q., He, T., & Wu, Z. et al. (2017). Unintentional Poisoning
in China, 1990 to 2015: The Global Burden of Disease Study 2015. American Journal Of
Public Health, 107(8), 1311-1315. doi: 10.2105/ajph.2017.303841
pg. 27
1 out of 28
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]