Real-time Analytics: Data Mining Techniques and Research

Verified

Added on  2023/03/17

|20
|3861
|30
AI Summary
This report provides an in-depth analysis of real-time analytics, data mining techniques, and research. It explores the use of SAP Predictive Analysis and its applications in various industries. The report also discusses the challenges and benefits of real-time analytics and provides recommendations for CEOs. With a focus on the classification of data analytics, this report offers valuable insights for businesses and researchers alike.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Cover letter
KIT Corporation,
39 Woolnough Road
BROWN HILL CREEK
South Australia.
MR. Hamish A Murray,
CEO,
KIT Corporation.
Sir,
The Classication of the Data Analytics is specified in this report. We conducted the data
analytics to collect the “Malaria Reported deaths” dataset information form the WHO (World
Health Organization) repository. The analysis is performed in the SAP predictive analysis. All
the findings are showed graphically within the report.
Yours Sincerely,
XYZ.
Date,
17/5/2019.

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
1
REALTIME
ANALYTICS
Document Page
Contents
1. Introduction..............................................................................................................................3
2. Data Mining Techniques..........................................................................................................4
3. Research...................................................................................................................................8
4. Recommendations for CEO....................................................................................................17
5. Conclusion..............................................................................................................................18
6. References..............................................................................................................................19
2
Document Page
1.Introduction
Real time analytics is used to enterprise all data and resources when they are needed. It
allows the business to process without any delay and prevent problems before they happen. It
uses certain financial institutions to decide about the credit, Fraud detection at points of sale. The
biggest challenge of real-time analytics is to handle big data. Real-time big data analytics are
already used in financial trading. The data is used in the form of financial databases, social
media, and satellite weather stations for the buying and selling decisions. The major process used
in the Real-time analytics is classification which says about the historical background, elements
used in the process and characteristics of the real-time systems. Data set used in the Real-time
analytics know as Reported deaths Data for the analysis of the Malaria (Bialski, 2012). The death
data which says about the risk factor of malaria and shows about the people’s suffering, death
rate and control measures in the locality. These ratings are more accurately given by the WHO
(World Health Organization), and it also explains about the major effects that are caused by the
various insects and parasites, it is mainly caused by the Plasmodium parasites. It also explains
about the transmission to the people. SAP Predictive Analysis is a software intelligence by the
SAP that is designed to make analyses of the dataset of a table and to make a forecast on the
outputs received and the attributes are explained on the further sections. While building a
predictive model traditionally it has been done using the scripts and algorithms are applied
manually. So the predictive analytics is used by the business users and the analyst of data to
create a new predictive model. Machine learning based on the SAP Predictive analysis is
described briefly. Machine learning always focuses on how data is retrieved and the development
of computer programs.
Research Questions
How many numbers of malaria deaths are reported?
What countries have the highest rate of malaria?
How many people died of malaria in 2017 and 2016?
How many cases of malaria were reported for all who reporting regions?
3

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
2.Data Mining Techniques
The data mining techniques are explained here. It is an execution of depicts between a
group of descriptive attributes and target attribute, that is between the input of a model and
output of a model that we create. These regression algorithms are of the family in predictive
models to be built that belongs to the model. The proprietary algorithm is used by the Automated
Analytics Engine. The predictive models are built using the training dataset that contains the
questions regarding the business. The returned models contain the polynomial expression who
need input number. The attribute contribution, the relative importance of weighting by the input
of the trained models has been analyzed. The characteristics are described in two types,
1. Predictive Power: It is commonly known as KI. It represents the quality of the model created
by Automated Analytics. The Predictive Power responds to the information that has a proportion.
It is also an explanatory variable that is explained using the target variable.
2. Prediction Confidence: It is commonly called as KR. It is a strong indicator of the model that
is created by Automated Analytics (Bruce Hanson, 2018).
The capacity value of the model is described to reach the same level of performance when KR is
given to the new dataset which has the same characteristics of the training dataset.
PREDICTION AND ACTION IN BUSINESS MOMENT USING AUTOMATION
The extreme changes are made across the various features on the business and make all
the enterprises to recreate the customers' value and the operational model. The employees,
customers, and partners are connected digitally. These peoples have an uneven opportunity on
the value for the creation and to capture. But it has risk on the organization chose that does not
respond to that problem.
When an employee, customer, and partners have increased expectations will make
pressure on the leaders of the business which will result in the innovative creation of the new
form value and occupy it. The new values will not be the responsibility for the business but
recreate the operating models. The digital transformations are made by predictive analysis. To
4
Document Page
make these needs companies must meet the requirements on customer experience, fast and data-
driven decisions. Most of the successful companies use predictive analytics for the applications,
processes, and solutions for the business line.
ACCURATE RESULT
SAP Predictive Analytics are used for the efficient and for the result produced accurately
on the whole predictive modeling processes. The preparation of data permits to create many
derived variables faster without the programming codes. SAP Predictive Analytics has created
models of automation that are self-serving. The predictive models have a high level of accuracy
and it gradually increases the work flow through productivity (Decreusefond, Moyal & Limnios,
2012).
MACHINE LEARNING:
The outlook based on SAP Predictive Analytics that rule, maintain, store and value the
models on probability. We can create thousands of the models segmented without the modeling
environment. The automated model management is done because of its user-friendly, single-
sign-on environment and it is browser based. Using these the models are created as real time, the
models can be managed, scheduling is made and a variety of scenarios are made used.
The performance level is measured and maintained for every single model using SAP
Predictive Analytics. It is done with a simple click and an end-to-end predictive lifecycle is
maintained along with the governance of the enterprise. The feature that makes easier to use is
the flexibility and loose coupling of the predictive factory architecture. It will make easy to
combine with the latest IT landscapes that include the business intelligence, rule of the system,
Big Data ecosystem, data sources, applications, and process on the business. It promotes the
combinations with the SAP Hybrids solutions and SAP software.
5
Document Page
By opening the SAP Predictive Analytics Software, the first step to select a Data Source.
Here, we have to select a Use a File or a Database Tools set a data type as text Files and browse
a folder to store a file. After completing these works, select the next button to move to the next
process.
The above picture shows the malaria dataset details. It has multiple variables such as
Country and year 2000 to 2017. The country is assigned as string and other variables are
represented as integer. The country values are nominal and another variable's value is
continuous. Then click the next button.
6

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Then selecting variables tool is appeared. Here, we have to select the target variables and
shows the years of 2017 and 2016. The selected variables are displayed on the right side of the
screen (Target Variables). Then click the next button for the next process (Oguz, Cinsdikici &
Gonul, 2017).
The above picture shows the Summary of Modeling Parameters, the model name is year
2017_ year 2016_malaria. The target variable is the year 2017 and 2016. Then click Generate
Button. The previous button used to view the previous page.
7
Document Page
3.Research
The Conducted research is explained in this process. This is used to make an evaluation
to the people or to make use of the Modeler for building Regression/Classification. This does not
need statistics or database for creating a new model.
FILES AND DOCUMENTATION
The application has a sample data and these files are stored in the folder D. The name of
the dataset table is Reported deaths Data by country. The manually used Application model is
used in this data file. The dataset name is MALARIA.csv and it can be accessed using this name.
APPLICATIONS ON REGRESSION/CLASSIFICATION
The outline used is about the report of the deaths in most of the countries due to malaria.
It needs increased skills on statistics of data using modeling. The World Health Organization
make statistics of the people who died because of the attack of malaria. This project is to indicate
the peoples die year by year increased or decreased. It is a large database that must be analyzed
deeply with significant time. It is open-data information with 10,000 rows of dataset table (Peng
& Zeng, 2017). Here we explained the examples of Classification in data analysis task:
It analyzes the death of malaria reported in order to know how risky and safe.
A data analyzer has analyzed a people with an issued details.
These both examples are categorized the attacks by the malaria. These are the details are
reported in the data analysis.
OBJECTIVE
The objective is about the statistics created by the WHO. It gives the report about the
people who caused by the malaria. And its major objective is create the awareness between the
people and the Government.
MEANS
It is a statistical analysis that is developed based on records. It also takes various
measures to control the death which is caused by the malaria in the year 2000-2017. Depending
8
Document Page
upon the performance validation and estimation increases in the year 2016-2017.According to
the statistics the maximum death is happened in the year 2012 and minimum is in 2006. From
2001 to 2006 there is only a minor difference in the death level. But from 2007 there was a huge
difference with increased death.
The working on the classification is understood with the help of the death rate on malaria
over the countries. The process of data classification is of two steps,
Developing a model or a classifier
Using the classification of classifier
DEVELOPING A MODEL OR A CLASSIFIER
Building a classifier is a learning phase or a learning step.
The classifiers are built using the classification algorithms.
The training set is built using the database tuples and the combination of class
labels that are used to develop the classifiers.
Every single tuple has a training set that describes the category or class. It is
referred to using the sample models, data or objects.
USING THE CLASSIFICATION OF CLASSIFIER
This step explains about the classifier made for the classification. The test data is used to
make the probability of accuracy on the classification of rules. The rules are applied to the newer
data tuples if the accuracy is present.
COMPARISON ON CLASSIFICATION AND PREDICTION METHODS
Accuracy -The accuracy is defined as the ability of the classifier. The class labels are
projected correctly and the guess made on the value of the attribute of the predicate is the
accuracy of the predicate.
Speed - It is the calculation of the cost while creating and using the classifiers or models.
Robustness - The ability to make the predictions correctly with the given data.
Scalability - When a large amount of data is given the ability to build the model is
scalability.
9

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
SAP Predictive Analytics are used to deliver results in decision point in applications and
systems. We initiate predictive probability for an open variety of specified system and it is most
useful in the database (business processes and line-of-business solutions) are performed. From
the SAP platform, it allows to perform a memory-scoring in the SAP predictive analysis
platform, relational database management system and data sources. Without moving data, we can
store each model and directly embed into SAP HANA and SAP S/4HANA software. By
generating a CCL (Continuous Computation Model) code are used for the model and install the
software for using cases.
The SAP model is placed in the predictive insights of Organization. With the SAP
Analytics Cloud Solution, the customers can hold the platform and to identify key variables with
the mouse click.
From the SAP Predictive Analytics software, it depends on the * Data manager
Guided model authoring
Predictive factory
Predictive factory
Scorer
Link analysis and recommendation
Software development kit and API
SAP Predictive Analytics provides
It delivers predictive insights faster by reducing the errors through automated techniques
with the help of quality measures (KI and KR)
The thousands of predictive models have scaled with the machine learning that are valued
and enterprise.
It contributes to the predictive value chain using guided workflows and automated
techniques. It is integrated with SAP software by including the SAP HANA, SAP
S/4HANA, SAP Business Objects and the SAP business Warehouse Application.
It improves the operating margins across the enterprise by utilizing thousands of models.
It increases productivity, to create predictive data sets it does not have multiple points.
10
Document Page
Training the model explains about the malaria dataset. The model view is displayed in the
above picture. It shows the dataset name is malaria.csv, an initial number of variables is 20. The
selected variables are 17, and a number of records 105.
The suspicious variables have the variable, target, KI and KR values. There are two
variables displayed in the variable section, such as the year 2015 and 2019. There are two target
variables are shown in the picture such as the year 2017 and 2016. The year 2009 and 2016 has
lowest KI value. The year 2017 has the lowest KR value (Peng, Zeng & Natale, 2019).
Choose the two target values such as the year 2017 and 2016. The year 2017 minimum
value is 0, the maximum value is 4,414, mean value is 548.986 and SD value is 1,113.7.
The year 2016 minimum value is 0, the maximum value is 5,853, mean value is 522.93
and standard deviation value is 1,160.1.
11
Document Page
In given picture shows the predictive power (KI) and prediction confidence (KR) in the
year 2016 and 2017. The 2017 KI value is 0.9643 and KR value is 0.9590. The 2016 KI value is
0.9637 and KR value is 0.9523.
The display tab shows the five categories such as model overview, model graphs, and
contribution by variables, category significance, and statistical reports.
12

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
The above picture shows the model predicted and the actual graph of the year 2017. The
x-axis is the predicted value and y-axis is the actual value. The green color represents the wizard.
The blue color represents the validation and the yellow color represent the estimation.
The above picture shows the contributions by variables. The variables are the year 2000
to 2017. The year 2006 has the lowest contribution level. It has less than 0.025 value of the
contribution. The year 2012 has the highest value. It obtains the 0.300 value of the contribution.
13
Document Page
The above picture shows the model graph for the year 2016. It similar to the year 2017
model graph but it has different values for predicted and actual values.
In given picture shows the contributions of the year 2016. The variables are the year 2010
and 2000. 2013 has the highest contribution. Its value is 0.225. 2006 has the lowest contribution.
Its value is less than 0.025.
14
Document Page
The above picture shows the statistical reports for the all variables. It has variable,
dataset, min, max, mean and standard deviation. There are two data set types to display in the
statistical report such as validation and estimation.
In given picture shows the dataset size. The estimation dataset has 71 records and the
total weight is 71. The validation dataset has the 34 records and the total weight is 34.
15

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
The variable correlations in the year 2017 have four variables such as index, first
variable, second variable, and coefficient. The correlation is the statistical relationship between
the two variables (Sand born, 2013). The highest coefficient value is 0.993 the relationship
between the year 2003 and 2005. This index value is 38.
The above picture shows the KI and KR graph for the year 2017. The dataset type is an
estimation. The blue color is the KI value. The gray color represents the KR value.
16
Document Page
The above picture shows the KI and KR statistical report for the year 2016. The different
colors are used for differentiating the KI and KR.
4.Recommendations for CEO
The report recommendations for the KIT is explained in this process. All these
recommendations are suggested based on the findings of the research. This recommendation
helps to the CEO to achieve his aim malaria free globe. It is clear from the research that malaria
caused for the infants in large number. Malaria attacks mainly on the rural areas, because they
don't have the awareness. So creating awareness programs on the rural areas helps to reduce the
impacts of malaria. Results of the research clearly show the malaria mortality rates reduced since
2010. It is mainly because of the prevention techniques followed and readiness of the
government agencies. Most of the mortality caused by malaria occurred in rainy seasons. So
government agencies and NGO must ready on rainy seasons.
17
Document Page
5.Conclusion
We find different analytics solution to a company by using a business analytical project.
From the sources, we found 10,000 rows of datasets. We have chosen the WHO (World Health
Organization)-open data repository from the given data. While, analysis data processing, we use
the SAP (System Application and Product) predictive analysis software. By using this data we
produce numerous information to finding the process of patterns in large datasets. The
techniques are used in data analysis is Classification. The reports of business intelligence for
dashboards and data mining are identified. By using a different reference from the company to
develop the operational manager. The research of the task is well done by using the sample
reports issued by using the snipping tool. The tasks are well clearly finished and explained in the
assignment given in the dataset. Based on the child mortality analysis dataset is performed.
Logical recommendations are used to refer the CEO for business analytical data. The main
findings and observation of business data to the CEO (Yamato, Fukumoto & Kumazaki, 2016).
Data patterns are modified and developed predictive models for cloud clear water through
operational goals. Then more explanation is given to the CEO, we issued a covering latter. At
last, from the end of the assignment, analyzed and focused solutions for the business analytical
of a company process. The SAP software plays a well secure and well versed for business and
analytics. Here we did our assigned tasks in the good expectation. We found the type
classification of analytics solution by using an analytic project by using the 10,000 rows of
datasets. Here, we are selected the WHO (World Health Organization)-open data repository from
the given entries. On doing the analysis data processing, we used the SAP predictive analysis
software. This report shows the business way to dashboards and data mining are identified. The
research of the work is well done by using the sample reports with type help of Snipping tool.
The SAP software generates be well secure and versed for business and analytics. Here, we have
done our assigned work in good expectation.
18

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
6.References
Bruce Hanson, A. (2018). Some Results about Big and Little Lip. Real Analysis
Exchange, 43(1), 43. doi: 10.14321/realanalexch.43.1.0043
Decreusefond, L., Moyal, P., & Limnios, N. (2012). Stochastic modeling and analysis of telecom
networks. Hoboken, NJ: J. Wiley & Sons.
Oguz, K., Cinsdikici, M., & Gonul, A. (2017). Robust activation detection methods for real-time
and offline fMRI analysis. Computer Methods And Programs In Biomedicine, 144, 1-11.
doi: 10.1016/j.cmpb.2017.03.015
Peng, C., & Zeng, H. (2017). Response time analysis of digraph real-time tasks scheduled with
static priority: generalization, approximation, and improvement. Real-Time Systems, 54(1),
91-131. doi: 10.1007/s11241-017-9290-7
Peng, C., Zeng, H., & Natale, M. (2019). A comparison of schedulability analysis methods using
state and digraph models for the schedulability analysis of synchronous FSMs. Real-Time
Systems. doi: 10.1007/s11241-019-09331-1
Sandborn, P. (2013). Cost analysis of electronic systems. Singapore: World Scientific.
Yamato, Y., Fukumoto, Y., & Kumazaki, H. (2016). Proposal of Real Time Predictive
Maintenance Platform with 3D Printer for Business Vehicles. International Journal Of
Information And Electronics Engineering, 6(5), 289-293. doi: 10.18178/ijiee.2016.6.5.640
19
1 out of 20
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]