Business Intelligence Report: Data Analysis and Loan Delinquency

Verified

Added on  2023/06/03

|28
|4403
|439
Report
AI Summary
This report delves into various aspects of business intelligence, starting with the importance of electronic information recording systems, particularly in healthcare. It examines the evolution of electronically stored health records and the crucial role of data security. The report then transitions to a practical application, analyzing loan delinquency using RapidMiner, exploring data attributes, and interpreting correlation matrices. The analysis includes insights into customer demographics, loan defaulting habits, and the utilization of unsecured lines. The document concludes by highlighting the significance of data management in organizational decision-making and compliance with regulatory requirements, emphasizing the need for robust data protection measures.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Running head: BUSINESS INTELLIGENCE
Business Intelligence
Name of the Student
Name of the University
Course ID
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
1BUSINESS INTELLIGENCE
Table of Contents
Task 1...............................................................................................................................................2
Introduction..................................................................................................................................2
Importance of electronic information recording system..............................................................3
Detecting the Evolution in electronically stored health record...................................................4
Task 2:.............................................................................................................................................7
Task 2.1:......................................................................................................................................7
Task 2.2:....................................................................................................................................13
Task 2.3:....................................................................................................................................16
Task 2.4......................................................................................................................................17
Task 3.............................................................................................................................................19
Task 3.1..........................................................................................................................................19
Task 3.2..........................................................................................................................................19
Task 3.3..........................................................................................................................................20
Task 3.4..........................................................................................................................................22
References......................................................................................................................................24
Document Page
2BUSINESS INTELLIGENCE
Task 1
Introduction
Privacy concerns in the health research has now a day become one issue that attract
considerable attention. In designing relevant health care service, research is required regarding
health related issues. Data or information are considered as one vital asset of any organization. In
the information set, special attention needs to be given on confidential information. The
organization should provide proper focus on maintaining terms of confidentiality of the
information set. Securing confidential information by using different software and hardware is
known as information security. It implies a combined internal and external system of operation
where collected information and data are kept protected. In functioning of an organization,
recorded data and information plat several important roles. The information security consists of
different functions. The first primary responsibility is to maintain the privacy of the collected
data. A secure information system also protects capacity of the concerned organization in
performing its assigned functions. Another important aspect the security to the accessed
technology of the organization. Organization gives special attention to protection of the
information as the unauthorized access to the confidential information has adverse effect on
people directly or indirectly connected to the organization.
All the health related data in Australia re documented under the supervision of digital
health agency of Australia. With increasing prevalence of various health issues, load of gathered
information is increasing at a rapid pace in a very short span of time. The concerned agency
gathers and maintains all these health related information. Growing concerns for various diseases
encourage more people to take health care services. The number of people willing to have
different health care service far exceed the number of available service providers
Document Page
3BUSINESS INTELLIGENCE
(Digitalhealth.gov.au. 2018). Therefore, maintenance of detailed information regarding about the
individual recipient has become extremely important. Information are collected regarding
personal details and status of health. The technological advancement in several equipment and
machinery used by different health care service unit increases information availability related to
individual recipient. Data related to prenatal testing is the most easily accessible. Analysis of the
health related risk factors is an important aspect determining continuation of required services.
Maintenance of proper data and information also provide protection against unjustified allegation
or claims. In addition to direct health care, various other aspects are considered under primary
health care service. It is the responsibility of the service providers to document observations and
instructions. Intervention by the third party is often observed in the system where relevant
information are used to pay for the used services.
Importance of electronic information recording system
Health care organizations today pay great attention in securing information. Given large
volume of health care data, security system is designed to maintain confidentiality of the
personal information. Unless proper security, it would be very easy to access and misuse these
information. Now, information are recorded and shared using electronic medium instead of
earlier paper based method of documentation (Dinev et al. 2018). The paper-based method of
photocopying important documents is a laborious process and require more time compared to
storing data electronically.
Various sources have been used in gathering the relevant data, which is combined and
connected to other profiles. Therefore, with the electronic process it is easier exploring the
database within the build network for extracting the data from different remote locations.
Nevertheless, the system relevantly increases the chance of third party accesses of the data,
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
4BUSINESS INTELLIGENCE
which is being stored. Hence, the overall system indicates the absence of adequate security,
which is not protecting the overall data of the organization, while making the access process
easy. The individual can access the data without leaving any kind of trace for such kind of
incident. Moreover, the system has directly allowed the service providers to understand the trend
of the data, which indicates the health conditions of the population. The service providers depict
easy access and understanding of the information presented in data based (Van Cauteren et al.
2016). The information is relevantly used with the advancements in the technology for
supporting the electronic health care record to the individuals. Mobile technology is also used in
detecting the required level of data for the individuals. The major significance of HER is to
support activities of the industry. Thus, with the continuous evolution of the system along with
the improvements directly increase the quality of health care service that is being contributed to
the overall system.
There are different measures, where the information related to the statues of store data is
electronically kept and reduces the overall error processing. Therefore, with this measure risk of
malpractices can be avoided for meeting the reimbursement claims. However, there are
drawback of the current legal framework that is being used in the health care recording process.
Moreover, the obligation of the health care is also based on electronic and paper-based methods.
Thus, it could be detected that the confidentiality conditions may vary on the information holder.
Detecting the Evolution in electronically stored health record
The rapid growth in the health record burden has relevantly flourished the use of
electronically based recording system, which directly supports the service providers. Therefore,
with the large variation in the nation forced the creation of huge support is electronic health
record. The electronic health record has played a significant role in major hospitals, as its allows
Document Page
5BUSINESS INTELLIGENCE
the authorities to understand the history of the patients. The national center has relevantly
indicated that 75% of the service provers are able to enhance the quality of patient care with the
use of electronic data. Consequently, with the electronic recording system the has relevantly
allowed the individuals to access information regarding the patients and make adequate decision
during critical hours (Kim et al. 2017). The system has relevantly provided alerts for the new
medication and physicians that the patients are considering for their health issues. Hence, it could
be understood that the digital health technology has undergone serious changes in recent years
for supporting the hospitals with information regarding the patients. The structure of the health
record system has also played an adequate role in distributing the information of different
hospitals.
The personal information of the Australian citizen and other personal data of the citizen
are stored in the Australian Data Agency. Since the data is composed of the Australian Citizen
personal and crucial, important documents the same needs to be stored and governed by the
regulatory act such as the Privacy Act of 1998. The health record system manages all the
personal information of the organization in a more classified way. The personal data and
information collected and stored are useful and is always viable for an consideration for an
option when the same is identified for the use of the communication process and for the
management purpose. The use of “My Health Record System has enabled and widely used by the
company and the operators for reclassification and arraignment of the data. The crucial
information, which are gathered about the health care products and services rendered are stored
in a structured and the same is stored with privacy. The data protection and the personal data
gathered is a privacy matter for the company and the same should be regulated with the
regulatory bodies by imposing certain rules and guidelines for the same. The regulatory body can
Document Page
6BUSINESS INTELLIGENCE
take several steps by including steps such as imposition of penalty and fines and imposing
several regulatory and criminal proceedings against those involved in the breaching of secured
and private data of the organization (Zingg et al.2015).
The organization has several ways through which the data inflow to the organization such
as data recorded via telephonic conversations, emails and other general letters and all, which may
contain certain other privy data. The Organizations collect several and various kind of
employment related data also, which should also be managed and stored effectively. The process
of data collection and data processing is well managed by the company in terms of managing
relation with the employees of the organizations (Watanabe 2015). It is crucial to note that the
management of the company should assess crucial important situations and scenario where the
data collected and gathered may be for the use in making effective decision process. Situation
arises when the organization reviews data management and the same is used in the various
process and steps of the company like in the contract, workforce management, and meeting the
obligations and rules of the regulatory bodies and for association of goods with the market
information available to the management of the company. There are several requirement by the
Human service Department for the for providing health data records and information which is
ensured by the regulatory body to ensure betted data management and data processing. The
registration are also taken for those interested in registration of the digital health care system and
the security for the same ios an important factor. Certain and several steps needs to be taken into
consideration for enabling and protecting the data of the organizations (Booth et al. 2018).
There should be several steps and accessibility to the data should be given to those
individuals after having careful analyzing of their identity. The parental responsibility should
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
7BUSINESS INTELLIGENCE
also be taken care where the responsibility of the same should be regarding authorized
representative should be over the age of 18.
Document Page
8BUSINESS INTELLIGENCE
Task 2:
Task 2.1:
In order to explore the loan delinquency of ACME bank the primary function is to import
the data into the rapidminer. Findings from the exploration of data suggest that the data file
represents the primary variable was to recognize the variable. By using the “Select Attributes”
the identity variable was detached. Consequently, the matrix correlation operator was implanted
and the central process was finished. The procedure is illustrated in the figure Task 2.1. The
execution procedure offered the analysis of exploratory data of the dataset together with the
matrix of correlation.
Figure 2.1: Figure illustrating data analysis and Correlation Matrix Procedure
Table 2.1: The below stated figure provides an analysis of the exploration data relating to
the loan delinquency.
Table 2.1: Results relating to the Analysis of Exploratory Data for loan delinq.csv
Document Page
9BUSINESS INTELLIGENCE
As understood from the analysis of data it represents that to review the information
relating to loan delinquency that has been collected. The information comprises the data relating
to the 1.5 lakh customers of bank. Additionally, it is noticed that barring the monthly earnings
and number of dependents information relating to all the attributes are present in it. Data relating
to the monthly earnings of 29731 customers are missing. Similarly, data relating to the total
number of dependents for 3924 customers are also omitted.
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
10BUSINESS INTELLIGENCE
The measurement of loan delinquency is assessed with the help of dichotomous variable -
SeriousDlqin2yrs. The variable is helpful in measuring the loan defaulting habit of the person
that are past 90 days. It is later noticed that 93.3% of the customers have the habit of loan default
during the last 90 days. Only the 6.7 per cent of the customers does not has the habit of loan
delinquency.
Document Page
11BUSINESS INTELLIGENCE
The word “RevolvingUtilizationOfUnsecuredLines” is viewed as continuous variable.
Whereas the minimum value relating to the variable is zero while, the maximum value stands
50708. Additionally, it is noticed that the average value stood 6.048, furthermore, it is noticed
that the majority of the customers does make the utilization of unsecured lines.
Document Page
12BUSINESS INTELLIGENCE
The variable “Age” is treated as continuous variable. The minimum and the maximum
age of every customer are stated as 0 and 109 respectively. The average age of customer stood
52.9295. Additionally, representation of histogram represents that there is a normal distribution
of customers age.
Variables such as “Number of Time 30-59 Days Past Due Not Worse”, “Number of
Times 90 Days Late” and “Number Of Time 60-89 Days Past Due Not Worse” represented 0 and
98 as the minimum and the maximum value. The maximum as well as minimum value of Debt
ratio stood 0 and 329664 while the average debt ratio stood 353.005.
The minimum and maximum value stood 0 and 58 for “NumberOfOpenCreditLinesAndLoans”
while the average value stood 8.453.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
13BUSINESS INTELLIGENCE
For the NumberOfDependents the minimum and maximum value of the customers stood
0 and 20 respectively. The average number of dependents stood 0.757 whereas the maximum
number of customers have the number of dependents at 1 with customers that has the higher
number.
Document Page
14BUSINESS INTELLIGENCE
To forecast the correlation of loan delinquency relating to the five variables a calculations
of variables is performed. The evidences from the figures suggest the matrix of correlation for
different variables. The analysis obtained from the correlation matrix represents
“NumberOfTime30-59DaysPastDueNotWorse”,
“NumberOfTimes90DaysLate”, “NumberOfTime60-89DaysPastDueNotWorse”,
The “NumberOfDependents” and “age” represents greater degree of correlation with the
loan delinquency. Therefore, to analyse the loan delinquency the above stated five factors are
selected.
Task 2.2:
Figure 2.2 (a) Decision Tree Procedure
Document Page
15BUSINESS INTELLIGENCE
Figure 2.2(b): Decision Tree
Figure 2.2(c): Decision Tree Process
In order to prepare the decision tree in the rapidminer the variables that have the greater
analysis of higher correlation is selected. Therefore, 5 variables that have the greater correlation
is selected for determining the loan delinquency. The set role reporter is employed to select the
delinquency of loan as the targeted variable. The “decision tree” operator is employed to make
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
16BUSINESS INTELLIGENCE
the decision tree. The use of lease square criterion is used to prepare the decision tree.
Additionally, maximum depth of 5 is used in preparing the decision tree. The trimming relating
to the decision represents the minimum gain of 0.01.
The evidences from the decision tree represents that the “NumberofTimes90DaysLate” is
initially selected and it is separated in less than or greater than 0.500. For the
“NumberofTimes90DaysLate” above 0.500 is attributed again to segregate into less than and
greater than 1.500. As understood from the decision tree it is noticed that all the
“NumberofTimes90DaysLate” possess greater than 1500 variables. The “NumberofTime30-
59dayspastduenotworse” is divided under two sections that has less than and greater than 0.500.
Use of age attribute is employed to define all the “NumberofTime30-59dayspastduenotworse” to
greater than 0.500.
For “NumberofTimes90DaysLate” less than 0.500 is attributed. While the attribute of
“NumberofTime30-59dayspastduenotworse” is separated to less than and greater than 0.500. The
variables of “NumberofTime60-89dayspastduenotworse” is used to forecast “NumberofTime30-
59dayspastduenotworse.”
As understood from the decision tree it is noticed that the entire five variables are used in
predicting the delinquency of loan. The separation has made used of the factor of 0.500 for the
first order. While the variable of “NumberofTime60-89dayspastduenotworse” is used to forecast
the “NumberofTime30-59dayspastduenotworse.”
The understanding from the decision tree suggest that the full five factors has been
utilised to predict the delinquency of loan. The segregation has made use of 0.500 for the first
order while in the second order the factor stood 1.500 has been used.
Document Page
17BUSINESS INTELLIGENCE
Task 2.3:
Fig
ure 2.3 (a): Logistic Regression Model
Figure 2.3(b): Logistic Regression Output
Document Page
18BUSINESS INTELLIGENCE
The image represents logistic regression procedure for determining loan delinquency.
The process for obtaining the logistic regression comprises of importation of data into the
process. The numerical variables are turned into binomial variable and the variables for selecting
loan delinquency is chosen. The variables are chosen based on loan delinquency matrix. Later
the use of set role operator is used to determine the association among the dependent loan
delinquency variable and independent variables.
The association between the dependent and independent variable is stated below;
Loan delinquency = 5.961*Age + 1.230* NumberOfTime30-59DaysPastDueNotWorse +
1.938* NumberOfTimes90DaysLate + 1.256* NumberOfTime60-89DaysPastDueNotWorse +
0.224* NumberOfDependents – 9.462
The equation provides evidences that all the independent variables presents the positive
effect on the loan delinquency. The equation provides that there is a greater rise in loan
delinquency with rise in age. The changes for loan delinquency falls with the age of customers.
Additionally, it is noticed that least effect on loan delinquency is obtained by the number of
dependents.
Meanwhile it is noticed that age do not represents statistical significance on loan
delinquency at 0.05 level of significance. Additionally, it is noticed that except for all the age
other variables possess the noteworthy effect on the loan delinquency.
Task 2.4
Table 2.2: Results of Model Performance Evaluation (Decision Tree, Logistic Regression)
Measures Logistic Regression Decision Tree
Model Accuracy 93.51% 93.51%
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
19BUSINESS INTELLIGENCE
True Positive 113.3 154.3
False positive 84.100 125.0
Precision 57.44% 55.29%
Recall 11.31% 15.04%
Lift 859.85 828.00
Sensitivity 11.31% 15.40%
F Measure 18.88% 24.08%
The evidences from table 2.2 provides a comparative analysis between the decision tree
model and logistic. The cross validation techniques were employed to provide comparative view
of models. As understood from the table both the model represents the equivalent accurate level
at 93.51%. Additionally, it is noticed that the precision of logistic regression model is greater at
57.44% in comparison to the decision tree model of 55.29%. The recall level of logistic model
stood 11.31% in comparison to the decision tree model of 15.04%. The sensitivity of decision
tree model stood better at 15.40% while the logistic model stood 11.31%. As understood from
cross validation the accuracy of both the models are identical in comparison to the decision tree
model.
Document Page
20BUSINESS INTELLIGENCE
Task 3
Task 3.1
Figure 3.1 : Daily rainfall at NorfolkIsland
The tableau view has been created for the NorthfolkIsland weather each day in the month
of June during the year 2012. The bar chart of this specific weather stations has been generated
by measuring days of June month in the horizontal axis and rainfall in the vertical axis. From the
above image we find that the maximum rainfall occurred on 30th June. In addition, it is also
found that the rainfall was very high from 12th to 15th subsequently there was approximately no
rainfall till 29th June. Moreover, it is also found that there was no rainfall in the starting of the
month. Through change of location in tableau file we would get the rainfall on other locations.
Task 3.2
Document Page
21BUSINESS INTELLIGENCE
Figure 3.2 : Monthly rainfall at NorfolkIsland
For the yearly analysis again we study the rainfall in NorfolkIsland. The analysis of the
rainfall shows that the rainfall follows a normal distribution from 2009 to 2018. The highest total
rainfall occurred in 2011. The least amount of rainfall occurred in 2017. Further it is found from
the chart that there was a decrease of rainfall from 2011 to 2013. The amount of rainfall from
2013 to 2015 was approximately equal. There was a rise in rainfall in 2016. However, the
rainfall fell drastically in 2017.
Task 3.3
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
22BUSINESS INTELLIGENCE
Figure 3.3: Monthly rainfall at NorfolkIsland
For the monthly evaluation of rainfall we have again selected the location of
NorfolkIsland. The year of analysis is 2012. The month variable (from date) is placed in columns
and rainfall as rows. The rainfall measure is converted to sum to present the total rainfall. The
colour of the bar is changed to yellow. The year variable (date) is placed in filter. This aids in
selecting 2012 as the year. The total monthly rainfall is represented through a bar chart. The
height of the bar represents the amount of rainfall. From the above chart it is found that the
highest amount of rainfall occurred in the month of January. The rainfall in 2012 followed a
cyclic occurred. There was a slump in rainfall from January till the month of June. From the
month of June there was rise in the rainfall till the month of November. However, we find that
the rainfall fell again in December.
Document Page
23BUSINESS INTELLIGENCE
Task 3.4
Figure 3.4: Geomap of Rainfall
The geomap in tableau is created by placing longitude in the columns and latitude in the rows.
Tableau automatically creates a geomap based on the given latitude and longitudes. In order to
find the locations, the variable is placed in the measure as colour. The locations are highlighted
with gradient green colour. The year 2010 is selected for accessing the rainfall. The least total
rainfall in 2010 was 206 mm while the highest total rainfall was 2660 mm. In the tableau file
when the year is changed then the rainfall for a different year would be provided.
Document Page
24BUSINESS INTELLIGENCE
Figure 3.5: Tableau Dashboard
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
25BUSINESS INTELLIGENCE
References
Booth, A., Moylan, A., Hodgson, J., Wright, K., Langworthy, K., Shimizu, N. and Maconochie,
I., 2018. Resuscitation registers: how many active registers are there and how many collect data
on paediatric cardiac arrests?. Resuscitation.
Digitalhealth.gov.au., 2018. Privacy - Australian Digital Health Agency. [online] Available at:
https://www.digitalhealth.gov.au/policies/privacy [Accessed 6 Oct. 2018].
Dinev, T., Albano, V., Xu, H., D’Atri, A. and Hart, P., 2016. Individuals’ attitudes towards
electronic health records: A privacy calculus perspective. In Advances in healthcare informatics
and analytics (pp. 19-50). Springer, Cham.
John, A., Dennis, M., Kosnes, L., Gunnell, D., Scourfield, J., Ford, D.V. and Lloyd, K., 2014.
Suicide Information Database-Cymru: a protocol for a population-based, routinely collected data
linkage study to explore risks and patterns of healthcare contact prior to suicide to identify
opportunities for intervention. BMJ open, 4(11), p.e006780.
Kim, Y.H., Han, K., Son, J.W., Lee, S.S., Oh, S.W., Kwon, H.S., Shin, S.A., Kim, Y.Y., Lee,
W.Y. and Yoo, S.J., 2017. Data analytic process of a nationwide population-based study on
obesity using the national health information database presented by the national health insurance
service 2006-2015. Journal of Obesity & Metabolic Syndrome, 26(1), pp.23-27.
Legislation.gov.au., 2018. Healthcare Identifiers Act 2010. [online] Available at:
https://www.legislation.gov.au/Details/C2017C00239 [Accessed 6 Oct. 2018].
Legislation.gov.au., 2018. My Health Records Act 2012. [online] Available at:
https://www.legislation.gov.au/Details/C2017C00313 [Accessed 6 Oct. 2018].
Document Page
26BUSINESS INTELLIGENCE
Myhealthrecord.gov.au., 2018. My Health Record. [online] Available at:
http://www.myhealthrecord.gov.au/ [Accessed 6 Oct. 2018].
Myhealthrecord.gov.au., 2018. My Health Record. Privacy Policy. [online] Available at:
https://www.myhealthrecord.gov.au/about/privacy-policy [Accessed 6 Oct. 2018].
Van Cauteren, D., Millon, L., De Valk, H. and Grenouillet, F., 2016. Retrospective study of
human cystic echinococcosis over the past decade in France, using a nationwide hospital medical
information database. Parasitology research, 115(11), pp.4261-4265.
Watanabe, K., Ricoh Co Ltd, 2015. Data management for hospital form auto filling system. U.S.
Patent Application 14/194,365.
Zingg, W., Holmes, A., Dettenkofer, M., Goetting, T., Secci, F., Clack, L., Allegranzi, B.,
Magiorakos, A.P. and Pittet, D., 2015. Hospital organisation, management, and structure for
prevention of health-care-associated infection: a systematic review and expert consensus. The
Lancet Infectious Diseases, 15(2), pp.212-224.
Document Page
27BUSINESS INTELLIGENCE
chevron_up_icon
1 out of 28
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]