IT Infrastructure Management PG: Data Mining Report and Analysis

Verified

Added on 2020/04/01

AI Summary

This report analyzes data mining as a crucial component of database management, emphasizing its role in knowledge management and decision-making. It reviews data mining processes, including data definition, exploration, preparation, modeling, evaluation, and deployment. The report highlights the functionalities of data mining, specifically classification, clustering, and association analysis, with examples of algorithms such as decision tree induction, KNN, and partitioning algorithms. Applications of data mining in business, healthcare, and general industries are also explored. The report provides a SWOT analysis of a Real Estate organization, including recommendations for short, mid, and long-term improvements in their ICT infrastructure. This report is a valuable resource for understanding the practical application of data mining principles.

Running head: IT INFRASTRUCTURE MANAGEMENT PG
Assessment 3
[Student Name Here]
[Institution’s Name Here]
[Professor’s Name Here]
[Date Here]

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

IT INFRASTRUCTURE MANAGEMENT PG 2
Task A
SWOT Template
Organisation: Real Estate Date: 29/09/2017
Description of current/new ICT service:
Real Estate, offers realtor services to customers through its online system can access various real estate properties
available around in the world. Its operations facilitate prospective property buyers, sellers and investors in their
activities. In the current system, the user (customers) can select from a wide range of services including buying, renting,
investing and selling. Furthermore, for the first time customers and sellers, the organization will act as a link to service
agent based on their location and budget requirements. Finally, Real Estate system is based on an online infrastructure
that keeps users up to date with real estate matter regardless of their locations (RealEstate, 2017).
STRENGTHS
 Extensive and efficient customer services
based on personalized portals.
 Mobility and flexibility of services.
 The company’s system is real time.
 There is a wide range of services.
 Real time user support.
WEAKNESSES
 Minimal operational standards.
 An international system is used thus has conflicting
legal frameworks.
 Minimal to zero authentication functions
 Limited staff training.
 Minimal employee engagement functions.
OPPORTUNITIES
 Operates in a digital market which is an
extensive environment.
 Virtualization of services.
 Cloud computing initiative to increase service
delivery.
 Market advancements and evolution towards
the digital platform.
 Social media advertisement
 Extensive resource collaboration
THREATS
 Cyber intrusions.
 Stiff competition by other companies.
 Market fluctuations.
 Data security and privacy.
 Variations in the international legal framework.
Summary and Recommendations:
Both the opportunities and threats are external variables that the company cannot change. It should focus on its strengths
and weaknesses to increase its customer base. For instance, it should engage all digital platforms to sensitize the users of
its services. Furthermore, it should incorporate cloud resources across all existing ICT infrastructures in order to lower
the cost of operation. Finally, it must engage its employees to enhance their service delivery outcomes (Pristina, 2011).
Short Term (Now)
 First, increase its system's availability by employing social media in its marketing campaigns.
 Furthermore, engage the customers on a one on one basis through the personalized portals.
 Deliver personalized services based on the user’s preferences.
 Employ promotion tactics such as bonuses and purchase incentives e.g. offer coupons for first-time subscribers

IT INFRASTRUCTURE MANAGEMENT PG 3
that provide free real estate advice (SEE, 2014).
Mid Term (next 12 months)
 The first step is to create a strategic plan to maximize on the company’s strengths.
 Secondly, diversify the services by engaging in new business venture related to the existing services.
 Employ ICT resources that maximize efficiency at minimal cost e.g. cloud resources and ERP (enterprise
resource planning) for in-house activities.
 Regularly update the user’s system to improve the user’s interaction with the online facilities.
 Finally, engage with other service providers with similar objectives i.e. limit the vendor lock-in.
Long Term (next 3 to 5 years)
 Establish a strategy team and through it develop a dynamic governance plan for the ICT services.
 Implement long-term awareness programs for the employees to engage their understanding and to motivate
their career choices.
 Outline regular financial assessments (audits) in an attempt to reduce the overall turnover rates of the staff.
 Finally, invest in cloud resources (infrastructure) to host resources for other prospective real estate companies.

IT INFRASTRUCTURE MANAGEMENT PG 4
Part B:
Abstract
Data mining is the most significant component of database management since it enables the
summarization of data which yield conclusive results. It, therefore, forms a critical element of
knowledge management, a field that analyses data to aid decision-making procedures.
Furthermore, with the prevalence of information technology, data mining has gained
unprecedented application as organizations around the world seek to optimize their operations
outcomes. Moreover, this application has facilitated a lot of research into the area in an attempt
to improve the existing data mining infrastructures. Similarly, this paper reviews data mining as
a management technology where the different elements that support its functionalities are given.
In essence, the report highlights the procedures, categories, algorithms and applications of the
technology in today’s digital systems.
Keywords: Knowledge discovery (KD), Knowledge management, algorithm

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

IT INFRASTRUCTURE MANAGEMENT PG 5
Table of Contents
Introduction.............................................................................................................6
Typical data mining process................................................................................6
i. Data definition (Identification of the problem).............................................7
ii. Data exploration (Metadata)........................................................................7
iii. Data preparation..........................................................................................8
iv. Modeling (Algorithms)...............................................................................8
v. Evaluation....................................................................................................8
vi. Deployment.................................................................................................8
Data mining functionalities....................................................................................8
Classification.......................................................................................................9
Clustering............................................................................................................9
Association analysis..........................................................................................10
Application of data mining...................................................................................11
1. Business i.e. e-commerce...........................................................................11
2. Healthcare industry.......................................................................................11
3. General industries..........................................................................................12
Conclusion..............................................................................................................12

IT INFRASTRUCTURE MANAGEMENT PG 6
Introduction
A lot of information is generated today owing to the existence and expansion of the digital media
and information technology in general. In addition to this, the seamless integration of
communication networks has also increased the availability of information which has improved
the users’ access and collaboration. Now, although these outcomes have enhanced the
functionalities of technology, they have also led to massive data structures that have made the
analysis processes a challenge. In all, the databases that exist today are overwhelmed with data
an outcome that makes it difficult to distinguish between quality and non-quality information
(Silwattananusarn & Tuamsuk, 2012).
While there is no single accepted definition of data mining, its operational concepts can provide
an important account of its functionalities. In all, data mining is an essential element of
knowledge discovery (KD) which is a general data concept that processes and produces key
patterns of information based on the needs of the users. KD is an affiliated process of database
systems which facilitates the identification of data according to the parameters of the users.
Therefore, while they may seem similar knowledge discovery and data mining are two different
database concepts. In fact, KD can be highlighted as the overall process of identifying data
useful to a particular application. On the other hand, data mining is a subsidiary item of KD that
provides useful patterns from large volumes of data while focusing on specific database
algorithms (Han & Kamber, 2000).
Typical data mining process
Regardless of the application, data mining usually holds the same objective of developing an
effective and predictive model from large sources of data. These models improve the explanation
and generalization of data by identifying crucial defining elements (CRISP-DM, 2017).

IT INFRASTRUCTURE MANAGEMENT PG 7
Therefore, in its operations, data mining will take a new set of data from a knowledge base e.g. a
database or a data warehouse and define crucial patterns that will represent it as a whole
structure. In all, the following elements describe its functionalities:
 Data definition
 Data exploration
 Data preparation
 System modelling
 Evaluation
 Deployment
i. Data definition (Identification of the problem)
In the first step, the users must determine their objectives even before they handle the data in
itself. Therefore, in this stage, the business, users or organizations will analyse the problem they
face with regards to the available data. For instance, in an enterprise, the management may seek
to increase their customer base by understanding the users’ requirements and preferences. By
defining this objective, the data mining experts will have the isolation elements for the KD
process (IBM, 2017).
ii. Data exploration (Metadata)
Metadata, a segment of information that defines and characterizes another set of information.
Now, this stage explores the metadata where it’s understood to give the necessary background
information on the available data. Furthermore, it is at this stage that the data is collected, tagged
and analyzed. Moreover, the data is also explored to define its problems e.g. quality, availability
or even biased behaviour (Jackson, 2002).

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

IT INFRASTRUCTURE MANAGEMENT PG 8
iii. Data preparation
A ‘cleansing’ process is executed where the data is transformed into a certain model as data
mining algorithms will only accept it in certain formats. In addition to this cleansing, the process
also derives new data attributes to aid the analysis process e.g. a data average is given.
iv. Modeling (Algorithms)
This stage is the cornerstone of data mining process because it is at this stage that the various
functions and algorithms are used to develop the final data model. Now, this stage regularly
consults the data preparation stage to ensure the objectives are fully met. Furthermore, it is also
usually coupled with the evaluation stage to optimize the results of the model (IBM, 2017).
v. Evaluation
An assessment step that cross-examines the final model with the initial objectives. Now, if the
objectives are not satisfied the process is reverted to the modelling phase. Therefore, the
following questions are asked:
 Has the model achieved the business objectives?
 Have all the issues been addressed?
vi. Deployment
The final stage, where the final results are exported into database structures for presentation, this
can include spreadsheets, graphs and pictorials among other visual displays. Now, remember,
this process can be continuously repeated to optimize results, an outcome that facilitates the use
of iterative procedures to perfect the results.
Data mining functionalities
So, now that the general process of data mining is given the users must also evaluate the methods
to use in achieving this procedure. What does this mean? In essence, the data mining process

IT INFRASTRUCTURE MANAGEMENT PG 9
may follow a number of methods or techniques to achieve its results and the variation in these
methods highlight the functionalities of the process as they define the general operations. In this
section, the paper analyses the three most significant data mining functionalities; classification,
clustering and association analysis (Chen, Deng, & Wan, 2015).
Classification
Similar to the true meaning of the name, the classification function assigns data objects or items
to different classes for the purpose of developing predictive categories of objects having
unknown class indicators. Therefore, every data object in a given database is assigned to a given
category so as to identify its target class. An example of this function is a banking setup where
loan applicants are classified as either low or high credit risk individuals (Shodhganga, 2012).
Several algorithms implement this function in relation to given class objects, they are:
a. Decision tree induction – all the elements of the data are classified using a tree-like
structure that has various nodes of operation. These nodes will be given as rectangles and ovals
for the internal elements and the leaf nodes respectively. In all, the tree sequentially flows into
different levels expressing the data’s attributes.
b. KNN (K-nearest neighbour) – the conventional nearest neighbour algorithm (NNA) is used
where the classification aims to find the next nearest point in a given set of data.
c. Support vector machine – this algorithm uses the statistical learning principles to analyze
and identify patterns in a set of data. Moreover, a binary data classifier is used to map the
elements of the data in a multilevel dimensional space (Vozinika & Viana, 2004).
Clustering
In this function, the data mining process does not consider the data classes or objects available,
instead, the data is split into different groups of items having the similar patterns. Therefore, the

IT INFRASTRUCTURE MANAGEMENT PG 10
different groups will hold different patterns but with the internal elements having common
operational patterns. Although the method is confusing at first instance, it does makes a lot of
sense when analyzed with an example. Consider the example of search engines which categories
data based on their features (patterns) (STEFANOWSKI, 2009). Data is presented to the user as
either videos, reviews or audio among many other groups. Algorithms:
a. Partitioning – this algorithm clusters data using repetitive procedures that relocate data
points of subset information. Furthermore, the algorithm will also locate areas with a heavy
population of data sets in order to cluster them based on their defining attributes.
b. Hierarchical algorithm – a sequential flow of events is exhibited where data items are
combined into different subgroups. These subgroups then merge to form an even larger group, a
functionality that continuously grows into different levels (Ayr¨am¨o & K¨arkk¨ainen, 2016).
Association analysis
Also known as association rule mining, this function analyses data to identify the rules of
operation which are then used to highlight the attributes of the functional data. Furthermore, to
improve its functionalities, the algorithm will focus on the most recurring attributes to form the
best patterns that will yield the qualitative results for decision-making (Kumar, 2014). Its
algorithms are:
a. Pattern growth – data is sequentially analyzed to develop the attributes of the data items that
facilitate their collaboration. In itself, this algorithm is very complex but will work efficiently
with large volumes of data.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

IT INFRASTRUCTURE MANAGEMENT PG 11
b. Parallel algorithm – this algorithm will follow a certain logical flow of events which will
facilitate it in identifying certain patterns that occur repeatedly together. These patterns will then
yield the final data elements that categories the set of data (Tudor, 2008).
Application of data mining
Information technology and its affiliated systems are not limited to any industries or
functionality, this outcome increases their overall application. Similarly, data mining is widely
used in many industries where information serves as the main element for decision making. In
essence, the resources given by the technology are the defining factors of management as
organizations seek to expand and optimise their operations (Silwattananusarn & Tuamsuk,
2012). Nevertheless, let’s highlight some of the key applications of the technology.
1. Business i.e. e-commerce
Online stores have shown dominance in the past few years owing to their availability and
accessibility. These stores are further supplemented by the financial services that have shifted to
the digital medium including functionalities such as mobile banking. Now, E-commerce depends
on the flow of information to disseminate services and resources to consumers through the
internet. This information will include marketing ventures, financial transactions and user
preferences among many other items. Data mining will facilitate these online businesses in their
operations by enabling them to identify specific patterns that aid their success. For instance, an
online retail store will identify the best-selling items after analyzing the data available on their
servers. Furthermore, they are also able to understand their customer preferences after
scrutinizing the data found on their social media platforms (Matillion, 2017).

IT INFRASTRUCTURE MANAGEMENT PG 12
2. Healthcare industry
A key public sector that uses a lot of information because of the number of users. Now,
information systems support the modern healthcare industry where patients and staff records are
electronically managed. Furthermore, the same resources (data) are accessed by a variety of
users from different locations in order to organize their health procedures. Therefore, the
healthcare systems are regularly overloaded with information as many users update their records.
Moreover, a lot of heterogeneous data is made available to different organizations having items
such as payments, names, prescriptions and practitioner notes among many others. Now, data
mining facilitates the classification, analysis and clustering of this information, functions that
improve the quality of the quantitative records. These functionalities also improve the outcomes
of healthcare practices including reducing the resource wastage (Silwattananusarn & Tuamsuk,
2012).
3. General industries
Finally, consider the application of data mining in all the other industries in general where
regardless of the functionalities the aims are usually same i.e. to improve the outcomes of service
delivery while maintaining minimal operational costs. Starting with the telecommunication
industry where ICT functionalities and services enhance communication. These functionalities
generate a lot of information in an attempt to meet the users’ demands (Zentut, 2017). Again,
data mining facilitates these operations by sieving and analyzing data to yield conclusive results
that aid decisions. Similar functions are seen in other service industries such as banking, retail
and insurance. Moreover, the application of data mining also extends to government operations
where citizen’s records are analyzed for governance purposes.