ICT706 Data Analytics: Applying Data Mining Techniques to Business
VerifiedAdded on 2023/06/11
|16
|2120
|83
Report
AI Summary
This report acts as a data scientist addressing business problems in an e-commerce company. It details the research methodology, analytical findings, and recommendations to improve profitability, which has decreased due to increased competition and changing customer expectations. The analysis includes profit analysis, descriptive statistics for customer numbers, t-tests, ANOVA, regression, and correlation analysis. The report also includes a section on implementation, concluding with insights on developing predictive models for monthly sales using methods like linear regression and decision trees. Python code snippets are included for data import, t-tests, ANOVA, and linear regression. This document is available on Desklib, a platform offering a range of study tools for students.

2018
748944 –DATA ANALYTICS
748944 –DATA ANALYTICS
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Executive summary
In this scenario we are working as a Data Scientist in a big ecommerce
company. The main product are latest gadgets, books, toys, household,
stationaries, home appliances items and clothes etc.
The company’s performance is profitability decreased due to increased
competition and also the changing customer expectations and mind sets.
And offers given other company’s also very high. Sathiyavathi.R (2015).
The company has well qualified board of directors but they believe in
The power of data analytics in solving problems and its make important
Business decisions and task. Belarc.com. (2018)
Due to our data analytics skills and data science capability, the company
representatives have giving us a special task in analyse the company’s
sales and data for each product segment and each geographic region.
Lakshadipathi t and Kumar raj t (2016)
1
In this scenario we are working as a Data Scientist in a big ecommerce
company. The main product are latest gadgets, books, toys, household,
stationaries, home appliances items and clothes etc.
The company’s performance is profitability decreased due to increased
competition and also the changing customer expectations and mind sets.
And offers given other company’s also very high. Sathiyavathi.R (2015).
The company has well qualified board of directors but they believe in
The power of data analytics in solving problems and its make important
Business decisions and task. Belarc.com. (2018)
Due to our data analytics skills and data science capability, the company
representatives have giving us a special task in analyse the company’s
sales and data for each product segment and each geographic region.
Lakshadipathi t and Kumar raj t (2016)
1

TABLE OF CONTENTS PAGE.NO
1. LIST OF ABBREVIATIONS AND ASSUMPTIONS
MADE
2
2. INTRODUCTION 3
3. RESEARCH METHODOLOGY 3
4. RECOMMENDATIONS TO THE COMPANY 4
5. ANALYTICAL FINDINGS 5
5.1. PROFIT ANALYSIS
5.2. DESCRIPTIVE STATISTICS FOR NUMBER OF
CUSTOMER
5.3. TWO SAMPLE T_TESTS
5.4. ONE WAY ANOVA
5.5. REGRESSION ANALYSIS
5.5CORRELATION ANALYSIS
5
6
7
8
8
6. IMPLEMENTATIONS 9
7. CONCLUSION 10
8. LIST OF REFERENCE 11
9. APPENDIX 12
2
1. LIST OF ABBREVIATIONS AND ASSUMPTIONS
MADE
2
2. INTRODUCTION 3
3. RESEARCH METHODOLOGY 3
4. RECOMMENDATIONS TO THE COMPANY 4
5. ANALYTICAL FINDINGS 5
5.1. PROFIT ANALYSIS
5.2. DESCRIPTIVE STATISTICS FOR NUMBER OF
CUSTOMER
5.3. TWO SAMPLE T_TESTS
5.4. ONE WAY ANOVA
5.5. REGRESSION ANALYSIS
5.5CORRELATION ANALYSIS
5
6
7
8
8
6. IMPLEMENTATIONS 9
7. CONCLUSION 10
8. LIST OF REFERENCE 11
9. APPENDIX 12
2
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

1. List of Abbreviations and assumptions made
1. CSV-comma separator value
2. COD-cash on delivery
3. DCP-debit card payment
4. CCP-credit card payment
5. SC-sub category
6. No .of –number of
7. CN –customer name
8. CT – customer type
3
1. CSV-comma separator value
2. COD-cash on delivery
3. DCP-debit card payment
4. CCP-credit card payment
5. SC-sub category
6. No .of –number of
7. CN –customer name
8. CT – customer type
3
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

2. Introduction
When we are working as a data scientist in an ecommerce company we
must need to know each and everything about the company’s main
things like latest gadgets, books, toys, household items and clothes. The
company is a market leader with range of product segments. As a data
analyst we need to the things like Research methodology, Analytical
findings, Recommendations to the company, Implementations etc.
3. Research Methodology
There are some differences there between qualitative information and
quantitative information. In subjective explores utilizing interview,
combining, tests and so the information investigation may shall include
analyze of normal samples and also another examples inside the
reactions and basically examining them keeps in brain the final with
achievement to accomplish about dots and destinations. Knight, S. and
Littleton, K. (2015).
After the research we need to prepare the data sets for a certain records
and we need store that one in the excel sheet and analyze the sheet
whenever we want to use the record.
Hussain, A. and Roy, A. (2016).
We need research the following things so that we may conclude which
product is not at all selling and also we need to analyze the name of the
product, price of the product and shipping type of the product and also
we need to look at the things like monthly sales and no .of customers
who bought the product and also check the customers are new customers
or old and regular customer etc.
Zhang, H., Wang, H., Li, J. and GAO, H. (2018).
4. Recommendations to the company:
4
When we are working as a data scientist in an ecommerce company we
must need to know each and everything about the company’s main
things like latest gadgets, books, toys, household items and clothes. The
company is a market leader with range of product segments. As a data
analyst we need to the things like Research methodology, Analytical
findings, Recommendations to the company, Implementations etc.
3. Research Methodology
There are some differences there between qualitative information and
quantitative information. In subjective explores utilizing interview,
combining, tests and so the information investigation may shall include
analyze of normal samples and also another examples inside the
reactions and basically examining them keeps in brain the final with
achievement to accomplish about dots and destinations. Knight, S. and
Littleton, K. (2015).
After the research we need to prepare the data sets for a certain records
and we need store that one in the excel sheet and analyze the sheet
whenever we want to use the record.
Hussain, A. and Roy, A. (2016).
We need research the following things so that we may conclude which
product is not at all selling and also we need to analyze the name of the
product, price of the product and shipping type of the product and also
we need to look at the things like monthly sales and no .of customers
who bought the product and also check the customers are new customers
or old and regular customer etc.
Zhang, H., Wang, H., Li, J. and GAO, H. (2018).
4. Recommendations to the company:
4

The immense information stores that associations gather are
proceeding to develop in volume and assorted variety.
The bits of knowledge picked up from Big Data empower
associations to be driven by business knowledge and
understanding, which themselves are driven by complex
measurements and examination. Legitimately saddled
information can give bits of knowledge to an association's
showcasing, item and administration guides, and notoriety
administration.
Lakshadipathi t and Kumar raj t (2016)
Information Science is staying put as an important piece of the
Big Data toolset. Information keeps on filling gathering
frameworks from cell phones, interpersonal organizations,
online trackers, e-commerce curios, client overviews, and any
sources that can be tapped for input that has potential
incentive for an association. There are many algorithms are
available for data science.
A few calculations were created to address business issues. Some
were created to increase calculations being used for different purposes,
or to have them perform fairly in an unexpected way, to tune them to a
business situation. These calculations can be utilized, for example, to
help clients to remember an occasion, or to target likely charge card
candidates. Albeit one calculation may be plainly preferable for a
specific reason over another, it's occasionally exceptionally helpful to
attempt more than one. Doing this can give examinations and regularly
turn up some startling outcomes that can reveal to you more than you
expected about your item or your clients. Elder, J. (2015).
5. Analytical Findings
5
proceeding to develop in volume and assorted variety.
The bits of knowledge picked up from Big Data empower
associations to be driven by business knowledge and
understanding, which themselves are driven by complex
measurements and examination. Legitimately saddled
information can give bits of knowledge to an association's
showcasing, item and administration guides, and notoriety
administration.
Lakshadipathi t and Kumar raj t (2016)
Information Science is staying put as an important piece of the
Big Data toolset. Information keeps on filling gathering
frameworks from cell phones, interpersonal organizations,
online trackers, e-commerce curios, client overviews, and any
sources that can be tapped for input that has potential
incentive for an association. There are many algorithms are
available for data science.
A few calculations were created to address business issues. Some
were created to increase calculations being used for different purposes,
or to have them perform fairly in an unexpected way, to tune them to a
business situation. These calculations can be utilized, for example, to
help clients to remember an occasion, or to target likely charge card
candidates. Albeit one calculation may be plainly preferable for a
specific reason over another, it's occasionally exceptionally helpful to
attempt more than one. Doing this can give examinations and regularly
turn up some startling outcomes that can reveal to you more than you
expected about your item or your clients. Elder, J. (2015).
5. Analytical Findings
5
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Analytics of data is the kind of technique used to provide the business
details and the development of the company product and process and in
the analytics the data is used to recognize the content about the data and
that structure and analyze the need of the product.
Samoilova, Keusch and Wolbring (2017).
And it is mainly used in the business and provide the data to the
customers and keep the data. In the analytics it mainly used to get the
data and verify that data for the security issues.
Ore, E. and Slalom, J. (2015).
Despite the fact that there are numerous calculations, these techniques,
Classification, Regression, and Similarity Matching are the essential
standards on which a considerable lot of the calculations utilized as a
part of information science depend.
In information science, various calculations based on factual models are
accessible for information researchers to make logical stages. Which
calculation is picked depends on the objectives that have been built up
previously, similarly as an analyst picks the suitable measurable model
in view of the issue to be fathomed.
Organizations are progressively depending on the examination of their
information to foresee buyer reaction and prescribe items to their clients.
Nonetheless, to dissect such gigantic measures of information, the
arrangement obviously must be figure driven
6
details and the development of the company product and process and in
the analytics the data is used to recognize the content about the data and
that structure and analyze the need of the product.
Samoilova, Keusch and Wolbring (2017).
And it is mainly used in the business and provide the data to the
customers and keep the data. In the analytics it mainly used to get the
data and verify that data for the security issues.
Ore, E. and Slalom, J. (2015).
Despite the fact that there are numerous calculations, these techniques,
Classification, Regression, and Similarity Matching are the essential
standards on which a considerable lot of the calculations utilized as a
part of information science depend.
In information science, various calculations based on factual models are
accessible for information researchers to make logical stages. Which
calculation is picked depends on the objectives that have been built up
previously, similarly as an analyst picks the suitable measurable model
in view of the issue to be fathomed.
Organizations are progressively depending on the examination of their
information to foresee buyer reaction and prescribe items to their clients.
Nonetheless, to dissect such gigantic measures of information, the
arrangement obviously must be figure driven
6
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

5.1 Profit analysis:
Profit analysis is a type of accounting which is used as an elementary
instruction and also the short run decision. It is extends the
implementation of data given by analysis. This is the important piece of
examine where the income add ups and annual expenditures are
equivalent. At this point the company or organization does not loss or
misfortune and salary loss.
5.2 Descriptive statistics for number of customer
Games is a characteristic road for finding out about information
examination, since they are so information arranged. Player execution is
estimated, and after that dismembered and talked about. Asking
measurements and research procedure courses are begin. Additionally,
it's critical to be a buyer of factual investigation. Discover a few themes
or issues that intrigue you and read whatever scientific works you can
discover.
In any case, there is most likely no restriction to the scope of themes on
which you can discover strong and drawing in systematic work.
5.3 Two samples t-test:
7
Profit analysis is a type of accounting which is used as an elementary
instruction and also the short run decision. It is extends the
implementation of data given by analysis. This is the important piece of
examine where the income add ups and annual expenditures are
equivalent. At this point the company or organization does not loss or
misfortune and salary loss.
5.2 Descriptive statistics for number of customer
Games is a characteristic road for finding out about information
examination, since they are so information arranged. Player execution is
estimated, and after that dismembered and talked about. Asking
measurements and research procedure courses are begin. Additionally,
it's critical to be a buyer of factual investigation. Discover a few themes
or issues that intrigue you and read whatever scientific works you can
discover.
In any case, there is most likely no restriction to the scope of themes on
which you can discover strong and drawing in systematic work.
5.3 Two samples t-test:
7

5.4 Regression analysis
A model is relates with hypothesized, and estimates with regression
equation. Regression analysis involves identifying the relationship
between a dependents on the variable and one or more independent
variables.
5.5 Correlation analysis:
Correlation analysis are relationships among variables. The correlation
coefficient is a measure of linear associated with between two variables.
They lies between -1 and +1.
5.6 One way ANOVA:
ANOVA is also known as one way analysis of variance and it is utilized
whether any huge contrasts between the two things they are autonomous
and gatherings.
8
A model is relates with hypothesized, and estimates with regression
equation. Regression analysis involves identifying the relationship
between a dependents on the variable and one or more independent
variables.
5.5 Correlation analysis:
Correlation analysis are relationships among variables. The correlation
coefficient is a measure of linear associated with between two variables.
They lies between -1 and +1.
5.6 One way ANOVA:
ANOVA is also known as one way analysis of variance and it is utilized
whether any huge contrasts between the two things they are autonomous
and gatherings.
8
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

9
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

6. Implementation:
Screenshots:
10
Screenshots:
10

7. Conclusion
At the end of the study we may come to know that the developing the
predictive model for an particular predict monthly sales to specific
region is also possible in the methods such as linear regression , naive
Bayes and decision trees.
Singh, A. (2018).
It also welcomes comparative analysis for all the methods across the
research developed by the data analyst and justifying the approach and
findings of research.
Cormack, A. (2016).
11
At the end of the study we may come to know that the developing the
predictive model for an particular predict monthly sales to specific
region is also possible in the methods such as linear regression , naive
Bayes and decision trees.
Singh, A. (2018).
It also welcomes comparative analysis for all the methods across the
research developed by the data analyst and justifying the approach and
findings of research.
Cormack, A. (2016).
11
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 16