Explanations and Recommendations for the Wine Cellar Database System
VerifiedAdded on 2019/09/22
|22
|6749
|126
Report
AI Summary
This report, prepared for The Wine Cellar, a company selling wine and accessories, examines database technologies to address their evolving needs. The report begins by exploring the limitations of relational databases in handling the company's expansion and the need for company-wide reporting. It then delves into the concepts of data warehouses (OLAP) and their role in supporting decision-making, contrasting them with Online Transaction Processing (OLTP) systems. The report also discusses the design and implementation considerations of a data warehouse. Furthermore, the report explores distributed databases, assessing their relevance to The Wine Cellar's organizational structure, and recommends a suitable type of distributed database with justifications. Finally, the report examines how relational databases support object-oriented development and their relevance to multimedia requirements for product promotion via videos. The report provides a detailed analysis of various database technologies and their applications, offering practical recommendations for The Wine Cellar's IT infrastructure.

<University>
EXPLANATIONS AND RECOMMENDATIONS FOR
THE WINE CELLAR DATABASE
by
Student Number:
Module Name:
Submission Deadline:
15th March 2017
<Lecturer’s Name and Course Number>
<Your Name> <Your Student Number> 2017 1 of 22
EXPLANATIONS AND RECOMMENDATIONS FOR
THE WINE CELLAR DATABASE
by
Student Number:
Module Name:
Submission Deadline:
15th March 2017
<Lecturer’s Name and Course Number>
<Your Name> <Your Student Number> 2017 1 of 22
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Table of Contents
Explanations and Recommendations for The Wine Cellar Database.............................5
Task 1.............................................................................................................................6
Limitations of Relational Databases for The Wine Cellar Scenario..........................6
Data Warehouse and Their Support in Decision-Making..........................................8
Issues in Designing and Implementing a Data Warehouse........................................9
Data Structure Model for The Wine Cellar Scenario...............................................10
Task 2...........................................................................................................................12
Distributed Databases and Potential Issues Before Final Design Decisions............12
Relevance of Distributed Database for The Wine Cellar Organization and
Costs/Benefits...........................................................................................................14
Recommendation of Type of Distributed Database and Justifications....................15
Task 3...........................................................................................................................15
How Relational Databases Support Object Oriented Development?.......................15
Relational Databases Relevance for Multimedia Requirements in The Wine Cellar
Scenario....................................................................................................................16
Conclusion....................................................................................................................17
References....................................................................................................................19
<Your Name> <Your Student Number> 2017 2 of 22
Explanations and Recommendations for The Wine Cellar Database.............................5
Task 1.............................................................................................................................6
Limitations of Relational Databases for The Wine Cellar Scenario..........................6
Data Warehouse and Their Support in Decision-Making..........................................8
Issues in Designing and Implementing a Data Warehouse........................................9
Data Structure Model for The Wine Cellar Scenario...............................................10
Task 2...........................................................................................................................12
Distributed Databases and Potential Issues Before Final Design Decisions............12
Relevance of Distributed Database for The Wine Cellar Organization and
Costs/Benefits...........................................................................................................14
Recommendation of Type of Distributed Database and Justifications....................15
Task 3...........................................................................................................................15
How Relational Databases Support Object Oriented Development?.......................15
Relational Databases Relevance for Multimedia Requirements in The Wine Cellar
Scenario....................................................................................................................16
Conclusion....................................................................................................................17
References....................................................................................................................19
<Your Name> <Your Student Number> 2017 2 of 22

Data is the oil that runs the machinery of the present-day society. Data is everywhere,
and new is being generated every moment from many sources. Being able to store and
retrieve data in an efficient way is an essential requirement for the success of any
organisation. Also, different databases are optimised for different roles. Some may be
optimised for read-heavy operations, while others may be optimised for write-heavy
operations. Optimising one parameter entails a drawback in other parameters. Also,
the underlying structure of a database system has a significant bearing on the
applications. This paper deals with the new requirements of a (fictitious) company in
England, which is engaged in sourcing wine and accessories from all over the world
and selling them. The company has experienced growth in the recent years and has
opened branches throughout the country. However, the database handling is still
keeping in mind a single location. The consequence has been the delay and manual
effort required in compiling company-wide reports. In addition to the reporting
requirement, the company also wants to promote selected products via short videos on
a website. This paper details the various database technologies and provides
recommendations for the new scenarios of reporting and multimedia promotions
which have cropped up for The Wine Cellar.
<Your Name> <Your Student Number> 2017 3 of 22
and new is being generated every moment from many sources. Being able to store and
retrieve data in an efficient way is an essential requirement for the success of any
organisation. Also, different databases are optimised for different roles. Some may be
optimised for read-heavy operations, while others may be optimised for write-heavy
operations. Optimising one parameter entails a drawback in other parameters. Also,
the underlying structure of a database system has a significant bearing on the
applications. This paper deals with the new requirements of a (fictitious) company in
England, which is engaged in sourcing wine and accessories from all over the world
and selling them. The company has experienced growth in the recent years and has
opened branches throughout the country. However, the database handling is still
keeping in mind a single location. The consequence has been the delay and manual
effort required in compiling company-wide reports. In addition to the reporting
requirement, the company also wants to promote selected products via short videos on
a website. This paper details the various database technologies and provides
recommendations for the new scenarios of reporting and multimedia promotions
which have cropped up for The Wine Cellar.
<Your Name> <Your Student Number> 2017 3 of 22
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

<Your Name> <Your Student Number> 2017 4 of 22
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Explanations and Recommendations for The Wine Cellar Database
Data is the new oil (Toonders 2014), and this implies a likewise importance to the
means for managing data. Data needs to be stored, edited, deleted, copied and
retrieved to be of relevance to the success of a business or an organisation. All of
these operations to the data need to be done in a performant way. If data is not
handled efficiently, then instead of becoming a competitive advantage and blooming
to its full potential, data becomes additional work without the benefits (Jones 2016).
The means of managing the data (including the data itself, paradigms, schemas, and
the software used) come under the term of databases (Gupta 2011, p.5). Databases are
thus a solution to our world's requirement for storing significant amounts of data and
retrieving it quickly and accurately (Ward and Dafoulas 2006, p.2). Databases are of
many types and have been evolving since their inception in the 1970s (Ward and
Dafoulas 2006, p.2), and many variants have been developed, each with its
advantages and shortcomings. Some of the types of databases are flat-file, relational,
distributed, NoSQL, and others (Data Warehouses 2012). It is not the case that one
type of a database management system (DBMS) is intrinsically better than another,
but each is suited to some particular scenarios. Since efficiency is an essential
requirement for the database, the scenario at hand needs to be considered before
selecting a particular database.
This paper is about the database requirements of a (fictitious) company that operates
in England. This company, The Wine Cellar is engaged in selling a selection of fine
wine, spirits and accessories. The company arranges wines and accessories to sell
from sellers all over the world. The company has experienced growth in the last few
years with the resulting expansion. The company now operates from many branches
within England. Each branch maintains its private database and submits a summary
file each week to the head office at the end of every week. This submission from silo-
like databases at each outlet does fulfil the requirements for the data at the head
office, but the turnaround time is very high. Currently, the individual files are
analysed individually, and the results are amalgamated. Also, this process is labour-
intensive, both at the branch and the head-office level. Given the growth and
<Your Name> <Your Student Number> 2017 5 of 22
Data is the new oil (Toonders 2014), and this implies a likewise importance to the
means for managing data. Data needs to be stored, edited, deleted, copied and
retrieved to be of relevance to the success of a business or an organisation. All of
these operations to the data need to be done in a performant way. If data is not
handled efficiently, then instead of becoming a competitive advantage and blooming
to its full potential, data becomes additional work without the benefits (Jones 2016).
The means of managing the data (including the data itself, paradigms, schemas, and
the software used) come under the term of databases (Gupta 2011, p.5). Databases are
thus a solution to our world's requirement for storing significant amounts of data and
retrieving it quickly and accurately (Ward and Dafoulas 2006, p.2). Databases are of
many types and have been evolving since their inception in the 1970s (Ward and
Dafoulas 2006, p.2), and many variants have been developed, each with its
advantages and shortcomings. Some of the types of databases are flat-file, relational,
distributed, NoSQL, and others (Data Warehouses 2012). It is not the case that one
type of a database management system (DBMS) is intrinsically better than another,
but each is suited to some particular scenarios. Since efficiency is an essential
requirement for the database, the scenario at hand needs to be considered before
selecting a particular database.
This paper is about the database requirements of a (fictitious) company that operates
in England. This company, The Wine Cellar is engaged in selling a selection of fine
wine, spirits and accessories. The company arranges wines and accessories to sell
from sellers all over the world. The company has experienced growth in the last few
years with the resulting expansion. The company now operates from many branches
within England. Each branch maintains its private database and submits a summary
file each week to the head office at the end of every week. This submission from silo-
like databases at each outlet does fulfil the requirements for the data at the head
office, but the turnaround time is very high. Currently, the individual files are
analysed individually, and the results are amalgamated. Also, this process is labour-
intensive, both at the branch and the head-office level. Given the growth and
<Your Name> <Your Student Number> 2017 5 of 22

expansion of The Wine Cellar and the future potential, timely and accurate data is
critical. The management will be better equipped to make company-wide strategic
decisions if the company-wide information is easily available at short notice. This
agility in data querying is something that the current solution cannot provide. Also,
the company would like to promote selected products by showcasing short videos via
a website. These two requirements will be attended to in the paper, in addition to
providing a theoretical knowhow of the database technologies relevant to the
discussion.
The Information Technology (IT) team of The Wine Cellar is skilled in relational
database design, and this skill-set served the company well until recently. However,
keeping in mind the recent growth and the future possibilities, the IT team will have
to expand their skill set to include some new technologies, or some new hires with the
relevant skills will have to be made. The company is more interested in training and
increasing the skills of the current team. This paper will look into these new
technologies and will suggest ways in which the existing experience of the team can
be utilised while developing new ones.
Task 1
Limitations of Relational Databases for The Wine Cellar Scenario
In the relational database paradigm, data is represented in the form of tables, and
relations among the tables help enforce data consistency (Rouse 2006). In the process
of developing relational database schema for a scenario, the entities to be represented
in the database are identified first. Then, the parameters of these entities are
recognised and finally the relations among them, so that there is no duplication of data
and any information about any entity is linked and can be traced. Such a paradigm
allows for naturally enforcing constraints. This system was proposed in the 70s by
E.F. Codd (Codd 1982, pp.109-117). Relational Database is implemented via tables,
columns, rows, primary keys, and foreign keys. In other words, tables represent
entities (e.g. products, customers), columns represent parameters about the entities
(e.g. product name, manufacturing date, volume). The rows represent one particular
<Your Name> <Your Student Number> 2017 6 of 22
critical. The management will be better equipped to make company-wide strategic
decisions if the company-wide information is easily available at short notice. This
agility in data querying is something that the current solution cannot provide. Also,
the company would like to promote selected products by showcasing short videos via
a website. These two requirements will be attended to in the paper, in addition to
providing a theoretical knowhow of the database technologies relevant to the
discussion.
The Information Technology (IT) team of The Wine Cellar is skilled in relational
database design, and this skill-set served the company well until recently. However,
keeping in mind the recent growth and the future possibilities, the IT team will have
to expand their skill set to include some new technologies, or some new hires with the
relevant skills will have to be made. The company is more interested in training and
increasing the skills of the current team. This paper will look into these new
technologies and will suggest ways in which the existing experience of the team can
be utilised while developing new ones.
Task 1
Limitations of Relational Databases for The Wine Cellar Scenario
In the relational database paradigm, data is represented in the form of tables, and
relations among the tables help enforce data consistency (Rouse 2006). In the process
of developing relational database schema for a scenario, the entities to be represented
in the database are identified first. Then, the parameters of these entities are
recognised and finally the relations among them, so that there is no duplication of data
and any information about any entity is linked and can be traced. Such a paradigm
allows for naturally enforcing constraints. This system was proposed in the 70s by
E.F. Codd (Codd 1982, pp.109-117). Relational Database is implemented via tables,
columns, rows, primary keys, and foreign keys. In other words, tables represent
entities (e.g. products, customers), columns represent parameters about the entities
(e.g. product name, manufacturing date, volume). The rows represent one particular
<Your Name> <Your Student Number> 2017 6 of 22
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

instance of the entity (e.g. all the details of a single person), a primary key uniquely
identifies each row in a table (e.g. social security number), and foreign keys map one
row to another to implement relations.
The key point to note from the preceding discussion is the foundation of the paradigm
is relations, and indeed the relational database management system (RDBMS) are
based on mathematical theories (Duffy 2011). Thus these systems are good at
enforcing data consistency, both at the individual column level and in between tables.
However, such systems are traditionally centralised, and this has both advantages and
disadvantages. Till a few years, the centralised system was serving The Wine Cellar
well, but the geographical distribution of the sources of data to different outlets call
for another solution. By centralisation, we mean that the entire database system - the
software to manage the data, the computer hardware, and the data are located on a
single machine in a single location. This type of a concentrated architecture has some
advantages and disadvantages. As we will see in a moment, the centralised systems
serve the outlets of The Wine Cellar well but fail when it comes to company-wide
manoeuvring.
The current scenario of the company is that the outlets manage their data in a
centralised relational database system e.g. inventory, sales, tax, customer details and
other data. This centralisation serves them well as a centralised system has the
advantages of being faster, cheaper, and being easier to manage. However, this makes
interaction, specifically programmatic interaction with the head office a difficult
thing. The senior staff at the head office would be interested in knowing the up-to-
minute performance of their various branches and make strategic and tactical
decisions accordingly. However, as confirmed in the briefing for this paper, the
outlets submit a file each week which has to be manually analysed and results
compiled for the perusal of the senior staff. This process is cumbersome, slow and
prone to errors. In a nutshell, the centralisation of traditional relational database
systems and the consequent inability to allow access to multiple (physically separated)
parties is a limitation in the context of The Wine Cellar's growth. This situation is a
<Your Name> <Your Student Number> 2017 7 of 22
identifies each row in a table (e.g. social security number), and foreign keys map one
row to another to implement relations.
The key point to note from the preceding discussion is the foundation of the paradigm
is relations, and indeed the relational database management system (RDBMS) are
based on mathematical theories (Duffy 2011). Thus these systems are good at
enforcing data consistency, both at the individual column level and in between tables.
However, such systems are traditionally centralised, and this has both advantages and
disadvantages. Till a few years, the centralised system was serving The Wine Cellar
well, but the geographical distribution of the sources of data to different outlets call
for another solution. By centralisation, we mean that the entire database system - the
software to manage the data, the computer hardware, and the data are located on a
single machine in a single location. This type of a concentrated architecture has some
advantages and disadvantages. As we will see in a moment, the centralised systems
serve the outlets of The Wine Cellar well but fail when it comes to company-wide
manoeuvring.
The current scenario of the company is that the outlets manage their data in a
centralised relational database system e.g. inventory, sales, tax, customer details and
other data. This centralisation serves them well as a centralised system has the
advantages of being faster, cheaper, and being easier to manage. However, this makes
interaction, specifically programmatic interaction with the head office a difficult
thing. The senior staff at the head office would be interested in knowing the up-to-
minute performance of their various branches and make strategic and tactical
decisions accordingly. However, as confirmed in the briefing for this paper, the
outlets submit a file each week which has to be manually analysed and results
compiled for the perusal of the senior staff. This process is cumbersome, slow and
prone to errors. In a nutshell, the centralisation of traditional relational database
systems and the consequent inability to allow access to multiple (physically separated)
parties is a limitation in the context of The Wine Cellar's growth. This situation is a
<Your Name> <Your Student Number> 2017 7 of 22
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

typical one in that the databases end up becoming islands of information with even
one department's system not being able to talk to another department (Martin n.d.).
Data Warehouse and Their Support in Decision-Making
We will now explore two types of database systems - OLTP (Online Transaction
Processing) and OLAP (Online Analytical Processing). Imagine a lake in which water
flows into from different streams. After a while of the streams pouring their water into
the lake, the lake begins to fill and will soon overflow. However, before that happens,
the deeper water is taken out and placed in a separate well which has been designed to
preserve this water. Now, in this comparison, the streams are the routine transactions
that are taking place regularly e.g. thousands of bank transactions take place in a
banking system every minute. The lake is the transactional database that handles the
operations and is a reflection of the current state of the organisation (OLTP database).
The movement of old data into a system which is specifically made for maintaining
archives is the OLAP database. OLTP is also referred to as operational data stores.
OLAP is also referred to as a data warehouse. A transactional database by its very
nature does not make itself amenable to analytics (Cardon n.d.) and thus is born the
need for OLAP systems. It must be emphasised that OLTP is not suitable for business
intelligence or analytics (Why Business Intelligence on OLTP is evil n.d.).
Technically, the OLTP has different goals that OLAP. OLTP is designed to support
the running of the organisation and performance is the main criteria. Its main focus is
on updates, which are short but multiple and are triggered by end-users. The queries
are standardised and few in variety. Here, the data consistency is paramount, and
usually high normalisation of data is maintained to enforce consistency and integrity
of data. OLTP is the storehouse of the original data. In contrast, OLAP is an archive
of historical data i.e. transactions that have taken place and are not likely to edited
further. Its focus is on retrieval (OLAP vs OLTP: what makes the difference n.d.).
Also, this consolidated data of OLAP comes from multiple places, and this
multiplicity of the sources often dictates that a separate step be executed (a staging
area) where the data is treated, and the format is modified before being saved in the
<Your Name> <Your Student Number> 2017 8 of 22
one department's system not being able to talk to another department (Martin n.d.).
Data Warehouse and Their Support in Decision-Making
We will now explore two types of database systems - OLTP (Online Transaction
Processing) and OLAP (Online Analytical Processing). Imagine a lake in which water
flows into from different streams. After a while of the streams pouring their water into
the lake, the lake begins to fill and will soon overflow. However, before that happens,
the deeper water is taken out and placed in a separate well which has been designed to
preserve this water. Now, in this comparison, the streams are the routine transactions
that are taking place regularly e.g. thousands of bank transactions take place in a
banking system every minute. The lake is the transactional database that handles the
operations and is a reflection of the current state of the organisation (OLTP database).
The movement of old data into a system which is specifically made for maintaining
archives is the OLAP database. OLTP is also referred to as operational data stores.
OLAP is also referred to as a data warehouse. A transactional database by its very
nature does not make itself amenable to analytics (Cardon n.d.) and thus is born the
need for OLAP systems. It must be emphasised that OLTP is not suitable for business
intelligence or analytics (Why Business Intelligence on OLTP is evil n.d.).
Technically, the OLTP has different goals that OLAP. OLTP is designed to support
the running of the organisation and performance is the main criteria. Its main focus is
on updates, which are short but multiple and are triggered by end-users. The queries
are standardised and few in variety. Here, the data consistency is paramount, and
usually high normalisation of data is maintained to enforce consistency and integrity
of data. OLTP is the storehouse of the original data. In contrast, OLAP is an archive
of historical data i.e. transactions that have taken place and are not likely to edited
further. Its focus is on retrieval (OLAP vs OLTP: what makes the difference n.d.).
Also, this consolidated data of OLAP comes from multiple places, and this
multiplicity of the sources often dictates that a separate step be executed (a staging
area) where the data is treated, and the format is modified before being saved in the
<Your Name> <Your Student Number> 2017 8 of 22

OLAP database. This operation is often referred to as ETL (Extract, Transform,
Load). Since OLAP deals with historical data (which is not expected to change), the
priority of the system and the scope of its service is poles apart from the transactional
one. The priority of an OLAP system is supporting complex and non-typical queries
in an ad-hoc manner, some of which are often long running. Many common and long-
running queries are executed in advance to precompute the reports that may be later
requested. Its scope is to help with planning, problem-solving and supporting
decision-making. Another aspect of OLAP is that data normalisation is dropped in
favour of data redundancy and simpler associations so that complex queries can be
executed faster.
Now, an OLAP or a data warehouse is characterised by being subject-oriented to a
particular metric (e.g. sales), integrated (using multiple sources of data), non-volatile
(data once entered into warehouse is not likely to be changed or deleted), and time
variant (being able to analyze how a metric has changed over time e.g. months or
years) (Lane and Schupmann 2002). This availability of an ocean of data which is
disjoint from the transactional data, coupled with specialised tools, allows managers
to look into the past and ask questions to mine data, extract trends, see how metrics
are moving up or down, and then decide what is in the best interests of their
organisation. Such an arrangement also eases the database managers' responsibilities.
Issues in Designing and Implementing a Data Warehouse
As detailed in the previous section, a data warehouse is an ideal system to help
management look into the past and come up with answers to questions like the change
in a metric over time, hidden trends and other data mining questions. Still, the process
of establishing and maintaining a data warehouse is non-trivial. In this section, we
will talk about the challenges encountered in the implementation of a data warehouse.
One issue is of the data quality. In a data warehouse, the source of data are multiple
and not directly the end-users. Also, to help meet a data warehouse's goal of
uncovering significant trends, we cannot skip on data sources. Now, whenever
different sources are merged, errors happen in data due to the different format, the
<Your Name> <Your Student Number> 2017 9 of 22
Load). Since OLAP deals with historical data (which is not expected to change), the
priority of the system and the scope of its service is poles apart from the transactional
one. The priority of an OLAP system is supporting complex and non-typical queries
in an ad-hoc manner, some of which are often long running. Many common and long-
running queries are executed in advance to precompute the reports that may be later
requested. Its scope is to help with planning, problem-solving and supporting
decision-making. Another aspect of OLAP is that data normalisation is dropped in
favour of data redundancy and simpler associations so that complex queries can be
executed faster.
Now, an OLAP or a data warehouse is characterised by being subject-oriented to a
particular metric (e.g. sales), integrated (using multiple sources of data), non-volatile
(data once entered into warehouse is not likely to be changed or deleted), and time
variant (being able to analyze how a metric has changed over time e.g. months or
years) (Lane and Schupmann 2002). This availability of an ocean of data which is
disjoint from the transactional data, coupled with specialised tools, allows managers
to look into the past and ask questions to mine data, extract trends, see how metrics
are moving up or down, and then decide what is in the best interests of their
organisation. Such an arrangement also eases the database managers' responsibilities.
Issues in Designing and Implementing a Data Warehouse
As detailed in the previous section, a data warehouse is an ideal system to help
management look into the past and come up with answers to questions like the change
in a metric over time, hidden trends and other data mining questions. Still, the process
of establishing and maintaining a data warehouse is non-trivial. In this section, we
will talk about the challenges encountered in the implementation of a data warehouse.
One issue is of the data quality. In a data warehouse, the source of data are multiple
and not directly the end-users. Also, to help meet a data warehouse's goal of
uncovering significant trends, we cannot skip on data sources. Now, whenever
different sources are merged, errors happen in data due to the different format, the
<Your Name> <Your Student Number> 2017 9 of 22
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

level of quality, logic conflicts, missing data (Wentzlaff 2014). The reasons for the
low quality of data at the sources include incorrect information entered in the primary
source, incomplete knowledge of interdependencies among the data sources, inability
to cope with the ageing data, varying timelines of the data sources, and others (Pandey
2014, pp.18-24). Some practitioners suggest that not all low-quality data needs to be
cleaned. Sometimes, acknowledging the bounds of clean data is prudent (McKnight
2007). Another concern is the model to be chosen for the data warehouse. As pointed
out earlier, since the goals of the transactional database and a warehouse are poles
apart, therefore the underlying structure cannot be same. Thus the data models for
transactional are not suited for warehousing environments (Becker 2002, pp.22-76). A
multidimensional data model (MDDM) storing data as facts and dimensions as
opposed to rows and columns of a relational database are more suited to handle the
scope of a data warehouse - ad-hoc and complex queries joining a great many
numbers of entities. The schemas common are star schema and snowflake schema
(Nepomnjashiy 2001). Then, there are some implementation issues which need to be
sorted out, like the amount of data that will be shipped around the network and the
network's capability to cope with it, the amount of disk space required, and its speed.
Also, expenses need to be tailored to the goals in a decision like solid-state drives
(SSDs) or traditional hard disk drives (Whitehorn 2011).
Data Structure Model for The Wine Cellar Scenario
In this section, we will walk through the process of creating a data structure model for
the company. Initially, we will identify the entities involved. Then, we will determine
their parameters. After that, relations among them will be finalised. The company is a
business engaged in selling products. It buys products from sellers and sells them
using its outlets. We will be concentrating on these aspects of the firm. A real
business has many other nuances like the costs and management of shipping the goods
to the outlets, salaries, taxes, etc. We will focus on the problem of report generation
for the management. The company deals in products, which are of different types, and
are sold in the outlets of the company throughout the country. The emphasised words
give us an idea of the entities identified. All of these will form their table. Also, each
<Your Name> <Your Student Number> 2017 10 of 22
low quality of data at the sources include incorrect information entered in the primary
source, incomplete knowledge of interdependencies among the data sources, inability
to cope with the ageing data, varying timelines of the data sources, and others (Pandey
2014, pp.18-24). Some practitioners suggest that not all low-quality data needs to be
cleaned. Sometimes, acknowledging the bounds of clean data is prudent (McKnight
2007). Another concern is the model to be chosen for the data warehouse. As pointed
out earlier, since the goals of the transactional database and a warehouse are poles
apart, therefore the underlying structure cannot be same. Thus the data models for
transactional are not suited for warehousing environments (Becker 2002, pp.22-76). A
multidimensional data model (MDDM) storing data as facts and dimensions as
opposed to rows and columns of a relational database are more suited to handle the
scope of a data warehouse - ad-hoc and complex queries joining a great many
numbers of entities. The schemas common are star schema and snowflake schema
(Nepomnjashiy 2001). Then, there are some implementation issues which need to be
sorted out, like the amount of data that will be shipped around the network and the
network's capability to cope with it, the amount of disk space required, and its speed.
Also, expenses need to be tailored to the goals in a decision like solid-state drives
(SSDs) or traditional hard disk drives (Whitehorn 2011).
Data Structure Model for The Wine Cellar Scenario
In this section, we will walk through the process of creating a data structure model for
the company. Initially, we will identify the entities involved. Then, we will determine
their parameters. After that, relations among them will be finalised. The company is a
business engaged in selling products. It buys products from sellers and sells them
using its outlets. We will be concentrating on these aspects of the firm. A real
business has many other nuances like the costs and management of shipping the goods
to the outlets, salaries, taxes, etc. We will focus on the problem of report generation
for the management. The company deals in products, which are of different types, and
are sold in the outlets of the company throughout the country. The emphasised words
give us an idea of the entities identified. All of these will form their table. Also, each
<Your Name> <Your Student Number> 2017 10 of 22
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

table will have its primary key. A table consisting of the types of products sold will be
created e.g. red wine, white wine, beer, etc. It will have an ID for the kind of product.
An outlets table will be housing the details of all the outlets of the company. A
products table will be using this ID to identify the product that has been sold. There
will be a sales table which will be inserted to whenever there is a new sale is made in
any outlet. If the data is synchronised and is common among all the outlets (which is
an essential characteristic of any database system and will be explored later in the
paper), then the head office should have no difficulty in querying the database
howsoever it wishes and getting the results it wants within moments. This newly
proposed system would be dramatically better than the current situation where files
are manually being sent to the head office once a week, and then those files are
analysed, and a report is amalgamated.
The above figure (Fig. 1) shows the database structure for the narrow scope of the
problem being handled in this paper. PK stands for the primary key. It has not been
depicted in the figure above, but the databases are related to each other. "Product type
ID" of "Product Types" table is related to Products. "Sales" table is related to both the
"Products ID" and the "Outlet ID". It may be noticed that in the structure proposed,
there is no duplication of data. Additionally, any detail about any entity can be
queried.
<Your Name> <Your Student Number> 2017 11 of 22
Fig 1.Database model for a the sales management
created e.g. red wine, white wine, beer, etc. It will have an ID for the kind of product.
An outlets table will be housing the details of all the outlets of the company. A
products table will be using this ID to identify the product that has been sold. There
will be a sales table which will be inserted to whenever there is a new sale is made in
any outlet. If the data is synchronised and is common among all the outlets (which is
an essential characteristic of any database system and will be explored later in the
paper), then the head office should have no difficulty in querying the database
howsoever it wishes and getting the results it wants within moments. This newly
proposed system would be dramatically better than the current situation where files
are manually being sent to the head office once a week, and then those files are
analysed, and a report is amalgamated.
The above figure (Fig. 1) shows the database structure for the narrow scope of the
problem being handled in this paper. PK stands for the primary key. It has not been
depicted in the figure above, but the databases are related to each other. "Product type
ID" of "Product Types" table is related to Products. "Sales" table is related to both the
"Products ID" and the "Outlet ID". It may be noticed that in the structure proposed,
there is no duplication of data. Additionally, any detail about any entity can be
queried.
<Your Name> <Your Student Number> 2017 11 of 22
Fig 1.Database model for a the sales management

Task 2
Distributed Databases and Potential Issues Before Final Design Decisions
A distributed database is in contrast to a centralised database. A database which is
implemented across multiple computers, which may be spread apart geographically is
a distributed database system. Such systems are connected by a network (e.g.
Internet). Depending upon the underlying hardware and software, there are two types
of distributed systems - homogeneous and heterogeneous. Before we move further, it
is relevant to bear in mind that for the user, this distribution is transparent. For a
distributed database, this location transparency exists for the database administrators
also. As an example, consider email service by Google, Gmail. A Gmail user does not
know where in the world and on which servers his emails are being stored and
processed. We note that the data itself, as well as the processing required to serve a
query, may be spread across the multiple systems (which may even be in different
continents or across oceans) which are part of the distributed database. The sites
which make up a distributed database do not share any physical components and run
independently of each other. Some authors believe that the distribution of the
processing load among multiple computers may improve an end user's performance.
A homogenous distributed database system is one in which all participating sites have
the same software, are aware of each other and share the workload of processing user
requests. All of them agree that they will not change the schema or software without a
consensus. In contrast, in a heterogeneous distributed database system, the
participating sites have the freedom to use different software. They are also free to use
different schemas for the databases. As expected, such a difference gives rise to
technical challenges. Also, the sites may not be aware of each other and may
cooperate in a limited way only. Now, this distribution of data among the underlying
worker machines introduces the problems of data consistency and accuracy in the face
of physically separate computers which are possibly running different software and
schemas.
<Your Name> <Your Student Number> 2017 12 of 22
Distributed Databases and Potential Issues Before Final Design Decisions
A distributed database is in contrast to a centralised database. A database which is
implemented across multiple computers, which may be spread apart geographically is
a distributed database system. Such systems are connected by a network (e.g.
Internet). Depending upon the underlying hardware and software, there are two types
of distributed systems - homogeneous and heterogeneous. Before we move further, it
is relevant to bear in mind that for the user, this distribution is transparent. For a
distributed database, this location transparency exists for the database administrators
also. As an example, consider email service by Google, Gmail. A Gmail user does not
know where in the world and on which servers his emails are being stored and
processed. We note that the data itself, as well as the processing required to serve a
query, may be spread across the multiple systems (which may even be in different
continents or across oceans) which are part of the distributed database. The sites
which make up a distributed database do not share any physical components and run
independently of each other. Some authors believe that the distribution of the
processing load among multiple computers may improve an end user's performance.
A homogenous distributed database system is one in which all participating sites have
the same software, are aware of each other and share the workload of processing user
requests. All of them agree that they will not change the schema or software without a
consensus. In contrast, in a heterogeneous distributed database system, the
participating sites have the freedom to use different software. They are also free to use
different schemas for the databases. As expected, such a difference gives rise to
technical challenges. Also, the sites may not be aware of each other and may
cooperate in a limited way only. Now, this distribution of data among the underlying
worker machines introduces the problems of data consistency and accuracy in the face
of physically separate computers which are possibly running different software and
schemas.
<Your Name> <Your Student Number> 2017 12 of 22
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 22
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.




