Advanced Database Systems: Object Relational Database Report Analysis

Verified

Added on  2023/01/10

|13
|4278
|41
Report
AI Summary
This report examines object-relational database implementation for the Portdown Psychiatric Hospital's research department, focusing on the 'Portdown Twin Register.' The report begins by introducing the concept of object-relational databases and their relevance to data management challenges. It then delves into a comparative analysis of various database technologies, including noSQL databases, Hadoop, and data warehouses, alongside relational databases. For each technology, the report provides an overview of its main features, components, strengths, and weaknesses. The report assesses how these technologies can be used within the PPH’s research department. The report concludes with a summary of the findings and recommendations for database implementation in the context of the hospital's research needs, specifically addressing the management of twin data. The report highlights the benefits of object-relational databases, such as reuse and sharing, increased productivity, and experience, and how they can improve the accessibility of data.
Document Page
Object Relational Database
Implementation
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Contents
INTRODUCTION...........................................................................................................................1
MAIN BODY..................................................................................................................................1
(a) Overview of the main features and components of the database/technology and its
strengths and weaknesses over using a relational database.........................................................1
(b) How and where the database/technology might be used within the PPH’s research
department...................................................................................................................................6
CONCLUSION................................................................................................................................9
REFERENCES..............................................................................................................................10
Document Page
INTRODUCTION
Portdown Psychiatric Hospital’s Research department has dataset of twins known as
Portdown Twin Register and while maintaining and updating this dataset, there are various issue
that the research manager has to face due to which the hospital research department is exploring
different Database technology so that they can resolve all their issues.
Object Relational Database is a data set management system in which relational and
referenced data is being maintained by using different algorithms so that the accessibility of the
data base can be ensured (Bousnina and et.al., 2016). The main aim of this report is to research
different technology or databases management system that help in the process of data set
management. For this, the case study of Portdown Psychiatric Hospital’s Research department is
considered.
In this report, the Database technologies of noSQL databases, Hadoop, Data Warehouses
and Object-Relational databases will be assessed in order to provide their overview along with
their features, components, strengths and weaknesses. This report will also include the usage of
these technologies in order to ascertain the ways by which they can be used for PPH’s research
department.
MAIN BODY
(a) Overview of the main features and components of the database/technology and its strengths
and weaknesses over using a relational database
A relational database is a type of dataset which is stored in a way that provides access to
the data points of the data set which are related to each other. In simpler words, a relational
database is a collection of information with various data points having pre determined
relationships (Agrawal and et.al., 2016). It is complex to maintain and update a relational
database as a change in a variable in such data base impacts various other variables of this data
base. For example: If investigator has to change address of a twin pair then they have to update
entire record of that twin pair as all the information in relational database is related to each other.
Considering this, there are few databases and technologies that can be used over relational
database. All these databases and technologies along with their overview is analysed below:
noSQL databases
1
Document Page
Overview - NOSQL or not only sql is a type of database which is non tabular unlike
relational database and has variety of data models. These types of databases are known for their
easy accessibility.
Components and features - The components of noSQL databases includes storage node
agents, replication nodes, admins and shards. These components store the information and
corresponds with the particular service running on a host. The features of this type of databases
includes multi model, easy scalability, flexibility and well distributed. As the noSQL databases
does not required to be recorded in tables, it can have multiple models that ensures the feature of
easy scalability. These kind of databases are considered to be flexible as they can process not
only the structured data but can also process semi structured and unstructured data due to which
noSQL database is better considered than relational database (Moniruzzaman and Hossain,
2013).
Strengths - There are various strengths of noSQL databases due to which they can be
arguably used over relational databases. The first and most influential benefit of noSQL
databases is scalability. There are usually two kinds of scalability which are vertical and
horizontal; the horizontal scalability involves enhancement of data by adding more machines due
to which it is easier to be implemented but it is difficult to do with relational databases because
2
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
relational databases focus on maintaining ACID transactions which means that if systems are
distributed, then user need to have distributed locking, distributed everything which adds to the
complexity and on the other hand, vertical scaling adds new resources instead of machinery due
to which it is difficult to be implement. Both type of scaling can be done in noSQL databases
which provides it a preference point than relational database. Other strengths of noSQL
databases includes auto sharding, integrated caching and replication. By using these benefits, the
information in the database can be easily available as if data fails to replicate then also noSQL
databases can automatically provide or replicates a previous consistent state of the dataset
(Advantages of NoSQL Databases. 2020).
Weaknesses - Similar to every other database or technology, noSQL databases also has
few weaknesses due to which it is restricted to be used. The noSQL databases has a narrow focus
when it comes to functionality. The main aim of using noSQL databases is effectively store all
the information due to which it has less functionality features. On this point of argument,
relational databases are quite effective as they are the better choice for transaction management.
Another weakness of noSQL databases is that these type of databases does not have any
reliability functions which make these databases to not support atomicity, isolation, durability
and consistency. NoSQL are relatively newer than relational data due to which reliability of
noSQL databases are less (Nyati, Pawar and Ingle, 2013).
Hadoop
Overview - Hadoop which is also known as Apache Hadoop is a distribution filing
system that allows its users to distribute and process big data using an open source infrastructure.
Hadoop is more of a technology which has been designed to convert single services into various
machines so that the service of local storage can be enjoyed and each server can be computed
(Overview of the Hadoop. 2019).
Components and features - Hadoop has two core components which were designed with
the aim of assisting users; these components are Hadoop Distributed File System (HDFS) and
MapReduce framework. The HDFS was designed to process data and stores data blocks. The sub
components of Hadoop are Namenode which is the master of HDFS and Datanode which is the
slave of HDFS. On the other hand, the second component of Hadoop is MapReduce framework
which was develop to act as a programming model that can allow to process big data using
distributed algorithms. The features of Hadoop include data integrity, robustness, accessibility
3
Document Page
and cluster rebalancing. All these features are associated with Hadoop and it helps in storing and
processing big data (Yang, Lin and Liu, 2013).
Strengths - Hadoop is a software application which has the main strength of storing and
processing the data due to which Hadoop is preferred and used over relative dataset system.
Scalable is also a feature that Hadoop possess that allows the users to store and distribute large
data set along with which it allows to distribute data across hundreds of servers. Another strength
of Hadoop is that; it is a cost effective storage solution which is not cost prohibitive like
relational database management systems. Hadoop is devised technology which is resilient to
change. Unlike relative database, when a data is sent to an individual note, Hadoop replicates
that data into other nodes in a cluster so that if any failure occurs then there will be a copy
available (Overview of the Hadoop. 2019).
Weaknesses - Apart from the above strengths, there are certain weaknesses due to which
Hadoop is not preferred over relational database. The most influential weakness is the
compatibility of Hadoop that only allows the big data to be analysed which makes it non
compatible with small data. Another weakness of Hadoop is its security concerns which are
raised due to missing encryption at network levels (Khan, Liu and Li, 2014).
The data of twins hold by PPH is considerably big data which can be effectively stored
and analysed using Hadoop. This technology of Hadoop will provide benefit of easy updating the
twin information without any chaos.
Data Warehouses
Overview - Data warehouses is a technology involving systems which aggregate and
store information from various sources. This technology of data warehouses systems allows to
have a data centre where all the information can be stored due to which it system has the focus
being business oriented (Overview of data warehouses. 2020). This technology of data
warehouses is usually preferred over relational database because the data warehouses is the
collection of various databases and also the storage power of data warehouses is way larger than
relational database.
Components and features - There are five components of data warehouse including
Database, ETL tools, meta data, query and Datamarts. The first component of database includes
the information which is stored is required to be stored. Second component of ETL tools are the
cleaning and transformation tools which are used to pre process the data so that they can be
4
Document Page
further analysed; these tools include extract, transform and load. Third component of data
warehouse is Metadata which designed to maintain and manage the data warehouse and is related
to the architecture of data. Next component is Query which includes tools that helps in to use the
information and analyse it in such a way that the data can be used to strategic decision making.
This tool of query helps to interact user with data warehouse system. Besides these components,
there are few features of data warehouse which includes integration and time variant as data
warehouse is integrated to various sources like mainframe and flat files. Other characteristics of
this technology includes subject oriented and non volatile (Hajmoosaei, Kashfi and Kailasam,
2011).
Strengths – The technological system of data warehouse has various strengths due to
which, it is used over relational databases. Data warehouses allows to retrieve data wile
consuming less time when compared to relational data. Also the data warehouse has a feature
using which errors in database can be identified and corrected due to its compatibility with
artificial intelligence. Lastly, unlike relational database, data warehouses have the strength of
easy integration using which it is used by various business organisations by integrating it with
CRM.
Weaknesses – The only weakness of data warehouse due to which it is not preferred over
relational database is its involvement of high maintenance costs. Data warehouse is required to
be consistency updated due to which it requires high costs, considering which various
organisations are not able to afford it (Triplet and Butler, 2014).
Object-Relational databases
Overview – The technology of Object-Relational databases is a database management
system which is develop with the combination of relational database and object oriented
database. This database management technology supports the schemas of structured query
language. This database management system is an extension of relational model.
Components and features – Components of Object-Relational databases includes all the
components combined of relational database management system and object oriented
management system (Components of Object-Relational databases. 2020). These component
includes tables, forms, macros, modules and queries. Like relational database, information is
stored in tables in this technological database system and like objective oriented stem, this
system includes macros and modules. The features of Object-Relational databases include
5
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Inheritance and method encapsulation. The technology of ORDBMS supports the components,
new sub types and sub classes of object oriented and relational database due to which it can be
said that this technology has the characteristic of inheritance. The feature of method
encapsulation defines the capability of ORDBMS which allows users to perform a method within
the defined object (Harrington, 2016).
Strengths Among various benefits which pursuit users to use Object-Relational
databases management system instead of relational databases, the strength of reuse and sharing is
quite influential. By using ORDBMS, a user can perform the standard functionality centrally by
which it can reused as much as the discretion of user. Another strength of this database
technology is increased productivity as it provides an interactive experience to not only the user
but to the developer as well. Another strength of Object-Relational databases is the use of
experience. The system of ORDBMS allows to preserve the significant body of knowledge and
experience which is gained while developing relational applications. By preserving this
experience, organisations can have a benchmark and an evolutionary way to develop databases
(Components of Object-Relational databases. 2020).
Weaknesses – The only weakness of ORDBMS due to which relational databases are used
over ORDBMS is simplicity as the ORDBMS is an extended form of relational databases, due to
which there are much more complexities in this system (Armbrust and et.al, 2015).
From the above analysis, it has been analysed that each of the four database management
systems and technologies has some strengths due to which they are used over relational database
management system.
(b) How and where the database/technology might be used within the PPH’s research department
Portdown Psychiatric Hospital’s Research department is known for their internally
renowned database of twins. This database is currently facing various issues to be maintained
and managed. Currently, PPH is using relational database management system for storing the
database of more than 50000 pairs of twins. Instead of selecting any one new technology /
database, it is suggested that PPH must use different technologies and database management
system for different sub sets of data that will address all the issues.
Rationale of using each database management system and technology is analysed below
along with the analysis of what and where each of these technologies must be used in context of
Portdown Psychiatric Hospital’s Research department:
6
Document Page
noSQL databases
The record of basic information of each twin will be recorded and stored using NOSQL. In
this database, general contact information of twins will be stored along with the information of
age, sex, place of birth, nationality of mother and father along with their date of birth and the
history of migration movements. This database will be recorded and stored using NOSQL by
manually recording all the physical records into electronic formats. This database will not
involve any tabular data. The major reason due to NOSQL must be preferred by PPH is that this
database technology is much flexible and SQL database such as MongoDb allows the inclusion
of any additional ‘fields’ as it doesn’t need a schema in advance unlike a relational database,
where user have to design the structure of the database in advance before you can add any data.
After analysing where and how information will be recorded, it is important to identify the
rationale due to which the NOSQL database is considered to be used. As analysed above,
NOSQL is the most effective management system to record any kind of data whether it is
structure, semi structure or unstructured (Han and et.al., 2011). Another reason to record this sub
set of data into NOSQL is that this data has the aim to only be recorded and does not need any
further functionally which makes NOSQL ideal for this data sub set. The data will be easily
retrieved when required and will be divided into sections of same sex identical twins, same sex
non-identical twins and differing sex twins and triplets.
Hadoop
This software technology will be used to fulfil various database requirements and satisfy
various issue of Portdown Psychiatric Hospital’s Research department. The software of Hadoop
will be used to store and process the database. For PPH, the database that will be stored is the
dataset of the illness and date of diagnosis for each twin along with additional information
including hospital notes, hospitalisation history. The aim of recording this data will to easily
accessible health history of each of the twins and easy learning of the background health history.
Along with this, Hadoop will also be used to record the research project history of the twins in
order to identify the research tests in which each of the twin members are involved. It is
important for this data to enable show a complete testing record for each patient so that the data
can be retrieved for what twins were tested for, which test they took and what were the results.
This record must also provide a report on these tests.
7
Document Page
Besides both the sub data sets, the software application of Hadoop will also be used for
analysing the budget of each project of the entire data set. In this requirement of Portdown
Psychiatric Hospital’s Research department, all the expenses of information acquisition will be
monitored and controlled by providing a budget constraint for all the activities (Polato and et.al.,
2014).
There is a rationale behind using Hadoop for each of the requirements above. Hadoop is a
software that has the capability of recording and processing big data. The sub data sets of
hospital history and research tests history are big data sets that can be easily recoded by using
Hadoop. Also both of these data bases can also be processed by using the Hadoop components of
Hadoop Distributed File System and MapReduce framework. By using Hadoop, researchers will
be able to process the information to identify the medical backgrounds of all the twins and the
results of the research tests. This software has the strength of distributed algorithms using which
budgets for each project will be allotted and distributed.
Data Warehouses
Data warehouses has various benefits and weaknesses to be used due to which this database
management system will only be used for a specific sub data base of PPH’s research department.
According to PPH’s requirement, the entire data must be processed in such a way that it can be
used to analyse the results across various projects. For this requirement to retrieve the data, the
technology of data warehouses must be used as data warehouses has the ability to connect all
data bases by which any information can be gathered from any data point or centre. Besides this,
data warehouse will also be used for another requirement which is to get easy access of
information on the publications relating to the projects. For this, PPH can combine their database
with database of publications using data warehouses that will enable to monitor any publication
activity of the individual researchers (Özcan and et.al., 2011).
The rationale behind selecting database warehouses technology for above two requirements
is the components of this technology which not only allows to record the data but also allows to
clean and transform the data to be retrieved using query tools. Also, the only weakness of this
system is that data warehouses are expensive but the additional fund allocated to PPH, the
service of data warehouses can be used by PPH’s research department. This organisation is also
suggested to use star schema in which they will be able to structure their data and also will be
capable to quick update it.
8
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Object-Relational databases
All the issue and requirements of PPH for the twin database can be fulfilled by using above
three database management systems. The above three systems will be used to store and process a
specific sub dataset but the Object-Relational databases will be used to store and record entire
databases all together in Object-Relational databases (Boyd and Crawford, 2011).
The rationale behind selecting Object-Relational databases for storing the entire database of
PPH is because of its inheritance. The database which PPH currently possess is the relational due
to which it will be moderately easier to record entire data base using Object-Relational databases
as not only it will help in recording and storing database in tabular form but will also help in
process and retrieve the data at any point of time; also, it will provide benefit of easy updating of
information. Object Relational Database will be implemented for the twin database of PPH but
other three database management systems and technologies will also be used to record and
process sub sets of database and addressing specific analysis requirements.
CONCLUSION
From the above report, it has been concluded that database management procedure is
complex task to conduct when the data requires regular update and retrieve. The organisation of
the case study that is PPH is recommended to use Object Relational Database Implementation for
their twin database and use NOSQL, Hadoop and data warehouses for their sub sets of data and
specific processing and analytical requirements. It has been also recommended to PPH to invest
their funds in procuring database warehouses and acquiring an open source copy of Hadoop as
NOSQL and ORDBMS does not require additional monetary funds.
9
Document Page
REFERENCES
Books and Journals
Agrawal, A. and et.al., 2016. Flexible storage of XML collections within an object-relational
database. U.S. Patent 9,367,642.
Armbrust, M. and et.al, 2015, May. Spark sql: Relational data processing in spark.
In Proceedings of the 2015 ACM SIGMOD international conference on management of
data (pp. 1383-1394).
Bousnina, F.E. and et.al., 2016, April. Object-relational implementation of evidential databases.
In 2016 International Conference on Digital Economy (ICDEc) (pp. 80-87). IEEE.
Boyd, D. and Crawford, K., 2011, September. Six provocations for big data. In A decade in
internet time: Symposium on the dynamics of the internet and society.
Hajmoosaei, A., Kashfi, M. and Kailasam, P., 2011, October. Comparison plan for data
warehouse system architectures. In The 3rd International Conference on Data Mining
and Intelligent Information Technology Applications (pp. 290-293). IEEE.
Han, J. and et.al., 2011, October. Survey on NoSQL database. In 2011 6th international
conference on pervasive computing and applications (pp. 363-366). IEEE.
Harrington, J.L., 2016. Relational database design and implementation. Morgan Kaufmann.
Khan, M., Liu, Y. and Li, M., 2014, August. Data locality in Hadoop cluster systems. In 2014
11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD) (pp.
720-724). IEEE.
Moniruzzaman, A.B.M. and Hossain, S.A., 2013. Nosql database: New era of databases for big
data analytics-classification, characteristics and comparison. arXiv preprint
arXiv:1307.0191.
Nyati, S.S., Pawar, S. and Ingle, R., 2013, August. Performance evaluation of unstructured
NoSQL data over distributed framework. In 2013 International Conference on Advances
in Computing, Communications and Informatics (ICACCI) (pp. 1623-1627). IEEE.
Özcan, F. and et.al., 2011, June. Emerging trends in the enterprise data analytics: connecting
Hadoop and DB2 warehouse. In Proceedings of the 2011 ACM SIGMOD International
Conference on Management of data (pp. 1161-1164).
Polato, I. and et.al., 2014. A comprehensive view of Hadoop research—A systematic literature
review. Journal of Network and Computer Applications. 46. pp.1-25.
Triplet, T. and Butler, G., 2014. A review of genomic data warehousing systems. Briefings in
Bioinformatics. 15(4). pp.471-483.
Yang, C., Lin, W. and Liu, M., 2013, September. A novel triple encryption scheme for hadoop-
based cloud data security. In 2013 Fourth International Conference on Emerging
Intelligent Data and Web Technologies (pp. 437-442). IEEE.
Online
Advantages of NoSQL Databases. 2020. [Online]. Available through:
<https://www.mongodb.com/nosql-explained/advantages>
Overview of the Hadoop. 2019. [Online]. Available through: <https://dzone.com/articles/big-
data-hadoop-tutorial-for-beginners-3>
Overview of data warehouses. 2020. [Online]. Available through:
<https://www.ibm.com/support/knowledgecenter/en/SSGU8G_11.50.0/com.ibm.whse.do
c/ids_ddi_344.htm>
10
Document Page
Components of Object-Relational databases. 2020. [Online]. Available through:
<https://www.techopedia.com/definition/8714/object-relational-database-ord>
11
chevron_up_icon
1 out of 13
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]