logo

SQL vs NoSQL: Choosing the Right Database for Big Data Management

Answer all the questions in the assignment and ensure good presentation and thoroughness in approach. Proper referencing and use of APA style is required.

10 Pages2698 Words1 Views
   

Added on  2022-12-23

About This Document

This report discusses the differences between SQL and NoSQL databases and their implications for big data management. It highlights the benefits of NoSQL databases, such as scalability and flexibility, and explains why MongoDB is a suitable choice for handling diverse and rapidly growing data. The report also explores the use of machine learning techniques in big data analysis and the role of text mining in extracting valuable insights from textual databases. Overall, it provides valuable insights for organizations looking to choose the right database system for their big data needs.

SQL vs NoSQL: Choosing the Right Database for Big Data Management

Answer all the questions in the assignment and ensure good presentation and thoroughness in approach. Proper referencing and use of APA style is required.

   Added on 2022-12-23

ShareRelated Documents
Report
Component (a)
Background
The Safeway fresh and whole foods is a supermarket chain that provides its customer with all
types of groceries and other household needs at one place. The company have a vast database
that stores, record of sales, stock details, transactions both online and offline, purchase and sales
of products and so on. The company also maintains a software to sold their products online. As
more and more of the people are now transitioning towards online purchase, the company
collects the information from variety of social media platforms to understand the buying trends
of customers. (Hobbs, 2017).
SQL vs NoSQL
Safeway are upgrading their database and considering the implementation of either SQL or
NoSQL database management system for the same. At first place Safeway should consider this
high level differences between SQL and NoSQL before choosing one of them for storage and
management of their big data.
1. SQL Databases are the Relational Databases RDBMS whereas NoSQL are the distributed
or non-relational databases.
2. In the SQL databases the data is in the form of tables where it is represented in n number
of rows, on the other hand the NoSQL databases are in the form of documents, key-value
pairs, wide-column stores or graph databases, where no standard schema definition is
followed.
3. There is a predefined schema in SQL databases whereas for NoSQL databases there is
dynamic schema and the data is unstructured.
4. SQL databases are scaled vertically whereas NoSQL databases are scaled horizontally.
This means to scale SQL database requires increment in hardware horse-power such as
increasing the RAM, SSD, CPU power on a single server, whereas for NoSQL databases
the number of database servers in the resource pools needs to be increased in order to
reduce load.
5. As the name suggest, structured query language is used in SQL for the definition and
manipulation of data. In NoSQL documents collection is the primary focus of queries, it
is also sometimes called Unstructured Query Language UnQL, and its syntax are
different as per the databases.
6. Oracle, MySQL, Postgres, SQLite and MS-SQL are some examples of SQL databases.
MongoDB, Redis, Cassandra, CouchDB, Hbase and RavenDB are some examples of
NoSQL databases.
7. For processing complex queries in an intensive environment SQL database are best suited
and more powerful as compared to NoSQL which lacks in standard interface to perform
these complex queries.
SQL vs NoSQL: Choosing the Right Database for Big Data Management_1
8. For big data NoSQL database are highly preferred as they fits best to store hierarchical
and large set of data due to its key-value pair storage method just like JSON
9. For complex and high transactional application SQL databases are best suited and are
more stable and reliable in terms of integrity and atomicity of data. While NoSQL
databases are also used for the purpose they are not as stable and comparable as SQL for
complex and high load transactional application.
10. The SQL database vendors provides excellent support to the business clients particularly
in large deployments. For NoSQL database still have dependency on community support
and setup and deployment on large scale are done by experts that are only few available
in the market.
11. SQL databases are known for their ACID properties that is Atomicity, Consistency,
Isolation and Durability. On the other hand NoSQL properties follows Brewers CAP
theorem which stands for Consistency, Availability and Partition tolerance (Raje,
Jagdale, 2017).
From the above difference it can be understood that to choose between SQL and NoSQL for the
storage and management of Big Data, NoSQL can be an optimal choice. Since a variety of data
is associated with the big data application, which is collected from various sources such as
mobile phones, social media etc. This data can vary from personal information of the user to
sensor data and so on. For handling of these data flexibility and scalability plays an important
role. For vertical scalability of SQL system, it requires to spend a huge amount of money on the
single hardware node. For big data applications NoSQL provides horizontal scalability which
can be easily implemented just by adding a server node to the system, allowing sharing of the
loads between nodes. Also NoSQL databases are more flexible as compared to the SQL as it is
not schema restricted as in case of relational database (Pore &. Pawar, 2015).
Implications and outcomes of choosing correct database
management system:
For the Safeway inc. it is important that the system should be friendly for the brand managers as
well as for employees in the company. This means that the system should not only be easy to use
and understand but should provide the organization with maximum benefits. As the selected
database system would be handled by a number of professionals including, IT team, marketing
professionals, database developers and others, they should have enough knowledge regarding
system and its functioning to avoid any issues. It should be ensured that the chosen system
should be scalable and flexible enough so that it can be integrated with their current software and
at the same time should be able to grow with the rapidly growing demands of the organization,
while being cost effective and sustainable at the same time. Overall, while selecting the right
database system the most important criteria is that every team should be able to utilize the system
without much efforts to make the most valuable decisions from the data contained in it.
Machine learning and Big data
In today’s era of technological advancement massive amount of data is generated every
second which needs to be stored and processed with equal intensity as it is generated. While
SQL vs NoSQL: Choosing the Right Database for Big Data Management_2
these massive data has significant potential, new ways of learning and thinking are required
to interpret these data to extract meaningful results and address the challenges that comes
along (Qiu, Wu, Ding, et al. 2016). Over the past decade, the data intensive fields such as
astronomy, medicine and biology has widely adopted machine learning techniques to mine
hidden information within the data. Machine learning is a research field focusing on the
performance, properties and theories of learning algorithms and system. Machine learning is
an interdisciplinary field based on the various disciplines of engineering, science and
mathematics such as optimization theory, artificial intelligence, cognitive science, statistics,
optimal theory, information theory and so on. The main function of Machine Learning is to
utilize past experience to find a good prediction automatically with a good classifier
(Muhammad & Yan, 2015). There are two simple process in which machine learning model
work: training and testing. The machine take the input in the form of samples of training data
and the features of data are learned by the learner or learning algorithm to build the model.
For the testing process, with the use of execution engine the learning model makes the
prediction for the production or test data. The output from the learning model is the tagged
data which provides with the final prediction (Qiu, Wu, Ding, Xu & Feng, 2016). The most
common technique of classification problems is the supervised learning. In this technique,
the input and the desired output are feed and with the help of the algorithm the mapping
function from input to the output is learned. The supervised learning have further two main
categories, Classification and regression. In classification the target class is predicted as
output, while in regression continuous values are predicted as output (Vennapusa &
Bhyrapuneni, 2019). However there are challenges in supervised learning when it comes to
dealing with big data. Currently, there are ubiquitous classification problems with huge scale
data, and the traditional classification algorithm cannot process the big data properly. The
learning methods available today are shallow structured architectures which does not fit for
growing complex structures of input data (Xie et. al, 2018).
Approaches in text mining
Text mining is the process of finding data from textual databases by extracting stimulating and
substantive models. It is a multi-disciplinary field that includes, information retrieval, machine
learning, data mining and computational semantics. There is a growing usage of text-based
techniques in the industry, web application, internet, academia and other fields (Talib, Hanif,
Ayesha & Fatima, 2016). Organizations requires analysis to develop a marketing strategy, such
as their target audience, geographic details of the audience, so on. These information helps them
to gain insights and discover future trends to target their marketing strategies. Text mining can
help in the areas, such as CRM, clean emails, social network analysis and extract features,
visualization, predictive and trend analysis. As these data are stored in the text frames in
databases, the text extraction strategies can be implemented on text files to analyze its data. All
the data that is generated and analyzed can be compiled and reviewed further to ensure more
tailored marketing efforts.
SQL vs NoSQL: Choosing the Right Database for Big Data Management_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
NoSQL Database and MongoDb
|5
|1252
|362

Big Data and Database Assignment
|10
|2615
|40

Big Data and traditional databases
|10
|936
|159

Big Data Database: MongoDB
|15
|689
|24

ITC560 - Database Management System - Report
|8
|1460
|41

Logical Data Modeling Assignment PDF
|10
|1481
|131