MongoDB Database Analysis Project for Big Data - MIT-BDS-02

Verified

Added on 2022/12/23

AI Summary

This project explores the use of MongoDB, a NoSQL database, for analyzing a 'GameData.xlsx' dataset. The project details the structure of MongoDB, including databases, collections, and documents, and demonstrates how to import and query the dataset. It covers creating a database named 'dbase_games', inserting records, and performing various queries to gain insights into the dataset, such as finding free games, calculating average prices for different age groups, and identifying games not supported on Apple devices. The reflection highlights the flexibility of MongoDB, the ease of querying, and the differences between SQL and NoSQL databases. This assignment helped in understanding the issues with SQL and features present in NoSQL and how the schema can be defined and changed at runtime in NoSQL.

Big Data Database:
MongoDB

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

About MongoDB
 MongoDB is widely and famous NoSQL database used for big data
and it is open-source document type database.
 In MongoDB the information is stored in the form of document and
group of documents is called collection (which is equivalent to table
in SQL).
 Schema less − MongoDB is a document type database and store the
information in the form of documents and schema of one document
can differ with other.
 No complex joins.
 Ease of scale-out − MongoDB can be scaled based on the
requirement.
 Structure of a single object is clear.

Dataset from Internet
 The dataset selected from internet is “GameData.xlsx”.
 this dataset is about the details of the games such as its developers,
released dates, released versions, supported devices, price, genre
etc.
 This dataset consists of total 33 columns.
 This dataset gives clear distinction that SQL cannot be used with it
rather NoSQL is best choice.
 This dataset is chosen because it is very challenging to design this
complex structure.

Dataset Columns

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Dataset Columns

Processing in the Database
 MongoDB follows JSON like structure.
 The database is created with the “dbase_games”
 Inside it collection is created with the name games and index is
created on Gameid.
 Genres is created as array object as it will flexible any number of
genres (in dataset there are separate columns for genres).
 Similarly Language, SupportedDevices, Developers and LatestVersion
are selected as nested documents.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Structure to Insert in database

Records inserted

Insight on the dataset
 The first thing to get the insight on the dataset is to check the records
in it and select few columns from it, listing all games and its title from
the database. Here the command used is “find” with projection of
Title column as 1 (to display) and 0 for Id to hide.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Insight Continues.
 Searching all games which are free and not chargeable, the query will
fetch all those records which has price equal to 0. There are 8 games
which are free. Here filter is applied on column “price” the operator
used is “$eq”

Insight Continues.
 With the below query we are going to find the average price for 4+
advisory and 12+. The games for kids is around $2 and for teenagers
the game is costly than kids