This article discusses the basics of Big Data, NoSQL databases, and MapReduce. It explains why traditional relational databases are not effective for storing Big Data and introduces NoSQL databases as an alternative. It also describes the concept of MapReduce and how it can be used in Big Data processing.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
ITECH 2201 Cloud Computing School of Science, Information Technology & Engineering Workbook for Week 6 (Big Data) Please note: All the efforts were taken to ensure the given web links are accessible. However, if they are broken – please use any appropriate video/article and refer them in your answer Part A(4 Marks) Exercise 1: Data Science(1 mark) Read the article athttp://datascience.berkeley.edu/about/what-is-data-science/and answer the following: What is Data Science? The study of what information represents, where it comes from and how the information can be turned into a valuable resource to create IT strategies and businesses is known as data science. Patterns can be identified by mining huge number of unstructured and structured data that can be analyzedtohelpbusinessesincostefficiencies,competitiveadvantageandnewmarket opportunities. According to IBM estimation, what is the percent of the data in the world today that has been created in the past two years? According to IBM estimation, 90 % percent of the data in the world today that has been created in the past two years. What is the value of petabyte storage? The value ofpetabyte storage is 10^15 bytes or 1,000,000 gigabytes or 1,000 terabytes CRICOS Provider No. 00103DInsert file name herePage1of23
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
For each course, both foundation and advanced, you find at http://datascience.berkeley.edu/academics/curriculum/briefly state (in 2 to 3 lines) what they offer? Based on the given course description as well as from the video. The purpose of this question is to understand the different streams available in Data Science. In the foundation courses, students who are good in OOP will get 12 units of the coursework, whereas the students who are not so good at OOP will get 15 units of course work. The foundation course work includes 3 units of python data science, Research design and application for data and analysis, Statistics for Daa science, fundamental sof data engineering and applied machine learning. In the advanced courses, the units include experiments and casualty, behind data human values, scaling up really big data, statistical methods for discrete response, time series and panel data, machine learning at scale, natural language processing with deep learning and data visualization Exercise 2: Characteristics of Big Data(2 marks) Read the following research paper from IEEE Xplore Digital Library Ali-ud-din Khan, M.; Uddin, M.F.; Gupta, N., "Seven V's of Big Data understanding Big Data to extract value," American Society for Engineering Education (ASEE Zone 1), 2014 Zone 1 Conference of the , pp.1,5, 3-5 April 2014 and answer the following questions. Summarise the motivation of the author (in one paragraph) The motivation of the author is the simple fact that big data is everywhere in our lives now and it is the solution to many problems that are present in the industries now. To build the technology of the future, the big data has all the right materials. Big data is used in every aspects if our lives now such as in small and large businesses, film making, law enforcement and entertainment. It is used by large corporations like Amazon, Facebook and Google. To really use big data to its full advantage, the wen experience needs to be CRICOS Provider No. 00103DInsert file name herePage2of23
chosen as the data pool as most of the people nowadays access web and mobile apps most of the time. The largest generator of data should be Google and it has changed the market scenario by introducing big data related technology like Map Reduce, Hadoop and Google Big table. The author has stressed on the fact that big data will help to revolutionize other industries like biological research, politics, sustainability, environmental research, finance and education. What are the 7 v’s mentioned in the paper? Briefly describe each V in one paragraph. The 7 Vs that are mentioned in the paper are validity, veracity, volume, volatility, variety and velocity. The volume aspect of the 7 v’s points to the fact that big data is created from several sources such as natural disasters, forecasting, weather, crime reports, space images, medical, research studies, networking, images, social, video, audio and text. The volume of data can be extracted from social media, GPS trails, government documents, telemetry, web pages and so on. The second aspect is velocity. The big data needs to be transferred at an optimum velocity so that it can be processed. The system taking in the data should have capable infrastructure to handle the data. The feedback loop should have high speed from the input to the decision. The third aspect is the variety of big data that comes from different sources. The data can be in the form of names, images, text, video and audio. Users use different browsers and upload their data on different clouds The third aspect is the .variety. The fourth V is veracity. Next comes validity and value. Explore the author’s future work by using the reference [4] in the research paper. Summarise your understanding how Big Data can improve the healthcare sector. Big data can improve healthcare by Personalized treatments and medicines Better treatment Prevent malicious behaviour CRICOS Provider No. 00103DInsert file name herePage3of23
Exercise 3: Big Data Platform(1 mark) In order to build a big data platform - one has to acquire, organize and analyse the big data. Go through the following links and answer the questions that follow the links: Check the videos and change the wordings −http://www.infochimps.com/infochimps-cloud/how-it-works/ −http://www.youtube.com/watch?v=TfuhuA_uaho −http://www.youtube.com/watch?v=IC6jVRO2Hq4 −http://www.youtube.com/watch?v=2yf_jrBhz5w Please note: You are encouraged to watch all the videos in the series from Oracle. How to acquire big data for enterprises and how it can be used? Big data is being utilized to enhance operational productivity, and the capacity to settle on proper choices in light of the extremely most recent up-to-the-minute data is quickly turning into the standard. The main objective of big data is to enable organizations to settle on more educated business choices by empowering data Scientist, prescient modelers and different investigation experts to break down extensive volumes of exchange information, and in addition different types of information that might be undiscovered by ordinary business intelligence programs. How to organize and handle the big data? To organize and handle big data efficiently, the initial step is to convey the information down to its dataset and lessen the measure of information to be overseen. Next, use the energy of virtualization innovation (Baldini et al. 2016). Associations must virtualize this novel informational index with the goal that not just different applications can reuse similar information impression, yet in addition the littler information impression can be put away on any seller free stockpiling gadget. Virtualization is the tool that associations can use to fight the Big Data administration challenge. By diminishing the information, virtualizing the reuse and capacity of the information and unifying the administration of the informational index, Big Data is at last changed into little information and oversaw like virtual information. CRICOS Provider No. 00103DInsert file name herePage4of23
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
What are the analyses that can be done using big data? There are several analyses that can be conducted with the help of big data. Big data Analytics is utilized by them to break down dangers like hostile to illegal tax avoidance, extortion relief, know your customer activity. Media and excitement requires constant offer information to full the expanding requests of the clients in various arrangements and assortment of gadgets such as billboards, TV, YouTube and some more. Their primary initiative is to use enormous information and convey an ongoing substance crosswise over various Medias. Wimbledon supposition examination, Amazon Prime and Spotify are live illustrations. Social insurance industry is in most need of huge information examination. They have a tremendous measure of blood test results to exchange information, from remedies to media discourse. Because of absence of legitimate examination, the wellbeing segment has dependably neglected to use the information to check the cost and get medical advantages. Humedica, Obama care, Cerner are a few cases of such industries. Part B(4 Marks) Part B answers should be based on well cited article/videos – name the references used in your answer. For more information read the guidelines as given in Assignment 1. Exercise 4: Big Data Products(1 mark) Google is a master at creating data products. Below are few examples from Google. Describe the below products and explain how the large scale data is used effectively in these products. a.Google’s PageRank PageRank is the thing that Google uses to decide the significance of a site page. It's one of numerous components used to figure out which pages show up in list items. PageRank tries to check a site page's significance. PageRank doesn't stop at the popularity of the link. It likewise takes a guess at the significance of the page that contains the connection. Pages with higher PageRank have more weight in "voting" with their connections than pages with bring down PageRank. It additionally checks the CRICOS Provider No. 00103DInsert file name herePage5of23
quantity of connections on the page throwing the "vote." Pages with more connections have less value. b.Google’s Spell Checker Google's spell check is an exceptionally old component Google has been always making strides. Google is utilizing both web file and additionally question preparing calculation with a specific end goal to choose if the word you write needs a refinement. At times Google doesn't give the first inquiry any shot whatsoever: It looks for the "right" spelling. These inquiries may have the most reduced QR. c.Google’s Flu Trends Google Flu Trends is presently never again distributing current assessments. Google operated the service. It gave assessments of flu action to in excess of 25 nations. By amassing Google Search questions, it tried to make precise assessments about the activity of influenza. d.Google’s Trends Google Trends is a trend searching application that shows how much of the time a given hunt term is gone into Google's web index with respect to the website's aggregate pursuit volume over a given timeframe. It can be utilized for near watchword inquire about and to find occasion activated spikes in volumes of keywords. It gives watchword related information including look volume list and geological data about internet searcher clients. Like Google – Facebook and LinkedIn also uses large scale data effectively. How? LinkedIn, Facebook and Google use large scale data by analysing user behaviour and interations. Theb large scale data is then analysed to find patterns. Exercise 5: Big Data Tools(2 marks) Briefly explain why a traditional relational database (RDBS) is not effectively used to store big data? CRICOS Provider No. 00103DInsert file name herePage6of23
RDBS is not usually used for storing big data due to the following reasons. To begin with, the size of the data has expanded hugely to the scope of petabytes—one petabyte. RDBMS thinks that its testing to deal with such immense information volumes. Toaddressthis,RDBMSincludedmoreCPUsormorememorytothedatabase administration system to scale up vertically. Second, most of the information arrives in a semi-organized or unstructured arrangement from online networking, sound, video, messages, and messages. Be that as it may, the second issue identified with unstructured information is outside the domain of RDBMS in light of the fact that social databases can't arrange unstructured information. They're composed and organized to suit organized information, for example, weblog sensor and money related information. Big data is created at a high speed. RDBMS needs in high speed since it's intended for relentless data maintenance as opposed to fast development. Regardless of whether RDBMS is utilized to deal with and store "huge information," it will end up being extremely costly. Thus, the powerlessness of relational databases to deal with big data prompted the rise of new technologies. What is NoSQL Database? NoSQL Database gives an tool to measure capacity and recovery of information that is demonstrated in tabular fashion utilized as a part of social databases. It is a way to deal with database plan that can accommodate a wide assortment of information models, including graph formats columnar and key value. Name and briefly describe at least 5 NoSQL Databases Five NoSQL databases are mentioned as follows:- Wide column (Cassandra, HBase) - Information is put away in sections rather than columnsasinatraditionalSQLframework.Anynumberofsegments(and consequently a wide range of sorts of information) can be assembled or totaled as required for questions or information sees. Document databases ( MongoDB, CouchDB) - Embedded data is put away as JSON structures or archives where the information could be anything from numbers CRICOS Provider No. 00103DInsert file name herePage7of23
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
to strings to free content. There is no intrinsic need to indicate what fields, assuming any, a report will contain. Multi model databases like Cosmos DB and Orient DB. Graph databases - Information is provided to as a system or chart of substances and their connections, with every hub in the diagram a freestyle lump of information. Example, Neo4j. Key-value stores ( Riak, Redis) - from basic numbers or strings to complex JSON reports—are gotten to in the database by method for keys. What is MapReduce and how it works? MapReduce is a paradigm of programming that takes into consideration huge versatility over hundreds or thousands of servers in a Hadoop bunch. It works by rearranging which is the procedure by which middle information from mappers are exchanged to at least 0,1 reducers. Every reducer gets at least 1 keys and its related qualities relying upon the quantity of reducers (for an adjusted load). Facilitate the qualities related with each key are privately arranged. Briefly describe some notable MapReduce products (at least 5) Some products and applications of MapReduce are mentioned as follows:- Distributed grep is utilized to scan for a given example in an expansive number of records. For instance, a web manager can utilize conveyed grep to look web server sign so as to locate the best asked for pages that match a given example With the innovative progressions in area based administrations, there is a colossal surge in the measure of geospatial information. Geospatial questions (closest neighbor inquiries and invert closest neighbor questions) devour parcel of computational assets and it is watched that their preparing in characteristically parallelizable. Digital Elevation Models are advanced or 3D portrayal of the scene, where every (X, Y) position is spoken to by a solitary rise esteem. DEMs are additionally alluded by the name Digital Terrain Model (DTM) or Digital Surface Model (DSM). A DEM can be spoken to as a raster (a matrix of squares) or as a triangular unpredictable CRICOS Provider No. 00103DInsert file name herePage8of23
system (TIN), and can be created from remotely detected (utilizing satellites) or specifically overviewed arrive height data. Count of URL Access Frequency - The guide work forms logs of site page demands and yields <URL, 1>. The decrease work includes all qualities for a similar URL and produces a <URL, add up to count> combine. Inverted Index - The guide work parses each record, and radiates an arrangement of <word, report ID> sets. Amazon’s S3 service lets to store large chunks of data on an online service. List some 5 features for Amazon’s S3 service. The features are:- Unmatched durability Comprehensive security In place query Flexible management Vendor support Getting the concise, valuable information from a sea of data can be challenging. We need statistical analysis tool to deal with Big Data. Name and describe some (at least 3) statistical analysis tools. Three statistical tools have been described as follows:- MS Excel brings a wide assortment of devices for perception and measurable examination of your physiological information. Information import from content documents is as basic as producing rundown measurements and adjustable illustrations and figures. Advantages:- • It offers a great deal of control and adaptability. • It is generally accessible and moderately modest for understudies and private substances. CRICOS Provider No. 00103DInsert file name herePage9of23
• It doesn't require to learn new techniques for controlling information and drawing diagrams. MATLAB is a framework of general analysis , which requires programming abilities to a substantially more prominent degree than Excel or SPSS. Advantages:- • MATLAB offers specific tool kits for the investigation of information coming from eye following, EEG, ECG, EMG and so forth and outward appearance examination. • In MATLAB, examination, handling steps and results can be totally altered. • It offers scholarly licenses at a lessened cost. SPSS is searching programming, including measurements, measurable and non- measurable test effectiveness. Plots of SPSS are ordinarily found in scholarly papers and business explore reports. Advantages:- • SPSS has a productive data administration and offers a considerable measure of control over information association. • It offers an extensive variety of techniques, diagrams and graphs. • SPSS verifies that the result is kept separate from the information itself, producing very much organized reports and worksheets containing comes about. Exercise 6: Big Data Application(1 mark) Name 3 industries that should use Big Data Google, Facebook and twitter should use big data. From your lecture and also based on the below given video link: https://www.youtube.com/watch?v=_sXkTSiAe-A Write a paragraph about memory virtualization. CRICOS Provider No. 00103DInsert file name herePage10of23
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Memory virtualization permits organized servers to share a pool of memory to conquer physical memory impediments, a typical bottleneck in programming execution. The memory pool might be gotten to at the application level or working framework level. At the application level, the pool is gotten to through an API or as an organized record framework to make a rapid shared memory store. At the working framework level, a page reserve can use the pool as an extensive memoryassetthatissignificantlyquickerthanneighborhoodorarranged stockpiling. Memory virtualization usage are recognized from shared memory frameworks. Watch the below mentioned YouTube link: https://www.youtube.com/watch?v=wTcxRObq738 Based on the video answer the following questions: What is RAID 0? Disk striping or RAID 0 is a procedure that splits up a document and spreads the information over all the plate drives in a RAID gathering. The advantage of RAID 0 is that it enhances execution. Since striping spreads information crosswise overmorephysicaldrives,variouscirclescangettothesubstanceofa document, enabling composes and peruses to be finished all the more rapidly. A disadvantage to RAID 0 is that it doesn't have equality. In the event that a drive ought to flop, there is no repetition and all information would be lost. Describe Striping, Mirroring and Parity. Disk striping is a strategy in which numerous littler plates go about as a solitary substantialcircle.Theprocedureseparatessubstantialinformationinto information pieces and spreads them over different stockpiling gadgets. Disk striping gives the upside of amazingly huge databases or extensive single-table table space utilizing just a single consistent gadget. Data mirroring includes the continuing task of duplicating information from one area to a nearby or remote stockpiling medium. In short, a mirror is a precise of a dataset. Most regularly, it is utilized when numerous precise of information are required in different areas. Parity drive is a hard drive utilized as a part of a RAID exhibit to give adaptation to non-critical failure. For instance, RAID 3 utilizes it to make a framework that is both blame tolerant and, in view of information striping, quick. The XOR of the greater part of the information drives in the RAID cluster is composed to the parity drive. CRICOS Provider No. 00103DInsert file name herePage11of23
Exercise 2: Storage Design(2 marks) Summarize storage repository design based on the following video link: https://www.youtube.com/watch?v=eVQH7C3nulY In the mentioned video, a storage repository on a LUN is connected to a grouped server pool, as a result of the idea of the OCFS2 document framework it employments. Thus, a server pool must exist with grouping empowered, and no less than one server must be available in the clustered environement. Local server storage with a repository additionally has a place in this classification, since nearby circles are constantly found as LUNs. Below YouTube link describes the Intelligent Storage System https://www.youtube.com/watch?v=raTIRsMi7zk Based on the watched video answer the following questions: What is ISS? Storage Arrays include arrays of rich RAID that give exceptionally streamlined I/O handling abilities are by and large alluded as Intelligent Storage Arrays or Intelligent Storage Systems. These stockpiling frameworks have the ability to meet the necessities of the present I/O concentrated cutting edge applications. Theseapplicationsrequireabnormalamountsofexecution,accessibility, security,andversatility.Alongtheselines,tomeetthenecessitiesofthe applications numerous sellers of clever stockpiling system currently support SSDs, deduplication, compression, architecture and encryption. What are the 4 main components of the ISS? The front end gives the interface between host and the storage. It comprises of two parts: front-end ports and front-end controllers. The front-end ports empower hosts to interface with the canny stockpiling framework. Each front-end port has preparing rationale that executes the proper transport convention, for example, SCSI, iSCSI and Fibre Channel, for capacity associations. Cache is a critical part that upgrades the I/O execution in a canny stockpiling framework. Store is semiconductor memory where information is set incidentally to lessen the time required to benefit I/O asks for from the host. Cac enhances stockpilingframeworkexecutionbydisconnectinghasfromthemechanical deferrals related with physical plates, which are the slowest segments of a clever stockpiling framework. The back end gives an interface amongst store and the physical circles. It comprises of two parts: back-end ports and back-end controllers. The back end CRICOS Provider No. 00103DInsert file name herePage12of23
controls information exchanges amongst store and the physical circles. From reserve, information is sent to the back end and afterward directed to the goal plate. The fourth part is physical circles. How cache works in ISS? ISS systems control the distribution, administration, and utilization of capacity assetsforquickerinformationhandling.Thesecapacityframeworkskeep running with a lot of stored memory and complex calculations to meet the I/O requestsofeventhecriticalapplications.Cacheisusedinstockpiling frameworks makes it speedier to recover information after the primary hunt. Each keeps in touch with the reserve memory is put away in two diverse memory stockpiling and memory cards and comprises of the label RAM and the information store. The RAM tracks the area of the information in the physical locationswhiletheinformationstoreholdstheinformationthatisbeing composed or perused. The copy serves as a backup if the store fails in a single area. These procedures accelerate the IO forms and significantly diminishes the quantity of mechanical disk activities. Storage Area Network (SAN) and Network Attached Storage (NAS) are widely used concepts in data storage arena. The following YouTube video links gives detailed description of these concepts: −http://www.youtube.com/watch?v=csdJFazj3h0 −http://www.youtube.com/watch?v=vdf6CvGQZrk −https://www.youtube.com/watch?v=KxdfGcynfJ0 −https://www.youtube.com/watch?v=4RsLUTJ_Qtk Based on the watched videos answer the following questions: Describe NAS and SAN briefly using diagrams? NAS is a committed server utilized for record stockpiling and sharing. NAS is a hard drive joined to a system, utilized for capacity and got to through an appointed system address. It works likea server for document sharing yet does not permit different administrations (like messages or confirmation). It permits the expansion of more storage room to accessible systems notwithstanding when the framework is shutdown amid maintenance. SAN is a high speed system that gives blocklevel system access to capacity. SANs are normally made out of switches, hosts, capacity components, and capacitygadgetsthatareinterconnectedutilizinganassortmentof advancements,topologies,andconventions.SANsmaylikewisetraverse various locales. CRICOS Provider No. 00103DInsert file name herePage13of23
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
What are the advantages of SAN over NAS? SAN is a devoted system of storage devices (can incorporate tape drives, hard drive and more) all working together to give a great level stockpiling. While NAS isasolitarygadget/server/figuringapparatus,sharingitsownparticular stockpiling over the system. What are two common NAS file sharing protocols? How they are different from each other? Two common NAS file sharing protocols are NFS and AFP. The Network File System (NFS) is a customer/server application that gives a PC client a chance to see and alternatively store and refresh records on a remote PC just as they were without anyone else PC. The protocol is one of a few conveyed document framework norms for NAS. AFPisthelocalrecordandprintersharingconventionforMacsandit strengthens numerous extraordinary Mac characteristics that are not upheld by different conventions. So for the best execution, and 100% similarity, AFP ought to be utilized. Part B(3 Marks) Exercise 3: Storage Design(1 Mark) Design Storage Solution for New Application Scenario An organization is deploying a new business application in their environment. The new application requires 1TB of storage space for business and application data. During peak workload, application is expected to generate 4900 IOPS (I/O per second) with typical I/O data block size of 4KB. The vendor available disk drive option is 15,000 rpm drive with 100 GB capacity. Other specifications of the drives are: CRICOS Provider No. 00103DInsert file name herePage14of23
Average Seek time = 5 millisecond and data transfer rate = 40 MB/sec. You are required to calculate the required number of disk drives that can meet both capacity and performance requirements of an application. Hint:In order to calculate the IOPS from average seek time, data transfer rate, disk rpm and data block size refer slide 28 in week 6 lecture slide. Once you have IOPS, refer slide 29 in week 6 to calculate the required number of disks. Exercise 4: Storage Evolution(2 Marks) Watch the following videos for Fiber Channel over Ethernet and answer the questions that follow: −http://www.youtube.com/watch?v=hSFyf-rmjA8 −http://www.youtube.com/watch?v=iCfJCzfNLrw What is FCoE and why we need FCoE? Fiber Channel over Ethernet or FCoE is a protocol that empower Fiber Channel communication to run specifically designed Ethernet. It makes it conceivable to move Fiber Channel activity crosswise over existing rapid Ethernet foundation and merges stockpiling and IP conventions onto a solitary link transport and interface. Fiber Channel bolsters fast connections of data between registering gadgets that interconnect servers with shared capacity gadgets and between capacitycontrollersanddrives.FCoEsharesFiberChannelandEthernet movement on the same physical link or gives companies a chance to isolate Fiber Channel and Ethernet activity on a similar equipment. In your opinion how FCoE is cost effective than traditional connection – give brief explanation. The switches gives an outline for FCoE that can convey connector to-change to- connectorinactivityofunder10microsecondsandport-to-portidlenessof roughly 3 microseconds, free of parcel measure. They incorporate ports at the back for consistency with server farm servers, permitting shorter and less difficult link keeps running inside racks and decreasing expense and copper. YouhavereadandansweredaboutSANinpartA–basedonyour understanding and with some research effort answers the following questions: CRICOS Provider No. 00103DInsert file name herePage15of23
What is a Virtual SAN? Virtual SAN is a product characterized storage from VMware that empowers companies to pool their capacity abilities and to in a flash and naturally arrangement virtual machine stockpiling by means of basic strategies that are driven by the virtual machine. What is IP SAN protocols and FibreChannel over IP (FCIP)? An IP SAN is a particular SAN network or storage area network that enables different servers to get to pools of shared square stockpiling gadgets utilizing capacityconventionsthatrelyupontheInternetEngineeringTaskforce standard Internet Protocol suite. FCIP, is a protocol used to interface Fiber Channel switches over an IP arrange, empowering interconnection of remote areas. From the texture see, a FCIP interface is a between ISL that vehicles FC control and information outlines between switches. Watch the below video about Introduction to Object-based and Unified Storage and: https://www.youtube.com/watch?v=kl9X6mzEWO4 Choose the correct answer from the following questions: What is an advantage of a flat address space over a hierarchical address space? a.Highly scalable with minimal impact on performance b.Provides access to data, based on retention policies c.Provides access to block, file, and object with same interface d.Consumes less bandwidth on network while accessing data What is a role of metadata service in an OSD node? a.Responsible for storing data in the form of objects b.Stores unique IDs generated for objects c.Stores both objects and objects IDs d.Controls functioning of storage devices CRICOS Provider No. 00103DInsert file name herePage16of23
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
What is used to generate an object ID in a CAS system? a.File metadata b.Source and destination address c.Binary representation of data d.File system type and ownership What accurately describes block I/O access in a unified storage? a.I/O traverse NAS head and storage controller to disk b.I/O traverse OSD node and storage controller to disk c.I/O traverse storage controller to disk d.I/O is directly sent to the disk What accurately describes unified storage? a.Provides block, file, and object-based access within one platform b.Provides block and file storage access using objects c.Supports block and file access using flat address space d.Specialized storage device purposely built for archiving What is Greenhouse effect? The greenhouse effect expands the temperature of the Earth by catching heat in our climate. This keeps the temperature of the Earth higher than it would be if direct heat from the Sun was the main source of warming. We are legally, ethically, and socially required to green our IT products, applications, services, and practices– is this statement true? Why? Yes it is true because it is our duty to create a future for our generations where they can live happily in a sustainable way. What is Green IT and what are the benefits of greening IT? CRICOS Provider No. 00103DInsert file name herePage17of23
Green ITis the act of naturally supportable figuring. It plans to limit its negative effect activities on nature by planning, fabricating, working and discarding PCs and PC related items in an ecologically well disposed way. Exercise 2: Environmental Sustainability(0.5 Marks) Read the article in the below link and answer the questions that follow: http://www.computer.org/csdl/mags/it/2010/02/mit2010020004.html According to the article how do you build a greener environment? Green environment can be built by including power administration; server farm plan, format, and area; the utilization of biodegradable materials; administrative consistence;greenmeasurementsandgreenmetrics;carbonimpression appraisal devices and philosophy; and environment related chance alleviation. Summarize the article in 150 words In the article , the significant advancements enhancing the vitality proficiency of PCs, virtualization, server farm outline and task, control mindful programming have been mentioned. Nonetheless, there are a few Green IT zones that request additionally innovative work: innovation selection, ecological effect evaluation, models and direction, and saddling IT for natural supportability. To manufacture agreenersituation,weshouldadjustorendaportionofouroldand recognizable methods for getting things done. To thoroughly and adequately addressIT'secologicaleffect,weshouldreceiveanall-encompassing methodology and make the whole IT life cycle. Exercise 3: Environmentally Sound Practices(1 Mark) The questions in this exercise can be answered by doing internet search. Briefly explain the following terms – a paragraph for each term: Power usage effectiveness (PUE) and its reciprocal PUE or Power Usage Effectiveness is a measure used to decide proficiency of energy estimations. It is figured by estimating the proportion of aggregate energy utilizing the server combination in addition to the cooling to "valuable" energy use. Data center efficiency (DCE) CRICOS Provider No. 00103DInsert file name herePage18of23
DCE is a more viable approach to evaluate the energy utilization for investigating the viable utilization of energy through existing IT hardware, with respect to the peroformance of that device. Data center infrastructure efficiency (DCiE) DCiE is a metric used to decide the energy effectiveness of a data center. It was introduced by Green Grid, an industry gather concentrated on dat center vitality proficiency. It, is computed by partitioning IT hardware control by add up to office control. List 5 universities who offers Green Computing course. You should name the university, the course name and the brief description about the course. The universities which provide the courses on green computing are written as follows:- UniversityofHertfordshire,Foundationandstrategies,greencomputing evaluation UniversityofCambridge,AdvancedIT,aboutthebenefitsofgreen computing UMass Amherst College of Engineering, Computing architecture University of Victoria, Green IT Karlstad University, Foundation and strategies Exercise 4: Major Cloud APIs(1 Mark) The following companies are the major cloud service provider: Amazon, GoGrid, Google, and Microsoft. Listandbrieflydescribe(3linesforeachcompany)theCloudAPIs provided by the above major vendors. Google Cloud Platform, offered by Google, is a suite of cloud computing services that keeps running on a similar system that Google utilizes inside for its end- client items, for example, Google Search and YouTube. Amazon Web Services (AWS) is a safe cloud administrations stage, offering figure control, database stockpiling, content conveyance and other usefulness to enable organizations to scale and develop (Lu et al. 2013). The AWS Cloud gives a wide arrangement of foundation administrations, for example, processing power, stockpiling choices, CRICOS Provider No. 00103DInsert file name herePage19of23
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
systems administration and databases, conveyed as utility Microsoft Azure is a consistentlygrowingarrangementofcloudadministrationstoenableyour associationtoaddressyourbusinessdifficulties.Itistheopportunityto manufacture, oversee and send applications on an enormous, worldwide system utilizing your most loved instruments and structures. GoGrid is a cloud infrastructure service, facilitating Linux and Windows virtual machines managed by a multi-server API. Part B(3 Marks) Exercise 1: Greening IT Standards and Regulations(0.5 Marks) TodesigngreencomputersandotherIThardware–thefollowing standards and regulations are mainly used EPEAT (www.epeat.net), the Energy Star 4.0 standard, and the Restriction of Hazardous Substances Directive(https://www.gov.uk/guidance/rohs-compliance-and-guidance). Usethelinkprovidewithsomeinternetsearch–summarizeeach standards and regulations in 150 words. The Electronic Product Environmental Assessment Tool (EPEAT) is a simple to- utilize asset for buyers, makers, affiliates and others needing to discover or advance electronic items with positive natural qualities. It was produced utilizing EPA and is overseen by the Green Electronics Council (GEC). GEC keeps up EPEAT's site and item registry and furthermore reports the natural advantages coming about because of the buy of EPEAT-enlisted items. Vitality STAR is government supported image for energy effectiveness, giving straightforward and unrestricted data that buyers and organizations depend on to settle on very much educated choices. A great many mechanical, business, utility, state, and nearby associations including 40 percent of the Fortune 500 depend on their organization with the U.S. Ecological Protection Agency to convey cost-sparing vitality productivity arrangements. RoHS Directive is an arrangement of criteria planned by the European Union to control the utilization of harmful materials in electrical and electronic gadgets, frameworks, and toys. The Directive is effective from 1 July, 2006. Exercise 2: Green cloud computing(0.5 Marks) Xiong, N.; Han, W.; Vandenberg, A, "Green cloud computing schemes based on networks: a survey,"Communications, IET, vol.6, no.18, pp.3294,3300, Dec. 18 2012 Most part of power consumption in data centers comes from computation processing, disk storage,networkandcoolingsystems.Nowadays,therearenewtechnologiesand methodsproposedtoreduceenergycostindatacenters.Fromtheabovepaper summarize (in 300 words) the recent work done in these fields. CRICOS Provider No. 00103DInsert file name herePage20of23
Virtualization technology allows one to create several Virtual Machines on a physical server reduces amount of hardware in use and improves the utilization of resources. Organizations can outsource their computation needs to the Cloud, thereby eliminating the necessity to maintain own computing infrastructure. Data center power consumption and cooling are two of the biggest energy issues that confront IT organizations today. Cooling systems consume nearly half of the electricity energy of data centers. Using this hot water cooling, chillers are no longer required year- round, that means the data-center energy consumption can be reduced by up to 50%. And more attractively, direct utilization of the collected thermal energy becomes feasible, either using synergies with district heating or specific industrial applications. Exercise 3: Cloud API Functionalities(2 Marks) List the functionalities that can be achieved by using the APIs mentioned in the following link: https://code.google.com/p/sainsburys-nectar-api/ The list of functionalities are written as follows:- Migration from one application to the next Central database Data security in case of system failure What API is used in the following link and how it is used? https://pypi.python.org/pypi/python-novaclient • OpenStack Compute API has been used in the given link. Through this API, the administration gives hugely versatile, on request, self-benefit access to process assets. Contingentupon the sendingthose process assets may be Virtual Machines, Physical Machines or Containers. Openstack is an open source collaborative software project which meets many of the cloud needs. Below links gives vast information about Openstack. https://support.rc.nectar.org.au/docs/openstack http://docs.openstack.org/api/quick-start/content/ Write a report (1 page) about the Openstack features and functionalities. Write a report (1 page) about the Openstack features and functionalities OpenStack is an arrangement of software tools for managing and building cloud computing systems for private and public systems. Supported by a portion of the greatest organizations CRICOS Provider No. 00103DInsert file name herePage21of23
in software hosting, and in addition a huge number of individual group individuals, numerous surmise that it is the eventual fate of distributed computing. It is overseen by the OpenStack Foundation, a non-benefit that manages both improvement and group working around the undertaking. It gives clients a chance to send virtual machines and different cases that handle diverse undertakings for dealing with a cloud situation on the fly. It makes flat scaling simple, which implies that errands that advantage from running simultaneously can without much of a stretch serve increasingly or less clients on the fly by simply turning up more occasions. For instance, an application that requirements to speak with a remote server may have the capacity to partition crafted by speaking with every client crosswise over a wide range of cases, all speaking with each other however scaling rapidly and effortlessly as the application acquires clients. In particular, OpenStack is open source programming, which implies that any individual who jars get to the source code, roll out any improvements or adjustments they require, and uninhibitedly share these progressions pull out to the group on the loose. It additionally implies that OpenStack has the advantage of thousands of designers everywhere throughout the world working pair to build up the most grounded, most vigorous, and most secure item that they can. It is comprised of a wide range of moving interfaces. As a result of its open nature, anybody can add extra parts to OpenStack to assist it with meeting their necessities. OpenStack Image gives revelation, enlistment, and conveyance administrations for circle and server pictures. Put away pictures can be utilized as a layout. It can likewise be utilized to store and index a boundless number of reinforcements. The Image Service can store plate and server pictures in an assortment of back-closes, including Swift. OpenStack Object Storage is an adaptable repetitive stockpiling system. Records are composed to variousarea drivesspreadall throughserversin the datacenterwith the OpenStack programming in charge of guaranteeing information replication. CRICOS Provider No. 00103DInsert file name herePage22of23
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
References Baldini, I., Castro, P., Cheng, P., Fink, S., Ishakian, V., Mitchell, N., ... & Suter, P. (2016, May). Cloud-native, event-based programming for mobile applications. InProceedings of the International Conference on Mobile Software Engineering and Systems(pp. 287-288). ACM. Lu, Q., Zhu, L., Bass, L., Xu, X., Li, Z., & Wada, H. (2013, June). Cloud API issues: an empirical study and impact. InProceedings of the 9th international ACM Sigsoft conference on Quality of software architectures(pp. 23-32). ACM. CRICOS Provider No. 00103DInsert file name herePage23of23