CT074-3-M-RELM: Reliability Management Plan and Manual

Verified

Added on  2023/06/10

|18
|4794
|312
Report
AI Summary
This report, prepared for a Reliability Management assignment, addresses the case of XYZ Resources engaging ABC Tech for cloud service management. The assignment details the reliability requirements, including setting up primary and standby cloud servers, full backup systems, a disaster recovery unit, and a security control system with firewalls. It outlines the organization for reliability, emphasizing ABC Tech's responsibilities and the need for warranties and security protocols. The report describes the reliability activities, such as design analysis and testing procedures, and their timing within the project milestones. It also covers reliability management strategies for suppliers, including metrics, KPIs, and a supplier risk scorecard. The report details the standards, specifications, and internal procedures, including the reliability manual, cross-referencing with other plans like test, security, and maintainability. The manual emphasizes the importance of high-quality standards, backup systems, and security measures to prevent data loss and ensure operational continuity. The report also covers the firm's procedures to mitigate losses caused by the failure of operations caused by damages or attacks from harmful persons. The reliability procedures will be used in the development of the system to prevent failure and damage to the company’s property.
Document Page
RELIABILITY MANAGEMENT 1
RELIABILITY MANAGEMENT
Name of student
Name of institution
Name of instructor
Date
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
RELIABILITY MANAGEMENT 2
1. A brief statement of reliability requirements
The mission of developing results in two cloud servers, which include a primary server
and a standby operations service. The system should have a full back up system to
support the system during failure. Additionally, the development will result in a disaster
recovery unit and a security control system. The disaster management unit is necessary to
protect the system from damage and failure to operate during human or natural mishaps
(Amstutz, 2014). The system will have tools and policies in place to provide guidance in
the event of a disaster such as standard machines responsible for recovery. Moreover, the
system will include a security system to prevent access by unauthorized persons. The
security systems will include firewalls to prevent access by hackers (Amstutz, 2014).
Additionally, during the project lifecycle, the developer will collect data to predict the
reliability of the software. The data analysis will aim at preventing the probability of
failure after the completion of each stage of the lifecycle. Moreover, the system will
undergo various test and maintenance procedures to minimize the probability of failure in
the final output. The probability of failure will be analyzed using the failure mode and
effect analysis (FMEA). The FMEA approach identifies the features in the system that
could lead to failure in the system (Barki, 2015). The system will undergo testing
procedures including the reliability testing and maintainability testing. The reliability
testing procedures will involve failure data collection, analysis of trends and reporting
requirements. On the other hand, the maintainability requirements include procedures to
identify whether the system is maintainable. The tests include contractor’s requirements
such as the existence of a system and a service level agreement for conducting the
activity.
2. The organisation for reliability
Document Page
RELIABILITY MANAGEMENT 3
The company XYZ will engage the services of ABC tech to develop the system that fits
the requirements specified in the project document. The company must fully adhere to the
development of a cloud server with both primary and standby operation services. The
development should follow the set lifecycle stages including maintainability tests, quality
test and security measures for the system. The company should also produce evidence of
previously developed systems that have met the customer’s expectations (Bench, 2013).
The testimonials are meant to prove that the company is able to develop a system with the
same specifications as that required by XYZ. Moreover, ABC tech should assure the
customer that the system would have the right security protocols, which include the
installation of firewalls to prevent unauthorised access and infection by viruses. ABC tech
should also assure the customer that the system would have warranties for all the
materials used in the process to enable refunds and free repairs in the event of damages or
failure to meet the expectations (Clemons, 2010).
3. The reliability activities that will be performed
The development of the system will include various reliability activities to identify
whether the system performs as intended. Additionally, the developers intend to ensure
that the system is free from failure. The reliability activities give assurance to the
customer that the system will perform as intended (Clemons, 2012).
Design analysis
The company will conduct design analysis as part of the reliability activities, which refers
to the use of powerful soft wares that simulate the physical operations of the system. The
analysis identifies whether the system will break or operate normally even under very
extreme pressure (Earl, 2014). The use of software to analyse eliminates the added costs
that arise from the use of prototyping method to develop a model of a system. The design
Document Page
RELIABILITY MANAGEMENT 4
analysis will follow the finite element analysis method, which means subdividing the all
the system into individual elements whose behavior is easily identified and then
developing the original system from the components. Moreover, the system will undergo
stress analysis to identify the ability to operate under various conditions (Edwards, 2014).
Test procedures
The system will undergo various test procedures to identify problems in the system. The
system testing increases the confidence level in the system due to the elimination of the
probability of failure in the operations (Ein-Dor, 2014). The test procedures will occur at
each level of the development lifecycle to prevent failure. The test procedures will
include such as stress screening procedures to identify whether the system can function
under extreme conditions.
Reliability requirements
The reliability requirements will follow the requirements set by XYZ, which includes the
development of a primary system with a backup plan. Additionally, the system will have
various procedures to ensure quality and full functioning. The requirements should drive
the incorporation of various features to ensure the success of the system (Gerardi, 2016).
Reports
ABC tech will deliver constant reports on the milestones achieved during the project
development. The development team will collect data after each milestone and report to
the customer company, which is XYZ (Gregor, 2014).
4. The timing of all major activities, in relation to the project development milestones.
The project will have various stages requiring accomplishment to ensure successful
operation of the system. The accomplishment of each stage will be marked as milestones
and a timeline is necessary to ensure that the stages do not cause a delay in the project
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
RELIABILITY MANAGEMENT 5
execution (Hevner, 2018). Some of the important activities include finding the right
suppliers of the necessary materials. The search for the suppliers and awarding of the
contract is allocated 14 days. The activities include the drafting of a proposal, analyzing
of tenders and awarding of the contract. Another activity is the design analysis to ensure
that the system will give the required results, which can take 30 days. The stage includes
creating a prototype using powerful software, which identifies the points that could cause
problems. The stage requires careful analysis to avoid undermining issues that could
cause failure of the system. Additionally, the actual development of the system is a
crucial stage, which could take up to 60 days. The development involves coming up with
the actual cloud system for ABC and includes making a primary and a backup operations
system. During the actual development of the system, other systems are developed such
as a full backup system, a disaster recovery system and a security control system. The
system testing procedure is important, could happen in 14 days, which is enough time to
ensure that the system is free from errors, and can satisfy all required operations (King,
2018).
5. Reliability management strategies for suppliers.
I. Developing metrics and KPIs to monitor performance
The metrics and key performance indicators provide a threshold for the supplier to meet when
developing the system. The performance monitoring occurs in all levels and stages of
operations to ensure that the outcome is according to the set standards (Lederer, 2014). The
metrics include the development of a backup system for the system, installing security
controls and a disaster recovery system. Therefore, the suppliers have to meet the standards
by offering materials and services that match the set quality levels.
II. Implementing a supplier risk scorecard
Document Page
RELIABILITY MANAGEMENT 6
The supplier scorecard standardizes the operations of the whole organisation. Therefore, the
suppliers for materials and services necessary for the development of the system follow the
needed standards to avoid failure and termination of contracts. The scorecard identifies the
risks that the suppliers should avoid. The group of suppliers work according to the set
standards leading to uniform outputs without errors and probability of failure (Barki, 2015).
6. The standards, specifications and internal procedures (e.g. the reliability manual) which
will be implemented, as well as cross-references to other plans such as for test, security,
maintainability and quality assurance.
Standards, specifications, and internal procedures
The system should achieve high-quality standards and meet all the expectations without fail.
The system should avoid errors and have solutions to problems affecting operations.
Therefore, the system should have a backup system installed to maintain operations during
failure such as power outages (McMullen, 2009). Additionally, the system should have a
disaster management system and a security system to avoid loss of data or failure to operate
in the event of accidents.
The specifications for the system include proper testing and design analysis to come up with
an effective system that is free from errors. Therefore, the expected system is one that is free
from errors and is able to create an operating alternative in the event of a failure in the main
system (Myers, 2017)
The internal procedures include having a reliability manual, which specifies the ability of the
system to meet the operation standards required. Additionally, the internal operations include
the appointment of employees to manage and evaluate whether the system development
follows the requirements. Moreover, an internal team of experts could train the other
employees on how to use the system (Nunamaker, 2010).
Document Page
RELIABILITY MANAGEMENT 7
Furthermore, various testing procedures could take place such as accelerating testing
approach. The test approach induces failure to the system in a laboratory with the system
expected to show failure in a similar manner as in the field (Pfenning, 2014). Moreover, the
system will have security measures to avoid harmful attacks that could damage operations
such as firewalls and antiviruses.
The maintenance test happens after project completion to identify the costs of repairs and the
time consumed when repairing the faulty system. The maintenance stage happens at the same
time with the quality assurance tests, which aim at checking whether the system meets the
customer expectations.
Reliability manual
1. A brief statement of reliability requirements.
XYZ Company requires high-quality systems of operations that meet all operational needs as
requested. The company’s leadership has developed a reliability policy that all employees at
the firm should follow. Therefore, the company contracted to work with the firm should
strictly adhere to reliability principles. The development of the cloud servers should follow
the set standards to avoid failure and errors in operations. The system development should
see that the system has undergone various steps to eliminate the possibility of failure in the
final output. Therefore, the system should undergo a design analysis to establish full
performance and highlight the points of possible failure (Senn, 2010). The points showing a
probability of failure require quick rectification to eliminate chances of failure in the final
product. Additionally, the company requires a full backup system for the cloud servers. The
backup system should provide services in the event of failure of the original system.
Additionally, the organisation policy emphasizes that the cloud servers should be developed
following the relevant international standards that govern information systems. The standards
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
RELIABILITY MANAGEMENT 8
include the IEEE 1633-2008, which specifies the reliability standards to follow when
developing an information system. Moreover, the system should follow the international
security standards that ensure that the system is protected from unauthorized access such as
firewalls and antiviruses (Stanfel, 2014).
2. The organisation for reliability.
The firm has set procedures to mitigate losses caused by the failure of operations caused by
damages or attacks from harmful persons. Similarly, the reliability procedures will be used in
the development of the system to prevent failure and damage to the company’s property.
Therefore, the cloud-based system will adhere to the international standards set for
information systems such as installing a security system to mitigate the danger posed by
infection of viruses and hackers (Stern, 2015). The lack of a security system could lead to
loss of important information to unauthorised persons. Additionally, the system should
contain a backup system to restore operations in the event of failure. The backup system
ensures that the company continues accessing information during the period that the system is
under maintenance. The system should also undergo various testing procedures to establish
the ability to perform under different conditions. Testing procedures such as stress tests
expose the system’s ability to perform under extreme pressure.
3. Reliability procedures in design
The system design will undergo various procedures to ensure that the output is in accordance
with the set blueprint. The system design specifies that the output should be of high quality
and ability to serve the firm appropriately.
The design analysis will involve procedures such as prototyping and design analysis. The two
steps will be undertaken to ensure that all the system functionalities and failures are
established. The system design analysis procedure will be the finite element analysis, which
Document Page
RELIABILITY MANAGEMENT 9
simulates the physical behaviour of a product using very strong soft wares (Stern, 2010). The
element analysis exposes the failures in the system for the developers to correct.
Additionally, the analysis shows how the system will work in reality after full development.
The other design analysis to apply is the prototyping, which refers to the creation of an
example of the system that developers use to checkpoints of improvement to eliminate bugs.
The prototype is used for some time before the full and final product is introduced. The
prototype assists in the identification of failure, which is eliminated to avoid transmission to
the final system. Therefore, the two design procedures, which are prototyping and finite
element analysis ensure that the cloud-based operating system is free from bugs and errors
(Swanson, 2014).
Components derating policy
The policy is meant to prolong the operating period of a system by setting enough margins
for operation. The system has a policy of every increase in temperature by 10% reduces the
lifespan of the product by 50% (Swanson, 2010). Therefore, the system should be protected
from power surges and overvoltage situations. The observing of the policy results in a system
with a prolonged life and reliable during the system’s lifetime.
Components
The system requires various components to ensure successful development. The system
requires both software and hardware components to fully support the system. The
components include a heterogeneous business support that links up both the software and
hardware. Additionally, the business support connects the new system to both traditional and
modern infrastructure available at the centre to allow the flow of data (Van, 2011).
Moreover, application-programming interfaces are necessary to create a link between the
existing operations, administration and maintenance systems. The activity includes both
Document Page
RELIABILITY MANAGEMENT 10
current visual tools such as VMWare and CITRIX and the larger data management
companies such as IBM and HP.
The other hardware includes the use of computers with the ability to process programs
necessary for developing the system. The computers should accommodate enough storage for
installing and smooth running of the installed software (Vazsonvi, 2013).
Material and process selection approval and review
The material and process selection will be done by a team of experts from within and outside
the company (Zwass, 2014). The selection of material will seek to achieve the approval of the
best tools and procedures to develop the system. The team will select materials based on the
quality and aesthetic nature of the materials. Additionally, the market ratings will assist in
coming to a conclusion about the best materials to use. The materials with the highest rating
will be considered first before others.
Likewise, a group of experts will select the best process for development. The selection will
be based on quality and cost-benefit analysis. The selected process should promise to achieve
a high-quality output with the employment of experienced experts. Additionally, the process
should meet the expectations at an affordable cost.
Design review
The design review procedure set will identify whether each stage has achieved the required
standards. The review will be done after every stage to identify all problems at that stage
(Bench, 2013). The problems are rectified to avoid passing to the next stage, which results in
a complete system without failure.
Critical item listings
The critical items will be noted in the project document to ensure that the developers focus on
each item without failure. Additionally, the development will have a supporting critical item
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
RELIABILITY MANAGEMENT 11
list to create emphasis on the requirements. The items project team should focus on
integrating all the items mentioned in the critical items list (Van, 2011).
4. Reliability test procedures
The test procedures will include such as data collection and analysis. The data is collected
from the system operations after completion. The user team collects and records the data
based on the various operations of the system. The data assist in identifying the bottleneck
areas through analytical models that expose the areas of failure within a system (Clemons,
2012).
Additional reliability tests will include stress tests to identify the responses of the system
under different conditions. The system is set under both normal and abnormal conditions to
observe the behavior. Therefore, the system is operated under normal conditions such as
moderate input of data to identify the behavior and responses. The behavior checked include
the speed of operations and accuracy of output. On the other hand, the system is put under
extreme conditions to identify the ability to withstand pressure. The extreme conditions
include the input of lots of data to identify the ability to process quickly and give accurate
output (King, 2018).
5. Reliability data collection, analysis and action system, including data from the test,
warranty, etc.
The data collection will be collected be done by the users and a team of experts to identify the
areas of failure. The data will be collected over a period to allow the identification of patterns
through analysis. The data will undergo analysis to identify trends that expose the areas
causing trouble and the cause of the shortcomings. Additionally, the experts will give repair
directions based on the analysis of the system. The repairs will exploit the warranties
Document Page
RELIABILITY MANAGEMENT 12
awarded by the suppliers such as free replacement of the faulty system. The warranties
should last for a long period such as two to three years.
The data from the test also includes figures showing the output of the system under various
conditions. The experts will check the periods and conditions causing the low output from the
system. Therefore, the team plans the approach to improve the operation of the system and
prevent failure (Edwards, 2014).
REQUEST FOR PROPOSAL
The purpose of the request for proposal is to service company XYZ with the required
information for the development of a cloud-based operating system. The development should
follow the set guidelines and achieved within the set budget. However, an increase in the cost
in addition to the budgeted amount will be discussed with the relevant finance
representatives. The ABC Company is seeking to achieve a high-quality cloud-based
operating system with both primary and a standby operations plan. The system should have
the necessary security standards, backup and disaster recovery measures to avoid failure.
Practice information
The company engages in social media services such as marketing and blogging. The
company is located in a busy town with many customers requiring social media services such
as web hosting, the creation of websites and running of social media accounts such as
Instagram, Twitter and Facebook. The business targets the market with small and medium-
sized businesses. The business intends to boost operations with a cloud-based system that will
replace the traditional system currently existing at the company. The cloud-based system will
increase efficiency in the storage of information and improve the speed of giving output and
accuracy of data. Therefore, the company ABC tech is supposed to develop a system that will
chevron_up_icon
1 out of 18
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]