Data Mining for Plant Biotechnology: Analysis and Future Scope

Verified

Added on  2023/06/15

|7
|1280
|71
Report
AI Summary
This report provides an overview of data mining and its applications in plant biotechnology, focusing on the use of the Osmotic Stress Microarray Information Database (OSMID). It discusses the basic concepts of data mining, its connection to data warehousing, and the economic relevance of plant biotechnology, particularly in the context of genetically modified (GM) foods and plant-made-pharmaceuticals (PMP). The report analyzes data from OSMID to investigate economic and environmental factors, highlighting the importance of data quality and appropriate tools in data warehousing. It also touches upon the different types of data warehouses and concludes with future directions for research in this field, emphasizing the potential for discovering valuable insights through data mining techniques in plant biotechnology.
Document Page
Running head: DATA MINING IN PLANT BIOTECH
DATA MINING IN PLANT BIOTECH
Name of the Student:
Name of the University:
Author Note:
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
1DATA MINING IN PLANT BIOTECH
Introduction
Initially, this paper provides the basic concepts of the development of data mining and its
connection with the data warehousing. Data warehousing and data mining are the emerging
technologies for the information system and the new courses ad curricula are created revolving
these technologies.
A brief description of the plant bio-tech economic is provided. The data used in this
paper are from Osmotic Stress Microarray Information Database (OSMID). The data can be used
for the application of the genetically modified (GM) foods and plant-made-pharmaceuticals
(PMP). The data from the OSMID data warehouse is used for the findings of the economic and
environmental factors. This paper also discussed about the future direction of the study along
with the conclusion.
What is Data Mining?
Sometimes the term Data Mining is called “data knowledge discovery”. It is a process of
discovery of the information automatically. Data Mining refers to the analysis of the data from
the different perspective and the summarization of the data into the useful information. Data
Mining is an emerging technology; however, the concept of the data mining is not the old one.
At past the organizations used to use computers and other technical devices for managing the
large amount of data and the analysis of the market conditions.
According to Bigus (1996), the discovery of data mining leads an efficient way to find
out important non-obvious information from the large amount of data. It also helps in the
automated discovery relationships and new facts about the data. The data mining process is the
mechanism of gaining the core knowledge. Paper by Han and Kamber (2001) and Acxiom
Document Page
2DATA MINING IN PLANT BIOTECH
Working Paper by Segall (2003) mention the relation between data base integration, data
mining and data clearing with respect to the data warehousing along with the adoption of the
appropriate mining technique and the task relevant data to form the framework for evaluation of
the new knowledge. Data mining is the knowledge discovery which involves finding cluster,
sequencing and forecasting that can be represented to define the classified rules and patterns.
Certain models such as fuzzy logic, neural network and statistical analysis decision trees and
data visualization are used for data mining.
Data mining is an emerging technology which helps the users to find the information
without asking the specific question. This is a technology associated with statistic and artificial
intelligence. The objective of this technology is "tell me something interesting, even though I
don't know what questions to ask, and also tell me what may happen." The data used for mining
should be beyond the traditional data set. As for example- patterns for the large data base or data
ware house can e used for mining.
In order to discuss about the model and structure building for the data warehouse and
their relation to the data mining can be supported by the previous literatures written by Acxiom
Working Papers of Segall (2003) and Fish and Segall (2002), Segall (2004) and Fish and Segall
(2004) respectively.
It has also been discussed about the application of the data mining algorithm in the
medical database Segall (1984,1988,2002). Modeling of data mining functions is done using
‘linear’ and ‘non-linear’ regressions and fitting of curve. In order to define the learning rules for
Document Page
3DATA MINING IN PLANT BIOTECH
the neural network, the data mining models are applied on the database of certain applications
shown by Segall (1995,1996,2001,2003,2004).
It is assumed that the people who are reading this paper will be interested in searching
other applications of biotech using data mining. However, it can be said, that the users should
aware of the effect if using incomplete data and the use of inaccurate estimations as it will not
give the desirable results.
Interestingness measures
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
4DATA MINING IN PLANT BIOTECH
Purpose: Filtering of the irrelevant patterns for gaining the desirable knowledge.
Thousands of irrelevant patterns can be produces during data mining process.
Objective measures: Depends on the structure and statistic of the pattern. (e.g.
frequency count).
Subjective measures: Depends on the user’s perception about the data. The data may
become interesting if it supports or oppose the hypothesis of the user, depending on the situation.
Interestingness measures can be deployed after or before the discovery of the pattern. In future,
the efficiency can be improved.
What is Data Warehousing?
Data warehousing is the process of storing the large amount of corporate data. It enables
the new opportunities for the decision making system of the organization. Good decisions cannot
be made if the appropriate data is not available. A perfect corporate data warehouse enables the
access of the right data in the decision making process and it helps on the implementation of the
new computing techniques and data mining techniques.
A figure of multi-tiered data warehouse is presented in the figure, which is obtained from
Acxiom Working Paper by Segall (2002). The main objective of this figure is to present the
connection between data mining, data warehouse, OLAP, data marts and data sources. Data mart,
as mentioned in Segall (2002) is a mechanism that merges data required for the group of
particular applications.
Document Page
5DATA MINING IN PLANT BIOTECH
The users using the data mining should be aware of the importance of using the appropriate tools
and structure of data used in the data warehousing. SAS (2013) discussed the data warehouse
solution for the pharmaceutical enterprises.
Different types of Data Warehouse:
Enterprise Warehouse: Includes all the fields of organizational interests.
Data Mart: Includes the subsets of the corporate data that may be the interest of the user
specified group.
Virtual Warehouse: Presents a set of views which is based on the particular demands of
the operational database. Some of these vies can be achieved.
Plant database in data mining:
Document Page
6DATA MINING IN PLANT BIOTECH
Plant data discussed in this study are the data which can be presented to analysis the plant
biotech. The database used for the data mining in this paper is ‘Osmotic Stress Microarray
Information Database (OSMID)’. This database contains the hundred microarray experiments
carried out by University of Arizona for a project named “The Functional Genomics of Plant
Stress”, supported by National Science Foundation. The selection of the corn is based on the
three factors supported by plant biotech industry (Monsanto (2002):
Corn is the most researched elements in US food system. Its agronomic as well as the
generic properties is documented.
Corn is regarded as safe medium for the generic expression.
It has been seen that corn accumulates the monoclonal antibodies in a higher degree. This
is not shown in the other plants.
chevron_up_icon
1 out of 7
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]