Regression Analysis and Decision Making
Added on 20190916
11 Pages2602 Words313 Views



PART I: DATA WAREHOUSE(Onepoint each)(1)For analysispurposes,you want thedata stored in thedata warehouseinaform.(2)Departmental data marts,from an enterprise perspectivehaveonemajordeficiency. What is thatdeficiency?(3)A multidimensional structureto storedatais called a .(4) A Star Schema hasaFact Tables and at most, six Dimension Tables (Trueor False)(5) When designingan OLAPdatabase,arestructured into “hierarchies.”PART I: MATCH TERMS(Twopointsperquestion)Discrete Qualitative data (1)learning from discoveryand observationTemperaturereadings _Dependent variable(2)induction(3 )used toidentifysimilarityand diversitywithin asampleTheindependent variableis significantUnsupervised learning_(4)objects that considered dissimilar from theremainder of thedata(5) continuous quantitative dataPredictive datamining_(6)gender, political parties or religionsText miningExtraction ofpatterns(7)process of numericizingtext for analysis,prediction and pattern identification(8)if thevalue of “p”less 0.05 or<Outlier (9)thegoal is predictionJaccard Coefficient(10)variableto be predictedPART II: TRUEFALSE (Two points each)(1)Supervised learningis learningfrom observationand discovery.(2)Statistics is thebranch of mathematics concerningthe collection and description ofdata.(3)Linear regression is similar to thetask offindingalinethat maximizes totaldistanceto a set of data.(4)Squaringthe differencebetween the predictedSeenext page
consumptionpercustomervalue and theactual valueis themostcommon waytocalculate theerror.(5)the regression forY = bo+b1P+b2P2+b3Q +b4Q2+.... is amultiple regression equation.(6)A simple regression equation has oneindependentvariable and two dependent variables.(7)The correlation coefficient for thebelow relationshipbetween theaverageprice and consumption per customer is0.79. This suggestsastrongpositive correlationbetweenaverageprice andconsumption per customer.1601401201008060402000 20 40 60 80 100 120averageprice(8)The abovescatter plot showsalinear relationshipbetween theaverageprice and consumption per employee.(9)Supervised learningtechniques arenot used forprediction.(10)Clusteringis mostlyused for consolidatingdata intohighlevel views andgeneralgroupingofrecordswithunlikebehavior.PART III: FILL IN THE BLANK(Two pointseach)(1)Theregressed datapoints in a simple regression equation ofxandy yields an intercept of 0and aslope of1. TheRsquared coefficient for this equation is .(2)Themajor limitation of allregression techniques is thatyou cannot be sureof theunderlyingrelationship(s).(3)Two classification algorithms areand .(4)Logitmodels produceprobabilities that rangefrom to .
PART IV: MULTIPLE CHOICE (Two Points Each) CIRCLE OR PUT A () AROUNDYOURANSWER(1)Outliers area. Illegal b. Normalc. Infrequent observationsd. All of the above(2)General Applications ofClusteringa. Pattern Recognitionb. ImageProcessingc. Spatial Data Analysisd. All of the above(3)Decision Trees producerules that area. Bottom upb. Inclusive c. Exclusived. All of the above(4)If....And.... Thena. Is notgood Englishb. Isaruleformatc. Isredundantd. Noneof theabove(5)Thespecificterm “parentnode”would befounda. Inclusteringb. In neural networksc. Inruleinductiond. Noneof theabovePART V: PROBLEMSET:PROBLEM(1):(10 points)(1)This is asimple regression problem. You areadata analyst with the U.S. Department of Commerce.You havegathered information from thedepartment’s data warehouseon the world priceof pulp and theexports ofpulp to the rest of the world. You havebeen asked to seewhat would be the effect ofa$1 increasein thepriceof pulp on shipments from theUnited States. Assumethe United States produces100%of theworld pulp.a. Table 1 below representsyour data series (in areal world scenario,your series might consistofhundreds of datapoints). ThevariableX is theworld pricein current dollarsper ton. So for example, the first variable is $792.32 per metricton. Theothervariable Yispulp shipments in millions of metrictons.
PulpShipments(millionsofmetrictons)b. Table 2 is ascatter plot of therelationship between pulp prices and shipments.TABLE 2: Scatter Plot of Pulp Shipments4035302520151050500 550 600 650 700 750 800 850 900WorldPulpPrice (dollars per ton)c. Table 3 is theregressionequationmodel predictingthe relationship between price and pulpshipments.
End of preview
Want to access all the pages? Upload your documents or become a member.
Related Documents
Simple Linear Regression Analysis for Sales and Survey per Capita Consumptionlg...
5
1019
116
Estimating a Regression and Linear Regression Model for Wine Consumption and Deathslg...
6
1191
472
Regression Analysis of Fuel Priceslg...
11
1421
439
Ex 9 1) ANOVA. df Regres sion Residua l. 1 7. Total. 8.lg...
3
477
85
Graded Homework Maths Problems With Solution 2022lg...
6
1185
13
Applied Statisticslg...
23
3217
47