# Project #13572 - Stats

Assume the question is asking you to create a vector that has 8 numbers with the entries [1.2,3,4,5,5,4,3] and specifically asks for using the colon operator without using any direct entries. The answer is ‘x <- c(1:5,5:3)’. Enter this in R and after entering it, show what is inside. Once you did this, copy and paste the code from the R command window as follows:

> x <- c(1:5,5:3)

> x

[1] 1 2 3 4 5 5 4 3

You should report as in the bolded lines above. If you do not follow instructions, you will lose points.

We may not have covered some of the questions in the class. If that is the case, you should use the help function and learn how to do it.

a.      Create a vector with 10 numbers (3, 12, 6, -5, 0, 8, 15, 1, -10, 7) and assign it to x.

> x <- c(3,12,6,-5,0,8,15,1,-10,7)

> x

[1]   3  12   6  -5   0   8  15   1 -10   7

b.      What is the data type of x? How can you find out?

Numerical

c.       Subtract 5 from the 2nd, 4th, 6th, etc. element in x.

d.      Compute the sum and the average for x (there are functions for that).

> sum(x)

[1] 37

> mean(x)

[1] 3.7

e.      Reverse the order of the elements in x (use a function that reverses a vector).

> rev(x)

[1]   7 -10   1  15   8   0  -5   6  12   3

> x[x<0]

[1]  -5 -10

g.      Remove all entries with negative numbers from x based on their index (use concatenate function as well as the index numbers).

h.      How long is x now (use a function).

i.        Remove x from the environment/workspace (session) and list the variables in your workspace.

j.        Create the a vector of strings containing “CSE 8001", “CSE 8002", ...,\CSE 8100" using paste.

1.      How could prediction models contribute to targeting of treatment and to achieve better care at reduced costs (increases cost-effectiveness) of medical care?

2.

a.       What are the problems of dichotomization when studying the effect of one specific predictor, such as age?

b.      Why should continuous predictors not categorized and why they should? Give a healthcare example.

3.      Why are extreme values (outliers) a problem for data mining?  When would truncation be reasonable?

4.      Calculate Sensitivity, specificity, accuracy, precision and recall in the following confusion matrix, show your work – do not just report numbers.

 Actual Classification of Classes in the Dataset Positive Negative Model Classification Positive 700 35 Negative 80 156

5.      Apply the Apriori algorithm to the following to find the sets of associated items. Use a support of 3 and show your work step by step using tables

 Transaction Items 10000 1, 2,3,5 20000 1,3,5,4 30000 3,4,6 40000 1,5,6 50000 1,3,4,5 60000 2,3,4,5

 Subject Mathematics Due By (Pacific Time) 09/30/2013 12:00 pm
TutorRating
pallavi

Chat Now!

out of 1971 reviews
amosmm

Chat Now!

out of 766 reviews
PhyzKyd

Chat Now!

out of 1164 reviews
rajdeep77

Chat Now!

out of 721 reviews
sctys

Chat Now!

out of 1600 reviews

Chat Now!

out of 770 reviews
topnotcher

Chat Now!

out of 766 reviews
XXXIAO

Chat Now!

out of 680 reviews