Homework 7

Mapping using a dataset of hard data values

Date given: 11/4

Date Due for Part 1&2: 11/11 noon

Date Due for Part 3&4: 11/18 noon

 

 

You are asked to do parts 1 and 2 of this homework without help from classmates or students who have already taken the class. Doing otherwise on this homework will violate your honor code pledge.

 

Part 1

Let the SRF X(s) represent the prevalence of COVID-19 amongst students living at location s. Let X =X(s) be a random variable representing the prevalence in a UNC dorm at location s, and let X’=X(s) be the prevalence at another dorm at location s’ that is a distance d away from s, where d (km) is equal to your Month Of Birth (for example if you are born in January then d=1km while if you are born in December then d=12 km).  Assume that the bivariate PDF of X and X’ is given by the equation below

 

 

Recently published studies indicate that the spread of COVID-19 substantially increases once more than one third of students living in a dorm test positive for COVID-19, and therefore your directive is that you need to ask all students to move out of the UNC dorm if there is more than 70% chance that the prevalence at the UNC dorm exceeds 1/3.

 

a.     State your Month Of Birth (MOB), write the PDF using d=MOB, and show that it is a valid PDF.

b.     You know that the students at the dorm located at s’ are being tested and you will receive their results soon. However, you do not have that information yet. While you are waiting, what do you expect the prevalence to be at the UNC dorm? What is the probability that the prevalence at the UNC dorm exceeds 1/3=33.333%? Based on that probability should you already ask students to move out?

c.     You learn that the prevalence at the other dorm is 0.9 (i.e. 90% of the students tested positive for COVID-19 at the other dorm). Given that information (i.e. given that X’=0.9), what do you expect the prevalence to be at the UNC dorm? What is the probability that the UNC dorm prevalence X exceeds 1/3 given that X’=0.9? Should you ask students to move out from the UNC dorm given that X’=0.9?

d.    Make a plot of the marginal PDF  and the conditional PDF . Explain what these plots show and use these plots to explain what you found in questions a and b above.

e.     Is the SRF X(s) homogeneous?

f.      Is the correlation between X and X’ increasing with d, or decreasing with d? Make a proof supporting your answer.

 

 

Part 2

 

Consider the space/time random field (S/TRF) X(p) representing log-PM2.5 (log-ppm) at space/time location p=(s,t), where s=(s1,s2) is the 2D spatial location, and t is time.  Assume that X(p) has a constant mean of 0 log-ppm (over space and time), and that it has the following covariance

 

cX(r,t) = c01 exp(-3r/ar1) exp(-3t2/at12) + c02 exp(-3r/ar2) exp(-3t/at2),

 

where r is the spatial lag, t is the temporal lag, and the covariance parameters are c01=1.5 (log-ppm)2, ar1=1 km and at1=3 day, c02=0.5 (log-ppm)2, ar2=30 km and at2=700 day

 

Samples of PM2.5 were collected at the following space/time locations:

At p1=(0 km,2 km,1 day), sample 1 was collected and stored in a lab freezer

At p2=(1 km,4 km,2 day), sample 2 was collected and stored in a lab freezer

 

Your task is to use simple kriging to estimate log-PM2.5 at pk=(1 km ,0 km,3 day). The samples are in a lab freezer and they will be analyzed to get measurements of log-PM2.5 at p1 and p2.

a.     Do you think that the S/TRF X(p) can be assumed to be homogeneous/stationary, and space/time separable?  Justify your answer.

b.     Derive the value of mean value mk and the column mean vector mh. Likewise derive the values of the covariance value ckk, and covariance matrix Ckh and the covariance matrix Chh. What is the variance of log-PM2.5 at p1, p2 and pk?

c.     You have not analyzed sample 1 nor sample 2. What is the expected value of Xk=X(pk) and its 95% confidence interval if you do not know the value of X1=X(p1) and X2=X(p2)?

d.     What is the correlation between Xk and X1 and the correlation between between Xk and X2? Based on this, if you do not have funds to analyze one sample, which one should you analyze?

e.     You are instructed to analyze sample 1 and your measurement indicates that X1= 3.3 log-ppm. Calculate the value of the simple kriging mean and variance of Xk given that X1= 3.3 log-ppm. Show all the hand written steps of your calculations. What is the expected value of Xk and its 95% confidence interval given that X1= 3.3 log-ppm?

f.      You need to quarantine due to COVID-19. Your lab mate analyzes sample 2 and her measurement indicates that X2= 1.8 log-ppm. She does not know your measurement value for X1. Using simple kriging, what is the expected value of Xk and its 95% confidence interval given that X2= 1.8 log-ppm? Again show all your hand written steps of your calculations.

g.     Your quarantine is over, you get back to the lab, and you now know that X1= 3.3 log-ppm and X2= 1.8 log-ppm. Using simple kriging, what is the expected value of Xk and its 95% confidence interval given that X1= 3.3 log-ppm and X2= 1.8 log-ppm? Again show all your hand written steps of your calculations.

 

 

Part 3

Use what you have learned from the BMEGUI tutorials and previous homework to refine the space/time analysis of your project dataset. Prepare and submit a short preliminary draft (less than 8 pages of text and figures) of your final class project report. This report should be well written, it should have an introduction providing background about the environmental contaminant and the research question you plan to address, it should have a materials and method section describing the data you obtained (i.e. exploratory data analysis) and the method and tools you will use to analyze that data (i.e. the BME framework and the BMEGUI tool), and it should have some preliminary results (covariance and maps). Use a “future work” section to describe what you plan to do to complete or further improve the analysis and address your research question.

 

Part 4

Prepare a 3 slides / 3 minutes PowerPoint preliminary final presentation that you will present to the class on the last week of the semester. Generally, this presentation should have one slide on intro and data, one slide on mean trend and covariance model, and one slide on mapping results

 

 

Create a file containing a scan of your hand written work for parts 1 and 2 (named yourfirstname_yourlastname_hwk7_part1and2.pdf) and email it to the TA at the deadline for parts 1 and 2. Create a well written report part 3 and 4 in a file named yourfirstname_yourlastname_hwk7_part3and4.docx and email it to the TA at the deadline for parts 3 and 4.