1. (4pts).  The lectures on ÒdataÓ began with the suggestion that there were some key questions to ask of any study.  Which of the following were in that list?  MTF

(A) How many data exist?

(B)  WhatÕs the evidence?

(C)  Who obtained the evidence?

(D)  How was the evidence obtained?

Data: Error

2. (5 pts) Which of the following demonstrations or videos were used to illustrate errors in data or ÒidealÓ data, and which options also correctly identify the type of error illustrated or the purpose of the demo?  An option that correctly refers to only part of the demo should be considered correct. 

A)    coin flip to illustrate bias

B)    coin flip to illustrate how to reduce sampling error

C)    Òchoose a random odd numberÓ to illustrate RPA error

D)    a video clip from the movie Spinal Tap to illustrate (bad) standards:  ÒitÕs one louderÓ

E)    a video clip from a Monty Python video to illustrate RPA error:  ÒPenguin intelligenceÓ

 

(3-5). For each of the following descriptions, indicate the types of error present (the italicized phrase identifies the error). Mark a type of error only if it is definitely present. Do not assume any more than what is explicitly described. One answer only for each question, but an option may be used more than once.  Note that you have the option of Ònone.Ó

 

type of error:

A) RPA

B) sampling

C) Human & technical

D) Bias

E) No error indicated

 

 

3. (4 pts) A high school football coach decides to test whether his players might be on steroids.  He tests his 50 players (first and second string) and finds that, while nearly all of his players have testosterone levels within the normal range, one player (Jerry) has a high level Ð that only 2% of high school football players (males) his age should have a level that high. Based on this evidence, the coach disciplines Jerry.  In thinking about whether the disciplinary action is justified, you should answer what type of error could account for the unusually high testosterone level in Jerry? (one only)

(A)       (B)      (C)       (D)      (E)

 

4. (4 pts) Two UT students, Alice and Lilia, are conducting a poll to assess UT-wide student attitudes about a topic.  Alice is a computer whiz and puts a web page up to let other UT students express their opinions.  Lilia visits dorm cafeterias and has students fill out forms while dining. After the first day, the poll results from hundreds of students for each method are noticeably different, so Alice and Lilia accuse each other of being sloppy with their data.  But the same approximate outcome happens the second day as well.    What type of error is suggested by the consistent difference in poll response between the web page interface and the forms filled out by dining students? (one only)                             (A)      (B)      (C)       (D)      (E)

 

 

5. (4 pts) A racing carÕs performance on a track is compared with and without a fuel additive.  In multiple trials with and without the additive, the carÕs top speed is consistently 3 miles per hour faster with the additive, but fuel use is also higher (per mile) with the additive.  Thus, it is determined that the additive is advantageous only for short races, because more time is taken refueling the car than is gained by the small increase in top speed.  What type of error in data is indicated by the difference in speed of the car with and without the fuel additive? (one only)

(A)       (B)      (C)       (D)      (E)

 

6. (5 pts) A customer complains that prices charged at the Far West HEB grocery store are often different than the advertised prices, and further suggests that the difference often goes in favor of the store (the charged price is usually higher than the advertised price).  The store does an audit of all 3450 items it sells and finds that there are several items marked differently than charged.  A distribution of the differences is shown below;

the center bar represents items for which the charged price was the same as the advertised price;

bars to the left of center are items in which charged price was more than advertised,

bars to the right are those in which the charged price was less than advertised. 

 

 

Which options fairly describe the data and the customer complaints?  MTF

 

A)   The suggestion that the store usually charges more than advertised is a claim that the price differences are biased.

B)    The distribution shown in the figure does not support much of a bias (if any) in the price difference.

C)    The study should have chosen the 3450 items randomly.

 

Ideal Data

7. (7pts) Which of the following points about ideal data are true?  MTF

A)   Randomization (if done properly) eliminates nearly all sources of bias.  For example, if subjects are randomly assigned to treatment and control groups in a drug trial, the main concerns with bias are nearly eliminated.

B)    Although replication will not eliminate bias, it usually reduces it.

C)    If followed to the letter, a written procedure minimizes errors in the data.  That is, most error creeps in because the data are not gathered strictly according to the protocol.

D)   Even when the protocol is followed, there are usually many subtle ways in which the written protocol does not describe how the data were collected.  This is one way in which the written procedure is a false model.

E)    Sampling error (of something like a rate or probability) is never completely eliminated by replication.  Thus, there are examples pertinent to human endeavors in which a trial involving 3,000 individuals missed effects that later became important.

F)    RPA error is reduced with standards. 

G)    The temperature of ice water would be a standard for a thermometer.

 

(8-9). Do-it-yourself protocol. You are conducting an external review/test of a genotyping lab. Your job is to send two tubes to the lab, with labels. There are several options for the content of and label on a tube. You must decide which contents to send and how to label the tubes so that the features of ideal data requested in the question are present from the lab's perspective. If a tube has a person's name on it, the lab can assume that the tube contents belong to the name of the person on the label. If a tube is labeled with a number, the contents are unknown to the lab but known to you. Your options for tube contents and tube labels are:

option

 tube label

Contents in the tube are from

Blood type

Gender

Marker status

(A)

Laura Baker

Laura Baker

B

Female

+

(B)

Darin Rokyta

Darin Rokyta

AB

Male

negative

(C)

Rachael Springman

Rachael Springman

O

Female

+

(D)

#132

Darin Rokyta

AB

Male

negative

(E)

#218

Patsy Cline

A

Female

+

(F)

#10

Pam Hines

O

Female

negative

(G)

Jerry Allison

Jerry Allison

B

Male

negative

(H)

#101

Brent Iverson

AB

Male

negative

(I)

No combination of tubes can satisfy the protocol

 

 

In the following questions, choose two letters among options (A)-(H) to describe the two tubes that will be sent to the lab. The tube labels are the only information the lab receives about the samples, and the lab does not have prior information about the individuals.  If it is possible to satisfy the protocol, the question will require exactly two letters and only two letters -- one for each tube. Thus, the answer for a question might be (A) & (B), or it might be (D) & (F). If more than one pair of options are possible correct answers, fill in only one correct pair of options. Thus, if (A) & (B) is one acceptable answer, and (C) & (D) is another acceptable answer, fill in either (A)&(B) or (C)&(D), but not both.   If a factor (such as identity, blood type, gender, etc.) is not specified in the protocol, then that factor will be ignored in grading the answer. 

Alternatively, if a protocol cannot be satisfied with two from (A)-(H), fill in (I).

 

8. (3 pts) Choose two tubes to guarantee replication of individual, marker, and blood type but gender is not replicated; the replication of individual should be blind to the lab Ð that is, the lab should not be able to tell from the information on the tubes that the two samples have the same blood type.

two tubes or I:              (A)       (B)       (C)       (D)       (E)       (F)       (G)       (H)            (I)

 

9. (3 pts) Make the tubes replicated for marker, gender and blood type but not replicated for individual.

two tubes or I:              (A)       (B)       (C)       (D)       (E)       (F)       (G)       (H)            (I)

 

 

(10-12). For each of the following statements, mark the appropriate letters that describe the data design features present. Mark a data feature only if it is explicitly present at some level in the problem description. all questions are MTF

 

(A) explicit protocol

(C) standards

(E) blind

(B) replication

(D) random

(F) none

 

10. (4pts). You decide to test whether sober people can routinely pass the SFST, and whether age affects performance.  You recruit 200 people of different ages and inform them only that they will be given the SFST, they must be sober at the time (verified with a breathalyzer test that is calibrated against a blank), and that you are interested in whether men are better than women at passing the test; they are not told about your interest in the effect of age.  They are asked to show up in alphabetical order on the same day.  The test is administered by officers in uniform that are certified to administer the test and who follow formal test procedures, the actual trials are video taped and verified by others who are also certified.  MTF

 

(A)       (B)       (C)       (D)       (E)       (F)

 

(A) explicit protocol

(C) standards

(E) blind

(B) replication

(D) random

(F) none

 

11. (4 pts) Some high school students decide to test the power of prayer on plant growth.  They plant 50 bean seeds individually in pots, and when 40 of the seeds have germinated, the pots are divided into two groups of 20 each.  Both groups are subjected to daily prayers for good growth.  At the end of one month, the height of each plant is measured, and the averages between the two groups are compared.  MTF

(A)       (B)       (C)       (D)       (E)       (F)

 

12. (4pts) You are hired as a consultant for a company selling home pregnancy tests to help them market a product that will be easy to use and give accurate results.  You advise them to put a picture of a woman on the front of the box and directions for use on the back.  Furthermore you suggest that they provide supplies for just a single test, so that if a woman wants to test herself again, she has to purchase a second kit.  Finally you suggest that they include a sample solution in the kit that will provide a definite positive result that can be used if the woman tests negative.  Which aspects of the ideal data template would be satisfied by a single kit if your recommendations are followed?  MTF

 

(A)       (B)       (C)       (D)       (E)       (F)

 

13 (5 pts).  The following pair of graphs was shown in relation to the coin flip demo in class.  Which points were illustrated by the difference between the left and right graphs?  The horizontal axis is the proportion heads, and both horizontal axes span 0 to 1.  MTF

 

 

(A)   There is greater bias in the left graph, because the left shows that more people failed to get the right proportion of heads.

(B)   Classes from different years have generated different distributions of the proportion of heads

(C)   Replication reduces sampling error

(D)   The right graph has the least RPA error.

 

 

 

 

 

 

Drug Testing, DWI testing

           

14. ( 5 pts). What constitutes a standard in a drug test for evaluating lab error rates?  (MTF)

 

A)    A sample with a known level of drug present.

B)    A sample known to be drug-free.

C)    A written procedure describing the level of performance to be upheld by the lab

D)    Any measure taken by the lab to detect or reduce human and technical error

E)    A proficiency test given to the lab that does the analysis, regardless of whether the test is blind.

 

15.  (5pts)  The reading of a Òblood alcohol contentÓ (BAC) with a breathalyzer can be a bad test of actual alcohol levels in blood for which of the following reasons?  For an option to be true, the reason must both be a true statement and be a reason that an erroneous (seriously false) reading could be obtained. This question is not concerned with either RPA error or with the fact that all models are false.  Rather, we want to know why a breathalyzer could give a reading that is not close to the actual BAC.   MTF

 

A)   Substances in the breath other than alcohol can affect the reading

B)    Alcohol in the mouth can affect the reading

C)    A breathalyzer cannot be tested with a standard.

D)   A breathalyzer does not actually use blood to measure alcohol content.  This reason, by itself, means that an accurate reading of BAC cannot be obtained.

 

16. (6 pts) MTF:  The standardized field sobriety test (SFST)

A)   is partly a test of coordination

B)    is partly a test of involuntary eye movement

C)    is partly a test of ability to follow directions

D)   involves a formal protocol in scoring and administration (instructions)

E)    is scored against you if you start before being told to do so.

F)    is a false model of whether an individual is impaired by alcohol because it applies the same standards to everyone, regardless of how they would perform sober.

 

17. (4pts) The breathalyzer score sheet of the person shown in the video revealed:

A)   blind testing

B)    replication

C)    standards

D)   randomization

 

 

DNA Typing plus Criminal Justice System

 

18. (4pts) Letters were read in class (and are in the Book) from the Chicago Police Dept to the FBI requesting DNA typing of samples. Which aspects of ideal data were specifically described in those letters (included in those requests)?  MTF

 

A)   Replication of the same sample

B)    Standards

C)    Randomization

D)   Blind

 

 

 

19. (4pts)  With forensic evidence, a court is often told that material from the crime scene matches the suspect.  The random match probability (RMP) gives the chance that the sample from a crime could have come from a randomly chosen person not involved in the crime.  Suppose the formal RMP calculation is 1/billion but the lab error rate of giving false matches is 1%.  How do those two numbers affect the significance of a match?  (ÒsignificanceÓ is the overall chance that the sample did NOT come from the suspect.)  MTF

 

(a)   The significance is not affected, and remains 1/billion.

(b)   The significance is approximately 1/billion x probability (0.99) that no error was made, hence the significance is slightly less than but close to 1/billion

(c)   The significance is approximately 1/billion PLUS 1%, and in this case is close to the lab error rate.

(d)   You cannot calculate the ÒsignificanceÓ if there might have been human & technical error.

 

 

20. (4pts) Which protocol features are not needed for drug testing (e.g., for the presence of cocaine) but are needed for DNA typing and determining the significance of a match?  MTF

A)   replication

B)    a knowledge of lab error rates

C)    standards in the form of a sample of known properties

D)   methods to calculate a RMP

E)    blind processing of samples

F)    a reference database from the human population

 

 

21. (4pts) Which forensic methods have been shown (in proficiency tests) to have error rates sometimes exceeding 10% or to otherwise be unreliable?  MTF

 

A) polygraph (lie detector)

E) eyewitness identification

B) voice matching

F) handwriting identification

C) hair matching (not DNA based)

G) fingerprint matching

D) bite mark matching

 

 

22. (4pts)  It was noted in class that over 100 people convicted of serious crimes have been released because subsequent evidence showed that they could not have committed the crimes for which they were convicted.  Which points about those wrongful convictions are true? Or which statements about processes that led to wrongful convictions are true?  The italicized statement is true.  MTF

A)   The most common factors associated with wrongful convictions were bad or faulty data.

B)    Approximately 2/3 of the first 100 convicts who were subjected to DNA testing (after conviction) were shown to be innocent.  This fraction (2/3) is likely a biased fraction of all wrongful convictions, because the first 100 convicts tested was not a random sample.

C)    Since DNA typing has been routinely implemented before trial, it has been found that the prosecutionÕs prime suspect is nearly always compatible with the DNA evidence Ð only 5% of prime suspects are cleared by the DNA before trial.

D)   Fallibility of eyewitness identification has come to light only in the last 1-2 decades because the first experimental tests were conducted that recently.

 

 

 

 

 

(23,24).  The next 2 questions address types of problems with forensic data used in court.  Use the following set of options

A)   bad protocols

B)    bad standards (including inadequate databases, lack of proficiency testing)

C)    lack of blind

D)   inadequate replication

E)    failure to randomize

 

 

23. (4pts).  (MTF)  Which options are fixed by using coded samples? 

(A)  (B)  (C)  (D)  (E)

 

24. (4pts).  (MTF) Which options were given as the main problems with much of forensic evidence?

(A)  (B)  (C)  (D)  (E)

 

 

 

 

 

25. (4 pts.) Exam Key Code: Fill in (AB) on question 25 to indicate your exam code.  Also, fill in the correct bubbles for your name and pad number on the scantron form.