1.
(4pts). The lectures on ÒdataÓ began with the suggestion that
there were some key questions to ask of any study. Which of the following were in that list? MTF
(A) How many data exist?
(B) WhatÕs the evidence?
(C) Who obtained the evidence?
(D) How was
the evidence obtained?
Data: Error
2.
(5 pts) Which of the following
demonstrations or videos were used to illustrate errors in data or ÒidealÓ
data, and which options also correctly identify the type of error illustrated
or the purpose of the demo? An option
that correctly refers to only part of the demo should be considered
correct.
A)
coin flip to illustrate
bias
B)
coin flip to illustrate how
to reduce sampling error
C)
Òchoose a random odd
numberÓ to illustrate RPA error
D)
a video clip from the
movie Spinal Tap to illustrate (bad) standards: ÒitÕs one louderÓ
E)
a video clip from a
Monty Python video to illustrate RPA error: ÒPenguin intelligenceÓ
(3-5). For each of the following descriptions, indicate
the types of error present (the italicized phrase identifies the error). Mark a type of error only if it is
definitely present. Do not assume any more than what is explicitly described. One
answer only for each question, but an option may be used more than once. Note
that you have the option of Ònone.Ó
type of error: |
A) RPA |
B) sampling |
C) Human & technical |
D) Bias |
E) No error indicated |
3.
(4 pts) A high school football coach
decides to test whether his players might be on steroids. He tests his 50 players (first and
second string) and finds that, while nearly all of his players have
testosterone levels within the normal range, one player (Jerry) has a high
level Ð that only 2% of high school football players (males) his age should
have a level that high. Based on this evidence, the coach disciplines Jerry. In thinking about whether the
disciplinary action is justified, you should answer what type of error could
account for the unusually high testosterone level in Jerry? (one only)
(A)
(B) (C) (D) (E)
4.
(4 pts) Two UT students, Alice and
Lilia, are conducting a poll to assess UT-wide student attitudes about a
topic. Alice is a computer whiz
and puts a web page up to let other UT students express their opinions. Lilia visits dorm cafeterias and has
students fill out forms while dining. After the first day, the poll results
from hundreds of students for each method are noticeably different, so Alice
and Lilia accuse each other of being sloppy with their data. But the same approximate outcome
happens the second day as well. What type of error is suggested by the consistent
difference in poll response between the web page interface and the forms filled
out by dining students? (one only)
(A) (B) (C) (D) (E)
5. (4 pts)
A racing carÕs performance on a track
is compared with and without a fuel additive. In multiple trials with and without the additive, the carÕs
top speed is consistently 3 miles per hour faster with the additive, but fuel
use is also higher (per mile) with the additive. Thus, it is determined that the additive is advantageous
only for short races, because more time is taken refueling the car than is
gained by the small increase in top speed. What type of error in data is indicated by the difference
in speed of the car with and without the fuel additive? (one only)
(A)
(B) (C) (D) (E)
6.
(5 pts) A customer complains that
prices charged at the Far West HEB grocery store are often different than the
advertised prices, and further suggests that the difference often goes in favor
of the store (the charged price is usually higher than the advertised price). The store does an audit of all 3450
items it sells and finds that there are several items marked differently than
charged. A distribution of the
differences is shown below;
the center bar represents items for which the
charged price was the same as the advertised price;
bars to the left of center are items in which
charged price was more than advertised,
bars to the right are those in which the
charged price was less than advertised.
|
Which
options fairly describe the data and the customer complaints? MTF
A) The suggestion that the store usually charges more
than advertised is a claim that the price differences are biased.
B) The distribution shown in the figure does not support
much of a bias (if any) in the price difference.
C) The study should have chosen the 3450 items randomly.
7.
(7pts) Which of the following points
about ideal data are true? MTF
A)
Randomization (if done
properly) eliminates nearly all sources of bias. For example, if subjects are randomly assigned to treatment
and control groups in a drug trial, the main concerns with bias are nearly
eliminated.
B)
Although replication
will not eliminate bias, it usually reduces it.
C)
If followed to the
letter, a written procedure minimizes errors in the data. That is, most error creeps in because
the data are not gathered strictly according to the protocol.
D)
Even when the protocol
is followed, there are usually many subtle ways in which the written protocol
does not describe how the data were collected. This is one way in which the written procedure is a false
model.
E)
Sampling error (of
something like a rate or probability) is never completely eliminated by
replication. Thus, there are
examples pertinent to human endeavors in which a trial involving 3,000 individuals
missed effects that later became important.
F)
RPA error is reduced
with standards.
G)
The temperature of ice
water would be a standard for a thermometer.
(8-9). Do-it-yourself protocol. You are conducting an
external review/test of a genotyping lab. Your job is to send two tubes to
the lab, with labels. There are several options for the content of and label on
a tube. You must decide which contents to send and how to label the tubes so
that the features of ideal data requested in the question are present from the
lab's perspective. If a tube has a person's name on it, the lab can assume that
the tube contents belong to the name of the person on the label. If a tube is
labeled with a number, the contents are unknown to the lab but known to you.
Your options for tube contents and tube labels are:
option |
tube label |
Contents in the tube
are from |
Blood type
|
Gender |
Marker status |
(A) |
Laura Baker |
Laura Baker |
B |
Female |
+ |
(B) |
Darin Rokyta |
Darin Rokyta |
AB |
Male |
negative |
(C) |
Rachael Springman |
Rachael Springman |
O |
Female |
+ |
(D) |
#132 |
Darin Rokyta |
AB |
Male |
negative |
(E) |
#218 |
Patsy Cline |
A |
Female |
+ |
(F) |
#10 |
Pam Hines |
O |
Female |
negative |
(G) |
Jerry Allison |
Jerry Allison |
B |
Male |
negative |
(H) |
#101 |
Brent Iverson |
AB |
Male |
negative |
(I) |
No combination of tubes can
satisfy the protocol |
|
In the following
questions, choose two letters among options (A)-(H) to describe the two tubes
that will be sent to the lab. The tube labels are the only information the
lab receives about the samples, and the lab does not have prior information
about the individuals. If it is possible to satisfy the
protocol, the question will require exactly two letters and only two letters -- one for each tube. Thus, the answer for a question
might be (A) & (B), or it might be (D) & (F). If more than one pair of
options are possible correct answers, fill in only one correct pair of options.
Thus, if (A) & (B) is one acceptable answer, and (C) & (D) is another
acceptable answer, fill in either (A)&(B) or (C)&(D), but not both. If a factor (such as identity, blood type, gender,
etc.) is not specified in the protocol, then that factor will be ignored in
grading the answer.
Alternatively,
if a protocol cannot be satisfied with
two from (A)-(H), fill in (I).
8. (3 pts)
Choose two tubes to guarantee replication of individual, marker, and blood
type but gender is not replicated; the replication of individual
should be blind to the lab Ð that is, the lab should not be able to tell
from the information on the tubes that the two samples have the same blood type.
two
tubes or I: (A) (B) (C) (D) (E) (F) (G) (H) (I)
9. (3 pts)
Make the tubes replicated for marker, gender and blood type but not
replicated for individual.
two
tubes or I: (A) (B) (C) (D) (E) (F) (G) (H) (I)
(10-12). For each of the
following statements, mark the appropriate letters that describe the data
design features present. Mark a data feature only if it is explicitly present
at some level in the problem description. all questions are MTF
(A) explicit protocol |
(C) standards |
(E) blind |
(B) replication |
(D) random |
(F) none |
10. (4pts).
You decide to test whether sober people can routinely pass the SFST, and
whether age affects performance.
You recruit 200 people of different ages and inform them only that they
will be given the SFST, they must be sober at the time (verified with a
breathalyzer test that is calibrated against a blank), and that you are
interested in whether men are better than women at passing the test; they are
not told about your interest in the effect of age. They are asked to show up in alphabetical order on the same
day. The test is administered by
officers in uniform that are certified to administer the test and who follow
formal test procedures, the actual trials are video taped and verified by
others who are also certified. MTF
(A) (B) (C) (D) (E) (F)
(A) explicit protocol |
(C) standards |
(E) blind |
(B) replication |
(D) random |
(F) none |
11. (4 pts) Some high school students decide to test the power of
prayer on plant growth. They plant
50 bean seeds individually in pots, and when 40 of the seeds have germinated,
the pots are divided into two groups of 20 each. Both groups are subjected to daily prayers for good growth. At the end of one month, the height of
each plant is measured, and the averages between the two groups are
compared. MTF
(A) (B) (C) (D) (E) (F)
12. (4pts) You are hired as a consultant for a company selling home pregnancy tests to help them market a product that will be easy to use and give accurate results. You advise them to put a picture of a woman on the front of the box and directions for use on the back. Furthermore you suggest that they provide supplies for just a single test, so that if a woman wants to test herself again, she has to purchase a second kit. Finally you suggest that they include a sample solution in the kit that will provide a definite positive result that can be used if the woman tests negative. Which aspects of the ideal data template would be satisfied by a single kit if your recommendations are followed? MTF
(A) (B) (C) (D) (E) (F)
13
(5 pts). The following pair of graphs was shown in relation to
the coin flip demo in class. Which
points were illustrated by the difference between the left and right
graphs? The horizontal axis is the
proportion heads, and both horizontal axes span 0 to 1. MTF
|
|
(A) There is greater bias in the left graph, because the left shows that more people failed to get the right proportion of heads.
(B) Classes from different years have generated different distributions of the proportion of heads
(C) Replication reduces sampling error
(D) The right graph has the least RPA error.
Drug Testing, DWI testing
14. ( 5 pts). What constitutes a standard in a drug test for
evaluating lab error rates? (MTF)
A) A sample with a known level of drug present.
B) A sample known to be drug-free.
C) A written procedure describing the level of
performance to be upheld by the lab
D) Any measure taken by the lab to detect or reduce human
and technical error
E) A proficiency test given to the lab that does the
analysis, regardless of whether the test is blind.
15. (5pts) The reading of a Òblood alcohol contentÓ (BAC) with a
breathalyzer can be a bad test of actual alcohol levels in blood for which of
the following reasons? For an
option to be true, the reason must both be a true statement and be a reason
that an erroneous (seriously false) reading could be obtained. This question is
not concerned with either RPA error or with the fact that all models are false. Rather, we want to know why a
breathalyzer could give a reading that is not close to the actual BAC. MTF
A) Substances in the breath other than alcohol can affect
the reading
B) Alcohol in the mouth can affect the reading
C) A breathalyzer cannot be tested with a standard.
D) A breathalyzer does not actually use blood to measure
alcohol content. This reason, by
itself, means that an accurate reading of BAC cannot be obtained.
A) is partly a test of coordination
B) is partly a test of involuntary eye movement
C) is partly a test of ability to follow directions
D) involves a formal protocol in scoring and
administration (instructions)
E) is scored against you if you start before being told
to do so.
F) is a false model of whether an individual is impaired
by alcohol because it applies the same standards to everyone, regardless of how
they would perform sober.
17. (4pts) The breathalyzer score sheet of the person shown in the video revealed:
A) blind testing
B) replication
C) standards
D) randomization
18. (4pts) Letters were read in class (and are in the Book) from
the Chicago Police Dept to the FBI requesting DNA typing of samples. Which
aspects of ideal data were specifically described in those letters (included in
those requests)? MTF
A) Replication of the same sample
B) Standards
C) Randomization
D) Blind
19. (4pts) With
forensic evidence, a court is often told that material from the crime scene
matches the suspect. The random
match probability (RMP) gives the chance that the sample from a crime could
have come from a randomly chosen person not involved in the crime. Suppose the formal RMP calculation is
1/billion but the lab error rate of giving false matches is 1%. How do those two numbers affect the significance
of a match? (ÒsignificanceÓ is the
overall chance that the sample did NOT come from the suspect.) MTF
(a)
The significance is not
affected, and remains 1/billion.
(b)
The significance is
approximately 1/billion x probability (0.99) that no error was made, hence the
significance is slightly less than but close to 1/billion
(c)
The significance is
approximately 1/billion PLUS 1%, and in this case is close to the lab error
rate.
(d)
You cannot calculate the
ÒsignificanceÓ if there might have been human & technical error.
20. (4pts) Which protocol features are not needed for drug
testing (e.g., for the presence of cocaine) but are needed for DNA
typing and determining the significance of a match? MTF
A) replication
B) a knowledge of lab error rates
C) standards in the form of a sample of known properties
D) methods to calculate a RMP
E) blind processing of samples
F) a reference database from the human population
21. (4pts) Which forensic methods have been shown (in proficiency
tests) to have error rates sometimes exceeding 10% or to otherwise be
unreliable? MTF
A) polygraph (lie detector) |
E) eyewitness
identification |
B) voice matching |
F) handwriting
identification |
C) hair matching (not DNA
based) |
G) fingerprint matching |
D) bite mark matching |
|
22. (4pts) It was noted in class
that over 100 people convicted of serious crimes have been released because
subsequent evidence showed that they could not have committed the crimes for
which they were convicted. Which points
about those wrongful convictions are true? Or which statements about processes
that led to wrongful convictions are true? The italicized statement is true. MTF
A) The most common factors
associated with wrongful convictions were bad or faulty data.
B) Approximately 2/3 of
the first 100 convicts who were subjected to DNA testing (after conviction)
were shown to be innocent. This fraction
(2/3) is likely a biased fraction of all wrongful convictions, because the
first 100 convicts tested was not a random sample.
C) Since DNA typing has
been routinely implemented before trial, it has been found that the
prosecutionÕs prime suspect is nearly always compatible with the DNA evidence Ð
only 5% of prime suspects are cleared by the DNA before trial.
D) Fallibility of
eyewitness identification has come to light only in the last 1-2 decades
because the first experimental tests were conducted that recently.
(23,24). The
next 2 questions address types of problems with forensic data used in
court. Use the following set of
options
A) bad protocols
B) bad standards (including inadequate databases, lack of
proficiency testing)
C) lack of blind
D) inadequate replication
E) failure to randomize
23. (4pts).
(MTF) Which options are
fixed by using coded samples?
(A)
(B) (C) (D) (E)
24. (4pts). (MTF)
Which options were given as the main problems with much of forensic
evidence?
(A)
(B) (C) (D) (E)
25.
(4 pts.) Exam Key Code: Fill in
(AB) on question 25 to indicate your exam code. Also, fill in the correct bubbles for your name and pad
number on the scantron form.