EVIDENCE FOR CONSCIOUSNESS-RELATED ANOMALIES IN RANDOM PHYSICAL SYSTEMS
Document Type:
Collection:
Document Number (FOIA) /ESDN (CREST):
CIA-RDP96-00789R002200520001-0
Release Decision:
RIFPUB
Original Classification:
K
Document Page Count:
16
Document Creation Date:
November 4, 2016
Document Release Date:
May 18, 2000
Sequence Number:
1
Case Number:
Content Type:
REPORT
File:
Attachment | Size |
---|---|
CIA-RDP96-00789R002200520001-0.pdf | 842.64 KB |
Body:
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Vol. 19, No. IZ, December 1989
Printed in Belgium
Evidence for Consciousness-Related Anomalies in
Random Physical Systems
Dean I. Rodin' and Roger D. Nelson2
Received May 6, 1988; revised June 12, 1989
Speculations about the role of consciousness in physical systems are frequently
observed in the literature concerned with the interpretation of quantum mechanics.
While only three experimental investigations can be found on this topic in physics
jou_ rRals, more than $OO relevant experiments have been reported in the literature
of parapsychology. Awell-defined body of empirical evidence from this domain
was reviewed using meta-analytic techniques to assess methodological quality and
overall effect size. Results showed effects conforming to chance expectation in
control conditions and unequivocal non-chance effects in experimental conditions.
This quantitative literature review agrees with the findings of two earlier reviews,
suggesting the existence of some form of consciousness-related anomaly in random
physical systems.
The nature of the relationship between human consciousness and the
physical world has intrigued philosophers for millenia. In this century,
speculations about mind-body interactions persist, often contributed by
physicists in discussions of the measurement problem in quantum mechanics.
Virtually all of the founders of quantum theory-Planck, de Broglie,
Heisenberg, Schrodinger, Einstein-considered this subject in depth, ~' ~ and
contemporary physicists continue this tradition.~2-'~
~ Department of Psychology, Princeton University, Princeton, New Jersey 08544. Present
address: Contel Technology Center, 15000 Conference Center Drive, P.O. Box 10814,
Chantilly, Virginia 22021-3808.
2 Department of Mechanical and Aerospace Engineering, Princeton University, Princeton,
New Jersey 0$544.
1499
OO15-9018/89/1200-1499f06.00/0 R?^, 1989 Plenum Publishing Corporation
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-00789R0022005200Q1-0
IS00 Radin and Nelson
The following expression of the problem can be found in ~ recent
interpretation of quantum theory:
If conscious choice can decide what particular observation I measure. and kh~re-
fore into what states my consciousness splits. might not conscious choice also
be able to influence the outcome of the measurement? One possible place where
mind may influence matter is in quantum effects. Experiments on whether i~t is
possible to affect the decay rates of nuclei by thinking suitable thoughts would
presumably be easy to perform, and might be worth doing.18~
Given the distinguished history of speculations about the:':. role of
consciousness in quantum mechanics, one might expect that the; physics
literature would contain a sizable body of empirical data on this :topic. A
search, however, reveals only three studies.
The first is in an article by Hall, Kim, McElroy, and Shimoity, who
reported an experiment "based upon taking seriously the proposalthat the
reduction of the wave packet is due to a mind-body interaction; ~n which
both of the interacting systems are changed."'9' This experiment; e~Camined
whether one person could detect if another person had previously bbserved
a quantum mechanical event (gamma emission from sodium-22 atoms).
The idea was based on the supposition that if person A's observation
actually changes the physical state of a system, then when person obser-
ves the same system later, B's experience may be different acco ding to
whether A has or has not looked at the system. Hall er al.'s results, based
on a total of 554 trials, d:id not support the hypothesis; the bbserved
number of "hits" obtained in their experiment was precisely the.; number
expected by chance (277), while the variance of their measurements was
significantly smaller than expected (p < 0.05 ). c9 ~
The second study is referred to by Hall et al., who end their article by
pointing out that a similar, unpublished experiment using cobalt-~7 as the
source was successful (40 hits out of 67 trials ).'10'
The third study is a more systematic investigation- repdrted by
Jahn and Dunne,"' who summarize results of over 25 million binary
trials collected during seven years of experimentation with random-event
generators. These experiments, involving long-term data colleeti~on with
33 unselected individuals, provide persuasive, replicable evidence of an
anomalous correlation between conscious intention and the o>~rtput of
random number generators.
Thus, of three pertinent experiments referenced in mainstream} physics
journals, one describes results statistically too close to chance expectation
and two describe positive effects.'9-"' Given the theoretical implications of
such an effect, it is remarkable that no further experiments of this ;type can
be found in the physics literature; but this is not to say that .no such
experiments have been performed. In fact, dozens of researchers have
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
reported conceptually identical experiments in the puzzling and uncertain
domain of parapsychology. Perhaps because of the insular nature of
scientific disciplines, the vast majority of these experiments are unknown
to most scientists. A few critics who have considered this literature have
dismissed the experiments as being flawed, nonreplicable, or open
to fraud,~12-16~ but their assertions are countered by at least two
detailed reviews which provide strong statistical support for the existence
of anomalous consciousness-related effects with random number ,
_. _
generators.~l'~'$~ In this paper, we describe the results of a comprehensive,
quantitative meta-analysis which focused on the questions of methodoiogi-
cal quality and replicability in these experiments.
The experiments involved some form of microelectronic random
number generator (RNG), a human observer, and a set of instructions for
the observer to attempt to "influence" the RNG to generate particular
numbers, or changes in a distribution, solely by intention. RNGs are
usually based upon a source of truly random events such as electronic
noise, radioactive decay, or randomly seeded pseudorandom sequences.
Feedback about the distribution of random events is often provided in the
form of a digital display, but audio feedback, computer graphics, and a
variety of other mechanisms have also been used. Some of the RNGs
described in the literature are technically sophisticated, the best devices
employing electromagnetic shielding, environmental failsafe mechanisms
triggered by deviant voltages, currents, or temperature, automatic
computer-based data recording on magnetic media, redundant hard copy
output, periodic randomness calibrations, and so on.~is.zo~
RNGs are typically designed to produce a sequence of random bits at
the press of a button. After generating a sequence of say, 100 random bits
(0's or 1's ), the number of 1's in the sequence may be provided as feedback.
In an experimental protocol using a binary RNG, a run might cansist of
an observer being asked to cause the RNG to produce, in three successive
button presses, a high number (sum of 1's greater than chance expectation
of 50), a low number (less than 50), and a control condition with no direc-
tional intention. An experiment might consist of a group of individuals
each contributing a hundred such runs, or one individual contributing
several thousand runs. Results are usually analyzed by comparing high
aim and low aim means against a control mean or theoretical chance
expectation.
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
tgp~ Rsdin land Nelson
The quantitative literature review, also called meta-analysis, has
become a valuable tool in the behavioral and social sciences. ~=11
Meta-analysis is analogous to well-established procedures usead in the
physical sciences to determine parameters and constants. The technique
assesses replication of an effect within a body of studies by examining the
distribution of effect sizes.~z7-2?' In the present context, the null- hiypothesis
(no mental influence on the RNG output) specifies an expected mean effect
size of zero. A homogeneous distribution of effect sizes with-nonzero mean
indicates replication of an effect, and the size of the deviation of the mean
from its expected value estimates the magnitude of the effect.
Meta-analyses assume that effects being compared are sirnil'ar across
different experiments, that is, that all studies seek to estimate the same pop-
ulation parameters. Thus the scope of a quantitative review must be strictly
delimited to ensure appropriate commonality across the different studies
that are combined.~21?ZS' This can present a nontrivial problem; in meta-
analytic reviews because replication studies typically investigate a number
of variables in addition to those studied in the original experiments. In the
present case, because different subjects, experimental protocols, aid RNGs
were employed within the reviewed literature, some heterogeneity
attributable to these factors was expected in the obtained distribution of
effect sizes. However, the circumscription for the review required tihat every
study in the database have the same primary goal or hypothesis; end hence
estimate the same underlying effect.
Experiments selected for review examined the following hypothesis:
The statistical output of an electronic RNG is correlated wixh' observer
intention in accordance with prespecified instructions, as indicated by
the directional shift of distribution parameters (usually the mean} from
expected values.
Because this "directional shift" is most often reported as a~ standard
normal deviate (i.e., Z score) in the reviewed experiments, we ;determined
effect size as a Z score normalized by the square root of the sample size
(N), e = Z/~, where N was the total number of individual random events
(with probability of a hit at p = 0.5, p = 0.25, etc. ). This effect size measure
is equivalent to a Pearson product. moment correlation. ~Z"
3.1. Unit of Analysis
To avoid redundant inclusion of data in ameta-analysis, ~ "units of
analysis" are often specified. We employed the following method: If
an author distinguished among several experiments reported in a single
;Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Consciousness in Physical Systems 1503
article with titles such as "pilot test" or "confirmatory test," or provided
independent statistical summaries, each of these studies was coded and
quality-assessed separately. If an experiment consisted of two or more
conditions comparing different intentions or types of RNG devices, the
data were split into separate units of analysis to allow the results to be
coded unambiguously. In general, within a given reviewed report, the
largest possible aggregation of nonoverlapping data collected under a
single intentional aim was defined as the unit of analysis (hereafter called
an experiment or study).
For each experiment, a Z score was assigned corresponding to
whether the observed result matched the direction of intention. Thus, a
negative Z obtained under intention to "aim low" was recorded as a
positive score. When sufficient data were provided in a report, Z was
calculated from those data and compared with the reported results; the
new calculation was used if there was a discrepancy. If only probability
levels were reported, these were transformed into the corresponding Z
score. For experiments reported only as "nonsignificant," aconservative
value of Z = 0 was assigned; if the outcome was reported only as "statisti-
cally significant," Z =1.645 was assigned; and if sample size was not repor-
ted or could not be calculated from the information provided, a special
code of N =1 was assigned.
3.2. Assessing Quality
Because the hypothesized anomalous effect is not easily accom-
modated within the prevailing scientific world-view, it is particularly
important to assess the trustworthiness of each reviewed experiment.
Unfortunately, estimating experimental quality tends to be a subjective
task confounded by prior expectations and beliefs.~2s~Z') Estimates of inter-
judge reliability in assessing the quality of research reports, for example,
rarely exceed correlations of 0.5.28) We addressed this problem by
assigning to each experiment a single quality weight derived from a set of
sixteen binary (present/absent) criteria. The first author coded and
double-checked the coding for all studies; the second author independently
coded the first 100 studies. Inter-judge reliability for quality criteria was
r = 0.802 with 98 degrees of freedom.
These criteria were developed from published criticisms about
random-number generator experiments~'a.~s.z9-33) and from expert opinion
on important methodological considerations when performing. studies
involving human behavior.~2o.3a.3s> Collectively, these criteria form a
measure of credibility by which to judge the reported data. The criteria
assess the integrity of the experiment in four categories-procedures,
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Radin and Nelson
statistics, the data, and the RNG device-and they cover virtually all
methodological criticisms raised to date. They are (1) control teSits noted,
(2) local controls conducted, (3) global controls conducted, (4~ controls
established through the experimental protocol, (5) randomness calibrations
conducted, (6) failsafe equipment employed, (7) data automatically recor-
ded, (8) redundant data recording employed, (9) data double checked,
(10) data permanently archived, (11) targets alternated on successive trials,
(12) data selection prevented by protocol or equipment, (13') Fixed run
lengths specified, (14) formal experiment declared, (15)tamper=resistant
RNG employed, and (16) use of unselected subjects.
Each criterion was coded as being present or absent in the ,;report of
an experiment, specifically excluding consideration of previously published
descriptions of RNG devices or control tests. This strategy was ermployed
to reflect lower confidence in such experiments since, for exarnple;~ random-
ness tests conducted once an an RNG do not guarantee acceptable perfor-
mance in the same RNG in all future experiments. As a result; assessed
quality was conservative, that is, lower than the "true" quality -'for some
experiments, especially those reported only as abstracts or cpnference
proceedings. Using unit weights (which have been shown to be !Robust in
such applications~361) on each of the sixteen descriptors, the quality rating
for an individual experiment was simply the sum of the descripte}rs. Thus,
while a quality score near zero indicated a low quality or poorly reported
experiment, a score near sixteen reflected a highly credible experimment.
33. Assessing Effect Size
Assume that each of K experiments produces effect size estimates e of
a parameter E, based on N samples, and that each a has a known ~ standard
error s. The weighted mean effect size is calculated as e. _ ~ cetre;/~ w;,
where w; = 1 /s = N;, and i ranges from 1 to K. The standard error of e. is
se = (~ w;) - `~'-. A test for homogeneity for the K estimates of e; is given by
HK = ~ w; (e; - e. )', where Hx has achi-square distribution with K -1
degrees of freedom.~37 The same procedure can be followed tc~ test for
homogeneity of effect size across M independent investigators. In ;this case,
e. ~ and seJ are calculated per investigator, and the test for homogeneity is
performed as H,,,=~ w;(e.;-e.M)=, where e.; and w; are meanweighted
effect size and 1/sP per investigator, respectively, e. M = ~ w;e.;/~ w;, and
j ranges from 1 to M. HM has M - 1 degrees of freedom.
For a quality-weighted analysis, we may determin(: e. Q =
~ (Q; cv; e; )/~ (Q; w; ), where Q; is the quality assessed for experiment i. The
standard error associated with eQ is seQ = (~ (Q?w;)/(~ QIw;)?)''~2; the
test for homogeneity is similar to that described above. Finally, :Following
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
the practice of reviewers in the physical sciences,~23~24~ we deleted potential
"outlier" studies to obtain a homogeneous distribution of effect sizes and to
reduce the possibility that the calculated mean effect size may have been
spuriously enlarged by extreme values. The procedure used was as follows:
If the homogeneity statistic for all studies was significant (at the p < 0.05
level ), the study that would produce the largest reduction in this statistic
was deleted; this was repeated until the homogeneity statistic had become
nonsignificant.
On-line bibliographic databases for psychology and physics journals
were searched, as was a specialized database covering parapsychological
articles, technical reports, conference proceedings and manuscripts.
Altogether 152 references were found from 1959 to 1987. These reports
described 832 studies conducted by 68 different investigators (597
experimental studies and 235 control studies). Fifty-four experimental and
33 control studies reported only as nonsignificant were assigned Z = 0. Six
experiments and two control studies coded as (N= 1, Z> 0) were
eliminated from further meta-analysis because effect size could not be
accurately estimated (this required the elimination of one investigator who
reported a single study). Figures 1 and 2 show the distributions of Z scores
reported for control and experimental studies, respectively.
m
a 3
w
THEORY
FIT 00000
SHIFT -----
-2 -1 0 1 2 3 4 5
2-SCORES
Fig. 1. Distribution of Z scores reported in 235 control studies. Thirty-three of these studies
were reported only as "nonsignificant" and were assigned Z scores of zero. To replace the
spurious spike at Z=O, those 33 studies were recast as normally distributed Z scores,
bounded by ? 1.64, averaging Z = 0.
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
106 Rsdin snd Nelson
Fig. 2. Distribution of Z scores reported in 597 experimental studies. Fifty-four of these
studies were reported as "nonsigni4'icant" and were assigned Z scores of zero.'Ag in Fig. 1,
those 54 studies were recast as normally distributed Z scores, bounded by ? 1.64; averaging
d
O 2
x
m
N
.y
U ~
d
-3 -2 -1 '--'0 1
2-SCORES
N . 58
Fig. 3. Mean effect size point estimates ? 1 standard error
for (a)contro] studies and (b)individual experiments;
(c) mean effect size per investigator, (d) homogeneous mean
effect size for experiments, (el homogeneous mean effect size
per investigator, (f) mean effect size for quality-weighted
experiments, and (g) mean effect size for homogeneous
quality-weighted experiments.
THEORY
FIT o a o jo 0
~4pproved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
These results, expressed as overall mean effect sizes, show that control
studies conform well to chance expectation {Fig. 3a), and that experimental
effects, whether calculated for studies or investigators, deviate significantly
from chance expectation (Fig. 3b, 3c). To obtain a homogeneous distribu-
tion of effect sizes, it was necessary to delete 17 % of individual outlier
studies (Fig. 3d) and 13 % of mean effect sizes across investigators (Fig. 3e).
This may be compared with exemplary physical and social science reviews,
where it is sometimes necessary to discard as many as 45 % of the studies
to achieve a homogeneous effect size distribution.~19' Of individual studies
deleted, 77 % deviated from the overall mean in the positive direction, and
of investigator means deleted, all were positive (i.e., supportive of the
experimental hypothesis).
4.1. Effect of Quality
Some critics have postulated that as experimental quality increases in
these studies, effect size would decrease, ultimately regressing to the "true"
value of zero, i.e., chance results. suggesting
that the present database is not compromised by poor experimental
methodology. Another assessment of the effect of quality was obtained by
comparing unweighted and quality-weighted effect sizes per experiment
(Fig. 3b vs. 3f). These are nearly identical, and the same is true after
deleting outliers to obtain a homogeneous quality-weighted distribution
(Fig. 3d vs. 3g), confirming that differences in methodological quality are
not significant predictors of effect size.
It might be argued that the quality assessment procedure employed
here was nonoptimal because some quality criteria are more important
than others, so that if appropriate weights were assigned, the
quality-weighted effect size might turn out to be quite different. This was
tested by Monte Carlo simulation, using sets of 16 weights, one per
criterion, randomly selected over the range 0 to 6. Aquality-weighted effect
size was calculated for the 597 experiments as before, now using the
random weights instead of unit weights, and this process was repeated one
thousand times, yielding a distribution of possible quality ratings. The
average effect size from the simulation was 3.18 x 10 -4 ? 0.15 x 10-4,
indicating that in this particular database coded by these sixteen criteria,
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
1508 Rodin end Nelson
the probable range of the quality-weighted mean effect size clearlyexcludes
chance expectation of zero.
Although accounting for differences in assessed quality does npt nullify
the effect, it is well known in the behavioral and social sciences that non-
significant studies are published less often than significant studied (this is
called the "filedrawer" problem~21?41~3j). If the number of nonsignificant
studies in the filedrawer is large, this reporting bias may serious~y inflate
the effect size estimated in ameta-analysis. We explored several procedures
for estimating the magnitude of this problem and to assess the possibility
that the filedrawer problem can sufficiently explain the observed t'esults.
The filedrawer hypothesis implicitly maintains that all or nearly all
significant positive results are reported. If positive studies are not balanced
by reports of studies having chance and negative outcomes, the empirical
Z score distribution should show more than the expected prap~rtion of
scores in the positive tail beyond Z = 1.645. While no argument can be
made that all negative effects are reported, it is interesting to notel that the
database contains 37 Z scores in the negative tail, where only 30 would be
expected by chance. On the other hand, there are 152 scores in thq positive
tail, about five times as many as expected. The question is whe'.ther this
excess represents a genuine deviation from the null hypothesis or' a defect
in reporting or editorial practices.
This question may be addressed by modeling based on the assumption
that all significant positive results are reported. Afour-parameter.~fit mini-
mizing the chi-square goodness-of--fit statistic was applied to all :observed
data with Z ,1.645, using the exponential
to simulate the effect of skew or kurtosis in producing the dspropor-
tionately long positive tail. This exponential is a probability distribution
with- the same mean and variance as the normal distribution; .but with
kurtosis = 3.0.
To begin, the null hypothesis of a (0, 1) normal distribution: with no
kurtosis was considered. To account for the excess in the pgsi';tive tail,
"~~ N= 585,000 filedrawer studies were required, and the chi-squared! statistic
~~ remained far too large to indicate a reasonable fit (see Table I). This large
1V, in comparison with the 597 studies actually reported together with the
.poor goodness-of--fit statistic, suggests that the assumption of a (0, 1)
normal distribution is inappropriate.
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
'Consciousness in Physical Systems 1509
Table I. Four-Parameter Fit (E:N, N, Mean, sd) Minimi2ing Chi-Square (lOdf)
Goodness-of--Fit Statistic to the Positive Tail of the Observed Z Score Distribution,
for Several Exponential:Nonnal Ratios?
Normal distribution
0
585,000
0
1
57,867.84
0
(null hypothesis)
1
5,300
0
1
220.97
0
2
4,800
0
1
167.84
0
3
4,600
0
1
148.45
0
10
4,400
0
1
119.69
0
Empirical distribution
0
700
0.145
2.10
23.94
0.008
1
747
0.345
1.90
16.32
0.091
2
757
0.445
1.80
14.21
0.164
3
777
0.445
1.80
11.08
0.226
10
807
0.445
1.80
11.08
0.351
? The null hypothesis is tested by clamping the mean at 0 and the standard deviation at 1,
allowing N and E:N to vary. The empirical database is addressed by allowing all four
parameters to vary.
Adding simulated kurtosis to a (0, 1) normal distribution by mixing
exponential [Eq. (1) ] and normal distributions in a 1:1 ratio reduced N by
two orders of magnitude, and ratios of 2:1, 3:1, and 10:1 exponential to
normal (E:N) yielded further small improvements. However, the chi-
squared statistic still indicated a poor fit to the empirical data. Applying
the same mixture of exponential and normal distributions, but starting
from the observed values of N = 597, mean Zscore = 0.645, and standard
deviation =1.601, with the constraint that the mean could only decrease
from 0.645, resulted in much better fits to the data. Table I shows the
results.
This procedure shows that the null hypothesis is unviable, even after
allowing a huge filedrawer. The chi-square fit vastly improves with the
addition of kurtosis, but only becomes a reasonably good fit when mean
and standard deviation are allowed to approximate the empirical values.
The filedrawer estimate from this model depends on a number of assump-
tions (e.g., the true distribution is generally normal, but has a dispropor-
tionately large positive tail). It suggests a total number of experimental
studies on the order of 800, of which three-fourths have been formally
reported.
A somewhat simpler modeling procedure was applied to the data
assuming that all studies with significant Z scores in either the positive or
negative tail are reported. The model is based on the normal distribution
with a standard deviation =1, and estimates the mean and N required to
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-00789R0022005200a1-0
1510 Raclin and Nelson
account for the 152 Z scores in the positive tail and 37 Z scores in the
negative tail. This mean-shift model, which ignores the shape of the
observed distribution, results in an N =1,580 and amean Zscore = 0.34.
These modeling efforts suggest that the number of unreported or
unretrieved RNG studies falls in the range of 200 to 1,000. A remaining
question is, how many filedrawer studies with an average null resu~t would
be required to reduce the effect to nonsignificance (i.e., p "Evaluating psychological research reports: Dimensions, reliability,
and correlates of quality judgments," Am. Psychol. 33, 920-934 (1978).
29. C. Akers, "Methodological criticisms of parapsychology." In Advances in Parapsychologi-
cal Research, Vol. 4, S. Krippner, ed. (McFarland, Jefferson, North Carolina, 1984); "Can
meta-analysis resolve the ESP controversy?" In A Skeptic's Handbook of Parapsychology,
P. Kurtz, ed. (Prometheus Books, Buffalo, New York, 1985).
30. J. E. Alcock, "Parapsychology: Science of the anomalous or search for the soul," Behav.
Brain Sci. Y0, 553-565 (1987).
31. P. Diaconis, "Statistical problems in ESP research," Science 201, 131-136 (1978).
32. C. E. M. Hansel, ESP and Parapsychology: ACritical Reevaluation (Prometheus gooks,
Buffalo, New York, 1980).
33. R. Hyman, "The ganzfeld psi experiment: A critical appraisal," J. Parapsychol. 49, 3-50
(1985 ).
34. T. X. Barber, Pitfalls in Human Research: Ten Pivotal Points (Pergamon Press, Elmsford,
New York, 1976).
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0
1514 Radio ~ad Nelson ,
35. J. B. Rhine, "Comments: 'A new case of experimenter unreliability,"' J. Para~sychol. 38,
215-255 (1974).
36. R. M. Dawes, "The robust beauty of improper linear models in .decision making," Am.
Psychol. 34, 571-582 (1979 }.
37. L. V. Hedges, "How hard is hard science, how soft is soft science?" Am, P.,sychol. 42,-
443-455 (1987).
38. C. E. M. Hansel, ESP: A Scientific Evaluation (Charles Scribner's Sons, New York, 1966),
p. 234.
39. R. Rosenthal and D. B. Rubin, "Interpersonal expectancy effects: The first 3;45 studies,"
Behm;. Brain Sci. 3, 37715 (1978). L~
40. G. V. Glass, B. McCaw, and M. L. Smith, Meta-analysis in Social Research. f~Sage Publi-
cations, $cverly Hills, California, 1981).
41. Q. McNemar, "At random: Sense and nonsense," Am. Psychol. 15, 295-300 (,960).
42. S. Iyengar and J. B. Greenhouse, "Selection models and the filc-drawer problem,"
Technical Report 394, Departrnent of Statistics, Carnegie-Mellon University (~uly, 1987).
43. L. V. Hedges, "Estimation of effect size under nonrandom sampling: The effects of
censoring studies yielding statistically insignificam mean differences," J. Edi+c. Stat. 9,
61-86 (1984 ).
44. H. H. Collins, Changing Order: Replication and Induction in Scientific Prdctice (Sage
Publications, Beverly Hills, California, 1985).
45. S. Epstein, "The stability of behavior, II: Implications for psychological reserarch," Am.
Psychol. 35, 790-806 (1980).
46. D. Druckman and J. A. Swets, eds. Enhancing Human Performance: Issues, Tfteories, and
Techniques (National Academy Press. Washington, D.C., 1988),-p. 207.
47. A. Neher, The Psychology of Transcendence (Prentice-Hall, Englewood Cliffs, ~iew Jersey,
1980), p. 147. j
Printed by Catherine Press, Ltd., Tempelhof 41, B-8000 Brugge, Belgium
Approved For Release 2000/08/08 :CIA-RDP96-007898002200520001-0