PROJECT STAR GATE RESEARCH AND PEER REVIEW PLAN
Document Type:
Collection:
Document Number (FOIA) /ESDN (CREST):
CIA-RDP96-00789R002700010001-1
Release Decision:
RIPPUB
Original Classification:
S
Document Page Count:
106
Document Creation Date:
November 4, 2016
Document Release Date:
February 13, 2003
Sequence Number:
1
Case Number:
Publication Date:
June 1, 1994
Content Type:
RS
File:
Attachment | Size |
---|---|
CIA-RDP96-00789R002700010001-1.pdf | 9.11 MB |
Body:
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
PRG-TR-1068-SL
DEFENSE
INTELLIGENCE
AGENCY
PROJECT STAR GRTE
RESEARCH AIID PEER REVIEW PLAR (U)
JUnE 1994
NOFORN
SECRET
LIMDIS
STAR GRTE
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
PROJECT STAR GATE
RESEARCH AND PEER REVIEW PLAN (U)
This document was prepared by the
Technology Assessment and Support Activity
Office for Ground Forces
Directorate for Military Assessments
National Military Intelligence Production Center
Defense Intelligence Agency
Date of Publication
June 1994
REPRODUCTION REQUIRES
APPROVAL OF ORIGINATOR
OR HIGHER DOD AUTHORITY
LIMITED DISSEMINATION
FUTHER DISSEMINATION CLASSIFIED BY MULTIPLE SOURCES
ONLY AS DIRECTED BY DIA/PAG
OR HIGHER DOD AUTHORITY DECLASSIFY ON OADR
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
UNCLASSIFIED
OUTLINE
PAGE
EXECUTIVE SUMMARY ................................... 1
1. INTRODUCTION .................................. 2
II. PLAN OBJECTIVES ............................... 3
III. SIGNIFICANCE OF EFFORT ........................ 4
IV. PLAN OVERVIEW ................................. 5
V. BASIC RESEARCH PLAN FOR ANOMALOUS COGNITION... 7
VI. BASIC RESEARCH PLAN FOR ANOMALOUS PERTURBATION. 15
VII. APPLIED RESEARCH PLAN FOR ANOMALOUS COGNITION.. 17
SG1 B
IX. POTENTIAL RESEARCH RETURN ...................... 25
X. PROJECT OVERSIGHT ............................. 25
XI. DEVELOPMENT OF EVALUATION CRITERIA ............. 26
XII. BUDGET AND RESOURCE REQUIREMENTS (FYs 95-99)... 26
APPENDICES
A. CONGRESSIONALLY-DIRECTED ACTION, DEFENSE
AUTHORIZATION CONFERENCE ...................... A-1
B. TERMINOLOGY AND DEFINITIONS ................... B-1
C. POTENTIAL RESEARCH SUPPORT FACILITIES ......... C-1
D. RESOURCE LITERATURE ........................... D-1
E. CURRENT CONTRACTOR SCIENTIFIC OVERSIGHT
COMMITTEE MEMBERSHIP .......................... E-1
F. CURRENT CONTRACTOR INSTITUTIONAL
REVIEW BOARD ................................. F-1
G. ACADEMIC STUDIES REGARDING THE SCIENTIFIC
VALIDITY OF AMP .............................. G-1
H. AN ASSESSMENT OF THE ENHANCED HUMAN
PERFORMANCE PROGRAM .......................... H-1
I. IN-HOUSE STAFFING REQUIREMENTS ............... I-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
(U) EXECUTIVE SUMMARY:
(S/NF/SG/LIMDIS) In compliance with the congressional
conferees' request (Appendix A), DIA proposes to develop a multi-
year research and development program, subject to rigorous
scientific and technical oversight, to demonstrate the scientific
validity of the STAR GATE program, and that results of military
and intelligence value can be obtained in a cost-effective manner
using anomalous mental phenomena (AMP).
(S/NF/SG/LIMDIS) This proposed program, if successfully
implemented, will:
- Identify the underlying mechanisms of AMP.
- Establish the limits of operational usefulness of
- Determine the degree to which foreign activities in
AMP represents a threat to national security.
- Lead to the development of countermeasures to
neutralize this threat.
- Use research findings to improve operational
activities.
- Develop data fusion criteria to integrate AMP results
with other intelligence sources.
(S/NF/SG/LIMDIS) Due to the diversity of the STAR GATE
mission/objectives, both external resources and in-house
expertise are required. Since this Activity possesses no in-
house R&D capability, an absolute need for external R&D support
is required to meet Congressional concerns which are addressed in
this program plan.. A balance will be maintained between external
and in-house activities, and every effort will be made to
integrate and link these activities where appropriate. The
external aspect permits a wide range. of expertise covering many
disciplines to be focused on this area; this also has the benefit
of ensuring peer group review and of facilitating a variety of
scientific interactions. In-house personnel with a wide-range of
expertise in this phenemenology will need to be retained to make
this proposed plan work.
(S/NF/SG/LIMDIS) In order to fulfill Congressional
Direction, the DIA proposes to convene a Scientific Evaluation
Panel (SEP) composed of representatives from each of the Service
Scientific Advisory Boards. The purpose of the SEP is to review
and validate the methodology outlined in the plan in order to
address the cost-effectiveness and performance criteria for the
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
STAR GATE program's research and development objectives and to
propose recommendations as to which objectives should be pursued
and the program scope required to achieve those objectives. If
the SEP determines that objectives in the plan are viable and
executable, the General Defense Intelligence Program (GDIP)
Manager will complete this initiative with others for limited
available resources remaining in the program.
(U) The proposed ongoing R&D effort will be reviewed every
two years by the SEP to determine whether the STAR GATE program
can show results that are cost-effective and satisfy reasonable
performance criteria.
(C) An annual report will document the current
operational, technical and administrative status of the program.
I. (U) INTRODUCTION:
(S/NF/SG/LIMDIS) This program plan was developed in
response to a Defense Authorization Conference, Congressionally
Directed Action (CDA) to prepare a long-term systematic and
comprehensive research and peer review plan in order to
investigate anomalous mental phenomena (AMP), and to apply
program research results to potential operational activities.
This plan also describes key in-house activities along with an
appropriately integrated basic and applied external research
support effort.
(S/NF/SG/LIMDIS) Specifically, this program plan
represents DIA's view on how best to proceed with both in-house
activities and external research support for the period of FY95
through FY99. Research findings, both domestic and foreign, and
results from operational activities may lead to updates of this
plan in order to reflect improved phenomena understanding and to
pursue follow-on research and/or application directions.
(S/NF/SG/LIMDIS) A underlying and fundamental premise
governing the implementation of this program plan is that a well-
integrated interdisciplinary approach is considered to be the
most appropriate strategy for conducting research in this diverse
field. Consequently, this plan includes a wide variety of
research topics which are based on recent findings from leading-
edge pursuits in other disciplines that are suspected of being
germane for STAR GATE. Other topics are derived from a review of
worldwide research, consultations with leading area experts, and
on insights gained from previous research and application
activities associated with the STAR GATE program.
(S/NF/SG/LIMDIS) This program plan also includes
recommended proposed FY funding which will allow for the STAR
GATE program to show results that are cost effective and will at
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
the same time satisfy reasonable program performance criteria.
The implementation of this program plan will preclude the
reoccurrence of the yearly cyclical activity of project start-up,
limited progress, followed by anticipated project shut-down which
previously inhibited program activity.
(S/NF/SG/LIMDIS) In sum, the implementation of this
research and peer review plan will allow DIA to successfully
accomplish identified R&D activities which, in-turn, will enhance
the capability of STAR GATE personnel to engage in operational
activities and to assess the work done by potential adversaries,
thereby, reducing the risk potential for a technological
surprise.
(U) Terminology and definitions are discussed at
Appendix B.
II. (U) PLAN OBJECTIVES:
(S/NF/SG/LIMDIS) The objective of this follow-on research
and peer review plan is to further develop phenomena
understanding and/or validation, in applications understanding,
and in operational feasibility evaluation. This continued work
will have a direct bearing on DIA's ability to both assess the
significance of foreign research and to perform a systematic
review of potential applications regarding this phenomena.
(S/NF/SG/LIMDIS) Accomplishment of the various activities
identified in this plan will further enhance threat assessment of
foreign achievements in this area, and will help achieve the
potential for U.S. military/intelligence applications on select
tasks as a supplement to HUMINT operations.
(U) It is anticipated that this plan will assist decision
makers in their review and consideration of future directions for
this field, and that this plan.can.begin formal implementation
starting in FY95.
(S/NF/SG/LIMDIS) In compliance with the Congressional
conferees' request, DIA recommends that a period of six to nine
months be set aside at the beginning of this new program for the
purpose of identifying the most promising and cost-effective
experiments to be conducted under the program to meet the overall
research objectives outlined below. It is further suggested that
a series of small working groups consisting of scientific experts
from a variety of pertinent disciplines meet during this time
period to accomplish this end. Their suggestions will be
presented to the STAR GATE Scientific Oversight Committee for
final approval.
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
SG1 B
III. (U) SIGNIFICANCE OF EFFORT:
(S/NF/SG/LIMDIS) STAR GATE is a dynamic approach for
pursuing the largely unexplored area of human consciousness and
subconsciousness interaction. Its scope is comprehensive; a wide
range of phenomenological issues are examined that include
psychological, physiological/neurophysiological, physics and
other leading-edge scientific areas. Although broad in scope,
STAR GATE is well grounded due to its solid independent
scientific review base. STAR GATE is based on a dynamic style in
all its endeavors, especially in its pursuit of on-going foreign
activities in this area.
(S/NF/SG/LIMDIS) One of the tasks previously levied on DIA
by the FY91 Defense Authorization Act was to develop a long-range
comprehensive plan for investigating parapsychological phenomena.
This task was one of several objectives included in a new program
for this phenomenological area that identified DIA as executive
agent. Moreover the FY91 Defense Authorization Act authorized
for DIA a funding level of $2 million for DIA in order to
initiate this new program. As a result, a balanced and
integrated plan to include operations, foreign assessment, and
research and development was implemented . In addition, a new
DIA limited dissemination (LIMDIS) program, codeword STAR GATE,
was established in order to accomplish the objectives that were
set forth in this plan.
(S/NF/SG/LIMDIS) The external research support conducted
under monies appropriated to date comes to a close in the
March/April 1994 time-frame. The impact of this is that if
research activities utilizing human subjects are interrupted, it
has generally been necessary to begin again instead of later
resuming activities from the point of termination. Consequently,
,it is important for the STAR GATE program to remain stable.
Research involving human use differs considerably from that
involving physical systems. For example, data from human
subjects cannot be collected nor analyzed as rapidly, in that
additional empirical data is often required to reach analytical
conclusions. This type of data analysis utilizing human subjects
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Appro
ed ForRel
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
can only be achieved with an in-place, uninterrupted, multi-year
research and development program. Therefore, should it be
decided to go forward with this program, it should be done in a
timely fashion.
(S/NF) The funding allocation for external research
received by STAR GATE in FY91 and continued through FY93
permitted several important research areas to be initiated and
continued. It is anticipated that results of this research will
assist in clarifying some of the possible future research
directions; consequently, not all long-range research
possibilities can identified in this plan. However, most all of
the major investigation areas can be addressed, and many of the
specifics can be identified with reasonable confidence.
Figure 1 presents an overview of overall research objectives for
both Anomalous Cognition (AC) and Anomalous Perturbation (AP)
which will be considered for inclusion in this program.
(S/NF) Previous basic research activities from FY91
through FY93 focused on the following; (1) validating findings
from previous magnetoencephalograph (MEG) research and initiating
new work with a variety of conditions and individuals; (2)
performing a variety of anomalous cognition (AC) experiments to
determine potential correlations (e.g., target type,
environmental factors); (3) developing various theoretical
constructs that might be testable and that could help explain the
phenomena; (4) examining effects of altered states on data
quality; (5) initiating review of and research into the
energetics area; and (6) examining various application
possibilities (e.g., communication, search).
(U) Results from previous basic and applied research
activity have been factored into this research and development
plan and provide the basis upon which further R&D efforts will be
built.
IV. (U) PLAN OVERVIEW:
A. (U) BASIC RESEARCH OBJECTIVES
(S/NF/SG/LIMDIS) The objective of basic research is to
understand the fundamental, underlying mechanisms for AMP. To
achieve this objective in an efficient way, basic research of the
detection mechanism should begin in a conservative direction.
That is, assume that a putative "sensorial" system exists for AMP
and that it most likely will behave similarly to-those common
elements which are known through the five senses. This
conservative approach generalizes to understand the source of AMP
and its propagation mechanisms.
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
UNCLASSIFIED
II Cognlllon
1.0 Detector [2.0 Transmission 3,0 Source
1.1 shill
Central Nervous System
NeuroNet Models
Autonomlo Responses
Inter-species Communications
Other Mimals
1.2AA.Rlled
Other Physiology (Skin)
Pets onality (B ehavloraV8 ellf,port-Q-Sort/M871)
Perceptual Modell
PsychologbtlModels (MOUvaUon/Emollon)
Selection (Dired(Corteiauonal)
Environment-Physical (OMF)
Environment-?sychological (Set and Setting)
Environmeht-Phyelolopi aj (Comfort)
AM dal
Response Type (AudloNldeo/Lef Hand)
Redundancy (Multiple Pass/Multlple Detectors)
Communication
Analysis
1.9 Mixed
Internal Noise Source
Training (MacroscoplQ(Operant)
Session Protocols
2.10?910 '1
Informauon'al (Entropy/Meaning)
Other Thermodynamlo
Veolor/Scalar Potential
a.2.AAAUJA
Boundaries
OpMltl'one
Human Bender
Demarcatbn (Coordlnates/8eaoon)
Mdemd Nobe Source
Inverse (See ch)
2.2 Mind
Physical Charabterlallos (She/Composlllon)
Type (StaUcNynirnlo)
One-ht-'n' (Foroed Choloe/einpry
Search
III Perturbation
2.is3aiq
Decision Augmehtdtbn Theory
Worm Holes (4-'Dkrienebns)
Vector/Scalar PotentialPropagation
Stochasuo Ceusahty
Figure 1 (U) Research Overview
Approd For IeleaseJ003/0418 : CIj RD
I Anomalous Phenotiietia
(Mental)
I
"I' I 700110001-I
2.0 macro
2~19AilA '
Plezoelectrlo Strain Gairge
Restive Strain Gauge
Metal (Bending)
Pendulums (LlnearuibrYbn(Bbtogkat)
Mechanical Systems (Balb/Interferometera)
22AAd11ed
Inertial Syiterhi
1.1 paflil
Atoms
Nuclei (Moesbauer Effect)
Photons
Cells (Algae, Blood)
88410rl4 (Mutation--Salmonella)
Quantum Systems (Neutron/Pholon Interference)
Crystal'Structure
Molecular Strocture(R Spectra of H20)
Theory (Quantum M'easuremenliZeno)
Random Number Generators (Etedrontc/Nuciear)
maneUo p
Mag mains
1.3.Mllurl
Electrons
ARAld
LMng Systems (Hanltir slPlsh_.8eha41oral)
Mioromichines
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
B. (U) APPLIED RESEARCH OBJECTIVES
(S/NF/SG/LIMDIS) The objective of applied research is
to improve AMP functioning to its maximum possible limit. To
realize this objective, it is critical to define AMP output
measures that are consistent with either a laboratory setting
and/or an operational environment. The approach should also
reflect scientific conservatism. In investigating any single
variable (e.g., different training methodologies) all other
variables should remain as constant as possible (e.g., use the
same individuals and known good target systems).
C. (U) FOREIGN ASSESSMENT SUPPORT OBJECTIVES
(S/NF) From a research perspective, the objective of
foreign assessment is to determine the degree to which claims
from foreign laboratories can be confirmed in a U.S.-based
setting. In science, replication is critical for understanding.
V. (U) BASIC RESEARCH PLAN FOR ANOMALOUS COGNITION:
A. (U) BASIC APPROACH
(S/NF) The link of basic and applied research with
other applications investigations or with research activities is
shown on Figure 2. The top of the chart shows that for any
research or application task, certain conditions must be met
(e.g., a reliable calibrated individual is required; proper
scientific procedures need to be developed, etc.). Once these
basic foundations are laid, then basic/applied research can be
initiated with a reasonable expectation of success and with
assurance that results will not be ambiguous or fail scientific
scrutiny.
(S/NF) This chart also. illustrates the difference
between basic and applied research; applied research relates to
various methods for collecting, recording, improving and
analyzing data output, while basic research is aimed at phenomena
understanding. In this chart, the "detector" is the human
brain/mind, the "source" is the target or an aspect of the
target, and "transmission" refers to notions of how information
and/or energy are actually transmitted between source and
detector.
(U) Figure 3 illustrates the interdisciplinary scope
that will be brought to bear on this research problem. Leading-
edge researchers in their various fields can provide clues, if
not make direct contributions, that will assist in phenomena and
applications understanding. Appendix C lists candidate research
support facilities that could be involved in this long-range
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For ?Ngfffi CIA-RDP96-00789R002700010001-1
.baJZ06 - 03=t;a far Spec f3.c Tas7 I7C -cafe ong
* Reliable fCa].3l ?o& racoivor
,r App CCP ate waz gnat
tir Optisaum Protocol or Data- CSollecf30Zt
'VC Optimum. Data Assssemsxtt
rr Integration of Rssult
,ir: So~~ce
,K. Tr xzwsmisaion
tir Detector
sr zttec, ation
* Receiver 5electfon
Raceivor Tram
,k Target: Selection
Prot-.,cols
Analysis
?integration
* Counter oassses
Figure 2 (U) Research Objectives
UKLaSSIFIED
8
Approved For Release 2003Pa5,ip896-00789R002700010001-1
Ge31 1 #3alati ity
Quastt Xa3 1 'a e2tt
Z`lieitmadyssaalics
Statu&t .cs/Signal. xu.atyaris'~
2fevYV~I.-,Kettrvrks
RNUMRLUUS
MEN L
LPUMCIMENR
9Y
7~ay+cho~i imrsuuoiogy
cogs? tiv+e Haraoscience
rt f; ci a~. Isiteuigmuco
Figure 3 (U) Integration of Scientific Disciplines
UNCLASS i f I ED
7CeiLic31
7Lmthropology
?-~" Pa~physi olo
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
effort. Appendix D outlines pertinent research literature
applicable to this field. Final selection will be based on how
well the activities if these institutions will fit into specific
time-lines and priorities to be established in FY95. Figure 4
lists milestones for the anomalous cognition basic research to be
conducted under this plan.
B. (U) RESEARCH DETAILS
1. (U) Source.
(S/NF/SG/LIMDIS) Source research will address
those topics that show promise for understanding the
characteristics of the target or target area that may play a role
in anomalous cognition (AC) occurrence and data quality. Aspects
of the target that can be defined by conventional information
theory (involving entropy/information content) will be explored
in-depth. A wide variety of targets with a wide range of
information content, dynamics, or other parameters will be
examined to explore this possible link. If not successful, other
approaches to investigate the targets' innate nature and its
possible link to phenomenon occurrence will be initiated.
Definitive data in this area would also have implications for
defining those targets which have the highest probability of
successful data acquisition in an operational setting, thus
establishing operational tasking parameters.
2. (U) Transmission.
(S/NF) The pursuit of possible transmission
mechanisms for AC phenomena is essentially the most significant
basic research task and also the most difficult to formulate. In
this effort, a theoretical basis will be developed from
extensions of current theory in light of recent advanced physics
formulations. Some of these formulations permit unusual
"information flows" that may, in fact, have relevance for this
phenomenon. Testable models/constructs will be developed and
evaluated. A variety of other possible explanations involving
extensions of gravitation theory, quantum physics or other areas
will be constructed and tested where possible. Some of these
tests may require close cooperation of leading-edge researchers
using equipment in their facility.
(C/NF) Effort in this area will also focus on
integrating diverse aspects of the source, transmission, and
detector categories. For example, it will examine how
"targeting" occurs. Insight will be drawn from in-depth reviews
of various unusual physical effects identified by physical
sciences researches. These include distant particle coupling
(Bell's theorem), ideas from quantum gravity, possible
electrostatic/gravity interactions, unusual quantum physics,
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Appro
ase 2003/04/18 : CIA-RDP96-00789R00270001 001-1
II
. Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
UNCLASSIFIED
TIME FRAME
ACTT U ITY
1995
1996
1997
1998
1999
Information/Entropy
SOURCE
- - - - -
RESEARCH
Analysis
Various Target Attributes
(TARGET)
(Size, Form, Content)
TRANSMISSION
Four-Dimensional Calculations
RESEARCH
(Relativity Extensions)
Unconventional Waves
(MECHANISM)
(Laboratory) - (Long-Range Tests) -
Variables (Distance, Shielding, Energy)
DETECTOR
Neuroscience (EEG, Memory, Etc.)
RESEARCH
Environmental Factors
(BRAIN)
Other Physiology (Electrical, Infrared)
Implications from Medical/Animal Research
Physical Sciences (Physics, Statistics, Parallel Processing, Etc.)
Psychological Sciences
INTEGRATION
(Psychology, Anthropology, Cognitive, ttental,
Subliminal Perception, Etc.)
Medical (Genetics, Etc.)
FIGURE 4 (U) BASIC RESEARCH MILESTONES - ANOMALOUS COGNITON
UNCLASSIFIED
11
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
observational theories, vacuum "energy" potential, and a variety
of other concepts.
(S/NF) Perhaps the most promising exploratory
model of all is one based on little-understood aspects of the
fundamental equations for electromagnetic wave propagation
(Maxwell's equations). These equations indicate that forms of
"wave propagation" could also exist that do not have the
conventional electric or magnetic field components (i.e., vector
and scalar waves). These waves would not be blocked by matter
and therefore could be leading candidates for AC propagation or
for certain aspects of AC phenomenon. Research papersl _j
SG1 B indicate that these
waves are considered. a lea ing canaiaare or AC transmissions by
their researchers. Pilot study investigations in this area were
conducted by PAG-TA in FY92 with promising preliminary results.
Future research could couple with other DIA exploratory R&D
efforts in this area currently being explored.
(S/NF/SG/LIMDIS) Research on this topic will be
closely integrated with research involving the anomalous
phenomena (AP) aspect, since findings in the AP area would have
direct implications for phenomena transmission mechanisms in
general. Findings from the target (or target source) research
area would also provide insight into possible transmission
mechanisms. For example, different forms of the same target
(e.g., target size, 2D vs 3D, holographic representations) may
show patterns in the AC data that might provide-clues regarding
phenomena mechanisms.
3. (U) Detector.
(U) The most important and promising aspect of
understanding the nature of the AC detection system in humans is
through modern advances of the neuroscience. Earlier
neurophysiological results obtained from magnetoencephalograph
(MEG) measurements begun in FY92 will be validated and expanded.
This earlier work indicated MEG correlations between visual
evoked responses areas of the brain may exist, and that remote
stimuli might also be detectable in MEG data. Some of the
specific investigations will examine a variety of near and far-
field situations, other sensory modes and different types of
individuals in order to search for potential variables. It might
be possible, with advanced MEG instrumentation, to actually
locate the exact brain areas involved in AC phenomena occurrence.
Future research in this area could couple with research currently
being explored at the National Laboratory.
(U) Other physical/psychophysical aspects of the
central nervous system (CNS) will also be explored to look for
possible correlates. This would include galvanic skin responses
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
SG1 B
I
Appro ed For Rel base 2003/04/18 : CIA-RDP96-00789R00270001 Q001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
(GSR) or other parameters.
(U) Related to this overall area are several
investigations that relate to possible environmental interactions
with the brain that could affect AC data. This would include
possible geomagnetic or electromagnetic influences.
(S/NF) A spin-off from findings in this basic
research area could be for unique communication applications.
MEG correlates might exist between remotely located people. If
so, the possibility of transmission of remote messages (via a
type of code) might be possible. Since AC phenomenon is not
degraded by distance or shielding, the potential of transmitting
basic "messages" to individuals in submarines would exist.
Preliminary exploration of this application by PAG-TA has yielded
promising results.
(S/NF) Another potential spin-off benefit from
detector research in this program is that new insights into brain
memory or parallel processing might be achieved. This could lead
to new directions in advanced computer developments involving
neural networks. For example, recent indicates that SG1B
"wave-like" brain activity occurs in addition o usual neuronal
processes. This wave-like phenomenon may have some link to the
"phase shift" observed in MEG data from the previous MEG project.
Further MEG work involving remote stimuli may help clarify such
issues.
4. (U) Integration.
(U) The basic research activities will liberally
avail itself of the existing research communities that specialize
in neuroscience, physics and statistics and the broader
psychological/social sciences. Direct support with a variety of
university departments, national and international, will be
explored. PAG-TA contacts with such national laboratories as Los
Alamos, Lawrence Livermore, Oak Ridge, and have indicated an
interest on their part in supporting the research efforts.
Frequent conferences and data exchanges are anticipated. These
data exchanges will insure that a proper interdisciplinary
approach is maintained, and that findings from other disciplines
will be incorporated in this program where appropriate. This
peer group dialogue will greatly benefit research sponsored
through this plan, new ideas will be generated, and possibly
clues regarding phenomena operation will be easier to identify.
(U) Some specific interdisciplinary examples that
will benefit this program are as follows:
- In 1990 The American Anthropological
Association (AAA) formed a new division, the Society for the
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
Anthropology of Consciousness (SAC). This division has
established a technical journal to support interdisciplinary,
cross-cultural, experimental, and theoretical approaches to the
study of consciousness. This group may be able to contribute
this program by providing cross-cultural examples. These members
might also assist in the assessment of foreign data in this area.
- The psychophysiology of vision has already
contributed to the earlier program. This plan calls for a
collaborative effort with researcher in an attempt to understand
how the central nervous system process subliminal stimuli. This
should assist in understanding how MEG correlates occur.
- The relationship between mind and body is
currently discussed in the research literature as well as in the
popular press. Researcher at the California Institute for
Transpersonal Psychology (CITP) have.been active in investigating
the role of mental attitudes and body chemistry. While there may
not be a direct link with AC, and exchange of techniques and
experimental designs would be helpful.
- The Journal of Cognitive Neuroscience
contains at least one article of interest in each issue. This
discipline is where most of the cognitive work with the
neuromagnetism is conducted. There is the possibility of joint
investigations with researchers performing MEG investigations at
the National Institutes of Health (NIH).
- Stanford University has been conducting
research on internal mental imagery. The manipulation and
control of this imagery is extremely important in understanding
the source of internal noise during an AC session. A
collaborative effort with Stanford should lead to methods for
noise reduction.
- Neural networks are particularly good at
recognizing subtle patterns in complex data, and are being
applied in the subjective arena of decision making in business.
In order to improve AC analysis, the program will conduct a
collaborative effort with scientists who are active in neural
network research and with selected individuals who have had
success with interpreting highly subjective data.
- Statistics is the heart of AC research in
that most of the results are usually quoted in statistical terms.
Hypothesis testing has traditionally been the primary focus, but
there are other possible approaches that should be explored.
Statistics researchers at Harvard have already expressed interest
in contributing to the research effort.
- A major portion of the effort will be a
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R00270001
go
SG1 B
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
search for a AC evoked response in the brain. Sophisticated
processing is required in that magnetic signals from the brain
can not be easily characterized by standard statistical
practices. Several research facilities can contribute.
- Classical statistical thermodynamics may be
the heart of understanding the nature of an AC source of
information. A physical property called entropy may be related
to what is sensed by AC. The program intends to collaborate with
a variety of university physics departments to calculate the
appropriate parameters.
(S/NF) The specific experiments to be conducted in
these research domains will be defined during the first six to
nine months of the program utilizing the recommendations of the
working groups mentioned above subject to approval by the
Scientific Oversight Committee.
VI. (U) BASIC RESEARCH PLAN FOR ANOMALOUS PERTURBATION:
(S/NF) Figure 5 illustrates the basic approach for
investigations "energetics", or anomalous perturbation (AP)
phenomenon. Intelligence reporting indicates that this aspect of
AMP I Ishould receive
attention in is researc pan to prevent technological
surprise. Thus, beginning in FY95, acceptance criteria will be
establish with which to judge the historical literature for
potential AP effects. Using those criteria, a detailed review of
the literature will begin in mid FY95 and considering the size of
that data base will continue through FY95. Knowledge gained from
this review may provide insights for the development of new AP
target systems or provide data so that particular experiments can
be replicated. Given the complexity of most AP experiments,
considerable time is needed to plan and conduct them properly.
If the results warrant, then application development may begin as
early as FY96; however the primary task of basic research of AP
is to attempt to validate its existence. Findings from foreign
research will be examined and factored into this activity as
appropriate.
(S/NF) The keys to investigating this area will be in
appropriate personnel selection and, very likely, in proper
selection of the AP test device. Thus, the initial phase of this
effort will involve identification and solicitation of
individuals known or claimed to have such talents. For example,
certain expert martial arts or yoga practitioners might do well
in such experiments due to their strong mental conditioning and
ability for intense mental focus. After locating such
individuals, various instruments, such as microcomputer devices,
sensitive electronic/sensor devices, or other unique or sensitive
equipment would be used as targets in AP experiments.
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
UNCLASSIFIED
ACTI U ITY
TIME FRAME
--
1995
1996
1997
1999
l
1999
DEVELOP
EURLUATION
CRITERIA
PERFORM
Historical Data Base
ANALYSIS
EHAMINE
Various Technical Targets
TARGET
Laboratory Setting
SYSTEMS
CONDUCT
Advanced Sensors Complex Components
UALIORTION
EHPERIMENTS
Far-Field Effects (countermeasures)
PURSUE
APPLICATIONS
Solicit
Known
PERSONNEL
Talent Screening/Training (Develop)
SELECTION
Figure 5 (U) Basic Research Milestones - Anomalous Perturbation
(To Include Biological Systems)
UNCLASSIFIED
ase 2003/04/18 : CIA-RDP96-00789R002700019001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
(S/NF) Some of the unique sensor candidates include
devices that are highly sensitive to very weak gravitational
effects (such as Mossbauer devices or atomic clocks). Perhaps
the most promising device is one that involves detection of an
unusual non-electromagnetic wave (A vector/scalar wave). If
experiments with such sensors are successful, then significant
understanding of AP or AC phenomenon would occur. Experiments
with such a device is a distinct near-term possibility;
consequently this will be given high priority in the early part
of this long-range program.
(S/NF) Should these pilot experiments prove successful,
then a near and distant experiments would be developed for a wide
variety of devices to evaluate application aspects. Potential
applications could include, for example, remote switching (in a
communication role) or possibly as a countermeasure to minimize
effectiveness of threat systems such as sensitive computer
components or sensors. Similarly, if these results are
successful, they would provide insight regarding potential
threats to U.S. systems or security.
(S/NF) The specific experiments to be conducted in these
research domains will be defined during the first six to nine
months of the program utilizing the recommendations of the
working groups mentioned above subject to approval by the
Scientific oversight Committee.
VII. (U) APPLIED RESEARCH PLAN FOR ANOMALOUS COGNITION:
(U) Figure 6 illustrates the overall plan for the applied
research portion for several main functional categories.
a. (U) SELECTION
(C) The most promising potential for selecting
individuals is to identify ancillary activity that correlates
with AC ability. If such a procedure can be identified, then
receiver selection can be incorporated as part of other screening
tests (e.g., fighter pilot candidacy), and thus large populations
can be used. Among the items that will be examined are
physiology (e.g., responses of the brain to external stimuli) and
hypnotic susceptibility (i.e., an individuals predisposition for
being hypnotized). The results of this effort will be examined
continuously; however, a decision to end the investigation will
occur in mid FY96. Should the results at that time warrant, then
refining of the techniques will continue to the end of FY 1998.
The reason the initial research spans several years is that to
validate even one psychological finding requires long-term
testing of candidate individuals. Current statistical methods
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
UNCLASSIFIED
ACTIVITY
TIME FRRME
1995
1996
1997
1998
1999
State Parameters
PERSONNEL
(Hypnosis, Physiology, Etc.)
SELECTION
Psychology
RESEARCH
(Self Report, Behavioral Measures, Etc.)
Solicit Known Talent Empirical bass Screenina)
State Parameters (Altered States Subliminal Threshold t feasures. Etc.)-
PERSONNEL
Empirical Evaluation
TRAINING
RESEARCH
Practical Application Tests (Increasing Project Difficulty)
Target Characteristics (Entropy, Size, Etc.)
APPLICATION
Other Aspects (Target Function, Dynamics, Degree of
EURLURTION
RESERRCH
Importance, Etc.)
Operational Conditions (Targets, Feedback, Etc.)
PROTOCOL
Search/Location Projects
DEVELOPMENT
- - - -
New Applications/Procedures
ANALYSIS '
Response Definition Written Drawn, Physiological tieasures, Etc.)
METHOD
Artificial Intelligence (Fuzzy Sets Etc.)
DEVELOPMENT
Neural Network Analogies
Combination of Methods
DATR
Intelli nce Data Fusion tiethods
INTEGRATION/
Training/Seminars
ASSIMILATION
Advanced Training
DEVELOPMENT
Various Customers
Figure 6 (U) Rpplied Research Milestones - Anomalous Cognition
UNCLASSIFIED
18
Approved ForRel ase 2003/04/18 : CIA-RDP96-00789R00270001 001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
require many AC sessions, and experience has shown that only a
few sessions can be conducted per week for any single individual.
(C) The previous program was able to estimate
that approximately one percent of the general population
possessed a high-quality, natural AC ability. Because the
empirical method (i.e., asking large groups to attempt AC) is
labor intensive and very inefficient, it is included in the
research plan only.as an alternate approach.
b. (U) TRAINING
(S/NF) Training has been a major part of the
previous program; however, results of training approaches have
been difficult to evaluate and have not been examined
systematically. Systematic review of this issue was begun in FY
92. One of the methods that will be examined involves lowering
an individual's visual subliminal threshold (i.e., the level
below which an individual is not consciously aware of visual
material). This could enhance the individual's sensitivity to AC
data. Other forms of altered states, such as dreaming and
hypnosis, will also be evaluated to see if such states can
enhance AC data quality.
(U) Results on these issues should be available
at the close of FY95. If no progress has been observed and if
there have been no positive results from the basic research, the
task ends. However, should any of the variables examined appear
promising then the task will be continued.
(S/NF) It is anticipated that all laboratory
successes must be validated by simulating operational tasks.
These experiments involve identifying the specialty to be tested,
the acceptance criteria, and conducting sessions in which the
complete target systems are know. This three-year activity runs
concurrently with the other tasks but with a one-year offset to
allow for planning.
c. (U) TARGET/APPLICATION SELECTION
(C) Based on earlier research, the most promising
approach to target selection appears to be a single physical
characteristic called entropy (i.e., a measure of inherent target
information). Beginning in FY95, two and one half years have
been allocated for the detailed study of this aspect of target
properties. Initially, little experimentation is-required;
rather, a retrospective examination of previous target systems
should indicate if this approach is valid. Included in this
examination are detailed calculations of the information content
of natural target scenes.
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
(S/NF) Beginning in mid FY96, other potential
intrinsic target properties will be examined. For example, a
target may be more readily sensed by AC if the collection of
elements at the site (e.g., landmark, buildings, roads)
constitute a conceptually coherent unit as opposed to a collage
of unrelated items. Quantitative definition of targets will also
be developed that include non-physical target parameters such as
function, meaning, or relationships. These aspects are highly
important in most operational projects and need to be quantified.
(S/NF) Part of this effort will involve
investigations that serve two purposes: (1) add insight into
the phenomenon; and (2) help evaluate the feasibility of certain
potential applications. For example, long distance experiments
could be conducted to or from deep caves or submarines in deep
water to test communication potential and transmission theories.
Experiments could also be. conducted to targets on board space
platforms to test distance and gravitational effects.
Experiments to or from magnetically shielded rooms or certain
earth locations (e.g., the magnetic pole) might indicate if
magnetic fields influence the phenomenon. Experiments to
opposite sides of the earth might also indicate if a mass or
gravity effect can be noted.
(S/NF/SG/LIMDIS) This area of investigation will
be integrated with a variety of applications in coordination with
findings/investigations pursued by the in-house effort. Figure 9
identifies the main application or operational areas. Along with
types of data desired. This activity will be integrated, where
possible, into in-house pursuits that will explore these areas in
a systematic fashion. Initial emphasis will be in
counternarcotics and counterterrorism areas.
(S/NF/SG/LIMDIS) Specific types of applications
that will be explored in-depth include the search problem.
Search tasks are expected. to remain .as high priority operational
tasks (e.g., hostage location, lost equipment or system
location). Search tasks are complicated by timing issues,
especially if the missing target is being moved frequently.
Related to this will be examination of predictive capability in
order to evaluate feasibility of detecting hostile plans and
intentions in advance. Pilot studies of other areas (e.g., code
breaking, medical diagnostics, low intensity conflict support)
will also be initiated.
(S/NF/SG/LIMDIS) Another application area that
will be examined is "communications". Previous research
indicates that with proper protocols, basic or coded messages can
be sent and received via AC procedures. Redundant coding methods
can readily enhance probability of success, and new statistical
methods can also improve success rates. Communication
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Appro
ase 2003/04/18 : CIA-RDP96-00789R00270001 001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
applications may have significant value for search problems by
providing additional information on location of kidnapped or
hostage victims. Such techniques might also help in determining
hostage or POW state-of-health or other significant issues.
d. (U) PROTOCOLS
(U) Given the laboratory success of AC
experimentation, the protocol task can build upon a substantial
literature. Determining optimal, specialty-dependent protocols
only require extending current concepts. Several years are
required due to the statistical nature of analysis that:is
required to determine the effects of environment, receiver,
target and feedback conditions. Several high-interest
application areas (such as search/location) will be examined in
detail. A variety of session procedures will be evaluated to
determine those that are beneficial to improving data quality.
(S/NF) Protocol effectiveness may be measured by
quality, quantity, and/or usefulness of the AC information
elicited by its use. The requirements for protocols that are
designed for laboratory settings are considerably more
restrictive than those required for operational settings. For
example, providing limited information to a receiver while an
operational session is in progress (i.e., intermediate feedback)
might facilitate the acquisition of the desired data. This kind
of feedback is strictly prohibited, however, in most protocols
designed for laboratory experiments. Protocols may also vary
depending on nature of the data required. For example, for some
search projects, only general data may be adequate. For such
cases would not require development of highly specific details
and protocols the sessions would not be as complex.
(U) A detailed protocol will need to consider a
variety of potential session variables such as the individuals'
physical environment, mental state and attitude, and how the
target or task is designated (e.g., coordinates, abstract terms).
Other data includes specifics of the session (monitor present or
not), type of feedback, type of response data (e.g., predictive),
and mode and method of response (e.g., drawings, verbal).
(S/NF) Concurrently, the only known way to
resolve the above issues is to conduct a large number of trials
for a given individual with as many of the potential variables as
possible held constant. Standard statistical methods can then be
used to identify trends, patterns, and operational constraints.
e. (U) DATA ANALYSIS
(U) This area requires extensive review of
leading analysis tools, such as those required for describing
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
imprecise concepts or data (i.e., artificial intelligence
techniques, fuzzy sets). This work will be combined with
findings from neural network analysis and research, or possibly
combinations of other emerging advanced analysis methods.
(S/NF) Various approaches that are anticipated to
directly benefit operational evaluations. One promising
technique involves procedures based on an adaptive (frequent data
base update) approach. This will permit an individual's
progression, and possibly time dependent data variables in an
individual's track record, to be identified.
(S/NF) In addition to the search for new analysis
methods, the current methods will also be reexamined. Laboratory
requirements differ from those for operational activities in that
the target can be controlled and well defined. For operational
activities, uncertainties in tasking may arise, especially if
operational requirements are changing or if some of the initial
"known" data are incorrect. Such uncertainties complicate later
analyses.
(S/NF) Analysis methods will also be developed
that can make predictions on data quality for any given task.
This will require development of an extensive track record for
each individual based on both controlled and operational
projects.
(S/NF) These analysis methods will also address
certain practical issues. For example, a detailed, high-quality
example of AC data may have little value to an intelligence
analyst if that information was known from other sources.
Likewise, a poor example of AC data might provide a single
element as a tip-off for other assets, or provide the missing
piece in a complex analysis, and thus be quite valuable. The
intelligence utility of AC data. may in some cases be only weakly
connected to the AC quality. Therefore a data fusion analysis
procedure is needed for AC-derived operational data. Methods
that permit appropriate data analysis from an accuracy and
utility viewpoint will be developed.
f. (U) INTEGRATION
(U) This activity would be an on-going review/
integration effort in order to identify patterns or clues useful
for understanding practical aspects of this phenomenological
area.
(S/NF) Identifying approaches and procedures that
permit assimilation of AC data from operational support projects
into all-source intelligence analysis procedures will also be
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
part of this support activity. Depending on results of applied
research findings and operational pursuits, a basic seminar/
training program for other applications-oriented elements might
be established. Such a training/seminar program would focus on
basic techniques and would augment possible operational training
activity that might become part of the in-house effort. This
would require several years to develop and establish.
(S/NF) The specific experiments to be conducted
in these research domains will be defined during the first six to
SG1B nine months of the program utilizing the recommendations of the
working groups mentioned above subject to approval by the
Scientific Oversight Committee.
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SG1B
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SG1 B
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
IX. (U) POTENTIAL RESEARCH RETURN:
(S/NF/SG/LIMDIS) The research pursuits identified in the
overall research and peer review plan have the potential for
achieving highly significant results using AMP to address
problems of national security by pushing the phenomena to their
natural limits. This overall result can be achieved by
accomplishing the aforementioned program plan goals.
X. (U) PROGRAM OVERSIGHT
A. (U) PROJECT OVERSIGHT METHODOLOGY:
1. (U) PROGRAM MANAGEMENT/OVERSIGHT
(S/NF) DIA, as executive agent, proposes to
implement a management structure that fosters a proactive,
responsive, and creative environment for this activity. Both
the external research and in-house activities will be centered in
the Technology Assessment and Support Activity under the
supervision of the Chief, Office for Ground Forces (DIA/PAG).
2. (U) SCIENTIFIC OVERSIGHT
(S/NF) Scientific oversight will be provided by the
3. (U) CONTRACTOR OVERSIGHT
a. (U) A contractor sponsored Scientific
Oversight Committee (SOC), consisting of scientists from the
following disciplines: physics, astronomy, statistics,
neuroscience, and psychology, will be tasked with the following:
-- (U) Reviewing and approving all
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
experimental protocols prior to the collection of experimental
data.
-- (U) Reviewing all experimental final
reports as if they were submissions to technical scientific
journals.
-- (U) Proposing directions for further
-- (U) Conducting un-announced drop-in
privileges to view experiments in progress.
b. (U) An contractor sponsored Human Use
Review Board will also be formed and charged with the
responsibility of assuring compliance with all U.S. and DoD
regulations with regard to the use of humans in experimentation
and assuring their safety. Members should represent the health,
legal, and spiritual professions IAW government guidelines.
XI. (U) DEVELOPMENT OF EVALUATION CRITERIA:
A. (U) SCIENTIFIC VALIDITY
(S/NF) A thorough review of DoD's activities in AMP
was conducted in 1987 to evaluate the use of AMP for intelligence
gathering purposes. The overall findings of this evaluation were
that "...the Project Review Group has determined to its
satisfaction that the work of the Enhanced Human Performance
Group is scientifically sound...and is providing valuable insight
into the nature of an anomaly which have a significant impact on
the DoD." This research and development program will both draw
from and add to this extensive data base to further demonstrate
the scientific validity and practicality of AMP.
B. (U) PERFORMANCE
(S/NF) The ability of the STAR GATE program to produce
results that have an intelligence value can only be measured by
customer feedback evaluations. STAR GATE has developed feedback
mechanisms and procedures for customers that should result in a
method of quantifying this subjective feedback data so that
operational value added and cost-effectiveness can be measured.
XII. (U) BUDGET AND RESOURCE REQUIREMENTS (FYs 95-99):
(S/NF/SG/LIMDIS) Due to the diversity of the STAR GATE
mission/objectives, both external resources and in-house
expertise are required. Since this Activity possesses no in-
house R&D capability, an absolute need for external R&D support
is required to meet Congressional concerns which are addressed in
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
ase 2003/04/18 : CIA-RDP96-00789R00270001 001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
this program plan. A balance will be maintained between external
and in-house activities, and every effort will be made to
integrate and link these activities where appropriate. The
external aspect permits a wide range of expertise covering many
disciplines to be focused on this area; this also has the benefit
of ensuring peer group review and of facilitating a variety of
scientific interactions. In-house personnel with a wide-range of
expertise in this phenemenology will need to be retained to make
this proposed plan work.
(S/NF/SG/LIMDIS) In order to fulfill Congressional
Direction, the DIA proposes to convene a Scientific Evaluation
Panel (SEP) composed of representatives from each of the Service
Scientific Advisory Boards. The purpose of the SEP is to review
and validate the methodology outlined in the plan in order to
address the cost-effectiveness and performance criteria for the
STAR GATE program's research and. development objectives and to
propose recommendations as to which objectives should be pursued
and the program scope required to achieve those objectives. If
the SEP determines that objectives in the plan are viable and
executable, the General Defense Intelligence Program (GDIP)
Manager will complete this initiative with others for limited
available resources remaining in the program.
(U) The proposed ongoing R&D effort will be reviewed every
two years by the SEP to determine whether the STAR GATE program
can show results that are cost-effective and satisfy reasonable
performance criteria.
(C) An annual report will document the current operational,
technical and administrative status of the program.
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
APPENDIX A
CONGRESSIONALLY-DIRECTED ACTION
DEFENSE AUTHORIZATION CONFERENCE
(S/NF) REQUEST: "The conferees are concerned that insufficient
funds have been spent on research and development to establish
the scientific basis for the STAR GATE program. The conferees
direct the Director of DIA to prepare a program plan and to
submit an appropriate budget request for a research effort, over
several years, to determine whether the STAR GATE program can
show results that are cost-effective and satisfy reasonable
performance criteria. This plan, and any research under this
program, should be subject to peer review by neutral scientific
experts. The Director of DIA is directed to prepare this
research and peer review plan within existing program funds."
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
APPENDIX B
TERMINOLOGY AND DEFINITIONS
(U) PHENOMENA TERMINOLOGY:
(U) This phenomenological area has had a variety of
descriptive terms over the years, such as paranormal,
parapsychological, or as psychical research. Foreign researchers
use other terms: "psychoenergetics" in the USSR; "extraordinary
human function" in the People's Republic of China (PRC). In
general, this field is concerned with a largely unexplored area
of human consciousness/subconsciousness interactions associated
with unusual or underdeveloped human capabilities.
(U) Recently, researchers have shown a preference for terms
that are neutral and that emphasizes the anomalous or enigmatic
nature of this phenomena. The term anomalous mental phenomena
(AMP), is generally preferred.
(U) This area has two aspects; information access and
energetics influence. Information access refers to a mental
ability to describe remote areas or to access concealed data that
are otherwise shielded from all known sensory channels. A recent
term for this ability is anomalous cognition (AC). This term
places emphasis on potential understanding that might be
available from advances in sensory/brain functioning research or
other related research. Older terms for this aspect have
included extra-sensory perception (ESP), remote viewing (RV), and
in some cases, precognition.
(U) The energetics aspect refers to the ability to
influence, via mental volition, physical or biological systems by
an as yet unknown physical mechanism. An example of physical
system influence would include affecting the output of sensors or
electronic devices; biological systems influence would include
affecting physiological parameters of an individual. A recent
descriptive term for this ability is anomalous perturbation (AP).
Older terms for this phenomenon included psychokinesis (PK) or
telekinesis.
(U) GENERAL DEFINITIONS:
(S/NF) For this program, basic research is-defined to mean
any investigation or experiment for determining fundamental
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
processes or for uncovering underlying parameters that are
involved in this phenomenon. Basic research is primarily
oriented toward understanding the physical, physiological , and
psychological mechanisms of anomalous mental phenomena (AMP).
(S/NF) Applied research refers to any investigation
directed toward developing particular applications or for
improving data quality and reliability. For anomalous cognition
(AC) phenomenon, research is primarily directed toward improving
the output quality of AC data. This would include ways to
develop/improve utility of AC data for variety of potential
application. For example, examination of spatial and temporal
relationships of AC data could assist in developing a reliable
search capability useful for locating missing people or
equipment.
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R00270001
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
APPENDIX C
POTENTIAL RESEARCH SUPPORT FACILITIES
Science Applications International Corp.
Mind Science Foundation
Princeton Engineering Anomalies Laboratory
American Society for Psychical Research
St. John's University
Foundation for Research into the Nature
of Man
ARE/Atlantic University
University of Virginia
Psychophysical Research Laboratories
Edinburgh University
OTHER RELATED DISCIPLINES.
Psychology
Stanford University
Cornell University
Anthropology
University of California
University of Arizona
Psychophysiology
SRI International
Langly-Portor Neuropsychiatric Institute
Menninger Foundation
Psychoimmunology
California Institute for Transpersonal
Psychology
Cognitive Neuroscience
Los Alamos National Laboratory
Sandia National Laboratory
University of California
Los Altos, CA
San Antonio, TX
Princeton Univ, NJ
New York, NY
Long Island, NY
Durham, NC
Virginia Beach, VA
Charlottesville,
VA
Edinburgh,
Scotland
Edinburgh,
Scotland
Stanford, CA
Ithaca, NY
Berkeley, CA
Tucson, AZ
Menlo Park, CA
San Francisco, CA
Topeka, KS
Menlo Park, CA
Los Alamos, NM
Albuquerque, NM
San Diego, CA
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
Cognitive Psychology
Psychology Department, Princeton Univ Princeton, NJ
Psychology Department, City College of New York, NY
New York
Artificial Intelligence
Massachusetts Institute of Technology
Stanford University
Neural Networks
Massachusetts Institute of Technology
Science Applications International Corp
Statistics/Signal Analysis
University of California
Harvard University
Thermodynamics
Rochester University
Physics Department, Stanford University
Quantum Measurement
International Business Machines,
Research Laboratories
Cambridge, MA
Stanford, CA
Cambridge, MA
Los Altos,' CA
Davis, CA
Cambridge, MA
Rochester, NY
Stanford, CA
College Park, MD
General Relativity
California Institute of Technology Pasadena, CA
University of Texas at Austin Austin, TX
Electromagnetic/Basic Research
Electronetics Corp Buffalo, NY
Battelle Corp Columbus, OH
Institute for Advanced Study Austin, TX
W
eo
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LINDIS
Approved ForRel base 2003/04/18 : CIA-RDP96-00789R00270001
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
APPENDIX D
RESOURCE LITERATURE
1.
A.R.E. Journal
2.
Abnormal hypnotic Phenomena
3.
American Anthropologist
4.
American Ethnologist
5.
American Journal
of
Clinical Hypnosis
6.
American Journal
of
Physiology
7.
American Journal
of
Sociology
8.
American Psychologist
9.
American Society for Psychical Research
10.
Annals of Eugenics
11.
Annals of Mathematical Statistics
12.
Annales de Sciences Psychiques
13.
Archivo di Psicologica Neurologic e Psychiatra
14.
Association for the Anthropological Study of Consciousness
tt
N
l
ews
e
er
15.
Behavioral and Brain Science
16.
Behavioral Science
17.
Bell System Technical Journal
18.
Biological Psychiatry
19.
Biological Review
20.
British Journal for the Philosophy of
Science
21.
British Journal of Psychology
22.
Bulletin of the American Physical Research
23.
Bulletin of the Boston Society for Psychic Research
24.
Bulletin of the Los Angeles Neurological Societies
25.
Contributions to Asian
Studies
26.
Electroencephalography
and Clinical Neurophysiology
27.
Endeavour
28.
Ethnology
29.
Exceptional Human Experience
30.
Experientia
31.
Experimental Medicine and Surgery
32.
Fate
33.
Fields within Fields
34.
Foundations of Physics
35.
Hibbert Journal
36.
Human Biology
37.
International Journal of Clinical and Experimental Hypnosis
38.
International Journal of Comparative Sociology
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
is
39. International Journal of Neuropsychiatry
40. International Journal of Parapsychology
41. International Journal of Psychoanalysis
42. Journal of Abnormal and Social Psychology
43. Journal of Altered States of Consciousness
44. Journal of Applied Physics
45. Journal of Applied Psychology
46. Journal of Asian and African Studies
47. Journal of Biophysical and Biochemical Cytology
48. Journal of Cell Biology
49. Journal of Communication
50. Journal of Comparative and Physiological Psychology
51. Journal of Consulting Psychology
52. Journal of Existential Psychiatry
53. Journal of Experimental Biology
54. Journal of Experimental Psychology
55. Journal of General Psychology
56. Journal of Genetic Psychology
57. Journal of Mind and Behavior
58. Journal of Nervous and Mental Diseases
59. Journal of Personality
60. Journal of Personality and Social Psychology
61. Journal of Research in PSI Phenomena
62. Journal of Scientific Exploration
63. Journal of the American Academy of Psychoanalysis
64. Journal of the London Mathematical Society
65. Journal of the Royal Anthropological Institute of Great
Britain and Ireland
66. Metapsichica
67. Mind-Brain Bulletin
68. Motivation and Emotion
69. Nature
70. Naturwissenschaftliche Rundschau
71. New Horizons
72. New scientist
73. New Sense bulletin
74. Newsletter of the Parapsychology Foundation
75. Parapsychology Bulletin
76. Parapsychology Abstracts International
77. Parapsychology Review
78. Perceptual and Motor Skills
79. Philosophy of Science
80. Physiology and Behavior
81. Proceedings of the Society for Psychical Research
82. Psychedelic Review
83. Psychic
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Appro
ed For Rel
ase 2003/04/18 : CIA-RDP96-00789R00270001 001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
84. Psychic science
85. Psychoanalytic Quarterly
86. Psychoanalytic Review
87. Psychological Bulletin
88. Psychometrika
89. Psychophysiology
90. Physics Today
91.
92.
93.
94.
95.
Renti
Teyigongneng
(EFHB Research) [PRC]
Revue
Metapsychique
Revue
Philosophique
Revue
Philosophique
de la France et de L'Etranger
Revue
Philosophique
Applique
96. Science
97. Skeptical Inquirer
98. Social Studies of science
99. Subtle Energies
100. The Humanistic Psychology Institute
101. The Journal of Parapsychology
102. The Journal of the American Society for Psychical Research
103. Theta
104. Tijdschrif voor Parapsychologie
105. Tomorrow
106. Voprosy Filosofi (Questions of Philosophy) [RUSSIA]
107. Western Canadian Journal of Anthropology
108. Zeitschrift fur die Gesamte Neurologie and Psychiatrie
109. Zietschrift fur Parapsychologie and Grenzgebeite der
Psychologie
110. Zietschrift fur Tierpsychologie
111. Zietschrift fur Vergleichende Physiologie
112. Zetetic Scholar
113. Zhongguo Shebui Kexue (China Social Sciences) [PRC]
114. Ziran Zazhi (Nature) [PRC]
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
APPENDIX E
CURRENT CONTRACTOR SCIENTIFIC OVERSIGHT COMMITTEE MEMBERSHIP
Steven A. Hillyard
- Professor of Neurosciences, Department of Neurosciences,
University of California, San Diego.
- Author or coauthor of 118 technical neuroscience
publications.
- Eighty-two invited presentations at technical conferences.
- Ph.D., Yale University, 1968 (Psychology).
S. James Press
- Professor of Statistics, Department of Statistics, University
of California, Riverside.
- Author or coauthor of 132 statistics publications.
- Author of 12 books and/or monographs.
- Ph.D., Stanford University, 1964 (Statistics).
Garrison Rapmund
- Responsible for facilitating transfer of Strategic
Defense Initiative technologies to health care industries.
- Major General, USA retired in 1986 as Assistant Surgeon
General (R&D) and Commander, Army Medical R & D Command.
- M.D., Columbia University, 1953 (Pediatrics).
Melvin Schwartz
- Associate Director for High Energy and Nuclear Physics,
Brookhaven National Laboratory.
- Author or coauthor of 40 technical publications in high energy
physics, author of "Principles of Electrodynamics."
- Nobel Prize, Physics (1988).
- Ph.D., Columbia University, 1958 (Physics).
Yervant Terzian
- Professor of Physical Sciences, Chairman of the Department of
Astronomy, Cornell University.
- Author/coauthor of numerous technical publications and books.
- Ph.D., Indiana University, 1965 (Astronomy).
Phillip G. Zimbardo
- Professor of Psychology, Department of Psychology, Stanford
University.
- Author/coauthor of numerous experimental psychology
publications. -
- Ph.D., Yale University, 1959 (Psychology).
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
APPENDIX F
CURRENT CONTRACTOR INSTITUTIONAL REVIEW BOARD MEMBERSHIP
Byron Wm. Brown, Jr., Ph.D.
- Biostatistics, Stanford University
Gary R. Fujimoto, M. D.
- Occupational Medicine, Palo Alto Medical Foundation
John Hanley, M. D.
- Neuropsychiatry, University of California, Los Angeles
Robert B. Livingston,, M. D.
- Neuroscience, University of California, San Diego
Robin P. Michelson, M. D.
- Otolaryngology, University of California, San Francisco
Ronald Y. Nakasone, Ph.D.
- Buddhist Studies, Institute of Buddhist Studies, Berkeley, CA
Garrison Rapmund, M. D. (Chair)
- Air Force Science Advisory Board
Louis J. West, M. D.
- Neuropsychiatry, University of California, Los Angeles
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
SECRET
APPENDIX G
ACADEMIC STUDIES REGARDING THE SCIENTIFIC VALIDITY OF AMP
SECRET
NOT RELEASABLE TO FOREIGN NATIONALS
STAR GATE
LIMDIS
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Psychological Bulletin (January, 1994)
Version 4.7
October 1, 1993
Does Psi Exist?
Replicable Evidence for an
Anomalous Process of Information Transfer
Daryl J. Bem and Charles Honorton
Most academic psychologists :do not yet accept the existence of psi, anomalous processes of in-
formation or energy transfer (such. as telepathy or other forms of extrasensory perception) that
are currently unexplained in terms of known physical or biological mechanisms. We believe
that the replication rates and effect-sizes achieved,by one particular experimental method, the
ganzfeWd procedure, are now.suflicient to warrant bringing -this .body of data to the attention of
the wider psychological community. Competing meta-analyses of the ganzfeld database are re-
viewed, 1 by R. Hyman (1985), .a skeptical critic of psi research, and the other by C. Honorton
(1985), a parapsychologist and major contributor to the ganzfeld-database. Next-the results of
11 new ganzfeld studies that.comply with guidelines jointly authored by IL Hyman and C. CPYRGHT
Honorton (1986) are summarized. Finally, issues ofreplication and theoretical explanation are
discussed.
The term psi denotes anomalous processes of informa-
tion or energy transfer, processes such as telepathy or
other forms of extrasensory perception that are currently
unexplained in terms of known physical or biological
mechanisms. The term is purely descriptive: It neither
implies that such anomalous phenomena are paranormal
nor connotes anything about their underlying mecha-
nisms.
Does psi exist? Most academic psychologists don't think
so. A survey of more than 1,100 college :professors in the
United States found that 55% of natural scientists, 66% of
social scientists (excluding psychologists), and 77% of aca-
demics in the arts, humanities, and education believed
that ESP is either an established fact or a likely possibil-
ity. The comparable figure for psychologists was only 34%.
Moreover, an equal number of psychologists declared ESP
to be an impossibility, a view expressed by only 2% of all
other respondents (Wagner & Monnet,1979).
Daryl J. ?Bem, Department of Psychology, Cornell University.
Charles Honorton, Department of Psychology. University of Ed-
inburgh. Edninburgh, Scotland.
199 Sadly. Charles Honorton died of a heart attack on November 4,
days before this article was accepted for publication. He
was 46. Parapsychology has lost one of its most valued contribu-
tors. I have lost a valued friend.
This collaboration had its origins in a 1983 visit I made to
Honorton's Psychophysical Research Laboratories (PRL) in
Princeton, New Jersey, as one of several outside consultants
brought in to examine the design and implementation of the ex-
perimental protocols.
Preparation ofthis article was supported, in .part, by grants to
Charles Honorton from the American Society for Psychical Re-
search and the Parapsychology Foundation, both of New York
City. The work at PRL summarized in the second half of this ar-
ticle was supported by the James S. McDonnell Foundation of St.
Louis, Missouri, and by the John E. Fetzer Foundation of Kala-
mazoo, Michigan.
Helpful comments on drafts of this article were received from
Deborah Delany, Edwin May. Donald McCarthy, Robert Morris,
John Palmer, Robert Rosenthal, Lee Ross, Jessica Utts, Philip
Zimbardo, and two anonymous reviewers.
Correspondence concerning this article should be addressed to
Daryl J. Bem, Department of Psychology, Uris Hall, Cornell
University, Ithaca, New York 14853. (Electronic mail may be
sent to d bem?oornelLedu).
Psychologists are probably more skeptical about psi for
several reasons. First, we believe that extraordinary
claims require extraordinary proof. And although our col-
leagues from other disciplines would probably agree with
this dictum, we are more likely to be familiar with the
methodological and statistical requirements for sustaining
such claims, as well as with previous claims that failed ei-
ther to meet those requirements or to survive the test of
successful replication. Even for ordinary claims, our con-
ventional statistical criteria are conservative. The sacred
p a .05 threshold is a constant reminder that it is far more
sinful to assert that an effect exists when it does not (the
Type I error) than to. assert that an effect does not exist
when it does (the Type II error).
Second, most of us distinguish sharply between phe-
nomena whose explanations are merely obscure or contro-
versial.(e.g., hypnosis) and.phenomena such as psi that
would appear to fall outside our current explanatory
framework altogether. (Some would characterize this as
the difference between the unexplained and the inexplica-
ble.) 'In contrast, many laypersons treat all exotic psycho-
logical phenomena as epistemologically equivalent; many
even consider d6jh vu to be a psychic phenomenon. The
blurring of this critical distinction is aided and abetted by
the mass media, 'new age books and mind-power courses,
and , psychic' entertainers who present both genuine hyp-
nosis and fake `mind reading" in the course of a single
performance. Accordingly, most laypersons would not
have to revise their conceptual model of reality as radi-
cally as we would to assimilate the existence of psi. For
us, psi is simply more extraordinary.
Finally, research in cognitive and social psychology has
sensitized us to the errors and biases that plague intuitive
attempts to draw valid inferences from the data of every.
day experience (Gilovich, 1991; Nisbett & Ross, 1980;
Tversky & Kahneman, 1971). This leads us to give virtu-
ally no probative weight to anecdotal or journalistic re-
ports of psi, the main source cited by our academic col-
leagues. as evidence for their beliefs about psi (Wagner &
Monnet, 1979).
Ironically, however, psychologists are probably not more
familiar than others with recent experimental research on
psi. Like most psychological research, parapsychological
research is reported primarily in specialized journals; un-
like most psychological research, however, contemporary
parapsychological research is not usually reviewed or
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
summarized in psychology's textbooks, handbooks, or
mainstream journals. For example, only 1 of 64 introduc-
tory psychology textbooks recently surveyed even men-
tions the experimental procedure reviewed in this article,
a procedure that has been in widespread use since the
early 1970a (Roig, Ieochea, & Cuzzucoli,1991). Other sec-
ondary sources for nonspecialists are frequently inaccu-
rate in their descriptions of parapsychological research.
(For discussions of this problem, see Child, 1985; and
Palmer, Honorton, & Utts,1989.)
This situation may be changing. Discussions of modern
psi research have recently appeared in a widely used in-
troductory textbook (Atkinson, Atkinson, Smith, & Bem,
1990, 1993), two mainstream psychology journals (Child,
1985; Rao & Palmer, 1987), and a scholarly but accessible
book for nonspecialists (Broughton, 1991). The purpose of
the present article is to supplement these broader treat-
ments with a more detailed, meta analytic presentation of
evidence issuing from a single experimental method: the
ganzfeld procedure. We believe that the replication rates
and effect sizes achieved with this procedure are now suf-
ficient to warrant bringing this body of data to the atten-
tion of the wider psychological community.
The Ganzfeld Procedure
By the 1960s, a number of parapsychologists had be-
come dissatisfied with the familiar ESP testing methods
pioneered by J. B. Rhine at Duke University in the 1930s.
In particular, they believed that the repetitive forced-
choice procedure in which a subject repeatedly attempts to
select the correct `target' symbol from a set of fixed alter-
natives failed to capture the circumstances that character-
ize reported instances of psi in everyday life.
Historically, psi has often been associated with medita-
tion, hypnosis, dreaming, and other naturally occurring or
deliberately induced altered states of consciousness. For
example, the view that psi phenomena can occur during
meditation is expressed in most classical texts on medita-
tive techniques; the belief that hypnosis is a psi-conducive
state dates all the way back to the days of early mes-
merism (Dingwall, 1968); and cross-cultural surveys indi-
cate that most reported 'real-life psi experiences are me-
diated through dreams (Green, 1960; Prasad & Stevenson,
1968; L. E. Rhine, 1962; Sannwald, 1959).
There are now reports of experimental evidence consis-
tent with these anecdotal observations. For example, sev-
eral laboratory investigators have reported that medita-
tion facilitates. psi performance (Honorton, 1977). A meta-
analysis of 25 experiments on hypnosis and psi conducted
between 1945 and 1981 in 10 different laboratories sug-
gests that hypnotic induction may also facilitate psi per-
formance (Schechter, 1984). And dream mediated psi was
reported in a series of experiments conducted at Mai-
monides Medical Center in New York and published be-
tween 1966 and 1972 (Child, 1985; Ullman, Krippner, &
Vaughan, 1973).
In the Maimonides dream studies, two subjects-a
`receiver" and a `sender'_apent the night in a sleep labo-
ratory. The receiver's brain waves and eye movements
were 'monitored as he or she slept in an isolated room.
When the receiver entered a period of REM sleep, the ex-
perimenter pressed a buzzer that signaled the sender-
under the supervision of a second experimenter-to begin
a sending period. The sender would then concentrate on a
randomly chosen picture (the "target") with the goal of in-
fluencing the content of the receiver's dream.
Toward the end of the REM period, the receiver was
awakened and asked to describe any dream just experi-
enced. This procedure was repeated throughout the night
with the same target. A transcription of the receiver's
dream reports was given to outside judges who blindly
rated the similarity of the night's dreams to several pic-
tures, including the target In some studies, similarity rat-
ings were also obtained from the receivers themselves.
Across several variations of the procedure, dreams were
judged to be significantly more similar to the target pic-
tures than to the control pictures in the judging sets
(failures to replicate the Maimonides results were also re-
viewed by Child, 1985).
These several lines of evidence suggested a working
model of psi in which psi-mediated information is concep-
tualized as a weak signal that is, normally masked by in-
ternal somatic and external sensory `noise.' By reducing
ordinary sensory input, these diverse psi-conducive states
are presumed to raise the signal-to-noise ratio, thereby
enhancing a person's ability to detect the psi-mediated in-
formation (Honorton, 1969, 1977). To test the hypothesis
that a reduction of sensory input itself facilitates psi per-
formance, investigators turned to the ganzfeld procedure
(Brand, Wood, & Braud, 1975; Honorton & Harper, 1974;
Parker, 1975), a procedure originally introduced into ex-
perimental psychology during the 1930s to test proposi-
tions derived from Gestalt theory (Avant, 1965; Metzger,
1930).
Like the dream studies, the psi ganzfeld procedure has
most often been used to test for telepathic communication
between a sender and a receiver. The receiver is placed in
a reclining chair in an acoustically isolated room:
Translucent ping-pong ball halves are taped over the eyes
and headphones are placed over the ears; a red floodlight
directed toward the eyes produces an undifferentiated vi-
sual field and white noise played through the headphones
produces an analogous auditory field. It is this homoge-
neous perceptual environment that is called the Ganzfeld
("`total field"). To reduce internal somatic 'noise,' the re-
ceiver typically also undergoes aseries of progressive re-
laxation exercises at the beginning of the ganzfeld period.
The sender is sequestered in a separate acoustically iso-'
lated room, and a visual stimulus (art print, photograph,
or brief videotaped sequence) is randomly selected from a
large pool of such stimuli to serve as the target for the
session. While the sender concentrates on the target, the
receiver provides a continuous verbal report of his or her
ongoing imagery and mentation, usually for about 30
minutes. At the completion of the ganzfeld period, the re-
ceiver is presented with several stimuli (usually four) and,
without knowing which stimulus was the target, is asked
to rate the degree to which each matches the imagery and
mentation experienced during the ganzfeld period. If the
receiver assigns the highest rating to the target stimulus,
it is scored as a'hit.' Thus, if the experiment uses judging
sets containing four stimuli (the target and three decoys
or control stimuli), the hit rate expected by chance is .25.
The ratings can also be analyzed in other ways; for exam-
ple, they can be converted to ranks or standardized scores
within each set and analyzed parametrically across ses-
sions. And, as with the dream studies, the similarity rat-
ings can also be made by outside judges using transcripts
of the receiver's mentation report.
pr
Appro
ed ForRel
ase 2003/04/18 : CIA-RDP96-00789R00270001 9001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ANOMALOUS INFORMATION TRANSFER
Meta-Analyses of the Ganzfeld Database
In 1985 and 1986, the Journal of Parapsychology de-
voted two entire issues to a critical examination of the
ganzfeld database. The 1985 issue comprised two contri-
butions: (a) a meta-analysis and critique by Ray Hyman
(1985), a cognitive psychologist and skeptical critic of
parapsychological research, and (b) a competing meta.
analysis and rejoinder by Charles Honorton (1985), a
parapsychologist and major contributor to the ganzfeld
database. The 1986 issue contained four commentaries on
the Hyman-Honorton exchange, a joint communique by
Hyman and Honorton, and six additional commentaries
on the joint communique itself. We summarize the major
issues and conclusions here.
Replication Rates
Rates by study. Hyman's meta-analysis covered 42 psi
ganzfeld studies reported in 34 separate reports written
or published from 1974 through 1981. One of the first
problems he discovered in the database was multiple
analysis. As noted earlier, it is possible to calculate sev-
eral indexes of psi performance in a ganzfeld experiment
and, furthermore, to subject those indexes to several kinds
of statistical treatment. Many investigators reported mul-
tiple indexes or applied multiple statistical tests without
adjusting the criterion significance level for the number of
tests conducted. Worse, some may have `shopped' among
the alternatives until finding one that yielded a signifi-
cantly successful outcome. Honorton agreed that this was
a problem.
Accordingly, Honorton applied a uniform test on a
common index across all studies from which the pertinent
datum could be extracted, regardless of how the investiga-
tors had analyzed the data in the original reports. He se-
lected the proportion of hits as the common index because
it could be calculated for the largest subset of studies: 28
of the 42 studies. The hit rate is also a conservative index .
because it discards most of the rating information; a sec-
ond place ranking-a `near 'miss =receives no more
credit than a last place ranking. Honorton then calculated
the exact binomial probability and its associated z score
for each study.
Of the 28 studies, 23 (82%) had positive z scores (p =
4.6 x 10-4, exact binomial test with p = q = .5). Twelve of
the studies (43%) had z scores that were independently
significant at the 5% level (p = 3.5 x 10-9, binomial test
with 28 studies, p = .05, and q = .95), and 7 of the studies
(25%) were independently significant at the 1% level (p =
9.8 x 10-9). The composite Stouffer z score across the 28
studies was 6.60 (p = 2.1 x 10-11).1 A more conservative
estimate of significance can be obtained by including 10
additional studies that also used the relevant judging pro-
cedure but did not report hit rates. If these studies are as-
signed a mean z score of zero, the Stouffer z across all 38
studies becomes 5.67 (p = 7.3 x 10-9).
Thus, whether one considers only the studies for which
the relevant information is available or includes a null es-
timate for the additional studies for which the information
is not available, the aggregate results cannot reasonably
1Stouffer's z is computed by dividing the sum of the r scores for
the individual studies by the square root of the number of studies
(Rosenthal, 1978).
CPYRGHT
3
be attributed to chance. And, by design, the cumulative
outcome reported here cannot be attributed to the infla-
tion of significance levels through multiple analysis.
Rates by laboratory. One objection to estimates such as
those just described is that studies from a common labora-
tory are not independent of one another (Parker, 1978).
Thus, it is possible for one or two investigators to be dis.
proportionately responsible for a high replication rate
whereas other, independent investigators are unable to
obtain the effect.
The ganzfeld database is vulnerable to this possibility.
The 28 studies providing hit rate information were con-
ducted by investigators in 10 different laboratories. One
laboratory contributed 9 of the studies, Honorton's own
laboratory contributed 5, 2 other laboratories contributed
3 each, 2 contributed 2 each, and the remaining 4 labora-
tories- each contributed 1. Thus, half of the studies were
conducted by only 2 laboratories, 1 of them Honorton's
own.
Accordingly, Honorton calculated a separate Stouffer z
score for each laboratory. Significantly positive outcomes
were reported by 6 of the 10 laboratories, and the com-
bined z score across laboratories was 6.16 (p = 3.6 x
10-10). Even if all of the studies conducted by the 2 most
prolific laboratories are discarded from the analysis, the
Stouffer z across the 8 other laboratories remains signifi-
cant (z = 3.67, p = 1.2 x 10-4). Four of these studies are
significant at the 1% level (p = 9.2 x 10"6, binomial test
with 14 studies, p = .01, and q = .99), and each was con-
tributed by a different laboratory. Thus, even though the
total number of laboratories in this database is small,
most of them have reported significant studies, and the
significance of the overall effect does not depend on just
one or two of them.
Selective Reporting
In recent years, behavioral scientists have become in-
creasingly aware of the "file-drawer" problem: the likeli-
hood that successful studies are more likely to be pub-
lished than unsuccessful studies, which are more likely to
be consigned to the file drawers of their disappointed in-
vestigators (Bozarth & Roberts, 1972; Sterling, 1959).
Parapsychologists were among the first to become sensi-
tive to the problem, and, in 1975, the Parapsychological
Association Council adopted a policy opposing the selec-
tive reporting of positive outcomes. As a consequence,
negative findings have been routinely reported at the as-
sociation's meetings and in its affiliated publications for
almost two decades. As has already been shown, more
than half of the ganzfeld studies included in the meta-
analysis yielded outcomes whose significance falls short of
the conventional .05 level.
A variant of the selective reporting problem arises from
what Hyman (1985) has termed the -retrospective study.-
An investigator conducts a small set of exploratory trials.
If they yield null results, they remain exploratory and
never become part of the official record; if they yield posi-
tive results, they are defined as a study after the fact and
are submitted for publication. In support of this possibil-
ity, Hyman noted that there are more significant studies
in the database with fewer than 20 trials than one would
expect under the assumption that, all other things being
equal, statistical power should increase with the square
root of the sample size. Although Honorton questioned the
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ANOMALOUS INFORMATION TRANSFER
assumption that 'all other things" are in fact equal across
the studies and disagreed with Hyman's particular statis-
tical analysis, he agreed that there is an apparent cluster-
ing of significant studies with fewer than 20 trials. (Of the
complete. ganzfeld database of 42 studies, 8 involved fewer
than 20 trials, and 6 of those studies reported statistically
significant results.)
Because it is impossible, by definition, to know how
many unknown studies-exploratory or otherwise-are
languishing in file drawers, the major tool for estimating
the seriousness of selective reporting problems has be-
come some variant of Rosenthal's file drawer statistic, an
estimate of how many unreported studies with z scores of
zero would be required to exactly cancel out the signifi-
cance of the known database (Rosenthal, 1979). For the 28
direct-hit ganzfeld studies alone, this estimate is 423 fugi-
tive studies, a ratio of unreported-to-reported studies of
approximately 16:1. When it is recalled that a single
ganzfeld session takes over an hour to conduct, it is not
surprising that-despite his concern with the retrospec-
tive study problem Hyman concurred with Honorton and
other participants in the published debate that selective
reporting problems cannot plausibly account for the over-
all statistical significance of the psi ganzfeld database
(Hyman & Honorton, 1986).2
Methodological Flaws
If the most frequent criticism of parapsychology is that
it has not produced a replicable psi effect, the second most
frequent criticism is that many, if not moat, psi experi-
ments have inadequate controls and procedural safe-
guards. A frequent charge is that positive results emerge
primarily from initial, poorly controlled studies and then
vanish as better controls and safeguards are introduced.
Fortunately, meta-analysis provides a vehicle for empir-
ically evaluating the extent to which methodological flaws
may have contributed to artifactual positive outcomes
across a set of studies. First, ratings are assigned to each
study that index the degree to which particular method-
ological flaws are or are not present; these ratings are
then correlated with the studies' outcomes. Large positive
correlations constitute evidence that the observed effect
may be artifactual.
In psi research, the most fatal flaws are those that
might permit a subject to obtain the target information in
normal sensory fashion, either inadvertently or through
deliberate cheating. This is called the problem of uensory
leakage. Another potentially serious flaw is inadequate
randomization of target selection.
Sensory leakage. Because the ganzfeld is itself a percep-
tual isolation procedure, it goes a long way toward elimi-
nating potential sensory leakage during the ganzfeld por-
tion of the session. There are, however, potential channels
of sensory leakage after the ganzfeld period. For example,
if the experimenter who interacts with the receiver knows
the identity of the target, he or she could bias the re-
ceiver's similarity ratings in favor of correct identification.
Only one study in the database contained this flaw, a
study in which subjects actually performed slightly below
?A 1980 survey of parapsychologists uncovered only 19 com-
pleted but unreported ganzfeld studies. Seven of these had
achieved significantly positive results, a proportion (.37) very
similar to the proportion of independently significant studies in
the meta-analysis (.43) (Blackmore, 1980).
chance expectation. Second, if the stimulus set given to
the receiver for judging contains the actual physical target
handled by the sender during the sending period, there
might be cues (e.g., fingerprints, smudges, or temperature
differences) that could differentiate the target from the
decoys. Moreover, the process of transferring the stimulus
materials to the receiver's room itself opens up other po-
tential channels of sensory leakage. Although contempo-
rary ganzfeld studies have eliminated both of these possi-
bilities by using duplicate stimulus sets, some of the ear-
lier studies did not.
Independent analyses by Hyman and Honorton agreed
that there was no correlation between inadequacies of se-
curity against sensory leakage and study outcome. Honor-
ton further reported that if studies that failed to use du-
plicate stimulus sets were discarded- from the analysis,
the remaining studies are still highly significant (Stouffer
z=4.36,p=6.8x10'6)
Randomization. In many psi experiments, the issue of
target randomization is critical because systematic pat-
terns in inadequately randomized target sequences might
be detected by subjects during a session or might match
subjects' preexisting response biases. In a ganzfeld study,
however, randomization is a much less critical issue be-
cause only one target is selected during the session and
most subjects serve. in only one session. The primary con-
cern is simply that all the stimuli within each judging set
be sampled uniformly over the course of the study. Simi-
lar considerations govern the* secondsrandomization,.
which takes place after the ganzfeld period and deter- .
mines the sequence in which the target and decoys are
presented to the receiver (or external judge) for judging.
Nevertheless, Hyman and Honorton disagreed over the
findings here. Hyman claimed there was a correlation be-
tween flaws of randomization and study outcome; Honor-
ton claimed there was not. The sources of this disagree.
ment were in conflicting definitions of flaw categories, in
the coding and assignment of flaw ratings to individual
studies, and in the subsequent statistical treatment of
those ratings.
Unfortunately, there have beeni;ao ratings of fl awn by
independent raters who were unaware of the studies' out-
comes (Morris, 1991). Nevertheless, none of the contn'bu-
tors to the subsequent debate concurred with Hyman's
conclusion, whereas four nonparapsychologists-two
statisticians and two psychologists-explicitly concurred
with Honorton's conclusion (Harris & Rosenthal, 1988b;
Saunders, 1985; Utts, 1991a). For example, Harris and
Rosenthal (one of the pioneers in the use of meta-analysis
in psychology) used Hyman's own flaw ratings and failed
to find any significant relationships between flaws and
study outcomes in each of two separate analyses: `Our
analysis of the effects of flaws on study outcome lends no
support to the hypothesis that Ganzfeld research results
are a significant function of the set of flaw variables"
(1988b, p. 3; for a more recent exchange regarding Hy.
man's analysis, we Hyman, 1991; Utts, 1991a, 1991b).
Effect Size .
Some critics of parapsychology have argued that even if
current laboratory-produced psi effects turn out to be
replicable and nonartifactual, they are too small to be of
theoretical interest or practical importance. We do not be-
lieve this to be the case for the psi ganzfeld effect.
W_
Appro
ed For,Rel
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ANOMALOUS INFORMATION TRANSFER
In psi ganzfeld studies, the hit rate itself provides a
straightforward descriptive measure of effect size, but this
measure cannot be compared directly across studies be-
cause they do not all use a four-stimulus judging set and,
hence, do not all have a chance baseline of .25. The next
most obvious candidate, the difference in each study be-
tween the hit rate observed and the hit rate expected un-
der the null hypothesis, is also intuitively descriptive but
is not appropriate for statistical analysis because not all
differences between proportions that are equal are equally
detectable (e.g., the power to detect the difference between
.55 and .25 is different from the power to detect the differ-
ence between .50 and .20).
To provide a scale of equal delectability, Cohen (1988)
devised the effect size index h, which involves an arceine
transformation on the proportions before calculation of
their difference. Cohen's h is quite general and can assess
the difference between any two .proportions drawn from
independent samples or between a single proportion and
any specified hypothetical value. For the 28 studies exam-
ined in the meta-analyses, h was .28, with a 95% confi-
dence interval from .11 to .45.
But because values of h do not provide an intuitively
descriptive scale, Rosenthal and Rubin (1989; Rosenthal,
1991) have recently suggested a new index, a,, which ap-
plies specifically to one-sample, multiple-choice data of
the kind obtained in ganzfeld experiments. In particular,
it expresses all hit rates as the proportion of hits that
would have been obtained if there had been only two
equally likely alternatives- essentially a coin flip. Thus, xr
ranges from 0 to 1, with .5 expected under the null hy-
pothesis. The formula is
x = P(k -1)
P(k - 2) + I
where Pis the raw proportion of hits and k is the number
of alternative choices available. Because it has such, a
straightforward intuitive interpretation, we use. it (or its.
conversion back to an equivalent four-alternative hit rate)
throughout this article whenever it is applicable.
For the 28 studies examined in the meta-analyses, the
mean value of Yrwas .62, with a 95% confidence interval
from .55 to .69. This corresponds to a four-alternative hit
rate of 35%, with a 95% confidence interval from 28% to
43%.
Cohen (1988, 1992) has also categorized effect sizes into
small, medium, and large, with medium denoting an effect
size that should be apparent to the naked eye of a careful
observer. For a statistic such as n which indexes the de-
viation of a proportion from .5, Cohen considers .65 to be a
medium effect size: A statistically unaided observer
should be able to detect the bias of a coin that comes up
heads on 65% of the trials. Thus, at .62, the psi ganzfeld
effect size falls just short of Cohen's naked-eye criterion.
From the phenomenology of the ganzfeld experimenter,
the corresponding hit rate of 35% implies that he or she
will see a subject obtain a hit approximately every third
session rather than every fourth.
It is also instructive to compare the psi ganzfeld effect
with the results of a recent medical study that sought to
determine whether aspirin can prevent heart attacks
(Steering Committee of the Physicians' Health Study Re-
search Group, 1988). The study was discontinued after 6
CPYRGHT
years because it was already clear that the aspirin treat-
ment was effective (p < .00001) and it was considered un.
ethical to keep the control group on _ placebo medication.
The study was widely publicized as a major medical
breakthrough. But despite its undisputed reality and
practical importance, the size of the aspirin effect is quite
small: Taking aspirin reduces the probability of suffering
a heart attack by only .008. The corresponding effect size
(A) is .068, about one third to one fourth the size of the psi
ganzfeld effect (Atkinson et al., 1993, p. 236; Utte, 1991b).
In sum, we believe that the psi ganzfeld effect is large
enough to be of both theoretical interest and potential
practical importance.
Experimental Correlates of the Psi Ganzfeld Effect
We showed earlier that the technique of correlating
variables with effect sizes across studies can help to as-
sess whether methodological flaws might have produced
artifactual positive outcomes. The same technique can be
used more affirmatively to explore whether an effect
varies systematically with conceptually relevant varia-
tions in experimental procedure. The discovery of such
correlates can help to establish an effect as genuine, sug-
gest ways of increasing replication rates and effect sizes,
and enhance the chances of moving beyond the simple
demonstration of an effect to its explanation. This strat-
egy is only heuristic, however. Any correlates discovered
must be considered quite tentative, both because they
emerge from post hoc exploration and because they neces-
sarily involve comparisons across heterogeneous studies
that differ simultaneously on many interrelated variables,
known and unknown. Two such correlates emerged from
the meta-analyses of the psi ganzfeld effect.
Single- versus multiple-image targets. Although most of
the 28 studies in the meta-analysis used single pictures as
targets, 9 (conducted by three different investigators)
used View Master stereoscopic slide reels that presented
multiple images focused on a central theme. Studies using
the View Master reels produced significantly higher hit
rates than did studies using the single-image targets (50%
vs. 34%), t(26) = 2.22, p -.035, two-tailed.
Sender-giver pairing. In 17 of the 28 studies, partici-
pants were free to bring in friends to serve as senders. In
8 studies, only laboratory-assigned senders were used.
(Three studies used no sender.) Unfortunately, there is no
record of how many participants in the former studies ac-
tually brought in friends. Nevertheless, those 17 studies
(conducted by six different investigators) had significantly
higher hit rates than did the studies that used only labo-
ratory-assigned senders (44% vs. 26%), t(23) = 2.39, p =
.025, two-tailed.
The Joint Communique
After their published exchange in 1985, Hyman and
Honorton agreed to contribute a joint communique to the
subsequent discussion that was published in 1986. First
they set forth their areas of agreement and disagreement:
We agree that there is an overall significant effect in this
data base that cannot reasonably be explained by selective
reporting or multiple analysis. We continue to differ over
the degree to which the effect constitutes evidence for psi,
but we agree that the final verdict awaits the outcome of fu-
ture experiments conducted by a broader range of investiga-
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
The National Research Council Report
In 1988, the National Research Council (NRC) of the
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ton and according to more stringent standards. (Hyman & The Autoganzfeld Studies
Honorton, 1986, p. 351)
They then spelled out in detail the 'more stringent In 1983, Honorton and-his colleagues initiated a new
standards' they believed should govern future expert- series of ganzfeld studies designed to avoid the method.
treats. These standards included strict security precau- studies ological (Honorton, h1979; and others had 979). flee studies
tions against sensory leakage, testing and documentation complied with with all 1 Kennedy, guidelines hese studies
of randomization methods for selecting targets and Be- Hyman publish of the detailed gelinthat he and
quencing the judging set, statistical correction for multiple were to plater in their joint comm en u
analyses, advance specification of the status of the ex- The program continued until September, 1989, , when a
periment (e.g:, pilot study or confirmatory experiment), loss of funding forced the laboratory to close. The major
and full documentation in the published report of the ex. innovations of the new studies were the computer control
perimental procedures and the status of statistical tests of the experimental protocol-hence the name auto-
(e.g., planned or post hoc), ganzfeld--end the introduction of videotaped film clips as
t
t
t
it would be implausible to entertain the null a flo J y Zak' At that point, the sender moved to the re-
givn the ce
m
arge
s
unuh.
Method
The basic design of the autoganzfeld studies was the
y p
cued report commissioned by the U.S. Army that assessed same as that described earlier4:- A receiver and sender
several controversial technologies for enhancing human were sequestered in separate, acoustically-isolated chain-
Performance, including accelerated learning, neurolin- bars- After a 14-minute period of progressive relaxation,
guistic programming, mental practice, biofeedback, and the receiver underwent ganzfeld stimulation while de-
parapsychology (Druclniaa & Swats, 1988; summarized in scribing his or her thoughts and images aloud for 30 min-
Swets & Bjork, 1990). The report's conclusion concerning mss- Meanwhile, the sender concentrated on a randomly
parapsychology was quite negative: Me Committee finds selected target. At the end of the ganzfeld period, the re-
no scientific justification from research conducted over a ceiver was shown four stimuli and, without knowing
period of 130 years for the existence of parapsychological which of the four had been the target, rated each stimulus
phenomena' (Druckman & Sweta,1988, p. 22). for its similarity to his or her mentation during the
An extended refutation strongly protesting the commit- ganzfeld.
tee's treatment of parapsychology has been published The targets consisted of 80 still pictures (static targets)
elsewhere (Palmer at al., 1989). The pertinent point here and 80 short video segments complete with soundtracks
is simply that the NRC's evaluation of the ganzfeld stud- (dynamic targets), all recorded on videocassette.' The
ies does not reflect an additional, independent examine- static targets included art prints, photographs, and maga-
tion of the ganzfeld database but is based on the same zine advertisements; the dynamic targets included ex-
meta-analysis conducted by Hyman that we have die. cerpts of approximately 1-min duration from motion pic-
cussed in this article. tures, TV shows, and cartoons. The 160 targets were ar-
Hyman chaired the NRC's Subcommittee on Parapsy. ranged in judging sets of four static or four dynamic tar-
chology, and, although he had concurred with Honorton 2 gets each, constructed to minimize similarities among
years earlier in their joint communique that `there is an targets within a set.
overall significant effect in this data base that cannot yea- Target selection and Presentation. The VCR containing
sonably be explained by selective reporting or multiple the taped targets was interfaced. to the controlling com-
analysis' (p. ?351) and that "significant outcomes have Puter, which selected the target and controlled its re-
been produced by a number of different investigators' (p, peated presentation to the sender. during the ganzfeld pe-
352), neither of these points is acknowledged in the tom- nod, thus eliminating the need for a second experimenter
mittee's report. to accompany the sender. After the ganzfeld period, the
The NRC also solicited a background report fiom Harris computer randomly sequenced the four-clip judging set
and Rosenthal (1988a), which provided the committee and presented it to the receiver on a TV monitor for judg-
with a comparative methodological analysis of the five ~- The receiver used a computer game paddle to make
controversial areas just listed. Harris and Rosenthal noted his or her ratings on a 40-point scale that appeared on the
that, of these areas, "only the Ganzfeld ESP studies (the TV monitor after each clip was shown. The receiver was
only psi studies they evaluated] regularly meet the basic permitted to see each clip and to change the ratings re-
n
d exper
requirements of sound experimental design' (p. 63), and Peaky until he or she was satisfied. The computer then
they concluded that wrote these and other data from the session into a file on
National Academy of Sciences released a widel
ubli-
p
em-
both the receiver and the experimenter. Note that the ex-
or timate flaws the obtained ined out by Hyman accuracy rate rate and
to Hbe aboutonortoa , ... V3 we w tthhes- e perimenter did not even know the identity of the four-clip .
set until it was displayed to the receiver for judg-
accuracy rate expected under the null is 114. (p. 51)3in.
- oerve s
chamber and revealed the identity of the target to
biped p firom these 28 studies. Gives the various
robl
3jn a troubling development, the chair of the NRC Committee
phoned Rosenthal and asked him to delete the parapsychology
section of the paper (R..Rosenthal, personal communication,
September 15, 1992). Although Rosenthal refused to do so, that
section of the Harris Rosenthal paper is nowhere cited in the
NRC report.
4Because llonorton and his colleagues have complied with the
Hyman-Honorton specification that experimental reports be suf-
ficiently complete to permit others to reconstruct the investiga-
tors procedures, readers who wish to know more detail than we
provide here are likely to find whatever they need in the archival
publication of these studies in the Journal of Parapsychology
(Honorton et al.. 1990).
Appro
ed ForRel
ase 2003/04/18 : CIA-RDP96-00789R00270001
C'PYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ANOMALOUS INFORMATION TRANSFER
Randomization. The random selection of the target and
sequencing of the judging net were controlled by a noise-
based random number generator interfaced to the com-
puter. Extensive testing confirmed that the generator was
providing a uniform distribution of values throughout the
full target range (1-160). Tests on the actual frequencies
observed during the experiments confirmed that targets
were, on average, selected uniformly from among the 4
clips within each target set and that the 4 judging se-
quences used were uniformly distributed across sessions.
Additional control features. The receiver's and sender's
rooms were sound isolated, electrically shielded chambers
with single-door access that could be continuously moni-
tored by the experimenter. There was two-way intercom
communication between the experimenter and the re-
ceiver but only one-way communication into the sender's
room; thus, neither the experimenter nor the receiver
could monitor events inside the. sender's room. The
archival record for each session includes an audiotape
containing the receiver's mentation during the ganzfeld
period and all verbal exchanges between the experimenter
and the-receiver throughout the experiment.
The automated ganzfeld protocol has been examined by
several dozen parapsychologists and behavioral re-
searchers from other fields, including well-known critics
of parapsychology. Many have participated as subjects or
observers. All have expressed satisfaction with the han-
dling of security issues and controls.
Parapsychologists have often been urged to employ ma-
gicians as consultants to ensure that the experimental
protocols are not vulnerable either to inadvertent sensory
leakage or to deliberate cheating. Two `mentalists,' magi-
cians who specialize in the simulation of psi, have exam-
ined the autoganzfeld system and protocol. Ford Kress, a
professional mentalist and officer of the mentalist's pro-
fessional organization, the Psychic Entertainers Associa-
tion, provided the following written statementIn my pro-
fessional capacity as a mentalist, I have reviewed Psy-
chophysical Research Laboratories' automated ganzfeld
system and found it to provide.excellent. security against
deception by subjects" (personal communication, May,
1989).
Daryl J. Bern has also performed as a mentalist for
many years and is a member of the Psychic Entertainers
Association. As mentioned in the author note, this article
had its origins in a 1983 visit he made to Honorton's labo-
ratory, where he was asked to critically examine the re-
search protocol from the perspective of a mentalist, a re-
search psychologist, and a subject. Needless to say, this
article would not exist if he did not concur with Ford
Kross'a assessment of the security procedures.
Experimental Studies5
Altogether, 100 men and 140 women participated as re-
ceivers in 354 sessions during the research program. The
participants ranged in age from 17 to 74 years (iia - 37.3,
SD = 11.8), with a mean formal education of 15.6 years
(SD = 2.0). Eight separate experimenters, including Hon-
orton, conducted the studies.
5A recent review of the original computer files uncovered a
uplicate record in the autoganzfeld database. This has now been
liminated, reducing by one the number of subjects and sessions.
a result, some of the numbers presented in this article differ
lightly from those in Honorton et al. (1990).
The experimental program included three pilot an
eight formal studies. Five of the formal studies use
novice (first-time) participants who served as the receive
in one session each. The remaining three formal studies
used experienced participants.
Pilot studies. Sample sizes were not preset in the three
pilot studies. Study 1 comprised 22 sessions and was con-
ducted during the initial development and testing of the
autoganzfeld system. Study 2 comprised 9 sessions testing
a procedure in which the experimenter, rather than the
receiver, served as the judge at the end of the session.
Study 3 comprised 35 sessions and served as practice for
participants who had completed the allotted number of
sessions in the ongoing formal studies but who wanted
additional ganzfeld experience. This study also included
several demonstration sessions when TV film crews were
present.
Novice Studies. Studies 101-104 were each designed to
test 50 participants who had had no prior ganzfeld experi-
ence; each participant served as the receiver in a single
ganzfeld session. Study 104 included 16 of 20 students re-
cruited from the Juilliard School in New York City to test
an artistically gifted sample. Study 105 was initiated to
accommodate the overflow of participants who had been
recruited for Study 104, including the four remaining Juil-
liard students. The sample size for this study was set to
25, but only 6 sessions had been completed when the labo-
ratory closed. For purposes of exposition, we divided the
56 sessions from Studies 104 and 105 into two parts:
Study 104/105(a) comprises the 36 nonJuilliard partici-
pants and Study 104/105(b) comprises the 20 Juilliard
students.
Study 201. This study was designed to retest the most
promising participants from the previous studies. The
number of trials was set to 20, but only 7 sessions with 3
~~~cipants had been completed when the laboratory
Study 301. This study was designed to compare static
and dynamic targets. The sample size was set to 50 ses-
sions. Twenty-five experienced participants each served
as the-receiver in 2 sessions. Unknown to the participants,
the computer control program was modified to ensure that
they would each have 1 session with a static target and 1
session with a dynamic target.
Study 302 This study was designed to examine a dy-
namic target set that had yielded a particularly high hit
rate in the previous studies. The study involved experi-
enced participants who had had no prior experience with
this particular target set and who were unaware that only
one target set was being sampled. Each served as the re-
ceiver in a single session. The design called for the study
to continue until 15 sessions were completed with each of
the targets, but only 25 sessions had been completed
when the laboratory closed.
The 11 studies just described comprise all sessions con-
ducted during the 6.6 years of the program. There is no
"file drawer" of unreported sessions.
Results
Overall hit rate. As in the earlier meta-analysis, re-
ceivers' ratings were analyzed by tallying the proportion
of hits achieved and calculating the exact binomial proba-
bility for the observed number of hits compared with the
chance expectation of .25. As noted earlier, 240 partici-
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 CIA-RDP96-00789R002700010001-1
ANOMI~I,O"iTS INFORMA7`ION TRANSFER
Table 1
Outcome by Study
Study/subject
N
N
N
%
Effect size
Study
description
subjects
trials
hits.
hits '
` A
1
Poot
19.
22
8
36?.
.62'
z
0.99
2
'
Pilot
.
33
:60
0.25
Pilot
24
35,
10W
29
,55 ..:.
.032
101-
Novice ..
50
so
12.
24
.47
-0.30
102
Novice
50
50
.18
36
.63...
'1.60
103
Novice
so
50
15
30
?
.067
1041105(a)
Novice.
36
36
12
33
.60
0.97
1041105(b)
Juilliard -sample
10
50
.75
2.20
201
Experienced
3
7
3
.69
0.69
301
Experienced
25.
50
.15
30
.56
0.67
_302
Experienced . .
... 25' '
25
16 ..
54a
78a
a .. .
.
3.04
Overall (Studies 1-301).
Note. All z scores are based on the exact binomial pro bability. with .p - 25 and ;q 775.
pants contributed .354. sessions. For reasons . discussed
later, Study 302 is analyzed separately, reducing the
number of sessions is the primary analysis to 329.'
As Table 1' shows,' there were 106 lifts :in the -329 ses-
sions, a hit'rate' of 32% (z = 2.89, p = .002, one-tailed),
with a 95% confidence interval from 30% to 35%. This cor-
responds to an effect size (sr) of .69, with a:95%?confidence
interval from .63 to.64. .
Table 1 also shows that when -Studies 104 and .105 are
combined and re-divided into Studies 104/105(a) and
104/105(b), 9 of the 10 studies yield positive. effect sizes,
with a. mean effect size (a) of .61, t(9) = 4.44,.p a .0008
one-tailed. This effect size is equivalent to a four alterna-
Live hit rate of 34%. Alternatively, if Studies. 104 and -105
are retained as separate studies, 9 of the 10 studies again
yield
3.73, p = .002, one-taia mean effect size led. This e$ ( size s s
equivalent to a four-alternative.hit rate of 35% and is
identical to that found across the 28 studies of the earlier
meta-analysis.s
Considered together, sessions with novice .participants
(Studies 101-105) yielded a statistically significant bit '
rate of 32.5% (p = .009), which is not significantly differ-
ent from the 31.6% bit rate achieved by. experienced par-
ticipants in Studies 201 and 301. And finally, each of the
6Ae noted above, the laboratory was forced to close before three
of the formal studies could be completed. If we assume that the
remaining trials in Studies 105 and 201 would have yielded only
chance results, this would reduce the overall x for the first 10
autoganzfeld studies from 2.89 to 2.76 (p -.003). Thus,inclusion
of the two incomplete studies does not pose an optional stopping
problem. The third -incomplete study. Study 302, is discussed
below.
eight experimenters also
achieved aipositive eff
i
ect s
ze,
with a mean jr ..of .60, t(7) = 3.44;.p.- M5,:one-tailed.
The .41Ur sample. .There are several reports in the
literature of a relationship between creativity or artistic
ability and psi performance (Schmeidler,1988): To explore
this pcesi'bYlity in. the ganzfeld setting to male and 10 fe-
male' undergraduates, were recruited from the Jmlliard
School. Of.#heee, 8. were music:.atudents, 10 were drama
students, and 2 were dance students. Each served as the
receiver in a.single session in Study.104 or 105. As shown
in Table: 1, these students achieved a hit rate of 60% (p =
.014), one of 'the five highest bit rates ever reported for. a
single sample in a ganzfeld study.. The musicians were
particularly successful: 6 of the 8 (75%) successfully iden-
tified their targets (p = .004, further details about this
sample and their ganzfeld performance were reported in
Schlitz & Honorton,1992).
Study size and erect size. There .is a significant negative
correlation across the 10 studies listed in Table 1 between
the number of sessions included in a study and the study's
effect size (a), r = -.64, 0) = 2X6, p < .05, two-tailed. This
is reminiscent ofHyman's discovery that the smaller stud- .
ies in the original ganzfeld database were disproportion.
ately likely to -report statistically significant results. He
interpreted this finding as evidence for a bias against the
reporting of: small studies that fail to achieve significant
results. A simflar interpretation cannot'be applied to the
autoganzfeld studies, however, because there are no unre-
ported sessions.
One reviewer of this article suggested that the negative .
correlation might reflect a decline effect in which earlier ?.
Approved For Release 2003/04/18 : CIA-RDP96-00789R00270001
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1 CPYRGHT
ANOMALOUS INF.OR M2ION TRANSFER
,f_ 9
Table 2
Study 302: Expelled Hit Rate and Proportion of Sessions in . which Each Video
when it was a Deaoy? was Ranked Fast when it was a Target and
Relative
Relative Frequency of
Fre
ue
fi
fl
Ranked First
Ranked First
Fisher's
Video Cli
q
ncy
rst
ace
T
Expected
when
when
Exact
p
as
arget Ranking
Hit Rate (%)
Target
Decoy
Difference
P
Tidal Wave
.28
.24
6.72
.57
.11 .
.46
.032
(7/25)
.(6125)
j4/7)
(2/18)
Snakes'
.12
.12
1.44
.67
.05
.62
.029
(325)
(325)
(2/3)
(1/22)
Sex Scene
.16
.08
1.28
.25
.05 ;
.20
300
()
(2r25)
{114)
(1121)
.
Bugs Bunny
.44
.56
24.64
.82
.36
.46
.027
(1125)
(1425)
49/11)
(5/14)
Overall
34.08
.58
.14
.44
sessions of a study are more sueoessful than later aes-
sions. If there were such an effect, then studies with :fewer
sessions would show larger effect sizes because they
would end before a decline could set in. To check this pos-
sibility, we computed point-biserial correlations between
hits (1) or misses (0) and the session number within each
of the 10 studies. All of the correlations hovered-around
zero, six were positive, four were negative, andtheoverall
mean was.
01.
An inspection of Table 1 reveals that the negative corre-
lation derives .primarily from the two..studies with the
largest effect sizes: the 20 sessions with the Juilliard stu-
dents and the 7 sessions of Study 2044he study specifi-
cally: designed to retest the most promising-participants
from the =previous studies. Accordingly, it - seems likely
that the larger effect sizes of these two studies-and
hence .the significant negative - correlation between the
number of sessions and the effect size-reflect genuine
performance differences between these two small, highly
selected samples and other autoganzfeld participants.
Study 302. All of the studies except Study 302 randomly
sampled from a pool of 160 static and dynamic targets.
Study 302 sampled from a single, dynamic target set that
had yielded a particularly high hit rate in the previous
studies. The four film clips in this set consisted of a scene
of a tidal wave from the movie Clash .of the Titans. a high.
speed sex. scene from A Clockwork Orange, a scene of
crawling snakes from a TV documentary, anda scene
from a Bugs Bunny cartoon.
The experimental design called for this study to con-
tinue until each of the clips had served as the target 15
times. Unfortunately, the premature termination of this
study at 25 sessions left an imbalance in the frequency
with which each clip had served as the target. This means
that the high hit rate observed (64%) could well be in-
flated by response biases.
As an illustration, waterimagery is frequently reported
by receivers in ganzfeld sessions whereas sexual imagery
is rarely reported. (Some participants are probably reluc-
tent both to report sexual imageryand to give the highest
rating to the sex-related clip.) If a video clip containing
popular imagery (such as water) happens to appear as a
target mom Avquently than a'clip containing unpopular
imagery (ouch -as sex),:a high hit rate might simply reflect
the coincidence of those frequencies of occurrence with
perUcipants' response biases. And, as the second column
of Table 2 reveals, the tidal wave .clip did in fact appear
more frequently as the target than did the sex clip. More
generally. the second and third columns of Table 2 show
that the frequency with which each film clip was ranked
first closely matches the frequency with which each ap.
peared as the target.
One can adjust for this problem by using the observed
frequencies in these two columns to compute the hit rate
expected if there were no psi effect. In particular, one can
multiply eech.proportion in the second column by-the cor-
responding.proportion:in the third column-yielding the
joint probability that the clip was the target and that it
was ranked first-and then an across the four clips. As
shown in the fourth column of Table 2, this computation
yields an overall expected hit rate of 34.08%. When the
observed hit -rate of 64% is compared with this baseline,
the effect size (h) is .61. As shown in Table 1, this is
equivalent to a four-alternative hit rate of 64%, or a xr
value of .78,-and is statistically significant (z =.3:04, p =
.0012).
The psi effect can be seen even more dearly in the re-
maining columns of Table 2, which control for the differ-
ential popularity of the imagery in the clips by displaying
how frequently each was ranked first when it was the tar-
get compared with how frequently it was ranked first
when it was one of the control clips (decoys). As can be
seen, each of the four clips was selected as the target rel-
atively more frequently when it was the target than when
it was a decoy, a difference that is significant for three of
the four dips. On average, a clip was identified as the tar-
get 58% of the time when it was the target and only 14%
of the time when it was a decoy.
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ANOMAL uS INFORMATION TRANSFER
Dynamic versus static targets. The success of Study 302
raises the question of whether dynamic targets are, in
general, more effective than static targets. This possibility
was also suggested by the earlier meta-analysis, which
revealed that studies using multiple-image targets (View
Master stereoscopic slide reels) obtained significantly
higher hit rates than did studies using single-image tar-
gets. By adding motion and sound, the video clips might
be thought of as high-tech versions of the View Master
reels.
The 10 autoganzfeld studies that randomly sampled
from both dynamic and static target pools yielded 164 ses-
sions with dynamic targets and 165 sessions with static
targets.. As predicted, sessions using dynamic targets
yielded significantly more hits than did sessions using
static targets (37% vs. 27%; Fisher's exact p < .04).
Sender-receiver pairing. The earlier meta-analysis re-
vealed that studies in which participants were free to
bring in friends to serve as senders produced significantly
higher hit rates than studies that used only laboratory-as-
signed senders. As noted, however, there is no record of
how many of the participants in the former studies actu-
ally did bring in friends. Whatever the case, sender-re-
ceiver pairing was not a significant correlate of psi per-
formance in the autoganzfeld studies: The 197 sessions in
which the sender and receiver were friends did not yield a
significantly higher proportion of hits than did the 132
sessions in which they were not (35% vs. 29%; Fisher's ex-
act p
Correlations between receiver characteristics and psi
performance Most of the autoganzfeld participants were
strong believers in psi: On a 7-point scale, ranging from
strong disbelief in,.psi (1) to strong belief in psi (7), the
mean was 62 .(SD = 1.03); only 2 participants rated their
belief in psi below the midpoint of the scale. In addition,
88% of the participants reported personal experiences
suggestive of psi,.and 80% had some training in medita-
tion or other techniques involving internal focus of atten-
tion.
All of these appear to be important variables. The corre-
lation between belief in psi and psi performance is one of
the most consistent findings in the parapsychological -liit-
erature (Palmer, 19781 And within the autoganzfeld stud-
ies, successful performance of novice (first-time) partici-
pants was significantly predicted by reported personal psi
experiences, involvement with meditation or other mental
disciplines, and high scores on the Feeling and Perception
factors, of the Myers Briggs Type Inventory (Honorton,
1992; Honorton & Schechter, 1987; Myers & McCaulley,
1985). This recipe for success has now been independently
replicated in. another laboratory (Broughton, Kanthamani,
& MUM 1990).
The personality trait of extraversion is also associated
with better psi performance. A meta-analysis of 60 inde-
pendent studies with nearly 3,000 subjects revealed a
small but reliable .positive correlation between extraver-
sion and psi performance, especially in studies that used
free-response methods of the kind used in the ganzfeld
experiments (Honorton, Ferrari, & Bem,1992). Across 14
free-response studies conducted by four independent in-
vestigators, the correlation for 612 subjects was .20 (z a
4.82. p = 1.5 x 10-6). This correlation was replicated in
the autoganzfeld studies, in which extraversion scores
were available for 218 of the 240 subjects, r = .18, t(216) _
2.67, p = .004, one-tailed.
Finally, there is the strong psi performance of the Juil-
liard students, discussed earlier, which is consistent with
other studies in the parapsychological literature suggest-
ing a relationship between successful psi performance and
creativity or artistic ability.
Discussion
Earlier in this article we quoted from the abstract of the
Hyman Honorton communique: "We agree that the final
verdict awaits the outcome of future experiments con-
ducted by a broader range of investigators and according
to more stringent standards" (p. 351). We believe that the
`stringent standards" requirement has been met by the
autoganzfeld studies. The results are statistically signifi-
cant and consistent with those in the earlier database.
The mean effect size is quite respectable in comparison
with other controversial research areas of human perfor-
mance (Harris & Rosenthal, 1988a). And there are reli-
able relationships between successful psi performance and
conceptually relevant experimental and subject variables,
relationships that also replicate previous findings. Hyman
(1991) has also commented on the autoganzfeld studies:
'Honorton's experiments have produced intriguing re-
sults. If...independent laboratories can produce similar
results with the same relationships and with the same at-
tention to rigorous methodology, then parapsychology may
indeed have finally captured its elusive quarry' (p. 392):
Issues of Replication '
As Hyman's comment implies, the autoganzfeld studies
by themselves cannot satisfy the requirement that repli.
cations be conducted by a "broader range of investigators "
Accordingly, we hope the findings reported here will be
sufficiently provocative to prompt others to try replicating
the psi ganzfeld effect.
We believe that it is essential, however, that future
studies comply with the methodological, statistical, and
reporting standards set forth in the joint communique and
achieved by the autoganzfeld studies. It is not necessary
for studies to be as automated or as heavily instrumented
as the autoganzfeld studies in order to satisfy the
methodological guidelines, but they are still likely to be
labor intensive and potentially expensive .7
Statistical Power and Replication
Would-be replicators also need to be reminded of the
power requirements for replicating small effects. Although
many academic psychologists do not believe in psi, many
apparently do believe in miracles when it comes to repli-
cation. Tveraky and Kahneman (1971) posed the following
problem to their colleagues at meetings of the Mathemati-
cal Psychology Group and the American Psychological As=
sociation:
Suppose you have run an experiment on 20 subjects and
have obtained a significant result which confirms your the-,
.
7As the closing of the autoganzfeld laboratory exemplifies, it is
also difficult to obtain funding for psi research. The trhditional,,
peer-refereed sources of funding familiar to psychologists have
almost never funded proposals for psi research. The widespread
skepticism of psychologists toward psi is almost certainly a con-
tributing factor.
Appro
ase 2003/04/18 : CIA-RDP96-00789R00270001
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ANOMALOUS INFORMATION TRANSFER
ory (z = 2.23, p < .05, two-tailed). You now have cause to run
an additional group of 10 subjects. What do you think the
probability is that the results will be significant, by a one-
tailed test, separately for this group? (p. 105)
The median estimate was .85, with 9 out of 10 respon-
dents providing an estimate greater than .60. The correct
answer is approximately .48.
As Rosenthal (1990) has warned: "Given the levels of
statistical power at which we normally operate, we have
no right to expect.the proportion of significant results that
we typically do expect, even if in nature there is a very
real and very important effect" (p. 16). In this regard, it is
again instructive to consider the medical study that found
a highly significant effect of aspirin on the incidence of
heart attacks. The study monitored more than 22,000
subjects. Had the investigators monitored 3,000 subjects,
they would have had less than an even chance of finding a
conventionally significant effect. Such is life with small ef-
fect sizes.
Given its larger effect size, the prospects for success.
fully replicating the psi ganzfeld effect are not quite so
daunting, but they are probably still grimmer than intu-
ition would suggest. If the true hit rate is in fact about
34% when 25% is expected by chance, then an experiment
with 30 trials (the mean for the 28 studies in the original
meta-analysis) has only about I chance in 6 of finding an
effect significant at the .05 level with a one-tailed teat. A
50-trial experiment boosts that chance to about I in 3.
One must escalate to 100 trials in order to come close to
the break even point, at which one has a 60-60 chance of
finding a statistically significant effect (Utte, 1986).
(Recall that only 2 of the 11 autoganzfeld studies yielded
results that were individually significant at the conven-
tional .05 level.) Those who require that a psi effect be
statistically significant every time before they will seri-
ously entertain the possibility that an effect really exists
know not what they ask.
Significance Versus Effect Size
The preceding discussion is unduly pessimistic, how-
ever, because it perpetuates the tradition of worshipping
the significance level. Regular readers of this journal are
likely to be familiar with recent arguments imploring be-
havioral scientists to overcome their slavish dependence
on the significance level as the ultimate measure of virtue
and instead to focus more of their attention on effect sizes:
"Surely, God loves the .06 nearly as much as the .05"
(Roanow & Rosenthal, 1989, p. 1277). Accordingly, we
suggest that achieving a respectable effect size with a
methodologically tight ganzfeld study would be a perfectly
welcome contribution to the replication effort, no matter
how untenurable the p level renders the investigator.
Career consequences aside, this suggestion may seem
quite counterintuitive. Again, Tversky and Kahneman
(1971) have provided an elegant demonstration. They
asked several of their colleagues to consider an investiga-
tor who runs 15 subjects and obtains a significant t value
of 2.46. Another investigator attempts to duplicate the
procedure with the same number of subjects and obtains a
result in the same direction but with a nonsignificant
value of t. Tversky and Kahneman then asked their col-
leagues to indicate the highest level of t in the replication
study they would describe as a failure to replicate. The
majority of their colleagues regarded t =1.70 as a failure
to replicate. But if the data from two such studies (t = 2A6
CPYRGHT
and t - 1.70) were pooled, the t for the combined data
would be about 3.00 (assuming equal variances):
Thus, we are faced with a paradoxical state of affairs, in
which the same data that would increase our confidence in
the finding when viewed as part of the original study, shake
our confidence when viewed as an independent study.
(Tversky & Habaemen,1971, p. 108)
Such is the iron grip of the arbitrary .05. Pooling the
data, of course, is what meta-analysis is all about. Ac-
cordingly, we suggest that two or more laboratories could
collaborate in a ganzfeld replication effort by conducting
independent studies and then pooling them in meta-ana-
lytic fashion, what one might call real-time meta-analy-
sis. (Each investigator could then claim the pooled p
level for his or her own curriculum vitae.)
Maximizing Effect Size
Rather than buying or borrowing larger sample sizes,
those who seek to replicate the psi ganzfeld effect might
find it more intellectually satisfying to attempt to maxi-
mize the effect size by attending to the variables associ-
ated with successful outcomes. Thus researchers who wish
to enhance the chances of successful replication should
use dynamic rather than static targets. Similarly we ad-
vise using participants with the characteristics we have
reported to be correlated with successful psi performance.
Random college sophomores enrolled in introductory psy-
chology do not constitute the optimal subject pool.
Finally, we urge ganzfeld researchers to read carefully
the detailed description of the warm social ambiance that
Honorton et al. (1990) sought to create in the autoganzfeld
laboratory. We believe that the social climate created in
psi experiments is a critical determinant of their success
or failure.
The Problem of "Other" Variables
This caveat about the social climate of the ganzfeld ex-
periment prompted one reviewer of this article to worry
that this provided "an escape clause" that weakens the
falsifiability of the psi hypothesis: "Until Bem and Hon-
orton can provide operational criteria for creating a
warm social ambiance, the failure of an experiment with
otherwise adequate power can always be dismissed as
due to a lack of warmth."
Alas, it is true; we devoutly wish it were otherwise.
But the operation of unknown variables in moderating
the success of replications is a fact of life in all of the sci-
ences. Consider, for example, an earlier article in this
journal by Spence (1964). He reviewed studies testing
the straightforward derivation from Hullian learning
theory that high-aaxiety subjects should condition more
strongly than low-anxiety subjects. This hypothesis was
confirmed 94% of the time in Spence's own laboratory at
the University of Iowa but only 63% of the time in labo-
ratories at other universities. In fact, Kimble and his as-
sociates at Duke University and the University of North
Carolina obtained results in the opposite direction in two
of three experiments.
In searching for a post hoc explanation, Spence (1964)
noted that "a deliberate attempt was made in the Iowa
studies to provide conditions in the laboratory that might
elicit some degree of emotionality. Thus, the experi-
menter was instructed to be impersonal and quite formal
... and did not try to put [subjects] at ease or allay any
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ANOMALOUS INFORMATION TRANSFER
expressed fears" (pp. 135-136). Moreover, he pointed out,
his subjects sat in a dental chair whereas Kimble's sub-
jecta sat in a secretarial chair. Spence even considered
'the possibility that cultural backgrounds of southern
and northern students may lead to a difference in the
manner in which they respond to the different items in
the (Manifest Anxiety] scale" (p. 136). If this was the
state of affairs in an area of research as well established
as classical conditioning, then the suggestion that the so-
cial climate of the psi laboratory might affect the out-
come of ganzfeld experiments in ways not yet completely
understood should not be dismissed as a devious attempt
to provide an escape clause in case of replication failure.
The beet the original researchers can do is to communi-
cate as complete a knowledge of the experimental condi-
tions as possible in an attempt to anticipate some of the
relevant moderating variables. Ideally, this might include
direct training by the original researchers or videotapes of
actual sessions. Lacking these, however, the detailed de-
scription of the autoganzfeld procedures provided by Hon-
orton et al. (1990) comes as close as current knowledge
permits in providing for other researchers the
`operational criteria for creating a warm social ambiance."
Theoretical Considerations
'Up to this point, we have confined our discussion to
strictly empirical matters. We are sympathetic to the view
that one should establish the existence of a phenomenon,
anomalous or not, before attempting to explain it. So sup-
pose for the moment that we have a genuine anomaly of
information transfer here. How can it be understood or
explained?
The Psychology of Psi
In attempting to understand psi, parapsychologists
have typically begun with the working assumption that,
whatever its underlying mechanisms, it should behave
like other, more familiar psychological phenomena. In
particular, they typically assume that target information
behaves like an external sensory stimulus that is encoded,
processed, and experienced in familiar information-pro-
ceasing ways. Similarly, individual psi performances
should covary with experimental and subject variables in
psychologically sensible ways. These assumptions are em-
bodied in the model of psi that motivated the ganzfeld
studies in the first place.
The ganzfeld procedure. As noted in the introduction,
the ganzfeld procedure was designed to test a model in
which psi-mediated information is conceptualized as a
weak signal that is normally masked by internal somatic
and external sensory 'noise.' Accordingly, any technique
that raises the signal-to-noise ratio should enhance a per-
sons ability 'to detect psi mediated information. This
noise-reduction model of psi organizes a large and diverse
body of experimental results, particularly those demon-
strating the psi-conducive properties of altered states of
consciousness such as meditation, hypnosis, dreaming,
and, of course, the ganzfeld itself(Rao & Palmer, 1987).
Alternative theories propose that the ganzfeld (and al-
tered states) may be psi-conducive because it lowers resis-
tance to accepting alien imagery, diminishes rational or
contextual constraints on the encoding or reporting of in-
formation, stimulates more divergent thinking, or even
just serves as a placebolike ritual that participants per-
ceive as being psi conducive (Stanford, 1987). At this
point, there are no data that would permit one to choose
among these alternatives, and the noise-reduction model
remains the most widely accepted.
The target. There are also a number of plausible hy-
potheses that attempt to account for the superiority of dy-
namic targets over static targets, Dynamic targets contain
more information, involve more sensory modalities, evoke
more of the receiver's internal schemata, are more lifelike,
have a narrative structure, are more emotionally evoca-
tive, and are 'richer' in other, unspecified ways. Several
psi researchers have attempted to go beyond the simple
dynamic-static dichotomy to more refined or theory-based
definitions of a good target. Although these efforts have
involved examining both psychological and physical prop-
erties of targets, there is as yet not much progress to re-
port (Delany, 1990).
The receiver. Some of the subject characteristics asso-
ciated with good psi performance also appear to have psy-
chologically straightforward explanations. For example,
garden-variety motivational explanations seem sufficient
to account for the relatively consistent finding that those
who believe in psi perform significantly better than those
who do not. (Less straightforward, however, would be an
explanation for the frequent finding that nonbelievers ac-
tually perform significantly worse than chance
(Broughton, 1991, p. 109].)
The superior psi performance of creative or artistically
gifted individuals-like the Juilliard students-may re-
flect individual differences that parallel some of the hy-
pothesized effects of the ganzfeld mentioned earlier. Ar ds-
tically gifted individuals may be more receptive to alien
imagery, be better able to transcend rational or contextual
constraints on the encoding or reporting of information, or
be more divergent in their thinking. It has also been sug-
gested that both artistic and psi abilities might be rooted
in superior right-brain functioning.
The observed relationship between extraversion and
psi performance has been of theoretical interest for many
years. Eysenck (1966) reasoned,,-that extraverts should
perform well in psi tasks because they are easily bored
and respond favorably to novel stimuli. In a setting such
as the ganzfeld, extraverts may become `stimulus
starved' and thus be highly sensitive to any stimulation,
including weak incoming psi information. In contrast, in-
troverts would be more inclined to entertain themselves
with their own thoughts and thus continue to mask psi in-
formation despite the diminished sensory input. Eysenck
also speculated that psi might be a primitive form of per-
ception antedating cortical developments in the course of
evolution, and, hence, cortical arousal might suppress psi
functioning. Because extraverts have a lower level of cor-
tical arousal than introverts, they should perform better
in psi tasks (the evolutionary biology of psi has also been
discussed by Broughton, 1991, pp. 347-352).
But there are more mundane possibilities. Extraverts
might perform better than introverts simply because they
are more relaxed and comfortable in the social setting of
the typical psi experiment (e.g., the `warm social am-
biance' of the autoganzfeld studies). This interpretation is
strengthened by the observation that introverts outper-
formed extraverts in a study in which subjects had no con-
tact with an experimenter but worked alone at home with-
materials they received in the mail (Schmidt &`Schlitz,
1989). To help decide among these interpretations,
ganzfeld experimenters have begun to use the extraver-
sion scale of the NEO Personality Inventory (Costa & Mc-
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ANOMALOUS INFORMATION TRANSFER
Crae, 1992), which assesses six different facets of the ex-
traversion-introversion factor.
The sender. In contrast to this information about the re-
ceiver in psi experiments, virtually nothing is known
about the characteristics of a good sender or about the ef-
fects of the sender's relationship with the receiver. As has
been shown, the initial suggestion from the meta-analysis
of the original ganzfeld database that psi performance
might be enhanced when the sender and receiver are
friends was not replicated at a statistically significant
level in the autoganzfeld studies.
A number of parapsychologists have entertained the
more radical hypothesis that the sender may not even be a
necessary element in the psi process. In the terminology of
parapsychology, the sender-receiver procedure tests for
the existence of telepathy, anomalous communication be-
tween two individuals; however if the receiver is somehow .
picking up the information from the target itself; it would
be termed clairvoyance, and the presence of the sender
would be irrelevant (except for possible psychological rea-
sons such as expectation effects).
At the time of his death, Honorton was planning a se-
ries of autoganzfeld studies that would systematically
compare sender and no-sender conditions while keeping
both the receiver and the experimenter blind to the condi-
tion of the ongoing session. In preparation, he conducted a
meta-analytic review of ganzfeld studies that used no
sender. He found 12 studies with a median of 33.5 ses-
sions, conducted by seven investigators. The overall effect
size 00 was .56, which corresponds to a four-alternative
hit rate of 29%. But this effect size does not reach statisti-
cal significance (Stouffer z = 1.31, p = .095). So far, then,
there is no firm evidence for psi in the ganzfeld in the ab-
sence of a sender. (There are, however, -nonganzfeld stud-
ies in the literature that do report significant evidence for
clairvoyance, including a classic card-guessing experiment
conducted by J. B. Rhine and Pratt [1954].)
The Physics of Psi
The psychological level of theorizing discussed earlier
does not, of course, address the-, conundrum that makes psi
phenomena anomalous in the first place: their presumed
incompatibility with our current conceptual model of
physical reality. Parapsychologists differ widely from one
another in their taste for theorizing at this level, but sev-
eral whose training lies in physics or engineering have
proposed physical (or biophysical) theories of psi phenom-
ena (an extensive review of theoretical parapsychology
was provided by Stokes, 1987). Only some of these theo-
ries would force a radical revision in our conception of
physical reality.
Those who follow contemporary debates in modern
physics, however, will be aware that several phenomena
predicted by quantum theory and confirmed by experi-
ment are themselves incompatible with our current con-
ceptual model of physical reality. Of these, it is the 1982
empirical confirmation of Bell's theorem that has created
the most excitement and controversy among philosophers
and the few physicists who are willing to speculate on
such matters (Cushing & McMullin, 1989; Herbert, 1987).
In brief, Bell's theorem states that any model of reality
that is compatible with quantum mechanics must be non-
local: It must allow for the possibility that the results of
observations at two arbitrarily distant locations can be
correlated in ways that are incompatible with any physi-
cally permissible causal mechanism.
CPYRGHT
Several possible models of reality that incorporate non-
locality have been proposed by both philosophers and
physicists. Some of these models clearly rule out psi-like
information transfer, others permit it, and sorne actually
require it. Thus, at a grander level of theorizing, some
parapsychologists believe that one of the more radical
models of reality compatible with both quantum mechan-
ics and psi will eventually come to be accepted. If and
when that occurs, psi phenomena would cease to be
anomalous.
But we have learned that all such talk provokes most of
our colleagues in psychology and in physics to roll their
eyes and gnash their teeth. So let's just leave it at that.
Skepticism Revisited
More generally, we have learned that our colleagues'
tolerance for any kind of theorizing about psi is strongly
determined by the degree to which they have been con-
vinced by the data that psi has been demonstrated. We
have further learned that their diverse reactions to the
data themselves are strongly determined by their a priori
beliefs about and attitudes toward a number of quite gen-
eral issues, some scientific, some not. In fact, several
statisticians believe that the traditional hypothesis test-
ing methods used in the behavioral sciences should be
abandoned in favor of Bayesian analyses, which take into
account a person's a priori beliefs about the phenomenon
under investigation (e.g., Bayarri & Berger, 1991; Daw-
son,1991).
In the final analysis, however, we suspect that both
one's Bayesian a prioris and one's reactions to the data
are ultimately determined by whether one was more
severely punished in childhood for Type I or Type II er-
rors.
References
Atkinson, It, Atkinson, it. C., Smith, E. E., & Bem, D. J.
(1990). Introduction to psychology (10th ed.). San Diego,
CA: Harcourt Brace Jovanovich.
Atkinson, it., Atkinson, R. C., Smith, E. E., & Bem, D. J.
(1993). Introduction to psychology (11th ed.). San Diego,
CA: Harcourt Brace Jovanovich.
Avant, L. L. (1965). Vision in the ganzfeld. Psychological
Bulletin, 64,246-258.
Bayarri, M. J., & Berger, J. (1991). Comment. Statistical
Science, 6, 379-382.
Blackmare, S. (1980). The extent of selective reporting of
ESP GanzfeId studies. European Journal of Parapsy-
chology, 3, 213-219.
Bozarth, J. D., & Roberts, R. it. (1972). Signifying signifi-
cant significance. American Psychologist, 27, 774-775.
Braud, W. G., Wood, it., & Brazed, L. W. (1975). Free-re-
sponse GESP performance during an experimental
hypnagogic state induced by visual and acoustic
ganzfeld techniques. A Replication and extension. Jour-
nal of the American Society for Psychical Research, 69,
105-113.
Broughton, it. S. (1991). Parapsychology: The controver-
sial science. New York: Ballantine Books
.
Broughton, R. S., Kanthamani, H., & Khilji, A. (1990). As-
sessing the PRL success model on an independent
ganzfeld data base. In L. Henkel & J. Palmer (Eds.), Re-
search in parapsychology 1989 (pp. 32-35). Metuchen,
NJ: Scarecrow Press.
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Affemftom 14
Child, I. L. (1985). Psychology and anomalous observa- Honorton, C., Ferrari, D. C., & Bern, D. J. (1992). Ex.
tions: The question of ESP in dreams. American Pay- traversion and ESP performance: Meta-analysis and a
chologist, 40, 1219-1230. new confirmation. In L. A. Henkel & G. R. Schmeidler
Cohen, J. (1988). Statistical power analysis for the behav- (Eds.), Research in ra cholo 1990
ioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum. Metuchen, NJ: Scarecrow Press. (pp. 3538).
Cohen, J. (1992). Statistical power analysis. Current Di- Honorton, C., & Harper, re ttions in Psychological Science,1, 98-101. and ideation in Si enta procedure for imagery
an experimental l procedure for regular:
Costa, P. T. J., & McCrae, R. R. (1992). Revised NEO Per. ing perceptual input. Journal of the American Society
sonality Inventory (NEO-PI-R) and NEO Five Factor In- for Psychical Research, 68, ]L56-168.
ventory (NEO-FFI) Manual. Odessa, FL: Psychological Honorton, C., & Schechter, K. I. (1987). Ganzfeld target
Assessment Resources. retrieval with an automated testing system: A model for
Cushing, J. T., & McMullin, E. (Eds.). (1989). Philosophi- initial
ganzfeld success. noes of quantum theory: Reflections on Bell's n pIn D. B. Weiner & R. D. Nelson
theorem. Notre Dame, IN: University of Notre Dame Metuchen, Research in psychology 1986 (pp. 36-39).
Press. Hyman, R. (1985). The
appraisal garapsy hoi experiment: A critical
Dawson, R. (1991). Comment. Statistical Science, 6y 382-
385. . Journal of Parapsychology, 49, 3,49.
IA 385oy, D. L. (1990). Hyman, R. (1991). Comment. Statistical Science, 6, 389-
Approaches to the target: A time for 392
reevaluation. In L. A. Henkel, & J. Palmer (Eds.), Re- Hyman, It, & Honorton, C. 0.986). A joint communique:
search in Parapsychology 1989 (pp. 89-92). Metuchen, The psi ganzfeld controversy. Journal of Parapsychol-
NJ: Scarecrow Press. ogy, 50,351-364.
Dingwall, E. J. (Ed.). (1968). Abnormal hypnotic phenom- Kennedy, J. E. (1979). Methodological problems in free-re-
ena (4 vols.). London: Churchill. spouse ESP experiments. Journal of the American Soci-
D.ruckman, D., & Swats, J. A. (Eds.). (1988). Enhancing ety for Psychical Research, 78, 1-15.
human performance. Issues, theories, and techniques. Metzger, W. (1930). Optische Untersuchungen am
Washington, DC: National Academy Press. Ganzfeld: IL Zur phanomenologie des homogenen
Eysenck, H. J. (1966). Personality and extra-sensory per- Ganzfelds [Optical investigation of the Ganzfeld: II
ception. Journal of the Society for Psychical Research, Toward the phenomenology of the homogeneous
G.44, 65 T. (1991). How we know what isn't so: The Ganzfeld]. Psychologisehe Forvchung,13, 6-29. ovich, bility of human reason in e falli- Morris, R. L. (1991). Comment. Statistical Science, 6,393.
~~- vayday life. New York: Free 395.
Green, C. E. (1960). Analysis of spontaneous cases. Pro- Mto trehe der elopmeennt and use Consulting the Myers Briggs T~
ceedings of the Society for Psychical Research, 53, 97- Indicator. Palo Alto, CA: Consulting Psychologists
161. Press.
Harris, M. J., & Rosenthal, It. (1988a). Human perfor- Nisbett, It. E., & Ross, L. (1980). Human inference:
rnance research: An overview. Washington, DC: National Strategies and shortcomings of social judgment. Engle-
Academy Pres& wood Cliffs, NJ: Prentice-Hall
Harris, M. J., & Rosenthal, R. (1988b). Postscript to Palmer, J. (1978). Extrasensory' perception: Research find-
`Human performance research: An overview.' Washing- i
ton, DC:National Academy Press. ~. In S. ~PPn~' (Ed.), Advances in Parapsychologi-
Herbert, National. cal research (Vol. 2, pp. 59-243). New York: Plenum.
N. (1987) Quantum reality: Beyond the new Palmer, J. A., Honorton, C., &. U'tte; J. (1989). Reply to the
physics. Garden City, NY: Anchor Books. National Research Council Study on Parapsychology.
Honorton, C. (1969). Relationship between EEG alpha ac- Journal of the American Society for Psychical Research,
tivity and ESP card-guessing performance. Journal of 83,31-49.
the American Society for Psychical Research, 63, 365- Parker, A. (1975). Some findings relevant to the change in
374. state hypothesis. In J. D. Morris, W. G. Roll, & R. L.
Honorton, C. (1977). Psi and internal attention states. In Morris
B. B. Wolman (Ed.), Handbook o (Eds.), Research a parapsychology, 1974 (pp. 40-
. (pp. 42). Metuchen, NJ: Scarecrow 'Press.
435-472). New York: Van Nostrand Reinhold. Parker, A. (1978). A holistic methodology '
Honorton, C. (1979). Methodological issues in free-re- Parapsychology Review,-9, 1.6. psi research.
spouse experiments. Journal of the American Society for Prasad, J., & Stevenson, I. (1968). A survey of s
Psychical Research, 73, 381394. neous s chical Y Ponta-
Pradesh, y experiences chfPar s chol.
Honorton, C. (1985). Meta-analysis of psi ganzfeld re-
search: A response to Hyman. India. International Journal of arapy
ymaa. Journal of Parapsychol- ogy, 10, 241-261.
ogy, 49,51-91. Rao, K. It., & Palmer, J.
Honorton, C. (1992). The ganzfeld novice: Four predictors (1987). The anomaly called pin.
of initial ESP performance. Proceedings of the rRecent research and criticism. Behavioral and Brain
Psy- Sciencie,10, 539-551.
chological Association 35th Annual Convention, Las Ve. Rhine, J. B., & Pratt, J. G. (1954). A review of the Pearce-
gas, NV, 51-58. Pratt distance series of ESP tests. Journal o Para
Honorton, C., Berger, R. E., Varvoglis, M. P., Quant, M., chology,18, 165-177. of Pry-
Derr, P., Schechter, E. L, & Ferrari, D. C. (1990). Psi Rhine, L. E. (1962). Psychological processes in ESP expe-,
communication in the ganzfeld: Experiments with an riences. L Waking experiences. Journal ofParapsychol?'
automated testing system and a comparison with a ogy, 26,88-111.
meta-analysis of earlier studies. Journal of Parapsy-
chology, 54, 99-139.
Appro
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
ed ForAel
ase 2003/04/18 : CIA-RDP96-00789R00270001 001-1
W.
sa
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Roig, M., Icochea, H., & Cuzzucoli, A. (1991). Coverage of
parapsychology in introductory psychology textbooks.
Teaching of Psychology,18, 157-160.
Rosenthal, R. (1978). Combining results of independent
studies. Psychological Bulletin, 85, 185-193.
Rosenthal, R. (1979). The 'Me drawer problem" and toler-
ance for null results. Psychological Bulletin, 86, 638-
641.
Rosenthal, it (1990). Replication in behavioral research.
Journal of Social Behavior and Personality, 5,1-30.
Rosenthal, it (1991). Meta-analytic procedures for social
research (Rev. ed.). Newbury Park, CA. Sage.
Rosenthal, It., & Rubin, D. B. (1989). Effect size estima-
tion for one-sample multiple-choice-type data: Design,
analysis, and meta-analysis. Psychological Bulletin,
106,332-337.
Rosnow, It L., & Rosenthal, R.-(1989). Statistical proce-
dures and the justification of knowledge in psychologi-
cal science. American Psychologist, 44, 1276-1284.
Sannwald, G. (1959). Statistische untersuchungen an
Spontanph6nomene (Statistical investigation of sponta-
neous phenomena]. Zeitschrif frlr Parapsychologie and
Grenzgebiete der Psychologse, 3, 59-71.
Saunders, D. R. (1985). On Hyman's factor analyses.
Journal of Parapsychology, 49, 86-88.
Schechter, E. I. (1984). Hypnotic induction vs. control
conditions: Illustrating an approach to the evaluation of
replicability in parapsychology. Journal of the American
Society for Psychical Research, 78, 1-27.
Schlitz, M. J., & Honorton, C. (1992). Ganzfeld psi per-
formance within an artistically gifted population. Jour-
nal of the American Society for Psychical Research, 86,
83-98.
Schmeidler, G. R. (1988). Parapsychology and psychology;
Matches and Mismatches. Jefferson, NC: McFarland.
Schmidt, H., & Schlitz, M. J. (1989). A large scale pilot PK
experiment with prerecorded random events. In L. A.
Henkel & R. E. Berger (Eds.), Research in Parapsychol-
ogy 1988 (pp. 6-10). Metuchen, NJ: Scarecrow Press.
Spence, K. W. (1964). Anxiety (drive) level and perfor-
mance in eyelid conditioning. Psychological Bulletin, 61,
129-139.
Stanford, R. G. (1987). Ganzfeld and hypnotic-induction
procedures in ESP research: Toward understanding
their success. In S. Krippner (Ed.), Advances in para-
psychological research (Vol. 5, pp. 39-76). Jefferson, NC:
McFarland.
Steering Committee of the Physicians' Health Study Re-
search Group. (1988). Preliminary report: Findings from
the aspirin component of the ongoing Physicians'
Health Study. New England Journal of Medicine, 318,
262-264.
Sterling, T. C. (1959). Publication decisions and their pos-
sible effects on inferences drawn from tests of signifi-
cance-or vice versa. Journal of the American Statisti-
cal Association, 54, 3034.
Stokes, D. M. (1987). Theoretical parapsychology. In S.
Krippner (Ed.), Advances in parapsychological research
(Vol. 5, pp. 77-189). Jefferson, NC: McFarland.
Swets, J. A., & Bjork, R. A. (1990). Enhancing human per-
formance: An evaluation of 'new age" techniques con-
sidered by the U. S. Army. Psychological Science, 1, 85-
96.
Tveraky, A., & Kahneman, D. (1971). Belief in the law of
small numbers. Psychological Bulletin, 2, 105-110.
Ullman, M., Krippner, S., & Vaughan, A. (1973). Dream
telepathy. New York Macmillan.
Utts, J. (1986). The ganzfeld debate: A statistician's per-
spective. Journal of Parapsychology, 50, 393-402.
Utts, J. (1991a). Rejoinder. Statistical Science, 6, 396-403.
Utta, J. (1991b). Replication and meta-analysis in para.
psychology. Statistical Science, 6, 363-378.
Wagner, M. W., & Monnet, M. (1979). Attitudes of college
professors toward extra-sensory perception. Zetetic
Scholar, 5, 7-17.
Received September 28, 1992
Revision received March 10, 1993
Accepted March 14, 1993
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1W3
r
statistical science
1991. Vol. 6. No. 4.363-403
Replication and,Meta-Analysis in
Parapsychology
Jessica Utts
Abstract. Parapsychology, the laboratory study of psychic.phenomena,
has had its 'history interwoven with that of statistics. Many of the
controversies in :parapsychology have focused on statistical issues, and
statistical models have played, an integral role - in the experimental
work. Recently, parapsychologists have been using meta-analysis as a
tool for ,synthesizing large bodies of work. This paper presents an
overview of the 'use of statistics in parapsychology and offers a summary
of the meta-analyses that have been conducted. It begins with some
anecdotal information about the -involvement of statistics and statisti-
cians with the early history of parapsychology. Next, it is argued that
most nonstatisticians do not appreciate the -connection between power
and "successful" replication of experimental effects. Returning to para-
psychology, a particular experimental regime is examined by summariz-
ing an extended debate over the interpretation of the results. A new set
of experiments designed to resolve the debate is then reviewed. Finally,
meta-analyses from several areas of parapsychology are summarized. It
is concluded that the overall evidence indicates that there is -an anoma-
lous effect in need of an explanation.
Key words and phrases: Effect size, psychic research, statistical contro-
versies, randomness, vote-counting.
1. INTRODUCTION
In a June 1990 Gallup Poll, 49% of the 1236
respondents claimed to believe in extrasensory per-
ception (ESP), and one in four claimed to have had
a personal experience involving telepathy (Gallup
and Newport, 1991). Other surveys have shown
even higher percentages; the University of
Chicago's National Opinion Research Center re-
cently surveyed 1473 adults, of which 67% claimed
that they had experienced ESP (Greeley, 1987).
Public opinion is a poor arbiter -of science, how-
ever, and experience is a poor substitute for the
scientific method. For more than a century, small
numbers of-scientists have been conducting labora-
tory experiments to study phenomena such as
telepathy, clairvoyance and precognition, collec-
tively known as "psi" abilities. This paper will
examine some of that work, as well as some of the
statistical controversies it has generated.
Jessica Utts is Associate Professor, Division of
Statistics, University of California at Davis, 469
Kerr Hall, Davis, California 95616.
CPYRGHT
Parapsychology, as this field is called, has been a.
source of controversy throughout its history. Strong
beliefs tend to be resistant to.change even in the
face of data, and many people, scientists included,
seem to have made up their minds on the question
without examining any empirical data at all. A
critic of parapsychology recently acknowledged that
"The level of the debate during the past 130 years
has been an embarrassment for anyone who would
like to believe that scholars and scientists adhere
to standards of rationality and fair play" (Hyman,
1985a, page 89). While much of the controversy has
focused on poor experimental design and potential
fraud, there have been attacks and defenses of the
statistical methods as well, sometimes calling into
question the very foundations of probability and
statistical inference.
Most of the criticisms have been leveled by psy-
chologists. For example, a 1988 report of the U.S.
National Academy of Sciences concluded that "The
committee finds no scientific justification from
research conducted over a period of 130 years for
the existence of . parapsychological phenomena"
(Druckman and Swets, 1988, page 22). The chapter
on parapsychology was written by a subcommittee
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT*
ments, offered one of the earliest treatises on the
statistical evaluation of forced-choice experiments-
in two articles published in the Proceedings of the
Society for Psychical Research (Edgeworth, 1885,
1886). Unfortunately., as noted by -Mauskopf and
McVaugh (1979) in their historical account of the
period, Edgeworth's papers were "perhaps too diffi-
cult for their immediate audience" (page 105).
Edgeworth began. his analysis by using Bayes'
theorem to derive; the formula for the posterior
probability that chance was operating, given the
data. He then continued with an argument
"savouring more of Bernoulli than Bayes" in which
"it is consonant, I submit, to experience, to put 1 /2
both for a and a," that is, for both the prior proba-
bility that chance alone was operating, and the
prior.probability that "there should have been some
additional agency." He then reasoned (using a
Taylor series expansion of the posterior prob-
ability formula) that if there were a large prob-
ability: of observing the data given that some-
additional agency was at work, and a small objec-
tive probability of the data under chance, then the:
latter (binomial) probability "may be taken as a
rough measure. of the sought a posteriori probabil..:
ity in favour of mere chance" (page 195). Edge-.
worth concluded his article by applying his method
to some data published previously in the same
journal. He found the probability against chance to
be 0.99996, which he said "may fairly be regarded
as physical certainty" (page 199). He concluded:
chaired by a psychologist who had published a
similar conclusion prior to his appointment to the
committee (Hyman, 1985a, page 7). There were no
parapsychologists involved with the writing of the
report. Resulting accusations of bias (Palmer, Hon-
orton and Utts, 1989) led U.S. Senator Claiborne
Pell to request that the Congressional Office of
Technology Assessment (OTA) conduct an investi-
gation with a more balanced group.., A Hone-day
workshop was held on September 30,.1988, bring-
ing together parapsychologists, critics and experts
in some related fields (including the author of this
paper). The report concluded that parapsychology
needs "a fairer hearing across a broader spectrum
of the scientific community, so that emotionality
does not impede objective assessment of experimen-
tal results" (Office of Technology Assessment,
1989).
It is in the spirit of the OTA report that this
article is written. After Section 2, which offers an
anecdotal account of the role of statisticians and
statistics in parapsychology, the discussion turns to
the more general question of replication ofexperi-
mental results. Section 3 illustrates how. replica-
tion has been (mis)interpreted by scientists in many
fields. Returning to parapsychology in Section 4, a
particular experimental regime called the "ganz-
feld" is described, and an extended debate about
the interpretation of the experimental results is
discussed. Section 5 examines a meta-analysis of
recent ganzfeld experiments designed to resolve the
debate..Finally, Section 6 contains a brief account
of meta-analyses. that have been conducted in other
areas of parapsychology, and conclusions are given
in Section 7.
2. STATISTICS AND PARAPSYCHOLOGY
Parapsychology had its beginnings in the investi-
gation of purported mediums and other anecdotal
claims in the late 19th century. The Society for
Psychical Research was founded in Britain in 1882,
and its American. counterpart was founded in
Boston in 1884.. While these organizations and their
members were primarily involved with investigat-
ing Anecdotal material, a few of the early re-
searchers were .already conducting "forced-choice"
experiments such as card-guessing. (Forced-choice
experiments are like multiple choice tests; on each
trial the subject must guess from a small, known
set of possibilities.) Notable among these was
Nobel Laureate, Charles Richet, who is generally
credited with being the first to recognize that prob-
ability theory could be applied to card-guessing
experiments (Rhine, 1977, page 26; Richet, 1884).
F. Y. Edgeworth, .partly in response to what he
- --_-_J___.7 e.. 1.., ;.,.,..,...nn+ -ewcee ^f +U, as PvnPri.
Such is the evidence whicthe calculus of
probabilities affords as to the 'existence of an
agency other than mere chance The calculus is
silent as to the nature of that agency-whether
it is more likely to be vulgar illusion or ex-
traordinary law. That is a question to be
decided, not by formulae and figures, but by
general philosophy and common sense (page
199].
Both the statistical arguments and the experi-
mental controls in these early experiments were
somewhat loose. For example, Edgeworth treated=
as binomial an experiment in which one person??'=
chose a string of eight letters and another at-
tempted =-r.
to guess the string. Since it has long been '
understood that people are poor random number (or
letter) generators, there is no statistical basis for
analyzing such an experiment. Nonetheless, Edge-
worth and his contemporaries set the stage for the
use of controlled experiments with statistical evalu-
ation in laboratory parapsychology. An interesting
historical account of Edgeworth's involvement and
the role telepathy experiments played in the early.
history of randomization and experimental design-`
is nrnvidM(l by Hacking '(1988)
Appro ed For=Ref ase 2003/04/18 : CIA-RDP96-00789R00270001 001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1 CPYRGHT
One of the first American researchers to
use statistical methods in parapsychology was
John Edgar Coover, who was the Thomas Welton
Stanford Psychical Research Fellow in the Psychol-
ogy Department at Stanford University from 1912
to 1937 (Dommeyer, 1975). In 1917, Coover pub-
lished a large volume summarizing his work
(Coover, 1917). Coover believed that his results
were consistent with chance, but others have ar-
gued that Coover's definition of significance was
too strict (Dommeyer, 1975). For example, in one
evaluation of his telepathy experiments, Coover
found a two-tailed p-value:. of 0.0062. He concluded,
"Since this value, then, ;lies within the field of.
chance deviation, although the probability of its
occurrence by chance is fairly low, it cannot be
accepted as a decisive indication of some cause
beyond chance which operated in favor of success in
guessing" (Coover, 1917, page 82). On the next
page, he made it explicit that he would require a
p-value of 0.0000221 to declare that something
other than chance was operating.
It was during the summer of 1930, with the
card-guessing experiments of J. B. Rhine at Duke
University, that parapsychology began to take hold
as a laboratory science. Rhine's laboratory still
exists under the name of the Foundation for Re-
search on the Nature of Man, housed at the edge of
the Duke University campus.
It wasn't long after Rhine published his first
book, Extrasensory Perception in 1934, that the
attacks on his methodology began. Since his claims
were wholly based on statistical analyses of his
experiments, the statistical methods were closely
scrutinized by critics anxious to find a conventional
explanation for Rhine's positive results.
The most persistent critic was a psychologist
from McGill University named Chester Kellogg
(Mauskopf and McVaugh, 1979). Kellogg's main
argument was that Rhine was using the binomial
distribution (and normal approximation) on a se-
ries of trials that were not independent. The experi-
ments in question consisted of having a subject
guess the order of a deck of 25 cards, with five each
of five symbols, so technically Kellogg was correct.
By 1937, several mathematicians and statis-
ticians had come to Rhine's aid. Mauskopf and
McVaugh (1979) speculated that since statistics was
itself a young discipline, "a number of statisticians
were equally outraged by Kellogg, whose argu-
ments they saw as discrediting their profession"
(page 258). The major technical work, which ac-
knowledged that Kellogg's criticisms were accurate
but did little to change the significance of the
results, was conducted by Charles Stuart and
Joseph A. Greenwood and published in the first
and Greenwood, 1937). Stuart, who had been an
undergraduate in mathematics at Duke, was one of
Rhine's early subjects and continued to work with
him as a researcher until Stuart's death in 1947.
Greenwood was a Duke mathematician, who appar-
ently converted to a statistician at the urging of
Rhine.
Another prominent figure who was distressed
with Kellogg's attack was E. V.. Huntington, a
mathematician at Harvard. After corresponding
with.Rhine, Huntington decided that, rather than
further confuse the public with a technical reply to
Kellogg's arguments, a simple statement should be
made to the effect that the mathematical issues in
Rhine's work had been resolved. Huntington must
have successfully convinced his former student,
Burton Camp of Wesleyan, that this was a wise
approach. Camp was the 1937 President of EMS.
When the annual meetings were held in December
of 1937 (jointly with AMS and AAAS), Camp
released a statement to the press that read:
Dr. Rhine's investigations have two aspects:
experimental and statistical. On the exper-
imental side mathematicians, of course,
have nothing to say. On the statistical side,
however, recent mathematical work has
established the fact that, assuming that the
experiments have been properly performed,
the statistical analysis is essentially valid. If
the Rhine investigation is to be fairly attacked,
it must be on other than mathematical grounds
(Camp, 1937).
One statistician who did emerge as a critic was
William Feller. In a talk at the Duke Mathemati-
cal Seminar on April 24, 1940, Feller raised three
criticisms to Rhine's work (Feller, 1940). They had
been raised before by others (and continue to be
raised even today). The first was that inadequate
shuffling of the cards resulted in additional infor-
mation from one series to the next. The second was
what is now known as the "file-drawer effect,"
namely, that if one combines the results of pub-
lished studies only, there is sure to be a bias in
favor of successful studies. The third was that the
results were enhanced by the use of optional stop-
ping, that is, by not specifying the number of trials
in advance. All three of these criticisms were ad-
dressed in a rejoinder by Greenwood and Stuart
(1940), but Feller was never convinced. Even in its
third edition published in 1968, his book An Intro-
duction to Probability Theory and Its Applications
still contains his conclusion about Greenwood and
Stuart: "Both their arithmetic and their experi-
ments have a distinct tinge of the supernatural"
(Feller, 1968, page 407). In his discussion of Feller's
,.? ..nnn, __.......L ...] "T hPlieve
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Feller was confused ... he seemed to have decided
the opposition was wrong and that was that."
Several statisticians have contributed to the
literature in parapsychology to greater or lesser
degrees. T. N. 'E. Greville developed applicable
statistical methods for many of the experiments in
parapsychology and was Statistical Editor of the
Journal of Parapsychology, (with J. A: Greenwood)
from its start in 1937 through Volume 31 in 1967;
Fisher (1924, 1929) addressed some specific prob-
lems in card-guessing experiments; Wilks (1965a, b)
described various statistical methods for parapsy-
chology; Lindley (1957) presented a Bayesian anal-
ysis of some parapsychology data; and Diaconis
(1978) pointed out some problems with certain ex-
periments and presented a method for analyzing
experiments when feedback is given.
Occasionally, - attacks on parapsychology have
taken the form of attacks on statistical inference in
general, at least as it is applied to real data.
Spencer-Brown (1957) attempted to show that true
randomness is impossible, at least in finite se-
quences, and that this could be the explanation for
the results in parapsychology. That argument re-
emerged in. a recent debate on 'the role of random.
ness in parapsychology, initiated by psychologist J.
Barnard Gilmore (Gilmore, 1989, 1990; Utts, 1989;
Palmer, 1989, 1990). Gilmore stated. that "'The ag-
nostic statistician, advising on research in psi,
should take account of the possible inappropriate-
ness of classical inferential statistics" (1989, page
338). In his second paper, Gilmore reviewed several
non-psi studies showing purportedly random sys-
tems that do not behave as they should under
randomness (e.g., Iversen, Longcor, Mosteller,
Gilbert and Youtz, 1971; Spencer-Brown, 1957).
Gilmore concluded that "Anomalous data ...
should not be found nearly so often if classical
statistics offers a valid model of reality" (1990,
page 54), thus rejecting the use of classical statisti-
cal inference for real-world applications in general.
3. REPLICATION
Implicit and explicit in the literature on parapsy-
chology is the assumption that, in order to truly
establish itself, the field needs to find a repeat-
able experiment. For example, Diaconis (1978)
started the summary of his article in Science with
the words "In search of repeatable ESP experi.
ments, modern investigators ... " (page 131). On
October 28-29, 1983, the 32nd International Con-
ference of the Parapsychology Foundation was held
in San Antonio, Texas, to address "The Repeatabil-
ity Problem in : Parapsychology." The Conference
Proceedings (Shapin and Coly, 1985) reflect the
diverse views among parapsychologists on the na-
ture of the problem. Honorton (1985a) and Rao
(1985), for example, both argued that strict replica
tion is uncommon in most branches of science and
that parapsychology should not be singled out as
unique in this regard. Other authors expressed
disappointment in the lack of a single repeatable
experiment in parapsychology, with titles such'
as "Unrepeatability- Parapsychology's Only Find-
ing" (Blackmore, 1985), and "Research Strategies
for Dealing with Unstable Phenomena" (Beloff,
1985).
It has never been clear, however, just exactly
what would constitute acceptable evidence of a re-
peatable experiment. In the early,,days of investiga.
tion, the major critics "insisted that it would be
sufficient for Rhine and Soal to convince them of
ESP if a parapsychologist could perform success-
fully a single 'fraud-proof' xperiment" (Hyman,
1985a, page 71). However, as soon as well-designed
experiments showing statistical significance
emerged, the critics realized that a single experiment could be statistically significant just by,
chance. British psychologist C. E. M. Hansel quan-'
tified the new expectation, that the experiment.
should be repeated a few times, as follows:
If a result is significant at the .01 level and
this result is not due to chance but to informa-
tion reaching the subject, it may be expected
that by making two further sets of trials the
antichance odds of one hundred to one will be
increased to around a million' to one, thus en-
abling the effects of ESP-or`' whatever is re-
sponsible for the original result-to manifest'
itself to such an extent than there will be little
doubt that the result is not due to chance
(Hansel, 1980, page 2981.
In other words, three consecutive experiments at
p:5 0.01 would convince Hansel that something
other than chance was at work.
This argument implies that if a particular experi-
ment produces a statistically significant result, but
subsequent replications fail to attain significance,
then the original result was probably due to chance,
or at least remains unconvincing. The problem with
this line of reasoning is that there is no consid-
eration given to sample size or power. Only an
experiment with extremely high power should
be expected to be "successful" three times in
succession.
It is perhaps a failure of the way statistics is
taught that many scientists do not understand the
importance of power in defining successful replica-
tion. To illustrate this point, psychologists Tversky,
and Kahnemann (1982) distributed a questionnaire
Appro
ed For'Rel
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1 CPYRGHT
to their colleagues at a professional meeting, with
the question:
An investigator has reported a result that you
consider implausible. He ran 15 subjects, and
reported a significant value, t = 2.46. Another
investigator has attempted to duplicate his pro-
cedure, and he obtained a nonsignificant value
of t with the same number of subjects. The
direction was the same in both sets of data.
You are reviewing the literature. What is the
highest value of t in the second set of data that
you would describe as a failure to replicate?
(1982, page 28].
In reporting their results, Tversky and Kahne-
mann stated:
The majority of our respondents regarded t =
1.70 as a failure to replicate. If the data of two
such studies (t = 2.46 and t = 1.70) are pooled,
the value of t for the combined data is about
3.00 (assuming equal variances). Thus, we are
faced with a paradoxical state of affairs, in
which the same data that would increase our
confidence in the finding when viewed as part
of the original study, shake our confidence
when viewed as an independent study [1982,
page 28].
At a recent presentation to the History and Phi-
losophy of Science Seminar at the University of
California at Davis, I asked the following question.
Two scientists, Professors. A .and B,.. each. have a.
theory they would like to demonstrate.. Each plans
to run a fixed number of Bernoulli trials .and then
test Ho: p = 0.25 versus Ha: p > 0.25. Professor A
has access to large numbers of students each
semester to use as subjects. In his first experiment,
he runs 100 subjects, and there are 33 successes
(p = 0.04, one-tailed). Knowing the importance of
replication, Professor A runs an additional 100 sub-
jects as a second experiment. He finds 36 successes
(p = 0.009, one-tailed).
Professor B only teaches small classes. Each
quarter, she runs an experiment on her students to
test her theory. She carries out ten studies this
way, with the results in Table 1.
I asked the audience by a show of hands to
indicate whether or not they felt the scientists had
successfully demonstrated their theories. Professor
A's theory received overwhelming support, with
approximately 20 votes, while Professor B's theory
received only one vote.
If you aggregate the results of the experiments
for each professor, you will notice that each con-
ducted 200 trials, and Professor B actually demon-
strated a higher level of success than Professor A,
with 71 as opposed to 69 successful trials. The
one-tailed p-values for the combined trials are
0.0017 for Professor A and 0.0006 for Professor B.
To address the question of replication more ex-
plicitly, I also posed the following scenario. In
December of 1987, it was decided to prematurely
terminate a study on the effects of aspirin in reduc-
ing heart attacks because the data were so convinc-
ing (see, e.g., Greenhouse and Greenhouse, 1988;
Rosenthal, 1990a). The physician-subjects had been
randomly assigned to take aspirin or a placebo.
There were 104 heart attacks among the 11,037
subjects in the aspirin group, and 189 heart attacks
among the 11,034 subjects in the placebo group
(chi-square = 25.01, p < 0.00001).
After showing the results of that study, I pre-
sented the audience with two hypothetical experi-
ments conducted to try to replicate the original
result, with outcomes in Table 2.
I asked the audience to indicate which one they
thought was a more successful replication. The au-
dience chose the second one, as would most journal
editors, because of the "significant p-value." In
fact, the first replication has almost exactly the
same proportion of heart attacks in the two groups
as the original study and is thus a very close repli-
cation of that result. The second replication has
TABLE 1
Attempted repkiations for professor B
A
Number of successes
One-tailed p-value
10
4
0.22
15
6
0.15
17
6
0.23
25
8
0.17
30
10
0.20
40
13
0.18
18
7
0.14
10
5
0.08
15
5
0.31
20
7
0.21
TABLE 2
Hypothetical replications of the aspirin/ heart
attack study
Replication #1
Heart attack
Replication #2
Heart attack
Aspirin
11
1156
20
2314
Placebo
19
1090
48
2170
Chi-square
2.596, p =
0.11
13.206. p =
0.0003
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
a 95% confidence interval for relative risk from the.
original study. The magnitude. of the effect has
been much more. closely matched by the "nonsig-
nificant'.' replication.
Fortunately, psychologists are beginning to .no-
tice.. that ,replication is not as straightforward as
they were originally led to. believe. A special issue.
of the Journal. of Social Behavior and Personality
was entirely devoted. to the question of replication
(Neuliep, 1990). In one of the. articles, Rosenthal
cautioned his. colleagues: "Given the levels of sta-
tistical power at which we normally operate, we
have no right to expect the proportion-of significant.
results that we typically. do expect, even. if in na-
ture there. is a very real and very important effect"
(Rosenthal, 1990b, page 16).
Jacob Cohen, in his insightful article titled
"Things.I Have Learned (So Far),'" identified an-
other misconception common . among social scien-
tists: ``Despite widespread misconceptions to the
contrary, the rejection of a given null hypothesis
gives us.no basis for estimating the probability that
a. replication of the research will again, result in
rejecting .that null hypothesis" (Cohen, 1990, page
1307).
very different proportions,. and in fact the relative, been consistent effects of the same magnitude.
risk from the second study is not even contained in Rosenthal also advocates this view of replication:.
effect sizes as opposed to significance levels when
defining the strength of an experimental effect. In
general, effect sizes measure the amount by which
the data deviate from the null hypothesis in terms
of standardized units. For instance, the effect size
for a two-sample t-test is usually defined to be the
difference in the two means, 'divided by the stan-
dard deviation for the control group. This measure
can be compared across studies without the depen-
dence on sample size inherent in significance lev-
els. (Of course there will still be variability in the
sample effect sizes, decreasing as a function of sam-
ple size.) Comparison of effect sizes across studies is
one of the major components of meta-analysis.
Similar arguments have recently been made in
the medical. literature. For example, Gardner and
Altman (1986) stated that the use of p-values "to
define two alternative outcomes-significant and
not significant-is not helpful and encourages lazy
thinking" (page 746). They advocated the use of
confidence intervals instead.
As discussed in the next section, the arguments
used to conclude that parapsychology has failed to
demonstrate a replicable effect hinge on these mis.
conceptions of replication and failure to examine
power. A more appropriate analysis would compare
the effect sizes for similar experiments across ex-
perimenters and..across time to see if there have
Cohen and Rosenthal both advocate the use of
Appro
The traditional view of replication focuses on,
significance level as the relevant summary
statistic of a study and evaluates the success of
a replication ' in a dichotomous fashion The
newer, more useful view of replication'focuses
on effect size as the more important summary'.,statistic of a study and evaluates the success of .
a replication not in a dichotomous but in a
continuous fashion (Rosenthal, 1990b, page 28).
The dichotomous view of, replication has been
used throughout the history of parapsychology, by
both parapsychologists and critics (Utts, 1988). For
example, the National Academy of Sciences report
critically evaluated "significant" experiments, but
entirely ignored "nonsignificant" experiments.
In the next three sections, we will examine some
of the results in parapsychology using the broader,
more appropriate definition of replication. In doing
so, we will show that the results are far, more
interesting than the critics would have us believe.
4. THE GANZFELD DEBATE IN
PARAPSYCHOLOGY
An extensive debate took place in the mid-1980s
between a parapsychologist and critic, questioning,
whether or not a particular body of parapsychologi-
cal data had demonstrated psi abilities. The experi-
ments in question were all conducted using the
ganzfeld setting (described below). Several authors
were invited to write commentaries on the debate.
As a result, this data base liar been more thor-
oughly analyzed by both critic Jss and proponents
than any other and provides a good source for
studying replication in parapsychology.
The debate concluded with a detailed series' of
recommendations for further experiments, and left
open the question' of whether or not psi abilities
had been demonstrated. A new series of experi-
ments that followed the recommendations were
conducted over the next: few years. The results of
the new experiments will be presented in Section 5.
4.1 Free-Response Experiments
Recent experiments in parapsychology tend to
use more complex target material than the cards
and dice used in the early investigations, partially`
to alleviate boredom on the part of the subjects and
partially because they are thought to "more nearly
resemble the conditions of spontaneous psi ";occur=
rences" (Burdick and Kelly, 1977, page 109). These '
experiments fall under the general heading of
"free-response" experiments, because the subject is
asked to give a verbal or written description of the.,`
ed For,Rel base 2003/04/18 :CIA-RDP96-00789R00270001
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
REPLICATION IN PARAPSYCHOLOGY 369
target, rather than being forced to make a choice
from a small discrete set of possibilities. Various
types of target material have been used, including
pictures, short segments of movies on video tapes,
actual locations and small objects.
Despite the more complex target material, the
statistical methods used to analyze these experi-
ments are similar to those for forced-choice experi-
ments. A typical experiment proceeds as follows.
Before conducting any trials, a large pool of poten-
tial targets is assembled, usually in packets of four.
Similarity of targets within a packet is kept to a
minimum, for reasons made clear below. At the
start of an experimental session, after the subject is
sequestered in an isolated room, a target is selected
at random from the pool. A sender is placed in
another room with the target. The subject is asked
to provide a verbal or written description of what
he or she thinks is-in the target, knowing only that
it is a photograph, an object, etc.
After the subject's description has been recorded
and secured against the potential for later alter-
ation, a judge (who may or may not be the subject)
is given a copy of the subject's description and the
four possible targets that were in the packet with
the correct target. A properly conducted experi-
ment either uses video tapes or has two identical
sets of target material and uses the duplicate set
for this part of the, process, to ensure that clues
such as fingerprints don't give away the answer.
Based on the subject's description, and of course on
a blind basis, the judge is asked.. to either. rank the
four choices from most to least likely to have been
the target, or to select the one from the four that
seems to best match the subject's description. If
ranks are used, the statistical analysis proceeds by
summing the ranks over a series of trials and
comparing the sum to what would be expected by
chance. If the selection method is used, a "direct
hit" occurs if the correct target is chosen, and the
number of direct hits over a series of trials is
compared to the number expected in a binomial
experiment with p = 0.25.
Note that the subjects' responses cannot be con-
sidered to be "random" in any sense, so probability
assessments are based on the random selection of
the target and decoys. In a correctly designed ex-
periment, the probability of a direct hit by chance
is 0.25 on each trial, regardless of the response, and
the trials are independent. These and other issues
related to analyzing free-response experiments are
discussed by Utts (1991).
4.2 The Psi Ganzteid Experiments
The ganzfeld procedure is a particular kind of
free-response experiment utilizing a perceptual
isolation technique originally developed by Gestalt
psychologists for other purposes. Evidence from
spontaneous case studies and experimental work
had led parapsychologists to a model proposing that
psychic functioning may be masked by sensory in-
put and by inattention to internal states (Honorton,
1977). The ganzfeld procedure was specifically de-
signed to test whether or not reduction of external
"noise" would enhance psi performance.
In these experiments, the subject is placed in a
comfortable reclining chair in an acoustically
shielded room. To create a mild form of sensory
deprivation, the subject wears headphones through
which white noise is played, and stares into a
constant field of red light. This is achieved by
taping halved translucent ping-pong balls over the
eyes and then illuminating the room with red light.
In the psi ganzfeld experiments, the subject speaks
into a microphone and attempts to describe the
target material being observed by the sender in a
distant room.
At the 1982 Annual Meeting of the Parapsycho-
logical Association, a debate took place over the
degree to which the results of the psi ganzfeld
experiments constituted evidence of psi abilities.
Psychologist and critic Ray Hyman and parapsy-
chologist Charles Honorton each analyzed the re-
sults of all known psi ganzfeld experiments to date,
and they reached strikingly different conclusions
(Honorton, 1985b; Hyman, 1985b). The debate con-
tinued with the publication of their arguments in
separate articles in the March 1985 issue of the
Journal of Parapsychology. Finally, in the Decem-
ber 1986 issue of the Journal of Parapsychology,
Hyman and Honorton (1986) wrote a joint article
in which they highlighted their agreements and
disagreements and outlined detailed criteria for
future experiments. That same issue contained
commentaries on the debate by 10 other authors.
The data base analyzed by Hyman and Honorton
(1986) consisted of results taken from 34 reports
written by a total of 47 authors. Honorton counted
42 separate experiments described in the reports, of
which 28 reported enough information to determine
the number of direct hits achieved. Twenty three of
the studies (55%) were classified by Honorton as
having achieved statistical significance at 0.05.
4.3 The Vote-Counting Debate
Vote-counting is the term commonly used for the
technique of drawing inferences about an experi-
mental effect by counting the number of significant
versus nonsignificant studies of the effect. Hedges
and O1kin (1985) give a detailed analysis of the
inadequacy of this method, showing that it is more
and more likely to make the wrong decision as the
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
number of studies increases. While Hyman' ac-
knowledged that "vote-counting raises many prob-
lems"' (Hyman, 1985b, page 8), he nonetheless spent
half of his critique of the ganzfeld studies showing
why Honorton's count of 55% was wrong.
Hyman's first complaint was that several of the
studies contained multiple conditions, each of which
should be considered as a separate study. Using
this definition he counted 80 studies (thus further
reducing the sample sizes of the individual studies),
of which 25 (31%) were "successful." Honorton's
response to this was to invite readers to examine
the studies and decide for themselves if the varying
conditions constituted separate experiments.
Hyman next postulated that there was selection
bias, so that significant studies were more likely to
be reported. He raised some important issues about
how pilot studies may be terminated and not re-
ported if they don't show significant results, or may
at least be subject to optional stopping, allowing
the experimenter to determine the number of tri-
als. He also presented a chi-square analysis that
"suggests a tendency to report studies with a small
sample only if they have significant results"
(Hyman, 1985b, page 14), but I have questioned his
analysis elsewhere (Utts, 1986, page 397).
Honorton refuted Hyman's argument with four
rejoinders (Honorton, 1985b, page 66). In addition
to reinterpreting Hyman's chi-square analysis,
Honorton pointed out that the Parapsychological
Association has an official policy encouraging the
publication of nonsignificant results in its journals
and proceedings, that a large number of reported
ganzfeld studies did not achieve statistical signifi-
cance and that there would have to be 15 studies in
the "file-drawer" for every one reported to cancel
out the observed significant results.
The remainder of Hyman's vote-counting analy-
sis consisted of showing that the effective error rate
for each study was actually much higher than the
nominal 5%. For example, each study could have
been analyzed using the direct hit measure, the
sum of ranks measure or one of two other measures
used for free-response a nalyses. Hyman carried out
a simulation study that showed the true error rate
would be 0.22 if "significance" was defined by re-
quiring at least one of these four measures to
achieve the 0.05 level. He suggested several other
ways in which multiple testing could occur and
concluded that the effective error rate in each ex-
periment was not the nominal 0.05, but rather was
probably close to the 31% he had determined to be
the actual success rate in his vote-count.
Honorton acknowledged that there was a multi-
ple testing problem, but he had a two-fold response.
First, he applied a Bonferroni correction and found
that the number of significant studies (using his
definition of a study) only dropped from 55% to
45%. Next, he proposed that a uniform index of
success be applied to all studies. He used the num-
ber of direct hits, since it was by far the most:
commonly reported measure and was the measure
used in the. first published psi ganzfeld study. He ?.
then conducted a detailed analysis of the 28 studies..
reporting direct hits and found that 43% were sig-
nificant at 0.05 on that measure alone. Further, he
showed that significant effects were reported by six-
of the 10 independent investigators and thus were
not due to just one or two investigators or laborato-
.
ries. He also noted that success rates were very
similar for reports published in.. -refereed journals
and those published in unrefereed monographs and-
abstracts.
While Hyman's arguments identified issues such
as selective reporting and optional stopping that
should be considered in any meta-analysis, the de-
pendence of significance levels on sample size makes
the vote-counting technique almost useless for as-
sessing the magnitude of the effect. Consider, for.
example, the 24 studies where the direct hit meas-. .
ure was reported and the chance probability of a
direct hit was 0.25, the most common type of study -
in in the data base. (There were four direct hit studies-
with other chance probabilities and 14 that did not
report direct hits.) Of the 24 studies, 13 (54%) were.
"nonsignificant" at a = 0.05, one-tailed. But if the.
367 trials in these "failed replications" are com-
bined, there are 106 direct hits,..z = 1.66, and p =
0.0485, one tailed. This is, reminiscent of the
dilemma of Professor B in Section 3.
Power is typically very low for.these studies. The
median sample size for the studies reporting direct
hits was 28. If there is a real effect and it increases
the success probability from the chance 0.25 to
an actual 0.33 (a value whose rationale will be
made clear below), the power for a study with 28
trials is only 0.181 (Utts, 1986). It should be no
surprise that there is a "repeatability" problem in
parapsychology.
4.4 Flaw Analysis and Future Recommendations
The second half of Hyman's paper consisted of a
"Meta-Analysis of Flaws and Successful Outcomes"-t.
(1985b, page 30), designed to explore whether or
not various measures of success were related to
specific flaws in the experiments. While many crit-
ics have argued that the results in parapsychology
can be explained by experimental flaws, Hyman's
analysis was the first to attempt to quantify the
relationship between flaws and significant results..
Hyman identified 12 potential flaws in the
ganzfeld experiments, such as inadequate random-
Appro
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1 CPYRGHT
ization, multiple tests used without adjusting the
significance level (thus inflating the significance
level from the nominal 5%) and failure to use a
duplicate set of targets for the judging process (thus
allowing possible clues such as fingerprints). Using
cluster and factor analyses, the 12 binary flaw
variables were combined into three new variables,
which Hyman named General Security, Statistics
and Controls.
Several analyses were then conducted. The one,
reported with the most detail is a factor analysis
utilizing 17 variables for each of 36 studies. Four
factors emerged from the= analysis., From these,
Hyman concluded that security had increased over
the years, that the significance level tended to be
inflated the most for the most complex studies and
that both effect size and level of significance were
correlated with the existence of flaws.
Following his factor analysis, Hyman picked the
three flaws that seemed to be most highly corre-
lated with success, which were inadequate atten-
tion to both randomization and documentation and
the potential for ordinary communication between
the sender and receiver. A regression equation was
then computed using each of the three flaws as
dummy variables, and the effect size for the experi-
ment as the dependent variable. From this equa-
tion, Hyman concluded that a study without these
three flaws would be predicted to have a hit rate of
27%. He concluded that this is "well within the
statistical neighborhood of the 25% chance rate"
(1985b, page 37), and thus "the ganzfeld psi data
base, despite initial impressions, is inadequate ei-
ther to support the contention of a repeatable study
or to'demonstrate the reality of psi" (page 38).
Honorton discounted both Hyman's flaw classifi-
cation and his analysis. He did not deny that flaws
existed, but he objected that Hyman's analysis was
faulty and impossible to interpret. Honorton asked
psychometrician David Saunders to write an Ap-
pendix to his article, evaluating Hyman's analysis.
Saunders first criticized Hyman's use of a factor
analysis with 17 variables (many of which were
dichotomous) and only 36 cases and concluded that
"the entire analysis is meaningless" (Saunders,
1985, page 87). He then noted that Hyman's choice
of the three flaws to include in his regression anal.
ysis constituted a clear case of multiple analysis,
since there were 84 possible sets of three that could
have been selected (out of nine potential flaws), and
Hyman chose the set most highly correlated with
effect size. Again, Saunders concluded that "any
interpretation drawn from [the regression analysis]
must be regarded as meaningless" (1985, page 88).
Hyman's results were also contradicted by Harris
and Rosenthal (1988b) in an analysis requested by
Hyman in his capacity as Chair of the National
Academy of Sciences' Subcommittee on Parapsy-
chology. Using Hyman's flaw classifications and a
multivariate analysis, Harris and Rosenthal con-
cluded that "Our analysis of the effects of flaws on
study.outcome lends no support to the hypothesis
that ganzfeld research results are a significant
function of the set of flaw variables" (1988b,
page 3).
Hyman and .Honorton were .in the process of
preparing papers for a second round of debate when
they were invited to lunch together at the 1986
Meeting of the Parapsychological Association. They
discovered that they were in general agreement on
several major issues, and they decided to coauthor
a "Joint Communique" (Hyman and Honorton,
1986). It is clear from their paper that they both
thought it was more important to set the stage for
future experimentation than to continue the techni-
cal arguments over the current data base. In the
abstract to their paper, they wrote:
We agree that there is an overall significant
effect in this data base that cannot reasonably
be explained by selective reporting or multiple
analysis.. We continue to differ over the degree
to which the effect constitutes evidence for psi,
but we agree that the final verdict awaits the
outcome of future experiments conducted by a
broader range of investigators and according to
more stringent standards (page 351].
The paper then outlined what these standards
should be. They included controls against any kind
of sensory leakage, thorough testing and documen-
tation of randomization methods used, better re-
porting of judging and feedback protocols, control
for multiple analyses and advance specification of
number of trials and type of experiment. Indeed,
any area of research could benefit from such a
careful list of procedural recommendations.
4.5 Rosenthal's Meta-Analysis
The same issue of the Journal of Parapsychology
in which the Joint Communique appeared also car-
ried commentaries on the debate by 10 separate
authors. In his commentary, psychologist Robert
Rosenthal, one of the pioneers of meta-analysis in
psychology, summarized the aspects of Hyman's
and Honorton's work that would typically be in-
cluded in a meta-analysis (Rosenthal, 1986). It is
worth reviewing Rosenthal's results so that they
can be used as a basis of comparison for the more
recent psi ganzfeld studies reported in Section 5.
Rosenthal, like Hyman and Honorton, focused
only on the 28 studies for which direct hits were
known. He chose to use an effect size measure
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
called Cohen's h, which is the difference between
the arcsin transformed proportions of direct hits
that were observed and expected:
h = 2(aresin - rp-).
One advantage of this measure over the difference
in raw proportions- is that it can be used to compare
experiments with different chance hit rates.
If the observed and expected numbers of hits
were identical; the effect' size would be zero. Of the
28 studies, 23 t82%) had effect sizes greater than
zero, with a median effect size of 0.32 and a mean
of 0.28. These correspond to direct hit rates of 0.440
and 0.38 respectively, when 0.25 -is expected by
chance. A 95% ' confidence interval for the true
effect size 'is from 0.11 to 0.45, corresponding to
direct hit rates of from 0.30 to 0.46 when chance is
0.25.
A common technique in meta-analysis is to calcu-
late a "combined z," found by summing the indi-
vidual z scores and dividing by the square root of
the number .of studies. The result should. have a
standard normal distribution if each z score. has a
standard normal ..distribution. For the ganzfeld
studies, Rosenthal reported a combined z of 6.60
with a. p-value.of 3.37 x 10' 11. He also reiterated
Honorton's file-drawer assessment. by calculating
that there would have to be 423. studies unreported
to negate .the, significant effect in the 28 direct hit
studies.
Finally, Rosenthal acknowledged that, because of
the :flaws. in the-.data' base and the potential for at
least a -small file-drawer ' 'effect, the true average
effect; size was probably closer to 0.18 than 0.28. He
concluded, "Thus, when the accuracy rate expected
under the null is ?1/4, we might estimate the ob-
tained accuracy rate to be about 1/3" (1986, page
333): This is the value used for the earlier power
calculation:
It is worth mentioning that Rosenthal was com-
missioned by the National Academy-of Sciences to
prepare a background paper to accompany its 1988
report on . parapsychology. That paper (Harris and
Rosenthal, 1988a) contained much of the same
analysis as his commentary summarized above.
Ironically, the discussion of the ganzfeld work in
the National Academy Report focused on Hyman's
1985 analysis, but never mentioned the work it had
commissioned Rosenthal to perform, which contra.
dicted the final conclusion in the report.
5..A META-ANALYSIS OF RECENT GANZFELD
EXPERIMENTS
After the , initial exchange with Hyman at
the 1982 Parapsychological Association Meeting,
Honorton and his colleagues developed an auto-
mated ganzfeld experiment that was designed to,.
eliminate the methodological flaws identified by,
Hyman.. The execution and reporting of the experi
ments .followed .the detailed guidelines agreed upon. ,
by Hyman and Honorton.
Using. this."autoganzfeld" experiment, 11 'experi-
mental series , were conducted. by eight expert-%.
menters between February 1983 and September,:.
1989, when the equipment had to be dismantled
due; to lack of funding. In this .section, the results-
of these experiments are summarized and com-
pared to the earlier ganzfeld studies. Much of the.
information is derived from Honorton et al. (1990).
5.1 The Automated Ganzfeld Procedure
Like earlier ganzfeld studies, the "autoganzfeld"
experiments require four participants. The first is
the . Receiver (R), who attempts to identify the tar.-.
get material being observed by the Sender (S). The.
Experimenter (E) prepares R for the task, elicits.,
the response from R and supervises R's judging of
the response against the four, _ potential targets.,,
(Judging is double blind; E does not know which is,,,.,
the correct target.) The fourth participant is the labs
assistant (LA) whose only task is to.instruct the
computer to randomly select the target. No one;,,
involved in the experiment knows the identity of .,,
the target. .
Both R and S are sequestered in sound-isolated,:...
electrically shielded rooms. k is prepared as in
earlier ganzfeld studies, with white noise and a.
field of red light. In a nonadjacent room, S.watches
the target material on a television and can hear R's
target description ("mentation") as it is being.
given. The mentation is also tape recorded.
The judging process takes place immediately af-
ter the 30-minute sending period. On a TV monitor
in the isolated room, R views the four choices from . _
the target pack that contains the actual target. R is
asked to rate each one according to how closely it
matches the ganzfeld mentation. The ratings are.
converted to ranks and, if the correct target is,
ranked first, a direct hit is scored. The entire proc
ess is automatically recorded by the computer. The,.
computer then displays the. correct choice to R as'-'. feedback:
There were 160 preselected targets, used with ,
replacement, in 10 of the 11 series. They were
arranged in packets of four, and the decoys for a
given target were always the remaining three in
the same set. Thus, even if a particular target in a
set were consistently favored by Rs, the probability'"'."
a direct hit under the null hypothesis would,.(
remain at 1/4. Popular targets should be no more..
Appro ed For IRel base 2003/04/18 : CIA-RDP96-00789R00270001
ON Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1 CPYRGHT
likely to be. selected by the computer's random
number generator than any of the others in the. set.
The selection of the target by the computer is the
only source of randomness in these experiments.
This is an important point,.,and one that is often
misunderstood. (See Utts, 1991, for elucidation.)
Eighty of the targets were "dynamic," consisting
of scenes from movies, documentaries, And cartoons;
80 were "static," consisting of photographs, art
prints and..advertisements. The four targets within
each set were all of the same type. Earlier studies
indicated that dynamic targets were more likely to
produce successful results, and.one of the ;goals of
the new. experiments was -to test that theory.
The randomization procedure used to select the
target and the order of presentation for.judging.was
thoroughly tested before and during the experi-
ments..A detailed description is given by Honorton
et al. (1990, pages 118-120).
Three of the 11 series were pilot series, five were
formal series with novice receivers, and three were
formal series with experienced receivers. The last
series with experienced receivers was the, only one
that did not use the 160 , targets. Instead, it used
only one set of four dynamic targets in which one
target had previously received several fast place
ranks and one had never received a first place
rank. The receivers, none of whom had had prior
exposure to that target pack, were not aware that
only one target pack was being used. They each
contributed one session only to the series. This will
be called the "special series" in what follows.
Except for two of the pilot series, numbers of
trials were planned in advance for each series.
Unfortunately, three of the formal series were not
yet completed when the funding ran out, including
the special series, and one pilot study with advance
planning was terminated early when the experi-
menter relocated. There were no unreported trials
during the 6-year period under review, so there was
no "file drawer."
Overall, there were 183 Rs who contributed only
one trial and 58 who contributed more than one, for
a total of 241 participants and 355 trials. Only 23
Rs had -previously participated in ganzfeld experi-
ments, and 194 Rs (81%) had never participated in
any parapsychological research.
5.2 Results
While acknowledging that no probabilistic con-
clusions can be drawn from qualitative data, Hon-
orton et al. (1990) included several examples of
session excerpts that Rs identified as providing the
basis for their target rating. To give a flavor -for the
dream-like quality of the mentation and the amount
of information that can be lost by only assigning a
rank, the first example is reproduced here. The
target was a painting by Salvador Dali called
"Christ Crucified." The correct target received a
first place rank. The part of the mentation R used
to make this assessment read:,
... I think, of guides, like spirit guides, leading
me and I. come into a court with a king. 'It's
quiet.... It's like heaven. The king is some-
thing like Jesus. Woman. Now I'm just sort of
summersaulting through heaven ....
Brooding .... Aztecs, the Sun . God .... High
priest ....Fear .... Graves. Woman.
Prayer . . . . Funeral . . . . Dark.
Death .... Souls .... Ten Commandments.
Moses .... [Honorton et al., 1990).
Over all 11 series, there were 122 direct hits in
the 355 trials, for a hit rate of 34.4% (exact bino-
mial p-value = 0.00005) when 25% were expected '
by chance. Cohen's h is 0.20, and a 95% confidence
:
interval for the overall hit rate is from 0.30 to 0.39
This calculation assumes, of course, that the proba-
bility of a direct hit is constant and independent
across trials, an assumption that may be question-
able except under the null hypothesis of no psi
abilities.
Honorton et al. ;(1990) also calculated effect sizes
for each of the 11 series and each of the eight
experimenters. All but one of the series (the first
novice series) had positive effect sizes, as did all of
the experimenters.
The special series with experienced Rs had an
exceptionally high effect size with h = 0.81, corre-
sponding to .16 direct hits out of 25 trials (64%), but
the remaining series -and the experimenters had
relatively homogeneous effect sizes given the -
amount of variability expected by chance. If the
special series is removed, the overall hit rate is
32.1%, h = 0.16. Thus, the positive effects are not
due to just one series or one experimenter.
Of the 218 trials contributed by novices, 71 were
direct hits (32.5%, h = 0.17), compared with 51
hits in the 137 trials by those with prior ganzfeld
experience (37%, h = 0.26). The hit rates and effect
sizes were 31% (h = 0.14) for the combined pilot
series, 32.5% (h = 0.17) for the combined formal
novice. series, and 41.5% (h = 0.35) for the com-
bined experienced series. The last figure drops to
31.6% if the outlier series is removed. Finally,
without the outlier series the hit rate for the com-
bined series where all of the planned trials were
completed was 31.2% (h = 0.14), while it was 35%
(h = 0.22) for the combined series that were termi-
nated early. Thus, optional stopping cannot
account for the positive effect.
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Therewere two interesting comparisons that had
been suggested by earlier work and were pre-
planned in these experiments. The first was -to
compare results for trials with dynamic -targets
with those for static targets. In the-190 dynamic
target sessions there were 77 direct hits (40%, h =
0.32) and for the. static : targets there were 45 T hits
in 165 trials (27%, h = 0.05), thus indicating
that dynamic :targets produced far more successful
results.
The second comparison of interest was whether
or not the sender was a friend of the receiver. This
was a choice the receiver could make. If he or she
did not bring a friend, a lab member acted as
sender: There were 211 trials with friends ' as
senders (some of whom were also lab-staff), result-
ing in 76 direct hits (36%, h = 0.24). Four trials
used no sender. The remaining 140 trials used
nonfriend lab staff as senders and resulted in 46
direct hits (33%, h = 0.18). Thus, trials with friends
as senders were slightly more successful than those
without. -
Consonant with the definition of replication based
on consistent effect sizes, it is informative to com-
pare the ? autoganzfeld experiments with the direct
hit studies in the_previous data base. The `overall
success rates are extremely similar. The overall
direct hit rate was 34.4% for the autoganzfeld stud-
ies and was 38% for the comparable direct hit
studies in 'the earlier 'meta-analysis. Rosenthal's
(1986) adjustment .for flaws had placed a more con-
servative estimate at 33%, very close -to - the
observed 34.4% in--the new studies.
One. limitation of ? this -work is that the auto-
ganifeld studies, while conducted by eight experi-
menters,-all. used the same equipment in-the same
laboratory. Unfortunately, the. level of fund-
ing available in parapsychology and the cost in
time and equipment to conduct proper experiments
make it difficult to amass large amounts of data
across laboratories. Another autoganzfeld labora-
tory is currently being constructed at the Univer-
sity of Edinburgh in Scotland, so interlaboratory
comparisons may be possible in the near future:
Based on the ..effect size observed to date, large
samples are needed-to achieve reasonable power. If
there is a constant effect across all trials, resulting
in 33% direct hits when 25% are expected by chance,
to achieve a one-tailed significance level of 0.05
with 95% probability would require 345 sessions.
We end this section by returning to the aspirin
and heart attack example in Section 3 and expand-
ing a -comparison noted by Atkinson, Atkinson,
Smith and Bem (1990, page 237). Computing the
equivalent of Cohen's - h for comparing obser-
ved heart attack rates in the aspirin and placebo
groups results in h = 0.068. Thus, the effect size
observed in the ganzfeld data base is triple the`
much publicized effect of aspirin on heart attacks:
.6..:OTHER META ANALYSES IN
PARAPSYCHOLOGY
Four ' additional meta-analyses have been con-
ducted in various areas of parapsychology since the
original ganzfe1d meta-analyses were - reported:
Three . of the four analyses focused on evidence of
psi abilities, ' while the fourth examined the rela`-.-4
tionship between extroversion and -'psychic funs-
tioning. In this section, each~?af'the four analyses
will be briefly summarized. .,..-f
There `are only a handful l of English-language
journals and proceedings in parapsychology, so
retrieval of the relevant studies in each of the
four cases was simple to accomplish by searching
those sources in detail and by searching other'
bibliographic data bases for keywords.
Each analysis included an overall summary, an-
analysis of the quality of the studies versus the size
of the effect and a "file=drawer" analysis to deter-
mine the possible number of unreported studies''
Three of the four also contained comparisons across
6.1 Forced-Choice Precognition Experiments
Honorton and Ferrari (1989).. analyzed forced
choice experiments conducted from-1935 to 1987, in -
which -the. target material was randomly selected'
after the subject had attempted to predict what it'
would be. The time delay in selecting- the target
ranged from under a second to one year. Target. ?
material included items: as diverse as ESP cards
and automated random number generators. Two
investigators, S. G. Soal and Walter J. Levy, were`
not included because some of their work has been
suspected to be fraudulent.
Overall Results. There were 309 studies -re-
ported by 62 senior authors, including more than":,. -
50,000 subjects and nearly two million individual:,"
trials. Honorton and Ferrari used z/' as the
measure of effect size (ES) for each study, where n-
was the , number of Bernoulli trials in the study:.
They reported a mean ES of 0.020, and a mean-,=.
z-score of 0.65 over all studies. They also reported a",'
combined z of 11.41, p = 6.3 x 10'25. Some 30%'.:
(92) of the studies were statistically significant. ae.
a = 0.05. The mean ES per investigator was 0.033, ? .,
and the significant results were not due to just a::
few investigators.
Quality. - Eight dichotomous quality measures
were assigned to: each study, resulting in possible w.
Approved For Release 2003/04/18 : CIA-RDP96-00789R00270001Q001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1 CPYRGHT
scores from zero for the lowest quality, to eight for
the highest. They included features such as ade-
quate randomization, preplanned analysis and au-
tomated recording of the results. The correlation
between study quality and effect size was 0.081,
indicating a slight tendency for 'higher quality
studies to be more successful, contrary to claims by
critics that the opposite would be true. There was
a clear relationship between quality and year of
publication, presumably because over the years
experimenters in parapsychology have responded
to suggestions from critics for improving their
methodology.
File Drawer. Following Rosenthal (1984), the
authors calculated the "fail-safe N" indicating the
number of unreported studies that would have to be
sitting in file drawers in order to negate the signifi-
cant effect. They found N = 14,268, or a ratio of 46
unreported studies for each one reported. They also
followed a suggestion by Dawes, Landman and
Williams (1984) and computed the mean z for all
studies with z > 1.65. If such studies were a ran-
dom sample from the upper 5% tail of a N(0,1)
distribution, the mean z would be 2.06. In this case
it was 3.61. They concluded that selective reporting
could not explain these results.
Comparisons. Four variables were identified
that appeared to have a systematic relationship to
study outcome. The first was that the 25 studies
using subjects selected on the basis of good past
performance were more successful.: than the 223
using unselected subjects,: with mean effect. sizes.:of
0.051 and 0.008, respectively. Second, the 97 stud-
ies testing subjects individually were more success-
ful than the 105 studies that used group testing;
mean effect sizes were 0.021 and 0.004, respec-
tively. Timing of feedback was the third moderat-
ing variable, but information was only available for
104 studies. The 15 studies that never told the
subjects what the targets were had a mean effect
size of -0.001. Feedback after each trial produced
the best results, the mean ES for the 47 studies
was 0.035. Feedback after each set of trials re-
sulted in mean ES of 0.023 (21 studies), while
delayed feedback (also 21 studies) yielded a mean
ES of only 0.009. There is a clear ordering; as the
gap between time of feedback and time of the
actual guesses decreased, effect sizes increased.
The fourth variable was the time interval be-
tween the subject's guess and the actual target
selection, available for 144 studies. The best results
were for the 31 studies that generated targets less
than a second after the guess (mean ES = 0.045),
while the worst were for the seven studies that
delayed target selection by at least a month (mean
ES = 0.001). The mean effect sizes showed a clear
trend, decreasing in order as the time interval
increased from minutes to hours to days to weeks to
months.
6.2 Attempts to Influence Random Physical
Systems
Radin and Nelson (1989) examined studies de-
signed to test the hypothesis that "The statistical
output of an electronic RNG (random number gen-
erator] is correlated with observer intention in ac-
cordance with prespecified instructions" (page
1502). These experiments typically involve RNGs
based on radioactive decay, electronic noise or pseu-
dorandom number sequences seeded with true ran-
dom sources. Usually the subject is instructed to
try to influence the results of a string of binary
trials by mental intention alone. A typical protocol
would ask a subject to press a button (thus starting
the collection of a fixed-length sequence of bits),
and then try to influence the random source to
produce more zeroes or more ones. A run might
consist of three successive button presses, one each
in which the desired result was more zeroes or
more ones, and one as a control with no conscious
intention. A z score would then be computed for
each button press.
The 832 studies in the analysis were conducted
from 1959 to 1987 and included 235 "control" stud-
ies, in which the output of the RNGs were recorded
but there was no conscious intention involved.
These were usually conducted before and during
the.. experimental series, as tests of the RNGs.
Results. The effect size measure used was again
z / V n--, where z was positive if more bits of the
specified type were achieved. The mean effect size
for control studies was not significantly different
from zero (-1.0 x 10'5). The mean effect size
for the experimental studies was also very small,
3.2 x 10'4, but it was significantly higher than the
mean ES for the control studies (z = 4.1).
Quality. Sixteen quality measures were defined
and assigned to each study, under the four general
categories of procedures, statistics, data and the
RNG device. A score of 16 reflected the highest
quality. The authors regressed mean effect size on
mean quality for each investigator and found a
slope of 2.5 x 10' with standard error of 3.2 x
10-5, indicating little relationship between quality
and outcome. They also calculated a weighted mean
effect size, using quality scores as weights, and
found that it was very similar to the unweighted
mean ES. They concluded that "differences
in methodological quality are not significant
predictors of effect size" (page 1507).
File Drawer. Radin and Nelson used several
methods for estimating the number of unreported
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
studies (pages 1508-1510). Their estimates ranged
from 200 to 1000 based on models assuming
that all significant studies were reported. They
calculated the fail-safe N to be 54,000.
6.3 Attempts to Influence Dice
Radin and Ferrari (1991) examined 148 studies,
published. from 1935 to .1987, designed to test
whether or not.-consciousness can influence, the
results of tossing.dice. They also found 31 "con-
trot" studies ..in which no conscious intention was
involved.
Results. The effect size measure used was
z / V, where z was based on the number of throws
in which'the.,die landed with the desired face (or,
faces) up, in n throws. The weighted mean ES for
the experimental studies was 0.0122 with a stan-
dard error of 0.00062; for the control studies the
mean and standard error were 0.00093 and 0.00255,
respectively. Weights for. each, studi were de-
termined. by quality, giving more weight to high
quality studies.. Combined z.scores for the exper-
imental and control studies were reported by Radin
and Ferrari to be 18.2 and 0.18, respectively. .
Quality. Eleven dichotomous quality measures
were assigned, ranging from automated recording
to whether or not control studies were interspersed
with the experimental studies. The final quality
score for each study combined these with informa.
tion, on method.of tossing the dice, and.withsource
of subject (defined below). A regression of quality
score versus effect size resulted in a slope of - 0.002,
with. a standard error of 0.0011. However, when
effect sizes were weighted by sample size, there was
a significant relationship between quality and ef-
fect size, leading Raclin and Ferrari to conclude
that higher-quality studies produced lower weighted
effect sizes.
File Drawer. Radin and Ferrari calculated
Rosenthal's fail-safe. N for this analysis to be
17,974. Using the assumption that all significant
studies were reported, they estimated the number
of unreported studies to be 1152. As a final assess-
ment, they compared studies published before and
after 1975, when the Journal of Parapsychology
adopted an official policy of publishing nonsigni-
ficant results. They concluded, based on that an-
alysis, that more nonsignificant studies were
published after 1975, and thus "We must consi-
der the overall (1935-1987) data base as suspect
with respect to the filedrawer problem."
Comparisons. Radin and Ferrari noted that
there was bias in both the experimental and control
studies across die.face. Six was the face most likely
to come up, consistent' with the observation that' it
has the least mass. Therefore, they 'examined re-
sults for the subset of 69 studies in which targets
were evenly balanced among the six faces. They
still found. a significant effect, with mean and stan-
dard error for effect size of 8.6 x 10-3 and 1.1 x
10 - 3, respectively. The combined z was 7.617 for
these studies.
They also compared effect sizes across types of
subjects used in the studies, categorizing them as '
unelected, experimenter and other subjects, exper-
imenter as sole subject, and specially selected sub=jects. Like Honorton and Ferrari (1989), they found
the highest mean ES for studies with selected
subjects; it was approximately 0.02, more than twice
that for unselected subjects.
6.4 Extroversion and ESP Performance
Honorton, Ferrari and Bem. (1991) conducted a
meta-analysis to examine the relationship between
scores on tests of extroversion and scores on
psi-related tasks. They found 60 studies by 17
investigators,, conducted from 1945 to 1983.
Results. The effect size measure used for this
analysis was the correlation between each subject's
extroversion score and ESP score. A variety of..
measures :had been used for both scores across stud-.-
ies, so various correlation . coefficients were used.
Nonetheless, a stem and leaf diagram .of the corre-
lation showed an approximate bell shape with`
mean and standard deviation of 0.19 and ' 0.26, .
respectively, and with an additional outlier at r =
0.91. Honorton et al. reported that when weighted.
by degrees of freedom, the weighted mean r was '
0.14, with a 95% confidence interval covering 0.10
to 0.19.
Forced-Choice versus Free-Response Re'-
suits. Because forced-choice and free-response, tests
differ qualitatively, Honorton et al. chose to exam-
ine their relationship to extroversion separately.
They found that for free-response studies there was
a significant correlation between extroversion and.
ESP scores, with mean r = 0.20 and z = 4.46. Fur-
ther, this effect was homogeneous across both
investigators and extroversion scales.
For forced-choice studies, there was a significant,
correlation between ESP and extroversion, but only
for those studies that reported the ESP results
to the subjects before measuring extroversion.
Honorton et al. speculated that the relationship
was an - artifact, in which extroversion scores
were temporarily inflated as a result of positive
feedback on ESP performance.
Confirmation with New Data Following the
extroversion/ESP meta-analysis, Honorton et al.
attempted to confirm the relationship using
the autoganzfeld data base. Extroversion scores,
based on the Myers-Briggs Type Indicator were
available for 221 of the 241 subjects who had..'
narticinated in auto -anzfeld studies.
Appro
ed ForRel base 2003/04/18 : CIA-RDP96-00789R00270001 001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1 CPYRGHT
ON
The correlation between extroversion scores and
ganzfeld rating scores was r = 0.18, with a 95%
confidence interval from 0.05 to 0.30. This is con-
sistent with the mean correlation of r = 0.20 for
free-response experiments, , determined from the
meta-analysis. These correlations indicate that :ex-
troverted subjects can -produce higher -scores in
free-response ESP tests.
7.' CONCLUSIONS
Parapsychologists often make a distinction be-
tween "proof-oriented research" and "process-
oriented research.".The. -former is typically con-
ducted to test the hypothesis that-psi Abilities exist,
while the latter is 'designed to answer' questions
about how psychic functioning works. Proof-
oriented research has dominated the literature
in . parapsychology. . Unfortunately, many of the
studies used. small samples and would :thus be
nonsignificant even if a moderate-sized effect
exists.
The recent focus on meta-analysis in parapsy-
chology has revealed that there are small but
consistently nonzero effects across studies, experi-
menters and laboratories. The sizes of the effects in
forced-choice studies appear to be comparable to
those, reported in some medical studies that had
been heralded as breakthroughs. (See Section 5;
also Honorton and Ferrari, 1989, .page 301.) Free-
response studies show effect sizes of far ' greater
magnitude.
A promising direction for future process-oriented
research is to examine the causes of individual
differences in psychic functioning. The ESP/ex-
troversion meta-analysis is a step in that 'direction.
In keeping with the idea of individual differ-
ences, Bayes and empirical Bayes methods would
appear to make more sense than the classical infer-
ence methods commonly used, since they would
allow individual abilities and beliefs to be modeled.
Jeffreys (1990) reported a Bayesian analysis of some
of the RNG experiments and showed that conclu-
sions were closely tied to prior beliefs even though
hundreds of thousands of trials were available.
It may be that the nonzero effects observed in the
meta-analyses can be explained by something other
than ESP, such as shortcomings in our understand-
ing of randomness and independence. Nonetheless,
there is an anomaly that needs an explanation. As
I have argued elsewhere (Utts, 1987), research in
parapsychology should receive more support from
the scientific community. If ESP does not exist,
there is little to be lost by erring in the -direction. of
further research, which may in fact uncover other
anomalies. If ESP does exist, there is much to be
much to be gained by discovering how to enhance
and apply these abilities to important world
problems.
ACKNOWLEDGMENTS
I would like to thank Deborah Delany, Charles
Honorton, Wesley Johnson, Scott Plous and an
anonymous,reviewer .for their helpful comments on
an earlier draft of this paper, and Robert Rosenthal
and Charles Honorton for discussions that helped
clarify details.
REFERENCES
ATKINSON, R. L., ArxtNsoN. R. C., SMrnI, E. E. and $EM, D. J.
(1990). Introduction to Psychology, 10th ed. Harcourt Brace
Jovanovich, San Diego.
BELOFF, J. (1985). Research strategies for dealing with unstable
phenomena. In The Repeatability Problem in Parapsychol-
ogy (B. Shapin and L. Coly, eds.) 1-21. Parapsychology
Foundation. New York.
BLACKMORE, S. J. (1985). Unrepeatability: Parapsychology's only
finding. In The Repeatability Problem in Parapsychology
(B. Shapin and L. Coly, eds.) 183-206. Parapsychology
Foundation, New York.
BURDICK. D. S. and Ksux,:E. F. (1977).-Statistical methods in
parapsychological research. In Handbook.of Parapsychology
(B. B. Wolman, ed.) 81-130. Van Nostrand Reinhold. New
York.
CAMP. B. H.11937). (Statement in Notes Section.) Journal of
Parapsychology 1305.
COHEN, J. (1990). Things I have learned (so far). American
Psychologist 45'1304-1312.
CoovER, J. E. (1917). Experiments in Psychical Research at
Leland Stanford Junior University. Stanford Univ.
DAwES, R. M., LANDMAN, J. and WILLIAMS,,J. (1984). Reply to
Kurosawa. American Psychologist 39 74-75.
DIACONIS, P. (1978). Statistical problems in ESP research. Sci-
ence 201 131-136.
DoMMEYER, F. C. (1975). Psychical research at Stanford Univer-
sity. Journal of Parapsychology 39 173-205.
DRUCKMAN. D. and SwErs, J. A., eds. (1988) Enhancing-Human
Performance: Issues, Theories, and Techniques. National
Academy Press, Washington, D.C.
EDGEWORTH. F. Y. (1885). The calculus of probabilities applied
to psychical research. In Proceedings of the Society for
Psychical Research 3 190-199.
EDGEwoRTH. F. Y. (1886). The calculus of probabilities applied
to psychical research. II. In Proceedings of the Society for
Psychical Research 4 189-208.
FELLER, W. K. (1940). Statistical aspects of ESP. Journal of
Parapsychology 4 271-297.
FELLER, W. K. (1968). An Introduction to Probability Theory
and Its Applications 1. 3rd ed. Wiley. New York.
FISHER, R. A. (1924). A method of scoring coincidences in tests
with playing cards. In Proceedings of the Society for Psychi-
cal Research 34 181-185.
FISHER, R. A. (1929). The statistical method in psychical re-
search. In Proceedings of the Society for Psychical Research
39189-192.
GALLUP, G. H., JR., and NEWPORT, F. (1991). Belief in paranor-
mal phenomena among adult Americans. Skeptical Inquirer
15137-146.
GARDNER. M. J. and ALrMAN. D. G. (1986). Confidence intervals
rather than p-values: Estimation rather than hypothesis
L.L.d:....I 00) vea_7cn
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
GsujsoR.E, J. B. (1989). Randomness and the search for psi.
..Journal of Parapsychology 53 309-340.
GILMORE, J. B. (1990). Anomalous significance in pararandom
and psi-free domains. Journal of Parapsychology 54 53-58.
GREELEY, A. (1987). Mysticism goes mainstream. American
Health 7 47-49.
GREENHOUSE, J. B. and GREENHOUSE, S. W. (1988). An aspirin a
day ... ? Chance 1:24-31.
GaEENwooD,. J. A. and STuARr, C. E. (1940). A review of Dr.
Feller's critique. Journal of Parapsychology 4 299-319.
HACKING, L (1988). Telepathy: Origins of randomization' in ex
perimental design. Isis 79 427-451.
HANSEL, C. E. M. (1980). ESP and Parapsychology: A Critical
Re-evaluation. Prometheus Books, Buffalo, N.Y.
HARRIS, M. J. and RosENTHAt., R. (1988a). Interpersonal Ex-
pectancy Effects and Human Performance Research. Na.
tional Academy Press, Washington, D.C.
HARRIS, M. J. and ROSENTHAt., R. (1988b). Postscript to Interper.
sonal Expectancy Effects and Human Performance Research.
National Academy Press, Washington, D.C.
HEDGES, L. V. and OLxzN, I. (1985). Statistical Methods for
Meta Analysis. Academic, Orlando, Fla.
HONORTON, C. (1977). Psi and internal attention states. In
Handbook of Parapsychology (B. B. Wolman, ed.) 435-472.
? Van Nostrand Reinhold, New York.
HoNoRTON, C. (1985a). How to evaluate and improve the repli-
cability of parapsychological effects. In The Repeatability
Problem in Parapsychology (B. Shapin and L. Coly, eds.)
238-255. Parapsychology Foundation, New York.
HONORTON, C. (1985b). Meta-analysis of psi ganzfeld research: A
response to Hyman. Journal of Parapsychology 49 51=91.
HONORTON, C., BERGER, R. E., VARvormts, M. P., QUANT, M.,
DERR, P., ScHEcHTER, E. I. and FERRARI, D. C. (1990).
Psi communication in the ganzfeld: Experiments with an
automated testing system and a comparison with a meta-
analysis of earlier 'studies. Journal of Parapsychology 54
99-139.
HoNORTON, C. and FERRARI, D. C. (1989). "Future telling": A
meta-analysis of forced-choice precognition experiments,
1935-1987. Journal of Parapsychology 53 281-308.
HONORTON, C.. FERRARI. D. C. and BEM, D. J. (1991j. Extraver-
sion and ESP performance: A meta-analysis and a new
confirmation. Research in Parapsychology 1990. The Scare-
crow Press, Metuchen, N.J. To appear.
HYMAN, R. (1985a). A critical overview of parapsychology. In A
Skeptic's Handbook of Parapsychology (P. Kurtz, ed.) 1-96.
Prometheus Books, Buffalo, N.Y.
HYMAN, R. (1985b). The ganzfeld psi experiment: A critical
appraisal. Journal of Parapsychology 49 3-49.
HYMAN, R. and HovoRTON, C. (1986). Joint communique: The
psi ganzfeld controversy. Journal of Parapsychology 50
351-364.
IVERSEN, G. R., LONGCOR, W. H., MosrELLER, F., Gu.sERT, J. P.
and Yourz, C. (1971). Bias and runs in dice throwing and
recording: A few million throws. Psychometrika 36 1-19.
JEFFREYS, W. H. (1990). Bayesian analysis of random event
generator data. Journal of Scientif is Exploration 4 153-169.
LINDI.EY, D. V. (1957). A statistical paradox. Biometrika 44
187-192.
MAUSKOPF, S. H. and MCVAUGH, M. (1979). The Elusive Science:
Origins of Experimental Psychical Research. Johns Hopkins
Univ. Press.
McVAUGH, M. R. and MAUSKOPF, S. H. (1976). J. B. Rhine's
Extrasensory Perception and its background in psychical
research. Isis 67.161-189.
NEUUEP. J. W., ed. (1990). Handbook of replication research in
CPYRGHT
I
the behavioral and social sciences. Journal of Social Behao-
for and Personality 5 (4) 1-510.
OFFICE of TacINoLooY ASSESSMENT (1989). Report of a work-
shop on experimental parapsychology. Journal of, the Amer-
ican Society for Psychical Research 83 317-339.
PALMER, J. (1989). A reply to Gilmore. Journal of Parapsychol-
ogy 53441-344:. ..
PALMER, J..(1990). Reply to Gilmore: Round two. Journal of
Parapsychology 54 59-61.
PALMER, J. A., HoNOSroN, C. and Urrs, J. (1989). Reply to the.
National Research Council study on parapsychology. Jour-
nal ofthe American Society. for Psychical Research 83 31-49.
RADIN, D. I. and FERRARI, D. C. (1991). Effects of consciousness
on the fall. of dice: A meta-analysis. Journal of Scientific
Exploration'5.61-83..
RADLN, D_1. and NELSON; R. D..(1989)....Evidence for conscious.
ness-related anomalies' in random physical systems."Foun.
'
dations of Physics '19 1499-1514'---!..
RAO, K. R..(1985). Replication in conventional and controversial
sciences. In The Repeatability Problem in Parapsychology
(B. Shapin and L. Coly, eds.) 22-41. Parapsychology Foun-
dation, New York.
RHINE, J. B. (1934). Extrasensory Perception. Boston Society for
Psychical Research, Boston. (Reprinted ? by Branden Press,
1964.)
RHINE, J. B. (1977). History of experimental studies. In Hand
book. of Parapsychology (B. B. Wolman, ed.) 25-47. Van'
Nostrand Reinhold, New York.
RicHET, C. (1884). IA suggestion mentale et le calcul des'probat''.
bilites. Revue Philosophique I8 608-674.
ROSEhTHAL, ?R. (1984). Meta Analytic Procedures for Social Re-
search. Sage, Beverly Hills. .
ROSENrHAL, R. (1986). Meta-analytic procedures and the nature'
of replication: The ganzfeld debate. Journal of Parapsychol-
ogy 50 315-336.
RosENnHAt, R. (1990a). How are we 'doing in soft psychology?
American Psychologist 45 775-777.
RosE.%-rHAL, R. (1990b). Replication 'in behavioral research.
Journal of Social Behavior and Personality 5 1-30.
SAUNDERS, D. R. (1985). On Hyman's factor analysis. Journal of
Parapsychology 4986-88.
SHAPIN, B. and COLY, L., eds. (1985). TheRepeatability Problem
in Parapsychology. Parapsychology Foundation, New York.
SPENCER-BROWN, G. (1957).. Probability and Scientific Inference.
Longmans Green, London and New York.
STUART, C. E. and GREENWOOD, J. A. (1937). A review of criti-
cisms of the mathematical evaluation of ESP.data. Journal'
of Parapsychology 1,295-304.
TVERSKY, A. and KAHNEMAN, D. (1982). Belief in the law of
small numbers. In Judgment Under Uncertainty: Heuristics
and Biases (D. Kahneman,-P. Slovic and A. Tversky, eds.)
23-31. Cambridge Univ. Press.
Urrs, J. (1986). The ganzfeld debate: A statistician's perspec-
tive. Journal of Parapsychology 50 395-402.
Urrs, J. (1987). Psi, statistics, and society. Behavioral and
Brain Sciences 10 615-616.
Urns, J. (1988). Successful replication versus statistical signifi-
cance. ?.
Journal of Parapsychology 52 305-320.
Urrs, J. (1989). Randomness and randomization tests: A reply to
Gilmore. Journal of Parapsychology 53 345-351.
Urrs, J. (1991). Analyzing free-response data: A progress report...
In Psi Research Methodology: A Reexamination (L. Coly,
ed.). Parapsychology Foundation, New York. To appear.
WILKS, S. S. (1965a). Statistical aspects. of expeirments in'.
telepath. N.Y. 'Statistician 16 (6) 1-3. `'
WILKS, -S. S. (1965b). -Statistical aspects of experiments in '
telepathy. N.Y. Statistician 16 (7) 4-6.
Appro ed ForRel base 2003/04/18 : CIA-RDP96-00789R00270001~001-1
0-
C11YRGHT
Comment
M. J. Bayarri and James Berger
Approved For Release 2003/04/18.: CIA-RDP96-00789R002700010001-1
1. INTRODUCTION
There are many fascinating issues discussed in
this - paper. Several concern parapsychology itself
and the interpretation -of statistical methodology
therein. We are not experts in parapsychology, and
so have only one comment concerning such mat-
ters: In Section 3 we briefly discuss the need to
switch from P-values to Bayes factors in discussing
evidence concerning parapsychology.
A more general issue raised in the-paper is that
of replication. It is quite illuminating to consider
the issue of replication from a Bayesian perspec-
tive, and this is done in Section 2 of our discussion.
2. REPUCATION
Many insightful observations concerning replica-
tion are given in the article, and,these spurred us
to determine if they could be quantified within
Bayesian reasoning. Quantification requires clear
delineation of the possible purposes of replication,
and at least two are obvious. The first is simple
reduction of random error, achieved by obtaining
more observations from the replication. The second
purpose is to search for possible bias in the original
experiment. We use "bias" in a loose sense here, to
refer to any of the huge number- of ways - in which
the effects being measured by the experiment can
differ from the actual effects of interest. Thus a
clinical trial without a placebo can suffer a placebo
"bias"; a survey can suffer a "bias" due to the.
sampling frame being unrepresentative of the
actual population; and possible sources of bias
in parapsychological experiments have been
extensively discussed.
Replication to Reduce Random Error
If the sole goal of replication of an experiment is
to reduce random error, matters are very straight-
forward. Reviewing the Bayesian way of studying
this issue is, however, useful and will be done
through the following simple example.
M. J. Bayarri is Titular Professor, Department of
Statistics and Operations Research, University of
Valencia, Avenida Dr. Moliner 50, 46100 Burjassot,
Valencia, Spain. James Berger is the Richard M.
Brumfield Distinguished Professor of Statistics,
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Expt.E 1. Consider the example from Tversky
and Kahnemann (1982), in which an experiment
results in a standardized test statistic of zl = 2.46.
(We will assume normality to keep computations
trivial:) The question is: What is the highest value
of z2 in' a second set of data that would be consid-
ered a failure to replicate? Two possible precise
versions of this question are: Question 1: What is
the probability of observing z2 for which the null
hypothesis would be rejected in the replicated ex-
periment? Question 2: What value of z2 would
leave one's overall opinion about the null hypothe-
sis unchanged?
Consider the simple case where Z, - N(zl 0, 1)
and (independently) Z2 - N(z210, 1), where 0- is
the mean and 1 is the standard deviation of the
normal distribution. Note that we are considering
the case in which no experimental bias is suspected
and so the means for each experiment are assumed
to be the same.
Suppose that it is desired to test Ho: 0 0, and suppose that initial prior` .,opinion
about 0 can 'be. described by the noninformative
prior - u(9) = 1. We consider the one-sided testing
problem with a constant prior in this section, be-
cause it is 'known that 'then the posterior probabil-
ity of H0, to be denoted by P(Ho I data), equals the
P-value, allowing us to avoid complications arising
from differences between Bayesian and classical
answers.
After observing zi = 2.46, the posterior distribu-
tion of 0 is
ir(0 I zi) = N(0;12.46, 1).
Question 1 then has the answer (using predictive
Bayesian reasoning)
P(rejecting at level a I
r?? ~?? 1
cf oo
c. - 2.46
where 4, is the standard normal cdf and cQ is the
(one-sided) critical value corresponding to the level,
a, of the test. For instance, if a = 0.05, then this
probability equals 0.71.78, demonstrating that there
is a quite substantial probability that the second
experiment will fail to reject. If a is chosen: to be
the observed significance level from the-first exper-
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
380 J. u'rrs
second experiment will reject is just 1/2. This is
nothing but a statement of the well-known martin-
gale property of Bayesianism, that what you "ex-
pect" to see in the future is just what you know
today. In' a sense, therefore, question 1 is exposed
as being.uninteresting.
Question 2 more properly focuses on the fact that.
the. 'stated goal .of replication here is simply, to.
reduce: .uncertainty in - stated conclusions. :The an
swer to the.question follows.immediately..from not-
ing that the posterior from the combined data.
(zi.z2)
x(0I.zi,,. z2) N(0I (zi +.z2)/2,1./%/ ),
so that
P(H0Idata) = 4(-(Zi + z2)/V)?
Setting this equal to P(H0 zi) and solving for z2
yields. z2 = (VT - 1U zi = 1.02. Any value' of z2
greater than this will increase the total evidence
against Ho, while any value smaller than 1.02 will
decrease the evidence.
Replication to Detect Bias
The aspirin. example dramatically raises the is-.
sue of '.bias' detection as a motive for replication.
Professor Utts observes that replication 1 gives
results that", are.. fully compatible with those of the
original, study,.~which could be interpreted as sug-
gesting, that there is no bias in the. original study,
while replication' 2 would raise serious concerns of
bias: We became very interested in the implicit
suggestion that replication 2 would thus lead to
less.overall evidence. against the null hypothesis
than would replication 1, even though in isolation
replication 2 was much more "significant" than
was replication 1. In attempting to see if this is so,
we considered the Bayesian approach. to study of
bias within the framework of the aspirin example.
EXAMPLE 2. For simplicity in the aspiring exam-
ple, we reduce consideration to
0 true difference in heart attack rates between
aspirin and placebo populations multiplied by
1000;
Y difference in observed heart attack rates be-
tween aspirin and placebo groups in original
study multiplied by 1000;
X; = difference in observed heart attack rates be-
tween aspirin and placebo groups in Replica-
tion i`niultiplied by 1000.
We assume that the replication :studies. are ex-
tremely well. designed and implemented, so that
0
CPYRGHT
one is very confident that the,, X, have mean 0.
Using normal approximations for convenience, the
data can be summarized as
Xi - N(xi 10, 4.82), X2 - N(x210, 3.63)
with actual observations ' xi = 7.704 and x2 =.
13.07,.. .
Consider, now. the bias issue. We assume that the
original., experiment is somewhat suspect in this,,
regard, and ,.,we will model, bias .by defining the
mean of Y. to be
where f is the unknown bias. Then the data in the.
original experiment can be summarized by
Y - N(y I 'q, 1.54),
with the actual observation being y = 7.707.
Bayesian analysis requires specification of a prior
distribution, ir(f), for the suspected amount of bias
Of particular interest then are the posterior distr-,._
bution of 0, assuming 'replication i has been .,.
performed, given by.
.Jr(a) y, x,)
where aril is the variance (4.82? or .3.63) from repli
cation , i; and the posterior probability of Ho, given
by .
P(H01 y, xJ
=J co - (y-,0)
1.54 a? + 1.542
or, a;2 + 1.542
Recall that our goal here was to see if Bayesian
analysis can reproduce the intuition that the origi-
nal experiment could be trusted if replication 1 had.,
been done, while it could not be trusted (in spite of
its much larger sample size) had replication 2 been
performed. Establishing this requires finding a
prior distribution 7r(O) for which 7r((3I y, x,) has
little effect on P(H0I y, xi), but 7r((31 y, x2) has a
large effect on P(H0 I y, x2). To achieve the first
objective, a(3) must be tightly concentrated near
zero. To.achieve the second, x(P) must be such that
large I y - x21, which suggests presence of a large. .
Xx(s I y x) ds.
bias, can result in a substantial shift of posterior;
mass, for # away from zero. .
is
Approved For,RRel base 2003/04/18 : CIA-RDP96-00789R00270001(p001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1 CPYRGHT
do
A sensible candidate for the prior density 7r(j5)
is the Cauchy (0, V) density
^VVII 7V (I + (0/V)2] .
Flat-tailed densities, such as this, are well known
to have the property that when discordant data is
observed (e.g., when (I y - x2 1 is large), substan-
tial mass shifts away from the prior center towards
the likelihood center. It is easy to see that a normal
prior for 0 can not have the desired behavior.
Our first surprise in consideration of these priors
was how small V needed to be chosen in order for
P(Ho I y. x1) to be unaffected. by the. bias. For
instance, even with V = 1.54/100 (recall that 1.54
was the standard deviation of Y from the original
experiment), computation yields P(Ho I y, x1) =
4.3 x 10-5, compared with the P-value (and poste-
rior probability from -the original experiment as-
suming no bias) of 2.8 x 10-7. There is a clear
lesson here; even very small suspicions of bias can
drastically alter a small P-value. Note that replica-
tion 1 is very consistent with the presence of no
bias, and so the posterior distribution for the bias
remains tightly concentrated near zero; for in-
stance, the mean of the posterior for 16 is then
7.2 x 10-6, and the standard deviation is 0.25.
When we turned attention to replication 2, we
found that it did not seriously change the prior
perceptions of bias. Examination quickly revealed
the reason; even the maximum likelihood. estimate
of the bias is no more than 1.4 standard deviations
from zero, which is not enough to change strong
prior beliefs. We, therefore, considered a third
experiment, defined in Table 1. Transforming to
approximate normality, as before, yields
X3-N(x310,3.48),
with x3 = 22.72 being the actual observation. The
maximum likelihood estimate of bias is now 3.95
standard deviations from zero, so there is potential
for a substantial change in opinion about the bias.
Sure enough, computation when V = 1.54/100
yields that E[01 y, x31 = -4.9 with (posterior)
standard deviation equal to 6.62, which is a dra-
matic shift from prior opinion (that 0 is Cauchy (0,
TABLE 1
Frequency of heart attacks in replication 3
Aspirin 5 2309
Placebo 54 2116
381
1.54/100)). The effect of this is to essentially ignore
the original experiment in overall assessments of
evidence. For instance, P(Ho I y, x3) = 3.81 x
10 -11., which is very close to P(Ho ( x3) = 3.29 x
10-11. Note that, if 0 were set equal to zero, the
overall posterior probability of Ho (and P-value)
would be 2.62 x 10 -'3.
Thus Bayesian reasoning can reproduce the intu-
ition that replication which indicates bias can cast
considerable doubt on the original experiment,
while replication which provides no evidence of
bias leaves evidence from the original experiment
intact. Such behavior seems only obtainable, how-
ever, with flat-tailed priors for bias (such as the
Cauchy) that are very concentrated (in comparison
with the experimental standard deviation) near
zero.
3. P-VALUES OR BAYES FACTORS?
Parapsychology experiments usually consider
testing of Ho: No parapsychological effect exists.
Such null hypotheses are often realistically repre-
sented as point nulls (see Berger and Delampady,
1987, for the reason that care must be taken in
such representation), in which case it is known that
there is a large difference between P values and
posterior probabilities (see Berger and Delampady,
1987, for review). The article by Jefferys ?(1990)
dramatically illustrates this, showing that a very
small P-value can actually correspond to evidence
for Ho when considered from a Bayesian perspec-
tive. (This is very related to the famous "Jeffreys"
paradox.) The argument in favor of the Bayesian
approach here is very strong, since it can be shown
that the conflict holds for virtually any sensible
prior distribution; a Bayesian answer can be wrong
if the prior information turns out to be inaccurate,
but a Bayesian answer that holds for all sensible
priors is unassailable.
Since P-values simply cannot be viewed as mean-
ingful in these situations, we found it of interest to
reconsider the example in Section 5 from a Bayes
factor perspective. We considered only analysis of
the overall totals, that is, x = 122 successes out of
n = 355 trials. Assuming a simple Bernoulli trial
model with success probability 0, the goal is to test
Ho:0 = 1 /4 versus H1:0 * 1/4.
To determine the Bayes factor here, one must
specify g(0), the conditional prior density on Ht.
Consider choosing g to be uniform and symmetric,
that is,
1 1 1
G,.(0) = Tr' f o r 4- r 5 0 5 4+ r,
10, otherwise.
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
$82' 'a:'tri~rs
Crudely, r could be considered to be the maximum
change in success'.probability that one would expect
given that ESP.exists. Also, these distributions are
the "extreme points" over the class of symmetric
unlmodal conditional densities, so answers that hold
over this class are also representative of -answers'
over a much larger class. Note that here r`< 0.25
(because 0 0s1); for the"given data "the'6`> 0.5
are essentially irrelevant; but' if it 'were `deemed-
important to take c them''into account: one 'could use
the more sophisticated 'binomial analysis in- Berger'
and Delampady (1987);
For 1g,. the Bayes factor of Hl to H0, which is to
be interpreted as the relative odds for the hypothe.
ses provided by the data, 'is given by,
B(r) =
(1 /(2r)) I -;r 61(l - 6)355=122 d6
(1/4)1(1 - 1/4)5x5-i22
= 2r (63.13)
r -.0937)- + - (r + .0937) .
( .0252 .0252
This. is graphed .in:-Figure i.
:The P-value for this problem. was 0.00005, indi-
cating:. overwhelming evidence against Ho from a
classical . perspective. In contrast to the situation
studied by Jefferys (1990), the - Bayes factor here
does, not. completely reverse the conclusion, show.
ing that there are. very reasonable values of r for
which. the evidence against Ho is moderately
strong, for example 100/1 or 200/1. Of course, this
evidence is. probably not of. sufficient strength to
overcome strong prior opinions against Ho (one
Comment
This paper offers readers interested in statistical
science multiple views of the controversial history
of parapsychology and how statistics has con-
tributed to its development. It first provides an
Ree Dawson is Senior Statistician, New England
Biomedical Research Foundation, and Statistical
Consultant, RFE/RL Research Institute. Her mail-
ing address is 177 Morrison Avenue, Somerville,
Massachusetts 02144.
Fir.. 1.. The Bayes_ factor. of Hi to..Ho . as a function of r, the
maximum change in sriecess probability' that 'is expected given.
that ESP'exists, for the gaiufeld experiment.
obtains final posterior odds by multiplying prior
odds by' the : ' Bayes factor). To properly assess
strength of: evidence, we feel that such Bayes factor
computations should become standard in parapsy-'
chology.
As mentioned by Professor 'Utts, Bayesian meth-
ods have 'additional potential in situations ' such as
this, by allowing unrealistic models of iid trials to'
be replaced by hierarchical models reflecting differ
ing abilities among subjects.
ACKNOWLEDGMENTS
M. J. Bayarri's research was supported in partby the Spanish Ministry :of Education and Science
under DGICYT Grant BE91-038, while visiting
Purdue University. James Berger's research was
supported by NSF Grant DMS-89-23071.
account of how both design and inferential aspects
of statistics have been pivotal issues in evaluating:
the outcomes outcomes of experiments that study psi abili- -
ties. It then emphasizes how the idea of science asp-:
replication has been key in this field in which
results have not been conclusive or consistent and.
thus meta-analysis has been at the heart of the
literature in parapsychology. The author not only
reviews past debate on .how to interpret repeated
psi studies, but also provides very detailed informa-
tion on the Honorton-Hyman argument, a nice
illustration of the challenges of resolving such de- ,
Approved For Release 2003/04/18 : CIA-RDP96-00789R00270001Q001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
effects for this data (this result is reported in Sec-
tion 5). For the remaining 10 series, the chi-square
value X9 = 7.01 strongly favors homogeneity, al-
though more than one-third of its value is due to
the novice series (number 4 in Table 1). This pat-
tern points to the potential usefulness of a richer
model to accommodate series that may be distinct
from the others. For the earlier ganzfeld data ana-
lyzed by Honorton (1985b), the appeal of a Bayes or
other model that recognizes the heterogeneity
across studies is clear cut: X? = 56.6, p = 0.0001,
where only those studies with common chance hit
rate have been included (see Table 2).
Historic reliance on voting-count approaches to
determine the presence of psi effects makes it natu-
ral to consider Bayes models that focus on the
ensemble of experimental effects from parapsycho-
logical studies, rather than individual estimates.
Recent work in parapsychology that compares ef-
fect sizes across studies, rather than estimating
separate study effects, reinforces the need to exam-
ine this type of model. Louis (1984) develops Bayes
and empirical Bayes methods for problems that
consider the ensemble of parameter values to be
the primary goal, for example, multiple compar-
isons. For the simple compound normal model,
Y; -- N(6i, 1), B; - N(K, r2), the standard Bayes
estimates (posterior means)
bate. This debate is also a good example of how
statistical criticism can be part of the scientific
process and lead to better experiments and, in gen-
eral, better science.
The remainder of the paper addresses technical
issues of meta-analysis, drawing upon recent re-
search in parapsychology for an in-depth applica.
tion. Through a series of examples, the author
presents a convincing argument that power issues
cannot be overlooked in successive replications and
that comparison of effect sizes provides a richer
alternative to the dichotomous measure inherent in
the use of p-values. This is particularly relevant
when the potential effect. size is small and re-
sources are limited, as seems to be the case for psi
studies.
The concluding section briefly mentions Bayesian
techniques. As noted by the author, Bayes (or em-
pirical Bayes) methodology seems to make sense for
research in parapsychology. This discussion exam-
ines possible Bayesian approaches to meta-analysis
in this field.
BAYES MODELS FOR PARAPSYCHOLOGY
The notion of repeatability maps well into the
Bayesian set-up in which experiments, viewed as a
random sample from some superpopulation of ex-
periments, are assumed to be exchangeable. When
subjects can also be viewed as an approximately
random sample from some population, it is appro-
priate to pool them across experiments. Otherwise,
analyses that partially pool information according.
to experimental heterogeneity need to be consid-
ered. Empirical and hierarchical Bayes methods
offer a flexible modeling framework for such analy-
ses, relying on empirical or subjective sources to
determine the degree of pooling. These richer meth-
ods can be particularly useful to meta-analysis of
experiments in parapsychology conducted under
potentially diverse conditions.
For the recent ganzfeld series, assuming them
to be independent binomially distributed as dis-
cussed in Section 5, the data can be summed
(pooled) across series to estimate a common hit
rate. Honorton et al. (1990) assessed the homogene-
ity of effects across the 11 series using a chi-square
test that compares individual effect sizes to
the weighted mean effect. The chi-square statistic
X o = 16.25, not statistically significant (p =
0.093), largely reflects the contribution of the last
"special" series (contributes 9.2 units to the Xio
value), and to a lesser extent the novice series with
a negative effect (contributes 2.5 units). The outlier
series can be dropped from the analysis to provide a
more conservative estimate of the presence of psi
2
8*=t +D(Y;-?) and D= 1+T2
where the 8t represent experimental effects of in-
terest, are modified approximately to
0+VD__ (Y,?-L)
when an ensemble loss function is assumed. The
new estimates adjust the shrinkage factor D so
that their sample mean and variance match the
posterior expectation and variance of the 6's. Simi-
lar results are obtained when the model is gener-
-TAat.E 1
Recent gansfeld series
CPYRGHT
Pilot
22
0.36
-0.58
0.44
Pilot
9
0.33
-0.71
0.71
Pilot
36
0.28
-0.94
0.37
Novice
50
0.24
-1.15
0.33
Novice
50
0.36
-0.58
0.30
Novice
50
0.30
-0.85
0.31
Novice
50
0.36
-0.58
0.30
Novice
6
0.67
0.71
0.87
Experienced
7
0.43
-0.28
0.76
Experienced
50
0.30
-0.85
0.31
Experienced
25
0.64
0.58
0.42
Overall
355
0.34
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1 CPYRGHT
384
TABLE 2
Earlier ganzfeld studies
32 ..
0.44
-0.24
0.36
7
0.86
1.82
1.09
30
0.43
-0.28
0:37
30
0.23-
-1.21'"
0.43
20
0.10
-2.20
0.75
10
0.90
2.20
' 1:05
10
0.40
-0.41
0.65
28
0.29
-0.60
0.42
10
0.40
-0.41
0.65
20
0.35"
-0.62 '
0.47
26
0.31
-0.80
?0.42
20
0.45
-0.20
0.45
20
0.45
-0.20
0.45
30
0.53
0.12
0.37
36
0.33
-0.71'
0.35
32
0.28
-0.94
0.39
40
0.28
-0.94
0.35
26
0.46
-0.16
0.39
20
.0.60.
0.41
0.46
100
0.41
-0.36
0.20
40 "
0.33
-0.71
0.34
27
0.41
-0.36
0.39
60
0.45
-0.20
0.26
48
0.21
-1.33
0.35
722
alized to "the case of unequal variances, Y; -
N(0j, v,2).
For the above model, the fraction of Of above (or
below) a cut point' C is a consistent estimate of the
fraction of . 0j > C (or 0, < C). Thus, the use. of
ensemble.' rather than ..component-wise, loss . can
help detect when individual effects are above
a specified threshold by ' chance. For the meta-
analysis of ganzfeld experiments, the observed bi-
nomial proportions transformed on the logit (or
aresin./). scale can be modeled in this framework.,
Letting di and m', denote the number of direct hits
and misses respectively for the ith experiment, and
p, as the corresponding population proportion of
direct hits, the Y, are the observed logits
Yj = log(d,/m,)
and o 2, estimated by maximum likelihood as
1/d, + 1/rrtj, is the variance of Y, conditional on
0 j = logit(p,). The threshold logit (0.25) = 1.10 can
be used to identify the number of experiments for
which the proportion of direct hits exceeds that
expected by chance.
Table 1 shows Y, and a, for the 11 ganzfeld
series. All but one of the series are well above the
threshold; Y4 marginally falls below -1.10. Any
shrinkage toward a common hit rate will lead to an
estimate, 04 or 04', above the threshold. The use of
ensemble loss (with its consistency property) pro-
vides more convincing support that all 8; > - 1.10,
although posterior estimates of uncertainty are
needed to fully calibrate this. For the earlier
ganzfeld data in Table 2, ensemble loss can simi
larly be. used to determine the number of studies,
with 0, < -1.10 and specifically Whether the nega-?
tive? effects . of. studies 4 and, 24;., (Y4 = -1.21
and Y24 -1.33) occurred as a result of chance
fluctuation.. .
.Features of; the ganzfeld data in Section 5, such,
as the.outlier series, suggest that further elabora..
tion of the basic Bayesian set-up.may be necessary,
for some meta-analyses in parapsychology. Hierar-
chical models :provide .a natural? framework to spec-
ify these. elaborations and explore how. results
change with the prior specification. This type. of.
sensitivity analysis can expose whether conclusions
are closely tied to prior beliefs, as observed by
Jeffreys for RNG data (see Section 7). Quantifying
the. influence ; of model components deemed to be
more subjective or less certain is important to broad..
acceptance of results as evidence of psi performance
(or lack thereof).
Consider the initial model commonly used for
Bayesian analysis of discrete data: ' '
Yj I p,, n, - B(pj. n3,
??W
0, - N(.?, r2), 0,?.. iogit(p;);
with noninformative priors assumed for ? and r2
(e.g., 'log r locally uniform). The distinctiveness of
the last "special' series and, in general, the differ-
ent types of series (pilot versus formal, novice ver-
sus experienced) raises the question of whether the
experimental effects follow a normal distribution.
plots (Ryan Q'Dempster, 1984).
Weighted normal lots
can be used to graphically diagnose the adequacy of
second-stage normality (see Dempster, Selwyn and
Weeks, 1983, for examples with binary response
and normal superpopulation).
Alternatively, if nonnormality is suspected, the
model can be revised to include some sort of heavy-
tailed prior to accommodate possibly outlying se-
ries or studies. West (1985) incorporates additional
scale parameters, one for each component of the
model (experiment), that flexibly adapt to a typi-
cal 0, and discount their influence on posterior
estimates, thus avoiding under- or over-shrinkage
due to such 0,. For example, the second. stage
can specify the prior as a scale mixture of normals:
0, - N(K, r27, 1),
k7r - X,I.
vr'z-X2
This approach for the prior is similar to others for
to
Approved For'.;Rel base 2003/04/18 : CIA-RDP96-00789R00270001
Approved For Release 2003/04/18 : CIA-RDP96-00789R002700010001-1
CPYRGHT
maximum likelihood estimation that modify the
sampling error distribution to yield estimates that
are "robust" against outlying observations.
Like its maximum likelihood counterparts, in ad-
dition to the robust effect estimates 8,*, the Bayes
model provides (posterior) scale estimates These
can be interpreted as the weight given to the data
for each 01 in the analysis and are useful to diag-
nosing which model . components (series or studies)
are unusual and how they influence the shrinkage.
When more complex groupings among the 9, are
suspected, for example, bimodal distribution of
studies from different sites or experimenters, other
mixture. specifications can .be used to. further relax
the shrinkage toward a common value.
For the 11 ganzfeld series, the last "outlier"
series, quite. distinct from the others (hit rate =
0.64), is moderately precise (N = 25). Omitting it
from the analysis causes the overall hit rate to drop
from 0.344 to 0.321. The scale mixture model is .a
compromise between these two values (on the logit
scale), discounting the influence of series 11 on the
estimated posterior common hit rate used for
shrinkage. The scale factor 7i1, an indication of
how separate Q 1 is from the other parameters, also
causes 911 to be shrunk less toward. the common hit
rate than other, more homogeneous 0,, giving more
weight to individual information for that series (see
West, 1985). The heterogeneity of the earlier
ganzfeld data is more pronounced, and studies are
taken from a variety of sources over time. For these
data, the -y. can be used to explore ,atypical studies
(e.g., study 6, with. hit rate-_= 0.90, contributes more
than 25% to the X23 value . for homogeneity) and
groupings. among effects, as well as protect the
analysis from misspecification of second-stage
normality.
Variation among ganzfeld series or studies and
the degree to which pooling or shrinking is appro-
priate can be investigated further by considering a
range of priors for r2. If the marginal likelihood of
r2 dominates the prior specification, then results
should not vary as the prior for r2 is varied. Other-
wise, it is important to identify the degree to which
subjective information about interexperimental
variability influences the conclusions. This sen-
sitivity analysis is a Bayesian enrichment of
the simpler test of homogeneity directed toward
determining whether or not complete pooling is
appropriate.
To assess how well heterogeneity among his-
torical control groups is determined by the data.
Dempster, Selwyn and Weeks (1983) propose three
priors for r2 in the logistic-normal model. The prior
distributions range from strongly favoring individ-
ual estimates, p(r2)dr