JPRS ID: 10323 TRANSLATION SPEECH, EMOTIONS AND PERSONALITY ED. BY V.I. GALUNOV
Document Type:
Collection:
Document Number (FOIA) /ESDN (CREST):
CIA-RDP82-00850R000500030032-8
Release Decision:
RIF
Original Classification:
U
Document Page Count:
212
Document Creation Date:
November 1, 2016
Sequence Number:
32
Case Number:
Content Type:
REPORTS
File:
Attachment | Size |
---|---|
CIA-RDP82-00850R000500030032-8.pdf | 12.66 MB |
Body:
APPROVED FOR RELEASE: 2007/42/09: CIA-RDP82-00850R000500034432-8
h'OR UM'h7('IAL U~~: C)NLY
_ JPRS L/ 10323
12 February 1982
Translation
SPEECH, EMOTIONS AND PER SONALITY
Ed. by
V.I. Galun~v
- Fg~~ FOREIGN BROA~CAST INFOt~IVIATION SERVICE
_ FOR OFFICIAL U5E ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007142/09: CIA-RDP82-40854R040500030032-8
NOTE
JPRS publications contain information primarily from foreign
new;papers, periodicals ar.d books, but also from news agency
transmissions and broadcasts. Materials from foreign-language
sources are translated; those from English-language sources
are transcribed or reprinted, with the original phrasing and
other characteristics retained.
Headlines, editorial repcrts, and material enclosed in brackets
are supplied by JPRS. Processing indicators such as [Text]
or [Excerpt; in the first line of each item, or following the
last line of a brief, indicate how the original information was
processed. Where no processing indicator is given, the infor-
mation was summarized or extracted.
Unfamiliar names rendered phonetically or transliterated are
enclosed in parentheses. Words or names preceded by a ques-
- tion mark and enclosed in parentheses were not clear in the
original out have been supplied as appropriate in context.
Other unattributed parenthetical notes within the body of an
item originate with the source. Times within items are as
given by source.
The contents of this publication in no way represent the poli-
cies, ~iews or attitudes of the U.S. Government.
COPYRIGHT L,AWS AND REGULATIONS GOVERNING OWNERSHIP OF
MATERIALS REPRODUCED HEREIN REQUIRE 'THAT DISSEMINATION
OF THIS PUBLICATION BE RESTRICTED FOR OFFICIAL USE ONI,Y.
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FOR OFFICIAL USE ONLY
JPRS L/10323
- 12 February 1982
SPEECH, EMOTIONS AND PERSONALITY
Leningrad RECH', EMOTSTI I LICHNOST' in Russian 1978 (signed to press
19 Jul 78) pp 1-198
[Proceedings and reports of an all-union symposium, '17-28 February
1978, edited by V.I. Galunov, Order of Lenin USSR Academy of Sciences,
Scientific Council on Integrated Problems of Human and Animal physiol-
ogy, and the Combined Scientific Council on the Integrated Problem
"Physical and Technical Acoustics," 500 copies]
CONTENTS
Anno tation 1
Speech, Emotions and Personality: Problems and Prospects
(V. I. Galunov) 2
The Problem of Classifying Emotional States in Light of the Information
Theo ry of Emotions
(P. V. Simonov) 11
Linguistic Invariability and Individual Variability
- (L. V. Bondarko, V. G. Shchukin) 16
~ Extralinguistic Signals and the Properties and States of the Individual
(V. Kh. Manerov) 22
iJsing Symmetrical Biologically Active Points to Monitor Changes in
Human P~��~hophysiological State
(A. S. Abduakhadov, V. I. Galunov) 32
Analysis of Voice as a Source of Information on Properties of the
S pe ake r
(V. I. Alekseyev, et al.) 36
- The Semantic Space of Ideas Associated With Emotionally Colored
Speech
(Ye. F. Bazhin, G. A. Kr.ylova) 42
~ - a - [I - USSR - M FOUO]
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/49: CIA-RDP82-00850R040500030032-8
ruet ~rr~~~r?~ u~r, viv~.
A Package o.f Tests Ta Study pexception of Emot~,onal Speech
(A. V. Beakadarov, et al.) 47
Analysis of the Yariability of t?.e Melodic Contours o~ Spe�ach
(A. V. Beskadaruv, V. I. Galtmov) 51
Significance of Prosodic and Spectral Parameters of Spoken Signals
Expreasing Dif�erent Eraotional States
(L. P. Blokhina, T. G. Gomina) 54
Information Content of the Timbre Characteristics of Speech
(A. P. Varfolomeyev).......~ 60
The Significance of Personal Meanings To Realization. of the P'hyaical
Qiaracteristics of a Spoken Statement (Acco rding to Clinical
Observations)
(Ye. N. Vinarskaya, et al.) .....................................u.... 65
S~eech Recognition System Recognizea Speal.cers by Voice
(T. K. Vintayuit, Pt al.) 68
Emotionality of the Personality as Related to Psychoghysiological and
Speech Characte:-istics
(N. V. Vitt, L. V. L. B. Yermolayeva-Tomina) 74
Variability of Speech Tempoa
(L. T. Vygonnaya) 77
- Mutual Correlation Between Personal and Speech Qiaracteriati.cs in an
Em4tionally Tenae S1 tuatioz~
(5. S. Galagud:te, G. V. Nikolayeva) 80
Using Speech Characteriatica To Monitor Emotional State in Children
(V. I. Galunov, et al.) 86
Effect of Individual and Emotion-Dependent Qiangea in Parameters of
the Art3culatory Tract on Characteristics of the Spoken Signal
(V. I. Galunov, ~t al.) 89
Formant Frequency as an Index of Voice Individuality
(V. B. Gitlin) 96
Effect of Different Emotional State~ on Change in the Spectrum of ~
English Vowela
(T. G. Gomina) . 100
~ V
Acoustic Organization of Speech as One of the Meana of Its Emotional
Coloration
(A. P. ZhuravleN) 103
Uafng Speerh Characteriatica To Evaluate Individ~ial Persona:lity
Features
(I. S. Zamaletdinov, R. B. Bogdashevsk{y) 108
- b -
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFIC[AL USE ONLY
Acoustic Qiaracteristica o~ the Phonet~c ~Ioxd i,n Di��erent ~pes o f
: Em~tionally Organized Texta ~
(L. Y. Zlatoustova, M. V. Khitina) 113
Dynamics of Emotional Tension in a Situation Involving an Anticipated
Outcome of Variable Probability
(S. L. Zysin) 116
Perception of Spoken Information on a Noise Background by Listeners
in a State of Sensory Monotony
(M. N. tl'ina, I. M. Iuahchi.k.ti'ina) 119
Analysia of the Fundamental Tone of Impersonated Speech
(N. P. Ka~antseva, et al.) 122
Possibili~y of Studying Emotionally Colored Speech by the Segmentation
Me tho d
(N. G. Kamyshnaya) 124
_ Evaluation of Speech by Listeners Experiencing Different States
(Yu. A. Katygin) 127
- Some Factors Defining the Accuracy of a Liatener'e Evaluation of
Emotlonal States
- (T. V. Korneva) 131
Us�ing the Semantically Contraating Paira Method To Evaluate
- Professional Qualities of an Actor's Voice
(A. N. Kunitsyn, V. I. Tarasov) 135
Determination of Emotional States on the Baeis o~ Semantic and
Temporal Characteriatica of Speech
(M. V. Laako, Zh. I. Rezvitekaya) 138
Recognition of Operator State by Masking Characteristics of the Spoken
S ignal
(V. G. Lebedev) 141
A Device Meaauring Emotional Arousal-Inhibition
(V. Kh. Manerov, et al.) 144
_ Iriformation ContenL of~ the Emotional Characteristics of Speech
(V. L. Marishchuk) 147
Characteristics of Acouatic Resources for Expresaing Emotions in Vocal
Speech, and 9ome General Aspects of the Probletn of the 'Language of
Emo tions
(V. P. Morozov) UO
Rem~~e Control of Operator Stat~ in Connectioii With the Ob~ectivee of
Expert Certification
(V. 7. Myasnikov, e t al.) 155
-c- .
FOR ~DFF[CIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
f~'UR UF'b7c:1AL U~H; UNLY
Variability o# the Averaged Spectrum of Vowel Sounds in .Qui.et, No~mal
� and Loud Speech
(A. V. Nikonov) 158
Poasibilities for Evaluating Tntensity of a Speaker's Emotional Dension
- on the Basis of (1langes in Characteristics o~ His Speech
(E. L. Nosenko) 162
Some Flowcharts for Ana13~s is o f the S tate of an Individual on the
Basis of Characteristics of Hi.a Speech
_ (E. L. Nosenko, et al.) 165
Some Characteristica of Emotional Whispered Speech
(E. A. Nuahikyan, et al.) 169
(haracteriatics of Human Speech Behavior in Stresaful Conditions
- (V. A. Popov, et al.) 173
The Role of Verbal Presentation of Material in dne Method of Diagnosing
the Emotional Characteristics of an Individual
(I. A. Popova) 177
Components of the Temporal Em~tive Characteristic
(R. K. Po tapova) 181
A Method of Sub,jective Analysis of Emo tionally Colvred Speech
(L. V. Stat'yeva) 184
A Method for Describing Individual Properties of Voicea Employing
Analysis of Spoken Signal Spectr.uma and Bands
(V. D. Serdyukov) 187
An.alyais of the Variability of Vowel Formant Composition Arssociated
witn Qiange in Palate Shape
(A. I. Taraaov, et al.) 192
% Some Results of Reaearch on Intonational Characteristics of Principal
Stressed Vowe7 ~ounds of Emotional Speech
(V. L. Taubkin) 194
Ana.lysis of Adaptive Mechanisms of the Articulatory Organ Control
Sys tem
(A. A. Fedorov, et al.) 199
An Algorithm for Recogn.izing Emotional States of Speakere on the Basia
o f S treased Vowel Sounda
(M. V. Frolov, et al.) 202
~
- d -
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02109: CIA-RDP82-00850R400540030032-8
FOR OFFICIAL USE ONLY
ANNOTATION ~
This collectior~ contains the abridged texts of reports and communications presented
at the all-union symposium "Speech, Emotions and Personality" held in Leningrad in
February 1978. The symposium was convened on the initiative of the speech sections
of the USSR Academy of Sciences Scientific Council for Complex Problems of Human
and Rnimal Physiology and the USSR Academy of Sciences Scientific Council for the
- Complex Problem "Physical and Technical Acoustics." The symposium's convocation was
elicited by growth in interest toward problems associated with analyzing variability
of spoken communication arising under the influence of the 'individual features of
the speaker aiid chaiiges in his emotional state.
Reports dealing with the followinc; directions were discussed at the symposium:
the dependence of speech characteristics on the personality properties of the speaker;
the dependence of vocal manifestations of emotion on the personality characteristics
of the speaker;
simulation of emotional and individual variability of speech.
1
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02109: CIA-RDP82-00850R400540030032-8
- {'V~\ V~ ~ ~~.aAU VVai VI~aJ�
SPEECH, EMOTIONS ANI~ PERSONALITY: PROBLEMS AND PROSPECTS
V. I. Galunov
Researchers in linguistics have traditionally been interested only in ~he first
element of the classical three-element formula: "What is being said?, Whom is it
~ being said by?, In what state is it being said?" However, interest in the two othe~
aspects of the spoken siynal has noticeably increased in recent yeazs. 'I"here are
two reasons for this: First, a number of applied problems have arisen associated
with th2 need for defining the persona~ity and the state of the speaker on the basis
of his spoken signals; second, most experts in spoken comanunication have recogniz,ed
the inseparability of the three indicated aspects of the spoken signal, and the need
for analyzing the latter, in all the complexity of this indivisible trinity, even
when confronted by a classical problem which might appear to be simple--automatic
reco~nition of speech. What is meant by indivisible is that as a rule both semantic
information and information about the speaker's individuality and his state are en-
coded by the same parameters of the spoken signal.
This paper examines the complex of problems facing researchers attempting to establish
a relationship between the characteristics of the spoken signal and the speaker's
personality and state.
2'he Emotive and Indicative Functior. ~f the Speech System
The first problem, of course, is to accurately word the task itself, to realize that
which we wish to find. Let us attempt to do so by starting with the sufficiently
general scheme of communication represented in Shannon's ~aell known scheme (see .
Figure 1). In the case of spoken communication, the infozmation source is said to
be some central cerebral mechanisms shaping the content and structure of a statement,
the encoding unit is the articulatory system that transforr~s a statement into acoustic
form through the movements of speech forming organs, the decoding unit is the oxgan
of l~earing, which translates the acoustic signal into a neur~l code "comprehensible"
to the brain, and the information receiver is once again represented by central
mechariisms responsiblc for comprehension of the meaning of speech, mechanisms which
extract behaviorally useful information from this code. From the standpoint of en-
suring maximum resistance to interference for the spoken communication system ar.d
simplicity of the speech decoding system, it would be desirable to make the Ancodinq
system constant--that is, c_r~ make the articul.atorytracts of all people i~lentical, and
to ensure constancy of tYiese characteristics with resZ~ect to time in everyone. Ob-
viously, this is not so in fact. The parametexs of the speech forming system vary
2
_ FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOIt OFFICIAL USE ONLY
within broad limits both from one person to another and in a given person depending
on a number of causes, particularly his psychophysiological and emotional state.
Pr2sence of just this variability alone is enough to bring about individual and
emotional peculiarities in speech. It is clear in this case that within the framework
of the communicdtion model examined here, in which the main goal of the communication
system is to transmit semantic information, individual and emotional features mani-
' fest themselves only as additionai variations in semantic parameters.
,
(1) (2) (3)
; ~_,NcTOVmiK j_~ KOAN~ autee Kauan CHA3N '
i Illilhd~)M0I111{I I yCTPO~CTHO I I_ I
, ~
i ~ I fiCKOANPylOU(tC i J~1 N2MHNN~^~
j i yGTpOIICT00 I I NH~OPNBQNN I ~
- - - - - - - ~5) ^
Figure 1
Key:
1. Information source 4. Decoding unit
2. Encoding unit 5. Informztion receiver
3. Communication channel
; B~iewHaA cpena ~
, i. -
\ - _
` _ J
~ I ~ ~
i I NCT0411NK KoaHpLpautee ~ KaHan
c~ eaa~i I
~ IHII~10(1M81~NN ~ I_yfTPOHCT80 _ I i-^ (
. _ (7)- - - -L~
; ~
~ ~AtKOPNp,YqUltll I I~~NPMHNK I ~
~
~ I I yCT ORl'TSO ~ NH O M81(HN i
' - - - -
~ ~ - - - - - - - -
Figure 2
Key :
1. Environment 4. Communication channel
2. Information source 5. Decoding unit
3. Encoding unit 6. Znformation receiver
3
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
Let u5 ~x.iminc; a ma~i~~l ~~t yr~~:it~~x� ~�uw~~lc~xi~y. 1~'olluwitiy t3utilar (ly, let ua acid Lhe
environmerit to Shannon's communication scheme (see b'igure 2). 'I'his addition is not
as primitive as might appear at first glance. Besides introducing the "environment,"
we implicitly presuppose isolation, from this environment, some object connected with
the information source, the range of action of which is not reducible exclusively
to the information ccntained in communications but inciudes a broader complex of
behavior. The concept of information receiver is broadened in simila?_- fashion as
well. Such broadening permits us to examine communication not onZy in its technical
and organizational but also its functional plane--that is, to analyze why com~?unica-
tion might occur, given sufficiently general hypotheses concerning the interaction of
the communicating partners between themselves and with the environment. It would be
impossible here to provide a detailed analysis of this system and to examine all
functions of a communication system which may be discerned within the framework of
the latter (concerning this topic, see (2-3)). We will examine only three functions
of interest to us from the standpoint of the topic at hand. The main function of a
communication system is to transmit information about the environment. This of course
is ttie main function, and in Shannon's scheme, in whic;h everything except tY?e ideal
(that is, not varying and not having any functions other than communication) informa-
tion source and receiver are defined as the environment, tris is also the sole function.
The second function is emotive--the function of transmitting information about the
internal state of the ~ource. The L-hird is the indicative function, indicating the
individuality and the group or social status of the participants of communication.
Clearly within the framework of the topic at hand, we are mainly interested in the
emotive and indicative functior.a. But we cannot simply ignore the informative func-
- tion, since all processes associ.ated with realization of the emotive and indicative
functions proceed on the background of processes supporting this main function.
It should ba noted that the two functions of interest to us are supported in three
ways. First, through verbal expressions ("I do not feel well," "I am happy," "My
name is Ivanov," "I am your chief" etc.?. Second, by nonverbal sounds (laughter,
weeping, groaning etc.). Third, by variation of speech parameters (ch~.zges in loud-
ness, in the characteristics of the principal tone, in rhythm and pitch, in the
structure of the stdtements etc.). We should probably exclude the first way from our
examination right away, since it is fully identifiable as the means for realizing
the informative function, and it may be analyzed successfully within the framework of
the classical methods of linguistics, automatic speech recognition and so on. The
second of these ways is characterized by a rather narrow functional range, and it
would probably elicit only limited interest. Thus our main attention should be con-
centrated on the third way of supporting the emotive and indicative functions; ir will
be the main topic of disct:~sion below.
There is one more rather fine distinction between two levels in the support of emotive
and indicative functions. These two levels can be seen more clearly in the emotive
function. The first level is represented by changes in speech parameters that are
realized by the speaker and which yield ta his control (practically all such para-
meters can intentionally transmit indications of excitement, calmness, dissatisfaction
and so on). Certain situations may require certain style of speech in order to re-
alize the emotive and indicative functions (an "entreating" or "coirananding" voice).
The second level is represented by realization of these functions through uncontrol-
lable changes in speech characteristics. Clearly these two levels overlap to a
significant extent, but they also possess their separate elements (4). As a rule
the level of controllable manifestations is beyond the interest of researchers
4
FOR OFFI~~IAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
� FOR OFFICIAL USE ONY,Y
dealing with applied problems. However, it should be ~onsidered that in extreme
conditions a normally controllable system supporting one of the functions may also
go into action automatically.
Thus examination of the communication schemes adopted here leads to a rather narrow
problem: analysis of uncontro~lable changes in the structure of spoken communica-
tions supporting the emotive and indicative functions of the communication system.
Returning to the initial scheme, we should make two more remarks of inethodological
nature. First, differences in both the coding system (anatomical and physiological
features or changes in the speech forming system) and in the information source itself
(which would basically lead to change in the structure of a statement and not in it~
acoustic characteristics) can serve as a s~~urce of individual and emotional vari--
ability of speech. This allows us to define two pathways for analyzing variability,
- ones which may be called biophysical and psycholinguistic. Second, two approaches
, to analyzing the relationship of speech to emotional and personality characteristics
are possible. The first boils down to initially analyzing the variability of speech
and subsequently searching for individual and emotional characteristics eliciting
this variability. The second boils down to initially isolating the personality or
emotional characteristics of interest to the researcher and then searching for their
correlates in speech.
The Alphabet of Emotional States
One of the fundamental difficulties the researcher encounters in analysis of the
emotional variability of speech is the absence of a satisfactory theory of emotions.
The researcher in .'_inguistics faces the problem of determining w}iat it is iie wishes
to find reflected in a spoken signal--a problem outside his competeiicy. A purely
pragmatic approach is possible in prir.ciple: A list of states and the means of their
formation can be given from without, and then the linguist solves the purely applied
problem. Clearly the research would suffer in the generality of its application.
Nevertheless we could try to draw up a more-general list of states using some parti-
cular conception. One such general and promising conception is P. V. Simonov's
theory (see the present collection). We will examine two other approaches that also
permit us to draw up s2nsible lists of states.
The first is associated with the well known classification of emotions suggested by
Wundt (S), who character.ized states in relation to three sets of properties: positive-
negative, arousal-inhibition, anger-fear. This old conception has recently enjoyed
support among authors using the so-cal.led method of the semantic differential or,
iri other words, the method of semantically opposite pairs (6). It has been iound
that upon its perception, e very stimulus is evaluated on tne basis of a limited
number of affective characteristics. According to Osgood there are three such
~ characteristics (gcn~ral evaluation, activeness, strength), ones which are in full
agreement with Wundt's system. We distinguish four independen* characteristics.
1) general evaluation ("good-bad"), 2) activeness ("active-passive"), 3) degree of
domination ("suppressive-subordinal.e," "strong-weak"), 4) degree of predictability
("commonplace-odd," "stable-changeable"). The capability of the sensory system for
determining ~he values ~f these characteristics of all stimuli may be called its
evaluation function. In principle we can hypothesize that emotions are a certain
subjective behavioral equivalent of a general evaluation of a life situation in
the given moment, or simply of an evaluation of internal state. In this case we can
5
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/42/09: CIA-RDP82-00850R000500030032-8
now draw up the final list of states. Thus in our four-dimensional mode~ the extreme
states on the individual axes would be as follows, given that the states assume
neutral values on the other axes:
1. Pleasure-repulsion,
2. arousal-inhibition,
3. anger-fear,
4. Interest (attention)-indifference.
This list may be lengthened in due course by considering different combinations of
- the values of the evaluation function in relation to all four dimensions.
The secnnd appruach to compiling the alphabet of states is for practical purposes
a generalization of the purely pragmatic position in which the list of emotions
subject to analysis is determined by the applied problem facing the researcher.
Is there some way to represent all such problems and correspondingly list all states
that are of interest from a practical point of view? It may be hypothesized that
all emotional states having significance in practical life should be contained in a
dictionary. On this basis we examined a dictionary of the Russian language, and we
wrote down all terms which define emotional state to one degree or another. There
were more than 500 such terms in all. Obviously most of them were synoncmous.
~ An analysis of a thesaurus produced 22 sets of terms defining different states.
Eacki of these sets could be represented by a most typical and most widely used word:
1. iiidifference, 2. calmness, 3. concentration, 4. tension, 5. tiredness, 6. anxiety,
7. d~ubt, 8. embarrassment, 9. excitement, 10. inspiration, 11. frenzy, 12. joy
(pleasure), 13. grief, 14. dispair, 15. anger, 16. fright, 17. shock, 18. depression,
19. aggressiveness, 20. satisfaction, 21. revulsion, 22. melancholy.
As wi.th the list of states obtained by the first method, this list is somewhat ex-
tensive from a practical point of view, but on the other hand it probably covers
all states of interest to the researcher. To conclude, here is an abbreviated list
of emotions wii~^h from the author's standpoint represent all states of interest in
tre applied aspect: 1. joy, 2. grief, 3. excitement, 4. depression, 5. rage, 6. fear,
7. apathy, S. the norm.
Classification of Personality Characteristics
Given the great amount of confusion about the problem of classifying personality
characteristics, the researcher in linguistics is in a somewhat better position
here than with emotions (perhaps precisely owing to this excessive confusion). First
of a11 it is clear that a speaker's identification by voice depends on a number of
anatomical and physiolagical features of the speech forming system of the given
- individual, features which are typical of him alone and which make it possible to
distinguish him fram all others. Because in this case th~ matter boils down to
studying just the spoken signal and the means of its generation, the researcher in
- linguistics remains within his element. On the other hand it is obvious that the
spoken signal also reflects a number of group characteristics, tho~igh once aqain
associated basically with the anatomical and physiological features of individual
6
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
� FOR OFFIC7AL USE ONLY
- groups (based on sex, age and so on). The obvious list of such group characteristics
is not very long, though it would be difficult to imagine a definite procedure by
which it could be compiled.
There is a rather vast literature on the psychology of the persona~ity. Irrespec-
tive of that classification of personality traits which we would decide to adopt,
, it would be rather difficult to expect these traits to be reflected in the spoken
signal, except for those associated with emotionality, and even more likely with the
- emotional reactivity of the speaker (such as, for example, qeneral activeness,
impulsiveness, emotional activeness and so on, as defined by Guilford (7)).
Concluding this section, we should note yet another reason why researchers in
linguistics show little interest in a classification of individual personality traits.
The fact is that the principal applied problems associated with analyzing the indivi-
dual characteristics of speech are asso:.iated with iden tifying the speaker as such--
that is, with determining which concrete .~.ndividual uttered a particular passage of
speech (this is true of criminology, systems limiting access to documents and
facilities on the basis of speech patterns, military intelligence and so on (8)).
It is only recently that problems requiring quick testing of group psychological
traits of the personality have come into being.
The Mode1 of Emotional States, and Their Control
0ne of the principal stages of research on the emotional variability of spoken
signals is selection of a sufficiently convenient experimental model simulating
manifestation of emotional reactions in natural conditions. This choice becomes
necessary when it is practically impossible to record speech accompanying naturally
experienced emotions. An emotional state model should: be based on a broad range
of subjects so as to ensure sufficient statistical sign ificance; ensure acquisition
of the required speech reactions; provide a possibility for obtaining a broad spec-
trum of emotional states; permit current evaluation or control of the state of the
informant.
These requirements are satisfied most simply in the model provided by acting. Of
course, we do need to mention some negative traits of this model right off. First of
allwhen reflected by an actor, all states, even the most emotional ones, proceed on
the positive background of an actor's inspiration. Owing to this the model would
- produce the most authentic results when positive emotions are involved. Moreover
rather than experiencing an emotion, an actor more than likely simulates some traits
of its manifestation in ordinary life, or he may even use a certain stereotype to
symbolize an emotional state. This makes it difficult to distinguish controllable
from uncontrollable variations in the spoken signal (specifically, they are all r.nr.~-
trollable for the actor) . 'Akietlier or not puxely specific traits, inherent onl.y to
theatric speech and F~erforming an emotive function, are present in an actor':~ simula-
tion of emotion is not very c lear. No such traits have been revealed as yet. The
sensory system of the audience may be influenced by music or color as supplementary
_ means for deepening the emotional state simuLated by the actor. The model of emo-
tions induced in a hypnotic state is vc:ry close to the acting model. Today this is
one of the most promising models. Application of stimuli to specific body points
- used in acupuncture may be co,~bined with hypnotic suggestion in order to reinforce
the latter, or ~his may even be cione independently.
7
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
~'Vh V~~~~~~.~r~a. vua. v~~r.i�
Consideration of persistent shifts in emotional states observed in the psychiatric
cl~nic is a separate direction. Unfortunately this model possesses two signific~.^t
shortcomings. First of all it is rather difficult to obtain a broad spectrum of
emotional states from a single patient within a suitable time frame. Second,
side-effects of drugs upon the speech behavior of subjects, ones which are difficult
to control, are practically always present.
Concluding this review of models of emotional states, a little should be said about
so-called "natural" states. Rather often, researchers studying applied problems
- use states arising in situations that are either identical or sufficiently close
to practically important ones (for example firing an ejecting seat, experiencing
= accelerations in a centrifuge, parachute jumping and so on). When analyzing and
using the results of such studies, we must always remember the intense influence
factors dependent on purely physical accelerations and a number of other inci_dental
factors have on speech production. This influence makes it practically impossible
to use the obtained data beyond the limits of the narrow situation under study, though
in relation to the latter the resulting data are highly significant and most valuable
- for practical purposes.
Control of the speaker's state is a se~~arate problem. Were we to look at the models
presented above, we would usually find that the state of the speaker is determined
by a purely subjective method by the director, the hypnotist or the psychiatrist.
There is interest in obtaining additional physiological or biochemical data describing
the speaker's state. But unfortunately the dat~a that have been obtained pertain to
only one of the di.mensions--"inhibition-arousal." Nonmodal activation of the ascend-
ing reticular system manifests itself ~is desynchronization of the EEG. Arousal of
the sympathetic nervous system manifes:s itself in the GSR, in growth of arterial
- pressure, in dilation of the pupils, a~id in respiratory activity, pulse, muscle
tone and skin temperature. Correspondingly, this same scale correlates with release
of epinephrine and norepinephrine, and it may be tied in with the appropriate bio-
chemical analyses. Unfortunately the other dimensions of emotions are not reflected
by any physiological indicators that have been discovered thus far. Analysis of
electrophysiological indicators at special points of the body may o�fer certain
promise in this area. In principle, however, we can assert today that the spoken
signal is one of the most informative indicators of emotional state.
Trro Rpproaches to Analyzing the Relationship of Speech Parameters 'to Individual
and Emotional Characteristics*
_ Going on to the problems of speech, we will return to the two approaches, mentioned
~ a~ the beginning of this paper, to seeking relationships betweer~ speech parameters
and the individual and emotional characteristics of the speaker. We will examine,
= as an example, the relationship between speech and emotional states.
*The problems to which we now turn, ones which are associated specifically with
speech, are given a rather cursory examination, and only in the methodological
aspect, since the concrete results of speech research are the object of analysis of
most non-review re~~orts presented at the symposium.
8
FOR OFFIC[AL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FOR OFFICIAL USE ONY.Y
'Phe f..irat a~~~,~roach l~uyiti~ wiUi u~~uly:ile ~t state:i and changes ln the Uody eliciti:d
by changes in emotioiial state, and ends with the spoken Gignal. As was noted in
our discussion of physiological control of states, it is rather difficult to describe
change in biophysical characteristics in the presence of changes in state. This
makes analysis of possible changes in a speech signal much more difficult. However,
we can list a number of rather obvious factors that may influence speech formation.
First of all if emotional states deviate from the norm, we might expect destabiliza-
tion of the generation of spoken messages. At the psycholinguistic level (as defined
above) this should lead to a certain regression (primarily a simplification) of the
structure of statements, to use of simple, habitual syntactic structures that are
easily made automatic. An extreme manifestation of such destabilization might be
- the arisal of speech failures, of mistakes in gra~?atical and syntactic formulation
of statements. At the biophysical level, destabilization of the mechanism responsible
for controlling speech formations manifests itself as growth in the scatter of the
values of the typical parameters of individual sounds. A shift in the mean values
of these same parameters may in principle be associated with presence of a particular
emotion in the speaker, but an increase in scatter may probably be expected with any
deviation of state from the norm. We can also examine subtler processes influencing
~ the characteristics of the spoken signal and allowing us to treat, as suspicious,
parameters such as formant width, the frequency of the principal tone and the
characteristics of the melody curve of the principle tone, the relationship between
the high frequency and low frequency parts of the integral spectrum and so on (see
(9))�
The second approach to seeking the relationship between characteristics of the
spoken signal and the speaker's state begins with analysis of the variability ex-
hibited by the parameters of the spoken signal, and ends with revelation of the emo-
tional factors responsible for this variability. That is, in the first stage we
determine the limits for the varia.bility of individual speech parameters, and only
after this do we re~~eal whether or not this variability is associated with fluctua-
tions in state. This approach is more exotic, and it is hardly ever encountered
in research having a narrow practical orientation, though in principle examples of
its successful use can be cited (10).
All that has been said about analysis of the emotional characteristics of speech may
also be repeated in relation to analysis of the relationship between speech and
individual and personal characteristics.
In conclusion we should ~oint out one more direction of research in which active
work has been done in recent years: analysis of the mechanisms responsible for
perception of emotional and individual characteristics of speech. This research
direction is attracting attentian for two reasons. First of all the perception
mechanisms and the corresponding distinguishing ~haracteristics that are revealetl
may be used to develo~ automatic recognition systems. Second, in a number of
practical cases it would be permissible to use auditory experts to determine the
- emotional state or personality of the speaker. However, before we can sanction the
use of such experts we ~vould have to learn the possibilities of the human auditory
system.
9
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2047/02/09: CIA-RDP82-00850R000504030032-8
rvn vr.�.~.�ha, v.~c, vi~a.~
BIBLIOGRAPHY
1. Buhler, H., "Sprachtheorie," Jena, 1934.
2. Jakobson, R. ,"Linguistics and Poetics," in Sebeok, Th. (Editor) ,"Style it!
- Language," N.-J., 1963, pp 350-37'7.
3. Sebeok, Th. A., "The Informational Model ~f Language," in Garvin, P. (Editor),
"Natural Language and tr,P Computer, " N.-J. , 1963, pp 47-63.
4. Galunov, V. I., and Tara=ov, V. I., "Natural Manifestations of Emotional States
and Investigation of the Characteristics of the Spoken Signal," in "Rech' i
emotsii" [Speech and Emotions) , Leningrad, 1975, pp 55-61.
5. Vundt, V., "Osnovy fiziologicheskoy psikhologii" [Principles of Physiological
Psychology] , St. Petersburg, ? 874-1881.
6. Osgood, Ch., Suci, G., and Tannenbaum, P., "The Measurement of Meaning," Urbana,
1957.
7. Guilford, J. P., "Factors and Factors of Personality," PSYCHOL. BULL., Vol 82,
No 5, 1975, pp 802-814.
8. Beeh, B., Neuberg, E. P., and Hodge, D. C., "An Assessment of the TecYznology of
Automatic Speech Recognition for Niilitary Applications,"IEEE TRANS., ASSP-25,
No 4, 1977, pp 310-321.
- 9. Galunov, V. I., Koval', S. L., and Tampel', I. B., "Effect of Individual and
Emotion-Dependent Changes in Parameters of the Articulatory Tract on Character-
istics of the Spoken Signal," in tlze present collection.
10. Beskadarov, A. V., and Galunov, V. I., "Analysis of the Variability of the
Melodic Contours of Speech," in tY~e present collec:tion.
10
FOR OFFICIAL U5E ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FOR OF~ICIAL USE ONLY
THE PROBLEM OF CLASSIFYING EMOTIONAL STATES IN LIGHT OF
THE INFORMATION THEORY OF EMOTIONS
P. V. Simonov
1. Definition of emoticns as a phenomenon of higher nervous (mental) activity:
According to the informati.on theory of emotions (Simonov, 1964) emotion is an active
state of a system of specialized brain structures stimulating the subject to change
his behavior in the direction of minimizing (weakening, interrupting, preventing)
or maximizing (intensifying, prolonging, repeating) this state. The quality, degree
and sign of an emotion are determined by the need for its satisfaction and the pre-
dicted probability (possibility) of its satisfaction on the basis of inborn and pre-
viously acquired experience. Consideration is given in this case to how well per-
fected the organism's habits are, the energy resources of the organism, and the time
necessary and sufficient to perform adaptive actions.
Information theory is valid in relation to the genesis of all emotional states, in-
cluding the emotional tone of sensations. For example, an evaluation of food as
being pleasant arises only when a hunger stimuZus (a need) is integrated with afferen-
tation from the mouth cavity signaling impending satisfaction of this need. To a
sated subject, the same afferentation can elicit the negative emotion of revulsion,
and an avoidance reaction.
The probability of goal attainment (need satisfaction) may be predicted at both the
conscious and the unconscious level, for example through action of the mechanisms of
intuiLion. In the latter case probabilistic prediction of goal attainment concludes
with an emotional "presentiment" of the closeness of a solution or of the hopelessness
of searching in the given direction.
2. ~'he way emotions differ from other phenomena of higher nervous activity must be
analyzed iii connection with the fact that the terms "emotion," "motivation," "drive,"
"instinct" anci so on are often used as synonyms. Many authors prefer to refer to
"emotional behavior," "motivational-emotion~~l arousal," "the emotional-volitional
sphere" and so on. We now adhere to the fo.l.lowing working definitions:
Need--selective dependence of living organisms on environmental factors significant
to self-preservar,ion and self-development; a source of activeness of living systems~
the inducement and goal of their behavior iri the surrounding world. Three basic
groups of needs can be dist.inguished i~~ man:
11
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
material-t~iological: for food, clothing and ahelter, for self-praservation of the
individual and the species, for perpetuation of the line and so on;
social needs in the strict sense (inasmuch as all iiuman needs are sccially mediated)--
that is, the need to belong to a social group, to occupy a certain place within it,
~ to enjoy the respect of its members, to correspond with ethical standards accepted
by the given community and so on;
needs of the cognitive-creative type, so-called ideal or spiritual needs, satisfaction
of which leads to positive emotions, produ~ad by the process itself of learning about
- and transforming the realities surrour?~.ng man.
An important objective indicator of the ~,~uality of needs is the "goal postponement"
- parameter as defined by P. M. Yershov (A. S. Makarenko's "personal perspectiv~s").
Satisfaction of material-biological nee~~s (hunger for example) cannot be postponed for
a time of any major length. Satisfaction of social needs is limited to the human life
span. Ideal goals may be attained in the remote future. Z'he individual set of needs
and the hierarchy they assume make up the "core" of the given individual's personality,
its most significant characteristic. It is namely the sphere of needs and the emotions
arising on their basis that make up the "zone of overlap" in which research on brain
activity in the natural sciences makes its most intimate contact with the complex of
humanitarian sciences.
Motivation--a physiological mechanism activating engrams, stored in the memory, of
important objects that are capable of satisfying a need of the organism, as well as
engrams of those actions which are capable of leading to satisfaction. In order that
a motive could be transformed into outwardly realized behavior, real signals heralding
the appearance of target objects must oe present. Engrams themselves may not serve
as the triggering stimuli of behavior; otherwise the subject would be living in a
world of hallucinations and illus.ions. The mechanisms of motivation promote selec-
tivity of contact with the environment, as dictated by the needs important at the
given moment.
Behavior--a form of a living organism's function which changes the probability of
contact with the object of need satisfaction.
- Will--activity motivated by the need to surmount an obstacle, b~ a need that is relative-
ly independent and supplementary to the motive which first initiated a particular
behavior. The inborn "frecdom reflex" described by I. P. Pavlov is obviously the
phylogenetic precursor oF will. Reactions to "internal interference" (competing
motives for example) and participation of the consciousness, which perceives freedom
as a conscious necessity, are typical of the activity of volitional mechanisms in man.
Ueviation from this recognized need is perceived by the subject as "nonfreedom," and
it activates the mechanisms of volitional effort. Will can serve as an indi.cator of
a need which has held a dominant position in the structure of the given personality
for a long period of time and which determines the choice of actions in a conflict
situation, if any one of the subdominant needs generates an emotion that is stronger
than this dominant need.
Consciousness [soznaniye]--knowledge [znaniye] which an individual can share with
anotiter individual (compare with tlie words so-chuvstviye [equivalent to "sym-pathy"],
12
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFICIAL USE ONY,Y
so-perezhivaniye [equivalent to "sharing of experience"], so-deystviye [equivalent
to"co-operation"J and so-trudnichestvo [equivalent to "col-laboration"]). Unconscious
forms of higher nervous (mental) activity would best be subdivided into the subcon-
scious, which supports the protective tendencies of individual and specific defense
~ in the broad sense, ar:d "superconsciousness" (K. S. Stanislavskiy), which services the
trends of development, creativity and progress. The mechanisms of superconsciousness
surmount the known conservatisr.? of the conscious--its rationalistic nature and its
rigid dependence upon former individual and group experience. The superconsciousness
promotes arisal of hypotheses contradicting formerly known hypotheses, while the con-
sciousness retains the most important function of selecting only those hypotheses which
cor_respond to the objective realities and which are confirmed by practice. On the
other hand the mechanisms of the superconsciousness provide the individual the possi-
bility for acting "unsensibly" but in a way that would be necessary on the scale of
the development of civilization as a whole. As an example several fine people may
rush to the aid of a person they do not know and perish while rescuing this person
who turns out to be a despe rate scoundrel.
3. If we are to classify emotional states, we would necessarily have to introduce
some single valid principle api~licable to all emotions. From the positions of in-
formation theory of ernotions, emotional states should be classified in a system of
three coordinates: The magnitude of the need, growth or decrease of the probability
of its satisfaction in comparison with a former prediction, and the nature of the
action in the course of which the given state arises. The classification of emotions
we suggest is summarized in the table below.
Contact interaction is defined as interaction with the target object which may
undergo weakening, interruption, intensification or prolon;ation but which is al-
ready proceeding anc: cannot be averted. The remote actions of taking possession,
surmounting and defending are associated with the three furidamental emotions--joy
(corresponding to grief), anger and fear. Thus the sphere of human emotions rests
on a foundation of four basic states: pleasure-revulsion, joy-grief, confidence-
- fear, cheerfulness-anger. The specific features of the need impart qualitative
~uli~ueness to the emotion.
When two or more needs are important simultaneously, each may generate its own emo-
tion depending on the situation and on the Qrobability of satisfaction. Zn such a
case we arrive at mixed forms of emotions, observed so frequently in real life.
The a.ssertion by some critics of the information theory of emotions that this theory
is supposedly incapable of explaining the genesis of states such as "My sadness is
elevating, my sorrow is sweet" rests on a misunderstanding. These critics forqet
that coexistence of several needs often ger~erates an entire range of emotions, each
of which is subordinated to the rule formulated by this theory.
_ The individual tendency to predominantly react with one of the fundamental emotions
lies at the basis of the classification of temperaments (Simonov, 1968). The weak
(melancholic) type of nervous system has a special relationship to the fear reaction,
the strong, unrestrainable type (choleric) has a special relationship to rage, the
sanguine type is related specifically to positiv~: emotions, and the phlegmatic type
is not generally prone to tumultuous reactions, though like the former iie tends
~otentially toward positive emotions. In distinction from melancholic sadness,
13
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
,
~
_ ~
~
V �rl y,~ .
- b~ ~ ~ ~ . O
~ ~ O ~ U A ~ ~
.
~a~ ao~
~ ~ ~ ro w ~ ~ ~ ~ ~ 3 ~
N~ 4~-1 H H H
~ N
N i0 ~ ~ Gl
b+ ~ U 1 N~ O
U ~ ~ ~ tNJ1 Ul 'U ~ C: ~
N O ~ C v O N~ 4a ro N~
a, a a~ a, ~n N~~ ~ ro b~+
z~ ow ~x aroi a.~ c�~ ~ ro bw~
v�,~ ~ r~ a.u w u
~ ~
a+ ~
w v, c ~
oa �~o~ a -
~
~ U H ~ A ~ O ~ `
tT N~ 1~ n !A N N .I
rn o u~i tr~ vi a
~ u~ ~ O ~ ~ a~'i ~ u~i ~ro ~ a~'i
a~ ~ vi a croi ~ N o+ ro
~ ~
ro
z
a
o ~ o ~
~ ~
a ro c~i a~i b ~
aq~,~ a u u '
Q ~ i...~ ~ O N '0
~ rd U~ 1~ w N ~
W rd ~ 4a
U) ~ �.~1 01 .1
O ~ O 3 ~ ~ ~ ~ u~i
~ c
i U cn r"'{ A
w tn
~ H w +.~i o
O 4-i ~ .-~i ~
~ 0 ~ ~ ~ O
U ~ .Q U1 ~ ~ N
4-1 .-i ~ O ~ 'b 1~ �,-1 ~d
tNn ~d ~ N~ U k~ b~ U
- .-bi 0 W~ OI W~~ x A
U
a
ro
v ~ N
b ~ v ~
~ z N N
~ v ~ ~ ~
tT 1~ U ~ ~
~ W H UI H
O
L4
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
FOR OFFICIAL USE ONLY
choleric grief always borders on anger, while choleric joy borders on cheerful
aggressiveness, otl ardor.
Although the fundamental emotions listed above may arise in the course of satis-
faction of a need belonging to any of the three basic groups, fear is most typical
of the biological needs associated with self-preservation, anger is most typicai
~ of social motivations, and the need for cognition and creativity definitely tends
toward positive emotions.
Concluding this brief outline, it is my hope that the suggested classification of
needs and emotions would be useful to the study of the emotional and personality
characteristics of speech.
15
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
NvK vrr~~.aHa, uoc v~~a.a
LINGUISTIC INVARIABILITY AND INDIVIDUAL VARIABILITY
L. V. Bondarko, V. G. Shchukin
The social nature of language presupposes mandatory presence, invariability and
stability of the units forming a language system. These properties are precisely
what make communication among people of the same linguistic collective possible,
communication not only at the moment of a concrete act of spoken communication but
also through the retenti~ti and transmission of all information in this lar~guage.
However, we know that the utterance of language units in speech often fails to satis-
fy any of these requirements: Research on living speech, conducted especially in-
tensively in recent years, has shown that the more the units of analysis are broken
down, the more obvious it becomes that their properties correspond little to the
: initial concepts based on the study of that which linguists refer to as a language
- system. Changes in a language system in time may be explained mainly by failure to
comply with the principles of mandatory inclusion, invariability and stability.
Naturally the main element within which these principles are violated is the spoken
activity of each individual speaking in the given languaqe.
In this connection we are forced to raise the question as to how realistic is the
specific object of linguistics--"the abstract, hornogenous speech collective, all
the members of which speak in the same way and learn the language instantaneously"
((1), p 100). From a linguistic point of view it would be important to determine
precisely what characteristics of individual speech activity cause change in the
language system and what conditions promote such change.
In this connection the idiolect (that is, the individual's system of speech resources)
is to us, on one hand, a realization of the language system inherent to the given
individual and, on the other hand, a cause of change in this system. Of course,
the variabil~.ty inherent to an idiolect far fron~ always leads to such changes. It
all depends both on the frequency with whiGh a certain deviation is encountered and
_ on its causes. I'rom the standpoint of the frequency with which it is encountered,
we can refer to four ~lifferent types of variability:
- 1. Intra-idiolectic variability.
This type of variability is characterized by presence, in the idiolect, of variants
of linguistic iinits used occasionally, irregu?.arly, in addition to other variants of
these same uniL�s.
16
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
�OR OFFICIAL USE ON~,Y
2. Iiiter-idiolecti:: variability,
This type of variability is characterized by presence, in the idiolect, of regularly
used, stable variants of linguistic units distinguishing this given idiolect from
other idiolects.
3. Group variability.
Group variability comes into existence when the variants of linguistic units conunon
to the idiolect of representatives of the same group do not agree with the variants
of the same units in the idiolects of representatives of another group. As a rule
this type of variability reflects differences between groups, each of which is homo-
geneous in one regard or another. The homogeneity of such groups may be based on ~
the territorial factor (identical dialect and similar regional characteristics), the
social factor (social variants), age-related and occupaticnal factors (age-related
characteristics and occupational jargon) and so on.
- 4. Mass variability.
Mass variability comes into existence when ~i variant enjoys broad dissemination in
the speech of different strata of the poQulation irrespective of age, occupation,
territorial affiliation and so on.
The frequeracy itself with which variants are encountered depends on a large number
- of factors. Being extralinguistic by nature, these factors generate different forms
- of variability. Let us examine some of these factors from the standpoint of their
mutual relationships with the language system and with the types of variability.
The division of the causes of linguistic changes into internai and external, which
we encounter in special studies devoted to this issue 1t2), Pp 197-313), is not always
justified in research ~n these causes in application to the xdiolec~: Realization of
potential changes under the influence of external factors is essentially possible
only in the event that internal factors do not oppose such re~lization.
The most general and universal factor i.s the desire for economy of effort in speaking.
This factor should be interpreted broa~ily, to include not only the tendency to
- economize on the energy of pronunciati~~n (3) but also the desire to express identical
or close meaning by single form, the desire to limit the complexity of spoken communi-
cations and so on ((2), pp 241-250). This factor operates in the speech of every
' speaker, and the extent to which this economy manifests itself is regulated in maiiy
ways by linguistic factors: Thus, tY,e high variability of reduced vowels in modern
Russian is a result of "smoothing"of the characteristics of these vowels in unstressec3
syllablc~s,oF their convergence with the char,~eteristics of "neighboring" consonants
("stopping short" in the articulatory moveme~its necessary to achieve the "needed"
rank, highness and labilization). Such snootl~inR is a product of the properties of the
Russian language system, ir.~-~hich only two degrees of highness (open-close) are
functional in relation to unstressed syllables, and the rank of unlabilized v~wels
is defined by the hardnass-softness of the preceding consonan~.
The next group of factors includes those associated either with arisal of new con-
cepts or with the borrowing of words from another language to represent previously
known concepts. They characterize not so much the speech behavior of the speakers
17
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000540030032-8
(as is the case of the first factor) as the situation in which the speaker may find
himself. The next group is made up of factors associated directly with the personality
of the speaker--his sex, age, social status, education and occupation.
~ The effect of these factors on the characteristics of the i~iiolect have been dis-
cussed in sufficient detail in the linguistic literature. It might appear that these
should be called specifically external factors, but even here the limitations imposed
by internal factors on their influence is obvious, as is the influence of these ex-
ternal factors on internal ones.
- The territorial affiliation of the individual, which is responsible for arisal of
:iialectal deviati.ons in his idiolect, may be characterized as an external factor,
espec~ally if we broaden this concept such that territorial affiliation coul~ a~so
be taken to mean possible bilingualism, and not just presence of dialectal features.
The phenomena of speech pathology, whic:h doubtlessly influence some characteristics
of the idiolect and which may become widespread, also have special significance.
This factor will be discussed in greater detail a little later.
The list of factors influencing the forms of speech variability is far from exhausted
by those discussed here, inasmuch as our objective is not to systematize the factors
- but to systematize speech variability phenomena in relation to language.
Let us now examine the way in which the types of variability are brought about by
these factors.
~ Intra-idiolectic variability, which we defined as irregular usage of di�ferent
variants of linguistic units, is caused by almost all of the factors listed above,
~ and in this case all we can say is that the influence or these factors varies and
Lhat the factors themselves interact with one another; thus the influence of terri-
torial differences may depend on the sex of the speaker: Women retain dialectal
traits in speech more consistently than do men, who consequently exhibit greater
intra-idiolectal variability; higher education reduces intra-idiolectic variability
- associated with territorial affiliation, but it may increase the weight of factors
associated with arisal of new concepts or words, and so on.
Inter-idiolectic variability is caused by tt.e presence, in the given idiolect, of
sta~le variants of units inherent only to the~ given idi~lect. Arisal of these
stable differences may be associated with aI y:~f tr~ ~ fuctors.
Group variability, which characterizes not the individual idiolect but a certain
= group, is caused primarily by factors such ~?s sex and age, social status and occupa-
t?.on, ter~itori.al affiliation and education.
Mass variabili.ty, which for practical purpo::es characterizes the majority of the
bear~r~ of the gi~veri lang~xage (that is, it i.s represented in almost all idiolects),
may be described as the highest stage of sp~ech: We would rightfully expect at this
stage that a phenomenon inherent only to spcech would achieve the status of a lin-
gui~tic phenomenon, and thus restructuring of the language system would occur.
1 F3
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2Q07/02109: CIA-RDP82-0085QR000500030032-8
FOR OFFICIAL USE ONLY
~
- o
ro + + + +
v
~
V
U
O
t~d ri O
S.I �rl �rl
S~ LI 4a }J + -4- -F -F
~ O W ~
H ~
q
~ ~f- -F t
b ~ ~
W b
U
~
- C + t t
b r ~"i
a c
- ~ u~
~ a
ro + + + +
o
~ + + + +
~
aki I + + + +
~
- w
O 'O N
v ~
~ p N + +
fA S~1 U
�rl LI ~
- O O
- ~ GO U
N
~ 3
u'~i ~ v + + + +
�ri U
f.1 4a ~
~ O Q
v
~
~ O 0 + t -F t
O 'H
U W
W W
~
1..~ U U
}~J ~~rl 1~
U r-1 U U
b �.-I N ~
fj.i fA ~q I ~ I ~
~ N ~ ~L'1 O tA
~ ~ H ~ H C7 ~
w
O
19
- FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
MVK vrr~a..~H1. v~c v~~L.t
- As was said earlier, the influence of the idiolect on the language system may be
represented as a combination consisting of different types of variability, character-
izing the frequency with which a particular phenomenon is encountered, and the factors
- responsible for this variability. Such a classification may take the form of a
table allowing us to account for participation of different factors in formation of
different types of idiolectic characteristics (see table above).
As we can see, participation of different factors is almost always possible. This
table should be interpreted in two directions: first, by establishing correlations
between different factors, and second, by differentiating the factors in relation to
different levels of language. This can be clarified with examples.
1. Speech pathology, which participates in formation of inter-idiolectic and group
differences, may also stimulate changes in mass speaking styles. Thus pronunciation
of single-focus fricative consonants with a lisp, typical of (shchiptsovyye) [forceps?]
chilaren and persisting in the pronunciation of adults, may lead to corresponding
changes in mass speaking style, inasmuch as it is supported by both the tendency
for economy of pronunciation effort and the absence of the corresponding functional.
contrast in Russian language. However, it w~uld be difficult to believe that speech
_ pathology may have a broader influence at the phonetic level, or that it may be
in any way tangible at the level of gra.~mar, syntax and word use.
2. Reduction of unstressed vowels in inflections leads to indistinguishability of
the grammatical forms of a word. This is a realization of the tendency toward
economy of effort, and it is permitted by the Russian language system, in which
grammatical information on a word may be deduced from the structure of the sentence
in which it is used (4). In a numoer of cases however, this phenomenon, which
characterizes the segmental composition of a word, may be the cause of stress
changes penetrating through the idiolect into the language system: The impossibility
of distinguishing between the fozms " s~xopl3 " and " s~cOpst" promotes arisal of
the plural form "~cop~" (5) .
Systematic description of the idiolect as a language realization and as a source of
- chanye in language presupposes analysis of the influence of different factors both
from the point of view of the frequency of their occurrence (the type of variability)
and from the ~oint of view of their single or :,~ultiple influence upon different
levels of the language system.
BIBLIOGRAPHY
1. Labov, U., "The Study of Language in Its Social Context," in "Novoye v
~ lingvistike" (Advance~ in Linguistics], 7th Editiori, Moscow, 1975.
2. "Obshcheye yazykoznaniye" [General Linguistics], Chapter III. Language as a
Historically Developing Pk:enomenon, Nauka, Moscow, 1970.
~J
3. Martine, A., "Printsip ekoromii v foneticheskikh izmeneniyakh. Problemy
diakhronicheskoy fonologii'' LThe Principle of Economy in Phonetic Changes.
Problems of Diachronic Ph~nology], Moscow, 1962.
20
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
- FOR OFFICIAL USE ONY.Y
4. Bondarko, L. V., and Vcrbitskr.ya, L. A., "Phonotic Ctiaxacteristics of Un-
stressed Inflections in Modern Russian Lanquage," V. L., No 1, 1973.
5. Gorbachevich, K. S., "Phonetic Prerequisites of Some Stress Changes in Modern
Russian Lanquage," V. L., No 6, 1975.
~
21
FOR OFFICiAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
r'U1t urr~~iA~ u,r. ~~i~LY
EXTRALINGUISTIC SIGriALS AND THE PROPERTIES ~,ND uTATES OF TH~ INDIVIDUAL
- V. Kh. Manerov
1. Introduction
It is a universally recognized premise that spoken communication is a process in
which resources at the linguistic and extralinguistic levels interact in complex
fashion. But despii~e the fact that the latter have attracted the attention of
scholars since antiquity, attention has always been focused on symbolic or verbal
communication. It was not until our century, and especially in its second half,
that systematic research was started on the extralinguistic resources of communica-
tion. Today, if we consider only the acoustic extralinguistic signals reflecting
the states and properties of the individual, we can name several hundred papers
devoted to this probler.?. However, this abundance of publications has not produced a
qualitative change in our understanding of the essence of the phenomena of extra-
linguistic communication. This is associated with the lack of research generalizing
- the numerous but dissociated facts obtained, moreover, by representatives of different
sciences. Exclusions in this aspect are the works of V. I. Galunov and the monograph
by Nosenko (1) .
This paper provides a definition of the problem based on an ~alysis of both the
results of my own work and published data collected by gsychological and psycho-
linguistic studies. I was interested in changes contributed b,~ the emotions, ex-
pressed intentions and individual characteristics of the speaker to the acoustic and
phonetic structure of a spoken statement--changes which do not themselves lead to
_ change in the objective and logical content of the statement but produce informa-
- tion cf another sort: information about the speaking individual (the co~nunicator).
On the other hand I was also interested in how perception of these acoustic phenomena
is affected by the characteristics--the properties and states of the perceiving sub-
ject, the recipient.
An external analysis scheme developed within the framework of the systems approach
_ to analysis of complex systems was employed. Such a systems approach is close to
- optimum in application to extralingui.stic comQ??unication, since the overall picture
of this complex, multicomponent phenomenon must be based on a large quantity of
poorly structured, fragmentary data.
22
FOR QFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FOR O~FI('IA1, USE ONLY
"l. The Analysis ~ctieme.
An example of successful application of the systems approach to social phenomena
can be found in Kagan's work (2). The author suggests the following basic components
of systems analysis: 1. Structural component analysis, functional analysis,
3. historical analysis.
The subsequent discussion will follow this scheme, and the content of the analysis
will be interpreted within the framework of these three dimensions.
1. Structural Component Analysis
The task of structural component analysis is to reveal the components of the system
and their mutual associations. While linguistic communication occupies the central
- and most important place in human communication in general, extralinguistic acoustic
communication, which is facultative in r.elation to the former, occupies an inter-
mediate position, being in a sense a shell separating the core--linguistic communica-
tion--from other forms of com~?unication, to include visual, tactile and so on, forming
in their sum total the system of human communication. Because extralinguistic and
linguistic communication are relatively independent of one another, this shell can
be isolated as a special system.
Let us imagine the simplest communication situation involving two communicants
alternating their roles as communicator and recipient, and an acoustic signal trans-
mitting the content of the process of communication in a case where the linguistic
content is fixed or eliminated.
In principle, component analysis of the system in relation to all three components
of this situation is possible. At the level of the communicator the analysis must
~ include all potential forms of information embodied within the signal, and thus it
must concern itself with the classification of emotional states, expressed intentions
and individual manifestations. However, because little research has been conducted
on these fundamental problems, for the moment this approach would be extremely un-
wieldy and difficult to carry out. A more attractive approach would be to reveal
only those components of the system which, being encoded in the acoustic envelope of
the signal, are perceived and interpreted by recipients with sufficient effective-
' ness. The entire inventory of extralinguistic acoustic signals was classified on
the basis of the following criteria: controllable--uncontrollable, non-speech--
- quasi-speech, regular--situational. Uncontrollable and partially controllable
signals include acoustic phenomena that can be classified as emotional non-speech
sounds (groaning, laughter, weeping, sighing and so on). Partially controllable
signals also include the emotional tone of speech. Controllable signals are acoustic
rc~sources for expressing the attitudes, intentions and desires of the speaker, and
descriptive acoustic resources. Non-speech phenomena are those acoustic signals
which do not interfere with speech sounds. These include the emotional sounds men-
tioned above, and the noises produced by the body's physiological functions. Quasi-
speech phenomena pertain to the emotional coloration imparted to the acoustic struc-
ture of speech. The emotional and expressive acoustic phenomena listed above can
be classified as situational phenomena.
23
- FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
r'UR Ut r ll;l!~L UbC. VLVLY
Regular extralinguistic acoustic phenomena include the formal indicators of voice
quality and the characteristics of articulation and inton~tion typical of the given
speaker.
In what way are the extralinguistic signals listed above perceived and identified
by the listener? In the case of emotionally colored speech, experiments performed
with the purpose of identifying such signals can provide a direct answer to this
question. Re::earch conducted by many authors has shown that the principal emotions
(fear, anger, joy, grief) and the state of astonishment can be encoded by acoustic
resources quite adequately. The recipient is also able to determine the intensity
of each of the principal emotions--that is, for example, he is able to make a differ-
entiation between vitality, joy and ecstacy. Thus at this level the system is
discretely continuous: Qualitatively different emotions fall into a discrete
series, and within the limits of each of the members of the series, continuous
change in intensity is possible.
There is a subsequent stage of analysis possible, involving a search for components
at the second level--that is, inspection of the internal structure of acoustic ex-
pressions of the principal emotions. A highly convenient tool of analysis in this
case is the methods of mu~tidimensional psychological scaling, particularly the method
of the semantic differential. When the data acquired as a result of such inspection
are subjected to factor analysis, we find that the principal emotions fornl the
foundation of a geometric model with inhibition-arousal serving as the principal
dimension (4).
In terms of their internal structure, these factors are components at the second
level of human perception. Thus fear can assume characteristics such as tr~mbling,
gasping, constrained and plaintive; anger may be dull, sharp or stark, and so on.
It should be noted that second-level components fall within the first-level sub-
system together with weight factors, and the correlations they exhibit are statisti-
cal in nature.
Perception of 25 sounds uttered by actors simulating 25 different states was studied
in experiments with non-speech emotional sounds. It turned out that listeners were
capable of correctly identifying all of the principal emotions by ear (astonishment
and emotionally colored reactions such as pain and suffering). Z'he method of the
semantic differential resulted in factors similar to those obtained for the emotional
coloration of speech. Thus at the level of the recipient, the components of the
system of emotional non-speech sounds and the system of emotional colorations of
speech are highly similar.
The question as to the structural components of the system of expressive acoustic
phenomena has not bcen studied sufficiently. It is illuminated to some extent in
Tsep].itis' book (5), in which the author did of course use different terminology.
However, if we assumc that th~s system is composed of expressions of relationships
such as tenderness, contempt, embarrassment and so on, volitional phenomena such as
ordez-s, requests and complaint.s, and descriptive phenomena, our experiments con-
~ cernEd with identification of such expressions simulated by an actor provide an
apprc~ximate answer to the strticture question. We found that the group of listeners
was i.ncapable of correctly and unambiguously determining most of the emotional ex-
pres~;ions contained in standard phrases out of context. Thus tenderness is identi-
fied as pleasure and joy, and an order is perceived as perturbation and anger. Thus
thesE: signals are identified in the terms of emotional expressions.
24
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R400540030032-8
FOR OFFICIAI. USF ONLY
As far as the components of the system of individual colorations of voice are con-
cerned, experimental research conducted by Voier (6) and Galunov by the method of
the semantic differential, applied to the reading of standard phrases, showed that
this system also has a discrete-continuous structure with four or five of the
- following discrete qualities (1,6): 1) articulatory activeness-passiveness of
the speaker, 2) voice volume and dimensions, 3) timbre, 4) general tone evaluation.
The fifth quaZity ;dimension), which did not appear for all authors, is usually
associated with the pitch of the speakEar's voice.
Continuous change of the corresponding quality is possible within the limits of
each of these dimensions, and if we assume that the human ear can distinguish seven
gradations in each of the four or five dimensions, an extremely rough estimate of
the number of distinguishible shades of vocal tone would be from several thousand
to 15,000 variants.
However, a person perceiving certain features of a voice is capable of using rhem to
reconstruct some of the detai~s of the speaker's anpearance. Addington's work answers
the question ~is to what the components of this appearance are. Using different
variants of voice tone during reading, the author obtained the following data by
the method of the semantic differential (7):
Perception of Male Voices Perception of Female Voices
1. Appearance (fat, old-- 1. Iiitroversion--co~�nunicativeness
lean, young) Sullennecs--happiness
2. Passive--active 2. Passiven~ss--activeness
Depressed--happy Obedience--aggressiveness
3. Social and physical 3. Nonemotional--emotional
status 4. Polish--vulgarity
~ 4. Peace-loving--cruel Prosperity--poverty
5. General evaluation (bad--good)
The results of our research are contained in (8). Each of the component-factors
breaks down into second-level components, which may have to do with both sound and
personality. Thus the negative evaluation factor is applied to nasal sound, which
is associated ii~ women with "stupidity, homeliness." In men, tension in the voice
is associated with "old age, nervousness, callousness," and so on. It would be
interesting to note that a speaking male is perceived to a greater extent in rela-
tion to his physical attributes, whi'_e social characteristics are more important in
the ~erception of a woman's personality on the basis of her voice.
2. Functi~~nal Analysis
Functional analysis presupposes examination of the function oi extralinguistic
- acoustic resources of communication. In the first phase we analyze the external
function of th~ system in comparison with other communication resources. R. Yakobson
lists six principal functions of verbal communication: intellectual, expressive,
conative, factual and so on, depending on which element of a text is said to be most
significant. Many of these functions can obviously be performed by extralinguistic
resources; this is es~ecially true of the expressive (emotive in our definition) and
conativc functions.
25
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
rvn vre w~na. VJL' w~a..
In living spontaneous s~eech, the fuc~ctional load is constantly alternated between
linguistic and extralinguistic signs; moreover the informativeness itself of the
latter is situational in nature as well. Thus for example, verbal resources usually
play the dominant role in neutral situations, while in a situation of extreme emo-
tional aroi~sal of the speaker, when he is capable only of uttering unintelligible
sounds, their role is reduced or eliminated.
Shibutani (9) mentions another class of situations in which extralinguistic signals
assume priority. These are situations that standardize verbal resources (for example
at a first meeting). In these cases extralinguistic signals induce real but conceal-
able emotions and attitudes in the communicants, a_nc3 make z significant contribution
to the "first impression" phenomenon.
A comparison of the effectiveness of acoustic and nonacoustic extralinguistic signals
- (see (10)) would reveal that signals produced by facial expressions have an advantage
over intonational signals in terms of transmitting the attitudes of the communicants.
It cannot be doubted that facial expressions and acoustic resources are inferior to
verbal resources in terms of the accuracy with which the particular emotion is repre-
sented, but they do have their advantages as well. Experiments have demonstrated
their greater resistance to interference in comparison with spoken signals, the
directness of their expression, associated with the limited possibilities for volun-
tary control of signals at this level, and their capability for eliciting emotions,
their "contaqiousness." This provides i.he grounds for some researchers to believe
that real attitudes may be expressed in communication only by nonverbal means.
Extralinguistic signals (a smile, an afEectionate intonation) are resources of avert-
ing or softening aggressive behavior.
Another important feature of extralinguistic emotional signals is that they are
a motor expression of internal states, and they create a dynamic picture of the
function af a certain center of activity and ensure a more-sensitive reaction.
In psychology, an external expression of emotions, including an acoustic expression,
is interpreted as an instinctive reaction. Many authors feel that the capability
for perceiving expressions is a Phylogenetically confirmed capability, for the
realization of which mastery of linguistic communication is not mandatory.
There are several theories on perception of expressive movements: the inference
theory, the role theory and the empathy theory (see (11)). The last theory, which
belongs to T. Lipps, appears to be most plausible to us in application to this form
of perception. The theory is based on three basic premises:
1. Perception of an emotional expression elicits emotional reactions within the
recipient himself.
2. These reactions arise owing to realization of the need for motor imitation of
another's expressive movements.
3. The emotional reaction of the recipient is ascribed by the latter to the subject
being perceived.
26
FOR OFFICIAL USE ONI.Y
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFICIAL USE ONLY
These premises of Lipps' theory make plausible the suggestion that the actual emo-
tional state and emotionality of the recipient have an influence on the way extra-
- linguistic signals are perceived. Several studies taking this direction have been
publijhed in the literature, and the overall but unconfirmed conclusion they lead to
is that the recipient has a tendency to project his own state upon the communicator
(12). We arrived at a similar result in our research, in which students experiencing
anxiety just before taking a test had to evaluate the state of speakers on the basis
of the emotional coloration of their speech (13). There are data, however, indi-
cati.ng that it is possible to project one's state having a"minus" sign. In this
case recipients are ~e least sensitive to states similar to the ones they themselves
are experiencing at the time.
At the level of the human mind, the c;ontent of which is mediated by verbal experience,
perception of nonspoken signals is dependent upon this experience and upon the person-
ality of the communicant as a whole.
However, there are also significant differences at the level of the nonverbal animal
mind. Marler (15) notes that some animals are able to generate expressive movements
well, but rhey perceive them poorly. Others on the other hand are good recipients
of state signals and poor transmitters of such signals. It is interesting that these
authors revealed similar groups in an experiment with people. Various researchers
have mentioned that signals of state arE~ perceived more successfully by sensitive,
subservient subjects (14). There are data indicating that individual experience has
an influence on reception of emotional signals. Ramishvili (16) writes that blind
subjects are able to deduce the state of a speaker from the coloration of his speech
more successfully than sighted subjects.
In our experiment we wanted to follow the process of forming an impression of the
speaker, and arrive at a description of this process. By interviewing subjects who
had to describe the appearance and internal make-up of the speaker on the basis of
the way the latter read a standard text, we were able to distinguish two extreme
~trategies. Fiaving listened to the speaker's voice, some subjects fashion his visual
- appearance, and on the basis of this appearance they attempt to fashion his personal
c;haracteristics. Other subjects take an analytical route: They single out different
characteristics of the voice and associate them one at a time with personality
characteristics. Insufficient information about the speaker forces the recipient
to make extensive extrapolations. Thus, hearing emotional tones in the voice, he
may interpret them as a manifestation of a permanent quality of the communicator--
his emotionality, while in reality they may be a situational phenomenon. A listener
attempting to evaluate the personality of a speaker often employs metaphoric elements,
transferrinq properties of the voice to properties of the possessor of this voice.
Thus a person with a voice having a pleasing timbre is also assessed positively in
terms of his appearance, while active articulation creates the impression of a person
- who is energetic in general.
Inasmuch as voice characteristics afford a possibility for considerable arbitrariness
in the interpretation of the speaker's properties, such an interpretation often carries
more information about the listener than about the speaker. In this case the voice
may perform the role of a unique Rorschach "inkblot."
27
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
. . .
Witliout a doubt the voice also carries information of greater objectivity: The
pitch of the voice, for example, particular features of articulation and timbre, and
the hollowness-clarity of the voice provide indications of the speaker's age.
General characteristics such as the psychodynamic features of the individual that
make up the basis of his temperament (activeness and emotionality). are represented
rather fully in the voice. As we had seen earlier, these characteristics are partially
represented by those criteria-factors which listeners use to evaluate the personality
of the speaker. On the whole, however, individual voice characteristics cannot serve--
we agree with A. A. Bodalev in this regard--as dependa.ble indicators of the personality
as a whole (11).
3. Historical Analysis
The next direction of analysis presupposes examination of the historical roots of
the phenomenon--its origin and development. Without a doubt this topic requires ~
special and lengthy analysis. We will cite only the raw data used in such analysis.
c;, Darwin (17) is the author of the first theory of expressive movements which, in
particular, compares the external expression of the principal emotions in man and
higher animals and proves the existence of a phylogenetic relationship between them.
Without a doubt this general premise is valid in relation to the resources of acoustic
_ expression of emotions. Confirmations of Darwin's theory can also be encountered in
modern studies. Man's experimentally proven capability for identifying the state of
animals through auditory perception of their acoustic signals is such a confirmation,
though of course indirect (18). References to the capability of animals, especially
domestic ones, of sensing the state of an individual are often encounter.ed in the
literature (19). All of this attests to presence of similar factors in the external,
and particularly the acoustic expression of the principal emotions of man and higher
- animals. However, if this is valid in relation to non-speech sounds, how do matters
stand with the emotional coloration of speech? Experiments involving identification
of the emotional state of a speaker on the basis of stressed syllables or even vowels
extracted from a passage of emotionally colored speech show that these elements trans-
mit a significant amount of information on the state of the speaker. At the same
time, these elements are also the result of interference between phonetic and psycho-
physiological processes. A number.of researchers have noted, in relation to both
vocal and conversational emotionally colored speech, presence of the "audible smile"
phenomenon, or tearful, plaintive intonation, sighs of annoyance or moans of pain
directly within the structure of the spoken statement (20,21). Moreover the factor
structure of listeners' evaluations based on emotionally colored phrases of standard
- content is also similar according to our data (22). Therefore there are grounds for
assuminy that the emotional coloration of speech is a derivative phenomenon of emo-
tional non-s~~cech sounds. The question as to the presence of non-speech emotional
sounds iii s~~uecti is extremcly interesting. On one hand they are present in speech in
their first-c:xisting form and in the form of emotional coloration of speech. On the
other hand, if wc assume the position of many scientists who believe affective ~ounds
to be among the raw material for creation of language, they are present in language
- in transformed, removed form. It would be interesting to find their traces in the
phonemic content and in the nonemotional intonation of language, and in particular
to try to examine the phenomenon of phonetic symbolism from these positions.
28
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FOR OFFI('IA1, IiSF ONI.Y
As far as c~xpressive acoustic resources are concerned, their inventory is diverse,
- and it includes phenomena existing at different planes, ones which may have different
historical roots. The one thing that is sure is that they all have a later origin
in comparis~n with emotional resources. 2'here is almost no mention of the evolution
_ vf expressive acoustic resources in the literature. 7:'his question is touched upon
only in Bubrikh's work (23). From this author's point of view we can distinguish
three stages in the development of language: The first is typified by visual-objec-
tive thinking and by speech having a signaling function. In the second stage thinking
assumes a visual-descriptive nature, and representational and expressive resources
come into being.
Discussing the problem as a whole, we can assert that man typically uses more-ancient
acoustic resources when emotionally aroused--that is, his speech typically undergoes
acoustic primitivization, using E. L. Nosenko's term as he applied it tn the semantics
and syntax of speech in stressful conditions.
3. Conclusion
Two ~oints require cotisideration in the conclusion. First of all the initial scheme
of analysis, which contains three basic dimensions, should be supplemented by a
fourth dimension associated with semeiotic analysis of extralinguistic signals,
inasmuch as the system under discussion here is a communication system. But this
approach has not been fully worked out yet in application to such signals, and it
still requires extensive work prior to its implementati.on.
The second point is associated with the problem of determining the type of emotional
state cn the basis of the acoustic characteristics of speech. The characteristics
which researchers have at their disposal still do not permit determination of the
type of state; nor do they even permit differentiation between positive and negative
emotions. Despite the fact that some authors have rep~rted the possibility of making
such a diagnosis, the characteristics they suggest ar.e contradictory (24,25).
We suggest another approach to solving this pro~lem based on the considerations dis-
cussed above coiicerninq the nature of the emotional coloration of speech. In the
first stage of this a~~proacti we would need to subject the acoustic correlates of
an external expression of emotions (emotional soun.d) to meticulous analysis, and in
the next stage we skwuld se~k these characteristics in a spoken signal having an
emotional coloration.
BIBLIOGRAPHY
1. Nosenko, E. L., "Osobennosti rechi v sostoyanii emotsional'noy napryazhennosti"
[Characteristics of Speech in a State of Emotional Tension~, Dnepropetrovsk, 1975.
2. Kagan, M. S., "Ci~elovechaskaya deyatel'nost' (Opyt sistemnogo analiza)" [Human
Activity (An Experiment in Systems Analysis)], Nloscow, 1974.
3. Galunov, V. I., "Speech, Emotions and Personality: Problems and Prospects,"
in this collection.
29
- FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
rvn v. ..~..na. v.ia. w~a..
4. Manerov, V. Kh., "Investigation of the Spoken Signal to Determine the Individual's
Emotional State," Candidate Dissertation Abstract, Leningrad, 1975.
5. Tseplitis, L. K., "Analiz rechevoy intonatsii" [Analysis of Speech Intonation],
Riga, 1974.
6. Voier, W., "Perceptual Bases of Speaker ldentity," J. ACOUST. ASS. AMER., Vol 36,
No 6, 1964,
7. Addington, D., "Voice and Perception of Personality," N. Y., 1968.
8. Alekseyev, V. I., Manerov, V. Kh., and Ustinovich, Ye. A., "Analysis of Voice
as a Source of Information of Properties of the Speaker," see this collection.
9. Shibutani, T., "Sotsial'naya psikhologiya" [Social Psychologyl, Moscow, 1969.
10. Mo.rtensson, C., and Sereno, K. K., "Ad~vances in Communication Research," N.Y.,
Harper, 1973.
11. Bodalev, A. A., "Vospriyatiye cheloveka chelovekom" [Perception of Man by
Man], Izd-vo LGU, Leningrad, 1965. ,
12. Kvasovets, S. V., "Opyt izucheniya emotsional'nykh sostoyaniy. Problemy
neyropsikhologii" [Experience in Studying Emotional States. Problems of Neuro-
psychology], Moscow, Nauka, 1977.
13. I1'in, Ye. P., Manerov, V. Kh., Katygin, Yu. A., and Shatalova, T. N., "Effect
of Pretesting Arousal on Evaluation of Speech Tonc'(in press).
14. Korneva, T. V., "Some Factors Defining the Accuracy of a Listener's Evaluation
of Emotional States," see this collection.
15. Marler, in Krames, L., et al. (Editors), "Nonverbal Comnunication," N.Y.-London,
1974, X, p 202 (Vol 1, "Advances iii the Study of Communication and Affect"].
16. Ramishvili, D., "K prirode nekotorykh vidov vyrazitel'nykh dvizheniy" [The
Nature of Some Forms of Expressive Nbvement], Metsnireba, Tbilisi, 1976.
17. Darvin, Ch., "Vyrazheniye dushevnykh volneniy" [Expression of Spiritual Agitation],
St. Petersburg, 199fi.
1�3. Gerstiuni, G. V., Boydanov, B. V., Vakarchuk, O. Yu., Mal'tsev, V. P., and
Chernigovskaya, T. V., "Human Icientification of Different Types of Acoustic
Signals ~mitted by Monkeys," FIZIOLOGIYA CHELOVEKA, Vol 2, No 3, 1976.
19. Lorents, K., "Kol'tso tsarya Solomona" [King Solomon's Ring], Moscow, 1977.
20. Morozov, V. P., "Biofizicheskiye osnovy vokal'noy rechi" [Biophysical Principles
of Vocal Speech], Leningrad, 1977.
30
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FnR OM'F1('IA1~ USF ON1.Y
21. Kotlyar, G. M., "Analysis of Acoustic Resources for Expressing Emotional States
in Vocal Speech," Candidate Dissertation Abstract, Leningrad, 1977.
22. Manerov, V. Kh., "Analysis of Emotional Non-Speech Sounds," (in press).
23. Bubrikh, D. V., �'Origin of Thinking and Speech," in Anisimov, A. F., "Istoricheskiye
osobennosti pervobytnogo myshleniya" [Historical Characteristics of Primitive
Thinking], Nauka, Leningrad, 1971.
24. Blokhina, L. P., and Gomina, T. G., "Significance of Prosodic and Spectral Para-
meters of Spoken Signals Expressing Different Emotional States," see the pro-
ceedings of this symposium.
25. Taubkin, V. L., "Identification of the Emotional State of a Human Operator Using
S~~oken Signal Parameters," Candidate Dissertation AUstract, Moscow, 1977.
31
FOR OFFICiAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
1`Vn V?'~'~4~H~~ VJG Vl\L~
USING SYMMETRICAL BIOLOGICALLY ACTIVE POINTS TO
MONITOR CHANGES IN TiUMAN PSYCHOPHYSIOLOGICAL STATE
A. S. Abduakhadov, V. I. Galunov
The goal of our study was to develop a method for objectively monitoring the psycho-
physiological state (PPS) of an individual. An analysis of the available data would
show that traditional objective physiological indicators such as pulse, respiration,
FEG, GSR and so on are not sufficiently informative when it comes to determining a
number of psychophysiological states (PPS's). Thus there is int~rest in studying the
biologically active skin points (BASP's) used in acupuncture therapy to elicit change
in psychophysiological state (1). Ours is ari attempt to study the characteristics
of bioelectric reactions (BER's) of biologically active skin points (BASP's) in the
presence of different positive and negative emotional states.
- Research Methods
The research was conducted on 10 healthy subjects 21-31 years old in a comfortably
appointed soundproof room. To simulate emotional states of different signs, the sub-
jects were asked to act out or, for practical ~~urposes, autosuggest negative and
positive emotional experiences. Thus realistic emotional states were produced without
their active motor component.
BER's were recorded with a 16-channel "A1'var" electroencephalograph (GDR) using non-
polarizing (platinum) electrodes with a diameter of 2 mm. The inter-electrode distance
was 3 mm. Symmetrical general--action BASP's (designated Gi-4 by international con-
vention) and symmetrical inacti.ve points on the palms of both hands were the object
of research. A cardiogram was recorded in parallel.
The timiny of the experiment was as follows: First we recorded the initial BER of
symmetrical I3ASP's and inactive points, the cardiogram and the pulse (15 minutes).
After a little while we recorded the physi~logical parameters indicated above for 10
minutes, during which the subject imagined a negative emotional situation accompanied
by the aF~proF~riate state (fear, melancholy), during the time he remained immersed in
this state (10 minutes), and during his emergence from this state (20 minutes). Follow-
ing this, the entire course of the experiment was repeated with the subject imagining
a positive emotional situation accompanied by joy and pleasure.
32
FOR OFF[CIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFIC'IA1, l.1SE ONI.Y
The PPS indicators noted above ware then compared with the particular state and with
the properties exhibited by symmetrical BASP's. The coefficients of asymmetry (Kas~
of the parameters of symmetrical BASP's and of symmetrical inactive points in the
presence o� different emotional states were calculated using the formula:
A-B
KaS A+B'
where A--paramei:ers of points on the left hand, B--on the right hand.
~ Research Result:>
The BER recorded from symmetrical BASP's in response to voluntarily induced emotional
states of different signs may be described as a slow oscillatory process having a wide
frequency range (from 0.2 to 2 Hz) and an amplitude range from 50 to 900 Uv.
BER's recorded from symmetrical BASP's (Gi-4) in response to the different emotional
states studied are described below.
Complete emotional rest was adopted as f:he initial state. In it, BER's recorded from
symmetri~:al BASP's were unstable in amplituda:, asynchronous, irregularly arising
oscillations consisting of primary biphasal c:omponents lasting about 1 sec and
secondary, negative monophasal late component.s lasting 3-4 sec. In all cases, poten-
tials of significantly higher amplitude were recorded from the right Gi-4 point.
When the subject was in a state of emotional relaxation (probably similar to a state
somewhere between drowsiness and sleep) the potential oscillations exhibited a ten-
dency toward synchronization. Osc.illations recorded from the right Gi-4 point were
biphasal in shape, with the amplitude of the primary cor~ponents being 400-800 uv and
their duration being 1 sec. The primary oscillation was followed by a secondary mono-
phasal negative oscillation with an amplitude of about 200 uv and a duration of 3-4
sec. Biphasal oscillations of irregular shape lasting 1 sec and having a lower ampli-
tude (300-500 uv) were recorded from the left Gi-4 point. The secondary component~
which had a duration of 2-3 sec, had a rounded peak.
- Thc nature oF the BLR ~~ot~ntial oscillations changed dramatically in amplitude and
frequency when the subject imagined a state of fear. The oscillation frequency of
potentials recorded from symmetrical BASP's was greater than normal. On the background
of a general increase in frequency, statistically significant asymmetry of this indi-
cator was observed, expressing itself as a higher oscillation frequency at the left
Gi-4 point in comparison with the right. Later on, as the subject emerged from his
negative emotional state, the frequency of BER's recorded from symmetrical points
gradually fell. It should be nated that in addition to this, we recorded low ampli-
tude monophasal negative oscillations with a duration of 1 sec and secondary long-
- lasting positive oscillations (2 sec) from the right Gi-4 point.
At the left Gi-4 point, amplification of the frequency characteristics of the BER is
also accompanied by change in the amplitude of the potentials. They acquire an
approximately sinusoidal biphasal shape. Secondary slow oscillations disappear.
Alternation of oscillations of average size with shorter, asy~netrical oscillations
is observed.
33
FOR OFFIC[AL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
- . . . .
- The changes described above in BER's recorded from symmetrical BASP's assume a
different form when we proceed to a positive emotional state (joy). Amplification
of the frequency characteristics of BER's recorded from symmetrical BASP's can be
noted; however, this increase is expressed to a lesser degree than with a negative
state. The asymmetry of BER frequency characteristics was predominantly right-5ided.
Arisal of two-component oscillations, with an initial fast negative component and a
late biphasal positive-negative slow component (4-5 sec) was noted at the right Gi-4
point. At the left Gi-4 point the oscillations were biphasal, positive-negative and
slow; their shape was irregular, and their duration was 3-4 sec.
Thus these data reveal diffarences in the frequency and amplitude characteristics
of BER's recorded from symmetrical Gi-4 BASP's in response to emotional states of
different signs.
The changes icidicated above in BER's recorded from sym~?etrical BASP's correlate with
the dynamics of electric resistance (total resistance, to include its active component
and its capacitive component) recorded in similar experimental conditions. They also
agree in part with the dynamics of pulse changes occurring in the corresponding states.
Pulse grew faster with positive and negative states, and decreased in the emotionally
re laxed state.
The high effectiveness of using BASP's having general restorative action to detect
emotional states of different signs can be demonstrated by comparing BER's recorded
from biologically active and inactive points on the skin. Our experiments showed that
- the size and shape of the electric potential and the asymmetry exhibited by the dyna-
mics of amplitude and frequency characteristics recorded from symmetrical BASP's are
significantly more precise indicators of psychophysiol~gical state than are similar
characteristics recorded from inactive points, for which the electroenencephalographi-
cally recorded BER's either do not change at all in response to similar situations,
or they fluctuate insignificantly about the zero point.
Considering our present knowledge of the mechanisms responsible for the work of BASP's,
it may be hypothesized that amplification of the amplitude and frequency character-
- istics of B~R's recorded from symmetrical BASP's of general tonic action in response
to emotional states of different signs reflects nonspecific activation of the central
mechanisms of the states examined here. Asymmetry of BER indicators recorded from
symmetrical BASP's probably reflects activation of cortical mechanisms of the studied
emotional states, and it may be associated with functional lateralization of positive
emotions in the left hemisphere and of negative emotions in the right hemisphere (2)�
Ttie thus the research pcrmits the following conclusions:
1. Bioloyically active ~oirits that cause change in psychophysiological state when
affected in a certain way may also serve as a source of information on state.
BER's and the resistance to electric current recorded at BASP's of general tonic
action are promising indicators of qualitatively different changes in emotional
- state.
3. The amplitude and frequency characteristics of BER's recorded from symmetrical
BASP's differ in relation to emotional states of opposite sign.
34
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FOR OFFICIAI, l1SF. ONI,X
Asymmetry in l.he ayiiam_ic:s of lli~ am~~litude and frequency characteristics of BER's
recorded from symmetrical BASP's is the principal indicator of emotional states
differing in sign.
BIBLIOGRAPHY
1. Chzhu Lyan', "Rukovodstvo po sovremennoy chzhen'-tszyuterapii" [Handbook of
Modern Acupuncture Therapy], Moscow, Medgiz, 1959.
2. Gazzaniga, M. S., "The Bisected Brain," N.Y. 1970.
35
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R400540030032-8
ANALYSIS OF VOICE AS A SOURCE OF INFORMATION ON PROPERTIES OF THE SPEAKER
V. I. Alekseyev, V. Kh. Manerov, Ye. A. Ustinovich
Being a means of communication, speech is a system of signs organized in a particular
way. The principal sign of this system is the word. Thus speech is traditionally
studied as a means of verbal communication. But recently researchers have shown
increasingly greater interest in parameters of speech having to do with nonverbal
communication. Acoustic phenomena (prosodic characteristics, sonorousness, pronun-
ciation) which accompany speech and which may bear information supplementing the
meaning of a statement are now becoming an object of study.
A researcher studying perception of these acou5tic phenomena is able to distinguish
a group of speech elements that are constant characteristics of speech formation
~ which a listener can use to form an idea as to the age, education, appearance and
the psychophysiological and characterological features of the speaker. In other
words this group of speech parameters generates an impression about the speaker--
that is, it has an impressive function. Voice has special significance among these
- parameters. Our study was devoted to the influence of vocal characteristics on
the listener's resulting impression of the speaker's properties.
Psycholinguistic studies of voice characteristics (4,5,6) distinguish the following
descriptive qualities of speech sounds: pitch, loudness, speed, rhythm, timbre,
melody, sonorousness, intensity. The impression aii audience develops about the
speaker influences the content of his message--that is, together with other factors
it predetermir~es the effectiveness of mass media. It was for this reason that
scientific investigation of the role played by these speech characteristics in
formation of an impression about the individual features of a speaker began with the
clevclopment of radio broadcasting. 't'hus the first experiments were conducted in this
area by the British Broadcasting Com~~any in 1931 (3).
A researcher addressing this problem must answer the following questions:
l. What impressions does the speech of a speaker induce in his listeners?
2. How is the speaker's voice described, and what characteristics used in its eval-
uation are correlated with opinions about the speaker?
3. How do opinions about t11e speaker, arrived at as a result of listening to his
speech, correlate with data obtained by other means (for example with personality
inventories) ?
36
FOR OFFIC[AL IJSE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
4. Wtiat pliysically recorded f~arameters of a spokeii signal predetermine the opinion
of a speaker?
_ Thus four sets of data must be compared: the physical description of the spoken
signal, its subje ctive description (the voice), opinions about the speaker's proper-
ties and the psychophysiological indicators of the speaker's individual features.
_ Our research deal t with problems associated with the first two groups. Voice evalu-
ations and opinions about speakers based on such evaluations can be obtained by the
well known method of semantically opposite pairs suggested by C. Osgood (1), which
- is broadly employed today in research on auditory perception.
Psycholinguistic studies that have made use of this method (2) show that the "field
of ineanings" is de termined by two systems of evaluations. The first system, re-
ferred to by Osgood as affective, contains three factors: evaluation, strength and
activeness. The second sy~tem is formed out of denotative characteristics, and it
reflects the phys ical properties of the object. Inasmuch as the objective of our
work was to isol ate those vocal characteristics which predetermine opinions about the
properties of the speaker, denotative evaluations are of the greatest importance to
us. In addition to characteristics used in the affective evaluation (good-bad, strong-
weak and so on), the list of characteristics intended for voice evaluation included
terms describing the voice qualities listed above--three or four terms for each
quality. For example high-low, loud-soft, fast-slow and. so on. In all, the list
consisted of 36 pairs of adjectives.
The s~cond list we used in the listening sessions consisted of 54 characteristics
describing diffe rent properties of the speaker. It included some indicators of
appearance (tall-short, thin-fat) and age, and terms standing for emotional, voli-
- tional, intellec tual and characterological features (emotional-unemotional, anxious-
serene, smart-stupid, willful-unwillful). In this case each of these features was
described by ap~roximately the same number of terms.
Subjects referre d to these two lists as they listened to tape re.cordings of speech
excerpts by five male speakers. The excerpts, each about 2 min.utes ].ong, were frag-
ments of undirec ted speech. During the recording sessions, the speakers were asked
to speak freely about some interesting event in their life.
Subjects listene d to the recordings in groups. In all, 60 perscns listened to each
voice. The ins tructions did not limit the listening time, and th~~y required the
subjec~s to I~ay no attention, to the extent possible, to the content of the excerpts.
Ten evaluation i ntercorrelation matrices were obtained on treating the results:
. five matrices (3 6x36) of voice evaluations and five matrices (54x54) of evaluations
of the speaker's individual properties. They were subjected to factor analysis by the
- main factor meth od. The calculations were performed with an M-222 computer. In addi-
tion we obtained combined matrices of voice evaluations and evaluations of individual
characteristics of the speaker. These matrices were compiled on the basis of evalu-
ations of five voices listened to by 20 audiences. Thus we subjected (54x100) and
(36Y.100) matrice s to factor analysis. As a result we obtained the invariant factor
structures of evaluations shown in tables 1 and 2.
37
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
\I~' ~ �f~4 l~\l~� �
'1'~1~ l. Ic~variant ractor Structure Obtained Through Factor Analysis of Composite
Matrices of the I:valuations of Individual Features of the Speaker
(Characteristics With Maximum Weights Are Shown)
Factor Characteristics Weight
I Decisive-vascillating +0.73
- Stern-gentle +0.71
Fearful-fearless -0.71
Compliant-competitive -0.71
Willful-unwillful +0.70
Delicate-hard -0.70
Confident-unconfident +0.68
Imperious-quiet +0.68
II Slow-fast +0.82
Gay-depressed -0.80
Hurried-slow -0.80
Passive-active +0.78
Woeful-joyful +0.76
Energetic-inert +0.75
Introverted-communicative +0.75
III Conscientious-irresponsib~e +0.67
Serious-flippant +0.66
Rr~ugh-delicate -0.64
Intellectual-primitive -0.64
Smart-stupid +0.61
IV Lean-stout +0.60
- Fat-thin -0.55
_ Short-tall +0.50
Age +0.50
Stalwart-dwarfish ~�47
Old-young -0.43
V Anxious-serene +0.47
- Nervous-calm +0.44
- Relaxed-taut -0.44
The first factor that was `isolated by factor analysis of the combined matrix of
evaluations of the individual properties of tlie speaker included the characteristics
decisive-vascillating, delicate-hard, purposeful-spontaneous, willful-unwillful.
These characteristics describe volitional qualities of the individual, and the
factor may be defined as firmness or strength.
The second factor includes the characteristics passive-active, woeful-joyful, slow-
fast, inert-energetic, quiet-talkative. This can quiet definitely be interpreted
as an activeness factor.
The third factor of this invariant structure brought together the characteristics
conscientious-irresponsible, seriaus-flippant, smart-stupid. This can probably be
intsrpreted as the intelligence factor.
38
FOR OFFICIAL USE ON~.Y
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FOR OFFICIAL USE ONY.Y
The fourth factor consiste' of characteristics describing appearance (weight and
heigt) and age. It is interesting that the characteristics fat-thin and short-tall
- are positively correlated, as is the case with all individual matrices. The fifth
factor, which includes the characteristics anxious-serene, nervous-calm,can probably
be interpreted as a tension factor which is associated with anxiety and impulsiveness.
- Individual factor matrices of the evaluations of speaker properties are combinations
of these factors. In some cases we revealed special factors in addition to these.
As an example the final factor matrix of evaluations of speaker No 1 contains factors
uniting the characteristics rough-delicate, dreamer-doer, confident-unconfident.
Table 2. Invariant Factor Structure of Voice Evaluations
Factor Characteristics Weight
I Fluent-stumbling 0.83
Uriiform-nonuniform 0.76
Rhythmical-nonrhythmical 0.5~
Tense-relaxed -0.47
Free-constrained 0.40
II Squeaky-deep ~�80
High-low 0.74
Thin-thick 0.74
Bright-dark 0.63
Fast-slow 0.62
Sharp-dull 0.61
III Rich-dry 0.67
Bad-good -0.66
Deep-cold 0.69
Hollow-full -0.59
Unpleasant-pleasant -0.57
IV Dull-clear 0.76
Monotonous-modulated ~�72
Muff lec'~-dear'~�ning 0. 69
Unpleasant-pleasant 0.53
Brigh*-dull -0.51
Hoarse-clear 0.42
V Loud-soft -0.60
Mobile-inert -0.58
Fast-slow -0.56
Hurried-unhurried -0.48
Harsh-mild -0.40
VI Distinct-inarticulate 0.75
Comprehensible-incomprehensible 0.81
VII Nasal-non-nasal 0.51
39
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
l'V~\ V~ � �~.~Aa~ Vuu v1~u�
The invariant factor structure of voice evaluations, shown in Table 2, contains
seven factors. The first factor combines characteristics describing the rhythm of
- speech and its intensity. This means that the "intensity�' characteristic is deter-
mined by the rhythmical pattern of speech. The second factor may be interpreted as
the voice pitch factor. The third factor includes characteristics describing timbre
(rich-dry, warm-cold), and valuational characteristics. The fourth factor may most
likely be interpreted as a sound fullness factor. The fifth factor basically in-
v~lves speech rate evaluations. The sixth and seventh factors contain the largest
number of characteristics, with the sixth most probably characterizing the recording
~ quality. It can be noted that evaluations having to do with loudness did not compose
a single factor, instead falling within the fifth and sixth factors.
The next step in the analysis was to obtain correlations between voice evaluations
and evaluations of the individual properties of the speaker. Characteristics having
the maximum factor weights were selected out of the factor matrices of evaluations
of the first speaker. In all we selected 52 characteristics, which we subjected to
correlation analysis (20 reflect voice qualities while 22 describe individual charac-
teristics of the speaker). The most significant coefficients were obtained for
correlations between the following characteristics:
- Fluent-stumbling, woeful-joyful 0.40
- Fast-slow, energetic-inert 0.43
Full-hollow, willful-unwillful 0.54
Deep-squeaky, willful-unwillful 0.53
Dull-clear, fearful-fearless 0.43
Hoarse-not hoarse, willful-unwi11fu1 0.54
The following conclusions can be made:
1. The invariant structure of voice evaluations consists of five basic factors that
may be interpreted as rhythm, rate, timbre, pitch and fullness.
2. The invariant structure of evaluations of individual properties of the speaker
includes five factors: Activeness, will (firmness), intensity, intelligence and
appearance.
3. Fast s~eech creates the impression that the individual is active and energetic.
A low, dull, full voice is associated with a person who is purposeful, willful and
decisive. Ftliythmical speech is evaluated as a sign of an elevated mood.
BIBLIOGF2APHY
1. Osgood, C. E., Suci, G. H., and Tannenbaum, P. H., "The Measurement of Meaning,"
Urbana, 1957.
. 2. Tzeng, O., and May, W., "More Than E.P.I. Semantic Differential Scales," INTER-
NATIONAL JOURNAL OF PSYCHOLOGY, Vol 10, No 2, 1975.
3. Pear, T. H., "Voice and Personality," N. York, 1931.
40
FOR OFFIC[AL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFICIAI. USE ONLY
4. Voier, W. D., "Perceptual Bases~of Speaker ldentity," JASA, Vol 36, No 6,
1964.
5. Addington, D. W., "Voice and Perception of Personality," N. York, 1968.
6. Farmann, R., "Die Deutung des Sprechaus drucks," Bonn, 1960.
41
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
cvn vi~~ ~...�e~a, v.sa; vt~a.�
THE SEMANTIC SPACE OF IDEAS ASSOCIATED WITH EMOTIONALLY COLORED SPEECH
Ye. F. Bazhin, G. A. Krylova
Speech occupies an important place in the communicative function belonging to ex-
pression. Speech is one of the sources af information indicating presence of a
certain emotional state and an emotional relationship. It is commonly accepted (1)
tha t tlii.s information is contained in three levels of speech: acoustic-phonetic;
lexical-grammatic and semantic. F:esearchers are especially interested in the first
level--that is, one having to do with almost direct transmission of so-called affec-
tive language contained in emotional intonation--in changes in the physical
characteristics of the voice that are perceived and decoded by the listener. The
task of searching for and discovering concrete prosodic characteristics bearing in-
formation on emotional state is in general a solution to part of the theoretical
problem of man's perception by man. In the applied sense, on the other hand, the
objective of this task is to develop an automatic system for recognizing emotions
on the basis of the spoken ~ignal.
This problem was studied in our previous work from the standpoints of both finding
the objective correlates of emotional states in the melody of voice (2,3) and deter-
mining the possibilities of expert (listener) evaluatioii--that is, identification--of
the emotional color of speech (4). To a certain extent these two approaches are
opposites of one another: Thus on one hand we employeci instrumental analysis based
- on analyzing purely physical characteristics, while on the other hand we studied
the capability for communication associated with psychological features of a given
individual as a personality. This paper describes an attempt to find something
tha t would bring these two approaches closer together, something that could serve as
a link between them. One such linking tool, we believE~, is language, and namely its
semantic wealth, which is used, among other purposes, for transmission of information
on emotional statc.
'rhe immediate objective of our work was to reveal the :semantic space of Russian
language containing terms describing the emotional characteXisties of voice and
speech. The fullest and most reliable source of such terms is creative literature,
in which the ideas of the author, his philosophy and hi.s outlook on the world are
expressed by means of various descrip~ive resources, to include character descriptions
or expressions. A character description sometimes provides a direct indication as to
whether or not a certain emotional state is inherent to the hero, and it may indicate
the attitude of the narrator (the author) toward that state. Manufacturing an emo-
tion-inducing situation by means of his creative imagination, the author serves as
a unique sort of transformer, translating his ideas about the way the speech of his
heroes sounds into specific terms, the adequacy, c.larity and completeness of which
42
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFICIAL USF. ONi.Y
define ttie artistic expr~ssiveness of the work and promote transmission of the de-
sired information, making it understandable to a large number of people (readers).
In other words the objective of the author is to transmit acoustic characteristics
at the verbal level, to place them in semantic space in such a way that they could
be decoded--understood and felt. In this sense each writer acts as an experi.mental
psychologist on one nand, reconstructiing the behavior of people in different situ-
ations, and on the other hand he is a linguist-philologist, having an abundant lexicon
at his disposal.
Our method required the study of the creative works of 16 Russian and Soviet writers
(I. Bunin, N. Garin-Mikhaylcvskiy, M. Goz''kiy, F. Dostoyevskiy, A. Kurpin,
N. Leskov, D. Mamim-Sibiryak, K. Paustovskiy, S. Sergeyev-Tsenskiy, A. Tolstoy,
- L. Tolsoy, I. Turgenev, K. Fedin, A. Chekhov, M. Sholokhov and I. Erenburg). Three
excerpts from the works of each writer were analyzed. Each excerpt contained a
standard number of characters--200,000; in all, we studied 48 such excerpts with a
totaL volume of 960,000 characters. Expressions used to describe voice and speech
were extracted from the text. For the convenience of analysis these expressions
were qiven in adjectival form, for example "loud," "languid," and so on.
Table 1 contains t}?e text analysis data. We can see fr~m the table that the number
of times different writers make references to the color of voice and speech varies
broadly. A certain trend is evident, however: For about half of the authors
analyzed this namber did not exceed 250-270, while for the rest it was significantly
higher. Consequently the frequency with which references are made to voice and
speech in standard excerpts from creative works of like genre may vary.
A similar situation was also revealed on analysis of the terminological structure
of these references--that is, the writer's lexicon (see Table 1). Here the differ-
ence was rather large--69 terms for N. Leskov and 237 for A. Kuprin.
Table 1
Ratio of
No. of
References
No. of No. of to No.
_ Writcr References Terms of Terms
l. I. Bun.in 270 147 1.84
2. N. Garin-Mikhaylovskiy 239 108 2.21
3. M. Gor'kiy 605 17? 3.53
4. F. Dostoyevskiy 236 128 1.84
5. A. Kuprin 719 237 3.03
6. N. Leskov 168 69 2.43
7. D. Mamin-Sibiryak 379 134 2.82
8. K. Paustovskiy 410 148 2.78
9. N. Sergeyev-Tsenskiy 220 121 1.81
10. A. Tolstoy 558 142 3.92
11. L. Tolstoy 159 85 1.87
1"l. I. Turgenev 267 118 2.26
13. K. Fedin 536 202 2.65
,
[continued on following paqeJ
43
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
rvec vrri~.~Na. voc v~va.~
14. M. Sholokhov 527 185 2.85
15. A. Chekhov 306 109 2.8
6. I. Erenburg 371 118 3.14
373 139 2.61
There is another interesting indicator in Table l: the ratio between the total
number of references to voice and speech encountered in a text, and terminological
diversity--that is, the total number of terms used by a writer. This indicator
illustrates the average frequency with which some term is used throughout the
entire analyzed text--that is, 600,000 characters. As we can see from the table,
the size of this indicator increases as the number of references in the text to
voice and speech increases, while on the other hand it decreases as the lexicon
becomes relatively less rich--that is, as relatively fewer terms are used to define
the characteristics of voice and speech.
~
2s `
~
2~ . /3
~9v � ~v
r7o �
rSO . ~ � a
'~0
I.lo , y ' t
Ifo ~ Z�12 .IS ~
~
to � ~ ,
Sv
____,_t_ . ~ -+--1--t---~- F-- i
?o0 200 .~oo lod .fio iop l00, !oo X�
Figure 1
X axis--number of references in the text to different characteristics
of voice and speech. Y axis--number of different terms used to de-
_ scribe voice and speech. 1--I. Bunin, 2--N. Garin-Mikhaylovskiy,
3--M. Gor'kiy, 4--F. Dostoyevskiy, S--A. Kuprin, 6--N. Leskov,
7--D. Mamin-Sibiryak, 8--K. Paustovskiy, 9--N. Sergeyev-Tsenskiy,
10--A. Tolstoy, 11-L. Tolstay, 12--I. 'I'urgenev, 13--K. Fedin,
14--M. Sholokhov, 15--A. Chekhov, 16--I. Erenburg.
- 44
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
1~012 OI"r'ICI11L USE ONLY
The laws revealed by analysis of the texts may be visualized in Figure 1, in which
the total number of references in the text to voice and speech are plotted on the `
X axis and the number of terms (definitions) describing voice and speech is plotted
- on the Y axis. We can see from the figure that between these indicators there exista
a dependence that would best be described as linear--that is, as the number of
references to voice and speech increases, the number of terms employed--that is,
the terminological diversity--grows as well.
The total number of different terms referring to voice and speech encountered in
texts written by the 15 analyzed authors was 611. Analysis of the semantic content
of these terms, tised to describe voice and speech (we used a 16-volume modern
Russian language dictionary published by the USSR Academy of Sciences), showed that
it (this content) may be divided into three basic categories.
The first included terms describing t~ze acoustic-phonetic characteristics of voice
and speech directly: for example "whining, melodic, shrill," and so on. We revealed
150 such definitions, making u~~ 25.5 percent af the total number of terms.
The second category contained terms metaphorical in nature, for example"urbane, thick,
reedy, oily" and so on. These totaled 45--that is, 7 percent.
And finally, the third and largest category contained terms used by the authors
to provide relatively direct information as to the presence of a concrete emotion:
"hopeless, anxious, elated, melancholy, angry" and so on. We discovered 416 such
terms--that is, 67.5 percent.
The next stage of analysis involves more-detailed classification within each of the
categories of terms isolated above. Thus for example, six different factors were
revealed in relation to acoustic-phonet~~ terms directly characterizing voice and
speech (terms in the first category): intensity (60 terms)--that is, 40 percent of
- the total number of 150 terms, pitch (24-16 percent), speed (19-15.4 percent),
rhythm (10-12.6 percent), timbre (10-6.7 percent) and distinctness (14-9.3 percent).
The terminological lexicon resulting from this sampling procedure was also subjected
to analysis from the standpoint of the frequency with which individual terms were
used. By studying weightfaators we w~re able to arrive at a"mandatory" set of
terms--that is, one common to all writers; obviously, this would also be a unique
sort of semantic summary of Russian-language terms used to describe voice and speech.
On the ot~er hand this terminological lexicon (containing 611 terms) permitted us
to reveal clusters of synonyms, within which we determined, by analysis af weighted
evaluations, the central terms and the distances by which they were separated from
their synonyms. This in turn allowed us to represent these laws in the form of
three-dimensional concept models for key terms such as, for example, "loud," "joy-
ful" and so on.
In general the obtained data provide an impression of the semantic space of Russian
language, within which key concepts describing voice and speech are located as
individual points surrounded by synonymous terms, ones which obviously emphasize
certain shades of ineaning. This material will be used in the future to develop a
specialized semantic differential that could be used to compile a terminologica].
45
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R400540030032-8
1'VK VPI'~I.~HL VJC VI~L.t
lexicon for voice, this time expressing concrete emotions--melancholy, joy, arrger
and so on.
The research method we used, which is essentially one of the variants of content
analysis, appeared sufficiently adequate and promising to us. And in fact, We are
dealing with writers whose works possess remarkable realistic strength of descrip-
tion and influence upon the reader's imagination. It is not difficult for us to
imagine the visual picture of the heroes of I. Bunin, L. Tolstoy or F. Dostoyevskiy--
their appearance, voice and manner of speaking. Therefore we can assume that the
terms used by these authors are a unique source of information on ideas about
emotionally colored voice and speech typical of man in general.
BIBLIOGRAPHY
- l. Zhinkin, H. I., "Mekhanizmy rechi" [Speech Mechanisms], Moscow, 1958.
2. Bazhin, Ye. F., Galunov, V. I., Gorskiy, G. D., Manerov, V. Kh., and
Khvilivitskiy, T. Ya., "Analysis of Prosodic Characteristics of the Speech
of a Speaker Experiencing Different Emotional States," in "Analiz i sintez kak
vzaimo-obuslovlennyye metody e::sperimental'nykh foneticheskikh issledovaniy"
[Analysis and Synthesis as Mutually Dependent M~ethods of Experimental Phonetic
Research] , Minsk, 1972.
3. Bazhin, Ye. F., Galunov, V. I., Gorskiy, G. D., and Manerov, V. Kh., "Objective
Diagnosis of Emotional State in the Psychiatric Clinic on the Basis of Speech,"
in "Rech' i emotsii" (Speech and EmotionsJ, Leningrad, 1975.
4, Bazhin, Ye. F., Vuks, A. Ya., and Koriyeva, T. V., "Possibilities for Recognizing
Emotions on the Basis of an Isolated Spoken Signal," in "Psikhologicheskiye
problemy psikhogigiyeny, psikhoprofilaktiki i meditsinskoy deontologii"
[Psychological Problems of Menta]. Hygiene, Preventive Psychology and Medical
Deontology], Leningrad, 1976.
46
FOR OFFICIAL t?SE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000540030032-8
FOR OFFICIAL USE ON~.Y
A PACKAGE OI' ~PLS`1'S TO STUDY PERCEPTIOId OF EMOTIONAL SPEECH
A. V. Beskadarov, L. I. Vasserman, I. M. Tonkonogiy
Study of the emotional coloration of speech has recently been attracting the
interest of specialists in different fields. On one hand it is an object of interest
of researchers working on systems intended to monitor operator states. On the other
hand the emotional coloration of speech is being subjected to cletailed study in
psychiatry, neurology, medical psychology and so on. Researchers attempting to
determine the state of an individual on the basis of his speech are encountering
certain difficulties in doing so. They are associated primarily with the absence
of informative characteristics that would allow reliable differentiation between
emotional states.
The role played in communication by speech fonned out af the words of a concrete
language is universally recognized. The characteristics of such languages are
being studied in numerous linguistic, psychological and sociological research pro-
grams. But the significance of the language of emotions and of other paralinguistic
forms of spoken communication continues to be significantly outside the field of
view of the researchers, though these nonverbal forms of communication play a signi-
ficant role in the individual's activity, in his recognition of a situation, in
decision making and in evaluating the results of actior~ and behavior. From our
point of view emotions are among the simplest and most meaningful languages used
by the human brain as it receives, stores and transmit:; information. This language
is apparently limited to about 20 key concepts which tY~e individual can use to
arrive at a rough assessment of most situations and re:sults of action. This signi-
ficantly limits the number of classes of such situaticii~s that need to be identified,
and it significantly facilitates and abridges information processing by the human
brain, especially when frequently encountered, repeatirig events are involved. A
significant advantage of this language is its genetic :cubstrate--the possession of
the same language of emotions by all people. It is a unique Esperanto permitting
communication of one person with another, of a mother and a child in the first days
of its life, of people of different nationalities. It also facilitates communication
between ~eople using conventional verbal language by imposing a general value judg-
ment on events being described by verbal messages.
However, the information which we possess from researcY? on human emotions is highly
limited, even in regard to identification of emotions c~n the basis of facial ex-
pressions and voice characteristics. This article desc:ribes a method aimed at
47
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/42/09: CIA-RDP82-00850R000500034432-8
rvn vre.~-.na.. v.ia: v,.?..i
studying ti~~se indicators in patients sufFering local 1~rain lesions, since the
stuc~y of ~>atients with focal pathology may provide sig~iificant help in gaining an
understanding of the cerebral mechanisms of emotional language. In order that other
forms of nonverbal speech could be studied as well, the method was supplemented not
vnly by tests aimed at studying perception of the emotional characteristics of
speech but also tests oriented on perception of intonational and individual features
of speech. Because the main objective of our work was not to analyze the capabilities
for identifying emotional states but to study the cerebral mechanisms of these
capabilities, the tests were written on the basis of not only tape recordings of
the speech of inental patients in different emotional states but also recordings of
the speech of actors simulating emotions and the speech of normal subjects to whom
different emotional states were suggested under hypnosis. Our lexicon of emotional
states included only five basic ones, encountered most frequently in the clinic and
characterized by concepts describing positive and negative emotions on a more-general
plane, without isolating the different variants of the states, analysis of the per-
- ception of which is an independent problem that was not within the objective of our
work. These five types of emotional states included: 1. Normal. 2. Alarm.
3. Joy. 4. Melanc'~oly. 5. Anger. The speech tests used in this method for
studying perception of emotional, intonational and individual characteristics of
speech are described below.
Test 1: Identification and pairecl comparison of emotional states (based on material
gathered in the clinic).
- There are two parts to this test: 1) Identification of emotional states, 2) paired
_ comparison of emotional states.
a) Identification of emotional states: One sentence, "It was an early spring," was
chosen as the starting material. It was tape-recorded while read by patients ex-
periencing different emotional states. The procedure begins with a training series
_ intended to teach the patients how to recognize and name em~tional..states on the
basis of voice characteristics (the recording of each state is repeated three times).
The principal research series contained samples of emotional atates presented in
random order. The subject was first given a training series and tlien the principal
serics. On listening to the F~rincipal series, he had to recognize and name the
appro~?riate emotional states.
Paired comparison of emotiunal states: Following preliminary training, in this
exF~eriment the subject was asked to successively compare two stimuli. In his response
he had to declare whethcr the paired emotiorial states were different or identical.
The experimental sentence was the same as in the first part: "It was an early spring."
The training series consisted of the following seven pairs: 1) normal-normal,
2) alarm-alarm, 3) joy-joy, 4) melancholy-melancholy, 5) anger-anger, 6) normal-
joy, 7) anger-melancholy. Of course, it would not have been suitable in this experi-
ment to use training ~~airs consisting of all five states.
The ~~rincipal series was based on 30 pairs of comparisons recorded in random order.
It was unsuitable to include a large number of c~mparisons because this could tire
the ~atients. On being presented each pair of stimuli, the subject had to say
whether both stimuli were uttered in the same or in differ.ent emotional states.
48
- FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FOR UFFICIAL USE ONLY
Test 2. Identification and paired comparison of emotional states (actor simulation
and suggestion under hypnosis) .
This test made use of spoken material obtained: 1) by recording the words spoken
by an actor (actor simulation) and 2) by recording the words of a speaker under
hypiiosis.
a) Actor simulation: An actor read the sentence "This is so simple that I have to
say it" in one of the states he was asked to simulate: 1) Normal, 2) anger,
3) melancholy, 4) joy, 5) alarm. TY:e desian of the experiment was the same as
with test 1(paired comparison). The training series consis ted of four pairs of
stimuli: 1) Normal-nom,..:, 2) melancholy-melancholy, 3) normal-melancholy, 4) joy-
alarm. The p rincipal series consisted of thirty pairs.
b) Sugges tion under hypnosis: There were two parts to this test: a) paired compari-
son oF emotional states, b) identification of emotional states. One of the states
indicated above was suggested to the speaker, who then uttered the control sentence
"This is so simple that I have to say it." The training and principal material used
for ~air~ d comparison was arranged in the same order as with actor simulation.
Test 3: Identification and paired comparison of different intonational structures.
The experiment required analysis of seven different intonational structures baGed on
the same sentence : "Mommy bathed Man' ya. " fiere is a sample of the training text
offered to the subject for emotion identification:
No 1. Mommy r~~thed Man' ya (neutral advisory intonation) .
No 2. MommY bathed Man'ya (logical stress on the first word) .
No 3. Mommy bathed Man'ya (logical stress on the second word).
No 4. Mommy bathed Man'ya (logical stress on the third word).
No 5. Mommy bathed Man'ya? (questioning intonation).
No 6. Mommy bathed Man' ya! (exclamatory intonation) .
No 7. Mommy bathed hian' ya. ( incomplete intonation) .
Tl:ese structures were arranged in random order in the principal text.
I�'or ~~ai.redcomparison, the subject was offered a training sample of five pairs of
stimuli, and he was askeci to respond whether the stimuli were the same or different
(the ~~ausc between }~airs was 5 seconds) . There were 30 pairs of stimuli in the
principal tex t, with the number of pairs of identical stimuli being equal to the
number of pairs of different stimuli.
`I'est 4: Identification and paired comparison of individual characteristics of
pronunciation .
Ttle objective of this test is to reveal the particular way subjects pe~ceive
clifferent voices in a sample and their capability for distin guishing different
voices presented in pairs.
In this test 15 male speakers uttered the control phrase "Everything was blanketed
by dar}: clouds." The subjects are not acquainted with the voices of these speakers
~~rior to ttie experiment. The training text consists of five pairs of stimuli (two
49
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
H'()R UN'blt'lAL U5~ UNY.Y
pairs of identical voices and three pairs of different voices). In the principal
series the subject is required to successively listen to 30 repetitions of this
sentence, and venture a conclusion as to the simi.larity or dissimilarity of two
similar repetitions--that is, the similarity or dissimilarity of each successive
stimulus in relation to the previous one. The subject used the symbol to denote
similar stimuli and to denote different stimuli.
This method was applied in a preliminary experiment to 11 patients with local
lesions of different divisions of the cerebral cortex. The impression is that
recognition of emotional, intonational and individual characteristics of speech
_ in the tests is associated in a number of cases with focal pathology of the temporal
divisions of the cerebral cortex, and predominantly in the right hemisphere of
right-handed patients. Incidentally this research has only just begun, and more
observations will have to be accumulated before the results could be summarized and
_ analyzed.
50
FOR OFFICIAL U~E ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02/09: CIA-RDP82-00850R000500430032-8
FOR OFFICIAL USE ONLY
ANALYSIS OF THE VARIABILITY OF THE MELODIC CONTOURS OF SPEECH
A. V. Beskadarov, V. I. Galunov
A large amount of applied problems have recently.been found that can be solved
successfully by the methods of factor analysis. In our work we had the possibility
to examine one of these methods, ~zamely a computer program method of isolating main
factors. We used this method to detect individually variable parameters at the
prosodic level (the melodic autline of speech), associated with analysis of different
emotional states r~produced in the speech of an actor (the so-called "actor model"}.
Basically, factor analysis reveals the concealed laws of numerous measurements
through analysis of correlation (or covariate) matrices. It is based on the assump-
- ti~n that observed variables may be expressed through concealed independent para-
meters or factors. In other words analysis of variations in some phenomenon can
reveal the concealed laws of these variations.
1. Materials and Methods
In our experiments we used a tape recording of a single test sentence To T ax
rtpocTO, ~TO xo~e~cR cxasaTb" [Eto tak prosto, chto khoc~ietsya skazat' ;
This is so simple that I have to say it]) spoken by an actor. The speaker was
instructed to utter this sentence in one of 44 prescribF;d states (normal, arousal,
languor, joy and so on). Correspondingly, 44 melodic contours were obtained by
processing the oscillograms. Then we selected out the frequencies of the ~undamental
tone of vowel sounds within the test sentence (11 frequencies) (see Table 1).
The material in Table 1 was subjected to factor analysis in two variants: 1) factor
treatment of the characteristics of emotional states (a 44x44 correlation table)
and 2) factor treatment of the frequencies of the fundamental tones of individual
vowel sounds (llxll correlation table).
The first variant of treatment revealed four spectrums connected with the real data
on emotional states. All of the states co~~ditionally fell into four classes corres-
� ponding to meaningful factors, which were labeled: 1) "Relaxation" factor (14 emo-
tional states) , 2) "alarm" factor (lt3 stat~~s) , 3) "aggressiveness" factor (6 states) ,
4) uncertainty factor (2 states). There w~~re one or two states in each class which,
generally speaking, were not suited to the label characterizing that class (thus
the states of joy, tension and inspiration were such an exception in class 1). It
may be concluded that factor treatment of t:he test sentence in relation t~ different
51
FOR OFFICiAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
- .
Table 1. I'requencies of the Principal Tone in the Phrase "This is so Simple That
I Have to Say It" for Speaker M. (In 44 Emotional States)
~ti 1 2 3 4 5 6 7 8 9 10 11
- - - - - States
9 o a o 0 0 0'9 a a a
I 215 220 22Q 275 205 200 250 215 205 195 235 Nozmal
2 2I0 210 210 265 190 190 240 21b 19b 200 225 ~ Concentration
3 215 205 205 230 200 195 235 220 195 196 240 Relaxation
�1 250 270 280 330 230 235 230 270 250 240 286 Confidence
5 250 300 305 340 29Q 295 33b 310 285 295 330 Arousal
22Q 220 23Q 260 195 200 260 23b 210 210 24b Languor
7 2;35 230 240 3Q0 220 225 270 252 230 240 260 Tension
A 275 285 290 315 295 280 32b 30b 385 2J6. 318 Perturbation
2G0 350 330 375 305 295 350 310 325 320 34b Frenzy
in 250 270 260 300 206 225 305 260 2b0 222 297 Inspiration
11 33Q 33U 330 440 230 2G5 390 330 296 270 307 Joy
12 230 2~5 290 272 272 29b ~GO 320 312 307 295 Delight
~ 1 210 235 240 290 250 215 27U 246 220 220 266 Timidity
. ~ ~ 220 230 240 295 215 215 27c1 260 225 230 260 Embarrassment
~S I!~0 195 210 255 18Q 210 260 236 22b 216 236 Uncertainty
I~~ Yq7 210 207 210 215 2A2 190 216 202 19b 216 Doubt
~1 24S 220 222 282 I90 197 237 220 226 212 236 Disenchantment
~!?0 220 230 280 200 217 280 250 23b 232 280 Insult
Zno 22~ 325 267 195 230 290 230 242 227 240 Displeasure
IAO 232 245 297 195 220 302 250 24b 252 277� Rewlsion
27n 310 360 3G0 285 290 336 290 296 30b 320 Aggressiveness
2~~ ~12 330 395 308 317 368 310 320 306 330 Anger
- ?11 300 332 282 212 216 276 312 332 327 332 Indignation
- ~~1 2U7 240 3b0 27b 310 347 237 217 210 267 Surprise
25 250 24b 272 300 2G5 2fi3 280 270'247 26b 272 Confusion
'lf, 2(iFi 302 340 410 31Q 3b5 370 320 290 296 3b0 Amazement
27 2Q(1 ~ 212 240 2Gfi 220 200 2b0 230 210 212 237 ShoCk
2R 2 I() 230 245 2A0 222 207 320 245 220 220 246 Re 1 i e f
'l!i 'l2Q 245 2fi0 300 22Q 215 267 23Q 217 220 23b Pleasure
3(? 175 I~0 2fi0 215 177 190 247 220 19Q 190 222 Satisfaction
31 110 222 245 2H5 240 230 28b 24U 235 250 270 Anxiety
32 20() 217 225 245 2A0 215 2b7 24U 232 l30 245 Fear
3;i 275 3lil 300 :i50 2A2 280 325 295 295 290 326 Terror
3A 225 225 220 `l65 190 190 240 227 205 190 210 Woe
35 205 210 210 24~ IG5 I8b 250 200 180 170 170 Melancholy
3f, 2Q0 200 200 I8Q I80 170 217 190 160 170 176 Depression
37 2UO 225 30b 205 2O0 200 280 250 210 210 220 Suffering
3A 250 305 370 3IQ 2~0 280 345 310 305 300 310 Despair
;i~~ 250 270 226 385 320 300 362 315 320 31b 320 Irritation
40 3OU 3b0 382 340 290 320 345 330 335 312 340 Resentment
- 4 I 2H7 307 305 390 270 295 ~ 37b 310 325 330 260 Hatred
42 2711 2b0 265 240 210 200 240 230 210 200 226 Tenderness (variation 1)
43 220 235 235 'l50 1A5 210 255 230 225 220 240 Tenderness (variation 2)
44 21 n'l20 23fi 2f0 175 205 290 250 225 205 I72 Love
52
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R400540030032-8
FOR OFFICIAL USE ONLY
characteristics (states) produces an approximate breakdown of the mass of data into
- several classes containing similar states.
The second variant of treatment involved analysis of the succession of vowel phonemes
in the test sentence. We attempted to reveal certain correlations between the re-
vealed factors and the character of the melodic contours of the phrases analyzed,
and to establish certain types of controlling influences which would form the melodic
contour of the sentence in each individual pronunciation. Factor analysis in relation
to sound revealed two factors, the first of. which was found to be correlatecl with
stressed vowels in the phrase, the second being correlated with stressed vowels. In
other words the computer program of factor analysis did establish certain laws in
the formation of the melodic contour of a statement.
Conclusions: Factor analysis in relation to emotional states revealed four classes
of states; their comparison with spectrographic data showed that these classes are
typified by different degrees of unevenness and range of the fundamental tone (high
for factors I and III, low for factor II) and differences in the location of the
first melodic rise--namely at the vowel "a" in "TdK IIpOCTO..." for factor III, and
at the first vowel "o" in "~TdK IIpOCTO. for factor II) �
II. Investigation of Factor Loads and Analysis of Semantic Content in a Study of
Speaker Individuality
Materials and methods : The initial sentence "MaMa ~vtt~ a MaHx~" [Mama myla Manyu;
Mama bathed Man'ya] was pronounced by 26 speakers (24 stated it once, one stated it
29 times, and another s~eaker stated it 35 times). The speakers uttering this
- sentence were not asked to read it in the same way. Therefore they read the sentence
differently each time, while remaining within the framework of a given type of communi.-
cation. Comparison of the factor loads and the real melodic contours revealed four
basic patterns for the melodic contour of the control sentence.
Type I was characterized by a relatively even melodic contour, one with no sharp
rises and falls throughout the entire sentence. Type II was characterized by pro-
- nounced type I and II melodic rises separated by a not very large fall. The type III
melodic contour differs from the type II contour in that the fall between the types
I and II melodic rises is very significant, while in comparison with the type I
melodic rise, the ty~e II melodic rise appears more "massive" than is the case with
the type II melodic contour. The type IV melodic contour is characterized by a
- clearly pronounced descending melodic pattern.
Factor analysis in relation to segments of the test sentence revealed the same laws for.
formation of the melodic structure and the "responsibility" carried for this proc~~~~:
by the individual factors indicated in part I of this communication.
. 53
FOR OFFICIAL i1SE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
SIGNIFICANCE OF PROSODIC AND SPECTRAL PARAMETERS OF SPOKEN
SIGNALS EXPRESSING DIFFERENT EMOTIONAL STATES
L. P. Blokhina, T. G. Gomina
Isolating from an emotionally colored spoken. signal the acoustic parameters bearing
information on ~ given emotional state, and determination of the significance of
each of these parameters is one of the central problems of research on emotionally
colored speech.
Up until now, this problem l~as mainly been attacked from the standpoint of acoustic
analysis of prosodic characteristics (1- 5). Attempts at spectral analysis of emo-
tionally colored s~oken signals have been undertaken in a much lesser quantity (6-9).
Our research objective was to determine the significance of individual acoustic
parameters (prosodic and spectral) in spoken signals expressing different emotional
_ states by applying the analysis-synthes is-analysis method.
We used English-language material for o ur intonographic and spectrographic analysis
of emotionally colored speech. We sele cted emotional states falling wittiin the
classes joy, anger, fear and melancholy in comparison with neutrally colored speech,
which we referred to as normal. 7Chis report discusses the results of the synthesis
stage and of subsequent listener analys is.
A two-syllable nonsense signal simulati ng primary and secondary syllables in experi-
mental sentences uttered by English spe akers was used as the material for synthesis.
The experimental material was synthesized at the LEF (not further identified] of the
1'irst Moscow State Pedagogical Irlstitute of Foreign Languages imeni M. Torez. The
synthesizer was part of a complex of apparatus intended for primary analysis and
synthesis of speech, fashioned out of an "Ural-14" computer complex. The number of
temporal intervals in the programmer was 45. The THHT. parar~eter is describedby a five-
diyit code (from 5 to 1G0 msec with a spacing of 5 msec). F~ is described by
an eight-digit code (from '78 msec to 50 msec with a spacing of 78 msec). Parameters
F1, I'~, I',3 are described by a seven-digit code with variations in the following
ranges : 200-1, 470 EIz, 500-3, 040 Hz, 1, 000-6, OSO Hz. Parameters Fa~, F' ~p, F"~,
are described by a four-dic~it code in the following ranges: 600-2,400 Hz, 1,000-
1,050 Hz, 2,000-9,950 Hz. The amplitude parameters of all types of formants are
described by a two-digit code. The synthesizer can work with four amplitude values:
~o? Amin~ Amed? Amax�
54
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR O~FICIAL USE ONLY
Thc F~rogram fed into the synthesizer was written on the basis of conclusions made
as a result of instrumental analysis of the experimental material contained in
statements made by 11 speakers. The synthesis program was written with a consider-
ation for only those emotional state characteristics which were most common--that is,
those which were noted among all or most speakers. (In view of the limited techni-
cal possibilities of the synthesizer we were unable to include in the program a
number of acoustic characteristics isolated in the course of spectral analysis).
In comparison with normal, the following characteristics were found to be the most
common: For the state of joy--expansion of range and a rising-falling contour for
the ChOT lfrec~ueiic:y of the fundanental tone]; the voeal saturation of phrases;
shifting of spectral energy into the high frequency range; expansion of the spectrum
of stressed and unstressed vowels, and higher intensity of stressed vowels at high
frequencies; for the state of inelancholy--narrowing of range, and a falling contour
for the ChOT; consonant saturation of phrases; shifting of spectral energy toward
lower frequencies, narrowing of the spectrum of stressed vowels; for the state of
anger--e:cpansion of range and a falling contour for the ChOT; consonant saturation
of phrases, shifting of spectral energy into the high frequency range, expansion of
the s~ectrum of stress vowels, and growth in their intensity; for the state of fear
[sic; the word "anger" was probably intendedj--expansion of range and a risiny con-
tour for the ChOT, consonant saturation, shifting of spectral energy into the high
frequency range, expansion of the spectrum of stressed vowels, and growth in their
intensity; for the state of fear--expansion of range and an ascending contour for the
ChOT, consonant saturation, shifting of the energy spectrum into the high frequency
range, insignificant expansion of the spectrum of stressed vowels.
The ~rogram was written for the neutral state and for each of the emotional states
named above. The program written for the neutral state was subjected to change in
stages. The first stage entailed variation of the frequency characteristics of the
signal with all of the rest of the parameters remaining unchanged. In thc~ second
~ stage the changes in the ChOT were supplemented first by modification of t.he length
of consonant sounds and then modification of the length of vowel sound~. In the
third stage the high frec;uency components of the vowel spectrum were introduced
(during simulation of iiitense emotions), after which their amplitude was ~ncreased
by one order of magnitude. Each separate modification of the initial progrum
corresponding to a normal state was rccorded on ferromagnetic film. Thus we ob-
tained: a) signals in which only one of the prosodic or spectral characteristics
was varied; b) signals with variations in prosodic characteristics (ChOT and length);
c) signals with modifications of spectral characteristics; d) signals with simul-
taneous variations in prosodic and spectral characteristics. The model of a parti-
cular emotional state was a sumn?ational signal including bcth prosodic and spectral
modifications typical of the given emotional state. Synthesized signals were
analyzed by means of a spectrum analyzer with the purpose of testing the adequacy of
sF~oken signals to program-simulated signals. Analysis of the obtained spectrograms
indicated that the simulated and synthesized signals were identical, making it possible
to go onto the next stage of the research--listener analysis.
'l~ao scries of listener analysis were conducted, with seven experienced listeners
~~articipating. The same listeners participated in both analysis series. The
second series was conduct~~d with the purpose of checking the reliability of data
obtained in the first series. Comparison of the results of the two series demonstrate~~
- 55
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
rvn vrr~~,~t~L. VJG V1~L1
their sufficiently good match. The instructions required the listeners to perform
two tasks: 1. classify a nonsense stimulus in relation to the following character-
istics: a) neutrality-emotionality, b) positive emotion/negative emotion, c) intense
emotion/weak emotion. At first this test was performed as the subjects listened to
complete models of emotional states, and subsequently while listening to sigrials in
which certain acoustical parameters were modified. Each emotionally colored non-
sense stimulus was paired with an emotionally neutral nonsense stimulus, the pause
between them being 2 seconds. 2. Determine emotional state on the basis of the
nonsense stimulus provided. In this case the subjects listened only to the complete
models of the emotional states and the normal recording.
The results of listener analysis permit the following conclusion, tentative for the
~ moment.
In the first half, all listeners without exception were able to distinguish the
normal stimulus from stimuli simulating emotional state. Judging from the responses
of the listeners they were able to distinguish strong from weak emotional states
and positive from negative emotional states rather easily (see Table 1).
As is obvious from this table, the listeners are able to classify joy and melancholy
with sufficient adequacy. The most difficulty occurred in identification of signals
simulating anger and fear. In general the listeners confidently related these
- signals to strong and negative emotions, but separation of the latter from one
another was a complex task. Thus the signal simulating anger was identified as
anger by 57 percent of the listeners and as fear by 43 percent. The signal simu-
- lating fear was classified as fear by 31 percent of the listeners and as anger by
69 percent. These difficulties in identifying these models compelled us to conduct
one more series of listener analysis, in which signals simulating fear and anger
were presented to the listeners in pairs.
The listeiiers were told that two successive signals they were to hear belonged to
different classes--one to anger and the other to fear. The task of the listeners
was to classify the presented signals appropriately. In this stage of analysis
80 percent of the listeners correctly evaluated the synthesized signals presented
to them.
One interesting fact that should be noted is that signals with modified spectral
characteristics (expansion of spectrum and growth in the intensity of vowel formants)
whictidid not undergo corresponding change in their prosodic parameters were classi-
- fied by the listeners as manifestations of weak negative emotions (caution, mild
anger, reproach), though according to the results of spectrographic analysis these
spectral modifications were typical of emotional states such as anger, fear and,
to the greatest degree, joy. In all probability the spectral modifications typical
of the state of joy cannot by themselves (without participation of prosodic
characteristics) transmit this emotional state. Nbreover we should also consider
the still. inadequately higti technical possibilities of the synthesizer we used,
making it impossible to simulate finer modifications of the spectrum.
An analysis of the listerier responses revealed parameters promoting adequate identi-
fication of the corres~,oriding emotional states. The principal parameter in identi-
fication of joy is the ChOT (its contour and range). When only the frequency
contour was modified (narrowing of the range and depression of the frequency level),
56
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
FOR OFFICIAI. USE ONLY
~n
0 0 ~
z7 u~ u~ r.
~ ~ pp
~ O tn ~ ~ N
O r~1 tn ~r ~ O 1~1
2 ~-I ~-I ~-1 O O tA O ~-I ~ O ~ O
~ ~ Q�ri c+'1 r-I ~ C: N C~1 ~ UI d'
v ~ v v v
~ ~ ~ 0 ~
v
b ~T 0 U Ul ~I �U ~'1 r"'~ ~ y,~ -
~ �rl f~ ~ f-i ~ �~I O ~ N 3~1 i-~ ~ W ~
O 'Jr r-I S-I r-I W b~ r-1 �n N t0 ~ r6 rtf N CT
~ z� h~H a~~ H~w waa~
a,
> ~
o
�rl t11 O N O
~d ~ I r-1 O Q~ O
a1 Q r-I r~-1
z w
a~
> ~
o
N �rl tfl 00
4J �rl J~ I CO 1 I
CT vI O
~ a�~
a
v
U
~ ~
N O
a ~+~pi i i a�o M i
~ 3 w
N
~
~-1
ro a~ a
~r m o
~ o o u~ o
N i-~ I O N ~0 O
~1 O .-i ~-1
~ H ~
N
N
�'i ri
a b
~ o a~
o ~u o 0 0 0
~s i o 0 0 0
w ~ ~ ~ ~ ~ ~
m
~
~
~
~
b v
s~ o
ro o i i i i
~ ~ ~
a~ a~ ~
z
~
H
N
~
~
~ ~
(n ri
O
b ~
N ri U
a1 ~ ~-1 b~ ~0
a z� h ~ a~ w
~
a
57
FOR OFFiCIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
l'Va\ V~ ~ ~~..arfu VU~.. v~~a~~
tti~ listeners were able to classify this zone only as a negative emotion (anxious
~ remorse, caution, fear, reproach, displeasure). Additional increase in the length
of consonants and vowels permitted the listeners to establish the membership of
- the given signal to the melancholy class with sufficient reliability.
Table 2. t~/tv Ratio Within Syllables (in Relative Units)
_ State First Syllable Second Syllable
Anger 2.88 4
Fear 3.56 5.11
Although auditor analysis of the eu?otional states of "anger" and "fear" was con-
ducted in special conditions, we can still isolate the dominant parameters parti-
cipating in identification of these emotional states (though this conclusion re-
quires further testingr,~ore so than do the others). These parameters include the
frequency contour typical of the given emotional states: Rising in the presence
of fear and falling in the presence of anger, together with an increase in the length
of the first consonant and a decrease in the length of the second consonant coupled
with a simultaneous decrease in the length of vowels. Redistribution of the length
of a consonant and a vowel wihhzn a syllable may be significant in this case (see
Table 2).
BIBLIOGRAPHY '
1. Vitt, N. V., "Expression of Emotional States in Speech Intonation," Candidate
Dissertation, Nbscow, 1965.
2. Uldall, E., "Attitudinal b~eanings Conveyed by Intonational Contours," LANGUAGE
AND SPEECH, Vol 13, 1958.
3. Lieberman, Ph., and Mi.chaels, S., "Some Aspects of Fundamental Frequency and
Envelope Amplitude as Related to the Emotional Content of Speech," JASA, Vol 34,
No 7, 1962.
4. Bonncr, M. R., "Changes in the Speech Pattern Under Emotional Tension," THE
AN~ RICAN JOURNAL OF PSYCHOLOGY, Vol 52, No 2, 1943.
S. Fairbanks, G., and Pronovost, W., "Vocal Pitch During Simulated Emotion,"
SCIENCE, Vol Si3, 1938.
6. Williams, C., and Stevens, K., "Emotions and Speech: Some Acoustical Correlates,"
JASA, Vol 52, No 4, 1972.
7. Nikonov, A. V., and Popov, V. A., "Structural Characteristics of the Speech of
a Human Operator in Stressful Conditions," "Rech' i emotsii" [Speech and
Emotions], Leningrad, 1975.
58
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R400504030032-8
FOR OFFICIAL USE ONLY
8. Frolov, M. F., and Taubkin, V. L., "Influence of a Speaker's Emotional State
on Some Parameters of the Speech Signal," in "R,ech' ~ em4tsii," Leningrad,
1975.
9. Tishchenko, A. G., "Dynamics of Formants in the Spectrum of Audible Speech as
an Objective Indicator Distinguishing Positive From Negative Emotions,"
~ KOSMI QiESKAYA BIOLOGIYA I MEDITSINA, No 5, 1968.
59
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02/09: CIA-RDP82-00850R000500430032-8
INFORMATION CONTENT OF THE TIMBRE CHARACTERISTICS OF SPEECH
A. P. Varfolomeyev
Despite the fact that the overwhelming majority of works describing the system of
prosodic characteristics of speech make mention of its timbre, the question as to
timbre as an element of speech (a.-~d all the more so of language) possessing a certain
information content for practical purposes remains open. As a rule we see the term
"timbre of voice" used, and not "timbre of speech." And when discussion of the
= timbre of speech is found to be unavoidable, authors are compelled to limit them-
selves to general statements as to the possibility of expressing the emotional
- praperties of speech through timbre, forming a subtext and so on. The fact that
a listener's consciousness associates the timbre of the speaker with his emotional
and mental state has been mentioned more than once (1,2).
We did not come across any attempts at classifying the timbre of speech in relation
to the particular content it may express. The universally known classification of
voices (alto, soprano, tenor and so on) is hardly appl.icable to the timbre of
speech, since it does not correlate with any semantic elements of the speech act.
We obviously cannot base a classification of the timbre of speech on how content
descriptions of timbre group together and contrast with one another. Instead,
such a classification should be arrived at by seeking the laws governing change in
the timbre of speech in connection with the change in its information content. It is
not our ohjective here to arrive at such a classification, because to record changes
in the timbre of speech, we must first of all find a means of describing the informa-
tion content of tir.?bre--that is, a means of accounting for the semantics of a.
speaker's timbre within a certain excerpt of speech.
I3y describing a speaker's timbre in relation to its information content, we would
be able to reflect in a certain way the individual characteristics of speech. From
our point of view the timbre characteristics of speech are the most individualized,
and apparently this is precisely why it is so difficult to describe the content of
these characteristics. Those aspects of the content of the timbre of speech that
yield ta description and which may be correlated with expressions of the speaker's
emotional and mental state will find their use in general analysis of the emotional
characteristics of speech.
- We attempted to study the information content of the timbre of speech by the known
method of the semantic differential (3), based on recording the verbal evaluations
60
FOR OFFIC[AL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
- FOR OFFICIAL USE ONY.Y
made by subjects of a certain aspect of a spoken stimulus. The use of the human
- auditory system as a sufficiently sophisticated analyzer of individual and emotional
characteristics of speech has already recom�nended itself positively (4).
Methods of the Experiment
Subjects were asked to evaluate the timbre of the voice of a speaker (tape-recorded)
uttering a certain phrase, using scales such as "good-bad," "large-small," "happy-sad"
and so on. About 80 such scales were used in all. The scales were divided into
five ranks, each having a score from 1 to 5. As an example the scores for the "good-
bad" scale reflected perception of the following ranks of timbre: 1--"very good,"
2--"good," 3--"neither," 4--"bad," 5--"very bad." The lexical and grammatical
content of the phrases serving as stimuli was the same for all analyzed timbres;
consequently the main element that underwent variation was the timbre of the
speaker's voice.
Colleye students of different majors and speciaJ.ties and senior high school students
served as the subjects. Each stimulus were eva].uated by an average of 50 subjects.
Printed questionnaries bearing a list of the sc~~les of characteristics and an
explanation of the scoring system were used.
Statistical treatment of the data entailed find~.ng the average scores for each
stimulus in relation to all scales, and correlat:ion and factor analysis of these
averages.
Two series of experiments wer:e conducted (with l.l and 17 stimuli).
- Results and Discussion
Obviously average scores falling between 1 and 2 and between 4 and 5 would be signi-
ficant. Averages such as these would persuasively indicate that the timbre character-
istics of speech can be differentiated by the perceiver.
Thus for example, the information content of one certain timbre is expressed by the
characteristics "sad," "old," "inaccurate," "rough" and so on. The information
content of another is expressed by the characteristics "clear," "accurate," "strong,"
"good," "calm" and so oii. One can easily be persuaded of the qualitative consistency
of each of these descri~~tions.
Correlation-factor analysis of the average scorc~s revealed the principal factors
defining the way the content of the timbre char~~cteristics of speech is perceived.
One of them is re~resented by the characteristics "calm-irritable," "happy-sad,"
"unconstrained-tense," "smooth-rough," "good-evil" and so on--that is, by character-
istics ref.lecting perception of the speaker's tone.
The second factor is represented by the characteristics "clear-distorted," "good-bad,"
"beautiful-not beautiful," "pure-hoarse," "usual-unusual." This factor combines
~erception in relation to subjective assessment and clearness criteria.
61
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
` II,O
.96
.
.5 d~ ' '
- .
. ~8 .
.4 .
, , ~ ~2
, �3
,i~
~
.
- I I.~o - o,~ " o QS ------~�-ro ~D ~ I
-f ~
Figure 1. Intersection Plane of Factors I and II
. ~~s ' Calfi ~ � , .
l ~ ~
.9 ' ~
.6 . 2~d
S . ~ .
_ .
~ , 9 .
.8 �2 3� .
2,5 ~ 1 { .
. ,
Pure Hoarse
. - - ~ ~ - 31~ - - S~s ~S
~1N ~ ~ ~N~
3~~ Irritateci
Figure 2. Intersection Plane of Base Scales
62
FOR OFFIC[AL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02109: CIA-RDP82-00850R400540030032-8
FOR OFFICIAL USE ON?.Y
. J ~ _
. � 4~ . ~ '
. ~ / _ (2} 3 .
~~1 mNII~OJ � /
\ ,
~ _~_~i
/~'9 sll ~ -
, , L
_ ~ � ~ ~l
- ~9' 1/ (
j,Pj A / - ~3 ~ ~ na~E3 3 a/ ~ i ' ~ ~p14r9 ~
~ 1
~ 8 / R2 ~ (4) ~
\ ~ ~ /AApitNf NT / '
i
~ /01~ ~ /
~ ~
~ ~
- b,5
-!V
P~,r3
Figure 3. Intersection P1 ane of Factors II and I V
Key:
1. Normal 3. Paresis
2. Fi.brosis 4. Laryngitis
Figure 1 shows the position of the timbre stimuli in the inte rsection plane of these
- two factors (cietermined for the first series, consisting of 11 stimuli) . That the
factors were interpreted correctl y can be seen from a comparison of this space with
the disposition of these stimuli in the inter~ection plane of the so-called base
scales (the scales which most ful ly express the semantics of particular factor
groups) . The st.i muli fall within this space in correspondence with their average
scor.es on scales defined as bein g base ~cales.
For one factor group we ad~pted " duminant" as the base scale, and "hoarse" as th~
base scale for the other. The s imilarity in the distribution pattern of the stimuli
in both s~~aces indicates, in our opinion, both that the faotors are interpreted
corrc:ctly and that thc base scales are selected correctly (see figures 1 and 2) .
Among the samples of speech studied, some belonged to sp eakers w?th a normal speech
system (stimuli 5, 6, 8 and 9) while others beloneed to speakers with certain speech dis-
~ orders : fibros~s (1, 3, 4) , paresis .7) , and larynqitis (10,11) . The use of normal
and pathological speech is motivated by the fact that differences in the timbre
characteristics of speeckt are highly significant in relation to such a sample, and
- manifestation or norimanifestation of these difPerences has fundamental significance
to determining the possibility af using it for diagnostic purposes.
- 63
FOR OFFICIAL 11~~; ONI,Y
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/49: CIA-RDP82-00850R040500030032-8
ruK t~rri~.?Aa u~c vivLr
Tlie position of timbre stimuli in the intersection plane of the factors and in the
intersection plane of the base scales indicates a clear tendency toward grouping
in relation to the criteria for normal and pathological, and toward differentiation
of stimuli in relation to pathology, depending on its nature. Z'hus the second factor
also represents what is normal.
- Our work permits the conclusion that it is possible to analyze the timbre character-
istics of speech by the method of the semantic differential. That the character-
- istics of the "tone" factor managed to reveal themselves attests to the promise of
- studying the emotional and expressive content of speech represented by timbre.
A description of the inforntation carried by the characteristics of the timbre of
speech, when compared with an acoustic description of the same stimuli, may lead to
creation of automatic systems that could determine the individual's emotional and
mental state on the basis of the timbre of his speech, and to creation of systems
_ abl.e to diagnose speech pathology.
~ BIBLIOGRAPHY
- 1. Torsuyev, G. P., "Fonetika ~ingliyskogo yazyka" [The Phonetics of the English
Language], Nbsco~r, 1950.
2. Vaarask, P. K., "Tonicheskiye sredstva rechi" [The Tonic R,esources of Speech],
Tallin, 1964.
3. Osgood, Ch., Suci, G., and Tannenbaum, P., "The Measurement of Meaning,"
Urbana, 1957.
4. Galunov, V. I., Manerov, V. Kh., and Tarasov, V. I., "Auditory Analysis of
Speech Recorded in the Presence of Emotional States Simulated by Different
Method~," in "Rech' i emotsii" (Speech and Emotions], Leningrad, 1975.
64
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFICIAL USE ONLY
THE SIGNIFICANCE OF PERSONAL 1~ANINGS TO REALIZATION OF THE
PHYSICAL CHARACTERISTICS OF A SPOKEN STATENIENT (ACCORDING TO CLINI~AL OBSERVATIONS)
Ye. N. Vinarskaya, A. S. Nikiforov, S. A. Soldatova
Research Objective
To be realized, the physical characteristics of a speech signal require externally
determined energy outlays by the body, which change depending on the complexity of
both the structure of the signal itself and the operational structure of the act of
communication. The question is, would consideration of just the objective complexity
_ of these structures be enough to arrive at a conclusion as to the level
of energy to be expended?
Research Methods
To answer this question we used the method of clinical observation of a group of
- patients (150) suffering focal lesions of the mesencephalic-diencephalic division
of the brain, manifested as symptoms of insufficiency of the ascending activating
effects of nonspecific structures (i).
Research Results
Llynamic observation of our patients showed that as pathology proceeds, they must
gradually experience a greater personal interest if they are to enter into spoken
comr~unication. Their speech becomes slower, quiet, monotonous, emotionally unex-
pressive, and lexically and grammatically simple; speech becomes increasingly more
tiring to the patients, and they exhibit an increasingly greater need for supple-
mc~ntary motivation or volitional effort. In a more-pronounced phase of illness
~~atients ccase to use speec~h at all in situations not having direct personal meanin~
to them. [iowever, if a patient is made emotionally interested in the topic of
discussion aiid in the results to which such a disc~~ssion might lead, he arrives at
the necessary motivation for speech, and he begins to speak. The more personally
~aningful the topic and purpose of discussion become to the patient, the less con-
strained, l.ouder and more expressive the patient's speech becomes and the richer is
his use of segmental and supersegmental phonetic resources and of lexical and
grammatical re:sources. Thus one of our subjects, patient Z., 42 years old, who was
unable to make any sort of speech contact with medical personnel and even his wife,
- unexpectedly revealed the ability to advise his favorite daughter on discipline in
public prior to an examination: In the course of an hour he essentially answered all
of h~~r questions with a quiet, monotonous voice. Just prior to his death, another
_ 65
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
rt~n ~irr~~ ~r.i, uac, vi.a..
subject who was even more gravely ill, patient K., 57 years old, suddenly entered in-
to spoken contact with his wife when she broached the subject of writing his will.
As a rule speech of patients improved in all respects when the discussion turned to
their health, to the prospect s of a certain method of treatment and to the prognosis
of illness. As an example pa tient B., 43 years old, who exhibited a pronounced lack
of motivation to speak, reacte d to questions with monosyllabic, qulet, unintelligible
and intonationally unexpressive replies, but when it was remarked that the color of
her face was good that day, sYie showed signs of interest, and clearly uttered in a
sonorous voice with an intonation of prideful joy: "I never use any sort of creams,
just cold water and soap."
It may be concluded that foca 1 lesiori of the mesencephalic-diencephalic division of
tlie brain, coupled with sele c tive disturbance of ascending activatinc~ effects of
its nonspecific structures upon cortical and subcortical neurons specific to speech
behavior make these neurons functionally lacking. W'hile in the normal individual
the excitability ~f these neurons is raised to the necessary level automatically,
in our patients such regulati on loses its automatic nature and therefore becomes
accessible to analytical study. The method of clinical observation is enough to
permit the conclusion that imp arting meaningfulness to the topic of conversation in
relation to a given personali ty has fundamental significance to making speech be-
= havior possible irrespective o f the complexity of its structure. In some cases
- personal interest assumes exceptional forms in relation to regulation of the energetic
support of the individual's speech behavior. The following observation is an example.
Valeriy I., 24 years old, le a rned that he was goiny to die soon. This fate was
postponed by his wife Marina, who took the risk of subjecting her husband to a new
method of treatment. From tYiat moment on, and until his death 4 years later, Valeriy
devoted his entire life, in a 11 of its manifestations, to the service of Marina,
to her happiness and welfare_ Everything in which Marina was interested acquired
~>crsonal meaning to Valeriy; all else had no meaning to him, and was unconsciously
_ iyiiored by him. As Valeriy's physical strength drained away, the restrictions he
imposed upon his own persona 1 activity became increasingly more severe. Several
t.imes in the course of these years Valeriy was on the brink of death, but despite
ttie severity of his ~~hysical state, tie unexpectedly recovered once again, because
"Yie could not bear to anger Marina," because "he had no right to abandon hE~r in such
a state" (she was exF~ecting), because "funerals are difficult in winter, and Marina's
labor had to be easy."
In the middle of the fourth y ear the patient's state deteriorated dramatically once
~gain, excessive meiital and phy:~ical exhaustion developed, and various symptoms of
organic affliction of the ce n tr,il and peripheral nervous system appeared, to include
in the mesencephalic-diencephal~c di~ision. It is noted in the disease history that
- the patient responded to que s ti~~ns with a q~:iet, unexpressive and monotonous voice,
oFten pronouncing only the f i rst syllable of words making up very short sentences.
Soon the patient once again dev~~loped pneumonia, severe, painful contractures of
the arms and legs arose, and cociwlsive spasms appeared in in@ividual muscle groups.
k[is sE~cecti became almost inaudihle and incomprehensible, and he almost stopped
talking. Iiowever, on the da y following the birth of Marina's child,Valeriy was not
to bc recognized. 1~11 who e ntered the ward were greeted by a smile, and u:~ing a
loud, emotiorially expressive voice and. clearly articulated words, he info rn~ed them
66
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFI('IAL USE ON~.Y
of the weight and length of his daugher and the contents of his telephone conversa-
tion with his wife. Despite cachexia of tne greatc;st severity and worsening organic
affliction of the brain, the patient was able to read and to work out the schedule
for his daughter's care, and he was able to maintain an interest in talking with
vi~itors about Marina's future and, in this connection, the image of women in Russian
literature. He was unable to maintain any sort of conversation on topics not having
a relationship to his health, his wife and daughter: His voice would die down, the
rhythm, intonational structure and articulation of sounds would become irregular,
phrases would become simple, and Valeriy would fall altogether silent.
Conclusions
When taken all together, the ciinical iacts suggest that energetic support to the indi-
vidual's speech behavior, and the physical characteristics of the s,~oken si.gnal as
well, are dependent primarily on the personal meaning of this signal (2,3) within the
structure of the activity going on. This prem,ise should obviously be accounted for
when creating technical systems interacting with man and automatically identifying
his emotional state on the basis of the spoken signal.
BIBLIOGRAPHY
1. Vinarskaya, Ye. N., Nikiforov, A. S., and Soldatova, S. A., ZH. NEVROPAT. I
PSIKH. IM. KORSAKOVA, No 9, 1977, p 1347.
2. Leont'yev, A. N., "Deyatel'nost', soznaniye, lichnost [Activity, Consciousness,
Personality], Politizdat, 1975.
3. Vilyunas, V. K., "Psikhologiya emotsional'nykh yavleniy" [The Psychology of
Emotional Phenomena], MGU, 1976.
67
FOR OFI~ICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
MVK Vrr~~.~A~. vac v~~i.y
SPEECH RECOGNITION SYSTEM RECOGNIZES SPEAKERS BY VOICE
T. K. Vintsyuk, A. I. Kulyas, A. G. Shinkazh
A learning phonemic speech recognition system operating at the Ukrainian SSR Academy
of Scieiices Institute of Cybernetics is capable of recognizing, with high reliability,
the words and coalescent sentences of only that speaker to which the system had been
tuned in its learning mode (1,2). The reliability of recognizing the spoken signals
of another speaker dropped to 70 percent, given a lexicon of 300 words. These facts
permit the assertion that both the spoken signal description employed and the recogni-
tion and learniny method account for voice individuality. If this is so, then the
description and the recognition method may be applied, without any special changes,
for recognition of the individuality of the speaker in relation to some key phrase
or password.
A mathematical model of a s_~oken signal, the recognition method, the method by which
the system learns to recognize the soeaker by voice and the results of experiments
are briefly described below.
S~~ok~n Signal Description (3)
- p, s},~oken signal is a sequence of vectozs (elements) read out uniformly in time:
Xi= ~Xi~ Xs, xi, Xi)~
where 1 is the length of the realization. Elements xi are 48-dimensional vectors
with binary components 0 or 1. Component xiv, with element xi having number v, is
equal to 11, if at the i-th moment in time the energy in the v-th spectral band is
above a certain threshold Ov, and simultaneously greater than the energy in the
neighboring or (V-1)-th band. Z'he Ov thresholds are chosen such that the pauses of
vectors xi would be zero in 90 percent of the cases.
- Thus elements xi represent the sign of the frequency derivative in relation to the
frequency of the current spectrum of speech. Obviously elements xi define the posi-
tion of the maximums and minimums of spectral energy on the frequency axis and,
~qually so, the quality of the formant maximums. The sequence X1 contains informa-
tion on change in the position of the maximums and minimums of the current spectrum
with respect to time.
Sequence xi does not depend explicitly on the inL-ensity of pronunciation.
68
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFICIAL USE ONi.Y
Mathematical Model of a S~oken Signal Accounting for Voice Individuality
Now let us describe a mathematical model of word signals applicable to phoneme recog-
nition. This must be done before the method of speech recognition can be e~lained,
Observed realizations X1 are interpreted as the results of random distortions in
standard signals of the same length 1. The set of standard signals for one speaker
is determined as follows.
Let there be a finite ordered set E of standard elements e(j), where j is the number
(name) of the element. Let us assume that standard elements e(j) have the same
physical meaning as elements xi--factors with binary components. Elements e(j)
are el.ementary segments of speech with a duration of 15 msec, and they represent
phonemcs, or usually parts of phonemes.
Each word or word combination with number k is defined by phonetic-acoustic trans-
cription Rk--a sequence of element names from set E:
~k-Ukl, ~k2, ~k~~ JkqR~, j
where qk is the number of symbols in the transcription of the k-th word.
Transcription Rk is defined as an operator which, being applicable to E, generates
the initial standard signalfor the k-th word:
IZkF.-=\el~kt~~ 1e~~k2~, e~~kg~+ e~~kqk ~2~
Next we introduce transformation v of the initial standard spoken sic~nal. Operator
v, when applied to RkE generates a standard signal with length 13qk with the follow-
ing structure:
~~~~ki). ~'~~ki) ~~~k~)~ C~~k9~ ~~~k.r), f~)ks)
v(tkf: ~
= - , , - , �
� r~a:, v, pa~ vh pe~
~Ilkqa cUkqp ) _e~, e2, e,, ei)~ (3)
, . - - -
v~ pa:+
k
where vs are the components of operator v--that is, v=(vl, v2, vs, vqk)�
The following limitations are imposed on whole numbers vS:
m(k, s) ~,v, M(k, s), s=i, 2, qk (4)
~l k
~ V~ ~rJ~
s-i
h9
FOR OFFIC[AL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R400540030032-8
Itclat.ioi~:.liij>s (4)-(5) detcrmine th~ set V(k,l) of operators v generating, from
RxE, all possible standard signals vRkE with length 1.
- Assume that the first and last elements in the word transcriptions are represented
by an elementary standard pause signal--that is, jk1=jk~ = 1. In addition, let
m(k,l)=m(k,q~)=0 and M(k,l)=M(k,qk)=~. For all other s it is assiuned that m(k,s>0).
Obviously the standard word signals vRkE, v~V(k,l) are coarticulated signals
distinguished by a nonlinear pronunciation rate and different (including zero) length
of pauses at the start and end of a word. The rate is adjusted by the choice of
numbers vs, and coarticulation is accounted for by the pr.esence of such s that
m(k,s)=M(k,s)=1. ,
These constructs account for the fact that speech is generated out of elementary
"bric;ks" common to all words, and they account for the main factor of variability
in speech--nonlinear change in the rate of pronunciation.
Variables F, ~ Rk}kN~ (N--number of words in the lexicon),
{ m (k, s), ~t(k,1 s) ~ ~ ,,qk~
define the parameters of the grammar generating the standard snoken signals. These
parameters are individual to each speaker. They are evaluated on the basis of a
learning sample consisting of realizations of a word uttered by the same speaker (3).
Speaker Recognition Method
If we designate by p(Xi/vRk(d)E(d) the probability of signal Xi on the condition that
the standard signal is
vRk (d) E(d), v~ V(k, I, d),
- where d is the number of the speaker, using the method of maximum likelihood we can
write the criterion for detection of a speaker on the basis of pronunciation of
one word of a lexicon consisting of N words as:
argmax max max
d~X~~ ~--a-- -k ~6v(P� i. a~ P(X~I~Rk(d)E(d))� (6)
Let us assume that observed elements xi derived from standard elements ei as a result
of independent distortions by additive noise with binary,identically distri.buted
ar~d independent comPonents. Then criterion (6) could be written in the form:
I
argmin ~n~n min ~ t{(xi~(vRk(d)l((d))~) ~7)
d~X~~ d--_ . k~ vcV(k. L d) ,
?0
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFICIAL USE ON~,Y
where (vRk(d)~(d))i = ei --ar? element with number i in sequence vRk(d)E(d) defined
by relationship (3); H(xi,ei)--the Hemming distance between xi and ei. Similarly
as with phonemic word recognition (1,3), criterion (7) is realized as follows: The
minimum with respect to v is sought by the dynamic programming method, and the minimum
with respect to k and d is fotind by complete sorting.
In this case when the speaker is recognized by a password or a word combination
k*, minimization with respect to k is not necessary.
- Characteristics of the Model
Different parameters of the generating grammar reflect individual voice characteristics
differently.
Set E(d) primarily expresses the geometric characteristics of the speaker's articula-
tory system and the means of articulation of the principal sounds. Phonetic-acoustic
transcriptions R}~(d) depend on E(d) and reflect the individual manner of pronuncia-
tion of coarticulated speech--words in this case. Variables m(k,s), M(k,s) express
the indiviciual rate of pronunciation.
Of the three sets listed above, E(d) would apparently be the least individualistic.
In particular it may be suggested ~hat E(d) does not depend on the speaker, and
set E(d*) of one speaker d* may be substituted for E as being common to all speakars.
Then only
~ Rk~kN~ and ~(~n(k, s), M(k, s)) } kN~.sqk~
would be evaluated in the speaker recognition learning mode.
Experimental Results
In the first experiment speakers were recognized on the basis of the key phrase
"I am a person."
~aenty s~eakers, including four women, took part in the experiment. The phrase was
spoken in the presence of the noise of a BESM-6 computer room with the speech recog-�
nition system operating normally. The signal/noise ratio was 20 db. An MK-61
micro~~tione was used.
Each s~~eaker repeated the phrase 10 times with an interval of 1 minute between
repetiL-ions. Four out of 10 of the repetitions were used as the speaker learning
samplc. Set E(d) was common to all speakers. The latter was represented by the
E(d) function of one of the male speakers, calculated in previous learning experi-
ments in the recognition of spoken words. There are a total of 80 elements in E.
Transcription Rk*(d) and the variable
~(m(k*, s, d), M(k*, s, d)) } 9 k j.
71
FOR OFFIC(AL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
~'Vn V~'~'~I.IML VJL' Vl\L~
were determined for a given qk* = 19 during the learning process. An algorithm de-
scribed in (3) was used for this purpose.
The tables below show the results of recognizing learning and control samples
separately for male and female speakers. Criterion (7) was used in relation to a
fixed k = k*.
Men Women
Total Total Total Total
Realizations Errors Realizations Errors
Learning 64 1 Learning 16 0
sam~les samples
Control 96 2 Control 24 8
samples samples
As follows from the tables, the results of speaker recognition were satisfactory
for men. The relatively poorer results for female speakers can be explained by the
- fact that set E(d) determined for male speakers poorly appoximates the signals of
female speakers. Individual E(d) must be used for each speaker.
Then we introduced recognition refusals (1). The following results were obtained
for a refusal threshold of 12:
Women
Men Total
Total Reali-
Realizations Errors Refusals zations Errors Refusals
Learning 64 0 2 Learning 16 0 0
samples samples
Control 96 0 3 Control . 24 2 6
sam~~les samples
Thus the sum total of speaker recognition is characterized by the following data:
- 1 percent errors and 5~~ercent recognition refusals.
In the second experiment the optimum pair (d,k) minimizing criterion (7) served as
the recognition response. In this case the number of the word and the number of the
speaker are indicated--that is, the word recognition and speaker recognition problems
are solved simultaneously. There was a total of 205 classes in.the experiment--
200 words recorded from one spea;:er a~id five recorded from male speakers uttering
the key phrase "Listen, computer" for identification purposes. At first the
process of learning to recognize 200 words of one speaker was performed--E, Rk and
(m(k,s), M(k,s)) were evaluated. Then the process of learning to recognize speakers
on the basis of the key phrase was carried out in the presence of a fixed E--Rk*(d)
and (m(k*,s,d), M(k*,s,d) were evaluated. The resulting significance level of
recognition was 99 percent.
72
FOR OFFICIAL USE ONLY ~
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFIC'IAL USE ONLY �
CO[1(: Ll1~.1U11
Resources for learning and phonemic recognition of spoken words, developed and investi-�
gated by the Ukrainian SSR Academy of Sciences Institute of Cybernetics, may be used
successfully to identify speakers on the basis of their pronunciation of passwords
or key phrases.
BIBLIOGRAPHY
l. Vintsyuk, T. K., and Shinkazh, A. G., "Phonemic Recognition of Spoken Words:
Learning and Recognition Algorithms, and Experimental Rssults," in "Tezisy
dol~ladov VIII Vsesoyuznogo seminara 'Avtomaticheskoye raspoznavaniye slukhovykh
obrazov [Abstracts of Reports at the Eighth All-Union Seminar "Automatic
Recognition of Auditory Patterns], L'vov, 1974, pp 19-24.
- Vintsiuk, T. K., Gavrilyuk, 0. N., and Shinkazh, A. G., "Phoneme-by-Phoneme
Recognition of Speech Composed of the Given Vocabulary," in "The Proceedings
of the 1976 IEEE International Conference on Acoustics, Speech and Signal
Processing," Philadelphia, 1976.
3. Vintsyuk, T. K., and Shirikazh, A. G., "Automatic Transcription of Patterns on
the Basis of a Learning Sample," in "Obrabotka i raspoznavi?iye signalov" [Sigr.al
Processing and Recognition], Kiev, 1975, pp 102-120.
73
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
- l'vn vr,�wan~. uva:. vira.a
EMOTIONALITY OF THE PERSONALITY AS RELATED TO PSYCHO-
PHYSIOIAGICAL AND SPEECH CHARACTERISTICS
N. V. Vitt, L. V. L. B. Yermolayeva-Tomina
The emotionality of an individual has a dual origin--biological and social. Being a
component of temperament--the dynamic aspect of behavior, man's emotionality, as
outwardly expresse d, is unavoidably subordinate to socially accepted norms, and
this is manifested especially strongly in speech. All changes in the way the subject
regulates his reactions to emotion-producing situations fall on a bipolar scale
corresponding to the binary principle of the expression of emotions in speech--
voluntary-involuntary expression (ly. When studying the expression of emotions
in speech, it is important to find characteristics which would in a sense break
through the regulatory filters, irrespective of the level and individual structure
of the person's emotionality (2).
Within the context of speech-emotions-personality," there are at least four proper-
ties ofhuman emotionality that are most significant: 1) emotional reactivity (equat-
able to V. D. Nebylitsi~'s ~mpulsiveness); 2) emotional stability-laY~ility, equatable
to the frequency with which emotions arise; 3) intensity, which can be determined
easily from the EEG and from speech characteristics; 4) duration, manifested as the
persistence of an emotion.
The objective of our research was to comparatively analyze general emotional re-
activity and its expressioa in speech.
We used a modified variant of P. P. Blonskiy's procedure coupled with simultaneous
EEG recording. The subject was asked to recall, in his memory, and "relive" emo-
tion-producing situations of different modalities. The latent time of the recall
of emotion-producing situations and the intensity of changes in brain biorhythms
in response to "reliving" such situations were recorded. Spoken com�nunications
were analyzed in terms of two basic factors--formal linguistic and semantic content.
Thc verbal material includ~d verbal statements by subjects in oral and written form,
_ and a recording of oral resYonses to Rc~rschach inkblots. The experiments were per-
formed individually with each of 40 subjects (3).
Indicators of general emotionality included the average latent time of recall and
the cumulative chanqes in biorhythms in response to recall of emotion-producing
situatioris. A matrix (see Table 1) was drawn up on the basis of an analysis of the
distribution of the frequency with which rhythms increased, decreased or remained
- unchanged when the subject recalled each modality--joy, anger, fear and displeasure.
74
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R400540030032-8
FUIt OFFICIAI. USE ON~.Y
Table 1. Changes in Cumulative Rhythm Energy in Response to Recall of Emotion-
Producing Situations of Different Modalities
Emotional
State Dominant ~
Reproduced Hemisphere Forehead Occiput
Anger Left Delta Delta
Theta Alpha
Alpha Beta-1
Beta-2 BPta-2
Fear Right Delta
Beta-2 Beta-2
Displeas ure Right Delta Delta
Alpha
Beta-2
Joy Left and Delta Delta
right Theta Beta-2
Alpha
Only those changes in rhythm which were observed in more than 50 percent of the
patients were included in this matrix. By using the matrix we were able to monitor
the ease with which emotional states were reproduced and the intensity with which
, the emota.ons were experienced. Local changes in rhythms with respect to 1) left-
right hemisphere and 2) forehead-occiput were superimposed over the concrete changes
recorded in the rhythms of each subject. This made it possible to determine the
expressiveness of emotions of different modalities exhi.bited by each of the subjets.
For the purposes of comparative analysis we considered the following data in the
spoken communications: total length, number of verbs, the number of attributes,
with subjective evaluations placed in a separate category, and the number of inter-
jections. In terms of semantic content we considered the modality of the produced
text (that is, whether it was sta~ed categorically or unconfidently); the nature of
the topic, defined as static or dramatized--that is, dynamics. Temporal character~-
istics included the time of the verbal reaction and the length of pauses.
The analysis showed that the level of emotionality and the dominant emotion in the
individual structure of emotionality are expressed more distinctly in the oral form
of verbal communications than the written form.
A comparison of general emotionality and the average time of verbal reactions re-
vealed a V-shaped dependence. The same sort of V-shaped dependence was discovered
between general emotionality and the indicators of the productivity of creative
verbal communications (determined in reiation to the number of associations).
The EEG indicators for emotion dominance correlated with pauses, with the number of
subjective evaluations made and interjections used by the subjects, and dynamic--
that is, dramatized--narration of the plots.
75
FOR OFFICiAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
rvn vrr~~.~r~a. vai: vi.,.,i
This apF~roach to studyicig emotionality of the personality in relation to psycho-
I~hysiological and speech characteristics and application of the corresponding pro-
cedure revealed that it would be possible to study controlled and uncontrolled
expression of emotions.
- BIBLIOGRAPHY
- l. Vitt, N. V., "Simulation of Emotional Speech," in "Materialy V-go Vsesoyuznogo
simpoziuma po psikholingvistike i teorii kommunikatsii" [Proceedings of the
Fifth All-Union Symposium on Psycholinguistics and Communication Theory],
Part 2, Moscow, 1975.
2. Ol'shannikova, A. Ye., et al., "Evaluations of Procedur�as for Diagnosing Emo-
tionality," VOPR. PSIKHOIAGII, No 5, 1976.
3. Vitt, N. V., and Ermolayeva-Tomina, L. B., "A Procedure for Revealing Emo-
tionality in Mnemonic Processes," in "Materialy V-go Vsesoyuznogo s"yezda
psikhologov SSSR" [Proceedings of the Fifth A11-Union Congress of USSR
Psychologists], Moscow, 1977.
76
- FOR OFF(CIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
FOR OFFICIAL USE ONf.Y
VARIABILITY OF SPEECH TEMPOS
L. T. Vygonnaya
A larqe number af lingui.stic questions (the causes of acoustic variability, the
temc~o scale of a par*icular lanquage, the tempo components of intonation and so on)
as well as questions associated with clarifying the dependence of the rate of speech
on the speaker's state and on the situation, wi*_h evaluating the individual's
capacity for voluntarily changiiig his pconunciation rate and for hearing speech
signals transmitted at different tempos, with selecting an optimum ratio between
the tempo of speech transmission and the speech tempo typical of the listeners,
and so on, may be clarified by determining how differences in the tempo of speech
differ among representatives of different languages.
Interpretation of t`~e tempo of speech as a particular language trait is consistent
with the present notion existing among the bearers of a particular language that
speech tempo is a variable characteristic which may be use3 to distinguish among
representatives of the same language, to compare one's own and foreign speech and to
make rather accurate evaluations and self-evaluations from tlze standpoint of such
unique features of speech. The limitations imposed on speech in relation to tempo
are no* rigid. Speakers can vary it significantly while still remaining compr~-
hensible. These variations are an indicator of the speaker's individuality, one
of the characteristics by which the listener can assess the nature and genre of
the statement, the style of speech and the emotional state of the speaker, as well
as tl~e rllythmical and intonational structure of phrases, which reveals the content
oF tfi~ statements.
Our objective was to reveal the tempo characteristics of Belorussian speech in compari-
son wi~ti Russiari. The experiment was perfo~ed on 27 native Belorussian speakers
haviny facility with literary language (27. men and 5 women).
Une of the m-~in difficulties in establishing tempo differences that are significani=
Co a ~~articular language is the ambiguity of variability in tempo--the fact that
characteristics corresponding to the text, to the individuality and to the emotional
sr~~te of thc speaker are all represented simultaneously in a spoken messagP. This
is why special cesearch on tempc~ requires a standard text wnich may be deemed homo-
c3eneous and neutral, which does not elicit clearly pronounced emotions (of varying
siyn and level) in the speaker and which does not motivate him to emphasize, while
reading, speech's function of emotional expressio~l.
- 77
FOR OFFICfAL USE aNLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
PUK ~Irl~il IAI, u~r. VIVLI
In our experiment we used 20 narrative phrases (806 syllables, 1,870 phonemes)
of varyi;ig length--from 2Z to 71 syllables. Each phrase had the nature of an
asscrtion, and it transmitted established facts having no relationship to the speaker
(these were phrases taken from an encyclopedia). Specifically, we used a narrative
statement which, according to published data, is read at a tempo maximally close
to the average tempo of the speaker (1). The material was read by the speakers in a
recording studio, where it w~s tape-recorded. After first becoming acquainted with
th~~ experimenL-al situation, the speakers--persons with an advanced philological
e~ucation--wexe asked to read the text in a tempo comfortable to them--in a normal
tempo. After this, the same text was read by another speaker in what was, from the
standpoint of each speaker, a slow and a fast tempo. As he read, the speaker had to
keep in mind his own imp ressions of how different `.empos would sound. In one experi-
ment some of the speakers were asked to read the text at some particular tempo at
different times (within the same month) .
Analysis of the recordings revealed the average pronunciation tempo of each contin-
uous speech excerpt of each speaker iii three required tempos. Pauses between phrases
were not considered, since according t~ (1) pauses in messages, and in narrations
specifically, do not have an influence on the tempo of phrase pronunciation. Three
variants of individual tempo were established for each speaker--normal, fast and
slow.
Given the conditional nature of the speech (we analyzed reading, and not spontaneous
speech, and so on), the differences in the normal tempo of Belorussian speech ob-
tained for nativ~ Belorussian speakers having facility with literary language are
found to be extremely indicative when compared with the corresponding data for the
same three gradations of tempo established for native Russian speakers (2,3). For
the latter, the average duration of sound i:l normal, fast and slow individual tempo
was, respectively, 65-73 msec, 63-60 msec and 75-85 msec. Thus an individual tempo
that is slow to native Russian speakers is fast to native Belorussian speakers.
In the case of slow ~~rc~n unciation of text, the average duration of the sound of
I3elvrussian speech s~~okeci by most subjects (17 speakers) was 91-111 msec; smalier
grouF>s had an average duration of 143 msec (8 speakers) and 200 msec (2 speakers) .
Ir~ th~ cas~ of fast ~~ron unciation, 17 persons maintained a sound duxation averaging
bcCwe~n 55 and 63 m:;~c, 8 persuns hud an average from 67 to 71 msec and 2 persons
av~raacd rrom 77 to f~3 msec.
Com~>arison of t.he obtaine d data would show difference in the siynificance of the
ycncral tem~~o c}iaracteristics of Ru:~sian and Belorus~:.an language, as well as
_ difference in their significance wit:hin the framework of the same language spoken by
differ~nt r~presentatives of that language.
- All three established gradations of speech tempo are equally typical of speakers
using a given languaqe, and given its great significance, variability in tempo
cannot be thought of as random, as fully arbitrary and unpredictable.
'i'hc~ diverse causes behind var~ability in tempo require furt:;cr systematization and
clarification.
_ 78
FOR OFFICIAL US? LY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R400540030032-8
FOR OFF[C1AL USE ONf.Y
W~~ wer~ i??terested ii~ determining how our speakers would respond to an instruction
to s~~eed up and slow down their speech tempo, and in how much of an increase or
decrease in their speech tempo was perceivabLe by them. The instruction to "slow
down" the speech tempo produced a rather individual reaction in different speakers.
The speech te~npo decreased by 6, 9, 14, 15, 18, 23, 25, 30, 33, 37, 41, 55, 59
percent. Most speakers reduced their tempo by 14-25 percent. The response to a
request to read faster was less variable: All speakers increased their tempo by
not less than 16 percent, with most doing so by either 20-27 percent or 40-41 percent.
The maximum acceleration of tempo noted was 67 percent. For speakers whose individual
normal tempo could be interpreted as slow, acceleration of the tempo by 50 percent
elicited the same phenomena notecl in a state of emotional tension: The speakers
increased the number cf falsely started words, they omitted certain sounds and
syllables, they misplaced their accents, they made semantically justified word
substitutions, they spoke more loudly and so on.
BIBLIOGRAPHY
1. Typolohiya intonatsii movlennya" (Typology of Speech Intonations], Kiev, 1978,
pp 151-152.
2. Bondarko, L. V., Verbitskaya, L. A., and Pavlova, L. P., "Acoustic Characteristi.c~~
of Russian Speech Depending on Different Pronunciation Tempos," in "Voprosy
fonologii i fonetiki" [Problems in Phonology and Phonetics], Part 1, Nbscow,
1971, p 47.
3. Paufoshima, R. F., "Speech Tempo in Some Russian Dialects," in "Russkiye rJovory.
K izucheniyu fonetiki grammatiki, leksiki" [Russian Dialects. ~lnalysis of
Phonetics, Grammar and Vocabulary), Moscow, 1975.
79
FOR OFF(C'IAL USF. ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
rvn vrr~~.~r?i. VJL' Vl\Ll
MUTURL CORRELATION BETWEEN PERSONAL AND SPEECH
CHARACTERISTICS IN AN EMOTIONALLY TENSE SITUATION
S. S. Galagudze, G. V. Nikolayeva
Analysis of studies undertaken to establish the influence of an individual's emotional
states on the lexical and gr~nmatic characteristics of his speech would show that
this problem has still not been studied adequately. Soviet and foreign authors
have accumulated a sufficient quantity of experimental facts on the dynamics of the
higher levels of speech in the presence of different emotion-producing situations
(G. Mal', N. V. Vitt, E. L. Nosenko). Nevertheless it would have to be asserted that
the overwhelming majority of experimental research in this area is plagued by a
narrow empirical approach and by conflicts in the obta~ned results. '1"hus Nos~nko
(6) attempts to provide physiolo,qical and psychological grounds for the laws
behind changes occurring in speech characteristics under the influence of emotional
tension. The author interprets the emotionally grounded features of speech in light
of his analysis of the general psychological characteristics o� the way activity is
organized in an emotional situation, characteristics which manifest themselves as
- a tendency to simplify speech to permit its more-optimum regulation.
However, this interpretation is clearly in confl.ict with the facts indicating that
a number of characteristics of the lexical and grammat.~c level of speech improve
when certain subjects experience emotional tension (3,ei). In particular, we find
the following remark in Vitt's work cited above: Ecnotional states have different
stimulatory or inhibitory influences upon speech."
Without a doubt the author~ listed above have acc:umulated sufficiently representative
Pacts, ones which doubtlessly have pragmatic value; however, they do require theo-
retical inter~>retation and generalization.
The objective of our research was ~o examine the dynamics of the lexical and grammatic
levels of speech in response to emotionally tense situations, from the stand~~oint of
the information-energy approach developed in psychology by Vekkex' (1,2; L. M. Vekker
1976).
With this objective in mind we conducted experiments in which 30 subjects provided
samples of spontaneous oral speech ln an emotionally tense situation (their first
examination in their principal specialized subject) and in normal conditions (back-
gr~und). Then the recorded speech was transcribed into typewritten text, from which
the following lexical and grammatic characteristics were then isolated ar.3 analyzed:
80
FOR OFFICIAL USE ONI.Y
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000540030032-8
FOR OFFICIAI. USE ONi.Y
the overall length of narration, lexicographic diversity, ratio of the number of
verbs to the number of adjectives, the ratio of the number of abstract words to ttie
number of concrete words, the quantity of "weed" words, the quantity of grammatically
and logically incomplete sentences, presence of complex sentence structures and
complex subordinate phrases, and the quantity of words having a clearly positive or
a clearly negative connotation.
In order to reveal the dependence of speech dynamics in an emotional situation on
the personal qualities of the subjects, we examined certain personality properties--
introversion-extraversion and neuroticism (Eysenck's method) and anxiety (Cattell's
method) .
Hand tremor was used as an objective indicator of mobilization of the body's "energy
resources" in an emotionally tense situation.
The experimental data wc~re analyzed in several stages. In the first we examined ttie
characteristics of tlie energy indicators of the subject in a background situation and.
- in an emotionally tense situation. In this aspect the suYijects fell into two groups
with respect to the tremor dynamics indicator: The first group consisted of sub-
jects for wliom the difference l~etween background tremor indicators and the indicato.r~
of tremor iri an emotional situation was above average; in the second gr.oup this
difference did not reach the average level.
In the next stage of analysis we hau to compare the energy characteristics of the
subjects with their speech characteristics. Comparison of energy characteristics,
as defined by tremor indicators, and the speech characteristics stated above revealed
the following laws behind their dynamics in response to an emotionally tense SltllatlOri.
In the first group of subjects, for whom the indicator of emotional tension exceedecl
the sample average, the background indicators of the productivity of ~peech (total
length of narration, lexicographic 3iversity, quantity of complex sentences and com-
plex subordina te phrases) were found to be higher than in an emotionally tense
situat:i.on.
In the second group of subjects, for whom the energy indicators in an emotional
situation do not exceed the ave~age, the results were differpnt in nature. Some
of thc subjects were typified by the same dynamics of speech activity observed in
tkie firstgroup: An emotionally tense situation shortens their narration, makes i.t
~ more stereotypic and so on (see table), while the other subjects of this group
improved their speech characteristics in a stressful situation in comparison with
the background.
Thus in the final variant we were able to distinguish three groups of subjects
in terms of "energy-information" mutual dependencies:
Grou~~ 1(9 pcrsons)-�-subjects iii an emotionally stressful situation exhibit improve -
ment of a number of ~peech characteristics on the background of low energy activation;
group 2(12 persons)--subjects exhibit a decrease in a number of productivity indi--
cators of speech, on the background of low activation;
I ~roup 3(9 persons)--subjects in an emotionally stressful situation also exhibit
_ a decrease in a number of productivity indicators of speech, but on a background of
high energy activation.
81
_ FOR OFFICIAL U5E ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
1'VK VI'r~I.IHL U.7G Vl\Lt
Distribution of the Indicators of Lexical and Grammatic Level of
Spontaneous Oral Speech in Background and Stressful Situations
Experienced by Isolated Groups of Subjects
Group 3 Low Activation Group 2 Low Activation Group 1 High Activation
(9 Subjects) (12 Subjects) (9 Subjects)
Indicator Background Stress Background Stress Background Stress
1. Tremorenergy 111.5 124.9 100.0 112.8 ~01.2 188�5
(arbitrary
units)
2. Length of 148.8 110.6 200.2 116.6 128.2 69.2
narration
(words)
3. Lexico- 2.8 3.5 3.8 3.1 4.0 3.1
graphic
diversity
(index)
Verbs 2.8 2.9 2.7 3.4 1.7 2.8
(adjectives)
5. Abstract 0.82 0.63 6.51 ' 0.35 0.31 0.37
(concrete)
6. "Weed" 0.04 0.06 0.06 0.08 0.11 0.23
words (index)
7. Complex 0.77 0.92 0.73 0.35 0.52 0.37
sentences
and complex
subordinate
phrases
(index)
8. Words with 0.06 0.09 0.08 0.18 0.08 0.23
positive and
negative conno-
tation (index)
As may be deduced from the characteristics of the lexical and grammatic level of
si~ontaneous oral speech, an emotionally tense situation does not have a destructive
influence on statements spoken by subjects in the first group. In general the speech
of subjects experiencing a strE~sful situation remained at the same level in regard
to grammatic, syntactic and semantic structure as in a normal situation; moreover
certain subjects even exhibited greater smoothness, better structure and greater
expressiveness of statements in an emotionally tense situation.
At the same time, ana'. :is of the spoken statements of subjects in the second and
third groups revealed certain characteristics in the organization of speech in an
emotionally tense situation, such as reduction of the total lenc;th of narration, a
decrease iri lexical 3iversity, an increase in a number of "weed" words, more frequer~t
82
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2407/02109: CIA-RDP82-00854R000500030032-8
FOR OFFICIAL USE ONY.Y
use of verbs than adjectives, growth in the number of concrete words, an increase in
the number of logically and grammatically incomplete phrases, a decrease in the
number of complex sentences and complex subordinate clauses, and an increase in the
number of words having positive and negative connotation.
These laws agree well with published data (5,6) on the dynamics of emotionally de-
pendent speech.
The next stage of analysis of the obtained data presupposed comparison of the speech
characteristics of the subjects with their personality features. We found it
interesting to compare the first and second groups of subjects in relation to their
speech and personality features, since totally opposite tendencies in the nature of
changes in speech activity were observed in these groups on the background of low
energy activation in an emotionally tense situation.
We attempted to reveal the dependence between the indicators of extraversion-intro-
version, neuroticism and anxiety on one hand and the successfulness of speech in an
emotional situation on the other hand by the method of tetrachoric correlations.
_ We found that the successfulness of speech in an emotionally tense situation is
' negatively correlated with the anxiety indicator (r = 0.52 at pler, es~ecially in the high frequency range. When the tension of the vocal
cord decreases, the oscillation amplitudc ~ncreases while the frequency drops by
wi~hin two octaves. The duty factor (the ratio of the time the vocal cords remain
in contact with one another to the length of the oscillation period) grows at
first and then declines. As the duty factor increases, the high frequency oscilla-
tion s~~ectrum grow~, while the oscillations themselves retain their previr~us shape.
I3ec:ause this is su, the ratio between the levels of the spectral components of the
voice source in ttie high and low ranges at first grows somewhat in response to an
increase i~i tension, and then declines significantly. The relationship of
the~~ c�d~.g~~s in the tension of the vocal cords to the individual's actual anatomical
acid physiological potentials is not exactly known; however, it may ae hypothesized
that tension ma~� increase within rather broad limits, while its 3ecrease is appareiit-
ly limited.
Figure 2 and Tab].e 2 show data illustrating the dependence of the volume flow rate
beyond the vocal cords at different values for the distance between the voca]. cords
(the initial equilik,rium distance separating the vocal cords in the absence of
vocal cord pressure)- A negative distance means that before starting their work,
90
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2047102109: CIA-RDP82-00850R400504030032-8
- ~rOR OFFICIAL USE ONY:`I
(l)
CN
~ ct~
k~ A 06
d0~ ~i~ .
7~'0 ~
6~0
- yoo -
4U0 /
300
200 / ~
100
~ --r~ ~ - i , - ~ ~ ~ ~ ~ ~ , r , ~ ~
15 20 . ts f Mcrr
Soo / ~ ~ 2 ~ ~
4~ ~ ~
,aa
z~ .
;oo
a ro r~_'...y 't_.r 1' I. r�'iT_T T_'.~'T'_r-.ry ""'T r_t___'_..
I OMi hc0~ ~ f0~ tNO~00tt`O
~ ~
K M h lA N~+9 ~ p M
O ~ t~ CO t~ t0 t~ 00 p~ Qf Sj I tD t0 ti a~0
a
m '
~ a 3~ ~ti`o~ ti a~o 0o a~$~ ao~o�~~ ~
~ N ~
U " M titrN ~ tia0 ~~N ~N~ NO~.~. U
^ ~.n~~ ~
Ql N . ~
r
~ ~ - �~c~ct~ ~D~~ ti~~ ~~ti ~v~i~ti
~ a ~
O ~ ~ ~o~o~ n ~i ~ aNOO~~? ~oNOV~ ( O
a~ti~v Q,co
~ $ao~ "'~'',a ��~~u~ So~~ I ~ x
~ ~ .
~ ~ ~ u~ ~ u~ n. `v~ ~ ~ ~ n o ~ I ~ ~n
Sa
N N
~ g ~ ~ ~ ~~S ~ ~ ~ ~ ~ ~ ~ ~ ~ W~.i
1~ ^ N ~ w
~ , r-I RS r-1
v ,
3 N u
r� ~-n- D ~ D Ot
- N o ~ O ; O .
W
~ pMp p, O ~ O. O K O. p 1~pSp O. O K a' Op rl N ('~1
- h�mE`' ~s�a~' ~�Om~' F+a�a~' ~�~~R.
F 444 000 SSS mmmm ~
161
~ FOR OFFICIAL USE ONI~Y
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/49: CIA-RDP82-40850R040500034032-8
- rux urri~.~AL u~~ uIVY.Y
POSSIBILITIES FOR EVALUATING INTENSITY OF A SPEAKER'S EMOTIONAL
TENSION QN THE BASIS OF CHANGES IN CHARACTERISTICS OF HIS SPEECH
E. L. Nosenko
One approach to examining speech, in which it is viewed as one form of complex
intellectual activity, presupposes representation of its organization as a"hier-
archical multilevel substructure" (N. A. Bernshteyn, 1966), the levels of which
differ in complexity and in the subject's awareness of them. As we know, the indi-
vidual's awareness of individual components of polystructural activity increases
in the course of this activity as we ascend from one level to the next. While the
- speaker is "actually" aware (using A. N. Leont'yev's term, 1947) of the dominant
~ semantic level ~~f speech and his awareness of operations associated with the lexical
and grammatical structure of a statement is limited to conscious monitoring--that is,
_ he is not fully conscious of them, motor realization of the statement is at an even
_ lower level in the hierarchy of awareness of speech phenomena--the unconscious level,
or the level of "unconscious control" (A. N. Leont'yev, 1965).
This communication will attempt to justify the possibility for evaluating the intensity
of emotional tension experienced by the individual on the basis of a consideration of
which components of speech are responsible for v~rious difficulties in speech.
The hypothesis is suggested that arisal of mistakes and difficulties in speech of
which the speaker is unaware (which he cann~~t correct) or of which he is aware but
- finds it hard to surmount, is an indication of a high degree of emotional tension in
the speaker. This pertains to those elements of speech which are at higher levels of
awareness in the hierarchy of the levels of organization of the spoken statement,
and which can consequently be controlled more meticulously in speech proceeding in a
normal state. This hypothesis is based on the experimentally established fact that
in a state of emotional tension, conscious control over the quality of activity
weakens.
- Apparently the more intensively the speaker experiences emotional tension, the more
t7is capacity for maintaining effective conscious control over the quality of his
activity is disturbed, which is what leads to mistakes and difficulties not only at
levels of organization of a statement requiring distribution of attention between
the intent of the statement and its concrete linguistic realization, but even at the
highest level of speech, the semantic level.
162
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2047102109: CIA-RDP82-00850R400504030032-8
FOR OFFIC[AL USE ONLY �
To test this hypothesis we subjected, to comparative psycholinguistic analysis, the
spoken statements of the same subjects in a state of emotional tension elicited..by
different emotion-producing factors"'(an examination, questioning, anticipation of
~ surgery and so on) and in a normal state.
' Special attention was devoted to examining indecisiveness phenomena in speech and
~ statements containing the speaker's self-assessment of the quality of his own state-
ment. We also used the method of having a subject listen to a tape recording of his
speech when he was'in a state of emotional tension, and then asking him to comment
~ on what he heard.
! Our observations may be summarized as follows.
1. Mistakes and difficulties arise in the speech of subjects in a state of emotional
tension primarily at the level of the grammatical structure of statements.
This m,ay be explained by the following. Owing to the mechanism of "conscious control,"
a speaker in a normal state selects linguistic units and links them together in a
syntactical scheme efficiently and without mistakes, all the more so because in per-
forming this operation, the speaker need consider only the relationships between
linguistic signs, and he need not relate them to extralinguistic objects or concepts,
as in the case of selection of the syntactic structure itself, or in the process of
choosing words adequate to a given goal of communication.
In a state of emotional tension of even an insignificant degree, it becomes more
. difficult to distribute attention between the semantic level of speech and its lin-
guistic structure, which is what leads to errors in syntax, to "awkwardness" of
composition and so on. As a rule the speaker does not even notice these mistakes.
On hearing recordings of their own speech, subjects are bewildered by the fact that
they may have, for example, declined a noun improperly without noticing this slip of
the tongue. An example would be "...po sravneniyu s 1965 godu" (instead of "godom").
2. Arisal of mistakes (corrected by the speaker!) 2.n the choice of words and selec-
tion of the syntactic scheme of a statement appropriate to the given goal of communi-
cation attests to qreater weakening of conscious control over the quality of activity
in a state of emotional tension, and consequently to greater intensity of this state.
The fact is that for the speaker to understand his choice of a certain word or syntactic
scheme for a statement, he must consider the intent of the statement. This operation
proceeds under a greater degree of control of the speaker's voluntary attention than
does observation of the rules of grammar.
Therefore if even on the condition that the speaker is able to concentrate his volun-
- tary attention on his speech he makes errors such as using an inappropriate adjective,
and he fails to recognize such errors, we would have adequate grounds for suggesting
that he is experiencing a state of severe emotional tension. The plau~ibility of
this hypothesis is confirmed by the fact that even after becoming aware of the in-
adequacy of his choicc~ of a particular word, a speaker in a state of emotional tension
is unable to efficiently find an adequate substitute. Evidence of this can be found
in the numerous "false starts" in speech, analysis of which would show that they
unambiguously siqnal the arisal of difficulties in word choice. For example when asked
the question: "What geometric figures do you see in Figure 3?", a traffic controller
in a state of emotional tension replied: the pho... the photograph bears a
163
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2047102109: CIA-RDP82-00850R400504030032-8
rv~c vrria..iwt, uJC. VIVY.Y
a star, a circle and a triangle." The false start "The pho..." attests to the fact
that the decision to use the word "pho~tograph" instead of the word "figure" does not
_ satisfy the speaker himself, but he is unable to effectively find the adequate sub-
stitute. A speaker's self-assessment of the quality of his own speech can make it
especially clear that he is havinu difficulties in word choice. For example: "In the
second row I see a(the speaker pauses for 1.8 seconds)...(the speaker stutters)...
a watchamacallit (a pause of 1.2 seconds) a(a pause of 0.3 seconds) triangle, how
silly of ine!"
3. In a state of severe emotional tension (which we observe, for example, in subjects
prior to surgery), speech changes occur even at the level of programming the intent
of a statement.
The grounds for this assertion are that when subjects are permitted to hear recordings
of their own speech, they note the overemphasized positive or negative connotation of
the words they choose as bAing "unnatural," "atypical of them in a normal state." For
example "...my lab results are v~ terrible."
Disturbances in the dominance of the names of things in units of speech beyond phrase
length (in contrast to speech in a calm state, the speaker forgets whether or not
certain objects and subjects presently under discussion had been mentioned earlier)
also attests to weakening of control over the quality of speech even at its dominant
level, the one of which the speaker is fully aware in the course of his speech.
This communication offers a classification of different ~hanges in the characteristics
of speech in a state of emotional tension from the standpoint of their usefulness
in identifying ciifferent degrees of intensity of this state.
164
. FOR OFF[CIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/49: CIA-RDP82-00850R440500030032-8
FOR OFFICIAL USE ONLY
SOME FLOWCHARTS FOR ANALYSIS OF THE STATE OF AN INDIVIDUAL
ON THE BASIS OF CHARACTERISTICS OF HIS SPEECH
E. L. Nosenko, 0. N. Karpov, A. A. Chugay, G. N. Bordovskiy
Analysis of the state of an individual doing work presupposes effective acqui.sition of
information about Yus state.
The state of a human operator is usually evaluated on the basis of ineasurements of
a number of physio logical parameters (pulse, heart beat, respiration and so on)
made by contact sensors.
This report examine s new set-ups for monitorinq changes in the state of an operator,
based on recording changes in the characteristics of his speech.
The authors obtained experimental material confimung the informativeness of a large
number of speech parameters to be used as indicators of emotional tension that may
arisP in an operator in critical work situations and lead to work failure. In con-
trast tc, previous research in which the vocal com~unication channel was used as a
source of information on the state of the human operator (P. V. Simonod, M. V. Frol~v,
L. N. Luk'yanov, V_ A. Popov; C. E. Williams and K. N. Stevens, etc.), the authors of
this communication have developed a means for monitoring changes in the state of a
speaker requiring not comparison of the intonational contours of the same standard
words or phrases, but analysis of the flow of coherent speech.
Changes in the characteristics of speech associated with particular features of the
neurophysiological mechanisms of emotional state may be classified as follows:
1. Changes in the characteristics of speech in a state of emotional tension steimning
from the characte ristics of au~onomic reactiona inherent to this state.
2. Changes in the characteristics of speech reflecting the particular features of
the sensory and me ntal processes occurring in a state of emotional tension.
3. Changes in the characteristics of speech associated with certain motor reactions
occurring in a state of emotional tension.
Considerable tensing of the muscles of the sFeech forming apparatus, to include the
vocal cords, in a state of emotional tension causes change in the frequency of the
voice's fundamental tone. Consequently it would be suitable to use, as indicators
165
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500034432-8
t�va~ vl~~'~l,~tfL, VJG V1VL1
of emotional tension, characteristics of the frequency of the fundamental tone such
as the range of its variations, the swiftness with which zones appear in the flow of
speech in which the frequency of the fundamental tone significantly surpasses the
mean frequency typical of the given speaker, and so on.
Changes in breathing rhythms, which have an effect on the temporal characteristics of
speech, are typical of emotional tension: The number of pauses in the flow of speech
- increases, their duration grows longer, and the locations of pauses change.
On this basis, the following could be used as objective indicators by which to
identify states through speech characteristics:
1. Fluctuations in the frequency of the voice's fundamental tone.
2. Fluctuations in the loudness of speech tincreases or decreases in comparison with
speech in a normal state).
3. Changes in the tempo of articulation (in the absolute tempo of speech).
4. Fluctuations in the general tempo of speech from maximum to minimum (the range
of variation af the speech tempo, the rate of change of speech tempo).
5. Change in the average length of a passage of speech uttered without pauses due
to indecisiveness (a decrease in comparisan with speech in a normal state).
Our objective was to create an electronic speech analyzer to be used in a diagnostic
system recognizing an individual's emotional state on the basis of his speech
characteristics.
The following were used as informative parameters: change in frequency of the voice's
fundamental tone and changes in temporal characteristics of speech.
To simplify the circuitry of the analyzer, we settled on digital representation of
amplitude and temporal parameters.
The speech anaiyzer is based on series K-155 integrated microcircuits assembled into
a set of counters accumulating information on ttie temporal characteristics of speech
within 10 seconds of current time, and information on changes in the frequency of
the fundamental tone during each second of the tonal signal.
The fundamental tone analysis circuit represents a digital filter, and the parameter
it measures is the per-second distribution of the number of periods in the spoken
signal on a frequency axis.
Change in signal intensity is determined by comparison of the envelope of the
wideband signal with the thresholds of normal intensity--that is, normal, loud and
soft bands are isolated.
The parameter measured here is the number of times a given threshold is exceeded in
particular time interval.
166
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2047102109: CIA-RDP82-00850R400504030032-8
~ FOR OFFICIAL USE ONi.Y ~
I
~
i
~
1 . . .
~6~
n�.?M~~~aw_1
~ 2 ~ ~ ioperrep~wr ~ 3p
i. > 30~+r ' . .
M 6eof (~1 t>?sOMr 7'enn (4 ~
. nqYS q~swr~Mr~ar~io- .
NuI ~
( I
. ( ,Qeusv I 6~oR
/!o ~ � ~
~
. ~ ( i
6A~K t~ I KA+wtenfo ~
uslf~ro4 yu r~o y,!
L- 6---J .
~ ' - - - ~ ,
~ Gnat f/M/p~ptyl/ONMMI ~op~~~uvx~lO~
~ ~fir ~VeiW~rv ~ . .
Q~NV ~~wwr T/fl ~ 14
J~AOR
1 lr~nMU~++N 12 ' ~ , ntyo~~,~-
~ �
- a6~I~` - 6eo~ ~ 1
s~wer
,CrMA~AIn TIVOTIN .
t
6irR ~'Z7
~I~~
M
QYt~~t , ' '
Figure 1
Key:
1. Pause block 10. Intonational characteristics block
2. Msec 11. Duration-to-voltage converter
3. Temporal characteristics block 12. TSh [not further identified]
4. Articulation tempo 13. Pulse counters
5. Pause duration 14. Print-out block
6. Number of pauses 15. Tonal signal envelope block
7. Intensity measuring block 16. Print-out control block
8. Simulation block ~ 17. Time sensor block
9. Low frequency filter
Change in the parameters of speech with respect to time provides a complete dynamic
_ picture of changes in the individual's functional state.
The functional layout of the analyzer consists of the following blocks (Figure 1):
temporal characteristics isolating block;
intonational characteristics block;
167
FOR OFF[CIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500034432-8
rvn vrri~,~r~~ u~c, v1vLY
print-out control block;
print-out block;
pause simulation block;
time sensor block;
intensity measuring block.
The analyzer works as follows. In the pause block the spoken signal is transformed
into an envelope indicating presence and absence of a signal; two types of envelopes
are produced in this case: with a pause duration T1~30 msec and with a pause dura-
tion T23250 msec. The envelope with T1 pauses is used to count the number of pauses
and their duration within each 10 second interval, and the envelope with T2 pauses
is used together with the envelope of the tonal signal to obtain the characteristics
of articulation tempo. Concurrently the vocal signal passes from the pause block
through the low frequency filter, with fave - 400 Hz, into the intonational character-
istics formation block.
This block measures the duration of each period of the frequency-modulated signal of
the fundamental tone, and depending on the duration of the period, a value of one is
added to one of the eight accumulating counters. As a result the eight counters
provide the pulse frequency distribution for a 1 second interval of the fundamental
tone. Intensit~ is measured in the intensity measuring block and compared with eight
thresholds. The parameter measurements are fed to the print-out block, which is
controlled by the time sensor block and by signals from an I~-16 high-speed type-
writer.
Independent verification of the analyzer is achieved by the simulation block, which
generates pulses of variable frequency within the limita of the fundamental tone
frequency, and tonal signal envelopes with pauses T1>,30 msec and T2>,250 msec.
168
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00854R004500030032-8
FOR OFFICIAL USE ONLY
50ME CfiARAC~RISTICS OF EMOTIONP.L WHISP~RED SPEECH
E. A. Nushikyan, T. A. Brovchenko, S. N. Kolymba
Whispered speech has been an ob;ject of analysis several times at both the perce~~tion
level and the acoustic level. It has been studied mainly with respect to individual
sounds, syllables and words (1,2,3). Interest in studying whispered speech has recently
grown, but we were unable to find any descriptions, in the linguistic literature, of
the acoustic characteristics of emotional whispered speech, which,is of considerable
interest to a number of disciplines, including linguistics.
The objective of our research was to determine the acoustic characteristics of emo-
tional whispered speech. Our research material consisted of 76 emotionally colored
phrases expressing anger, amazement,, irony and approval, and neutral phrases corres-
ponding to the former. In keeping with our objective, we had a speaker read each
sentence four times--emotional and neutral phrases were read correspondingly in a whis-
per and at normal loudness. Emotionality was imparted to the phrases by their pro-
nunciation in context. To achieve neutral phrases, we left out the emotionally
colored context. Recordings of whispered and loud pronunciation of the phrases by
the speakers were subjected to listener analysis to detezmine how identifiable the
emotional states were by native listeners.
Only those phrases which, according to not less than 80 percent of the listeners,
expressed the expected emotional connotations were selected for electro-acoustic
analysis.
The next stage of the work was electro-acoustic analysis of the selected phrases,
performed with an intonograph at the laboratory of experin?ental phonetics of Odessa
State University. Electro-acoustic analysis was performed with respect to the follow-
ing characteristics: the envelope of the fundamental tone frequency for the entire
~~hrase, the frequency range of the phrase in pt [not further identified], the fre-
quency interval of the principal stressed vawel in pt, the peak value of the funda-
mental tone frequency, phrase duration in msec, average syllable duration in msec,
and the speech rate--the number of syllables uttered per second.
Phrases were subjected to comparative analysis ir~ two planes: emotionality-neutrality,
loud-whispered speech.
As far as the first opposition is concerned, it was examined in previous works by
the authors (4) in relation to loud speech; our present objective was to study this
opposition in relation to whispered speech.
169
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500034432-8
rvn vrr~~.a~+a.. uJG VI\Lt
As we know from the appropriate literature, the sounds of whispered speech are created
by the passage of air through constricted passageways of the whisper triangle and the
glottis with the vocal cords not vibrating. Th~ arising noise is the principal source
of the sounds of whispered speech, as opposed to vocal phonation (2).
Obviously when whispered speech supplants vocal phonation the envelope oF the
fundamental tone can still be isolated, as is confirmed by other studies. There are
indications in the linguistic literature that not only can the fundamental tone be
heard in whispered speech, but also it can be voluntarily modulated (3). Visual
analysis of the fundamental tone frequency envelope of our intonograms revealed that
in comparison with voiced speech, the periods of the fundamental tone of whispered
speech, complicated by noise components, exhibit an irregular nature.
Calculations showed that the percentage of the signal falling within the fundamental
tone channel was on the order of 30 percent, as compared to 95 percent for vocal
speech. Graphs revealing the dynamics of the fundamental tone frequency distinctly
showed retention of the configuration of the fundamental tone curve in relation to
both whispered and voiced speech.
A decrease in overall frequency was found to be specific to whispering in both emo-
tional and neutral pronunciation. Thus in regard to the opposition "whispered-voiced
speech," the maximum frequency of whispered speech was found to be 5 pt lower in rela-
tion to neutral speech and 3 pt lower in relation to emotional speech. In whispered
speech, emotional phrases were distinguished by a maximur.: frequency 5 pt higher than
that of the neutral variant.
The van der Warden test, which does not require knowledge of the distribution function
and which may be used with a small number of variants, was applied to reveal differ-
ences in the compared neutrality-emotionality oppositions of whispered speech.
Interpreting the peak frequency of the fundamental tone (for female voices) in whispered
neutral speech as a manifestation of randort~ variable X and the corresponding charac-
teristic of emotionally whispered speech as a manifestation of a random variable 'Y,
and identifying the latter with serial numbe~rs r and considering that n1=n2=14 and
- that n=28, we calculate X using the formula
X=~IY t n+l ~-8,27 (5)
r
Turning to the van der Warden test table, we find X05~ XO1~
~ X>Xo6>Xdi
' 8,27 6,09 4,69
- The calculated value confirms the presence of significant differences between the
average values of the peak fundamental tone frequency for emotional and neutral
phrases uttered in whispered speech. It may be concluded that these two general sets
are unconditionally different.
170
FOR OFFICIAL USE ONLY
APPROVED FOR RELEASE: 2007/02/09: CIA-RDP82-00850R000500030032-8
APPROVED FOR RELEASE: 2047102109: CIA-RDP82-00850R400504030032-8
FOR OFFICIAL USE ONLY
Frc.~sc~nrr C1f diffrm�rc+:: lil ~~lc~ ~k~:i~C r1111Cj:i11K~i1~ i~ t'.C~11A ~1r~~11P11~ ~OF ~~f pqh~l t~~1fA1 ~tl~j
n~;utral wtiis~red sY~:ecti can also be seeii ln relation to male voices, though it would
be difficult to conclude that the general sets are unconditionally different, inasmuch
as
Xos