ABSTRACTS OF THE CONFERENCE ON MACHINE TRANSLATION (MAY 15-21, 1958)
Document Type:
Collection:
Document Number (FOIA) /ESDN (CREST):
CIA-RDP68-00069A000100200007-9
Release Decision:
RIFPUB
Original Classification:
K
Document Page Count:
91
Document Creation Date:
December 9, 2016
Document Release Date:
September 24, 1998
Sequence Number:
7
Case Number:
Publication Date:
July 22, 1958
Content Type:
REPORT
File:
Attachment | Size |
---|---|
CIA-RDP68-00069A000100200007-9.pdf | 6.5 MB |
Body:
D
0
Opeff&9;i19 C. - ~- 3
Approved For Release
2000/08/24: CIA-RDP68-00069A00
0~~
22 July 1958
JPRS/DC-241.
ABSTRACTS
OF THE
CONFERENCE ON MACHINE TRANSLATION
(MAY 15-21, 1958)
PHOTOCOPIES OF THIS REPORT
MAY BE PURCHASED FROM THE
PHOTODUPLICATION SERVICE
LIBRARY OF CONGRESS
WASfl GTON 25, D. C.
U. S. JOINT PUBLICATIONS
RESEARCH SERVICE
Main Office:
Room 1125
205 E 42nd Street
New York 17, N. Y.
D. C. Office:
Second Floor
1636 Connecticut Ave., N.W.
Washington 9, D. C.
Approved For Release,D00/08/24: CIA-RDP68-00069A000100 {J007-9
JPRS/I?C-241
CSO DC-2026
Ministry of Higher Eduoation, USSR
First Moscow State Pedagogical Institute of Foreign L.nguages
ABSTRACTS
OF THE
CONFERENCE ON MACHINE TRANSLATION
(May 15-21, 1968)
iIDSCOW, 1958
Approved For Release 2000/08/24: CIA-RDP&8^00069A000100200007-9
Approved For Release"M00/08/24: CIA-RDP68-00069A000100"bOO7-9
TABLE OF CONTENTS
PLENARY SESSION
Page
1. Andreyev, N. D. (Leningrad), A Metalanguage of Machine
Translation 1
2. Bel'skaya, I. K. (Moscow), Some General Problems in
Machine Translation 1
3. Bokarev, Ye. A. (Moscow), An Intermediary Language and
Artificial International Languages 5
4. Dobrushin, R. L. (Moscow), The Value of Mathematical
Methods in Linguistics 6
5. Ivanov, V. V. (Moscow), Conversion of Commounioations
and Conversion of Codes 6
6. Kuznetsov, P. S. (Moscow), The Sequence in Building a
Language System
7. Iyapunov, A. A. and Kulagina, 0. S. (Moscow), Machine
Translation Studies in the-Mathematical Institute
of the Academy of Sciences, USSR
8. Mel'ohuk, I. A. (Moscow), An Intermediary Language Model
for Machine Translation 11
9. Steblin Kamenskii, M. I. (Leningrad), The Significance
of Machine Translation for Linguistics
10. Revzin, I. I. (Moscow) The "Active" and "Passive"
Grammar of L. V. Shoherba affi the Problems of
Machine Translation
11. Rozentsreig, V. Yu. and Revzin, I. I. (Moscow), A
General Theory of Translation in Connection"W th
Machine Translation
THEORETICAL SECTION
12. Artemov, V. A. and Zimnyaya, I. A. (Moscow), Spectra
of Phonemes and Their Use in Machine Translation 17
Approved For Release 2000/08/24: CIA-RDPb68 00069AO00100200007-9
Approved For Release 00/08/24: CIA-RDP68-00069AO001002db007-9
Page
13. Vinogradova, 0. S. and Luriya, A. R. (Moscow), An
Objective Investigation of Meaning Associations 19
14. Grigor'yev, V. I. (Moscow), The Treatment of Certain
Concepts in Structuralism 19
15. Grigoryan, V. M. (Yerevan), The Significance of Frequency
as a Factor in Determining the Stylistic Function of
Words
16. Dobrushin, R. L. (Moscow), An Experiment to Define the
Concept of Grammatical Category 21
17. Dolgopolvskii, A. B. (Moscow), The Theory of Probability
and Determination of Linguistic Relationship 22
18. Zinoveyev, A. A. (Moscow), A General Theory of Definition
and the Possibility of Applying It to the Theory of
Translation Devices
19. Ivanov, V. V. (Moscow), Linguistic Problems Connected
With Poetry Translation
200 Ivanov, Vo V. (Moscow), Hegel's Theorem and Linguistic
Paradoxes
21. Iliya, L. I. (Moscow), Methods of Breaking Down a
Syntactic Whole
22. `Kolshanskji, G. V. (Moscow), The Logical Nature of
Context
23. Kotov, R. G. (Moscow), Linguistic Statistics From
Russian Texts
24. Kulagina, 0. S. (Moscow), A Method of Defining
Grano tioal Categories 31
25. Revzin, I. I. (Mosoow), A Formal Theory of the Sentence 31
26. Reformatskii, A. A. (Moscow), Translation sub specie
structu.ali
smi
27. Rosentsveig, V. Yu. (Moscow), A System of Recording
Speech for Oral Translation
Approved For Release 2000/08/24: GlAeRDP68-00069A000100200007-9
Approved For Release 2''0/08/24: CIA-RDP68-00069A000100204007-9
Page
28. Sokolyanskii, I. A. (Moscow),, Language Training For
Blind Deaf-Mutes 35
29. Strelkovskii, G. M, (Moscow), Some General Principles in
Compiling Glossaries For Machine Translation
30. Toporov, V. N. (Moscow), Some Analogies to the Problems
and Methods of Contemporary Theoretical Linguistics
in Ancient Indian Grammatical Works
31. 'Udartseva, M. G. (Petrosavodek), The Frequency of Lexical
'Units in English Geological Literature 39
32. Finn, V. N. and TAkh.uti, D. G. (Moscow), One Approach to
Logical Semantics 40
33. Frumkina, R. M. (Moscow), Some Problems Connected With
Alternating Stems in Constructing an Algorithm of
Machine Translation For Spanish
34. Shun rang S. K. (Moscow), A Logical Analysis of the
Concept of Language Structure 43
35. Shevoroshkiin, V. (Moscow), Ancient Texts and Machine
Translation 44
SECTION ON ALGORITHMS OF MACHINE TRANSLATION
36. Agrayer, V. A. (Gorki), An Algorithm for Translating
French into Russian Electronically
37. Andreyev, N. D. (Leningrad), Principles in the
Construction of Electric Reading Devices
38. Andreyer, N. D. (Leningrad), Work on an Indonesian-Russian
Algorithm of Machine Translation
39. Andreyer, N. D., Batova, D. A., and Panfilov, V. S.
(Leningrad) f, Work on a Vietfia,mese4Russian Algorithm
of Machine Translation
40. Babinteev, A. A. (Leningrad), Work an a Japanese-Russian
Algorithm of Machine Translation 48
41. Bagrinovskaya, G. P. and Gavrilova, G. L. (Moscow), The
Programming of Translation From English into Russian 50
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release 200/08/24: CIA-RDP68-00069A0001002, O07-9
42. Belokrinitsknya, S. S. (Moscow), Principles in Compiling
a German-Russian Glossary of Polysemants for Machine
Translation
43. Bel'skaya, I. N. (Moscow), Main Features of the Glossary
and Grammatical Programs for English-Russian Machine
Translation
44. Berkov, V. P. (Leningrad), Work on a Norwegian-Russian
Algorithm of Machine Translation
45. Bratchi,kov, I. L., Fitialov, S. Ya., and Tseitin, G.'S.
(Leningrad), Glossary Structure and Information
Coding for Machine Translation
46. Vinogrdova, V. N. (Moscow), Gender as a Superfluous
Category of the Russian Verb
47, Volotskaya, Z. M. (Moscow), The Synthesis of Russian
Verb Formes in Machine Translation
Page
48. Volotskaya, Z. M.,-Paduoheva, Ye. V., Shelimova, I. N.,
and Shumilina, A. L. (Moscow), Russian Syntagmas 58
49. Volotakaya, Z. M., and Shumilina, A. L. (Moscow),
Synthesis of the Russian Clause
50. Voronin, V. A. (Moscow), Grammatical Analysis for Machine
Translation of Chinese into Russian
51. Grigorgyev, V. I. and Belonogov, G. G. (Moscow), Application
of Machine Translation Methods to the Lexical Coding
of Telegraphic and Telephonic Communications 62
52. Yefimov, M. B. (Moscow), Some Problems in Machine
Translation From Japanese into Russian 63
53. Zasorina, L. N, (Leningrad), Work on the Russo-English
Algorithm of Machine Translation 64
54. Katenina, T. Ye. (Leningrad), Work on a Hindustani
(Hindi) Russian Algorithm of Machine Translation 66
55. Komissarova, N. V. (Gorki), An Algorithm for Translating
English Texts on Radio Engineering into Russian 67
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releasc,000/08/24: CIA-RDP68-00069AO001OQ200007-9
56. Kulagina, 0. S. (Moscow), Automatization of Translation
Programming
57. Kulagina, 0. S. (Moscow), A bench-Russian Translation
Algorithm
58, Langleben, M. M. (Moscow), Determination of Syntactic
Connections for Formulas in Russian Mathematical
Texts
Page
59. Langleben, M. M. and Paduoheva, Yea V. (Moscow),
Elimination of Morphological and Syntactic HomonoaW
in Analyzing English Texts 69
60. Leontlyeva9 N. N. and Vavilova, G. N. (Moscow), The
Superfluousness of Russian Adjective Inflection 70
61, Moloshnaya, T. N. (Moscow), An Algorithm of Machine
Translation from English into Russian 71
62. Muratov9 R. S. (Sverdlovsk), A Device for the Reading of
Ordinary Printed Material by the Blind
63, Nikolayeva, T. N. (Moscow), Analysis of Punctuation Marks
During Machine Translation From Russian
64. Paduoheva, Ye. V. (Moscow), Some Problems Connected With
the Analysis of Complex Sentences and Clauses With
Similar Members
75,
65. Parahin9 V. V. (Moscow), Machine Translation of Compound
Nouns From German into Russian 76
66. Superanskaya, A, V. (Moscow), Proper Nouns in Machine
Translation 77
67. Timofeyeva, 0. (Leningrad), Work on a Burmese-Russian
Algorithm of Machine Translation
68. Frolova, Oa Bo (Leningrad), Work on an Arabio-Russian
Algorithm of Machine Translation
69, Chekova, G. V. (Moscow), Experimental Translations From
French into Russian.
Approved For Release 2000/08/24: CIA-RDR68-00069AO00100200007-9
Approved For Release 00/08/24: CIA-RDP68-00069AO00100 O07-9
70. ShelimDva, I. N. (Mosoow), Establishment of Syntactic
Cues for Prepositional Phrases
71. Shumiliaa, A. L. (Mosoow), Correlativity of 3rd Person
Personal Pronouns and the Nouns for Whioh They
Substitute
Page
Approved For Release 2000/08/24: CIA-RDPK68-00069AO00100200007-9
Approved For Release 300/08/24: CIA-RDP68-00069AO0010024WO7-9
1. THE METALANGUAGE OF MACHINE TRA,NSIATION
AND ITS ATI
No D. Andreyer (Leningrad)
16 We call a metalanguage any linear system of signs used for the
written designation of the elements in a particular system of ideas and the
relations between these elements
2. The class of metalanguages at the present time comprises mathetios,
physics, chemistry., formal genetics, and symbolic logic.
3 The preparation of algorithms for machine translation requires the
development of a special metalanguage in the symbols of which may be described
the f aita and relationships of the language systems that are subject to equiv-
alent comparison.
4. The symbols used in the metalanguage of machine translation are
regarded as metalanguage words and grouped in categories analogous to the
parts of speech.
5 Types of commands in M0T0 Z chine translatio are regarded as
metamoods %TA-NAKION,SNIY
6. The use of metalanguage in the analytic part of algorithms.
7. The use of metalanguage in the transformational part of algorithms.
The use of metalanguage in the synthetic part of algorithms,
9o The possibility and value of a general theory of metalinguistio
systems0
10., A comparative analysis of the class of metalanguages and the class
of spoken languages may serve as a basis for elucidating the relations between
formal logical semeiotics and general linguistics.
20 SOME GENERAL PROBLE$ IN MACHINE TRANSLATION LT.-/
I. K. Bellskaya (Moscow)
1. Experience gained in preparing experimental routines for machine
translation from English, German, Chinese, Japanese, and Russian in the ITM
and VT Lfnstitut tochnoi mekhaniki i vyohislitel'noi tekniki/Institute of
Precision Mechanics and Computer Engineerinof the Academy of Sciences,
USSR confirm the assumption that translation, even in such an unusual form
as machine translation, is, as far as content is concerned, a linguistic
problem.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Relea "1000/08/24: CIA-RDP68-00069A000100200007-9
2. The development of linguistic methods of solving M.T. problems may
be achieved on the basis of so-called "traditional linguistics" and the
results of such work may be of definite interest to linguistics,
The systematisation of language phenomena that accompanies M,To research
should help to eliminate the well known contradictions and diffuseness in
the definitions of certain linguistic categories accepted at the present time.
3. A distinction between the lexical and grammatical aspects of the
translation problem seems essential. The difference in quality and degree
of lexical and grammatical abstraction emerges in the system of machine
translation with unusual clarity,
Rules of lexical character are recorded in a glossary. Grammatical
rules are not included in the glossary and form the content of so-called
"translation routines".
4. An M.T, glossary must be so constructed that its various parts can
expand unevenly.
An M.T. glossary may be divided into 2 main sectionss
I single-meaning glossary, and
II multiple-meaning glossary.
Each of these is in turn subdivided into.
Ia
Ib
glossary of technical terms,
glossary of words in general uses
Ha glossary of full-meaning words,
IIb glossary of auxiliary words.
An M.T glossary is accompanied by several auxiliary routines (Comm
prising one cycle in the translation routine) in order that the lexical
analysis of a sentence may be performed without human interventions'
1. Routine of dividing a sentence into words Routine 1 is not
essential for all languages, only for such as Chinese,, Japanese, Arabics
etc., where the sentence is written down in the form of an unbroken succession
of signs with no spaces between the word
2. Routine of obtaining the glossary form of a word
3, Grammatical analysis of "unknown words"
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 3000/08/24: CIA-RDP68-00069AO00100ZQOO07-9
4. Syntactic analysis of "formulas"
5. Routine of distinguishing homonyms
6. Routine of analysis of polysenf.
5. The basic problems of an M.T. glossary--size and polysemy--are
satisfactorily solved by combining the following two methods:
(a) division of the glossary into a series of "special glossaries"
corresponding to various spheres of human activity (in our case - correspond-
ing to the various branches of science);
(b) contextual (functional - semantic) analysis of the words.
6. The main features of an M.T. glossary are that its
(a) contains a systematized description of each word that is
capable of ensuring the subsequent grammatical analysis of the word in the
sentence (the "invariant,characteristics of the word");
(b) provides for a genuine correspondence between two lexical
systems, registering the "relevant meanings" of words;
(o) takes cognizance of "zero meanings" of words, i.e. instances
where a word'must not be translated into another language as a separate
lexical unit.
For the rest, an M.T. glossary may be arranged on the same principles
as those underlying existing bilingual dictionaries. In particular, there
is no need to convert an M.T. glossary into a "glossary of stems". More-
over, a glossary of words has definite advantages for M.T. too.
7. The solution of the problem of grammatical analysis in M.T. is
connected with the realization of a logical, structural description of
language. Hence, conclusions drawn from solving this problem may have a
certain general linguistic interest.
8s Following the grammatical analysis of 5 linguistic systems- English,
German, Chinese, Japanese, and Russian--for M.T., it seemed possible to use
a consistent system of dividing words into the following 9 lexico-grammatical
categories:
1. verbs,
2. substantives,
3, numerals,
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releas00/08/24: CIA-RDP68-00069A000100007-9
4* adjectives,
5a adverbs,
6* prepositions 5hinese and Japanese postpasitions'may be clas-
sified as prepositions on the basis of their resemblance to
7, conjunctions,
8e particles,
9e parenthetic words.
The principle of dividing words into these classes is similar to that
underlying the division of words into parts of speech. Hence,, there is no
need to do away with the'traditional names of the parts of speech. Only a
bit more precision is required,
Thus, the classes of numerals, adjectives, and adverbs have been changed.
Pronouns are not isolated in a separate class, but the pronominal oategory
differs far such pa
t
f
r
s o
speech as substanti*edjtis,
s, aecve and adverbs
Systematization of grammatical categories within each part of speech
resulted in differentiating between the variant (contextual) and invariant
grammatical characteristics of the words.
9. The grammatical processing of sentences by the translation routines
breaks down into two independent steps s
Analysis of sentence to be translated, and
Synthesis of translated sentence.
We call analysis routines that system of rules whereby the -linguistic
analysis of a sentence to be translated can be performed in such away as
to produce the information needed for the grammatical structure of the
translated sentence.
In the M.Te variant developed at the Institute of Precision Mechanics
and Computer Engineering of the Aoademiy- of Sciences, USSR, the analysis
routines include the following 8 routines in cycle Its
1. functional analysis of punctuation marks;
2. breakdown of sentences into clauses and more precise definition
of parenthetical phrases in clauses;
3. syntactic analysis of clauses-
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release-.J00/08/24 : CIA-RDP68-00069A000100007-9
4. "verb" routine.-
5. "numeral" routine;
6. "substantive" routine;
7. "adjective" routine;
8. "changing word order in translated sentence" routine.
10. We call synthesis routines that system of rules whereby the
grammatical structure of the translated clause can be formed.
As of now 4 synthesis routines for the Russian sentence have been
worked outs
1. word-forming routine;
2. "verb" routine;
3, "adjective" routine;
4. "substantive" routine.
It is proposed to develop a routine for editing the style of translated
Russian sentences as well as synthesis routines for several other languages,
particularly Chinese and English.
This would make it possible to produce multilingual"machine translation
(from many languages into many languages), using Russian, it is suggested,
as an intermediary language.
3. AN INTERMEDIARY L&NGIIAGE AND ARTIFICIAL
O WGUAGES
Ye. A. Bokarev (Moscow)
1. Creation of an intermediary language for machine translation or
an artif icial'Esperanto-type international language requires the solution
of several problems, the main one being the need to establish correspondences
between the lexical and grammatical units of languages that differ in their
structural characteristics.
2, International languages based on natural languages use everything
that is essential for communication and reject what is non-essential or of
little value (exceptions of various kinds, polytypio declensions and con-
jugations, etc.). The most consistent in this respect are the autonomastio.
languages (Esperanto and Ido). Languages of another kind - the naturalistic
(Interlingua and Occidental) - retain certain of the unjustified complications
and inconsistencies of natural languages.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Releas000/08/24: CIA-RDP68-00069A0001 0007-9
3. The most important problems in the field of grammar ares indication
of the parts of speech, expression of subject-objeot relations, and word
order in sentences,
4. In the field of word formation there is the problem of productivity
of word-forming affixes and use of established patterns.
5o Some of these problems may be solved in various ways when an inter-
mediary language or an artificial language for international-relations is
created. Nevertheless, there are many problems that can be solved in similar
fashion.
4, THE VALUE OF MATHEMATICAL METHODS IN LINGUISTICS
R. L. Dobrushin (Moscow)
1. Uses of linguistics as a justification for its, existence. Classical
fields of uses teaching of languages and application to problems in history.
-2.'-Demands on language research imposed by classical fields of appli..
cation of linguistics.
3. Newest fields of application of linguistioss mechanical translation
and use for transmission of information in the form of written and oral
linguistic material,
4. Problems and methods of linguistic research dictated by the newest
fields of linguistic applications.
5. Mathematical methods of linguistic investigations
(a) methods used in theory of numbers applied to investigation
of the grammatical structure of language-
(b) investigation of language structure by methods used in the
theory of information,-
(6) linguistic statistics.
6. Interrelations between classical and modern linguistic techniques.
Potential for the development of mathematical methods,
5. CONVERSION OF COMMUNICATIONS AND CONVERSION OF CODES
V. V. Ivanov (Moscow)
1. In theoretical investigations dealing with automatization of
linguistic processes, it is advisable to distinguish the conversion of com-
munications (texts) from the conversion of codes (sign systems).
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release? 00/08/24: CIA-RDP68-00069AO00100 0007-9
2. By communication conversion we understand the translation of a
communioation from one code into another (recoding) while retaining the
invariant information. When speech is transmitted at a distance, the
linguistic structure of the text is kept, which makes this case very simple.
When sentences are converted within a singlelanguage the linguistic
structure of the text is partially transformed. This transformation may,
therefore, be regarded as a first approach to machine translation, In
translating from one concrete language into another concrete language or
into an intermediary language, it is possible to preserve the characteristics
of the linguistic structure of the text, which are directly reflected on the
structure of the text in the other language. In translating into the logical,
abstract language of an information machine, only the logical structure of
the text can be preserved, The increasing degree of difficulty of each of
these tasks is determined by the complexity of the rules for converting a
communication, which vary with the extent to which the information appear-
ing as an invariant during the conversions can be formalized.
3. By code conversion we understand the translation of one code into
another while retaining the code pattern. An intermediary language for
machine translation and an abstract machine language for an information
machine may be regarded as abstract systems, which are represented by the
concrete language of scientific and technical texts. Therefore, to develop
these abstract systems we require a formal analysis of the individual con-
crete languages in order to reveal their common patterns. An abstract
machine" language may be constructed by converting concrete languages derived,
in turn, from interpreting an abstract language. The general theory of code
conversion may be used for the deductive derivation of one scientific system
from another. In this connection it is necessary to investigate code
isomorphism in the various sciences (and code isomorphism in a single science
at various stages in its history). At the same time a general theory of
code conversion makes it possible to formulate with greater precision the
concepts of comparative and historical linguistics due to the fact that como-
parative-historioal oaloulation is a special case of code calculation.
P, S. Kuznetsov (Moscow)
1. Any language is a system of simple units of various orders so
interlinked by hierarchical relations that each elemental unit is in some
respect indivisible (without loss of some of its properties) and at the
same time consists of a certain number of units of a lower order.
2, The simple units of one order form what is called a level, stage,
or layer in a language system. Thus, one level is formed by such elemental
units as phonemes, another by morphemes, which consist of phonemes, a third
by lexemes (words), which consist of morphemes, etc.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
When we build any language system, apparently the simplest way should
be to define in succession the units of the lowest order and then pass on
to the units of the next higher order, the units and relations in which
they gust be -defined in accordance with concepts already defined for the-
next lower order," Thus, having defined the concept of phoneme, we may de-
fine the morpheme, which always consists of a certain number of phonemes.
4. But if we proceed in this fashion, we shall not be able to oon
struot an internally consistent system, since at certain stages along the
way we will meet up with vicious circles (in the logical sense).
5. The reason is that a system of units in any single order requires
certain concepts lying outside itself for its own construction or, in other
words, forming with respect to it meta-ooncepts ffSTA--PONYATIyjA. These
meta-ooncepts relate in part to the system of units in a lower order (with
respect to the order in question) and they may relate in part also to the'
system of units in a higher order (with respect to the order in question).
Thus, the definition of phonemes and their interrelations (in the phonological
sense, to which I subscribe; I have often set forth in print the case for
this view, so there is no need for me to go into it again-here) are based
not only on concepts from the field of phonetics, but also on some concepts
from the field of morphology, i.e., they relate to the level of morphemes.
6. A more complicated method of constructing a language system is
outlined on the basis of the foregoing. In some-cases it is necessary to
proceed directly from the system of the lower (e.g. first) order not to the
next higher (in the given case, second) order, but to the following (in
our case, third) order; and having constructed it without utilizing the con-
cepts of the second order, to proceed to this last; and then to return to
the system of the third order and finish constructing it, now also making
use of the concepts relating to the system of the second order.
7. MACHINE TRANSLATION STUDIES IN THE MATHEMATICAL
INSTITUTE OF THE ACADEMY OF SCIENCES, USSR
A. A. Lyapunov and 0. S. I{ulagina (Moscow)
I. Introduction
1. Electronic computers are a highly efficient means of processing
information.
2. It is practical to use electronic computers as an auxiliary tool
for intellectual work.
3. Human speech as a means of transmitting information,.
4. The importance of making it possible for machines to use human
speech.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release 22900/08/24: CIA-RDP68-00069A00010Q 0007-9
5. Machine translation as a first step in instructing machines to
work with a language.
II. Brief Description of Work Done
6. French-Russian translation. Empirical formulation of rules.
Construction of an algorithm suited to the machine U5 capabilities.
Elaboration of problems connected with coding and information conversion
in the machine memory and with the organization of programs to increase the
efficiency of machine operation. Utilization of scales. Work on improv-
ing the algorithm and programs on the basis of experimental translations.
?. English-Russian translation. Use of structural-syntactic analysis
of English. Classification of English and Russian words on the basis of
formal criteria. Grammatical configurations of English and Russian, a
comparison. Problems in eliminating homonomy. Use of experience with
French-Russian translation in problems connected with coding, program con-
struction, and Russian sentence analysis.
8. Problems in automatizing translation programming. Operational
description of translation algorithms. Compiling program, constructing the
translating program according to its operational description. -Significance
of experience gained in programming French-Russian translation.
9. Theory of numbers approach to the construction of a formal grammar.
Classification of words, identification of configurations, determination of
relations between words. Possibilities of using a similar approach to
syntax and phonetics.
10. Basic principles of operations advance by "ledges" iTSTUPA1v 7;
maximal theoretical interpretation of each step; planning of work base on
interrelations between machine and thought; close contact between groups
working on different languages,- joint work of mathematicians and-linguists
at all stages starting with the formulation of translation rules.
III. Problems
11. Linguistic problems in machine translation.
(a) Development of precise system of linguistic concepts, their
operation in translation algorithms as a criterion of usefulness.
(b)
for different
Development of methods of constructing translation algorithms
languages.
Intermediary languages, construction and use.
Problems in linguistic statistics.
Investigation of language structure on the basis of translation
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release 2000/08/24: CIA-RDP68-00069A00010'0007-9
12. Technical problems in machine translation.
(a) Elaboration of effective designs for translation machines.
(b) Establishment of operational systems for these machines.
(c) Elaboration of special memory devices (large capacity with
swift retrieval).
(d) Design of special input and output devices.
13, Mathematical problems in machine translation.
(a) Development of effective means of coding information at the
various stages of operation.
(b) Increasing the output of algorithms
(o) Investigation of abstract language models and translation
models,
(d) Elaboration of a mathematical language to describe translation
algorithms,
(e) Automatization of programming of translation algorithms.
14. Combinedmcybernetio problems.
(a) Machine output of algorithms.
(b) Machine production of linguistic statistics.
(c) Machine construction of models of concrete languages on the
basis of limited text materials.
IV. Problems Connected with Work in the
e of c ine Trans a off n
15. Need to elaborate different approaches to the problem by different
research groups maintaining close contact among themselves. 'Value of co-
operation in machine translation. Need to establish systematic exchange of
information between groups working in different cities,
16. Need for representatives of the varioiW-fields of specialization
to participate in the work on machine translations mathematicians, linguists,
and engineers constantly cooperating at all stages of the work from formulation
of rules to study of experimental translations.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 22,0/08/24: CIA-RDP68-00069A0001002 O7-9
8 e AN INTERJi IARY LANGUAGE MODEL FOR MAC ?I NE TRANS LAT IO N
Io A. Mellchuk (Moscow)
The following represents one of the possible solutions to the problem
of machine translation from many languages into many languagess
la Two sets of rules are worked out for each languages
(a) The rules of analysis which, with the help of appropriate
glossaries an cNa a', effect the transfer of a text into
a conventional numerical code in such a way that each word
in a given form and given syntactic function is matched
one-for-one with a chain of figures called set of infor-
mation for the word. The series of sets of information
developed is broken down into paired typical combinations
with which the relations existing in each given pair of
information sets have been matched one for-one a The fixed
relation between the-two sets of information (containing
the syntactic relation between the corresponding words) is
called a 'wconfiguration"o One member of the pair which
satisfies the given configuration is called the "governing"
and the other the "governed" members The total number of
configurations is-not very large (in a specialized text
no mare than 2OO) o
As a result of the analysis, each word in the text
to be translated is replaced by a set of information and
each set contains an indication of what configuration it
satisfieR. and which member it is,
(b) The rules of synthesis permit transition from the numerical
codes oeog offi a series of sets of information, to
words, to the actual text0 This operation is the reverse
of analysis described above
Each configuration contains an indication of what
form a word that satisfies the configuration in question
as either member of the pair must have ? Therefore, if
we know the atom of a word, the kind of cofigur?atione
and exactly hour the word satisfies it, we can synthesise
the necessary Barra 0
Both analysis and synthesis are effected in cos
plate independence of the translation.
2a A special system of rules and charts is being worked outs
determining correspondences between the conventional numerical code of
different languages (identical correspondences are not essential,- rules for
choice may be used., These correspondences are established on 3 levelas
? lI
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Rele^-2000/08/24: CIA-RDP68-00069A00400 00007-9
(a)
(b)
(0)
lexical correspondences (i.e. lexical transfer of stems);
granmatioal correspondences (transfer of so-called "extra-
syntactic " categories as, for example, number in nouns
or tense and mood in verbs);
syntactic correspondences (correspondences between con-
figurations s syntactic relations of different languages
as well as correspondences'between groups of configurations
clauses and various types of phrases).
This abstract system of correspondences is also called
an intermediary language which does not exist, therefore,
as any real or artificial language but represents a unique
calculus.
3. The translation process consists of three steps
analysis _- transition from a text in the source language to a
series of configurations;
transition -- from a series of configurations in the source
Ianguage to a series of configurations in the target language;
synthesis _m transition from a ,series of configurations in
the target language to a genuine text in it.
4, Underlying the;trans:lation is a.syntactio analysiss establish-
ment of configurations, i.e., ascertaining the relations between words in
the source language and expressing these relations by the most suitable means
in the-target language. -Such morphological data as case, number, and person
of ''a verb (also the use of auxiliary words is provisionally included here)
are used only as aids while ascertaining the syntactic relations.
5. During the course of syntactic analysis both the functions
of words in the sentence ("sentence members") and the interdependence of
words are established. The latter factor is especially important, since
the interdependence Of words makes it possible during synthesis to regulate
their arrangement, i.e. to achieve the best word order.
6. The model of an intermediary language that has been worked
out for machine translation includes for the present Russian, English,
Chinese, French, and Hungarian. The purpose is to develop a system of
formulating rules and the best method of recording and arranging the material.
.e 12
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release00/08/24: CIA-RDP68-00069A0001000007-9
9. THE SIGNIFICANCE OF MACHINE TRANSLATION
FOR LINGUISTICS
M. I. Steblin-Kamsnskii (Leningrad)
Besides promoting cooperation with representatives of the precise
sciences and thereby instilling linguists with the need for greater accuracy
in their research and formulations, work on machine translation is important
for linguistics in three respects:
(1) It Is critical of all the traditional grammatical concepts,
primarily those like the "parts of speech", "numbers of a clause", "clause",
etc. Based, as it is, on practical considerations, this criticism will be
more objective and effective than purely theoretical criticism.
(2) It makes clear that the same linguistic fact may be described
in various ways depending on what general definitions or terminological
conventions are used, with the result that all the dogzrAs established in
the individual branches of linguistics need to be reviewed.
(3) It will aid in overcoming linguistic "semantism"" 5EMAL3TIZMA7,
i.e. the practice whereby linguists follow the line of least resistance and
study meanings,. not the structure of language. Language differs from other
sign systems not by the existence of meanings (which are not peculiar to
language), but by the structure of expression.
10. THE "ACTIVE" AND ""PASSIVE" GRAMMAR OF L. V. SHCHERBA
AYM THE PROBLEM OF MACHINE TRANSLATION
I. I. Revzin (Moscow)
1. The polysemantic term "grammar" (either "grammatical structure of
a language" or "description of the grammatical structure of a language")
is one cause of the erroneous conception that a given language has only a
single grammatical structure, that there is only one correct "grammar" (as
a description of a system).
2. The description of a language system depends on the goal that an
investigator sets for himself. This notion was the core of the remarkable
theory of L. V. Shoherba on "passive" and "active" grammar, which has
suffered undeserved oblivion.
3. "Passive grammar studies the functions and meanings of structural
elements in a language on the basis of their forms, i.e. the external side.
Active grammar teaches the use of these forms." (L. V. Shcherba) The
purpose of instruction in passive grammar is to teach one to understand a
text in the language. The purpose of instruction in active grammar is to
teach one to express thoughts in the language.
.-13-
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Releas 00/08/24: CIA-RDP68-00069AO0010-00007-9
4. One of the dangers pointed out in connection with L, Vo'Shcherbavs
ideas is the assumption of a "denudation of-thought" or "existence of
thought without-language' in passing from form to pure meaning and from pure
meaning to form. However, no cognizance was taken of the fact that a
thought need not be registered in a concrete language; it may be registered
in an abstract,, artificial language where there is a simple, reciprocal
correspondence between the designator and the thing designated.
5. Machine translation assumes precisely such an abstract language,,
namely an intermediary language that must be implicitly present in any
machine program and will apparently be described in the near future. If
oybernetio analogies are adequately grounded, one may assume that the ana-
logue of such an intermediary language is present in any translation (and,
generally, in any form of logical activity).
6. Machine translation has demonstrated the correctness and need of
a separate approach to the problem of text analysis ("passive" grammar in
L, V. Shcherba?s terminology) and to the problem of text synthesis ("active"
grammar).
7. The first problem was effectively solved by purely formal means,
The limits of machine translation depend on a full solution of the second
problem (the compilation of a list; of synonyms, m by synonomy we understand
the presence of several units corresponding to a single unit in an abstract
language or what amounts to the same thing., a single unit of thought -- and
an algorithm for retrieving an equivalent under the given logical conditions).
8. Experience with machine translation has shown that, generally speak-
ing, an inverse ratio is observable between the 'active" and "passive"
grammar of a-languages the more complex the 'passive" grammar, the simpler
the "active", and vice versa. Hence, for a number of languages emphasis
wholly on passive grammar might considerably alleviate the language curricula
in schools.
9. L. V. Shcherba's ideas on the distinction between active and passive
grammar, as strengthened and enriched by experience with machine translation,
must ultimately find application in foreign language teaching (in secondary
schools as well as in colleges and universities).
10. Secondary schools should make wide use of the methods of passive
grammar, which are not only unusually effective for analyzing an unfamiliar
text,, but correspond to the habits of logical thinking developed in
mathematics classes. Moreover, interest in learning the grammar of a
foreign language can be heightened by introducing exercises in translating
sentences "by machine". This would also serve the interests of polyteohnioal
instruction.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release?3000/08/24: CIA-RDP68-00069A0001009&0007-9
11. The same considerations apply as well to language teaching in
the natural science departments of universities and in the higher technical
institutions where little use of the well developed formal-logical habits
of students has been made up to now in foreign language teaching,
12. Creating a scientific theory of "active grammar" would not only
push forward the frontiers of machine translation, but assist instruction
in language schools where grammar is still taught in undifferentiated
fashion. This is of particular concern to translation departments where
necessity dictated the conversion of a theory of translation into a theory
of active grammar.
11. A. GENERAL THEORY OF TRANSLATION IN CONNECTION
WITH MACHINE RAN IA O N
V. Yu. Rosentsveig and I. V. Revzin (Moscow)
1. The possibility of creating a scientific theory of translation is
still being argued by a number of specialists, both linguists and literary
critics. Nor has there been any final answer to the question of whether a
theory of translation concerns scientific linguistics or belongs to the
field of literature.
2. The polysemantic term "translation"' also awaits a definition. The
historical paramountcy of artistic translation has resulted in the oonceiv-
ing of every translation as an artistic production, as a creative achieve-
ment in the realm of language. Meanwhile, the development of new types
of translation aotivity, chiefly in the field of scientific and technical
literature, has made another conception of translation urgent, i.e. as a
process of establishing principles of correspondence between the structures
of two languages.
3. Disclosure of the possibility of translating texts by a machine
and development of a theory of machine translation has shown that distinguish-
ing between the fields of translation makes limitation of both concepts
logically inexorables
(a) "translationl" is translation as a form of creative activity
and
(b) "translation2" is translation as the establishment of strict
correspondences.
Translation as a form of creative activity is an object of study for
theorists of literature. Translation as the establishment of strict oor-
respondences is an object of study for linguists.
-15-.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Ili
Approved For Release 2000/08/24: CIA-RDP68-00069A000100007-9
5. A linguistic theory of translation must regard translation ("trans.
lation 2") as a special kind of decoding with subsequent encoding into another
system of symbols. The distinctive feature of this transformation of in-
formation is in the irreversibility of the process. The reason is that
simple, reciprocal correspondences between language systems are lacking.
Hence, rules for correspondence in translation are complicated by the need
to formulate a number of restrictive conditions. Determination of these
conditions is a proper object for a linguistic theory of translation. A
general linguistic theory of translation studies ideal types and routines
for matching systems of language symbols; a particular theory of trans-
lation analyzes the correspondences between the two languages. A general
theory of translation is chiefly a'deductive discipline, while a particular
theory of translation is inductive.
6o Thus, the methodology of a linguistic theory of translation com-
prisess
(a) methods of structural comparative analysis or, in other words,
analysis of the synchronous stages of various languages;
(b) methods of linguistic statistics.
(o) methods of logical semantics, more precisely general semsiology.
The very listing of these methods shows the main difference between the
linguistic and literary theories of translation. The latter requiress
(a) a study of the era,
(b) world cutl?ook and creative method of the writer and literary
school;
(o) peculiarities of his individual artistic style.
7o From the semantic point of view "translation " is a certain rem
2
flection in itself (a system of elemental meanings is assumed to be invariant).
"Translation 1" from this point of view, is not a reflection in itself, since
pragmatic meaning, which plays a major role in "translation I", does not
coincide in two languages.
8. Having marked off the object and methods of a linguistic theory
of translation, we can not only ascertain the limits of machine trans..
lation, but also create a well structured, definitive theory of trans_
lation, that is to say a separate, scientific ]linguistic discipline.
Creation of this discipline can help to perfect methods of training trans-
lators. It will undoubtedly find application in the teaching of foreign
languages as well,
-a 16
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release.200/08/24: CIA-RDP68-00069AO0010090007-9
12. SPECTRA OF PHONEMES AND THEIR USE IN
MACHINE TRANSLATION
V. A. Artemov and I. A. Zimnyaya (Moscow)
1. Oral information and translation machines must, among other things
be'accessible to people with varying physical characteristics of speech.
Therefore, their system of signalling must be based on the phonemic in-
variants of sounds or, in other words, on the spectra of phonemes.
2. Three aspects of the spectral analysis of speech sounds must be
distinguished: (1) syntactic (phonologic), (2) semantic (phonetic), and
(3) pragmatic (technical communicative).
3. A, syntactic investigation of spectra of phonemes is based on con-
tracts within the sound system of a given language. A semantic investigation
relates the spectra of phonemes to word meanings and grammatical .forms.
A pragmatic investigation of the spectra of bpeech sounds originates in and
services practical needs.
4. A syntactic and semantic investigation of spectra of phonemes pro-
vides an exhaustive analysis of their physical properties which form
structures bearing a comparative and systematic character.
5. A pragmatic investigation of spectra of phonemes, requires the
determination of their minimal characteristics, which permit of their full
or partial restorations i.e. it becomes a compression of the spectra of
phonemes. A pragmatic investigation of spectra of phonemes becomes their
oompandor, including the compression and expansion of amplitude.
6. The Laboratory of Experimental Phonetics and Speech Psychology
(LEF and PR) Iaboratoriya eksperimentallnoi fonetiki i psikhologii reohg
of the First scow State Pedagogical Institute of Foreign Languages
(MGPIIya)/17oskovskii goeudarstvennyi pedagogicheskii institut inostranrkh
yaaykov7oonduoted investigations of the spectra of 5 vocalic phonemes of
a, 0, us i, e type in the following languages (1) Russian (V. A. Artemov
and I. A. Zimnyaya)s (2) Georgian (T. G. Tsibadse), (3) Armenian (A. M.
Aramyan and A. A. Khaahatryan), (4) Lettish (I. A. Zi ya)& (5) Albanian
(I, A. Zimnyaya)s (6) Bulgarian (I. A. Zimnyaya)s (7) Czech ( I. A. Zimnyaya),
(8) German (L. P. Blokhina and I. A. Zimnyaya), (9) French (K. K. Barashnikova
and V. S. Sokolova), (10) English (I. A. Zimnyaya). In additions data on
English were drawn from the works of Paget, Green and Potter, Petterson,
and Kopp for purposes of comparison with the studies of the LEF and PR.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release X000/08/24: CIA-RDP68-00069AO00100200007-9
7o All the material was recorded with a basic tons of 120-150 cycles
per second at a level of 65-70 db, The pronunciation of each speaker was
representative of the literary speech of the various languages.
8o A comparison of the guantitative and graphic data shows that the
following pragmatic rules are observable within each languages
(a) the a-type vowel is characterized by a wide formant region
(600-1200 cycles) with gradually increasing intensity of
the components in the direction of high frequencies (1200-
2500 cycles).
(b) The omtype vowel is characterized by a central formant region
somewhat shifted down to 400?1000 cycles per second.
(o) The umtype vowel is characterized by a somewhat narrower
central formant region shifted still further toward the low
frequencies of 300-800 cycles per second with a maximal
elevation of amplitude in the range of 300?350 cycles per
secondo
(d) The imtype vowel is characterized by two main formant regions.
The first is in the range of lower frequencies and almost
coincides with the range of maximal intensification in the
main formant of the u-type vowel E as Paget has pointed out,
But a gentle falling-off is observed in amplitude of the u
m
type a and a steep falling-off in the i-type.
(e) The a-type vowel is distinguished from the i=type by the
formants shifted more to the center. The broader the e,
the closer the formants come together.
9o The above-mentioned acoustical properties of the vowels completely
correspond to the position and operation of the resonance chambers of the
vocal apparatus, as stated in several reports of the IEF and PR as well as
by Paget and Yakobson.
loo These studies indicate that the spectra of vowels on the syntactic
and semantic plane have a structural character. V. Ao Artemov suggested a
means of determining these structures. It consists of separating from the
vowel spectrum all the areas of relative intensification and establishing
correlations between them, taking the lowest of them as to
11. At the same time a comparison of the spectra of the 5 types' of
vowels studied indicates that a structural correlation between the areas
of intensification is retained within definite limits in the languages in-
vestigated. In this connection it is possible to speak about a certain
structural and comparative invariant of these types of vowel spectra in the
various languages, which is essential for signalling technique in trans-
lation machines.
Aft
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release IMO/08/24: CIA-RDP68-00069A00010020007-9
I3? AN OBJECTIVE INVESTIGATION OF MEANING ASSOCIATIONS
Oo S. Virog adova and A. R. luriya (Moscow)
le An objective investigation of the association of meanings that are
aroused in man by some word or other is a basic necessity for psychology as
well as for linguistics a
Despite the considerable progress achieved by nodern linguistics, in-
formation theory, and psychological. i.nveetigation ?f the development of the
meaning of words in children, objective research techniques both of potential
associations aroused by words and of the dy amtes of the se associations still
remain. to be worked out.
2, The use of different variation: of the conditioned reflex method
may play a vital role in elaborating objective ways of investigating
meaning associationso By combining the showing of a word with some kind
of involuntary reflex response (vase ular, outaneoue-galvanic, etc o reaction)
and then showing other words, the investigator is in a position to establish
objectively that group of words shown elicits similar reactions and is
consequently, to some extent, the eqmivalent of a previously shown words and
at the same time he is in a position to trace both the structure and the
dynamics of these associations.
3p The report discusses the results obtained from an objective inves-
tigation of the system of associations by registering the specific
and nonmspecif o conditions of ?aaoular reactions. Conclusions are drawn
concerning certain. faotcs that may determine the structure and dynamics
of these associations in, normal and abnormal experimental subjects,
l4e THE TREATMENT OF CERTAIN CONCEPTS IN STRUCTURALISM
V. T o Grigor lyev (Moscow)
to Interest, has groin'n of late in the thods and concepts of the
structuralist approach in linguistics du?s to the development of machine
translation Pad other brkknc re z+ of applied linguistics o However, recent
articles have treated certain structuralist concepts in an excessively one-
sided manner and, in es.sen e, incorrectly.
2q Phonemes are treated as though they were connecting elements lack-
ing physical reality, The physical. character of the differential signs of
phone s is dentedo Real speech sounds are represented as something ex-
ternal with respect to language. Meaning, which are also raved from
language, receive the same treatment. This method of handling speech sounds
and meanings reflects only the views of L. YeimislevIs group and is not to be
ascribed to structural: sin it gener to
-'i9-
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
3. Actually, the structuralist method of investigating speech sounds
takes into account their acoustic and articulative properties. The func-
tional criterion used by the structuralists in phonetics makes it possible
to isolate from the entire diverse mass of phonetic material the physical
(acoustic and articulative) properties that carry the functional load and,
consequently, are of prima importance to the linguist. The functional
criterion ensures a differentiated (from the viewpoint of language structure)
approach to the varied and changing properties of phonetic material. Using
the functional criterion, linguists may be very helpful to engineers in
solving practical problems confronting the several branches of engineering;
contrariwise, orientation on pure relationship elements would prevent the
linguists from solving practical problems and do away with the possibility
of cooperation between them and the engineers.
4o The attitude of the structuralists toward meaning was determined
by their interest in working out an objective method of investigating
language. The striving to escape from the inadequacies of traditional
linguistics led the struoturalista to refuse in general to consider meaning
as a solid criterion of linguistic form. However, this refusal to take
account of meaning in research methodology does not determine the structural-
ists' theoretical treatment of meaning. In many oases it exists harmon-
iously side by side with the acknowledgment of meaning as a basic element
in the functioning of language. It must be admitted, however,, that rejection
of the semantic criterion imposed severe limitations on this school of
linguistics. In practice, the field of semantics remained outside structural
analysis,
5. The meaning of a word is the linguistic form of expressing an idea.
Meaning cannot be separated from language simply because it does not exist
prior to or apart from language. At the same time, meaning is a basic factor
of language, determining its structure. It is important for the further
development of applied linguistics that objective methods of semantic analysis
be worked out. Naturally, in solving this problem full use will have to be
made of the experience gained by the representatives of structuralism in
their objective investigations of language.
6. A critical exploitation of the experience of structuralism is
scientifically advisable. It is an indispensable preliminary stage in the
task of introducing mathematical research methods into linguistics.
15. THE SIGNIFICANCE OF FREQUENCY AS A FACTOR IN
D I OFD` m
V. M. Grigoryan (Yerevan)
1. A comparative study of modern Russian.-Russian dictionaries reveals
contradictory data. Thus, in various dictionaries (e.g.) Monotypic 4mvolum?
works) one and the same word may be defined in different ways from the
204
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For ReleasevW00/08/24: CIA-RDP68-00069AO00100 007-9
viewpoint of the language As limitations with respect to stylistic usage B
and it is often possible to find 1nucDnsistei ~deu in the or?deri in which
meanings are arranged. These (and other) contradictions make things dif-
ficult for the reader who seeks info aticn in order to determine the
oper?atilr e norm for a given Unguisti,c fact,
2, Since the norr 9 as a r-uuls9 are correlated with the factor of
frequency, statistical data are extramly essential in mary oases, if
maximal precision is to be attained. Some considerations supported by
Russian language data (with due regard for strict synobronouene e) are
sited by way of illustrating this contentions
3a The plan proposed by us is not orb igi,nals it agrees in prin-
ciple with that employed in several frequency dictionaries published a-
broad (Harry H. 'alosaeleon, The Russian word count, Detroit, 1933;
Victor Garcia Hos d Vocabularic usual vocabularrio comun y vooabularric
fundax ntal9 Madrid, 1953.).
They are, usually constructed on the basis of the familiar correlation
between style and genre. Adapting thie plan or the whole a we propose to
set up 4 categorises (1) rr ,e (2) speech in =,:.alog5 a (3) speech in
monologues G using material from fiction (4) non-fiction literature
newspapers 9 doouants 9 eta o It is obvious that statistical data reflecting
the frequency of usage of a specific word in each of the 4 categories must
be selected on the basis of equal conditionso Clearly, these equal con-
ditions will be ensured if th,? f'uequency of a given word is derived from
an equal number of words in all 4 categories. If we designate the cate-
gories by a9 b9 a9 and d, respectively, the total preliminary number of
words in category a must be equal to the total preliminary number of words
in category b, etc. This word total9 it seems to us, can be advantageously
determined by using the Noz method. In addition., selections must be made
from purely random material (but within the given ccategory),- the more varied
the material, the more accurate will be the information.
The resultant date, can be used to determine stylistic fumnctions
165. E ERIMENT TO DEFINE THE CONCEPT OF
R. LO Dobruahin (Moscow)
A given finite nu berr of words is exams fined o A finite 9 ordered ag-
gregate of words is called a sentence. The division of all sentences into
two non- dossing classes is assumed to be givens a class of grammatically
valid sentences and a class of grarmatioally invalid sentences, Word A is
called subordinate to word b, if a valid sentence containing word A remains
valid after A is replayed by b. Two words A and b are called equivalents 9
if A is subordinate to b and b is subordinate to A. All words are divided
-_ 2l
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2000/08/24: CIA-RDP68-00069A0001 6 22 0007-9
into two non-crossing classes of words equivalent to one another. Class
A is subordinate to class B, if all words entered into class A are sub-
ordinate to words entered into class B. The system of classes and subs
ordinations thus obtained ? called the basic gramymatioal structure of
the language ~ is examined, The result is a definition of the concept
of grammatical category.
17. THE THEORY OF PROBABILITY AND DETERMINATION
OF LINGUISTIC RELATIONSHIP
A. B. Dolgopollskii (Moscow)
The proposed method of determining the relationship of language families
by applying the theory of probability is, in broad outline, as followss
1o On the basis of linguistic experience, those semantic points
are isolated in which maximum historical stability of morphemes (without
borrowing) is observed,
20 A determination is made in each group of languages under con-
sideration as to which morphemes possessing a given meaning may with greater
probability be regarded as the older. The usual techniques of comparative
historical research as well as the method of internal reconstruction are
used for this purpose.
3. We cannot speak about phonetic correspondences between language
families being compared before the fact of relationship has been established.
Hence, at this initial stage of investigation we must rely wholly on phonetic
resemblances. More precisely, we rely here on subsequent probability cor-
relations. Following a comparison of cognate languages, it appears that the
n-sound is the most probable of all the sounds in any single related language
that correspond etymologically to the nnsound of another related language.
The same may be said of the m sound. But, possibly, not of the s-sound.
At any rate, among all the sounds that correspond in one language to the a,
smsounds in another related language, the most probable, apparently, are
the sounds of the same s, s-group. fhis would also seem to be true of the
1D r?group$ the b, p, f-group, the t, dmgroup, the k, g, k, hwgroup, etc.
In this connection, we perhaps can?t say anything about vowels or laryngeals.
Starting with these probability considerations, we may be able (leaving
aside the vowels and laryngeals) to base our subsequent discussions on the
data of consonant coincidences between various morphemes in the different
languages. We will term "consonant coincidence" the correspondence between
oonsonantsthat remain within one of the above mentioned groups. These groups
must be chosen in such a way that phonetic shifts of these sounds are no
more probable than retention of the sound (retention within the group).
The groups cited here are obviously only for illustrative purposes. Actually,
oomparativemhistorioal phonetic materials from all possible language families
must be used to establish the most probable sound correspondences (one of
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releasc%000/08/24: CIA-RDP68-00069A000100Mb007-9
these correspondences is the most' probaab.e for sev-eral sounds - the cor- -
respondence of a sound to itself). As a? result, we may select, let us says
10 or 7 different sound its which will' constitute the material for
phonetic coi parisonss
4o Compari..ng the equi4ient m rphen*n, in the different families,
we note the phoretioc oincidenees'; r-,t pa a0 3). We then use appropriate
for lat from the theory of probability to measure the probability of the
&*oidental coincidence of a certa t.z; n ik ar of x rphemes in so many languages a
from so many comparable items, tasting into account, the number of old
synonymous morpbemes for e ch sea it c point of each language group as well
as the total number of consonants distinguished during the comparison.
(Cf0 par&0 Z)o
If the probability of accidental coincidence proves to be quite low,
it will be as ig '~ ?%rgu nt in or o t; e relationship of the languages
in questions
Use of the theory of probability will enable us to test the evidence
from comparisons between the vran?ipus lan .ages cited in numerous works
dealing with the problem of language family relationship (e.g. Trombettin
Winkler, etco)0
A. A. 2inov yev (Moscow)
1? The process of transiatiuig, from; one language into another may be
described as a language consistin excluily of definitions. Breakdown
of the language into elents is ere assumed to be effected. It is
possible to model the formal. z,*,p4,9x or def i.nitia~na,g one may scppoae9
by means of a special device a Having determined all possible definition
type relations at least between a` ,:elected part of the elements of one
language and a selected part of th-: el:am nt,, of anothher language, van
use the modeill.i:g1;i tz; prodnuo, in standard: torn at least partial
translation (if on' -4. initial a proxii$tion) o
2? A gsnerali theory of d fix tticnns to const ?uyted as part of a
theory of sy bol.,gse several -variaftts are. possible- de n.dinng on the original
concepts in the ,taten*nt and an tr?rhe for : ! apparatus for oonstru*;tinE the
theory. The suugge5tod -7a. riant a u ran^ ct rued. by ikn ir, tial concept
"Choioe, a. special. ?f deefini?:^. the ~oncatt tie? boi 'A '?T'erm`"? and
"Definition". The f'c rma" r ,Ba,a is ~t:uc tied on the bps :sip, of the
f actors ("Ea h. J") and (C a 1~ (": cyccs rr,d only one of") an4 or the
rd,ssion. as ' . tr tis,T logics ea^ i :a ,yqA ;isi ~s r. act1c ,"' could with
.~ A
td aa_po sition0
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releas 00/08/24: CIA- RDP68-00069AO00109 0007-9
3o A general theory of def iz;itions . sway contain proof of rules for
definitions, elicitation of the cgnditiorxa governing their use, rules for
deduction and their interconnections.
19. LINGUISTIC PROBLEMS CQNNECTED WITH POETRY TRANSLATION
V. V. Ivanov (Moscow)
1, The distinction between tjhe poetic model of a text and this text
may serve as a convenient starting point in solving the problem of poetry
translation. Translation makes it possible to recreate the same poetic
model by means of another language while retaining the relation between
the model and the text. On the other ha4da the direct conversion of a
poetic text in one language into a poetic text in another is impossible.
20 The amount of information contained in a text is determined by
the extent of deviation of this text from the statistical norms of ordinary
language and from the statistical norms of the poetic language of a given
era. A violation of the statistical norms of ordinary language may become
the norm of poetic language, which results in decreasing the amountof in-
formation contained in poetic texts. Poetry translation assumes the
transmission of the statistical characteristics of a text in conformity
with the language into which the translation is made.
3. The sound structure of verse is .etermined by the phonological
structure of the language, as was first pointed out by R. Yakobson. It
follows from this that trans ,ssiot of the phonetic characteristics of
the text structure is possible only when the corresponding elements in the
phonological systems of the two languages ooincid?. The non-translatability
of a poetic text is to a very large degree determined by the fact that in
poetic language the plane of content is functionally connected with thy:
plane of expression;, inson r as the plane of expression is in principle un-
translatable, +he plane o1 content appears partially untranslatable. This
limitation may also apply to the poetic model of the texto if (as with
Khlebnikov) units from the plane of expression are included in this model.
4. Phonetic coincidences of parts of words are used to organize a
poetic text chiefly in eases where they are superfluous from the morpholog-
ical point of view. Conver?selys morphologically essential phonetic re-
ie- mblances contain the least amount of information from the viewpoint of
poetic organization of speech (of. the problem of verbal rhythm in Russian
poetry). Consequently, the possibility of transmitting phonetic repeats
4POVTOROfdependa not only on the phonologicalg but also on the morphological
resemblances between the two languages in question.
Bo The predominance of synLtagmatio norm eotions between words over
paradigmatic connections is a peculiarity of poetic text on the plane of
content. We may see in this the result of transforming' language text in
o4m
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release00/08/24: CIA-FZDP68-00069A000100,007-9
accordance with a pooetic m de'1 th* rotation k twt? en ",emotion and text"
~ to use 0 o i47anc:gl a 9 tar). Thin transformation
ORXV,k I TENSTA
can bo effected not only is the origin Al but aleoo in the tr'analationa
6 a A line _1"Y- line an"41vat of trct does not yield satisfactory r?e-
auullta t ,ceuueo it weals ii:ttle .bo t u tune ,rjgt t!,, atructuur?e of long
passages ai~;@a are rWael, unite wf oa" the defa.xai i;on of
a period as "imve length" in Mi.U.t=on~a r?ae D am uaggeated by T. S. Eliot)
If a linemby4ine tranalati:rin app ro r.a i caaible, then for a translation
based on a poetio modal of the dims wt it is oneiderebly re im-
portant to Iran slat , them jo ? ythmiq ezri a; :ctaoti o vita into which
the work is divided,- as as exa e a t~.~l% 9 translation of VYMOZRU ODIN
11
YA NA DOROCrU Sc, out ?lone ontc~ the roar
io analymedo The continuity of
an it rian t A =, lal and -Its i bili;fi y ,> in principal., to for?m&llsed
~I7~I 5 1ude the j o ca the n:ty of automatic translation of
poetry by dean r omquute o
200 EEGEL'S TMVEk. NG,'l TIC PARADOXES
V. V. i nov (Moscow)
lo The r?eee bl n b~ eex matanem t;i, a and linguiatioa also appliee
to the trenda of these go.iennoe a as they develop in the 20th century. The
theoretical toundaitiona of the jo;ienoea are being investigated in anticipation
of practical applicatLomna 3 the r eeulta of thecae lnmoatigation_s will eventually
prove 7ita l for practical pu rposea o
20 liegel's the". Yaam,4 according to whi,oh the nbo~~oenrntradiotablan?a$ of
a theory cannot be demonstrated 1b;ithi,n .t'r* formalized tAiaory itself, may be
extended to iinguiatio theories 'by means of !tee generalization of the
theorems; which co mop, down an .f irm tx on of the incompleteness of any
system of symbols (Inol, Ading ieng cage Bowever a it would be essential
not to reatr?iot oneself to this f'o,?mual..l[rion L u i.r ati.gettng the founda-
tions of stY uctuar?ai lingu;i.et,J~ca 'bait to examife the conclusions resulting
from a linguuiatie, a .alogiae of H 1?s neoremo
S, The moat aaver?ely form .i. ed i xeoriea of language that exami
constructional ll l uiatl ob jero" were 'developed within the frerwwor k of
distributive a aai.yaiea which ae arum a thh p~ oaibility of describing the
elements of a language on the b 41a of 'their distribution. It is not
difficult to show that the logi4al appl,ioat ,on of this principle leads to
linguistic paradoxes (e.g. i; the distri b tive separation of phcnemreas,
word classes o a eaningc of pol.~s %
n tic-, Gorda etc o The distribution ot
elanenta tuur,?ne out, to be i.mpo,as led if these elements were not given
pr'eviouualy0 But the iom atio troduotion of language elen nta contradicts
not only the prinoiplea of dictribuutive ioveatigation9 but also the r?e-
quuir?em ants u fl' auut omatic auaiywim of vittan and oral apeeohv The axiom do
introduction of a c:.aaa of reguz,l'x tr seen noaa ,appearrra to be uu atiafaotory
for purposes of synthesis.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2000/08/24: CIA-RDP68-00069A0001QW00007-9
4. For the reasons given above, it is possible at the present time
to fashion a formal theory, which can be used to construct a program of
automatic analysis, only for a maximally simplified approximation to a
real language. We have in mind oases with simple correspondence between
form and substances on the plane of expression m for a system of standard,
typical variants of phonemes, on the plane of content m for a standard
language of science. The absence of paradoxes when these oases are analyzed
does not permit, however, of extending the results obtained to ordinary
language,., the metalanguage for which (unlike the cases mentioned) cannot
be formalized (this applies both to the phonological and to the semantic
metalanguage). Automatic analysis of real language requires the employment
of linguistic methods other than those considered above and the use of self-
teaching type machines (with probability elements)o
21. METHODS OF BREAKING DOWN A SYNTACTIC WHOLE
L. I. Iliya (Moscow)
1. Linguists representing the most different schools use as a starting
point in their methods of analysis the possibility mm objectively existing
in any language -- of isolating a certain "whole" as a maximal unit that
can be broken down into similar segments., i.e. comparable in any respect
whatsoever. This "whole", which has been variously called "utterance".,
`"sentence", or *clause",, belongs simultaneously to all the planes or "levels"
of a language ?o phonological, grammatical, and semantic ?m and is character-
ized by the fact that its borders coincide in all three planes, which makes
this segment a maximally complete or basic unit for any decomposition.
2. The breakdown of a "whole", due to its complexity of structure.,
is done on the basis of criteria that differ for each plane. As a result,
it yields segments the boundaries of which do not always coincide or. as
they say, are not "commensurable". Semantic decomposition is to a certain
extent independent of the grammatical, and it fails to establish a fixed
correlation between the boundaries created by rhythmic-intonation de-
composition and the boundaries of morphemes, words., and groups of words.
3. Modern linguists have attempted to eliminate the incommensurability
of the planes by seeking a single principle common to all stages of analysis.
However, unity of principle is achieved in some theories by ignoring some
aspect of language structure (e.g., meaning is excluded in Harris' method
and in rhythmic=intonation decomposition of Trager and Smith., while grammatical structure is ignored in Shoher'ba2s intonation-semantic decomposition).
OrderlineS of method is attained at the price of simplifying linquistic
analysis, which therefore cannot be regarded as adequate for research in
all its complexities. However, new methods of analysis focussing on form
mal criteria have been used to study them deeply, and modern techniques of
measuring such language units as phonemes, morphemes, and words have reached
a high degree of precision.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For ReleaseZQ00/08/24: CIA-RDP68-00069AO0010WO07-9
40 The task of linguistic analysis is not only to isolate the basic
linguistic units, but to determine the relations between the units that
all of semantic relations 0 The contemporary school of linguistics ao@
knowledges as "structural," i0e. which deal with linguistic analysis, only
those relations to which definite forms of expression, "signals", correspond.
Two main trends in the investigation of syntactic relations can be
discerned at the present times (a) the comparatively recent theory of
"direct constituents" (Bloomfield, Pike, Wells), which bases sentence decom-
position on the relations of a logical hierarchy of subordination that
links all the parts of a sentence into a single whole, and (b) the theory,
which may be provisionally called the theory of "members of the sentence'".
It has a long tradition and many opponents, but finds support among the
major representatives of contemporary linguistics (Kurilovioh, Bazel in
part, Diederichsen)o The theory considers the sentenos a wholes the parts
of which are linked together by functional relations 0
5. The direct constituent method, which is based on a single type of
relationship--the heterogeneity` of functions of the constituents=leaves
the general problem of determination of syntactic relations open and in-
vestigates for the most part the combinability of constituents and typical
patterns.
On the other hand, for the theory of "members of the sentence" the
problem of syntactic relations is fundamental. Formerly, these relations
were all too frequently distinguished purely on the basis of meaning, not
of forma1, criteria., although the inclusion of such criteria in the prin-
ciple is desirable and feasible (Friese, Togebyu) 0 The study of basic
syntactic relations requires for its own continuing development that all
modern methods of linguistic analysis be utilized, particularly the
technique of distributive analysiso
The "direct constituents" and "members of the sentence" methods do
not exclude one another. Rather, they are complementary, as they permit
the sentence to be studied in various respects.
22 0 THE LOGICAL NATURE OF CONTEXT
G. V. Kolshanskii (Moscow)
10 The term "context",, given the polysemia of language forms, may be
defined from the linguistic point of view as a combination of conditions
determining the simple, concrete identification of any linguistic phenomenon
(lexical, gra natioal, etc,). By "aLMle" are to understand the display
of only one of the many possible: properties of the form under the given
conditions (e.g0 one meaning of a word, one word order, one intonation, etc0)0
In this report we are considering cases of determination of meaning in
polysemantic words regardless of the method of origination (metaphor,
mr etonon r t, hom on y, a tc a) 0
-27_
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releas00/08/24: CIA-RDP68-00069A00010007-9
2. Contextual conditions may be found within the language itself,
but they may also comprise indications lying outside language material.
Among the language conditions it is necessary to distinguish between
indications included within a single sentence and textual indications.
Among the external conditions it is necessary to differentiate between
situation, object, and graphic indications.
3. The combination of possible conditions called context may be realized
while the precise meaning of a sentence is being formulated in language only
through a definite, active, logical process since indications by them-
selves are inert and can influence the meaning of a linguistic form only
as a starting point in the functional process of achieving a result that
makes sense.
Since the method of search by context is effective in the semantic
area of language, it is in essence a speoulatative, logical process of
reasoning about the meanings of language forms.
This rational search for the essential and uniquely correct (in the
ideal approach to a solution of this problem) result is a process of con
stauting a syllogism or chains of syllogisms where the answer needed to
establish the true meaning of the word and sentence is the final deduction.
4. A syllogism is constructed by searching for the a propriate
premise of a universal hypothetical syllogism (if.....then) or by inter-
preting a hypothetical-disjunctive judgment, the complexity of which de-
pends on the character of the indication underlying the premise.
While searching for the unknown meaning through external extra-
linguistic indications, a syllogism is formed in accordance with the
nearest indication contained in the context (e.g. determination of the
meaning "table" as a piece of furniture in the sentenoe"He has a good table"
is based on situation. The meaning "table" may be "either A or B. Here
it is not B. Therefore it is A, i.e. a piece of furniture.
5. Mention of the subject is sufficient for the major premise in order
to determine the moaning through objective context (e.g. determination of
the meaning of the word "solution" in chemistry and electrical engineering
is made in similar fashion). The.. form of writing in a written text may
also serve as a starting point for a syllogism about the true meaning of
a word (e.g. a foreign spelling).
6. The method of searching for the determining factor through lexical
environment is the most familiar way of determining the meaning of
linguistic forms. The premise is based on the immediately adjacent word
(starting of a Sputnik, starting of a motor) and a word standing in any
position in the appropriate group (an effective operation to destroy....a
hostile garrison, vermin, tumors. etc., where all the semantic variants
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2GO0/08/24: CIA-RDP68-00069AO001002QW07-9
are iaoo1.r edl to members In the maayor pre ~Te of an by Jthetioa.l-di juxactive
eyl1ogi,;m) o
G o the e a. a i~eu f ioi?rnt n or cue wi lhirn. the sea .tenn"'e';e, the true
aLing of wott^:ci is eosught by f rorming 66' rai yiiogiara to eesroh for the
preelee of the last oornoluusion, or the baei entire pasra6g ap'h or text
~eag~ W(P) did not alll,c ou:ln houuee to, harnmadl re('ivee the foflou g
logical, inte ^pretations
If it LS not a. question here of a uor^e h(uiee and family, then the
word "houee'' must be nderetood to mean "~ o xa .y ya ~, uz~f o After
exa n tiann of t h e text, t, f'ir t t ?en &qe are ';et company" r iu a an Gff s ti e aside and the meaning
aax~ raooording to the rule for a
di;e junotiviceo gy.4ll: iy -s ue, n Germ all e ?aadezr asn a t$11 All the wheele
are etantd: ;g etil Mi,mile.r a&Ty xk . --a` traffic o
toastc,~)o
80 The proYee -,,)f aaecoertaini, the true ng a , logioa~1ly
Carried Out by a iy ~ hati~o ~1~1 j v.ot ire eyllog emg but depending an the
nature of the deg ired reef t the o 3, ; oluz ,ion y be reaohed either by
eliminating paste of the df,o j votive ju(6,-ent, (gkvreu the possibility of
oomisiet* en ration of all the meaning~n of a word`s or by fist forminz a
die juunoti , judg nt ( ani4g9 the word A, r w, y be eit aer, or', It should
also be kept in mind that eaa:oh operation is subject to reoheokingo
9o Due to the fact that aa5lyeie of ooza xt Is essentially a ratio le
logical prooeem, it oan in prino pie be theor, etio ally formulated an an
ordinary logioel operation and be performed by a n chi , The feasibility,
and advisability of any arrange ut in oonneotion with machine trans ati.c n
is the deoihhive factor in a giver, ce,ae o
For simple formaula.s in a context the fors li.r,ed operation to search
for the ra a e,~asa y~ meaning may be xwork*d out by introducing a wimple
quantor (a, thematic quuantor), Whin the meaning of a word is being interc
pr eted on the basis of ' immediate environn nt, as virtually disjunctive
promise may be et -Rap, obviously %on+sieting of up to 3 words occu rring
before and after a polyeem ntilo wordy proari,ded, however, that preliminary
linguistic annally;sie determines all the oases where the meaning of the
given word depends on, words capaV'!;e of being associated with it. At the
pres~enna stage this work oan be perto. d only for a limited group of worth
in certain t>e;,U o
loo If the context extende beyond a sen, not, the eo2r do Of a die
junotive syllogism iN praaotioaalll Impossible, since one cannot for ally
mention the ind!ioatio on the basis of which the parts even of a fully set
up disjunctive judgment will be eliminated.
- 29_
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Rele a 2000/08/24: CIA-RDP68-00069AO00100200007-9
Lilewiee formally insoluble is the problem of aso6rtaining the limits
of the operation to analyse context (both within a sentence and, much more
so, outside it). This is the realm of active;, creative thought,, Thus, a
complete, practical solution to the problem of mechanical determination
of word meaning by context is exclludedo Formalisation of the rules for
interpretation f context in machine translation clearly requires the
application of tatistioal methods for probability determination of the
contextual meaning of the words.
23, LINGUISTIC STATISTICS FROM RUSSIAN TEXTS
R. G. Kotov (Moscow)
l,, The development of machine translation and the application of
methods of analysis and syntheses to oomm mications technology have created
sound conditions for expanding cooperation be en linguists and engineers,,
In this connection there has arisen a need to introduce into linguistics
objective research methods permitting mathematical handling of the data.
Linguistic statistics, which operates with quantitative values, offers wide
possibilities for linguistic research, Linguistic statistical data are
used to solve a number of problems in machine translations and communications
technology. In addition, they may be successfully exploited for lexioom
graphical purposes and for foreign language teaching,,
2. The current statistical investigation of Russian language texts
aims at preparing preliminary data in connection with constructing the
program of lexical coding of telegraphic messages. The work was first done
by hand on specimens of texts containing a total of 20,000 words. Methods
of analysis were determined by the existing possibilities and research
goals,, The texts to be analyzed were entered in order on index cards in
the form of two-member word combinations, which made various types of
calculation possible,,
It is proposed to use in the future machine methods for several
tabulations, e.g. word frequency.
3, The treatment of the material has yielded thus far a frequency
glossary, glossary of stable word combinations, and data on the frequency
of endings. Some principles governing the statistical distribution of
.the glossary for the texts examined are elucidated on this basis.
4,, Superfluousness in Russian texts of the type investigated is being
determined by taking cognizance of probability correlations in the glossary,,
A theoretical limit to the savings expected frog lexical coding is being
ascertained,, Lexical coding is regarded here as a particular case of de-
correlation ffEKORRELATSI messages by consolidation KRUPNEN'?Y7.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release Q000/08/24 : CIA-RDP68-00069A000100QW007-9
5. Work is going on to s Elucidate the min types of two-member word
combinations the sequence of which makes up a text, ascertain the provisional
probabilities of endings, and eliminate the uncertainty of choice of gram-
matical form in relation to the preceding word. Data obtained on the
material of two-member word combinations are assumed to apply to multi-
member word combinations and to the sentence as a whole.
24. A METHOD OF DEFINING GRAMMATICAL CONCEPTS
0. S. Kulagina (Moscow)
1. Inconvenience of existing grammatical systems for machine trans-
lation and need to elaborate precise definitions of concepts.
2. Initial base of undefined conceptss word sentence and OTMECHENNAYA
sentence, environment.
8. Breakdown of multitudes of words into submultitudes L'$ODMNOZHESTVg,
consolidation of breakdowns.
4. Concept of B?equivalence, amalgamation of B-equivalent submultitudes.
Derived breakdown. Theorem concerning the impossibility of secondary
amalgamation by equivalence.
5fl Sequence of amalgamation of words families, classes, types. Con-
cept of a simple language. Two definitions of type and their equivalence.
6o Determination of configuration, resultant element, ranks of con-
f igur?ations .
Concept of subordination of configurations.
Determination of relations between elements of configurations.
25n A FORMAL THEORY OF THE SENTENCE
I. I. Revzin (Moscow)
1. More' then 200 different definitions of "sentence" make, on the
one hand, a deductive development of syntax impossible and show, on the
other, that the approach to the problem of defining basic linguistic units
requires greater precision.
2. Any definition of a language element is a rmetalinguistio expression
(explicit or implicit). "Sentence' as a language word is, by its nature,
different from sentenced as a metalanguage word. Therefore, the aim to
include in the. definition. everything that is intuitively understood when
the sound complex AOsentencedis pronounced is scarcely realizable. A term
in linguistics, like an expression in metalinguistics, may reflect only
31, t5
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Releas00/08/24: CIA-RDP68-00069A0001002i')0007-9
certain intrinsic features corresponding to the usage of the word.
3. An analysis of existing definitions of "sentence" makes it
possible to divide them into two unequal groups. The overwhelming number
of definitions are connected with the purpose of the sentence, i.e, they
include mention of the fact that "sentence" is a language unit serving to
convey a "more or less complete thought". Only a few definitions are
based on particularly formal criteria.
4. The defect of "sense" definitions lies chiefly in the fact that
they violate the principle of homogeneity, they depart from the sphere of
language as a system and assume or sanction the dissolving of an object of
linguistics in an object of logic or psychology. Moreover, phrases like
"sore or less complete thoug at' and even simply `" :o lets tho ght" arm not
defused more or leEz strictly in logic itself. The linguists are. thereby
doomed to waiting passively for the progress of logical semantics which
it is easy to demonstrate, cannot itself develop without greater precision
of linguistic concepts.
5. The defect of existing "formal" definitions as compared with
"sense" definitions is that they lack the idea of syntactic coherence
(according to Aidukevich), i.e. what is most important in this unit of
language for a linguist.
Sentence "coherence" is reflected, as a rule, in the "sense" definitions,
but it is reflected functionally through the coherence of the judgment.
6. The formal definitions of "sentence" PRED1AZHENIYAg coincide in
substance with the definitions of "sentence" /PWV. Meanwhile, the
linguist is acutely aware of the need to disc nguis between the two con-
cepts,
7. The theory-of-numbers conception of language created by Soviet
mathematicians is a completely explicit metalanguage of linguistics in
which the basic linguistic categories may be rigorously defined.
8. In particular FRAZA entence , i.e. the ordered succession. of
smaller units is taken as the ori.gina , undefined concept (the aggregate
of meaningful or correctly constructed sentences in a certain language is
considered given).
9. Introduction of the concept of configuration, strictly defined in
metalinguistio terms, makes it possible to describe a relation of syntactic
dependency, while the isolation of regular configurations enables us to
obtain the complete analogue of a "syntagma" or "word combination".
10. The individual elements (parts) of the syntagma (they are described
in the formal system as S-groups or relatemes LRELYATE ) may be regulated
by the relationship of syntactic subordination. It is in these terms that
the concept of coherence is formulated.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For ReleaseJ000/08/24 : CIA-RDP68-00069AO0010000007-9
11. A sentence nay be called a cohesive number of Smgroups (or
relate s) such that for each Smdgrouup A there is one and only one Sdgroup
B such that B is syntactically subordinate to A.
12. Calling two S-groups YOM by mutual syntactic subordination a
predicative pair leads to the following theorems a sentence has one and
only ant predicative pair.
130 The suggested definition meets all the requirements set forth at
the begin ring of this reporto it is formal, reflects the idea of coherence.,
and is suffioiently close to the intuitive conception of the term "sentence".
It also permits us to derive deductively the idea of predicativityo
I40 Elusion of the somoalled "single-constituent sentences" is
justifiable on two main grosndss first, whatever may be our definition
of sentence, Wsingia-oonstituent sentences' cannot in general be taken into
consideration because the method of configurations is not applicable to
them. Second, the problems of correlation of "single`=constituent sentences"
with a judnt cannot be completely solved. And it is important for us
that the "sentence", determined by particularly formal means, may be placed
in mutually well-defined congruence with the Judgment, Thus9 a strictly
formal definition of the sentence is important even for logic.
2f . TRANSLATION sub specie structuralisni
Aa .A. Refori atskii (Moscow)
1. Translation results from the variety of languages and the consequent
lack of mutual understanding between their speakers. The purpose of trans-
lation is to supply necessary information (business, scientific, artistic9
etc.) in language comprehensible to a given user of the information.
2. What is the "theory of translation" and can there be a special
science of translation?
Criticism of "literary expansion"" (L. N. Soholev and, in part, i.
Etkind). Where A. V, Pedorcv is wrong in including the "theory of trans=
lation" in linguistics. The "theory of translation" not an a science,
but as an object of science, even various sciences. The role of linguistics
in the "theory of translation".
3. Types of tr?anslationo Narrowing of 'scope of translation"" in the
usual viewo Where L. N. Sobolev is wrong in considering translation
limited to three types o Haw "type of translation is defined". A given text
and the goal of translation. What is the structure of a given text in its
known linguistic features and in its social trends. The linguistic features
of a given text as determining the type of translationo Relevancy and non-
equivalence of translation elements in various types of translation.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For ReleaseOO/08/24: CIA-RDP68-00069A00010c007-9
4, Translation as information and interpretation. What the structure
of "translation" as a whole consists in, Initial data of translation, act
.of translation, results of translation in the structural sense. Where Z.
Klemensevich and I, Etkind are right and I. Kashkin is wrong. Various types
of relations between original and translation.
5, The problem of "translatability" and "non-translatability". What
"lack of mutual understanding" consists in. Why Humboldt is right and Kashkin
is wrong. The unwarranted claims of Ao V. Fedorov and others. What is ad-
equacy of translation in connection with analysis of translation elements
and understanding of translation type.
6, Methods and circumstances of translation dictating various
solutions of the translation problem. Ad hoc translations, translations
are the "task of a lifetime"$ lexicograpTy Informative translations,
technical and scientific translations, artistic translations, translations
of philosophical texts, machine translationo Cooperation of sciences and
talents in diversity of translation activity,.
279 A SYSTEM OF RECORDING SPEECH FOR ORAL TRANSLATION
V. Yu. Rozentsveig (Moscow)
1. Oral translation differs from translation of a written document
in that the words to be translated are perceived by the ear, transformed,
stored in the memory, and later delivered orally. These operations take
place more or less simultaneously (depending on the kind of oral trans-
lation).
20 The limited capacity of the "short" ,memory of man results in
considerable losses of information when large segments of speech are trans-
lated. Moreover, overloading the memory makes the analysis and synthesis
of a spoken communication difficult. It is necessary to work out a system
of recording speech constructed in such a way that it would interact with
the "short human memory, thus ensuring reliable storage of information,
facilitating perception, and recreating the oral message.
3. A phonetic (alphabetic) writing system has not been devised for
the recording of foreign speech. Stenography is unacceptable for oral
translation because it registers the words in toto (including redundant
and unnecessary words) and requires too much time to decipher, The system
of shorthand worked out empirically in the University of Geneva's School
for Translators largely meets the needs involved in recording speech for
oral translation (Rozan's work). However, it does not solve our problem
owing to its unsystematic nature and internal contradictions,
4. The task of developing an efficient system of recording speech for
oral translation amounts to the creation of a unique elementary information
language requiring the solution of' several logical, psychological, and
linguistic problems, to wits
34
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releas,.?000/08/24: CIA-RDP68-00069AO00100MO007-9
(a) logical analysis o spee la, isolation of semantic fuloral
points and systems of, connecting them.-,
(b) identification of the propersties and action ohao.ism of
the "short" hum-in nemoryg
(0) determination of linguistic redundancies n ooamn (atereotypi; )
word oombinnation and sen ~~n e oap,.able of being reduced to
symbols, the most efficient techniques of designating
Mrphel ss and syntn,.ctio connection in the system of the a omm
pllex whole m In addition.., we nnz t keep in mind, the necessity
of working out a recording system that will be applicable to
a pair of languages and easily mastered by those studying to
be?o tramsl.atoorso
S? T)J GUAGE TRAINING FOR BLIND LEAF-3+I ES
1, Ao Sokolyans.kii. (Moscow)
lo The a simaultaneouu,e. lank of visa.l and aural analyzers and thereby
of the speech an .lyzer i.s an emosedingly =+uaual condition for a child,"
The unusualness consiata in the' fact that the deaf-d s'1-Mind child is
completely nor sl as far as neural: and cerebral a,tructs e is concern d and
therefore retains potentially the full capacity for intellectual develop-
msnt like that of any normal child," NevertheiLec a using just his own efforts
and without outside help he can not m &k* initial contact with, the exte n m,i
environment, sunrrouing hiino
Lo Development of a deaf-du -bllind child's first contacts with his
en7ironment is an extre ly ,3ouaplex problem that can only solved by
selecting a rigorous system of initial signals," This is achieved by special
teaching and a special grammar. Or?diia rry general (particularly "aohooll")
gra aar, as presented in, general courses cannot be used,
3o If the system of initial signal contacts is developed in close
conformity with the logic of the external physical enviro nt, formation
of the second aignxalling system on the basis of the first is not parti.ou-
larlly complex and is chiefly a technical problem., The heart of the matter
lies, therefore, not in the esaand9 but, on the first signalling systeno
4, The second sign .lling system (language) in teaching a deaf-dumb-
blind child has wa r?h our. for li g, estioulatorya da?tyliofl touch (Braille),
writtona oral. The second signals must be strictly used in the saner order
as listed above.
A text is a basio link In the second signalling system--but no
separate words or separate :sentenoes o Hence, the language instruction of
a deaf-d b-bli,rrd child must begin with texts, not separate words or
sentences
M ._
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2000/08/24: CIA-RDP68-00069A00010020D007-9
6. The beginning texts are short, consisting altogether of 3-4 so-
called "simple" non-extended sentences (two-member). Five or six of these
are enough, after which the child can pass to texts composed of "simple"
extended sentences which, according to the rules, must include objects
(direct or indirect indiscriminately).
The remaining syntactic constructions a even such difficult ones as
complex clauses m are assimilated with the "simple" extended sentences in
the series of texts. What general grammar calls a "simple" sentence is
not at all "simple" as far as the teaching of deaf-dumb-blind children is
concerned.
29. SOME GENERAL PRINCIPLES IN COMPILING GLOSSARIES
VACHTNE TRANSLATION
G. M. Strelkowskii (Moscow)
1. The word as a basic unit of language. "Every word (speech)
generalizes" (Lenin Philosophical Notebooks), Since ideas originate
simultaneously with words and are expressed through words. The very
possibility of logical thought is created solely by language. The unity
of language and thought is organic, i.e. language can neither-arise nor
exist without thought, nor thought without language. However, words are
not identical to ideas. Words may have several meanings, i.e. they may
express different ideas and, vice versa, one idea may be expressed by
several words. A word may contain not only the expression of an idea, but
also the relation of the speaker to the object designated b the given
word (KHOIAD Zo-01g, KHOIIDDISHCHE Ze-xtreme col? KEOIODOK light oolleto.)
2. In this connection one should mention the impossibility of de-
scribing language without referring to meaning (the weakness in the theories
of American structuralists and their followers). The unsoundness of
theories reducing language to a system of pure relationships (Yellmslev).
3. In accordance with the considerations set forth above, algorithms
for machine translation must be based on a dictionary of meanings.
4. The principles of word choice for a machine dictionary.
(a) Significant and auxiliary words.
(b) Division of significant words into technical terms and words
in oommon use.
(c) Need to ascertain the minimum of international words required
for comprehension of technical texts.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releas ?. 00/08/24: CIA-RDP68-00069AO0010G 0007-9
(d) Choice of subjects (Electronics, particularly the section
dealing with automatic control, since this field is now in-
cluded in all branches of industry and science, and is a
basis for machine translation itself. Competence of author).
(e) Problem of compound words and word formation. Regular and
irregular translation of compound words. Glossary of stems
and program there"or or information referring to tables of
suffixes and paradigms of word changes (including stem changes,
e.g. stem forms ok strong verbs).
5. Word combinations and phrases. Providing words in the glossary
with an index indicating possible stock phrases. Translation of lexical
homonyms by the method of analyzing word combinations,
6. Methods of work in compiling dictionaries. Choice of articles,
reading them, writing out all words, except the commonest helping words
(auxiliary verbs, pronouns, prepositions, etc.) oN index cards; alpha-
betic arrangement of cards. Numbering sentences in the text and correspond-
ing index on the cards for ready location of possible occurrences of the
word.
7. Statistical conclusions. Alphabetic arrangement of words. Per-
tentage of technical terms. Repetitiousness of nontechnical terms.
8, Methods of expanding the glossary with and without the machine.
Reading of other materials on the given subject and enrichment of
glossary with common words.
Inclusion within the glossary of all technical terms already selected
in the special glossaries of technical terms on the given subject. Treat-
ment of new texts by the machine-with separation of words not known to it
and presenting them untranslated, or simply a selection of new words.
9. A selected glossary ai a foundation for constructing a trans-
lation algorithm without the creation of some metalanguage.
30. SOME ANALOGIES TO THE PROBLEMS AND METHODS OF
ANM'NT'-
CONT Y
INDIAN-MMICAL WORKS
y?~
V. N. Toporov (Moscow)
1. Linguistics has perhaps never been so independent and complacent
as it is today. This is undoubtedly due to the fact that the real object
of the science has been found. On the other hand, the connection between
linguistics and other sciences has never before been so strongly felt.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releast%00/08/24: CIA-RDP68-00069A00010Aft007-9
But this connection is effected not on the earlier basis, when attempts
were made to apply the methods of one science to another, but on a new-
oiie. It is characterized by some ideas common to a number of sciences.
These ideas developed (often independently) on the soil of the various
soienceb. The isomorphism of certain fundamental concepts (of."structure",
"field", "invariant", etc.), the similarity of individual problems and
methods of solution. It is becoming increasingly evident that certain
common ideas'and methods are being superimposed, as it were, on the material
d the particular sciences and transformed in accordance with the nature
of the material, the possibility of giving it a strictly formal inter..
pretation, the scientific traditions in the given field, etc. For this
reason the prospects for a new synthesis of various sciences on a new basis
ar?e now being carefully assessed (of. International Encyclopedia of
Unified Science,, vol. I, 1938m1939; B. Hansa no The concept offield as
a synthesis of natural science and humanities traditions in sociology.
Veetnik istorii.miroroi kulltury Perald of the History of World Culturf7,
, no 4p etc,)-.
At this time when linguistics is very clearly aware of its place
among the other sciences and the new direction in linguistics is inter-
preted as being something broader than simple opposition to old ideas, it
is natural that there should be growing interest in the outlook for the
development of linguistics, the nature of its connections with other
sciences, and the ultimate fate of these connections.
When one examines these problems, it is difficult to avoid thinking
about certain striking analogies to modern linguistic problems that may be
found in the history of ancient Indian science, particularly linguistics,
and which are attracting the attention of modern scholars with increasing
frequency (L. Bloomfield, Emeneau, Bro, Allen, Renov, and others).
3. It us list the most important analogies in the light of con-
temporary problems.
(a) Formal principle of language description ("desoriptivism")
exclusion of meaning in analysis, if we disregard the very small number
of Sutra-interpretations that sometimes deal with the determination of
connections of semantic (according to Morris) order,- fullness of
description, including differentiation between the obvious and the non-
obvious).
(b) Elements of a systematic approach to language; clear
destinction between class and member of class with fixed place; hence,
on the one hand, the concept of zero, on the other, potential forms,
bypergrammatioisms, false variation (often supported by the striving for
conciseness in exposition); contrast of.$phota.sabda; negative character-
istios of members in relationship- Prabh''a``kar~-avteaching on semantics-?
schools on relation of word and sentence and the dependence of the former
on the latter; distinction between signum?designatum-denotatum.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release W108124: CIA-RDP68-00069AO001002 p07-9
(o) The metalanguage of Indian grammatical treatises; symbols
(sign-index, sign-symbol types); metalanguage grammar in cognition..
(d) In connection with these features of ancient Indian linguistics,
mention must also be made of similar phenomena in other fields: The esthetic
code in ancient Indian art, particularly in the drama; the concept of dhavani
(an analogy to sphota); some analogies in the worKs of ancient Indian c1 gioians
and philosophers ca gorier of relation., time; '"nominalism`"); characteristics
of Indian historiography; the concept of zero among the mathematicians of
ancient India, etc. A oomparisN with ancient Greek science enhances the
significance oftle specific features of ancient Indian grammatical literature,
which in many respects resembles modern linguistics.,
31. THE FREQUENCY OF LEXICAL UNITS IN ENGLISH
M. G. Udartseva (Petrozavodsk)
1. We undertook a study of frequency of lexical units in English
geological literature in connection with. the compilation of a minimal glossary
for students in geological institutions. As material we selected articles
on the various branches of geology as well as on the allied sciences. In
addition, for the sake of objectivity in the tally, we included a considerable
number of authors from several English-speaking countries, The final listing
of sources comprised 33 works containing a total of 250,000 words, of which
28 are articles from 14 periodicals published in the United States, Great
Britain, Canada, India, and Australia, while 5 were excerpts from monographs.
2. The literature dealt with problems related to the following branches
of geology: mineralogy, crystallography, petrography, petrography of sed-
imentary, igneous, and metamorphic rooks, petrology, stratigraphy, paleonto-
logy, lithology, tectonics and structural geology, origin, distribution,
and exploitation of mineral resources, geology of oil and coal deposits,
geophysical methods, prospecting for mineral deposits, radioactive methods
of determining the age of rooks, quaternary geology, geomorphology and
glaciology, dynamic geology, geology of the ocean bottom, and regional
geology.
3. Individual words, phrases, and verbs plus post positions were used
in the count. Each additional meaning of a word was handled as a separate
item. For example, the word "face" was regarded as four separate words
corresponding to the meanings of "side', "face' (of crystal), "surface`",
"to put something in front of a person".
4. Each lexical item encountered again was entered on a separate
index card where all secondary usages with indication of author were noted.
If the word occurred more than 100 times in different authors, no further
entries were made. Such words as "that", "which"S 11it"", etc. were handled
similarly.
- 39 -
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Relea000/08/24: CIA-RDP68-00069A0001"%00007-9
5. The count resulted in a determination of the frequency for 7535
words. We entered into-the minimal glossary x373 lexical items consisting
of 546 verbs, 954 nouns, 327 adjectives, 235 adverbs, and 310 other kinds
of words. Of this number 176 words are speoialised"terms; more than 200
words have another'meaning in geological literatures, while the remainder
are ordinary words. About 4000 of the 7535 words are technical terms.
6, The minimal glossary was tested by taking several random pages
m
of diverse literary, general political, and geology material and cal
oulating the percentage of words from each text that were lacking in the
minimal glossary. It turned out that a page of geological text contained
1-1.5% "unfamiliar" words, general political text 8410%, and literature
(Dickens) 16-18%.
7. The minimal glossary was also collated with the Thorndyke
dictionary. Significant discrepancies were noted even in determining the
first 500 words.
32. ONE APPROACH TO LOGICAL SEMANTICS
V. K. Finn and D. Rho Lakhuti(Moscow)
1. Our approach to logical semantics can be summed up as follows"
(a) some language of science with minimal pragmatics is selected' AM%
as the investigated language (e.g., the language of synthetic
organic chemistry, formal genetics, classical mechanics, etc.);
(b) an artificial language is constructed for the investigated
language I and it consists of a glossary (class of basic
technical terms and syntactic-functors) and a class of indexes
for the glossary as well as a formal syntax in which are for-
mulated the rules for building sentences consisting of the
indexes. A correctly for-mad sentence in language I is
determined with the help of an algorithm constructed in the
formal syntax.
(o) Language I is expanded into language II consisting of language
I. A list of descriptions of types of sentences in language I
(examples of such types of sentences for the language of
synthetic organic chemistry will beo sentences conveying in-
formation about compounds,- sentences conveying information
about reactions$ sentences conveying information about re-
action conditions) and a list of combinations of indexes oorm
responding to the types of sentences formed.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2f99'0/08/24: CIA-RDP68-00069AO001002Q&07-9
(f )
(d) In aocordance with the types of sentences algorithms are
constructed in language II that discern the meaning of these
sentences,
If sentence F is correctly constructed and all the
indexes are replaced. by dictionary signs and if the combination
of indexes corresponding to F coincide with the combination
of indexes of some of the sentence types in language I, the
algorithm will'convert F into sign "S", if all the predicates
of the corresponding description are satisfied for F; if even
one predicate of the description is not satisfied for F, the
algorithm will convert F into an empty word.
In the first case we will say that "F has meaning in
language I", in the second "F does not have meaning in language
V. If, however, the algorithm is not applied to F, we will
say that the meaning of F is not determined in language I.
A descriptive syntax is formulated in language II. It
consists. of suitable algorithms to discern the meanings of
sentences and a list of rules according to which meaningful
sentences are derived from meaningful sentences.
(e) Language II is subsequently expanded into language III in which
definitions with reference to the properties of language I and
its relations to the investigated language are 'formulated.
Language III consists of language II and a list of definitions.
Language III contains definitions of the concepts of the
semantic completeness of language I, translatability (full or
partial)of the investigated language into language I, in-
terpretation of language I within the amalgamation of language
II and the investigated language, explicitness of language I,
and other semantic concepts.
If it is possible to construct a series of languages I,
IIg III for the investigated language, we will say that the
"semantic analysis of the investigated language" has been
realized. If the investigated language is at least partially
translatable into language I, it is suggested that "semantic
analysis of the investigated language" can be effected by an
automatic machine.
"Semantic analysis" is in the experimental stage, and that
is why we speak about an "approach" to semantics, and not the
construction of a deductive system of semantics.
However, the deductive construction of a system of seman-
tics is possible on the basis of experimental investigations
of the "languages of science"(with minimal pragmatics).
41
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Relearsea2000/08/24: CIA-RDP68-00069A000100200007-9
In the preparation of this paper we have used the ideas and results
of research in semantics by A. Tarski, K. Aidukevich, L. Hwistek, I. Bar.
Hillel, G. Curry, V. Quine, N. Chudman,, and R. Carnap.
33. SOME PROBLEMS CONVECTED WITH THE HANDLING OF
VERBS WITH ALTERNATING
(A Statistical Inquiry)
R. M. Frumkina (Moscow)
The compilation of a dictionary of stems is a necessary stage in the
task of constructing an algorithm of machine -translation. By stem we
understand the graphically invariant'part of a word. However? there are a
number of languages in which the graphically invariant part of certain
words, principally verbs with alternate forms, consists of one or two letters$
an inconvenience resulting in homonoigy of stems. It is therefore necessary
to separate only the purely standard endings (persona number, etc.)_, and
assume that a given word has several stems.
There are two possible ways of solving the problems
(1) Enter into the dictionary all the stem variants of each word
with plural stems, e.g. perfective and imperfective aspect, present and
past tense stems, eto. We thereby increase (and sometimes considerably)
the size of the dictionary.
(2) Select the most frequently occurring variants and enter
them into the dictionary; for the other stems, furnish the rules by which
they are in some manner to be identified or formed according to the stems
listed in the dictionary. This would enable us markedly to reduce the
size of the dictionary? but at the price of complicating the program.,
In order to determine the more efficient method, it will be necessary
above'all to carry out a statistical inquiry concerning words with plural
stem variants and their frequency. We are now analyzing the frequency of
verbs with alternating forms in a Spanish scientific (mathematical) text.
On the basis of data in the frequency dictionary of V. Garcia Hoz, all
Spanish words witha frequency of more than 460 were first divided into
classes depending on the types of alternation. Then the frequency both of
classes and of individual morphological forms was determined from con-
secutive material in mathematical texts.
The data thus obtained clarify the principles governing the distri-
bution of classes and alternating forms and enable us to make certain
recommendations in compiling a dictionary and rules for handling stems.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release,, 00/08/24: CIA-RDP68-00069A000100QD007-9
34. A LOGICAL ANALYSIS OF THE CONCEPT OF
LANGUAGE 5TRUCTM
S. K. Shunyan (Moscow)
1. Modern structural linguistics interprets language structure on
the Gestalt plane, ie, as a whole, the elements of which are connected by
definite relations.
2. It we consider that language elements interact on two axes--
syntagmatic and paradigmatic--an interpretation of language structure on
the Gestalt plane must be regarded as one-sided: we encounter wholes,
the elements of which are connected by definite relations, only on the
syntagmatic axis (such wholes, for example are syllables in phonology or
sentences in grammar. However, on the paradigmatic plane we deal not
with wholes, but with classes of ordered elementss the elements of these
classes are interlinked by definite relations, but the classes can not be
identified in any way with the wholes.
3. There arises the need of defining language structure in such a
way that the definitions may be applied to the interaction of language
elements not only on the syntagmatic, but also on the paradigmatic axis.
4. The new definition of the concept of language structure is based
on the general concept of structure in modern symbolic logic where it is
defined thuss the structure of a given relation is the property of being
isomorphic with the given relation.
Modern structural linguistics, as we know, distinguishes two planes
in language: the plane of expression and the plane of content (phonology
is included in the former, grammar and lexioology in the latter). Since
isomorphism exists between both planes, we may rely on the definition of
the general concept of structure in symbolic logic and define language
structure thust language structure is the property of the relations of
elements on the plane of expression and of the relations of elements on the
plane of content to be ismorphio with one another. This definition of
language structure is in complete accord with the research techniques of
structural linguistics at its present stage of development.
6. A logical analysis of the concept of language structure requires
an operational approach to this concept. Accordingly, the report states
how we should set up empiric operations by means of which language structure,
as an abstraction, can be linked to genuine linguistic activity.
- 43 -
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Releas00/08/24: CIA-RDP68-00069A0001 0007-9
35. ANCIENT TEXTS AND MACHINE TRANSLATION
(A formulation of the problem)
V. Shevoroshkin (Moscow)
1. There is no doubt that a great many philosophers, historians,
ethnographers, and even specialists in literature have an acute need of
Russian translations of a large number of ancient texts.
2. The available translations are a drop in the ocean compared with
the mass of ancient literary monuments.
3. Texts in dead languages have one feature that distinguishes them
from texts in modern languages, namely, the frequent impossibility of
proving that the original author had in mind precisely what we "read into"
the text.
4. The feature of ancient texts noted above has produced and is con-
tinuing to produce numerous commentaries on these texts.
5. The translator of ancient texts is in essence a commentator. Even
the translator who strives for maximum-objectivity inevitably introduces
into his work i yy subjective elements, which vary in degree with the depth
of his erudition.
6. An investigator who requires the translation of an ancient text
iaay also need a commentary, but his primary need is for a maximally ob-
jective translation. When:reading such a translation, he should confront
the same difficulties that are mastered by a person who reads the text
in the original. However, a translation done by a human being does not
meet these needs for the reasons mentioned in. (5) above..
7, Machine translation of ancient texts will enable a student to
obtain exactly what he needs. "Interpretation" of a text by a machine is
excluded. The more "elementilry", the better.
8. Thus, machine translation of ancient texts is particularly im-
portant, for the machine is not merely a substitute for a live translator,
but - in this respect alone - it does what a person can't do.
9. Certain characteristics of the ancient Indo-European languages
enable us to assert that these languages are more accessible to machine
translation than are the living languages. These oharaoteristios includes
Oomparatively greater transparency of morphology and simplicity of syntax,
numerous trite phrases, etc. This problem will be considered in detail
on Sanskrit material.
10. For the reasons set forth above machine translation of ancient
texts into Russian is a problem that deserves detailed elaboration,
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release WOO/08/24: CIA-RDP68-00069AO00100400007-9
SECTION ON ALGORITHMS OF MACHINE TRANSLATION
36. AN ALGORITHM FOR TRANSLATING FRENCH INTO RUSSIAN
ELBUTROXICALLY
V. A. Agrayev
(Gorki)
The algorithm was designed for use in connection with an electronic
computer of the GIFTI or'kovakii issledo vatel'skii fiziko-tekhnicheskoi
institut/Gorky Research' Institute of Physics and T'echnology7possessing a
limited memory capacity. The aim was to determine the translation capa-
bilities of the machine as well as to check the operation of the algorithm
with limited glossary and rules.
The algorithm includes lexical routines: a glossary of stems, a glossary
of phrases, and charts for translating polysemants. The stem glossary con-
tains about 500 words. In addition, we prepared a large glossary (about
1200 words) containing the full, original forms. The amount of grammatical
information included with the words varies in the two gl6asariesa lose is
given in the'stem glossary. The phrase search is based on the semantically
pivotal word The translation routines of polysemants contain tests for
contextual environment and the required meaning is selected accordingly.
Analyzing rules determine the meaning of French inflections andde-
pendi.ng on the governing words, establish the necessary grammatical forms
of the other words.
In the synthesis routines Russian word forms are constructed on the
basis of grammatical information derived from the glossary and developed
during the process of analysis. Synthesis is effected with regard for its
applicability also to translating English radio engineering texts.
Statistically chosen data were used in constructing the algorithm.
37. PRINCIPLES IN THE CONSTRUCTION OF ELECTRIC
N. D. Andreyev
(Leningrad)
1. The problem of electric reading devices (EChU) Zelektrochitdyushchiye
ustroisty 7 arises because of the slowness in preparing a text for machine
translation, which is inevitable when a human being does this work (partic-
ularly in oriental language texts).
2. An electric reading device must be adapted for machine sensing of
scripts of varying size, slant, proportion, and graphic shape.
3. The different sizes, slants, and proportions of scripts may be
reduced to a single standard by using the three-set system of varying curve
- 45
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2D00/08/24: CIA-RDP68-00069AO0010 0007-9
mirrors fREKHKOMPLEKTNOI SISTEMY. ZERKAL PERE11ENNOI KRIVIZNY
7.
4. Scripts of different shapes may be adapted for machine sensing
by using the principle of key identification points ZKLYJCHEVYKH
OPOZHAVATELQNIKH TOCHEf,, the number of which cannot exceed 50 for Cyrillic
and Latin; it may reach 100 for Arabic, Devanagari and their derivatives,
and about 300 for Chinese and Japanese.
5. The set of key points is individualized for each of the graphic
signs and is interpreted for each language in accordance with a special
program that constitutes the introductory part of the analysis in the
appropriate algorithm.
38. WORK ON AN INDONESIAN-RUSSIAN ALGORITHM
OF WHINE TRAM 3'-
N. D. Andreyev
(Leningrad)
1. The Indonesian language requires preliminary treatment of the
words in order to strip their roots. Stripping'of the root by direct
resort to a dictionary appears to be impossible.
2. Three factors make it difficult to strip the roots (1) the
presence of initial and secondary prefix and suffix; (2) internal sandhi,
i.e. the phonetic interaction of morphological elements; (3) the presence
of root reduplicators and polyreduplicators, which occur in two graphic
variants,
3. Much preliminary work was required for the statistical and
structural investigation of Indonesian words. Different versions of the
root-stripping program were based on this work.
4. Processing the words in the root-stripping program makes it
possible to proceed to morphological analysis, which is effected by
a special morphological program that is often realized in a purely
analytic way, i.e., without resorting to the output language, but by
substituting words in their code hieroglyphic.
5. Based on a certain working hypothesis concerning the structure of
the Indonesian sentence, it seems possible to construct a standard analysis
constituting the principal part of the syntactic program; it is only for a
minor portion of the sentences that we need a nonstandard analysis forming
a more complicated but much less frequently used part of this program.
6. The homonym and phraseology programs are operated after the first
three programs are completed, relying on the hieroglyphic analysis effected
therein.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release,Z0,00/08/24: CIA-RDP68-00069AO00100 007-9
7. The propositional and tloosary program works chiefly by oozbo
version, i.e.,, according to the output languago.
8. Tables of pseudoroots and typical sots of morphological in-
foriration are being developed as necessary supplerronts to the main
glossary.
89. WORK ON A VIP"PPIA T.-TSr?RUSS I/U1 ALGORITIUd
Or. :YAC1iT1d; TRlifi:i IO
N. D. Andreyev, D. A. Batova, and.
V. S. Penfilcv (Leningrad)
1, The Vietnamese-Russian algorithm of machine translation includes
the following programs
(a) Glossary of binomials, 5IN3
(b) Glossary of roots,
(a) Glossary of idioms,
(d) Supporting program, POFtNAYA PROGRAIMAC7
(e) Syntactic program,
(f) Homonymic program*
go The glossary of binomials assumes the stripping of two.-syllable
Vietnamese words with their gra:maatioal information.
The glossary of roots includes monosyllabic words and their
gramrmtioal information. The existence of two glossaries is due to the
problem of word boundary in isolating languages.
The glossary of idioms contains idioms, phrase oombinations, and
hard-to-translate expressions.
The supporting program serves to differentiate between parts of
speech in those oases where the appropriate gramraatioal information cannot
be precisely indicated either in the glossary of roots or in the glossary
of binomials.
The syntactic program provides for an analysis of Vietnamese syntactic
constructions.
The homonymio.program is designed to solve the problem of lexical
homonymy within any single part of speech. The program deals principally with
monosyllabic words, since homonomrgr is not characteristic of binomials.
-47.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2'000/08/24: CIA-RDP68-00069A00010D200007-9
3. In connection with the adoption of a syntactic standard, which
consists in utilizing syntaotio analysis to determine the parts of speech,
the range of application of the supporting program is narrowed to ex-
ceptions to standard eases.
4. Besides utilization of the supporting program, exceptions to
standard cases may be solved by inserting appropriate corrections into the
syntactic program.
6. The supporting program is characterized bys
(a) The ability of individual words to occur in a sentence as a
substantive and a verb.
(b) The fact that such words stand closer to the verb than to the
substantive. Therefore, when used as substantives, they often receive various
grammatical indicators that are peculiar to substantives.
(o) A number of verbs may be brought into the category of sub-
stantives by means of appropriate auxiliary elements.
(d) What has been set forth above explains the impossibility of
accurately indicating in the glossary the part of speech of the words in
question. The part of speech may be indicated only disjunctively.
(e) Determination of the part of speech to which words of the
type in question belong may be made in each specific case with the help of
carriers of grammatical data located in the supporting program.
40. 'PORK ON A JAPANESE-RUSSIAN ALGORITHM
OF MACHIM TION
A. A. Babintsev
(Leningrad)
10 Work on a Japanese-Russian algorithm was begun at the end of
December 1957, using atomic energy texts. At this stage analysis of
material is limited to the simple sentence.
2. Due to the fact that no reading devices are available for
ideographic text, the Japanese must be transcribed into Russian before it
is put into the machine.
3. The structure of Japanese--agglutination (substantive and verb
in part) and inflection (verb in part and adjective) with the stress on
agglutination--is responsible for the effectiveness and adequacy of the
standard morphological analysis and determines the primaoy of the program
of standard morphological analysis in the set of programs.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release 000/08/24: CIA-RDP68-00069AO0010O 007-9
4. The set of programs for the Japanese-Russian algorithm at the
present time is as follows-,
(1) A. program of standard morphological analysis (with referral
to the glossary-_"address" and withdrawal therefrom of certain gram?natioal
information).
(2) A program of standard syntactic analysis (based on a "working
hypothesis") o
(3) A program of non-standard syntactic analysis (oases that do
not fit the "working hypothesis").
(4) A hoxaonymio program.
(5) A glossary of idioms.
(6) A synthesising program.
5. The minimum of information to be derived from text analysis is t
for a substantive--case and, in certain instanoea9 number; for a verb--tense,
voice, mood, finiteness; for an adjective--tense.
6. The "working hypothesis"", which is based on the laws of Japanese
sentence structure, in broad outline consists of the followings
(1) The first substantive in the nominative or principal case is
the subject.
(2) The last word before a stop sign is the final predicate; a
verb in non-finite form is the middle predicate.
(3) The direct object immediately precedes the predicate; the
indirect object is found at some distance from the predioate.
(4) A substantive in the genitive case, adjective and verb in
the finite form preceding the substantive are attributes.
7. We should like to direct attention to one of the numerous problems
that have arisen in connection with our work on the algorithm.. After
analyzing a Japanese text, from which information on number can be obtained
only sporadically, it turns out that difficulties due to the inadequacy of
information on grammatical number appear in the synthesizing program during
formation of the output text. A solution to the problem of number in the
synthesizing program is exceptionally important for a number of "oriental"--
Russian algorithms.
-49-
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2 00/08/24: CIA-RDP68-00069A0001000007-9
41. THE PROGRAMMING OF TRANSLATION FROM
ENGLISTI INTO I
G. P. Bagr ,novskaya and
G. L. Gavx?iiova (Moscow)
Program of translation, constituent parts, order of operation.
Arrangement of glossary, difference in coding used in English section
of glossary from coding in French section of glossary. Size of glossary.
Glossary of phrases.
Choice of homonyms, construction of complex index scales and omitted
index scales.
'Operation of analysis program ("rolling up" formulas) fFORMULY SVERTK17.
Program'of'synthesis of structures on the basis-of formulas of synthesis.
Morphological treatment of results of synthesis.
Russian part of program of translation from English into Russian
(utilization of programs prepared for Russian. part of French-Russian trans-
lation). Agreement in codings.
42. PRINCIPLES IN COMPILING A GERMAN-RUSSIAN
GLOSSARY OF Y5LYSEWTS FOR ION
S. S. Belokrinitskaya (Moscow)
Determination of the meaning of a polysemantic word that is appropriate
in a given context constitutes one of the basic problems in machine trans-
lation.
This problem is being solved by compiling a glossary of polysemants
which will make it possible to obtain the relative meaning of a word by
an analysis of the surrounding context. In most cases it is sufficient
to examine context within the boundaries of a sentence.
A considerable number of words that have multiple meanings in
the usual literary language have but a single maning in mathematical
texts, and the system of meanings for a number of polysemants is simplified.
However, many German words, even in a mathematics text, have a large number
of relative meanings, the determination of requires a rather com-
plicated system of tests.
The most numerous are prepositions and a group of verbs which are
used with separable prefixes and which also form a large number of pfrrases.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Releas?. 00/08/24: CIA-RDP68-00069A0001040007-9
The principal method of determining the relative meaning of a
polysemant is structural--semantic analysis of the surrounding context.
In some oases grammatical forms of the given word or its environment are
also analyzed
It is possible to isolate certain gro.ps with a monotypic system of
meaning,, thereby simplifying the glossary and replacing in some oases the
system of tests (or part of the system) by reference to the appropriate
general rule,
We have also isolated a group of words united according to the principle
of identical effect on the translation of prepositions and some verbs with
extremely many meanings, which likewise permits of simplification of the
routine.
Methods of glossary treatment of different types of idioms and phrases
have been worked out.
The routines of polysemants also contain cases of lexical homonomy that
are not excluded from the system that differentiates between the meanings
of polysemants o
The determination of relative meanings of polysemants by means of the
glossary just described is not free from difficulties (in some cases a
single sentence does not provide sufficient context, the translation of
Complex words, etc.). However, these difficulties can, as a rule, be
overcomes
A check of the text shows that a complete satisfactory translation of
the mathematical corpus can be achieved with the help of the above-described
glossary of polysemantso
43, MAIN FEATURES OF THE GLOSSARY AND
GRAM TCAL
I. K. Bellskaya (Moscow)
1. The basic components of a system of machine translation from
English to Russian as worked out in the ITM* and VT*, 5see No. 2 for ex-
pansion and meaning of abbreviations7soademy of sciences9 USSR are a
specialized bilingual glossary and Three cycles of translation routines!
glossary routines, routine for analysis of input sentence and routine for
synthesis of output sentence.
2, The Anglo-Russian M.T. glossary now available has been designed
for the translation of scientific literature dealing with problems of
- 51
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Releas00/08/24: CIA-RDP68-00069A000100ft007-9
applied mathematics.- the solution of systems of linear, algebraic, and
transcendental equations, calculation of the proper values of matrices,
approximation of functions by means of polynomials as well as by trig-
onometrio functions, expansion of'functions into series, numerical
differentiation and integration, numerical solution of differential
equations, and other problems of numerical analysis.
The glossary contains 2300 words. Several works by English authors
were used for compilation and checking.
Uxt checking of the glossary for translation of mathematical lit-
er4ture yielded satisfactory results. Some 3000 sentences consisting of
more than 100 connected passages from the material of different authors
were used as the corpus.
3. A glossary for the machine translation of scientific literature
may be usefully divided into a series of independent "specialized"
glossaries. Further specialization down to relatively independent fields
within a given soience--mathematics, physics,, and chemistry--is also
worthwhile.
This division serves two purnosese it reduces the necessary bulk of
the glossary to the completely manageable number of 3000-3500 words and
even more important, considerably reduces the amount of polysemy.
The structure of the Anglo-Russian glossary for M.T. is such that
its several sections may be expanded independently.
The glossary has two main sections
I Single-meaning glossary and
II Multiple-meaning glossary.
Each section is divided into two subsectionss
Ia - glossary of termsa
Ib - glossary of words in general use,
IIa - glossary of words with complete meaning,
IIb - glossary of auxiliary words.
In size, the multiple-meaning glossary takes up about 1/5 of the entire
glossary which, in this instance, amounts to 458 words.
A%
- 52 -
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releaser00/08/24: CIA-RDP68-00069A0001002J007-9
5. The problem of polysemy is satisfactorily solved by combining
two methods (a) narrow specialization of a series of glossaries for M.T.
and (b) contextual (.functional?semantio) analysis of words in the sentence.
Experience shows that it is virtually unnecessary in scientific and
technological texts to go beyond the "small oontext"(i.e. one sentence).
6. In order that the lexical analysis of the words be effected
automatically (without human intervention), the M.T. glossary is accompanied
by a series of special glossary routines that make up cycle I in the over-
all system of translation routines.
These include-
1, A routine for obtaining the glossary form of the words,
2. A. routine for the grammatical analysis of "unknown words",
3. A routine for the grammatical analysis of "formulas",
4. A routine for distinguishing homonyms,
5. A routine for the analysis of polysemy.
The last routine is the most important from both the theoretical and
the practical points of view,
. 7. The lexical analysis, which is performed by means of the glossary
and glossary routines, precedes the gran matioal analysis and provides it
with the necessary initial information in the form of the so-called
"invariant oharacteristics" of each "known" word (i.e. entered in the Me
glossary) and the syntactic characteristics of all the "unknown" words
(not entered in the M. glossary) and the "formulas".
8. The grammatical analysis of input sentences is performed by means
of a series of routines in cycle II in the following orders
1. Analysis of verbs ("verb" routine);
2. Analysis of punctuation marks,
3. Syntactic analysis of sentences.- division of sentence into
clauses and more precise definition of parenthetical phrases
in clauses a define a sentence as that segment of text which
is includebetween full stops (period, exclamation or
interrogation point); a clause is a simple sentence, i.e. such
that it contains no more than one heterogenecus predicate,
4. Analysis of substantives and numerals,-
3
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
5. Analysis of adjectives;
6. Modification of word order in the translated sentence.
The "verb" routine is the key routine in the first half of the analysis
of English sentences; however, the syntactic analysis of sentences (routine
S) is the basis of operation for the second half of the analysis and deter-
mines the boundaries of those segments within which the subsequent analysis
is effected.
9. The routines in cycle III use the results of the preceding routines
in such a way that the Russian sentence obtains its grammatical form in
accordance with the rules of Russian grammar.
The synthesis routines go into operation Just at the time when the
variant (contextual) grammatical signs for all variable words in the output
sentence are obtained and the steps taken. to adjust the word order to
Russian norms.
In the place of the Russian numbers, which represented Russian words
up to this time, Russian equivalents are selected from the glossary, after
which the variable words (verbs, substantives, numerals, and adjectives)
are handled by the synthesis routines: a word ending is changed whenever
the desired word form does not coincide with the dictionary form of the
wore
10. Synthesis routines operate in the following orders
1. Word-forming routine;
2. "Verb" routine;
3. "Adjective" routine;
4. "Substantive" routine.
Changes in the numerals are effected partly in the "substantive"
routine, partly in the "adjective" routine.
The word-forming routine occupies a special plane: it provides for
various oases going beyond word changes while inserting the grammatioal
signs of the Russian word derived from analysis of the foreign sentence.
44. YORK ON A NORYQEGIAN-RUSSIAN ALGORITHM OF
M&CHINE TRANSLATION
V. P. Berkov (Leningrad)
I. The projected set of programs are: A. Analytic part: (1)
morphological program; (2) program for distinguishing homonyms; (3) syntactic
-64-
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release ZfJ00/08/24 : CIA-RDP68-00069A00010Q, O07-9
program; B. Glossary parts (4) glossary..-address; (5) regular glossary;
(6) glossary of phrases and idioms;.(7) program for compound words; (8
prepositional program; (9) program for unification of orthography; C.
Synthesizing part.
IT. Two methods of analysis, different in principle, were initially
oantemplateds
(a) To begin with a search for words in the glossary;
(b) To begin by extracting grammatical information from the text
before referring to the glossary on the basis of a supporting
program (lists of indisputable endings9 word-forming suffixes,
supporting words, ate.). Due to the extensive amount of
grammatical homonomy in Norwegian, the second method seemed
very cumbersome and9 in.some cases, practically unsound. It
has therefore been rejected.
III. The fact that the functioning of the algorithm- which begins
with a search for the words in the glossary and withdrawal of the infor-
mation located there into the operative metcory?-leads to clogging the
latter with information that is as a rule temporarily superfluous (in
some oases this is general) suggested the idea of creating typical sets
of .information.
TV. Programs (1) and (2) are now (beginning of March 1958) ready in
rough form. An ending obtained by stripping the dictionary stem from the
text form of the word is compared with the list of endings; if the given
word has a single grammatical n sanin.g, an information suffix is attached
to it and no further action is taken ova the word at this stage. Cases of
grammatical homonomy are handled by a series of special programs (2).
On extracting all the grammatical information from the text linear trans-
fers of words are made in order to impart a standard appearance to the
items derived by "unrolling" R,AZV .TE 7 this is done by a part of pro-
gram (3).
V. The program for the unification of orthography is the specifically
Norwegian part of the algorithm. The need for this program is dictated
by the considerable amount of inconsistency in Norwegian orthography, even
in scientific texts; without the program, the glossary would necessarily
be overloaded with many pairs of words.
VIA The program for unification of orthography will be used as a
basis on which to construct an adjusting program in connection with the
use of this algorithm for Danish.
55V
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release 2000/08/24: CIA-RDP68-00069A00010007-9
45. GLOSSARY STRUCTURE AND INFORMATION CODING
I. L. Fratchikov, S. Ya. Fitialov,
and Go S. Tseitin (Leningrad)
1. Consideration is being given-to the problem of introducing a
glossary on tape into the machine to search for coincidences in the event
that the glossary does not fit into the operational memory.
2. A glossary structure is proposed that will accelerate searching
and decrease the size of the indispensable portion of the memory for the
size of the glossary under consideration.
3. The previously suggested process of "rolling up the codes"
5VERTYVANIYA XODOV7 is now in use. The rolled up code is directly
utilized to obtain the address of information on words in the glossary.
We have provided for oases of coincidences of addresses thus obtained
(rolling up SVERTOCHNAYA/ homonomy) by differentiating routines in-
oluded in dictionary compartments, the addresses of which are not
addresses of the words.
4. Theoretical probability considerations have enabled us to obtain
results which, based on the given number of words in the glossary and the
volume of lexical information, make it possible to estimate the necessary
size of the memory to aooomodate the glossary.
5. Methods are also suggested for programming certain operators
encountered in the algorithms of machine translation.
46. GENDER AS A SUPERFLUOUS CATEGORY OF
V. N. Vinogradova (Moscow)
It is very important for machine translation to discover the gram-
nnstioal categories of a language concerning which there is no need to
give information insofar as translation can be effected without taking
them into account. Certain general considerations suggest that gender
in the Russian verb--an uncharacteristic phenomenon expressed only
in -1 forms, the singular-of the past tense and of the conditional mood--
is one of these categories,
We tested this assumption on a mathematical LT. G. Petrovskii,
Discourses on the Theory of Differential E uations, 1954gtext where the
number of verbs with gender expressed turns o u t to constitute only 4%
(93) of the total number of verbs (1970). We then selected linguistic
(history of language). A. Shakhmatov, Historic Morphology of the
Russian Lan ua e, 195 , pages 9-617 and historic . Grekov, Kievan
Russia, State Publishing House of Political Literature, 1953, page3
-texts in order to have a large number of diversified examples and
found th t then verb tub v
Approved For Release 2~O~p CIA=68'-CU _ 6000?'r of verbs
Approved For Release 00/08/24: CIA-RDP68-00069AO0010WO07-9
usedo It appears that in most sentences the verb may be related only to
the subject-ma single substantive in the nominative case, Doubts may arise
only in the case of transitive verbs where there is an object in the
accusative case that coincides in form with the nominative, of the type:
"Equation (6) yielded the general integral of this equation over the
entire surface except for the start of the coordinate", ravneniye(6)
davalo obshohii integral etovo uravneniya vo vsei ploskos 3a isklyuoheniyem
nachala Roordinag. Since we have a grammatical indication for both the
subject and the object purely in the past tense, and even then only when
the gender of the noun-subject differs from that of the noun-object, there
remains no other way of determining which is which than-the word order of
the sentence: the rule that the subject comes first holds in the over-
whelming majority of oases. A rearrangement is, of course, possible for
the sake of logical emphasis, e.g.t "in the Russian language preponderance
has received the accent of the nominative plural." ff rueskom ya3yke
pereves poluohilo udareniye imanitel'novo mnozhestvennovo. The phrase
poluchit' pereves receives the preponderance a predominate will
evidently have to be listed in the glossary as a phrase combination.
It is possible to conceive of more complicated oases (we didn't
find any examples, but we paraphrase one of the sentences of the type
described above): "Chaange,..caused a shift of e to a before a hard con-
sonant." Iznieneniye...vyzval perekhod e v o pared tverdoi soglasnog
Such a sen ence is almost impossible with a predicate in the present tense
(or is very badly writtena "Izmeneniye o.erzyayet perekhod..." will
clearly be misunderstood); even in the past tense it is awkward. Apparently,
rare instances of this kind will be edited; so too the following case in a
complex sentence: "The bishop asserted that his church land went along the
Lisichii -ford, which was in the time of Prince Yuri." /Tpiskop utverzhdal,
chto yeo tserkovnaye, zemlye, idet po Lisichii brod, chto byl pri knyaze Yuri*
.
In the absence of information on the gender of the verb b ', it is i os-
sible to determine whether the last clause modifies bro ford" (brodmchto
byl.?,a kotoryi byl.., an obsolete meaning, according to Ushakov's Tolko vyi
Slovar' Ziictionarf) or is a subordinate conjunctive clause relating to
entire preceding clause. This ambiguity cannot be resolved here by
formal signs.
With the exception of the last example, the texts studied did not con-
tain a single instance where the lack of information on the gender of verbs
would have resulted in confusion. This Permits of the conclusion that as
far as machine translation is concerned gender in the Russian verb may
well be ignored.
-67-
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For ReleasJ00/08/24: CIA-RDP68-00069A00010^0007-9
47. THE SYNTHESIS OF RUSSIAN VERB FORMS I'll MACHINE TRANSLATION
Z. M. Volotskaya (Moscow)
1. For the synthesis of Russian verb forms in machine translation it
is proposed to list in the glossary of stems only the stem of the imper-
fective aspeotive of each verb. All the forms of the present, past, and
future tense, perfective and imperfective aspect (personal as well as
impersonal) are formed from this stem in accordance with definite rules.
2. It is suggested that three types of operation are sufficient to
make all possible verbal forms from the single stem: (a) discarding the
final"letter or letters, (b) adding a letter or letters to the stem on the
right, and (c) adding a letter or letters to the stem on the left.
All the individual letters and combinations of letters which are
joined to the stem on the left and on the right are assigned by a list
and arranged in tables in accordance with a definite system.
3. All'the verbs are classified in three groups depending on the
method of producing: (a) the forms of the present tense, (b) the forms
of the past tense, and (o) the stem of the perfective aspect from the
stem of the imperfective aspect.
'By class of verbs we mean the total number of verbs that construct
a given form in the same way.
4. The information for each verb stem contains the class number of
the stem, which indicates the way in which a given form is to be con-
structed,
48. RUSSIAN SYNTAGMAS
(on the basis of mathematical texts)
Z. M. Volotskaya, Ye. V. Paduoheva,
I. N. Shelimova, and A. L. Shumilina
(Moscow)
1. This report discusses the basic types of two word combinations in
subordinate relationship (syntagms,s) as found in mathematical texts and
by means of which it is possible to construct the rules of formal text
analysis (for machine translation).
2. The syntagmas were based on specific word combinations drawn from
the texts.
- 58 -
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release Zo/08/24: CIA-RDP68-00069A0001002Q O7-9
Syntagmas are considered to differ from each other in type of
syntactic relations between their component parts. Therefore, not all the
morphological and syntactic signs of the words that form the given comr-
bina.tions served as criteria for relating these combinations to the
various syntagmas.
S. A syntagma consists of two components: "governing" and "governed".'
Each of which is accompanied in the list of syntagmas by certain information.
As a rule, the "syntactic group" is the essential information for the
"governing", the"morphological form" for the "governed" component.
4. Words are divided into "syntactic groups" on the basis of the
following principle of marking words according to the sign of a common
syntactic connection: first, those words which have a single common
syntactic connection are separated from the mass of words into one group;
then, those words which have another syntactic connection are separated
from the sane mass, etc. The same words may fall into different groups
which consequently appear to be crossing each other.
The separation of syntactic groups not only according to one but
according to a combination of signs should lead to a significant increase
in the number of syntactic groups and correspondingly, in the number of
syntagmas.
6. The report includes a list of syntagmas, description, and dis-
cussion of possible ways of using them in text analysis.
49. SYNTHESIS OF THE RUSSIAN CLAUSE
Z. M. Volotskaya and A. L. Shumilina
(Moscow)
1. Sentence synthesis in machine translation consists of combining
words into clauses and clauses into sentences according to the requirements
for sentence building in a given language.
2. The aggregate of syntagmas in each sentence that are obtained by
analyzing the language from which a translation is made does not constitute
an adequate basis for synthesizing sentences of the language into which
the translation is made. Correspondences must be established between the
languages in question not only on the syntagmatio level but also on the
sentence level.
3. A clause is synthesized by inserting a syntagma, i.e., one syntagma
as it were overlays and draws into itself another.
4. Each word in the clause of the output language obtains, in addition
to the information necessary for translation (number of stem in the output
language, number, tense, etc.), the following signs: (a) number of the
- 69 -
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Rele 2000/08/24: CIA-RDP68-00069A000 00007-9
syntagma into which it is entering as a governing word (by the first
method, of. below) or as a governed word (by the second method); (b)
ordinal numbers of the words (from the input language) with which the given
word forms syntagmas.
In combining words into clauses it is more convenient to use the ordinal
numbers. of words from the sentence of the input language and not the numbers
of the output stems because using only the latter might lead to mistakes
inasmuch as the sentence may contain several identical lexemes or different
ones, but with the same stem.
5. There are two possible ways of synthesizing a clause by means of
syntagmas:
(a) Isolating the pivotal syntagmas (predicatives) and successively
expanding each component at the expense of the governed words.
(b) Synthesizing a clause'by successively combining syntagmas until
they are reduced to the predicative. Moreover, each srtagma enters as a
single group into a higher rank syntagma as a governed, expanded component.
50. GRAMMATICAL ANALYSIS FOR MACHINE TRANSLATION
07 CHINESP INTO RUSSNIT
V. A. Voronin (Moscow)
The system of grammatical analysis for machine translation of Chinese
into Russian was based on materials from contemporary scientific and
technological texts in mathematics, electrical engineering and construction.
it utilized the fundamental works of Soviet and Chinese authors on the
modern Chinese language. The system was tested on mathematics articles from
the Chinese periodicals Shusyue syaaebao (Mathematics Herald) and Shusyue
tsin'chzhan' (Successes of the Mathematical Sciences). In constructing the
system we did not have the task of solving the extensive and manifold
grammatical problems connected with machine translation of literary and
sooio-political, literature. However, we did take cognizance of gram-
matical phenomena characteristic of Chinese as a whole.
Treatment of the Chinese sentence according to the system of grams
matical analysis starts after operation of the glossary and glossary
supplement is completed: as a result of which words in the sentence enter
the system with concrete relevant meaning and complete lexical characteris-
tics, i.e. with the set of necessary signs.
The special grammatical structure of Chinese possesses an extremely
small number of formal means by which one can identify the full morpholog-
ical properties of the Russian equivalent for the Chinese word within a
given lexical unit. Therefore, a Chinese sentence cannot be processed for
-60-
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release X00/08/24: CIA-RDP68-00069AO0010O W 07-9
machine translation without an analysis of the syntactic structure of the
sentence. to be translated which was predetermined by the general principles
underlying the system.
The systen operated in the form of routines, consists of two main
parts: (1) syntactic analysis of sentences, and (2) production of the mor-
phological characteristics of the Russian equivalent. The entire system
includes 9 interrelated, successively functioning routines.
The first part has 4 routines in which the following stages of
syntactic analysis are effected in corresponding orders
(1) Breakdown of the input sentence into simple clauses.
(2) Separation of attribute 4. attributed word groups.
(3) and (4). Separation of other (than attributive) syntactic
components of the clause.
The second part of the system has 5 routines of which 4, on the basis
of existing syntactic signs, produce the morphological characteristics
for the Russian equivalents of all the words in the Chinese sentence. The
classes of words mentioned below are handled in the order given:
(1) Numeral,
(2) Substantive
(3) Verb
(4) Adjective
The operation of the fifth routine consists of changing Chinese word
order in accordance with the norms of Russian word order.
The system as a whole comes down to producing the formal signs that
reflect in the first part the syntactic function of the-word and in the
second part the morphological features of the Russian equivalent of the
Chinese word.
An adequate, readable translation is ensured by performing a combined
lexico--grammatical analysis of the Chinese text put into the machine.
- 61 -
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Relea^000/08/24 : CIA-RDP68-00069A0001 0007-9
51. APPLICATION OF MACHINE TRANSLATION METHODS TO
QF TEIEGRLPHIC
AM-
V, I. Grigor'yev and G. G. Belonogav (Moscow)
1. Men have been searching from ancient times for the most effective
utilization of the channels of communication, Up to now the main efforts
of engineers and communications experts have been aimed at perfecting the
oommsmioation channel proper and at seeking ways of transforming the
signal so as to secure the maximum suitability of the signal to the given
ohamnel6 The contents of communications meanwhile remained unchanged.
However, the possibilities have now for the most part become exhausted so
that the problem of finding means of reducing the size of messages trans-
mitted is becoming increasing urgent.
2. The size of a telegram may be shortened 3-4 times if a lexical
code is used instead of a literal code. A telegraphic communication
that uses lexical coding differs from an ordinary printed letter com-
munioation only in that they send not code groups designating letters, of
the alphabet, but a code combination designating the ordinal number of
the word according to the dictionary in the memory device plus certain
items containing grammatical information about the word transmitted.
3. The principle of lexical coding of messages has been known
since ancient times, It is employed in various kinds of signal tables, in
the international radio code, and elsewhere. However, in all these oases
coding is done manually, requiring great effort and considerable expeadi-
ture of time. The development of computer technology has now made possible
automatization of the process of lexical coding and its wide use in com-
mtmications,
4. Lexical coding is based on an analysis of the message at the
transmitting end and its subsequent synthesis at the reception end of the
line of communication, This lexical analysis and synthesis of a message
is essentially a simplified form of the analysis and synthesis of a text
produced by machine translation. It is therefore worthwhile, when pre-
paring an algorithm for lexical coding, to make full use of the method of
text analysis and synthesis used for machine translation,
5. -Lexical coding has, in addition, several peculiarities. Text
analysis and synthesis in the case of machine translation is aimed at
securing the operation of hieroglyphic conversion--& basic operation in
m chine translation, Elimination of hieroglyphic 'conversion would
lead to considerable simplification of the routines of analysis and
synthesis in the case of lexical coding. On the other hand, with lexical
coding the demand for code economy is pushed to the foreground, whereas it
is of purely secondary significance as far as machine translation is con-
cerned, Lexical coding must rest to a large degree on speech statistics.
-62-
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release Q 00/08/24: CIA-RDP68-00069AO00100 007-9
In particular, due to the interlinking of analyzer with the devices of the
channel of communication, the size of the dictionary cannot be so conveniently
large. Available statistics permit limitation of the dictionary of the
lexical analyzer to a maximum of-4000 words in ordinary use, which generally
make up 97.6% of a literary text. Rare words not found in the dictionary
may be transmitted letter-by-letter.
6. Application of the principles of lexical coding to telephonic com-
iaunioation may help greatly in solving the problem of maximum closeness of
compression.
52. SOME PROBLEMS IN MACHINE TRANSLATION FROM
E I
M. B. Yefimov (Moscow)
The purpose of this communication is to set forth some principles
involved in analyzing Japanese sentences for machine translation, the
principles being characteristic of the Japanese language alone.
A, The primary problem with which we have to deal in analyzing a
Japanese sentence is its division into separate words. This is typical
chiefly of languages with an ideographic form of script (Japanese, Chinese,
eto.). The fact is that words are not separated in a written Japanese
text and, consequently, identification of their role in a sentence is quite
difficult.
We shall try to show in this report how we made the division in our
work.
We bean with the fact that the Japanese script uses the signs of a
syllabary (kana) along with ideograms.
Thus, the division of a Japanese sentence into separate words breaks
down into 3 main steps:
(1) 'Analysis of portions of sentences containing both ideograms
and syllabary.
(2) Analysis of ideographic part.
(3) Analysis of syllabary part.
This operation is closely linked to the operation of the existing
Japanese glossary and is, so to speak, one of its parts.
B. Breaking down a sentence into its individual clauses is no less
important a problem in Japano-Russian translation and has both practical
and theoretical interest.
- 63 -
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release X000/08/24 : CIA-RDP68-00069A0001002U0007-9 .
In this work we are relying chiefly on the rigid structure of the
Japanese sentence in which either a verb or a predicate adjective always
stands at the end. This enables us infallibly to determine the end of the
sentence.
The beginning of the sentence is determined by searching for the
subject.
Thus, the entire operation consists of two stepss
1. Determination of the end of the sentence, and
2. Determination of the beginning of the sentence.
0. As is true of all languages, the verb constitutes the greatest
difficulty in translating from Japanese into Russian.
The strongly developed affixation that is characteristic of Japanese
is most clearly marked in the verb.
This determined the cyclical nature of our operation.
We used the fundamental rules of traditional grammar for the analysis
of verb endings, relying mainly on the five stems of the Japanese verb.
We have been successful in establishing the necessary grammatical
and syntactic criteria for all verbs.
53. WORK ON THE RUSSO-ENGLISH ALGORITHM OF
L. N. Zasorina (Leningrad)
1. Limitation of problems and scope of work. Choice of mathematical
text as being most limited in stylistic peculiarities.
Determination of set of programs for Russo-English algorithm. Ex-.
elusion ofprogram of differentiating homonyms due to synthetic structure
of Russian. Simultaneous work on glossary and morphology program.
2. Combined investigation of short text. Compilation of glossary in
which the grammatical form and syntactic relations of the words are
registered. Recording of statistical data.
3. Investigation of individual parts of speech, division of words into
classes, and preliminary detection of homonymy between the parts of speech.
- 64
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release x,00108124 : CIA-RDP68-00069A00010024WO7-9
4. Terb'and grammatical information derived from personal forms and
nominal formse .Homonomy of participles and adjectives distinguished by
taking into account suffixes of full and short forms of participles.
Lack of formal-graphio separation of auxiliary and modal verbs from the
verb class.
Adjective class comprising adjectives, adverbs in -o, ?e9 .ski,
ordinal numerals, words in the status category. ArrangemenT in non?
specified subclasses. The substantive class including nouns, sub=
stantivized words and cardinal numerals (other than odin Za-nj~e, dva Atw7,
tri hre ,the a our) is distinguished by the abundance of homonymic
case formss in ra c ass homonomy and interclass homonomy. Separation of
non-specified subclasses. Triliteral word class. Class of invariable words
is characterized by negative separability in the text.
4. Advisability of introducing stamp-stripping program. Planning of
groups of commands for the individual classes. Manysided investigation
of homonymic coincidences of separable affixes.
5. Problems connected with differentiating grammatical data derived
from homonymic affixes. Tables of separable, restrictive lists of letters
that precede the separable affixes. Successive separation of affixes from
stem (endingsand formmoonstructing suffixes) and storage of grammatical
information derived.
Table for verifying matching of preliminary information obtained
from affixes and stem glossary.
Method of multistage depositing of grammatical information derived
from the glossary and steno stripping program.
Attempt at dividing grammatical data into two non crossing fields
to reduce the number of tests of possible grammatical forms.
6. Compilation of stem glossary. Determination of general size
and limits of glossary. "Lexical article" plan, taking into account
input and output information and list of possible forms.
Obtaining pseudostems.
Problems in contracting the glossary by separating word-building
suffixes and prefixes.
7. General routine for processing words stem stripping program,
stem glossary, morphology program. Obtaining input information for
the syntactic program.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Releae000/08/24 : CIA-RDP68-00069A00016&0007-9
540 WORK ON A HINDUSTANI (HINDI) - RUSSIAN ALGORITHM
T. Ye. Katenina (Leningrad)
1. The development of a Hindi-Russian algorithm is very important
for similar work in the field of Indian languages--both Indo-Aryan and
Dravidian. The structure of Hindustani is in the main analytical, al.-
though the traces of ancient inflection and agglutinative elements-.a new
synthesis.--play a definite role. The scientific style of Hindi prose is
characterized by a more or less definite word order close to that of the
Dravidian languages. Numerous phrases containing non-oonjugated verb
forms,, equivalent to subordinate clauses, constitute the main difficulty
for machine translation. Scientific texts are characterized by an abun-
dance of Sanskritisms which are frequently translated loan words of in-
ternational (European) terms.
2. Hindi writing, phonetic for the most part, is therefore especially
convenient for an electric reading device. To record texts we worked out
a mechanical transcription based on the Russian alphabet without complicated
signs and diacritics. In addition statistics justified our combining
several Hindustani sounds.
3. The set of programs for machine translation is as follows
(1) glossary of stems (2) morphology program (3) postposition program (4)
syntactic program (5) program for differentiating homonyms (6) list of
idioms (7) a translation program of compound words may be required for
some kinds of scientific texts.
4. In order to avoid superfluous information we adopted the following
hypothesis for the syntactic analysis of a simple sentences (1) the first
-noun substantive in a direct or active case is the subject (2) the verb
in the last place in the sentence is the predicate (3) if the.verb is not
a copula, the noun substantive in the next-to last place in the sentence
with the postposition ko or in the direct case (not the subject) is the
direct object.
We have determined the necessary minimum,of morphological information,-
but which requires statistical confirmation in individual-oases.-to bes
(i) for the noun substantive--number case (direct, active, indirect), (2)
for nominal adjeotive.-.-number (may be important to deternd.$e the number of
noun substantives with zero ending of direct case plural number), (3) for
the verb om tense, moods number (to determine the number of the some noun
substantives); voice,A check of the text showed that the overwhelming
majority of simple sentences as well as the constituent parts of complex
sentences may be analyzed in accordance with these rules.
66
Approved For Release 2000/08/24 CIA-RDP68-00069A000100200007-9
Approved For Release.100/08/24: CIA-RDP68-00069AO0010Q 007-9
6. Among the basic problems requiring a solution for subsequent
work in constructing a Hindustani-Russian algorithm area (1) elucidation
of rules for analyzing complex sentences and equivalent phrases with non
conjugated verb forms, (2) clarification on a statistical basis of the
need to design a program analyzing compound words that would be compulsory
for all kinds of texts.
55o AN ALGORITHM R TRANSLATING ENGLISH
TMUI-U 0 E I
K. Y. Komissarova (Gorki)
The translation rules and glossary have been worked out with regard
for the characteristics of English texts dealing with radio engineering.
The translation process is divided into 2 main partas analysis of
English sentences and synthesis of Russian sentences.
Analysis of an English text is based on a syntactic analysis of the
sentence. The grammatical function of a word is determined by morphological
and syntactic analysis according to rules grouped by the parts of speech.
The glossary contains more than 500 words in general use and specialized
technical terms.
56. AUTOMATIZATION OF TRANSLATION PROGRAMMING
0. S. Kulagina (Moscow)
1. Long, tedious process of constructing translation programs causes
need to automatize programming. Requirements of translation programs and
impossibility of using existing programming programs. Formulation of
problem of automatizing translation programming.
2. Breakdown of translation algorithms into operators. Types of
operators and functions of each. Parameters of operators.
3. Preparation of translation algorithm for translations recording
of algorithm in the form of sequence of simple rules, transition from this
recording to operator, automatic construction of translation program ac-
cording to operator recording of algorithm by means of compiling program.
4. Compiling program, its structure. Some features of structure of
programs obtained by the method described.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releas 00/08/24: CIA-RDP68-00069AO0010d%0007-9
670 A FRENCH-RUSSIAN TRANSIATXON ALGORITHM
0. So Kulagina (Moscow)
(1) Formulation of problems translation of mathematical texts, De-
mands for quality in translation, oases requiring editing.
(2) Structure of glossary for machine translationfeatures, Glossary
information end purpose, Glossary of phrases,
(3) Principles in constructing translation algorithm. Structure of
algorithm and'order of operation, Word look-up in glossary. Treatment
of phrases. Differentiation of homonyms and analysis of polysemants, order
of operation of rules for differentiating hononyms0 Analysis of French
sentence, problems. Sequence of handling parts of speech during analysis.
Character of information obtained through analysis, Change of word order
in translation. Synthesis of Russian sentenoea order of operation of
synthesizing rules and how they differ from analyzing rules.
(4) Supplementing and correcting algorithm on the basis of experimental
translations (greater precision in rules for differentiating homonyms,
change in handling of adjectives, separation of morphological from syntactic
analysis).
68. DETERMINATION OF SYNTACTIC CONNECTIONS FOR FORMULAS
IN RUSSIAN WgWf= TNT$
M. M0 Langleben (Moscow)
1. We call "formulas" all text elements not found in a mechanical
glossary in processing a text (surname, mathematical formulas, foreign
references, neologisms, eta.). 9'Formulas', like words to be translated,
require the ascertaining of syntactic connections in the text to be
analyzed, i.e., the identification of formulas that form part of one of
the previously given syntagmas.
2. The analysis of a "formula" is broken down into 2 partas
(A) testing the formula proper for the presence of any word-
changing suffixes, the sequence of tests being determined by
frequency of the oases.
(B) analysis of its environment (words and punctuation marks).
This begins only after all the 5formulas" contained in the
given segment of text have passed through part A.
3. The following order of ascertaining the possible syntactic
connections for "?ormulas" is advisable in that it eliminates the possi-
bility of establishing false syntagmas s
68
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Releaser2p00/08/24: CIA-RDP68-00069A00010Q2 007-9
(a) the formula acts as an adjective for a substantive standing
on the right;
(b) the formula is a name with a substantive standing on the left;
(o) the formula forms part of a prepositional phrase;
(d) the formula forms part of a ayntagma with an adjective requiring
the dative ease (RAVNYI qua7, KRATNYI /maltipl27);
(e) the formula forms part of a syntagma with an adjective in the
comparative degree replacing a substantive in the genitive came;
(f) the formula replaces a governing substantive in an "adjective f
substantive"' syntags;
(g) the formula acts as a predicative combinations
These last ar?e used to check various syntagmas with a verb; the function
of a formula with a verb is chiefly determined by its position on the right
or left of the formula, not by the form of the verb.
4. Since the analysis of "formulas" is a basic part of the routine
developed for the language as a whole, it will be performed piecemeal at
various stages of the total analysis.
b9o ELIMINATION OF 1)RPHOIAGICAL AND SYNTACTIC
H
M. M. Iangleben and Ye. V. Paducheva (Moscow)
to Those words in a dictionary of stems that cannot be identified as
a fixed part of speech, i.e. "attempt" (verb, noun), "cool" (adjective,
verb), and "further" (adverb? adjective), etc,, are handled as followas
If a word can be a noun and a verb or an adjective and a verb, it is
inserted in a dictionary of substantives or verbs, respectively. Those
word-changing suffixes that can readily identify one part of speech to
which a word belongs (mad, .-in g, but not -a) are listed in a-table of word-
building suffixes, i.e7If TEW-word has one of these endings, the part of
speech will be revealed after morphological analysis. However, homonymic
stems do not require any changes in the analysis routine provided for the
other words. (This method is based on a suggestion by A. I. Smirnitakii
who defined conversion as word building by means of paradigms).
2. If the part of speech cannot be readily determined by morphological
analysis of (zero ending in word stems) "They attempt", "the attempt",
homonymic ending--"he attempts", "the atten ts--or the parts of speech which
have no word-ohanging forms are howrymiom? further" (adjective, adverb)--
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Releas00/08/24: CIA-RDP68-00069A000100007-9
the word is assigned several syntactic functions corresponding to the
possibilities of the word to enter a syntagma as a substantives verb, etc,,
The possible functions are examined in a definite order and a syntagma is
established for the given word? depending on whether certain words are
present in the sentence; thereafter all the remaining functions listed are
dropped out except that for which the syntagma was found,
3. Similarly. homonomy in -ing forms, ,-ed forms, etc. is eliminated
by successive tests for the presence of oert syntagmas in the sentence.
60. THE SUPERFLUOUSNESS OF RUSSWT ADJECTIVE INFLECTION
N. N. IAont9yeva and G. N. Vavilova (Moscow)
1, In machine translation from Russian the procedures for handling
the inflection of adjectives are quite cumbersome. The machine has to
perform a double tasks first, to investigate the inflection of the ad-
jectivea then to search for the substantive with which the adjective
agrees.
There is an easier way of relating an adjective to the substantive
with which it agrees, a way that ignores inflection in most oases.
2, When a Russian text is analyzed, it usually turns out that adjective
inflection is superfluous as far as translation in concerned. It merely
indicates the agreement of the given adjective with a certain substantive.
3. An adjective may be related to the substantive with which it agrees
without analysis of its inflection by using the adjective's position in the
sentence.
An adjective m attributive most frequently occupies with respect to
the substantive with which it agrees a definite positions it stands either
before this substantive or after it, following a comm.
Accordingly, it is possible to formulates two rough rules for relating
an adjective to its substantives
(a) Relate the adjective to the nearest substantive on the right;
(b) If there is no substantive on the right, relate the adjective
to a substantive that is followed by a comma.
4. However, relating an adjective to a substantive in accordance with
these rules alone may turn out to be incorrect,
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release 2W/08/24: CIA-RDP68-00069A000100207-9
Therefore, a number of individual tests must be performed before
finally deciding the problem of relating an adjectives is the adjective
part of the nominal constituent of the predicate, is it included in a for-
mula, does it govern the following noun with or without a preposition
(VYZVEDENNYI IZ FORMULY seduced from the formul7a
zer27) 9 RAVNYI Ze-qual to
.,
5. After these checks the machine either relates the adjective to the
substantive without regard to its inflection or, if it cannot dispense with
it, analyzes the inflection of the adjective.
6. An analysis of mathematical texts shows that without investigation
of inflection it is possible to relate more ;than 85% of all adjectives to
the appropriate substantives. The remaining 10-15% of the adjectives re-
quires an analysis of the inflections.
7. In calculating the number of adjectives we excluded short ad-
jectives, the relative MOTORYI 5hicg9 cases where the adjective is part
of a formula, cases of ellipsis (the adjective is present, but not the
noun with which it agrees, e.g. OTLICHAYETSYA OT RASBMDTRENNYKH B ETOM
PARAGRAFE 5t differs from the (things) considered in this paragrap7.
8. The practicability of a method to ascertain the possibility of
ignoring adjective inflection has still not been proved. This will re-
quire further work on texts as well as more experience with machine
translation, taking cogni-Xanoe of technical difficulties,
Nevertheless, the suggested routine for relating an adjective to its
substantive by position criteria will retain its value, even if the
necessity for investigating the inflections of all adjectives is demon-
strated, since inflection is merely one of the factors that control the
correct relating of an adjective to its substantive by position criteria.
61. AN ALGORITHM OF AACHINE TRANSLATION FROM
ENGLTSH INTO RUSSW
T. N. Moloshnaya (Moscow)
I. (1) Different possibilities for formalizing linguistic data in
different languages.
(2) Advantages of a structural-syntactic analysis of English.
1I. (1) Classification of English and Russian words according to
formal criteria.
(2) Grammatical configurations constructed from isolated
classes of words.
III. Analysis of English sentence structure according to grammatical
configurations.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Relea2000/08/24: CIA-RDP68-00069A00010000007-9
(1) Replacement of grammatical configuration by its chief member.
(2) Sequence of ascertaining grammatical configurations in the sentence
to be analyzed.
IV. Synthesis of Russian sentence structure according to grammatical
configurations.
(1) Substitution of the English grammatical configuration used by the
corresponding Russian configuration.
(2) Morphological formation of Russian sentence structure.
(3) Def3.nition of grammatical forms of words in the Russian sentence.
V. Elimination of lexico-grammatical homonomy in the English sentence
on the basis of s
(1) morphological data,
(2) syntactic data
VI. Tests of machine analysis of English sentence structure.
62. A DEVICE FOR THE READING OF ORDINARY
PRTMED R Y THE B=
R. S. Muratov (Sverdlovsk)
1. Conversion of the graphic form of letters in a printed text into
electrical signals is achieved by breaking down the group of photosensitive
elements as they move along the line of text.
2. Electrical impulses generated when photosensitive elements are
blacked out switch on electronic relays which, in turn, switch on a tactile
or phonic signalling instrument.
8. The form of the signals (of successive formation of elementary
signals corresponding to each zone of disintegration) expresses the graphic
peculiarities of the letters and other marks in the text.
4. Correct reading of the signals rekuires preliminary instruction
by a reader.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2600/08/24: CIA-RDP68-00069A0001002Qp07-9
630 ANALYSIS OF PUNCTUATION MARKS DURING MACHINE
T LATON Oli RUSS
To No Nikolayev. (Moscow)
to The purpose of this operation is to obtain the distinguishing
features of punctuation marks during machine translation,
2. In translation from Russian each word in the sentence must receive
definite morphological and syntactic signs. The required signs are obtained
in different ways for each part of speech. In particular, in order to
determine the case and number of substantives it is necessary to know the
correlative position of the parts of speech within the limits of the closed
sentence L IcNmrovo PREDIDZ$ENIY '. However, most Russian sentences are
complicated by parenthetical and setoff 5.e, by commas-OBOSOBI
constructions, subordinate clauses, etc,
Hence, to obtain the precise grammatical signs it is necessary to break
down a complex sentence into simpler components, dividing the main clause
from the subordinate clauses and separating the setoff and parenthetical
phrases.
Thus, the final goal of the analysis of punctuation marks is tot
(a) separate simple clauses from the body of the complex sentence,
to find the boundaries of the simple clause within the sentence;
(b) separate similar members of the clause;
(o) help the subsequent elucidation of interrelations between
the individual parts of the punctuated complex, sentence;
(d) determine a group of similar members.
2, i27 The analysis is made within a single complex sentence.
Accordingly, "simple" and "multipurpose" Punctuation marks are dis-
tinguished. The simple ones (period, exclamation point, question mark9
and dots) serve as the boundaries of a complex sentence,
Multipurpose marks (comma, dash, and colon) unite simple clauses into
a complex clause9 introduce subordinate clauses, and separate parenthetical
and set-off constructions,
We are devoting the bulk of our attention to the multipurpose marks.
In a clause they may serve9 according to Profa A. B. Shapiro's terminology,
to "divide" or to *separate", We are also paying special attention to the
problem of distinguishing between single and non-single punctuation marks
(e.g. those used at the end of a setoff phrase and the beginning of a
subordinate clause eta.).
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Rele^e 2000/08/24: CIA-RDP68-00069A000100200007-9
3. As a result of the analyaig, all the multipurpose punctuation
marks receive one of the following signs$
(1) parenthetical (i.e. separating parenthetical words and phrases);
(2) setting off (separating participial and verbal-adverb phrases
as well as setoff attributives and appositives);
(3) similar=simple (dividing similar, members of a sentence);
(4) similar=complex (demarcating they parts of a compound sentence);
(5) dissimilar (i.e. introducing a subordinate clause).
4. Separation of the simple clauses occurs within the limits of the
complex whole according to our data.
The entire process of analyzing punctuation marks can be divided into
3 stages s
(1) Separation of the purely parenthetical constructions takes
place in the analysis glossary where the words that may be
used parenthetically or that are a basic part of a parenthet-
ical phrase undergo special analysis, after which the punc-
tuation marks that separate them receive an appropriate in-
formation sign,
(2) Processing of punctuation marks by the "Punctuation Marks"
routine, where the basic analysis of all the punctuation
marks takes place.
(3) Breakdown of the sentence into its constituent parts--
separation of parenthetical and setoff constructions,
dividing of simple clauses, etc. Here the occurrence of a
"non-single" mark is extremely important. This routine also
provides for insertion of a sentenoe-demarcating punctuation
mark where necessary.
5. The "Analysis of Punctuation Marks" routine consists of several
parts, each of which corresponds roughly to a given punctuation mark.
Within each part several checks are made on a number of individual
factors that determine the function of the multipurpose punctuation marks.
These factors include the presence of verbs with the sign"IF" (LICHNAYA
FORMA) rpersonal forlon both sides of a given mark (or on one side of it ,
the presence of verbs with the sign *NELICSNAYA FORMA' son-personal fo ,
the lace of a substantive with the sign FS ("FORMA SI1)VARNAYA") 5ictionary
form in respect to the given mark, the separation of words belonging to a
given lexical group, etc.
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release 2QpO/08/24: CIA-RDP68-00069A00010029007-9
6. As a result-of our investigation, all the punctuation marks are
provided with the requisite distinguishing features and the analysis is
performed accordingly within the separated simple units.
64. SOME PROBLEMS CONK CTED WITH THE ANALYSIS OF
COMA' SENTENCES AND C U H
Ye. V. Paduoheva (Moscow)
1. The following problems must be solved in connection with the
syntactic analysis of complex sentences and clauses with similar memberes
(a) To distinguish between a syntagma with similar members and
clause coordination (the difficulties in solving this problem
are explained by the fact that most of the co-ordinating con-
junctions (,Lin N$) and, or, bug may connect both similar
members of clauses and entire clauses and therefore they can-
not serve as a trustworthy sign either of clause boundary or
of syntagm, with co-ordinating connective 5OCHINITEL'NOI
SVAZ QY`
(b) To separate words interlinked by a oomordinating connective,
having divided them from the words governed by them.
2. For this purpose we propose the following method of analyzing sen-
tenoes with oo-ordinating conjunctions (only 2-member combinations are con-
sidered for the time being)s The sentence is out up into "chunks*, the
limits of which are co-ordinating conjunctions, and the eyntaotio analysis
is performed within the chunks; if after completion of syntactic analysis
within the chunk no words remain without a governor, it means that the con-
junction connects two clauses; if, however, such words remain, it means that
the sentence contains similar members. Words lacking a governor are, for
the most part, members of a co-ordinating syntagma.
3. When words are combined into a coordinating syntagma, the concept
of "sameness of form" LEAVNOOFORML3NNOST j7 is used. "Sameness of form"
is'the coincidence of several of their morphological and syntactic signs,
The same form is sought beyond the chunk for a word that lacks a governor
within the chunk and a coordinating syntagma is thereby established.
(This must be refined somewhat due to the possible absence of agreement
in number for words with the chunk, etc.).
L4 This method of analysis is feasible for Russian because a word
normally contains all the information regarding the possible syntactic
connections for it ( with some exceptions,--compare, e.g., the home
onomy- of oases, whion may make the syntactic function of a word in the
chunk inCAefinite). This method is impracticable in English (e.g. the
syntactic functions of a substantive are determined wholly by its position
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Relee 2000/08/24: CIA-RDP68-00069A000100200007-9
after a transitive verb, before another substantive, etc.; therefore,
superfluous "subjects" would appear after the division into chunks is made.
However, some of the difficulties mentioned for Russian disappear in
English during the analysis of a sentence with co-ordinating conjunctions
due to the rigid word order and preferential position of the governed word
after the governor. English syntagmas with oo-ordinating connectives are
determined at the same time as the others during the course of syntactic
analysis.
50 Some methods of fixing the boundaries of a simple clause inside
a complex clause are indicated.
65. MACHINE TRANSLATION OF CONFOUND NOUNS FROM
OW INTO RUSSM
V. V. Parshin (Moscow)
1e The extensive use of oompoundsin German, particularly in scientific
and technical literature, has made it necessary to work out universal rules
for their translation.
Formulation of such rules makes possible a significant reduction in
the size-of'the dictionary and the translation of compounds, provided that
the components are loaowA.
Universal riles for the translation of compounds are deduced from a
struoturalmsemantic analysis of the constituent words. Determination of
semantic connections between them ensures an adequate translation.
The iuthor?s investigations do not pretend to be a complete and
final solution to the problem of translating compounds. They are merely
an initial, empirical attempt at working out the basic principles and
methods that would permit of a more or less successful translation at the
first stage.
2. The existence of the following types of connections between the
stems of compounds has been demonstrated by an analysis of concrete
linguistic material (individual original works on mathematics and a
German-Russian polytechnioal dictionary)s
1, Relation of the sum to the constituents,
2. Relation of a part to the whole,
3. Object or subject of an action to the action,
4. Object of the bearer of a quality to the quality,
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 000/08/24 : CIA-RDP68-00069A000100;W007-9
5, Object of a determiner to the thing determined.
Translation of the first component of compound words, the internal
oonneotions of whose components relate to the first four types, is
effected by producing a Russian equivalent in the genitive case.
If the last type of connection is present, the first component is
translated in two wayss by a adjective and the production of a Russian
equivalent in the genitive case,
Polysemia causes a certain type of connection for each meaning of the
word. Therefore, a semantic analysis of the components is necessary to
differentiate the types of relations between the constituent elements.
Differentiating the relations of a part to the whole and the relation
of a determiner to the thing determined is the most difficult of all.
3. A"special case is the translation of compounds consisting of three
-components. It is important here to establish the oo-subordination of
determining stems to the determined stemr, which is done by subjecting them
to analysis in pairs.
Threemoomponent words are translated in accordance with the rules for
translating two-stem words.
4. Compounds of the input text are broken down into constituent
stems by the superposition of stems included in the dictionary taking
into account connecting consonants and rejected endings.
5. The principles and methods of translating German compounds
into Russian, as set forth above, can serve as the basis for a definitive,
detailed solution of one of the most complicated lexicographical problems
in German.
66. PROPER NOUNS IN MACHINE TRANSLATION
A. V. Superanskaya (Moscow)
1. Proper nouns are unavoidably present in every scientific test.
2. In the present state of development the machine translates a
text, but leaves proper nouns just the way they are, printing them in
Latin letters.
3. Since the number of proper nouns increases as one proceeds from
selective to continuous translation, the question of the desirability of
automatising the process of transcribing proper nouns arises.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Relea000/08/24: CIA-RDP68-00069A000100007-9
4. Proper nouns are not always written,, read, and pronounced in
all languages in accordance with the rules for common nouns.
5o Proper nouns are international, The same nouns are encountered
among peoples of different nationality. People move from country to
country and publish their papers in different countries in different
languages. That is the reason for the difficulty in determining the
nationality of a noun and,, accordingly the rules by which it should be
transcribed.
6o There is much inconsistency its the current transcription of
nouns. The need to unify the transcription and eliminate the lack of
uniformity is long overdue.
7o Due to the limitless memory potentialities of the macci and the
difficulty of mechanical analytical transcription, it is-more eff4oient"
to store proper nouns as & ,whole in the'machine's memory. Consequently,
if it encountered such a noun in a text, the machine would locate it in
the glossary and deliftr the answer (simple or in several variants de=
pending on the linguistic origin of the noun and on existing traditions).
This would help to make transcription uniforms and it could be accompanied
by a printed glossary to match.
67. WORK ON A BURNESEURUSSIAN ALGORITHM OF
CHINE TRANSLATION
O. A. Timofeyeva (Leningrad)
to The syllabic nature of Burmese writing requires the elaboration
of a special-program by which an electrical reading device can handle a
Burmese text.
2. We are compelled to restrict the algorithm to the literary form
of'Burmese'speeoh owing to the sharp divergences between the written and
contemporary spoken languages.
3. A highly developed word-building root struoture that crosses
with a form-building root structure makes it necessary to have a special
word-building program he purpose of which is to separate lexical from
morphological phenomela.
4. The development of agglutination and the rudiments of internal
infleotion require the construction of a complicated morphological pro.
am for handling the abundant and varied grammatioal information con-
tamed in the Burmese word,
5. The absence of a rigid order for nominal members of the Burmese
sentence'complicates the syntactic program, which cannot be effected
without the preliminary operation of the morphological program,
o78m
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release 2000/08/24: CIA-RDP68-00069AO001002QW07-9
68. WORK ON AN ARASIC-RUSSIAN ALGORITHM OF
WHINE TRANSLATION
0. B. Frolova (Leningrad)
I. Items from newspapers are used as texts in machine translation from
Arabic to Russian.
II. The main principles in working on an Arabic-Russian algorithm of
machine translation, as contrasted with those of traditional grammar,
are as follows
(a) Only the written form of the language with the infixes consonants
and long vowels is considered, whereas all the existing grammars take into
account the short vowels, which are not normally noted in writing. For
Arabic two algorithms, differing in principle, are neoessarys one for the
spoken language,, the other for the written; the two variants are not re-
ducible to each other.
(b) The traditional dictionary arranged by roots is replaced by a
dictionary arranged by stems.
(o) For oorivenience in transliterating Arabic letters into Russian
letters, the latter are used with no additional signs of any kind.
III. The programs makin up the algorithm are as follows, (1) stem-
strippin (2) address (3) morphological (4) syntactic (5) dictionary of
stems (6 table of prepositions (7) glossary of idioms and phrases (8)
program for distinguishing homonyms.
TV. Work on the stem-stripping programs
(a) Initial variations of this program provided for cutting off the
stems, prefixes, and suffixes; the glossary increase considerably, how-
ever, due to pseudostems.
(b) An important factor in simplifying this program was the idea of
a reject ZUTKAZp I7 glossary which was later developed into the idea of an
address used in other algorithms too.
(o) The stem-stripping program includes the following rules$
(1) Out of the 28 letters of the Arabic alphabet 10 letters may
be joined as non-radicals to the beginning of a words these
are certain conjunctions and prepositions, the definite
article, and verbal prefixes.
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Release=2000/08/24: CIA-RDP68-00069A000100200007-9
(2) In the Case of words" that do not contain initial nonradieal
1ettersa it is necessary to refer at once to the address
endings and suffixes are automatically strippedupon'com-
paring the words with the stems found in the address.
(3) Some of these initial non-radical letters., which when out
off reveal an insignificant number of pseudostems, are first
transf re to the end of the words and converted into suf.
fix es' are kept apart g the words are then sought in the
address.
(4) Words with remaining initial nonmradioal letters,, which if
out off would result in a large number of pseudostems, are
first checked in the address- if they are not found there,
-the non-radioals are transferred to the end of the words,
and the words are again looked up in the address. Checking
for their presence in the address is not equivalent to ex-
traoting'from the address all the information relating to
the stem.
69, EXPERI?ENTAL TRANSITIONS FROM FRENCH INTO RUSSIAN
G. V. Chekova (Moscow)
Devising of algorithms for translation from French to Russian.
8eguenoe of operations for translation programs. Changes in programs
and coding of glossary on the basis of experimental translations produced
by the machine.
Utilization of scales in translation programs.
Progremning characteristics, scope of programs and glossary; operations
utilized'in translation programs- numerical characteristics of translation
programs.
Basic demands made of a special translation machine.
Examples of translations produced by the STRELA machine in 1957-1958.
70, ESTABLISHMENT OF SYNTACTIC CUES FOR
PRLTRITTRAL
-- ~ -
I, No Shelimova (Moscow)
1. The object in making a syntactic analysis of prepositional phrases
consisting of either a preposition and substantive standing to the right of
it or a preposition and pronoun immediately adjacent to it on the right is
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release 2gp0/08/24: CIA-RDP68-00069AO001002Q0907-9
to include these prepositional phrase in syntagmasb It is neoessary,
therefore? to find a word in the sentence with which the prepositional
phrase forms a syntag.
2. There are no complications in drawing trip the rules for the
formal analysis of prepositional phrases if a word that belongs to a
class of words capable of forming a syntagma with the prepositional
phrase is found immediately to the left of the prepositional phrase. The
only exception is a case where a noun stands next to. the prepositional
phrase. Thus, if there is any verbal form o infinitive, participle (short
or full), verbal adverb, or adjective (full or short) m or special group
of invariable words on the left of the prepositional,phrase1 the prepo-
sitional phrase forms a syntagma with this particular word.
3. If on the left of the prepositional phrase .s a word that belongs
to a class of words with which the prepositional phrase does not generally
forma syntagma (pronounsf, adverbs, particles, conjunctions) or the prep-
ositional phrase stands at the very beginning of the sentence, than the
word with which the prepositional phrase forms a syntagma must be searched
for in the following orders
(a) Search to the left for the next word with which the prepo-
sitional phrase may become a syntagma, excluding a noun, i.e. search for
any form of verb, adjective or special kind of invariable word. A prep-
ositional phrase may unite in a syntagma with several of the classes of
words listed after it fulfills a series of oonditions.
(b) Search to the right for the next word belonging to the class
of verbs (except the full participle and verbal adverb) or a word from
'the s ecial group of invariable words or a short adjective. Actually while
searo ing for a word on the right,-with which the prepositional phrase may
form a synta,) we are looking for a word In the predicate of the sentence.
4. If a prepositional phrase stands next to a noun (immediately to
the left of the noun), the rule for establishing the syntagma constituted
by this phrase is not general for prepositional phrases with different
prepositions.
5o Therefore, any of the following may be significant in determining
the rules for analyzing prepositional phrases with a number of prepositions
(a) The lexical composition of the prepositional phrase itself;
(b) Does the prepositional phrase have on its left a noun which
by virtue of its syntactic or lexical properties is such that its connection
with the prepositional phrase must be regarded as certain?
(c) Does the sentence have any verbal form that by virtue of
syntactic or lexical properties must be regarded as necessarily connected
with a given prepositional phrase?
Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9
Approved For Reled"se?2000/08/24 : CIA-RDP68-00069A000100007-9
6, The structure of the sentence is particularly important in
establishing the rules of syntactic analysis for prepositional phrases
with s?veral other prepositions (e.g. v n' with the prepositional case
and dl o )o In order to determine the regular syntactic Dues for the
prepositional phrases mentioned, it is necessary in certain oases to
know if the prepositional phrase stands before or after the predicate or
whioh'syntagma contains the noun that is followed by the prepositional
phrase; Sometime it is important to know whather or not this noun in
turn forms a prepositional phrase with certain prepositions (e.g.
ir rezuletatia As a result o , falle 5afterD etc) because in such a
cease a prepositional phrase with rl or cannot be related to this
no=n
71. CORRELkTION BETWEEN 3RD PERSON PERSONAL PROMUT
M-Tim _ _.-' FOR WHICH Tk7EY S SE
A, L, Shumilina (Moscow)
1, In machine translation the 3rd person personal pronouns of one
language cannot be mechanically substituted for the corresponding prom
nouns of another language since gender is not an inherent sign of every
pronoun, but depends on the gender of the corresponding noun, which is
accidental as far as they are concerned and specific for the different
languages.
2. The following formal data must be obtained first if the correlation
between a pronoun and the corresponding substantive is to be establisheds
(a) The 'boundaries of the clauses (no cognizance is taken of the
differences between the boundaries of clauses within sentences and sentence
boundaries)s
(b) The grammatical properties'of the substantives and 3rd
person personal pronouns (gender, number, case)-
(o) The syntactic relations and specific syntactic functions
.of the substantives
(d) The order of substantives in the clauses
(e) Certain sequences of syntactically related words (eago ex-
panded attributes).
3. A substantive for which a given pronoun is used must 'correspond
gramsaticallye to this pronoun. By grammatical correspondence we mean the
correspondence between substantive and pronoun in number (correspondence
in number will in several oases differ from the conventional.) and gender (in
the singular).
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9
Approved For Release4000/08/24: CIA-RDP68-00069A000100 O07-9
4, The way to determine the corresponding ("unknown") substantive
is, for the most part, as followss
The search for the grammatically corresponding word is made only to
the left of the given pronoun (omitting the previously determined elements
in the clause).
A. Within a zero (1)olause ~'(1)Clauses subject to'analysis are
numbereds zero ? a clause within which the given pronoun in found,
first (1) m next clause to the left of the zero, second (2) a next clause
to the left of the first, etc?
(a) For pronouns in the nominative case, the only possible un-
known substantive may be one with a sign of the "grammatical subject"
(this concept is defined beforehand).
(b) For pronouns in other than the nominative case, the unknown
word is the substantive that is closest to the given pronoun, but with
certain restrictions (e.g. the unknown substantive must not forma single
word combination with the given pronoun, nor must it be the middle word or
word on the extreme right in a chain of genitive cases, if the word on the
extreme left satisfies the sign of "grammatical correspondence", etc.)
Ba Within the first, second.*.nth clause (The analysis is made
auoceseive y within the 1st 2nd ...nth. clause until the word that satisfies
our requirements is found).
For pronouns both in the nominative and in other oases, a word with
a sign of the "grans tical subject" is considered first; in the event
that there is no grammatical correspondence between the pronoun and the
"grammatioal subject" found, we pass on to a word with a sign of the
"grammatical direct object", then to the substantive that is closest to
the right boundary of the lst or nth clause (taking into account the
various restrictions already determined).
5. Similar work in the future may, with appropriate additions
(animatenessin nouns and other-criteria), be significant-from the'point
of'view of'practioal stylistios, i.e. it may create the possibility of
dit6imining certain purely formal rules for using 3rd person personal
pronouns on the basis of the laws of the language itself.
US JPRS/DC
DUPONT 7-4240
Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9