ABSTRACTS OF THE CONFERENCE ON MACHINE TRANSLATION (MAY 15-21, 1958)

Document Type:

CREST

Collection:

General CIA Records

Document Number (FOIA) /ESDN (CREST):

CIA-RDP68-00069A000100200007-9

Release Decision:

RIFPUB

Original Classification:

Document Page Count:

Document Creation Date:

December 9, 2016

Document Release Date:

September 24, 1998

Sequence Number:

Case Number:

Publication Date:

July 22, 1958

Content Type:

REPORT

File:

Attachment	Size
CIA-RDP68-00069A000100200007-9.pdf	6.5 MB

Body:

D 0 Opeff&9;i19 C. - ~- 3 Approved For Release 2000/08/24: CIA-RDP68-00069A00 0~~ 22 July 1958 JPRS/DC-241. ABSTRACTS OF THE CONFERENCE ON MACHINE TRANSLATION (MAY 15-21, 1958) PHOTOCOPIES OF THIS REPORT MAY BE PURCHASED FROM THE PHOTODUPLICATION SERVICE LIBRARY OF CONGRESS WASfl GTON 25, D. C. U. S. JOINT PUBLICATIONS RESEARCH SERVICE Main Office: Room 1125 205 E 42nd Street New York 17, N. Y. D. C. Office: Second Floor 1636 Connecticut Ave., N.W. Washington 9, D. C. Approved For Release,D00/08/24: CIA-RDP68-00069A000100 {J007-9 JPRS/I?C-241 CSO DC-2026 Ministry of Higher Eduoation, USSR First Moscow State Pedagogical Institute of Foreign L.nguages ABSTRACTS OF THE CONFERENCE ON MACHINE TRANSLATION (May 15-21, 1968) iIDSCOW, 1958 Approved For Release 2000/08/24: CIA-RDP&8^00069A000100200007-9 Approved For Release"M00/08/24: CIA-RDP68-00069A000100"bOO7-9 TABLE OF CONTENTS PLENARY SESSION Page 1. Andreyev, N. D. (Leningrad), A Metalanguage of Machine Translation 1 2. Bel'skaya, I. K. (Moscow), Some General Problems in Machine Translation 1 3. Bokarev, Ye. A. (Moscow), An Intermediary Language and Artificial International Languages 5 4. Dobrushin, R. L. (Moscow), The Value of Mathematical Methods in Linguistics 6 5. Ivanov, V. V. (Moscow), Conversion of Commounioations and Conversion of Codes 6 6. Kuznetsov, P. S. (Moscow), The Sequence in Building a Language System 7. Iyapunov, A. A. and Kulagina, 0. S. (Moscow), Machine Translation Studies in the-Mathematical Institute of the Academy of Sciences, USSR 8. Mel'ohuk, I. A. (Moscow), An Intermediary Language Model for Machine Translation 11 9. Steblin Kamenskii, M. I. (Leningrad), The Significance of Machine Translation for Linguistics 10. Revzin, I. I. (Moscow) The "Active" and "Passive" Grammar of L. V. Shoherba affi the Problems of Machine Translation 11. Rozentsreig, V. Yu. and Revzin, I. I. (Moscow), A General Theory of Translation in Connection"W th Machine Translation THEORETICAL SECTION 12. Artemov, V. A. and Zimnyaya, I. A. (Moscow), Spectra of Phonemes and Their Use in Machine Translation 17 Approved For Release 2000/08/24: CIA-RDPb68 00069AO00100200007-9 Approved For Release 00/08/24: CIA-RDP68-00069AO001002db007-9 Page 13. Vinogradova, 0. S. and Luriya, A. R. (Moscow), An Objective Investigation of Meaning Associations 19 14. Grigor'yev, V. I. (Moscow), The Treatment of Certain Concepts in Structuralism 19 15. Grigoryan, V. M. (Yerevan), The Significance of Frequency as a Factor in Determining the Stylistic Function of Words 16. Dobrushin, R. L. (Moscow), An Experiment to Define the Concept of Grammatical Category 21 17. Dolgopolvskii, A. B. (Moscow), The Theory of Probability and Determination of Linguistic Relationship 22 18. Zinoveyev, A. A. (Moscow), A General Theory of Definition and the Possibility of Applying It to the Theory of Translation Devices 19. Ivanov, V. V. (Moscow), Linguistic Problems Connected With Poetry Translation 200 Ivanov, Vo V. (Moscow), Hegel's Theorem and Linguistic Paradoxes 21. Iliya, L. I. (Moscow), Methods of Breaking Down a Syntactic Whole 22. `Kolshanskji, G. V. (Moscow), The Logical Nature of Context 23. Kotov, R. G. (Moscow), Linguistic Statistics From Russian Texts 24. Kulagina, 0. S. (Moscow), A Method of Defining Grano tioal Categories 31 25. Revzin, I. I. (Mosoow), A Formal Theory of the Sentence 31 26. Reformatskii, A. A. (Moscow), Translation sub specie structu.ali smi 27. Rosentsveig, V. Yu. (Moscow), A System of Recording Speech for Oral Translation Approved For Release 2000/08/24: GlAeRDP68-00069A000100200007-9 Approved For Release 2''0/08/24: CIA-RDP68-00069A000100204007-9 Page 28. Sokolyanskii, I. A. (Moscow),, Language Training For Blind Deaf-Mutes 35 29. Strelkovskii, G. M, (Moscow), Some General Principles in Compiling Glossaries For Machine Translation 30. Toporov, V. N. (Moscow), Some Analogies to the Problems and Methods of Contemporary Theoretical Linguistics in Ancient Indian Grammatical Works 31. 'Udartseva, M. G. (Petrosavodek), The Frequency of Lexical 'Units in English Geological Literature 39 32. Finn, V. N. and TAkh.uti, D. G. (Moscow), One Approach to Logical Semantics 40 33. Frumkina, R. M. (Moscow), Some Problems Connected With Alternating Stems in Constructing an Algorithm of Machine Translation For Spanish 34. Shun rang S. K. (Moscow), A Logical Analysis of the Concept of Language Structure 43 35. Shevoroshkiin, V. (Moscow), Ancient Texts and Machine Translation 44 SECTION ON ALGORITHMS OF MACHINE TRANSLATION 36. Agrayer, V. A. (Gorki), An Algorithm for Translating French into Russian Electronically 37. Andreyev, N. D. (Leningrad), Principles in the Construction of Electric Reading Devices 38. Andreyer, N. D. (Leningrad), Work on an Indonesian-Russian Algorithm of Machine Translation 39. Andreyer, N. D., Batova, D. A., and Panfilov, V. S. (Leningrad) f, Work on a Vietfia,mese4Russian Algorithm of Machine Translation 40. Babinteev, A. A. (Leningrad), Work an a Japanese-Russian Algorithm of Machine Translation 48 41. Bagrinovskaya, G. P. and Gavrilova, G. L. (Moscow), The Programming of Translation From English into Russian 50 Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release 200/08/24: CIA-RDP68-00069A0001002, O07-9 42. Belokrinitsknya, S. S. (Moscow), Principles in Compiling a German-Russian Glossary of Polysemants for Machine Translation 43. Bel'skaya, I. N. (Moscow), Main Features of the Glossary and Grammatical Programs for English-Russian Machine Translation 44. Berkov, V. P. (Leningrad), Work on a Norwegian-Russian Algorithm of Machine Translation 45. Bratchi,kov, I. L., Fitialov, S. Ya., and Tseitin, G.'S. (Leningrad), Glossary Structure and Information Coding for Machine Translation 46. Vinogrdova, V. N. (Moscow), Gender as a Superfluous Category of the Russian Verb 47, Volotskaya, Z. M. (Moscow), The Synthesis of Russian Verb Formes in Machine Translation Page 48. Volotskaya, Z. M.,-Paduoheva, Ye. V., Shelimova, I. N., and Shumilina, A. L. (Moscow), Russian Syntagmas 58 49. Volotakaya, Z. M., and Shumilina, A. L. (Moscow), Synthesis of the Russian Clause 50. Voronin, V. A. (Moscow), Grammatical Analysis for Machine Translation of Chinese into Russian 51. Grigorgyev, V. I. and Belonogov, G. G. (Moscow), Application of Machine Translation Methods to the Lexical Coding of Telegraphic and Telephonic Communications 62 52. Yefimov, M. B. (Moscow), Some Problems in Machine Translation From Japanese into Russian 63 53. Zasorina, L. N, (Leningrad), Work on the Russo-English Algorithm of Machine Translation 64 54. Katenina, T. Ye. (Leningrad), Work on a Hindustani (Hindi) Russian Algorithm of Machine Translation 66 55. Komissarova, N. V. (Gorki), An Algorithm for Translating English Texts on Radio Engineering into Russian 67 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releasc,000/08/24: CIA-RDP68-00069AO001OQ200007-9 56. Kulagina, 0. S. (Moscow), Automatization of Translation Programming 57. Kulagina, 0. S. (Moscow), A bench-Russian Translation Algorithm 58, Langleben, M. M. (Moscow), Determination of Syntactic Connections for Formulas in Russian Mathematical Texts Page 59. Langleben, M. M. and Paduoheva, Yea V. (Moscow), Elimination of Morphological and Syntactic HomonoaW in Analyzing English Texts 69 60. Leontlyeva9 N. N. and Vavilova, G. N. (Moscow), The Superfluousness of Russian Adjective Inflection 70 61, Moloshnaya, T. N. (Moscow), An Algorithm of Machine Translation from English into Russian 71 62. Muratov9 R. S. (Sverdlovsk), A Device for the Reading of Ordinary Printed Material by the Blind 63, Nikolayeva, T. N. (Moscow), Analysis of Punctuation Marks During Machine Translation From Russian 64. Paduoheva, Ye. V. (Moscow), Some Problems Connected With the Analysis of Complex Sentences and Clauses With Similar Members 75, 65. Parahin9 V. V. (Moscow), Machine Translation of Compound Nouns From German into Russian 76 66. Superanskaya, A, V. (Moscow), Proper Nouns in Machine Translation 77 67. Timofeyeva, 0. (Leningrad), Work on a Burmese-Russian Algorithm of Machine Translation 68. Frolova, Oa Bo (Leningrad), Work on an Arabio-Russian Algorithm of Machine Translation 69, Chekova, G. V. (Moscow), Experimental Translations From French into Russian. Approved For Release 2000/08/24: CIA-RDR68-00069AO00100200007-9 Approved For Release 00/08/24: CIA-RDP68-00069AO00100 O07-9 70. ShelimDva, I. N. (Mosoow), Establishment of Syntactic Cues for Prepositional Phrases 71. Shumiliaa, A. L. (Mosoow), Correlativity of 3rd Person Personal Pronouns and the Nouns for Whioh They Substitute Page Approved For Release 2000/08/24: CIA-RDPK68-00069AO00100200007-9 Approved For Release 300/08/24: CIA-RDP68-00069AO0010024WO7-9 1. THE METALANGUAGE OF MACHINE TRA,NSIATION AND ITS ATI No D. Andreyer (Leningrad) 16 We call a metalanguage any linear system of signs used for the written designation of the elements in a particular system of ideas and the relations between these elements 2. The class of metalanguages at the present time comprises mathetios, physics, chemistry., formal genetics, and symbolic logic. 3 The preparation of algorithms for machine translation requires the development of a special metalanguage in the symbols of which may be described the f aita and relationships of the language systems that are subject to equiv- alent comparison. 4. The symbols used in the metalanguage of machine translation are regarded as metalanguage words and grouped in categories analogous to the parts of speech. 5 Types of commands in M0T0 Z chine translatio are regarded as metamoods %TA-NAKION,SNIY 6. The use of metalanguage in the analytic part of algorithms. 7. The use of metalanguage in the transformational part of algorithms. The use of metalanguage in the synthetic part of algorithms, 9o The possibility and value of a general theory of metalinguistio systems0 10., A comparative analysis of the class of metalanguages and the class of spoken languages may serve as a basis for elucidating the relations between formal logical semeiotics and general linguistics. 20 SOME GENERAL PROBLE$ IN MACHINE TRANSLATION LT.-/ I. K. Bellskaya (Moscow) 1. Experience gained in preparing experimental routines for machine translation from English, German, Chinese, Japanese, and Russian in the ITM and VT Lfnstitut tochnoi mekhaniki i vyohislitel'noi tekniki/Institute of Precision Mechanics and Computer Engineerinof the Academy of Sciences, USSR confirm the assumption that translation, even in such an unusual form as machine translation, is, as far as content is concerned, a linguistic problem. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Relea "1000/08/24: CIA-RDP68-00069A000100200007-9 2. The development of linguistic methods of solving M.T. problems may be achieved on the basis of so-called "traditional linguistics" and the results of such work may be of definite interest to linguistics, The systematisation of language phenomena that accompanies M,To research should help to eliminate the well known contradictions and diffuseness in the definitions of certain linguistic categories accepted at the present time. 3. A distinction between the lexical and grammatical aspects of the translation problem seems essential. The difference in quality and degree of lexical and grammatical abstraction emerges in the system of machine translation with unusual clarity, Rules of lexical character are recorded in a glossary. Grammatical rules are not included in the glossary and form the content of so-called "translation routines". 4. An M.T, glossary must be so constructed that its various parts can expand unevenly. An M.T. glossary may be divided into 2 main sectionss I single-meaning glossary, and II multiple-meaning glossary. Each of these is in turn subdivided into. Ia Ib glossary of technical terms, glossary of words in general uses Ha glossary of full-meaning words, IIb glossary of auxiliary words. An M.T glossary is accompanied by several auxiliary routines (Comm prising one cycle in the translation routine) in order that the lexical analysis of a sentence may be performed without human interventions' 1. Routine of dividing a sentence into words Routine 1 is not essential for all languages, only for such as Chinese,, Japanese, Arabics etc., where the sentence is written down in the form of an unbroken succession of signs with no spaces between the word 2. Routine of obtaining the glossary form of a word 3, Grammatical analysis of "unknown words" Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 3000/08/24: CIA-RDP68-00069AO00100ZQOO07-9 4. Syntactic analysis of "formulas" 5. Routine of distinguishing homonyms 6. Routine of analysis of polysenf. 5. The basic problems of an M.T. glossary--size and polysemy--are satisfactorily solved by combining the following two methods: (a) division of the glossary into a series of "special glossaries" corresponding to various spheres of human activity (in our case - correspond- ing to the various branches of science); (b) contextual (functional - semantic) analysis of the words. 6. The main features of an M.T. glossary are that its (a) contains a systematized description of each word that is capable of ensuring the subsequent grammatical analysis of the word in the sentence (the "invariant,characteristics of the word"); (b) provides for a genuine correspondence between two lexical systems, registering the "relevant meanings" of words; (o) takes cognizance of "zero meanings" of words, i.e. instances where a word'must not be translated into another language as a separate lexical unit. For the rest, an M.T. glossary may be arranged on the same principles as those underlying existing bilingual dictionaries. In particular, there is no need to convert an M.T. glossary into a "glossary of stems". More- over, a glossary of words has definite advantages for M.T. too. 7. The solution of the problem of grammatical analysis in M.T. is connected with the realization of a logical, structural description of language. Hence, conclusions drawn from solving this problem may have a certain general linguistic interest. 8s Following the grammatical analysis of 5 linguistic systems- English, German, Chinese, Japanese, and Russian--for M.T., it seemed possible to use a consistent system of dividing words into the following 9 lexico-grammatical categories: 1. verbs, 2. substantives, 3, numerals, Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releas00/08/24: CIA-RDP68-00069A000100007-9 4* adjectives, 5a adverbs, 6* prepositions 5hinese and Japanese postpasitions'may be clas- sified as prepositions on the basis of their resemblance to 7, conjunctions, 8e particles, 9e parenthetic words. The principle of dividing words into these classes is similar to that underlying the division of words into parts of speech. Hence,, there is no need to do away with the'traditional names of the parts of speech. Only a bit more precision is required, Thus, the classes of numerals, adjectives, and adverbs have been changed. Pronouns are not isolated in a separate class, but the pronominal oategory differs far such pa t f r s o speech as substanti*edjtis, s, aecve and adverbs Systematization of grammatical categories within each part of speech resulted in differentiating between the variant (contextual) and invariant grammatical characteristics of the words. 9. The grammatical processing of sentences by the translation routines breaks down into two independent steps s Analysis of sentence to be translated, and Synthesis of translated sentence. We call analysis routines that system of rules whereby the -linguistic analysis of a sentence to be translated can be performed in such away as to produce the information needed for the grammatical structure of the translated sentence. In the M.Te variant developed at the Institute of Precision Mechanics and Computer Engineering of the Aoademiy- of Sciences, USSR, the analysis routines include the following 8 routines in cycle Its 1. functional analysis of punctuation marks; 2. breakdown of sentences into clauses and more precise definition of parenthetical phrases in clauses; 3. syntactic analysis of clauses- Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release-.J00/08/24 : CIA-RDP68-00069A000100007-9 4. "verb" routine.- 5. "numeral" routine; 6. "substantive" routine; 7. "adjective" routine; 8. "changing word order in translated sentence" routine. 10. We call synthesis routines that system of rules whereby the grammatical structure of the translated clause can be formed. As of now 4 synthesis routines for the Russian sentence have been worked outs 1. word-forming routine; 2. "verb" routine; 3, "adjective" routine; 4. "substantive" routine. It is proposed to develop a routine for editing the style of translated Russian sentences as well as synthesis routines for several other languages, particularly Chinese and English. This would make it possible to produce multilingual"machine translation (from many languages into many languages), using Russian, it is suggested, as an intermediary language. 3. AN INTERMEDIARY L&NGIIAGE AND ARTIFICIAL O WGUAGES Ye. A. Bokarev (Moscow) 1. Creation of an intermediary language for machine translation or an artif icial'Esperanto-type international language requires the solution of several problems, the main one being the need to establish correspondences between the lexical and grammatical units of languages that differ in their structural characteristics. 2, International languages based on natural languages use everything that is essential for communication and reject what is non-essential or of little value (exceptions of various kinds, polytypio declensions and con- jugations, etc.). The most consistent in this respect are the autonomastio. languages (Esperanto and Ido). Languages of another kind - the naturalistic (Interlingua and Occidental) - retain certain of the unjustified complications and inconsistencies of natural languages. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Releas000/08/24: CIA-RDP68-00069A0001 0007-9 3. The most important problems in the field of grammar ares indication of the parts of speech, expression of subject-objeot relations, and word order in sentences, 4. In the field of word formation there is the problem of productivity of word-forming affixes and use of established patterns. 5o Some of these problems may be solved in various ways when an inter- mediary language or an artificial language for international-relations is created. Nevertheless, there are many problems that can be solved in similar fashion. 4, THE VALUE OF MATHEMATICAL METHODS IN LINGUISTICS R. L. Dobrushin (Moscow) 1. Uses of linguistics as a justification for its, existence. Classical fields of uses teaching of languages and application to problems in history. -2.'-Demands on language research imposed by classical fields of appli.. cation of linguistics. 3. Newest fields of application of linguistioss mechanical translation and use for transmission of information in the form of written and oral linguistic material, 4. Problems and methods of linguistic research dictated by the newest fields of linguistic applications. 5. Mathematical methods of linguistic investigations (a) methods used in theory of numbers applied to investigation of the grammatical structure of language- (b) investigation of language structure by methods used in the theory of information,- (6) linguistic statistics. 6. Interrelations between classical and modern linguistic techniques. Potential for the development of mathematical methods, 5. CONVERSION OF COMMUNICATIONS AND CONVERSION OF CODES V. V. Ivanov (Moscow) 1. In theoretical investigations dealing with automatization of linguistic processes, it is advisable to distinguish the conversion of com- munications (texts) from the conversion of codes (sign systems). Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release? 00/08/24: CIA-RDP68-00069AO00100 0007-9 2. By communication conversion we understand the translation of a communioation from one code into another (recoding) while retaining the invariant information. When speech is transmitted at a distance, the linguistic structure of the text is kept, which makes this case very simple. When sentences are converted within a singlelanguage the linguistic structure of the text is partially transformed. This transformation may, therefore, be regarded as a first approach to machine translation, In translating from one concrete language into another concrete language or into an intermediary language, it is possible to preserve the characteristics of the linguistic structure of the text, which are directly reflected on the structure of the text in the other language. In translating into the logical, abstract language of an information machine, only the logical structure of the text can be preserved, The increasing degree of difficulty of each of these tasks is determined by the complexity of the rules for converting a communication, which vary with the extent to which the information appear- ing as an invariant during the conversions can be formalized. 3. By code conversion we understand the translation of one code into another while retaining the code pattern. An intermediary language for machine translation and an abstract machine language for an information machine may be regarded as abstract systems, which are represented by the concrete language of scientific and technical texts. Therefore, to develop these abstract systems we require a formal analysis of the individual con- crete languages in order to reveal their common patterns. An abstract machine" language may be constructed by converting concrete languages derived, in turn, from interpreting an abstract language. The general theory of code conversion may be used for the deductive derivation of one scientific system from another. In this connection it is necessary to investigate code isomorphism in the various sciences (and code isomorphism in a single science at various stages in its history). At the same time a general theory of code conversion makes it possible to formulate with greater precision the concepts of comparative and historical linguistics due to the fact that como- parative-historioal oaloulation is a special case of code calculation. P, S. Kuznetsov (Moscow) 1. Any language is a system of simple units of various orders so interlinked by hierarchical relations that each elemental unit is in some respect indivisible (without loss of some of its properties) and at the same time consists of a certain number of units of a lower order. 2, The simple units of one order form what is called a level, stage, or layer in a language system. Thus, one level is formed by such elemental units as phonemes, another by morphemes, which consist of phonemes, a third by lexemes (words), which consist of morphemes, etc. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 When we build any language system, apparently the simplest way should be to define in succession the units of the lowest order and then pass on to the units of the next higher order, the units and relations in which they gust be -defined in accordance with concepts already defined for the- next lower order," Thus, having defined the concept of phoneme, we may de- fine the morpheme, which always consists of a certain number of phonemes. 4. But if we proceed in this fashion, we shall not be able to oon struot an internally consistent system, since at certain stages along the way we will meet up with vicious circles (in the logical sense). 5. The reason is that a system of units in any single order requires certain concepts lying outside itself for its own construction or, in other words, forming with respect to it meta-ooncepts ffSTA--PONYATIyjA. These meta-ooncepts relate in part to the system of units in a lower order (with respect to the order in question) and they may relate in part also to the' system of units in a higher order (with respect to the order in question). Thus, the definition of phonemes and their interrelations (in the phonological sense, to which I subscribe; I have often set forth in print the case for this view, so there is no need for me to go into it again-here) are based not only on concepts from the field of phonetics, but also on some concepts from the field of morphology, i.e., they relate to the level of morphemes. 6. A more complicated method of constructing a language system is outlined on the basis of the foregoing. In some-cases it is necessary to proceed directly from the system of the lower (e.g. first) order not to the next higher (in the given case, second) order, but to the following (in our case, third) order; and having constructed it without utilizing the con- cepts of the second order, to proceed to this last; and then to return to the system of the third order and finish constructing it, now also making use of the concepts relating to the system of the second order. 7. MACHINE TRANSLATION STUDIES IN THE MATHEMATICAL INSTITUTE OF THE ACADEMY OF SCIENCES, USSR A. A. Lyapunov and 0. S. I{ulagina (Moscow) I. Introduction 1. Electronic computers are a highly efficient means of processing information. 2. It is practical to use electronic computers as an auxiliary tool for intellectual work. 3. Human speech as a means of transmitting information,. 4. The importance of making it possible for machines to use human speech. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release 22900/08/24: CIA-RDP68-00069A00010Q 0007-9 5. Machine translation as a first step in instructing machines to work with a language. II. Brief Description of Work Done 6. French-Russian translation. Empirical formulation of rules. Construction of an algorithm suited to the machine U5 capabilities. Elaboration of problems connected with coding and information conversion in the machine memory and with the organization of programs to increase the efficiency of machine operation. Utilization of scales. Work on improv- ing the algorithm and programs on the basis of experimental translations. ?. English-Russian translation. Use of structural-syntactic analysis of English. Classification of English and Russian words on the basis of formal criteria. Grammatical configurations of English and Russian, a comparison. Problems in eliminating homonomy. Use of experience with French-Russian translation in problems connected with coding, program con- struction, and Russian sentence analysis. 8. Problems in automatizing translation programming. Operational description of translation algorithms. Compiling program, constructing the translating program according to its operational description. -Significance of experience gained in programming French-Russian translation. 9. Theory of numbers approach to the construction of a formal grammar. Classification of words, identification of configurations, determination of relations between words. Possibilities of using a similar approach to syntax and phonetics. 10. Basic principles of operations advance by "ledges" iTSTUPA1v 7; maximal theoretical interpretation of each step; planning of work base on interrelations between machine and thought; close contact between groups working on different languages,- joint work of mathematicians and-linguists at all stages starting with the formulation of translation rules. III. Problems 11. Linguistic problems in machine translation. (a) Development of precise system of linguistic concepts, their operation in translation algorithms as a criterion of usefulness. (b) for different Development of methods of constructing translation algorithms languages. Intermediary languages, construction and use. Problems in linguistic statistics. Investigation of language structure on the basis of translation Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release 2000/08/24: CIA-RDP68-00069A00010'0007-9 12. Technical problems in machine translation. (a) Elaboration of effective designs for translation machines. (b) Establishment of operational systems for these machines. (c) Elaboration of special memory devices (large capacity with swift retrieval). (d) Design of special input and output devices. 13, Mathematical problems in machine translation. (a) Development of effective means of coding information at the various stages of operation. (b) Increasing the output of algorithms (o) Investigation of abstract language models and translation models, (d) Elaboration of a mathematical language to describe translation algorithms, (e) Automatization of programming of translation algorithms. 14. Combinedmcybernetio problems. (a) Machine output of algorithms. (b) Machine production of linguistic statistics. (c) Machine construction of models of concrete languages on the basis of limited text materials. IV. Problems Connected with Work in the e of c ine Trans a off n 15. Need to elaborate different approaches to the problem by different research groups maintaining close contact among themselves. 'Value of co- operation in machine translation. Need to establish systematic exchange of information between groups working in different cities, 16. Need for representatives of the varioiW-fields of specialization to participate in the work on machine translations mathematicians, linguists, and engineers constantly cooperating at all stages of the work from formulation of rules to study of experimental translations. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 22,0/08/24: CIA-RDP68-00069A0001002 O7-9 8 e AN INTERJi IARY LANGUAGE MODEL FOR MAC ?I NE TRANS LAT IO N Io A. Mellchuk (Moscow) The following represents one of the possible solutions to the problem of machine translation from many languages into many languagess la Two sets of rules are worked out for each languages (a) The rules of analysis which, with the help of appropriate glossaries an cNa a', effect the transfer of a text into a conventional numerical code in such a way that each word in a given form and given syntactic function is matched one-for-one with a chain of figures called set of infor- mation for the word. The series of sets of information developed is broken down into paired typical combinations with which the relations existing in each given pair of information sets have been matched one for-one a The fixed relation between the-two sets of information (containing the syntactic relation between the corresponding words) is called a 'wconfiguration"o One member of the pair which satisfies the given configuration is called the "governing" and the other the "governed" members The total number of configurations is-not very large (in a specialized text no mare than 2OO) o As a result of the analysis, each word in the text to be translated is replaced by a set of information and each set contains an indication of what configuration it satisfieR. and which member it is, (b) The rules of synthesis permit transition from the numerical codes oeog offi a series of sets of information, to words, to the actual text0 This operation is the reverse of analysis described above Each configuration contains an indication of what form a word that satisfies the configuration in question as either member of the pair must have ? Therefore, if we know the atom of a word, the kind of cofigur?atione and exactly hour the word satisfies it, we can synthesise the necessary Barra 0 Both analysis and synthesis are effected in cos plate independence of the translation. 2a A special system of rules and charts is being worked outs determining correspondences between the conventional numerical code of different languages (identical correspondences are not essential,- rules for choice may be used., These correspondences are established on 3 levelas ? lI Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Rele^-2000/08/24: CIA-RDP68-00069A00400 00007-9 (a) (b) (0) lexical correspondences (i.e. lexical transfer of stems); granmatioal correspondences (transfer of so-called "extra- syntactic " categories as, for example, number in nouns or tense and mood in verbs); syntactic correspondences (correspondences between con- figurations s syntactic relations of different languages as well as correspondences'between groups of configurations clauses and various types of phrases). This abstract system of correspondences is also called an intermediary language which does not exist, therefore, as any real or artificial language but represents a unique calculus. 3. The translation process consists of three steps analysis _- transition from a text in the source language to a series of configurations; transition -- from a series of configurations in the source Ianguage to a series of configurations in the target language; synthesis _m transition from a ,series of configurations in the target language to a genuine text in it. 4, Underlying the;trans:lation is a.syntactio analysiss establish- ment of configurations, i.e., ascertaining the relations between words in the source language and expressing these relations by the most suitable means in the-target language. -Such morphological data as case, number, and person of ''a verb (also the use of auxiliary words is provisionally included here) are used only as aids while ascertaining the syntactic relations. 5. During the course of syntactic analysis both the functions of words in the sentence ("sentence members") and the interdependence of words are established. The latter factor is especially important, since the interdependence Of words makes it possible during synthesis to regulate their arrangement, i.e. to achieve the best word order. 6. The model of an intermediary language that has been worked out for machine translation includes for the present Russian, English, Chinese, French, and Hungarian. The purpose is to develop a system of formulating rules and the best method of recording and arranging the material. .e 12 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release00/08/24: CIA-RDP68-00069A0001000007-9 9. THE SIGNIFICANCE OF MACHINE TRANSLATION FOR LINGUISTICS M. I. Steblin-Kamsnskii (Leningrad) Besides promoting cooperation with representatives of the precise sciences and thereby instilling linguists with the need for greater accuracy in their research and formulations, work on machine translation is important for linguistics in three respects: (1) It Is critical of all the traditional grammatical concepts, primarily those like the "parts of speech", "numbers of a clause", "clause", etc. Based, as it is, on practical considerations, this criticism will be more objective and effective than purely theoretical criticism. (2) It makes clear that the same linguistic fact may be described in various ways depending on what general definitions or terminological conventions are used, with the result that all the dogzrAs established in the individual branches of linguistics need to be reviewed. (3) It will aid in overcoming linguistic "semantism"" 5EMAL3TIZMA7, i.e. the practice whereby linguists follow the line of least resistance and study meanings,. not the structure of language. Language differs from other sign systems not by the existence of meanings (which are not peculiar to language), but by the structure of expression. 10. THE "ACTIVE" AND ""PASSIVE" GRAMMAR OF L. V. SHCHERBA AYM THE PROBLEM OF MACHINE TRANSLATION I. I. Revzin (Moscow) 1. The polysemantic term "grammar" (either "grammatical structure of a language" or "description of the grammatical structure of a language") is one cause of the erroneous conception that a given language has only a single grammatical structure, that there is only one correct "grammar" (as a description of a system). 2. The description of a language system depends on the goal that an investigator sets for himself. This notion was the core of the remarkable theory of L. V. Shoherba on "passive" and "active" grammar, which has suffered undeserved oblivion. 3. "Passive grammar studies the functions and meanings of structural elements in a language on the basis of their forms, i.e. the external side. Active grammar teaches the use of these forms." (L. V. Shcherba) The purpose of instruction in passive grammar is to teach one to understand a text in the language. The purpose of instruction in active grammar is to teach one to express thoughts in the language. .-13- Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Releas 00/08/24: CIA-RDP68-00069AO0010-00007-9 4. One of the dangers pointed out in connection with L, Vo'Shcherbavs ideas is the assumption of a "denudation of-thought" or "existence of thought without-language' in passing from form to pure meaning and from pure meaning to form. However, no cognizance was taken of the fact that a thought need not be registered in a concrete language; it may be registered in an abstract,, artificial language where there is a simple, reciprocal correspondence between the designator and the thing designated. 5. Machine translation assumes precisely such an abstract language,, namely an intermediary language that must be implicitly present in any machine program and will apparently be described in the near future. If oybernetio analogies are adequately grounded, one may assume that the ana- logue of such an intermediary language is present in any translation (and, generally, in any form of logical activity). 6. Machine translation has demonstrated the correctness and need of a separate approach to the problem of text analysis ("passive" grammar in L, V. Shcherba?s terminology) and to the problem of text synthesis ("active" grammar). 7. The first problem was effectively solved by purely formal means, The limits of machine translation depend on a full solution of the second problem (the compilation of a list; of synonyms, m by synonomy we understand the presence of several units corresponding to a single unit in an abstract language or what amounts to the same thing., a single unit of thought -- and an algorithm for retrieving an equivalent under the given logical conditions). 8. Experience with machine translation has shown that, generally speak- ing, an inverse ratio is observable between the 'active" and "passive" grammar of a-languages the more complex the 'passive" grammar, the simpler the "active", and vice versa. Hence, for a number of languages emphasis wholly on passive grammar might considerably alleviate the language curricula in schools. 9. L. V. Shcherba's ideas on the distinction between active and passive grammar, as strengthened and enriched by experience with machine translation, must ultimately find application in foreign language teaching (in secondary schools as well as in colleges and universities). 10. Secondary schools should make wide use of the methods of passive grammar, which are not only unusually effective for analyzing an unfamiliar text,, but correspond to the habits of logical thinking developed in mathematics classes. Moreover, interest in learning the grammar of a foreign language can be heightened by introducing exercises in translating sentences "by machine". This would also serve the interests of polyteohnioal instruction. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release?3000/08/24: CIA-RDP68-00069A0001009&0007-9 11. The same considerations apply as well to language teaching in the natural science departments of universities and in the higher technical institutions where little use of the well developed formal-logical habits of students has been made up to now in foreign language teaching, 12. Creating a scientific theory of "active grammar" would not only push forward the frontiers of machine translation, but assist instruction in language schools where grammar is still taught in undifferentiated fashion. This is of particular concern to translation departments where necessity dictated the conversion of a theory of translation into a theory of active grammar. 11. A. GENERAL THEORY OF TRANSLATION IN CONNECTION WITH MACHINE RAN IA O N V. Yu. Rosentsveig and I. V. Revzin (Moscow) 1. The possibility of creating a scientific theory of translation is still being argued by a number of specialists, both linguists and literary critics. Nor has there been any final answer to the question of whether a theory of translation concerns scientific linguistics or belongs to the field of literature. 2. The polysemantic term "translation"' also awaits a definition. The historical paramountcy of artistic translation has resulted in the oonceiv- ing of every translation as an artistic production, as a creative achieve- ment in the realm of language. Meanwhile, the development of new types of translation aotivity, chiefly in the field of scientific and technical literature, has made another conception of translation urgent, i.e. as a process of establishing principles of correspondence between the structures of two languages. 3. Disclosure of the possibility of translating texts by a machine and development of a theory of machine translation has shown that distinguish- ing between the fields of translation makes limitation of both concepts logically inexorables (a) "translationl" is translation as a form of creative activity and (b) "translation2" is translation as the establishment of strict correspondences. Translation as a form of creative activity is an object of study for theorists of literature. Translation as the establishment of strict oor- respondences is an object of study for linguists. -15-. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Ili Approved For Release 2000/08/24: CIA-RDP68-00069A000100007-9 5. A linguistic theory of translation must regard translation ("trans. lation 2") as a special kind of decoding with subsequent encoding into another system of symbols. The distinctive feature of this transformation of in- formation is in the irreversibility of the process. The reason is that simple, reciprocal correspondences between language systems are lacking. Hence, rules for correspondence in translation are complicated by the need to formulate a number of restrictive conditions. Determination of these conditions is a proper object for a linguistic theory of translation. A general linguistic theory of translation studies ideal types and routines for matching systems of language symbols; a particular theory of trans- lation analyzes the correspondences between the two languages. A general theory of translation is chiefly a'deductive discipline, while a particular theory of translation is inductive. 6o Thus, the methodology of a linguistic theory of translation com- prisess (a) methods of structural comparative analysis or, in other words, analysis of the synchronous stages of various languages; (b) methods of linguistic statistics. (o) methods of logical semantics, more precisely general semsiology. The very listing of these methods shows the main difference between the linguistic and literary theories of translation. The latter requiress (a) a study of the era, (b) world cutl?ook and creative method of the writer and literary school; (o) peculiarities of his individual artistic style. 7o From the semantic point of view "translation " is a certain rem 2 flection in itself (a system of elemental meanings is assumed to be invariant). "Translation 1" from this point of view, is not a reflection in itself, since pragmatic meaning, which plays a major role in "translation I", does not coincide in two languages. 8. Having marked off the object and methods of a linguistic theory of translation, we can not only ascertain the limits of machine trans.. lation, but also create a well structured, definitive theory of trans_ lation, that is to say a separate, scientific ]linguistic discipline. Creation of this discipline can help to perfect methods of training trans- lators. It will undoubtedly find application in the teaching of foreign languages as well, -a 16 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release.200/08/24: CIA-RDP68-00069AO0010090007-9 12. SPECTRA OF PHONEMES AND THEIR USE IN MACHINE TRANSLATION V. A. Artemov and I. A. Zimnyaya (Moscow) 1. Oral information and translation machines must, among other things be'accessible to people with varying physical characteristics of speech. Therefore, their system of signalling must be based on the phonemic in- variants of sounds or, in other words, on the spectra of phonemes. 2. Three aspects of the spectral analysis of speech sounds must be distinguished: (1) syntactic (phonologic), (2) semantic (phonetic), and (3) pragmatic (technical communicative). 3. A, syntactic investigation of spectra of phonemes is based on con- tracts within the sound system of a given language. A semantic investigation relates the spectra of phonemes to word meanings and grammatical .forms. A pragmatic investigation of the spectra of bpeech sounds originates in and services practical needs. 4. A syntactic and semantic investigation of spectra of phonemes pro- vides an exhaustive analysis of their physical properties which form structures bearing a comparative and systematic character. 5. A pragmatic investigation of spectra of phonemes, requires the determination of their minimal characteristics, which permit of their full or partial restorations i.e. it becomes a compression of the spectra of phonemes. A pragmatic investigation of spectra of phonemes becomes their oompandor, including the compression and expansion of amplitude. 6. The Laboratory of Experimental Phonetics and Speech Psychology (LEF and PR) Iaboratoriya eksperimentallnoi fonetiki i psikhologii reohg of the First scow State Pedagogical Institute of Foreign Languages (MGPIIya)/17oskovskii goeudarstvennyi pedagogicheskii institut inostranrkh yaaykov7oonduoted investigations of the spectra of 5 vocalic phonemes of a, 0, us i, e type in the following languages (1) Russian (V. A. Artemov and I. A. Zimnyaya)s (2) Georgian (T. G. Tsibadse), (3) Armenian (A. M. Aramyan and A. A. Khaahatryan), (4) Lettish (I. A. Zi ya)& (5) Albanian (I, A. Zimnyaya)s (6) Bulgarian (I. A. Zimnyaya)s (7) Czech ( I. A. Zimnyaya), (8) German (L. P. Blokhina and I. A. Zimnyaya), (9) French (K. K. Barashnikova and V. S. Sokolova), (10) English (I. A. Zimnyaya). In additions data on English were drawn from the works of Paget, Green and Potter, Petterson, and Kopp for purposes of comparison with the studies of the LEF and PR. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release X000/08/24: CIA-RDP68-00069AO00100200007-9 7o All the material was recorded with a basic tons of 120-150 cycles per second at a level of 65-70 db, The pronunciation of each speaker was representative of the literary speech of the various languages. 8o A comparison of the guantitative and graphic data shows that the following pragmatic rules are observable within each languages (a) the a-type vowel is characterized by a wide formant region (600-1200 cycles) with gradually increasing intensity of the components in the direction of high frequencies (1200- 2500 cycles). (b) The omtype vowel is characterized by a central formant region somewhat shifted down to 400?1000 cycles per second. (o) The umtype vowel is characterized by a somewhat narrower central formant region shifted still further toward the low frequencies of 300-800 cycles per second with a maximal elevation of amplitude in the range of 300?350 cycles per secondo (d) The imtype vowel is characterized by two main formant regions. The first is in the range of lower frequencies and almost coincides with the range of maximal intensification in the main formant of the u-type vowel E as Paget has pointed out, But a gentle falling-off is observed in amplitude of the u m type a and a steep falling-off in the i-type. (e) The a-type vowel is distinguished from the i=type by the formants shifted more to the center. The broader the e, the closer the formants come together. 9o The above-mentioned acoustical properties of the vowels completely correspond to the position and operation of the resonance chambers of the vocal apparatus, as stated in several reports of the IEF and PR as well as by Paget and Yakobson. loo These studies indicate that the spectra of vowels on the syntactic and semantic plane have a structural character. V. Ao Artemov suggested a means of determining these structures. It consists of separating from the vowel spectrum all the areas of relative intensification and establishing correlations between them, taking the lowest of them as to 11. At the same time a comparison of the spectra of the 5 types' of vowels studied indicates that a structural correlation between the areas of intensification is retained within definite limits in the languages in- vestigated. In this connection it is possible to speak about a certain structural and comparative invariant of these types of vowel spectra in the various languages, which is essential for signalling technique in trans- lation machines. Aft Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release IMO/08/24: CIA-RDP68-00069A00010020007-9 I3? AN OBJECTIVE INVESTIGATION OF MEANING ASSOCIATIONS Oo S. Virog adova and A. R. luriya (Moscow) le An objective investigation of the association of meanings that are aroused in man by some word or other is a basic necessity for psychology as well as for linguistics a Despite the considerable progress achieved by nodern linguistics, in- formation theory, and psychological. i.nveetigation ?f the development of the meaning of words in children, objective research techniques both of potential associations aroused by words and of the dy amtes of the se associations still remain. to be worked out. 2, The use of different variation: of the conditioned reflex method may play a vital role in elaborating objective ways of investigating meaning associationso By combining the showing of a word with some kind of involuntary reflex response (vase ular, outaneoue-galvanic, etc o reaction) and then showing other words, the investigator is in a position to establish objectively that group of words shown elicits similar reactions and is consequently, to some extent, the eqmivalent of a previously shown words and at the same time he is in a position to trace both the structure and the dynamics of these associations. 3p The report discusses the results obtained from an objective inves- tigation of the system of associations by registering the specific and nonmspecif o conditions of ?aaoular reactions. Conclusions are drawn concerning certain. faotcs that may determine the structure and dynamics of these associations in, normal and abnormal experimental subjects, l4e THE TREATMENT OF CERTAIN CONCEPTS IN STRUCTURALISM V. T o Grigor lyev (Moscow) to Interest, has groin'n of late in the thods and concepts of the structuralist approach in linguistics du?s to the development of machine translation Pad other brkknc re z+ of applied linguistics o However, recent articles have treated certain structuralist concepts in an excessively one- sided manner and, in es.sen e, incorrectly. 2q Phonemes are treated as though they were connecting elements lack- ing physical reality, The physical. character of the differential signs of phone s is dentedo Real speech sounds are represented as something ex- ternal with respect to language. Meaning, which are also raved from language, receive the same treatment. This method of handling speech sounds and meanings reflects only the views of L. YeimislevIs group and is not to be ascribed to structural: sin it gener to -'i9- Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 3. Actually, the structuralist method of investigating speech sounds takes into account their acoustic and articulative properties. The func- tional criterion used by the structuralists in phonetics makes it possible to isolate from the entire diverse mass of phonetic material the physical (acoustic and articulative) properties that carry the functional load and, consequently, are of prima importance to the linguist. The functional criterion ensures a differentiated (from the viewpoint of language structure) approach to the varied and changing properties of phonetic material. Using the functional criterion, linguists may be very helpful to engineers in solving practical problems confronting the several branches of engineering; contrariwise, orientation on pure relationship elements would prevent the linguists from solving practical problems and do away with the possibility of cooperation between them and the engineers. 4o The attitude of the structuralists toward meaning was determined by their interest in working out an objective method of investigating language. The striving to escape from the inadequacies of traditional linguistics led the struoturalista to refuse in general to consider meaning as a solid criterion of linguistic form. However, this refusal to take account of meaning in research methodology does not determine the structural- ists' theoretical treatment of meaning. In many oases it exists harmon- iously side by side with the acknowledgment of meaning as a basic element in the functioning of language. It must be admitted, however,, that rejection of the semantic criterion imposed severe limitations on this school of linguistics. In practice, the field of semantics remained outside structural analysis, 5. The meaning of a word is the linguistic form of expressing an idea. Meaning cannot be separated from language simply because it does not exist prior to or apart from language. At the same time, meaning is a basic factor of language, determining its structure. It is important for the further development of applied linguistics that objective methods of semantic analysis be worked out. Naturally, in solving this problem full use will have to be made of the experience gained by the representatives of structuralism in their objective investigations of language. 6. A critical exploitation of the experience of structuralism is scientifically advisable. It is an indispensable preliminary stage in the task of introducing mathematical research methods into linguistics. 15. THE SIGNIFICANCE OF FREQUENCY AS A FACTOR IN D I OFD` m V. M. Grigoryan (Yerevan) 1. A comparative study of modern Russian.-Russian dictionaries reveals contradictory data. Thus, in various dictionaries (e.g.) Monotypic 4mvolum? works) one and the same word may be defined in different ways from the 204 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For ReleasevW00/08/24: CIA-RDP68-00069AO00100 007-9 viewpoint of the language As limitations with respect to stylistic usage B and it is often possible to find 1nucDnsistei ~deu in the or?deri in which meanings are arranged. These (and other) contradictions make things dif- ficult for the reader who seeks info aticn in order to determine the oper?atilr e norm for a given Unguisti,c fact, 2, Since the norr 9 as a r-uuls9 are correlated with the factor of frequency, statistical data are extramly essential in mary oases, if maximal precision is to be attained. Some considerations supported by Russian language data (with due regard for strict synobronouene e) are sited by way of illustrating this contentions 3a The plan proposed by us is not orb igi,nals it agrees in prin- ciple with that employed in several frequency dictionaries published a- broad (Harry H. 'alosaeleon, The Russian word count, Detroit, 1933; Victor Garcia Hos d Vocabularic usual vocabularrio comun y vooabularric fundax ntal9 Madrid, 1953.). They are, usually constructed on the basis of the familiar correlation between style and genre. Adapting thie plan or the whole a we propose to set up 4 categorises (1) rr ,e (2) speech in =,:.alog5 a (3) speech in monologues G using material from fiction (4) non-fiction literature newspapers 9 doouants 9 eta o It is obvious that statistical data reflecting the frequency of usage of a specific word in each of the 4 categories must be selected on the basis of equal conditionso Clearly, these equal con- ditions will be ensured if th,? f'uequency of a given word is derived from an equal number of words in all 4 categories. If we designate the cate- gories by a9 b9 a9 and d, respectively, the total preliminary number of words in category a must be equal to the total preliminary number of words in category b, etc. This word total9 it seems to us, can be advantageously determined by using the Noz method. In addition., selections must be made from purely random material (but within the given ccategory),- the more varied the material, the more accurate will be the information. The resultant date, can be used to determine stylistic fumnctions 165. E ERIMENT TO DEFINE THE CONCEPT OF R. LO Dobruahin (Moscow) A given finite nu berr of words is exams fined o A finite 9 ordered ag- gregate of words is called a sentence. The division of all sentences into two non- dossing classes is assumed to be givens a class of grammatically valid sentences and a class of grarmatioally invalid sentences, Word A is called subordinate to word b, if a valid sentence containing word A remains valid after A is replayed by b. Two words A and b are called equivalents 9 if A is subordinate to b and b is subordinate to A. All words are divided -_ 2l Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2000/08/24: CIA-RDP68-00069A0001 6 22 0007-9 into two non-crossing classes of words equivalent to one another. Class A is subordinate to class B, if all words entered into class A are sub- ordinate to words entered into class B. The system of classes and subs ordinations thus obtained ? called the basic gramymatioal structure of the language ~ is examined, The result is a definition of the concept of grammatical category. 17. THE THEORY OF PROBABILITY AND DETERMINATION OF LINGUISTIC RELATIONSHIP A. B. Dolgopollskii (Moscow) The proposed method of determining the relationship of language families by applying the theory of probability is, in broad outline, as followss 1o On the basis of linguistic experience, those semantic points are isolated in which maximum historical stability of morphemes (without borrowing) is observed, 20 A determination is made in each group of languages under con- sideration as to which morphemes possessing a given meaning may with greater probability be regarded as the older. The usual techniques of comparative historical research as well as the method of internal reconstruction are used for this purpose. 3. We cannot speak about phonetic correspondences between language families being compared before the fact of relationship has been established. Hence, at this initial stage of investigation we must rely wholly on phonetic resemblances. More precisely, we rely here on subsequent probability cor- relations. Following a comparison of cognate languages, it appears that the n-sound is the most probable of all the sounds in any single related language that correspond etymologically to the nnsound of another related language. The same may be said of the m sound. But, possibly, not of the s-sound. At any rate, among all the sounds that correspond in one language to the a, smsounds in another related language, the most probable, apparently, are the sounds of the same s, s-group. fhis would also seem to be true of the 1D r?group$ the b, p, f-group, the t, dmgroup, the k, g, k, hwgroup, etc. In this connection, we perhaps can?t say anything about vowels or laryngeals. Starting with these probability considerations, we may be able (leaving aside the vowels and laryngeals) to base our subsequent discussions on the data of consonant coincidences between various morphemes in the different languages. We will term "consonant coincidence" the correspondence between oonsonantsthat remain within one of the above mentioned groups. These groups must be chosen in such a way that phonetic shifts of these sounds are no more probable than retention of the sound (retention within the group). The groups cited here are obviously only for illustrative purposes. Actually, oomparativemhistorioal phonetic materials from all possible language families must be used to establish the most probable sound correspondences (one of Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releasc%000/08/24: CIA-RDP68-00069A000100Mb007-9 these correspondences is the most' probaab.e for sev-eral sounds - the cor- - respondence of a sound to itself). As a? result, we may select, let us says 10 or 7 different sound its which will' constitute the material for phonetic coi parisonss 4o Compari..ng the equi4ient m rphen*n, in the different families, we note the phoretioc oincidenees'; r-,t pa a0 3). We then use appropriate for lat from the theory of probability to measure the probability of the &*oidental coincidence of a certa t.z; n ik ar of x rphemes in so many languages a from so many comparable items, tasting into account, the number of old synonymous morpbemes for e ch sea it c point of each language group as well as the total number of consonants distinguished during the comparison. (Cf0 par&0 Z)o If the probability of accidental coincidence proves to be quite low, it will be as ig '~ ?%rgu nt in or o t; e relationship of the languages in questions Use of the theory of probability will enable us to test the evidence from comparisons between the vran?ipus lan .ages cited in numerous works dealing with the problem of language family relationship (e.g. Trombettin Winkler, etco)0 A. A. 2inov yev (Moscow) 1? The process of transiatiuig, from; one language into another may be described as a language consistin excluily of definitions. Breakdown of the language into elents is ere assumed to be effected. It is possible to model the formal. z,*,p4,9x or def i.nitia~na,g one may scppoae9 by means of a special device a Having determined all possible definition type relations at least between a` ,:elected part of the elements of one language and a selected part of th-: el:am nt,, of anothher language, van use the modeill.i:g1;i tz; prodnuo, in standard: torn at least partial translation (if on' -4. initial a proxii$tion) o 2? A gsnerali theory of d fix tticnns to const ?uyted as part of a theory of sy bol.,gse several -variaftts are. possible- de n.dinng on the original concepts in the ,taten*nt and an tr?rhe for : ! apparatus for oonstru*;tinE the theory. The suugge5tod -7a. riant a u ran^ ct rued. by ikn ir, tial concept "Choioe, a. special. ?f deefini?:^. the ~oncatt tie? boi 'A '?T'erm`"? and "Definition". The f'c rma" r ,Ba,a is ~t:uc tied on the bps :sip, of the f actors ("Ea h. J") and (C a 1~ (": cyccs rr,d only one of") an4 or the rd,ssion. as ' . tr tis,T logics ea^ i :a ,yqA ;isi ~s r. act1c ,"' could with .~ A td aa_po sition0 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releas 00/08/24: CIA- RDP68-00069AO00109 0007-9 3o A general theory of def iz;itions . sway contain proof of rules for definitions, elicitation of the cgnditiorxa governing their use, rules for deduction and their interconnections. 19. LINGUISTIC PROBLEMS CQNNECTED WITH POETRY TRANSLATION V. V. Ivanov (Moscow) 1, The distinction between tjhe poetic model of a text and this text may serve as a convenient starting point in solving the problem of poetry translation. Translation makes it possible to recreate the same poetic model by means of another language while retaining the relation between the model and the text. On the other ha4da the direct conversion of a poetic text in one language into a poetic text in another is impossible. 20 The amount of information contained in a text is determined by the extent of deviation of this text from the statistical norms of ordinary language and from the statistical norms of the poetic language of a given era. A violation of the statistical norms of ordinary language may become the norm of poetic language, which results in decreasing the amountof in- formation contained in poetic texts. Poetry translation assumes the transmission of the statistical characteristics of a text in conformity with the language into which the translation is made. 3. The sound structure of verse is .etermined by the phonological structure of the language, as was first pointed out by R. Yakobson. It follows from this that trans ,ssiot of the phonetic characteristics of the text structure is possible only when the corresponding elements in the phonological systems of the two languages ooincid?. The non-translatability of a poetic text is to a very large degree determined by the fact that in poetic language the plane of content is functionally connected with thy: plane of expression;, inson r as the plane of expression is in principle un- translatable, +he plane o1 content appears partially untranslatable. This limitation may also apply to the poetic model of the texto if (as with Khlebnikov) units from the plane of expression are included in this model. 4. Phonetic coincidences of parts of words are used to organize a poetic text chiefly in eases where they are superfluous from the morpholog- ical point of view. Conver?selys morphologically essential phonetic re- ie- mblances contain the least amount of information from the viewpoint of poetic organization of speech (of. the problem of verbal rhythm in Russian poetry). Consequently, the possibility of transmitting phonetic repeats 4POVTOROfdependa not only on the phonologicalg but also on the morphological resemblances between the two languages in question. Bo The predominance of synLtagmatio norm eotions between words over paradigmatic connections is a peculiarity of poetic text on the plane of content. We may see in this the result of transforming' language text in o4m Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release00/08/24: CIA-FZDP68-00069A000100,007-9 accordance with a pooetic m de'1 th* rotation k twt? en ",emotion and text" ~ to use 0 o i47anc:gl a 9 tar). Thin transformation ORXV,k I TENSTA can bo effected not only is the origin Al but aleoo in the tr'analationa 6 a A line _1"Y- line an"41vat of trct does not yield satisfactory r?e- auullta t ,ceuueo it weals ii:ttle .bo t u tune ,rjgt t!,, atructuur?e of long passages ai~;@a are rWael, unite wf oa" the defa.xai i;on of a period as "imve length" in Mi.U.t=on~a r?ae D am uaggeated by T. S. Eliot) If a linemby4ine tranalati:rin app ro r.a i caaible, then for a translation based on a poetio modal of the dims wt it is oneiderebly re im- portant to Iran slat , them jo ? ythmiq ezri a; :ctaoti o vita into which the work is divided,- as as exa e a t~.~l% 9 translation of VYMOZRU ODIN 11 YA NA DOROCrU Sc, out ?lone ontc~ the roar io analymedo The continuity of an it rian t A =, lal and -Its i bili;fi y ,> in principal., to for?m&llsed ~I7~I 5 1ude the j o ca the n:ty of automatic translation of poetry by dean r omquute o 200 EEGEL'S TMVEk. NG,'l TIC PARADOXES V. V. i nov (Moscow) lo The r?eee bl n b~ eex matanem t;i, a and linguiatioa also appliee to the trenda of these go.iennoe a as they develop in the 20th century. The theoretical toundaitiona of the jo;ienoea are being investigated in anticipation of practical applicatLomna 3 the r eeulta of thecae lnmoatigation_s will eventually prove 7ita l for practical pu rposea o 20 liegel's the". Yaam,4 according to whi,oh the nbo~~oenrntradiotablan?a$ of a theory cannot be demonstrated 1b;ithi,n .t'r* formalized tAiaory itself, may be extended to iinguiatio theories 'by means of !tee generalization of the theorems; which co mop, down an .f irm tx on of the incompleteness of any system of symbols (Inol, Ading ieng cage Bowever a it would be essential not to reatr?iot oneself to this f'o,?mual..l[rion L u i.r ati.gettng the founda- tions of stY uctuar?ai lingu;i.et,J~ca 'bait to examife the conclusions resulting from a linguuiatie, a .alogiae of H 1?s neoremo S, The moat aaver?ely form .i. ed i xeoriea of language that exami constructional ll l uiatl ob jero" were 'developed within the frerwwor k of distributive a aai.yaiea which ae arum a thh p~ oaibility of describing the elements of a language on the b 41a of 'their distribution. It is not difficult to show that the logi4al appl,ioat ,on of this principle leads to linguistic paradoxes (e.g. i; the distri b tive separation of phcnemreas, word classes o a eaningc of pol.~s % n tic-, Gorda etc o The distribution ot elanenta tuur,?ne out, to be i.mpo,as led if these elements were not given pr'eviouualy0 But the iom atio troduotion of language elen nta contradicts not only the prinoiplea of dictribuutive ioveatigation9 but also the r?e- quuir?em ants u fl' auut omatic auaiywim of vittan and oral apeeohv The axiom do introduction of a c:.aaa of reguz,l'x tr seen noaa ,appearrra to be uu atiafaotory for purposes of synthesis. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2000/08/24: CIA-RDP68-00069A0001QW00007-9 4. For the reasons given above, it is possible at the present time to fashion a formal theory, which can be used to construct a program of automatic analysis, only for a maximally simplified approximation to a real language. We have in mind oases with simple correspondence between form and substances on the plane of expression m for a system of standard, typical variants of phonemes, on the plane of content m for a standard language of science. The absence of paradoxes when these oases are analyzed does not permit, however, of extending the results obtained to ordinary language,., the metalanguage for which (unlike the cases mentioned) cannot be formalized (this applies both to the phonological and to the semantic metalanguage). Automatic analysis of real language requires the employment of linguistic methods other than those considered above and the use of self- teaching type machines (with probability elements)o 21. METHODS OF BREAKING DOWN A SYNTACTIC WHOLE L. I. Iliya (Moscow) 1. Linguists representing the most different schools use as a starting point in their methods of analysis the possibility mm objectively existing in any language -- of isolating a certain "whole" as a maximal unit that can be broken down into similar segments., i.e. comparable in any respect whatsoever. This "whole", which has been variously called "utterance"., `"sentence", or *clause",, belongs simultaneously to all the planes or "levels" of a language ?o phonological, grammatical, and semantic ?m and is character- ized by the fact that its borders coincide in all three planes, which makes this segment a maximally complete or basic unit for any decomposition. 2. The breakdown of a "whole", due to its complexity of structure., is done on the basis of criteria that differ for each plane. As a result, it yields segments the boundaries of which do not always coincide or. as they say, are not "commensurable". Semantic decomposition is to a certain extent independent of the grammatical, and it fails to establish a fixed correlation between the boundaries created by rhythmic-intonation de- composition and the boundaries of morphemes, words., and groups of words. 3. Modern linguists have attempted to eliminate the incommensurability of the planes by seeking a single principle common to all stages of analysis. However, unity of principle is achieved in some theories by ignoring some aspect of language structure (e.g., meaning is excluded in Harris' method and in rhythmic=intonation decomposition of Trager and Smith., while grammatical structure is ignored in Shoher'ba2s intonation-semantic decomposition). OrderlineS of method is attained at the price of simplifying linquistic analysis, which therefore cannot be regarded as adequate for research in all its complexities. However, new methods of analysis focussing on form mal criteria have been used to study them deeply, and modern techniques of measuring such language units as phonemes, morphemes, and words have reached a high degree of precision. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For ReleaseZQ00/08/24: CIA-RDP68-00069AO0010WO07-9 40 The task of linguistic analysis is not only to isolate the basic linguistic units, but to determine the relations between the units that all of semantic relations 0 The contemporary school of linguistics ao@ knowledges as "structural," i0e. which deal with linguistic analysis, only those relations to which definite forms of expression, "signals", correspond. Two main trends in the investigation of syntactic relations can be discerned at the present times (a) the comparatively recent theory of "direct constituents" (Bloomfield, Pike, Wells), which bases sentence decom- position on the relations of a logical hierarchy of subordination that links all the parts of a sentence into a single whole, and (b) the theory, which may be provisionally called the theory of "members of the sentence'". It has a long tradition and many opponents, but finds support among the major representatives of contemporary linguistics (Kurilovioh, Bazel in part, Diederichsen)o The theory considers the sentenos a wholes the parts of which are linked together by functional relations 0 5. The direct constituent method, which is based on a single type of relationship--the heterogeneity` of functions of the constituents=leaves the general problem of determination of syntactic relations open and in- vestigates for the most part the combinability of constituents and typical patterns. On the other hand, for the theory of "members of the sentence" the problem of syntactic relations is fundamental. Formerly, these relations were all too frequently distinguished purely on the basis of meaning, not of forma1, criteria., although the inclusion of such criteria in the prin- ciple is desirable and feasible (Friese, Togebyu) 0 The study of basic syntactic relations requires for its own continuing development that all modern methods of linguistic analysis be utilized, particularly the technique of distributive analysiso The "direct constituents" and "members of the sentence" methods do not exclude one another. Rather, they are complementary, as they permit the sentence to be studied in various respects. 22 0 THE LOGICAL NATURE OF CONTEXT G. V. Kolshanskii (Moscow) 10 The term "context",, given the polysemia of language forms, may be defined from the linguistic point of view as a combination of conditions determining the simple, concrete identification of any linguistic phenomenon (lexical, gra natioal, etc,). By "aLMle" are to understand the display of only one of the many possible: properties of the form under the given conditions (e.g0 one meaning of a word, one word order, one intonation, etc0)0 In this report we are considering cases of determination of meaning in polysemantic words regardless of the method of origination (metaphor, mr etonon r t, hom on y, a tc a) 0 -27_ Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releas00/08/24: CIA-RDP68-00069A00010007-9 2. Contextual conditions may be found within the language itself, but they may also comprise indications lying outside language material. Among the language conditions it is necessary to distinguish between indications included within a single sentence and textual indications. Among the external conditions it is necessary to differentiate between situation, object, and graphic indications. 3. The combination of possible conditions called context may be realized while the precise meaning of a sentence is being formulated in language only through a definite, active, logical process since indications by them- selves are inert and can influence the meaning of a linguistic form only as a starting point in the functional process of achieving a result that makes sense. Since the method of search by context is effective in the semantic area of language, it is in essence a speoulatative, logical process of reasoning about the meanings of language forms. This rational search for the essential and uniquely correct (in the ideal approach to a solution of this problem) result is a process of con stauting a syllogism or chains of syllogisms where the answer needed to establish the true meaning of the word and sentence is the final deduction. 4. A syllogism is constructed by searching for the a propriate premise of a universal hypothetical syllogism (if.....then) or by inter- preting a hypothetical-disjunctive judgment, the complexity of which de- pends on the character of the indication underlying the premise. While searching for the unknown meaning through external extra- linguistic indications, a syllogism is formed in accordance with the nearest indication contained in the context (e.g. determination of the meaning "table" as a piece of furniture in the sentenoe"He has a good table" is based on situation. The meaning "table" may be "either A or B. Here it is not B. Therefore it is A, i.e. a piece of furniture. 5. Mention of the subject is sufficient for the major premise in order to determine the moaning through objective context (e.g. determination of the meaning of the word "solution" in chemistry and electrical engineering is made in similar fashion). The.. form of writing in a written text may also serve as a starting point for a syllogism about the true meaning of a word (e.g. a foreign spelling). 6. The method of searching for the determining factor through lexical environment is the most familiar way of determining the meaning of linguistic forms. The premise is based on the immediately adjacent word (starting of a Sputnik, starting of a motor) and a word standing in any position in the appropriate group (an effective operation to destroy....a hostile garrison, vermin, tumors. etc., where all the semantic variants Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2GO0/08/24: CIA-RDP68-00069AO001002QW07-9 are iaoo1.r edl to members In the maayor pre ~Te of an by Jthetioa.l-di juxactive eyl1ogi,;m) o G o the e a. a i~eu f ioi?rnt n or cue wi lhirn. the sea .tenn"'e';e, the true aLing of wott^:ci is eosught by f rorming 66' rai yiiogiara to eesroh for the preelee of the last oornoluusion, or the baei entire pasra6g ap'h or text ~eag~ W(P) did not alll,c ou:ln houuee to, harnmadl re('ivee the foflou g logical, inte ^pretations If it LS not a. question here of a uor^e h(uiee and family, then the word "houee'' must be nderetood to mean "~ o xa .y ya ~, uz~f o After exa n tiann of t h e text, t, f'ir t t ?en &qe are ';et company" r iu a an Gff s ti e aside and the meaning aax~ raooording to the rule for a di;e junotiviceo gy.4ll: iy -s ue, n Germ all e ?aadezr asn a t$11 All the wheele are etantd: ;g etil Mi,mile.r a&Ty xk . --a` traffic o toastc,~)o 80 The proYee -,,)f aaecoertaini, the true ng a , logioa~1ly Carried Out by a iy ~ hati~o ~1~1 j v.ot ire eyllog emg but depending an the nature of the deg ired reef t the o 3, ; oluz ,ion y be reaohed either by eliminating paste of the df,o j votive ju(6,-ent, (gkvreu the possibility of oomisiet* en ration of all the meaning~n of a word`s or by fist forminz a die juunoti , judg nt ( ani4g9 the word A, r w, y be eit aer, or', It should also be kept in mind that eaa:oh operation is subject to reoheokingo 9o Due to the fact that aa5lyeie of ooza xt Is essentially a ratio le logical prooeem, it oan in prino pie be theor, etio ally formulated an an ordinary logioel operation and be performed by a n chi , The feasibility, and advisability of any arrange ut in oonneotion with machine trans ati.c n is the deoihhive factor in a giver, ce,ae o For simple formaula.s in a context the fors li.r,ed operation to search for the ra a e,~asa y~ meaning may be xwork*d out by introducing a wimple quantor (a, thematic quuantor), Whin the meaning of a word is being interc pr eted on the basis of ' immediate environn nt, as virtually disjunctive promise may be et -Rap, obviously %on+sieting of up to 3 words occu rring before and after a polyeem ntilo wordy proari,ded, however, that preliminary linguistic annally;sie determines all the oases where the meaning of the given word depends on, words capaV'!;e of being associated with it. At the pres~enna stage this work oan be perto. d only for a limited group of worth in certain t>e;,U o loo If the context extende beyond a sen, not, the eo2r do Of a die junotive syllogism iN praaotioaalll Impossible, since one cannot for ally mention the ind!ioatio on the basis of which the parts even of a fully set up disjunctive judgment will be eliminated. - 29_ Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Rele a 2000/08/24: CIA-RDP68-00069AO00100200007-9 Lilewiee formally insoluble is the problem of aso6rtaining the limits of the operation to analyse context (both within a sentence and, much more so, outside it). This is the realm of active;, creative thought,, Thus, a complete, practical solution to the problem of mechanical determination of word meaning by context is exclludedo Formalisation of the rules for interpretation f context in machine translation clearly requires the application of tatistioal methods for probability determination of the contextual meaning of the words. 23, LINGUISTIC STATISTICS FROM RUSSIAN TEXTS R. G. Kotov (Moscow) l,, The development of machine translation and the application of methods of analysis and syntheses to oomm mications technology have created sound conditions for expanding cooperation be en linguists and engineers,, In this connection there has arisen a need to introduce into linguistics objective research methods permitting mathematical handling of the data. Linguistic statistics, which operates with quantitative values, offers wide possibilities for linguistic research, Linguistic statistical data are used to solve a number of problems in machine translations and communications technology. In addition, they may be successfully exploited for lexioom graphical purposes and for foreign language teaching,, 2. The current statistical investigation of Russian language texts aims at preparing preliminary data in connection with constructing the program of lexical coding of telegraphic messages. The work was first done by hand on specimens of texts containing a total of 20,000 words. Methods of analysis were determined by the existing possibilities and research goals,, The texts to be analyzed were entered in order on index cards in the form of two-member word combinations, which made various types of calculation possible,, It is proposed to use in the future machine methods for several tabulations, e.g. word frequency. 3, The treatment of the material has yielded thus far a frequency glossary, glossary of stable word combinations, and data on the frequency of endings. Some principles governing the statistical distribution of .the glossary for the texts examined are elucidated on this basis. 4,, Superfluousness in Russian texts of the type investigated is being determined by taking cognizance of probability correlations in the glossary,, A theoretical limit to the savings expected frog lexical coding is being ascertained,, Lexical coding is regarded here as a particular case of de- correlation ffEKORRELATSI messages by consolidation KRUPNEN'?Y7. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release Q000/08/24 : CIA-RDP68-00069A000100QW007-9 5. Work is going on to s Elucidate the min types of two-member word combinations the sequence of which makes up a text, ascertain the provisional probabilities of endings, and eliminate the uncertainty of choice of gram- matical form in relation to the preceding word. Data obtained on the material of two-member word combinations are assumed to apply to multi- member word combinations and to the sentence as a whole. 24. A METHOD OF DEFINING GRAMMATICAL CONCEPTS 0. S. Kulagina (Moscow) 1. Inconvenience of existing grammatical systems for machine trans- lation and need to elaborate precise definitions of concepts. 2. Initial base of undefined conceptss word sentence and OTMECHENNAYA sentence, environment. 8. Breakdown of multitudes of words into submultitudes L'$ODMNOZHESTVg, consolidation of breakdowns. 4. Concept of B?equivalence, amalgamation of B-equivalent submultitudes. Derived breakdown. Theorem concerning the impossibility of secondary amalgamation by equivalence. 5fl Sequence of amalgamation of words families, classes, types. Con- cept of a simple language. Two definitions of type and their equivalence. 6o Determination of configuration, resultant element, ranks of con- f igur?ations . Concept of subordination of configurations. Determination of relations between elements of configurations. 25n A FORMAL THEORY OF THE SENTENCE I. I. Revzin (Moscow) 1. More' then 200 different definitions of "sentence" make, on the one hand, a deductive development of syntax impossible and show, on the other, that the approach to the problem of defining basic linguistic units requires greater precision. 2. Any definition of a language element is a rmetalinguistio expression (explicit or implicit). "Sentence' as a language word is, by its nature, different from sentenced as a metalanguage word. Therefore, the aim to include in the. definition. everything that is intuitively understood when the sound complex AOsentencedis pronounced is scarcely realizable. A term in linguistics, like an expression in metalinguistics, may reflect only 31, t5 Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Releas00/08/24: CIA-RDP68-00069A0001002i')0007-9 certain intrinsic features corresponding to the usage of the word. 3. An analysis of existing definitions of "sentence" makes it possible to divide them into two unequal groups. The overwhelming number of definitions are connected with the purpose of the sentence, i.e, they include mention of the fact that "sentence" is a language unit serving to convey a "more or less complete thought". Only a few definitions are based on particularly formal criteria. 4. The defect of "sense" definitions lies chiefly in the fact that they violate the principle of homogeneity, they depart from the sphere of language as a system and assume or sanction the dissolving of an object of linguistics in an object of logic or psychology. Moreover, phrases like "sore or less complete thoug at' and even simply `" :o lets tho ght" arm not defused more or leEz strictly in logic itself. The linguists are. thereby doomed to waiting passively for the progress of logical semantics which it is easy to demonstrate, cannot itself develop without greater precision of linguistic concepts. 5. The defect of existing "formal" definitions as compared with "sense" definitions is that they lack the idea of syntactic coherence (according to Aidukevich), i.e. what is most important in this unit of language for a linguist. Sentence "coherence" is reflected, as a rule, in the "sense" definitions, but it is reflected functionally through the coherence of the judgment. 6. The formal definitions of "sentence" PRED1AZHENIYAg coincide in substance with the definitions of "sentence" /PWV. Meanwhile, the linguist is acutely aware of the need to disc nguis between the two con- cepts, 7. The theory-of-numbers conception of language created by Soviet mathematicians is a completely explicit metalanguage of linguistics in which the basic linguistic categories may be rigorously defined. 8. In particular FRAZA entence , i.e. the ordered succession. of smaller units is taken as the ori.gina , undefined concept (the aggregate of meaningful or correctly constructed sentences in a certain language is considered given). 9. Introduction of the concept of configuration, strictly defined in metalinguistio terms, makes it possible to describe a relation of syntactic dependency, while the isolation of regular configurations enables us to obtain the complete analogue of a "syntagma" or "word combination". 10. The individual elements (parts) of the syntagma (they are described in the formal system as S-groups or relatemes LRELYATE ) may be regulated by the relationship of syntactic subordination. It is in these terms that the concept of coherence is formulated. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For ReleaseJ000/08/24 : CIA-RDP68-00069AO0010000007-9 11. A sentence nay be called a cohesive number of Smgroups (or relate s) such that for each Smdgrouup A there is one and only one Sdgroup B such that B is syntactically subordinate to A. 12. Calling two S-groups YOM by mutual syntactic subordination a predicative pair leads to the following theorems a sentence has one and only ant predicative pair. 130 The suggested definition meets all the requirements set forth at the begin ring of this reporto it is formal, reflects the idea of coherence., and is suffioiently close to the intuitive conception of the term "sentence". It also permits us to derive deductively the idea of predicativityo I40 Elusion of the somoalled "single-constituent sentences" is justifiable on two main grosndss first, whatever may be our definition of sentence, Wsingia-oonstituent sentences' cannot in general be taken into consideration because the method of configurations is not applicable to them. Second, the problems of correlation of "single`=constituent sentences" with a judnt cannot be completely solved. And it is important for us that the "sentence", determined by particularly formal means, may be placed in mutually well-defined congruence with the Judgment, Thus9 a strictly formal definition of the sentence is important even for logic. 2f . TRANSLATION sub specie structuralisni Aa .A. Refori atskii (Moscow) 1. Translation results from the variety of languages and the consequent lack of mutual understanding between their speakers. The purpose of trans- lation is to supply necessary information (business, scientific, artistic9 etc.) in language comprehensible to a given user of the information. 2. What is the "theory of translation" and can there be a special science of translation? Criticism of "literary expansion"" (L. N. Soholev and, in part, i. Etkind). Where A. V, Pedorcv is wrong in including the "theory of trans= lation" in linguistics. The "theory of translation" not an a science, but as an object of science, even various sciences. The role of linguistics in the "theory of translation". 3. Types of tr?anslationo Narrowing of 'scope of translation"" in the usual viewo Where L. N. Sobolev is wrong in considering translation limited to three types o Haw "type of translation is defined". A given text and the goal of translation. What is the structure of a given text in its known linguistic features and in its social trends. The linguistic features of a given text as determining the type of translationo Relevancy and non- equivalence of translation elements in various types of translation. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For ReleaseOO/08/24: CIA-RDP68-00069A00010c007-9 4, Translation as information and interpretation. What the structure of "translation" as a whole consists in, Initial data of translation, act .of translation, results of translation in the structural sense. Where Z. Klemensevich and I, Etkind are right and I. Kashkin is wrong. Various types of relations between original and translation. 5, The problem of "translatability" and "non-translatability". What "lack of mutual understanding" consists in. Why Humboldt is right and Kashkin is wrong. The unwarranted claims of Ao V. Fedorov and others. What is ad- equacy of translation in connection with analysis of translation elements and understanding of translation type. 6, Methods and circumstances of translation dictating various solutions of the translation problem. Ad hoc translations, translations are the "task of a lifetime"$ lexicograpTy Informative translations, technical and scientific translations, artistic translations, translations of philosophical texts, machine translationo Cooperation of sciences and talents in diversity of translation activity,. 279 A SYSTEM OF RECORDING SPEECH FOR ORAL TRANSLATION V. Yu. Rozentsveig (Moscow) 1. Oral translation differs from translation of a written document in that the words to be translated are perceived by the ear, transformed, stored in the memory, and later delivered orally. These operations take place more or less simultaneously (depending on the kind of oral trans- lation). 20 The limited capacity of the "short" ,memory of man results in considerable losses of information when large segments of speech are trans- lated. Moreover, overloading the memory makes the analysis and synthesis of a spoken communication difficult. It is necessary to work out a system of recording speech constructed in such a way that it would interact with the "short human memory, thus ensuring reliable storage of information, facilitating perception, and recreating the oral message. 3. A phonetic (alphabetic) writing system has not been devised for the recording of foreign speech. Stenography is unacceptable for oral translation because it registers the words in toto (including redundant and unnecessary words) and requires too much time to decipher, The system of shorthand worked out empirically in the University of Geneva's School for Translators largely meets the needs involved in recording speech for oral translation (Rozan's work). However, it does not solve our problem owing to its unsystematic nature and internal contradictions, 4. The task of developing an efficient system of recording speech for oral translation amounts to the creation of a unique elementary information language requiring the solution of' several logical, psychological, and linguistic problems, to wits 34 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releas,.?000/08/24: CIA-RDP68-00069AO00100MO007-9 (a) logical analysis o spee la, isolation of semantic fuloral points and systems of, connecting them.-, (b) identification of the propersties and action ohao.ism of the "short" hum-in nemoryg (0) determination of linguistic redundancies n ooamn (atereotypi; ) word oombinnation and sen ~~n e oap,.able of being reduced to symbols, the most efficient techniques of designating Mrphel ss and syntn,.ctio connection in the system of the a omm pllex whole m In addition.., we nnz t keep in mind, the necessity of working out a recording system that will be applicable to a pair of languages and easily mastered by those studying to be?o tramsl.atoorso S? T)J GUAGE TRAINING FOR BLIND LEAF-3+I ES 1, Ao Sokolyans.kii. (Moscow) lo The a simaultaneouu,e. lank of visa.l and aural analyzers and thereby of the speech an .lyzer i.s an emosedingly =+uaual condition for a child," The unusualness consiata in the' fact that the deaf-d s'1-Mind child is completely nor sl as far as neural: and cerebral a,tructs e is concern d and therefore retains potentially the full capacity for intellectual develop- msnt like that of any normal child," NevertheiLec a using just his own efforts and without outside help he can not m &k* initial contact with, the exte n m,i environment, sunrrouing hiino Lo Development of a deaf-du -bllind child's first contacts with his en7ironment is an extre ly ,3ouaplex problem that can only solved by selecting a rigorous system of initial signals," This is achieved by special teaching and a special grammar. Or?diia rry general (particularly "aohooll") gra aar, as presented in, general courses cannot be used, 3o If the system of initial signal contacts is developed in close conformity with the logic of the external physical enviro nt, formation of the second aignxalling system on the basis of the first is not parti.ou- larlly complex and is chiefly a technical problem., The heart of the matter lies, therefore, not in the esaand9 but, on the first signalling systeno 4, The second sign .lling system (language) in teaching a deaf-dumb- blind child has wa r?h our. for li g, estioulatorya da?tyliofl touch (Braille), writtona oral. The second signals must be strictly used in the saner order as listed above. A text is a basio link In the second signalling system--but no separate words or separate :sentenoes o Hence, the language instruction of a deaf-d b-bli,rrd child must begin with texts, not separate words or sentences M ._ Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2000/08/24: CIA-RDP68-00069A00010020D007-9 6. The beginning texts are short, consisting altogether of 3-4 so- called "simple" non-extended sentences (two-member). Five or six of these are enough, after which the child can pass to texts composed of "simple" extended sentences which, according to the rules, must include objects (direct or indirect indiscriminately). The remaining syntactic constructions a even such difficult ones as complex clauses m are assimilated with the "simple" extended sentences in the series of texts. What general grammar calls a "simple" sentence is not at all "simple" as far as the teaching of deaf-dumb-blind children is concerned. 29. SOME GENERAL PRINCIPLES IN COMPILING GLOSSARIES VACHTNE TRANSLATION G. M. Strelkowskii (Moscow) 1. The word as a basic unit of language. "Every word (speech) generalizes" (Lenin Philosophical Notebooks), Since ideas originate simultaneously with words and are expressed through words. The very possibility of logical thought is created solely by language. The unity of language and thought is organic, i.e. language can neither-arise nor exist without thought, nor thought without language. However, words are not identical to ideas. Words may have several meanings, i.e. they may express different ideas and, vice versa, one idea may be expressed by several words. A word may contain not only the expression of an idea, but also the relation of the speaker to the object designated b the given word (KHOIAD Zo-01g, KHOIIDDISHCHE Ze-xtreme col? KEOIODOK light oolleto.) 2. In this connection one should mention the impossibility of de- scribing language without referring to meaning (the weakness in the theories of American structuralists and their followers). The unsoundness of theories reducing language to a system of pure relationships (Yellmslev). 3. In accordance with the considerations set forth above, algorithms for machine translation must be based on a dictionary of meanings. 4. The principles of word choice for a machine dictionary. (a) Significant and auxiliary words. (b) Division of significant words into technical terms and words in oommon use. (c) Need to ascertain the minimum of international words required for comprehension of technical texts. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releas ?. 00/08/24: CIA-RDP68-00069AO0010G 0007-9 (d) Choice of subjects (Electronics, particularly the section dealing with automatic control, since this field is now in- cluded in all branches of industry and science, and is a basis for machine translation itself. Competence of author). (e) Problem of compound words and word formation. Regular and irregular translation of compound words. Glossary of stems and program there"or or information referring to tables of suffixes and paradigms of word changes (including stem changes, e.g. stem forms ok strong verbs). 5. Word combinations and phrases. Providing words in the glossary with an index indicating possible stock phrases. Translation of lexical homonyms by the method of analyzing word combinations, 6. Methods of work in compiling dictionaries. Choice of articles, reading them, writing out all words, except the commonest helping words (auxiliary verbs, pronouns, prepositions, etc.) oN index cards; alpha- betic arrangement of cards. Numbering sentences in the text and correspond- ing index on the cards for ready location of possible occurrences of the word. 7. Statistical conclusions. Alphabetic arrangement of words. Per- tentage of technical terms. Repetitiousness of nontechnical terms. 8, Methods of expanding the glossary with and without the machine. Reading of other materials on the given subject and enrichment of glossary with common words. Inclusion within the glossary of all technical terms already selected in the special glossaries of technical terms on the given subject. Treat- ment of new texts by the machine-with separation of words not known to it and presenting them untranslated, or simply a selection of new words. 9. A selected glossary ai a foundation for constructing a trans- lation algorithm without the creation of some metalanguage. 30. SOME ANALOGIES TO THE PROBLEMS AND METHODS OF ANM'NT'- CONT Y INDIAN-MMICAL WORKS y?~ V. N. Toporov (Moscow) 1. Linguistics has perhaps never been so independent and complacent as it is today. This is undoubtedly due to the fact that the real object of the science has been found. On the other hand, the connection between linguistics and other sciences has never before been so strongly felt. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releast%00/08/24: CIA-RDP68-00069A00010Aft007-9 But this connection is effected not on the earlier basis, when attempts were made to apply the methods of one science to another, but on a new- oiie. It is characterized by some ideas common to a number of sciences. These ideas developed (often independently) on the soil of the various soienceb. The isomorphism of certain fundamental concepts (of."structure", "field", "invariant", etc.), the similarity of individual problems and methods of solution. It is becoming increasingly evident that certain common ideas'and methods are being superimposed, as it were, on the material d the particular sciences and transformed in accordance with the nature of the material, the possibility of giving it a strictly formal inter.. pretation, the scientific traditions in the given field, etc. For this reason the prospects for a new synthesis of various sciences on a new basis ar?e now being carefully assessed (of. International Encyclopedia of Unified Science,, vol. I, 1938m1939; B. Hansa no The concept offield as a synthesis of natural science and humanities traditions in sociology. Veetnik istorii.miroroi kulltury Perald of the History of World Culturf7, , no 4p etc,)-. At this time when linguistics is very clearly aware of its place among the other sciences and the new direction in linguistics is inter- preted as being something broader than simple opposition to old ideas, it is natural that there should be growing interest in the outlook for the development of linguistics, the nature of its connections with other sciences, and the ultimate fate of these connections. When one examines these problems, it is difficult to avoid thinking about certain striking analogies to modern linguistic problems that may be found in the history of ancient Indian science, particularly linguistics, and which are attracting the attention of modern scholars with increasing frequency (L. Bloomfield, Emeneau, Bro, Allen, Renov, and others). 3. It us list the most important analogies in the light of con- temporary problems. (a) Formal principle of language description ("desoriptivism") exclusion of meaning in analysis, if we disregard the very small number of Sutra-interpretations that sometimes deal with the determination of connections of semantic (according to Morris) order,- fullness of description, including differentiation between the obvious and the non- obvious). (b) Elements of a systematic approach to language; clear destinction between class and member of class with fixed place; hence, on the one hand, the concept of zero, on the other, potential forms, bypergrammatioisms, false variation (often supported by the striving for conciseness in exposition); contrast of.$phota.sabda; negative character- istios of members in relationship- Prabh''a``kar~-avteaching on semantics-? schools on relation of word and sentence and the dependence of the former on the latter; distinction between signum?designatum-denotatum. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release W108124: CIA-RDP68-00069AO001002 p07-9 (o) The metalanguage of Indian grammatical treatises; symbols (sign-index, sign-symbol types); metalanguage grammar in cognition.. (d) In connection with these features of ancient Indian linguistics, mention must also be made of similar phenomena in other fields: The esthetic code in ancient Indian art, particularly in the drama; the concept of dhavani (an analogy to sphota); some analogies in the worKs of ancient Indian c1 gioians and philosophers ca gorier of relation., time; '"nominalism`"); characteristics of Indian historiography; the concept of zero among the mathematicians of ancient India, etc. A oomparisN with ancient Greek science enhances the significance oftle specific features of ancient Indian grammatical literature, which in many respects resembles modern linguistics., 31. THE FREQUENCY OF LEXICAL UNITS IN ENGLISH M. G. Udartseva (Petrozavodsk) 1. We undertook a study of frequency of lexical units in English geological literature in connection with. the compilation of a minimal glossary for students in geological institutions. As material we selected articles on the various branches of geology as well as on the allied sciences. In addition, for the sake of objectivity in the tally, we included a considerable number of authors from several English-speaking countries, The final listing of sources comprised 33 works containing a total of 250,000 words, of which 28 are articles from 14 periodicals published in the United States, Great Britain, Canada, India, and Australia, while 5 were excerpts from monographs. 2. The literature dealt with problems related to the following branches of geology: mineralogy, crystallography, petrography, petrography of sed- imentary, igneous, and metamorphic rooks, petrology, stratigraphy, paleonto- logy, lithology, tectonics and structural geology, origin, distribution, and exploitation of mineral resources, geology of oil and coal deposits, geophysical methods, prospecting for mineral deposits, radioactive methods of determining the age of rooks, quaternary geology, geomorphology and glaciology, dynamic geology, geology of the ocean bottom, and regional geology. 3. Individual words, phrases, and verbs plus post positions were used in the count. Each additional meaning of a word was handled as a separate item. For example, the word "face" was regarded as four separate words corresponding to the meanings of "side', "face' (of crystal), "surface`", "to put something in front of a person". 4. Each lexical item encountered again was entered on a separate index card where all secondary usages with indication of author were noted. If the word occurred more than 100 times in different authors, no further entries were made. Such words as "that", "which"S 11it"", etc. were handled similarly. - 39 - Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Relea000/08/24: CIA-RDP68-00069A0001"%00007-9 5. The count resulted in a determination of the frequency for 7535 words. We entered into-the minimal glossary x373 lexical items consisting of 546 verbs, 954 nouns, 327 adjectives, 235 adverbs, and 310 other kinds of words. Of this number 176 words are speoialised"terms; more than 200 words have another'meaning in geological literatures, while the remainder are ordinary words. About 4000 of the 7535 words are technical terms. 6, The minimal glossary was tested by taking several random pages m of diverse literary, general political, and geology material and cal oulating the percentage of words from each text that were lacking in the minimal glossary. It turned out that a page of geological text contained 1-1.5% "unfamiliar" words, general political text 8410%, and literature (Dickens) 16-18%. 7. The minimal glossary was also collated with the Thorndyke dictionary. Significant discrepancies were noted even in determining the first 500 words. 32. ONE APPROACH TO LOGICAL SEMANTICS V. K. Finn and D. Rho Lakhuti(Moscow) 1. Our approach to logical semantics can be summed up as follows" (a) some language of science with minimal pragmatics is selected' AM% as the investigated language (e.g., the language of synthetic organic chemistry, formal genetics, classical mechanics, etc.); (b) an artificial language is constructed for the investigated language I and it consists of a glossary (class of basic technical terms and syntactic-functors) and a class of indexes for the glossary as well as a formal syntax in which are for- mulated the rules for building sentences consisting of the indexes. A correctly for-mad sentence in language I is determined with the help of an algorithm constructed in the formal syntax. (o) Language I is expanded into language II consisting of language I. A list of descriptions of types of sentences in language I (examples of such types of sentences for the language of synthetic organic chemistry will beo sentences conveying in- formation about compounds,- sentences conveying information about reactions$ sentences conveying information about re- action conditions) and a list of combinations of indexes oorm responding to the types of sentences formed. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2f99'0/08/24: CIA-RDP68-00069AO001002Q&07-9 (f ) (d) In aocordance with the types of sentences algorithms are constructed in language II that discern the meaning of these sentences, If sentence F is correctly constructed and all the indexes are replaced. by dictionary signs and if the combination of indexes corresponding to F coincide with the combination of indexes of some of the sentence types in language I, the algorithm will'convert F into sign "S", if all the predicates of the corresponding description are satisfied for F; if even one predicate of the description is not satisfied for F, the algorithm will convert F into an empty word. In the first case we will say that "F has meaning in language I", in the second "F does not have meaning in language V. If, however, the algorithm is not applied to F, we will say that the meaning of F is not determined in language I. A descriptive syntax is formulated in language II. It consists. of suitable algorithms to discern the meanings of sentences and a list of rules according to which meaningful sentences are derived from meaningful sentences. (e) Language II is subsequently expanded into language III in which definitions with reference to the properties of language I and its relations to the investigated language are 'formulated. Language III consists of language II and a list of definitions. Language III contains definitions of the concepts of the semantic completeness of language I, translatability (full or partial)of the investigated language into language I, in- terpretation of language I within the amalgamation of language II and the investigated language, explicitness of language I, and other semantic concepts. If it is possible to construct a series of languages I, IIg III for the investigated language, we will say that the "semantic analysis of the investigated language" has been realized. If the investigated language is at least partially translatable into language I, it is suggested that "semantic analysis of the investigated language" can be effected by an automatic machine. "Semantic analysis" is in the experimental stage, and that is why we speak about an "approach" to semantics, and not the construction of a deductive system of semantics. However, the deductive construction of a system of seman- tics is possible on the basis of experimental investigations of the "languages of science"(with minimal pragmatics). 41 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Relearsea2000/08/24: CIA-RDP68-00069A000100200007-9 In the preparation of this paper we have used the ideas and results of research in semantics by A. Tarski, K. Aidukevich, L. Hwistek, I. Bar. Hillel, G. Curry, V. Quine, N. Chudman,, and R. Carnap. 33. SOME PROBLEMS CONVECTED WITH THE HANDLING OF VERBS WITH ALTERNATING (A Statistical Inquiry) R. M. Frumkina (Moscow) The compilation of a dictionary of stems is a necessary stage in the task of constructing an algorithm of machine -translation. By stem we understand the graphically invariant'part of a word. However? there are a number of languages in which the graphically invariant part of certain words, principally verbs with alternate forms, consists of one or two letters$ an inconvenience resulting in homonoigy of stems. It is therefore necessary to separate only the purely standard endings (persona number, etc.)_, and assume that a given word has several stems. There are two possible ways of solving the problems (1) Enter into the dictionary all the stem variants of each word with plural stems, e.g. perfective and imperfective aspect, present and past tense stems, eto. We thereby increase (and sometimes considerably) the size of the dictionary. (2) Select the most frequently occurring variants and enter them into the dictionary; for the other stems, furnish the rules by which they are in some manner to be identified or formed according to the stems listed in the dictionary. This would enable us markedly to reduce the size of the dictionary? but at the price of complicating the program., In order to determine the more efficient method, it will be necessary above'all to carry out a statistical inquiry concerning words with plural stem variants and their frequency. We are now analyzing the frequency of verbs with alternating forms in a Spanish scientific (mathematical) text. On the basis of data in the frequency dictionary of V. Garcia Hoz, all Spanish words witha frequency of more than 460 were first divided into classes depending on the types of alternation. Then the frequency both of classes and of individual morphological forms was determined from con- secutive material in mathematical texts. The data thus obtained clarify the principles governing the distri- bution of classes and alternating forms and enable us to make certain recommendations in compiling a dictionary and rules for handling stems. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release,, 00/08/24: CIA-RDP68-00069A000100QD007-9 34. A LOGICAL ANALYSIS OF THE CONCEPT OF LANGUAGE 5TRUCTM S. K. Shunyan (Moscow) 1. Modern structural linguistics interprets language structure on the Gestalt plane, ie, as a whole, the elements of which are connected by definite relations. 2. It we consider that language elements interact on two axes-- syntagmatic and paradigmatic--an interpretation of language structure on the Gestalt plane must be regarded as one-sided: we encounter wholes, the elements of which are connected by definite relations, only on the syntagmatic axis (such wholes, for example are syllables in phonology or sentences in grammar. However, on the paradigmatic plane we deal not with wholes, but with classes of ordered elementss the elements of these classes are interlinked by definite relations, but the classes can not be identified in any way with the wholes. 3. There arises the need of defining language structure in such a way that the definitions may be applied to the interaction of language elements not only on the syntagmatic, but also on the paradigmatic axis. 4. The new definition of the concept of language structure is based on the general concept of structure in modern symbolic logic where it is defined thuss the structure of a given relation is the property of being isomorphic with the given relation. Modern structural linguistics, as we know, distinguishes two planes in language: the plane of expression and the plane of content (phonology is included in the former, grammar and lexioology in the latter). Since isomorphism exists between both planes, we may rely on the definition of the general concept of structure in symbolic logic and define language structure thust language structure is the property of the relations of elements on the plane of expression and of the relations of elements on the plane of content to be ismorphio with one another. This definition of language structure is in complete accord with the research techniques of structural linguistics at its present stage of development. 6. A logical analysis of the concept of language structure requires an operational approach to this concept. Accordingly, the report states how we should set up empiric operations by means of which language structure, as an abstraction, can be linked to genuine linguistic activity. - 43 - Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Releas00/08/24: CIA-RDP68-00069A0001 0007-9 35. ANCIENT TEXTS AND MACHINE TRANSLATION (A formulation of the problem) V. Shevoroshkin (Moscow) 1. There is no doubt that a great many philosophers, historians, ethnographers, and even specialists in literature have an acute need of Russian translations of a large number of ancient texts. 2. The available translations are a drop in the ocean compared with the mass of ancient literary monuments. 3. Texts in dead languages have one feature that distinguishes them from texts in modern languages, namely, the frequent impossibility of proving that the original author had in mind precisely what we "read into" the text. 4. The feature of ancient texts noted above has produced and is con- tinuing to produce numerous commentaries on these texts. 5. The translator of ancient texts is in essence a commentator. Even the translator who strives for maximum-objectivity inevitably introduces into his work i yy subjective elements, which vary in degree with the depth of his erudition. 6. An investigator who requires the translation of an ancient text iaay also need a commentary, but his primary need is for a maximally ob- jective translation. When:reading such a translation, he should confront the same difficulties that are mastered by a person who reads the text in the original. However, a translation done by a human being does not meet these needs for the reasons mentioned in. (5) above.. 7, Machine translation of ancient texts will enable a student to obtain exactly what he needs. "Interpretation" of a text by a machine is excluded. The more "elementilry", the better. 8. Thus, machine translation of ancient texts is particularly im- portant, for the machine is not merely a substitute for a live translator, but - in this respect alone - it does what a person can't do. 9. Certain characteristics of the ancient Indo-European languages enable us to assert that these languages are more accessible to machine translation than are the living languages. These oharaoteristios includes Oomparatively greater transparency of morphology and simplicity of syntax, numerous trite phrases, etc. This problem will be considered in detail on Sanskrit material. 10. For the reasons set forth above machine translation of ancient texts into Russian is a problem that deserves detailed elaboration, Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release WOO/08/24: CIA-RDP68-00069AO00100400007-9 SECTION ON ALGORITHMS OF MACHINE TRANSLATION 36. AN ALGORITHM FOR TRANSLATING FRENCH INTO RUSSIAN ELBUTROXICALLY V. A. Agrayev (Gorki) The algorithm was designed for use in connection with an electronic computer of the GIFTI or'kovakii issledo vatel'skii fiziko-tekhnicheskoi institut/Gorky Research' Institute of Physics and T'echnology7possessing a limited memory capacity. The aim was to determine the translation capa- bilities of the machine as well as to check the operation of the algorithm with limited glossary and rules. The algorithm includes lexical routines: a glossary of stems, a glossary of phrases, and charts for translating polysemants. The stem glossary con- tains about 500 words. In addition, we prepared a large glossary (about 1200 words) containing the full, original forms. The amount of grammatical information included with the words varies in the two gl6asariesa lose is given in the'stem glossary. The phrase search is based on the semantically pivotal word The translation routines of polysemants contain tests for contextual environment and the required meaning is selected accordingly. Analyzing rules determine the meaning of French inflections andde- pendi.ng on the governing words, establish the necessary grammatical forms of the other words. In the synthesis routines Russian word forms are constructed on the basis of grammatical information derived from the glossary and developed during the process of analysis. Synthesis is effected with regard for its applicability also to translating English radio engineering texts. Statistically chosen data were used in constructing the algorithm. 37. PRINCIPLES IN THE CONSTRUCTION OF ELECTRIC N. D. Andreyev (Leningrad) 1. The problem of electric reading devices (EChU) Zelektrochitdyushchiye ustroisty 7 arises because of the slowness in preparing a text for machine translation, which is inevitable when a human being does this work (partic- ularly in oriental language texts). 2. An electric reading device must be adapted for machine sensing of scripts of varying size, slant, proportion, and graphic shape. 3. The different sizes, slants, and proportions of scripts may be reduced to a single standard by using the three-set system of varying curve - 45 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2D00/08/24: CIA-RDP68-00069AO0010 0007-9 mirrors fREKHKOMPLEKTNOI SISTEMY. ZERKAL PERE11ENNOI KRIVIZNY 7. 4. Scripts of different shapes may be adapted for machine sensing by using the principle of key identification points ZKLYJCHEVYKH OPOZHAVATELQNIKH TOCHEf,, the number of which cannot exceed 50 for Cyrillic and Latin; it may reach 100 for Arabic, Devanagari and their derivatives, and about 300 for Chinese and Japanese. 5. The set of key points is individualized for each of the graphic signs and is interpreted for each language in accordance with a special program that constitutes the introductory part of the analysis in the appropriate algorithm. 38. WORK ON AN INDONESIAN-RUSSIAN ALGORITHM OF WHINE TRAM 3'- N. D. Andreyev (Leningrad) 1. The Indonesian language requires preliminary treatment of the words in order to strip their roots. Stripping'of the root by direct resort to a dictionary appears to be impossible. 2. Three factors make it difficult to strip the roots (1) the presence of initial and secondary prefix and suffix; (2) internal sandhi, i.e. the phonetic interaction of morphological elements; (3) the presence of root reduplicators and polyreduplicators, which occur in two graphic variants, 3. Much preliminary work was required for the statistical and structural investigation of Indonesian words. Different versions of the root-stripping program were based on this work. 4. Processing the words in the root-stripping program makes it possible to proceed to morphological analysis, which is effected by a special morphological program that is often realized in a purely analytic way, i.e., without resorting to the output language, but by substituting words in their code hieroglyphic. 5. Based on a certain working hypothesis concerning the structure of the Indonesian sentence, it seems possible to construct a standard analysis constituting the principal part of the syntactic program; it is only for a minor portion of the sentences that we need a nonstandard analysis forming a more complicated but much less frequently used part of this program. 6. The homonym and phraseology programs are operated after the first three programs are completed, relying on the hieroglyphic analysis effected therein. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release,Z0,00/08/24: CIA-RDP68-00069AO00100 007-9 7. The propositional and tloosary program works chiefly by oozbo version, i.e.,, according to the output languago. 8. Tables of pseudoroots and typical sots of morphological in- foriration are being developed as necessary supplerronts to the main glossary. 89. WORK ON A VIP"PPIA T.-TSr?RUSS I/U1 ALGORITIUd Or. :YAC1iT1d; TRlifi:i IO N. D. Andreyev, D. A. Batova, and. V. S. Penfilcv (Leningrad) 1, The Vietnamese-Russian algorithm of machine translation includes the following programs (a) Glossary of binomials, 5IN3 (b) Glossary of roots, (a) Glossary of idioms, (d) Supporting program, POFtNAYA PROGRAIMAC7 (e) Syntactic program, (f) Homonymic program* go The glossary of binomials assumes the stripping of two.-syllable Vietnamese words with their gra:maatioal information. The glossary of roots includes monosyllabic words and their gramrmtioal information. The existence of two glossaries is due to the problem of word boundary in isolating languages. The glossary of idioms contains idioms, phrase oombinations, and hard-to-translate expressions. The supporting program serves to differentiate between parts of speech in those oases where the appropriate gramraatioal information cannot be precisely indicated either in the glossary of roots or in the glossary of binomials. The syntactic program provides for an analysis of Vietnamese syntactic constructions. The homonymio.program is designed to solve the problem of lexical homonymy within any single part of speech. The program deals principally with monosyllabic words, since homonomrgr is not characteristic of binomials. -47. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2'000/08/24: CIA-RDP68-00069A00010D200007-9 3. In connection with the adoption of a syntactic standard, which consists in utilizing syntaotio analysis to determine the parts of speech, the range of application of the supporting program is narrowed to ex- ceptions to standard eases. 4. Besides utilization of the supporting program, exceptions to standard cases may be solved by inserting appropriate corrections into the syntactic program. 6. The supporting program is characterized bys (a) The ability of individual words to occur in a sentence as a substantive and a verb. (b) The fact that such words stand closer to the verb than to the substantive. Therefore, when used as substantives, they often receive various grammatical indicators that are peculiar to substantives. (o) A number of verbs may be brought into the category of sub- stantives by means of appropriate auxiliary elements. (d) What has been set forth above explains the impossibility of accurately indicating in the glossary the part of speech of the words in question. The part of speech may be indicated only disjunctively. (e) Determination of the part of speech to which words of the type in question belong may be made in each specific case with the help of carriers of grammatical data located in the supporting program. 40. 'PORK ON A JAPANESE-RUSSIAN ALGORITHM OF MACHIM TION A. A. Babintsev (Leningrad) 10 Work on a Japanese-Russian algorithm was begun at the end of December 1957, using atomic energy texts. At this stage analysis of material is limited to the simple sentence. 2. Due to the fact that no reading devices are available for ideographic text, the Japanese must be transcribed into Russian before it is put into the machine. 3. The structure of Japanese--agglutination (substantive and verb in part) and inflection (verb in part and adjective) with the stress on agglutination--is responsible for the effectiveness and adequacy of the standard morphological analysis and determines the primaoy of the program of standard morphological analysis in the set of programs. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release 000/08/24: CIA-RDP68-00069AO0010O 007-9 4. The set of programs for the Japanese-Russian algorithm at the present time is as follows-, (1) A. program of standard morphological analysis (with referral to the glossary-_"address" and withdrawal therefrom of certain gram?natioal information). (2) A program of standard syntactic analysis (based on a "working hypothesis") o (3) A program of non-standard syntactic analysis (oases that do not fit the "working hypothesis"). (4) A hoxaonymio program. (5) A glossary of idioms. (6) A synthesising program. 5. The minimum of information to be derived from text analysis is t for a substantive--case and, in certain instanoea9 number; for a verb--tense, voice, mood, finiteness; for an adjective--tense. 6. The "working hypothesis"", which is based on the laws of Japanese sentence structure, in broad outline consists of the followings (1) The first substantive in the nominative or principal case is the subject. (2) The last word before a stop sign is the final predicate; a verb in non-finite form is the middle predicate. (3) The direct object immediately precedes the predicate; the indirect object is found at some distance from the predioate. (4) A substantive in the genitive case, adjective and verb in the finite form preceding the substantive are attributes. 7. We should like to direct attention to one of the numerous problems that have arisen in connection with our work on the algorithm.. After analyzing a Japanese text, from which information on number can be obtained only sporadically, it turns out that difficulties due to the inadequacy of information on grammatical number appear in the synthesizing program during formation of the output text. A solution to the problem of number in the synthesizing program is exceptionally important for a number of "oriental"-- Russian algorithms. -49- Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2 00/08/24: CIA-RDP68-00069A0001000007-9 41. THE PROGRAMMING OF TRANSLATION FROM ENGLISTI INTO I G. P. Bagr ,novskaya and G. L. Gavx?iiova (Moscow) Program of translation, constituent parts, order of operation. Arrangement of glossary, difference in coding used in English section of glossary from coding in French section of glossary. Size of glossary. Glossary of phrases. Choice of homonyms, construction of complex index scales and omitted index scales. 'Operation of analysis program ("rolling up" formulas) fFORMULY SVERTK17. Program'of'synthesis of structures on the basis-of formulas of synthesis. Morphological treatment of results of synthesis. Russian part of program of translation from English into Russian (utilization of programs prepared for Russian. part of French-Russian trans- lation). Agreement in codings. 42. PRINCIPLES IN COMPILING A GERMAN-RUSSIAN GLOSSARY OF Y5LYSEWTS FOR ION S. S. Belokrinitskaya (Moscow) Determination of the meaning of a polysemantic word that is appropriate in a given context constitutes one of the basic problems in machine trans- lation. This problem is being solved by compiling a glossary of polysemants which will make it possible to obtain the relative meaning of a word by an analysis of the surrounding context. In most cases it is sufficient to examine context within the boundaries of a sentence. A considerable number of words that have multiple meanings in the usual literary language have but a single maning in mathematical texts, and the system of meanings for a number of polysemants is simplified. However, many German words, even in a mathematics text, have a large number of relative meanings, the determination of requires a rather com- plicated system of tests. The most numerous are prepositions and a group of verbs which are used with separable prefixes and which also form a large number of pfrrases. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Releas?. 00/08/24: CIA-RDP68-00069A0001040007-9 The principal method of determining the relative meaning of a polysemant is structural--semantic analysis of the surrounding context. In some oases grammatical forms of the given word or its environment are also analyzed It is possible to isolate certain gro.ps with a monotypic system of meaning,, thereby simplifying the glossary and replacing in some oases the system of tests (or part of the system) by reference to the appropriate general rule, We have also isolated a group of words united according to the principle of identical effect on the translation of prepositions and some verbs with extremely many meanings, which likewise permits of simplification of the routine. Methods of glossary treatment of different types of idioms and phrases have been worked out. The routines of polysemants also contain cases of lexical homonomy that are not excluded from the system that differentiates between the meanings of polysemants o The determination of relative meanings of polysemants by means of the glossary just described is not free from difficulties (in some cases a single sentence does not provide sufficient context, the translation of Complex words, etc.). However, these difficulties can, as a rule, be overcomes A check of the text shows that a complete satisfactory translation of the mathematical corpus can be achieved with the help of the above-described glossary of polysemantso 43, MAIN FEATURES OF THE GLOSSARY AND GRAM TCAL I. K. Bellskaya (Moscow) 1. The basic components of a system of machine translation from English to Russian as worked out in the ITM* and VT*, 5see No. 2 for ex- pansion and meaning of abbreviations7soademy of sciences9 USSR are a specialized bilingual glossary and Three cycles of translation routines! glossary routines, routine for analysis of input sentence and routine for synthesis of output sentence. 2, The Anglo-Russian M.T. glossary now available has been designed for the translation of scientific literature dealing with problems of - 51 Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Releas00/08/24: CIA-RDP68-00069A000100ft007-9 applied mathematics.- the solution of systems of linear, algebraic, and transcendental equations, calculation of the proper values of matrices, approximation of functions by means of polynomials as well as by trig- onometrio functions, expansion of'functions into series, numerical differentiation and integration, numerical solution of differential equations, and other problems of numerical analysis. The glossary contains 2300 words. Several works by English authors were used for compilation and checking. Uxt checking of the glossary for translation of mathematical lit- er4ture yielded satisfactory results. Some 3000 sentences consisting of more than 100 connected passages from the material of different authors were used as the corpus. 3. A glossary for the machine translation of scientific literature may be usefully divided into a series of independent "specialized" glossaries. Further specialization down to relatively independent fields within a given soience--mathematics, physics,, and chemistry--is also worthwhile. This division serves two purnosese it reduces the necessary bulk of the glossary to the completely manageable number of 3000-3500 words and even more important, considerably reduces the amount of polysemy. The structure of the Anglo-Russian glossary for M.T. is such that its several sections may be expanded independently. The glossary has two main sections I Single-meaning glossary and II Multiple-meaning glossary. Each section is divided into two subsectionss Ia - glossary of termsa Ib - glossary of words in general use, IIa - glossary of words with complete meaning, IIb - glossary of auxiliary words. In size, the multiple-meaning glossary takes up about 1/5 of the entire glossary which, in this instance, amounts to 458 words. A% - 52 - Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releaser00/08/24: CIA-RDP68-00069A0001002J007-9 5. The problem of polysemy is satisfactorily solved by combining two methods (a) narrow specialization of a series of glossaries for M.T. and (b) contextual (.functional?semantio) analysis of words in the sentence. Experience shows that it is virtually unnecessary in scientific and technological texts to go beyond the "small oontext"(i.e. one sentence). 6. In order that the lexical analysis of the words be effected automatically (without human intervention), the M.T. glossary is accompanied by a series of special glossary routines that make up cycle I in the over- all system of translation routines. These include- 1, A routine for obtaining the glossary form of the words, 2. A. routine for the grammatical analysis of "unknown words", 3. A routine for the grammatical analysis of "formulas", 4. A routine for distinguishing homonyms, 5. A routine for the analysis of polysemy. The last routine is the most important from both the theoretical and the practical points of view, . 7. The lexical analysis, which is performed by means of the glossary and glossary routines, precedes the gran matioal analysis and provides it with the necessary initial information in the form of the so-called "invariant oharacteristics" of each "known" word (i.e. entered in the Me glossary) and the syntactic characteristics of all the "unknown" words (not entered in the M. glossary) and the "formulas". 8. The grammatical analysis of input sentences is performed by means of a series of routines in cycle II in the following orders 1. Analysis of verbs ("verb" routine); 2. Analysis of punctuation marks, 3. Syntactic analysis of sentences.- division of sentence into clauses and more precise definition of parenthetical phrases in clauses a define a sentence as that segment of text which is includebetween full stops (period, exclamation or interrogation point); a clause is a simple sentence, i.e. such that it contains no more than one heterogenecus predicate, 4. Analysis of substantives and numerals,- 3 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 5. Analysis of adjectives; 6. Modification of word order in the translated sentence. The "verb" routine is the key routine in the first half of the analysis of English sentences; however, the syntactic analysis of sentences (routine S) is the basis of operation for the second half of the analysis and deter- mines the boundaries of those segments within which the subsequent analysis is effected. 9. The routines in cycle III use the results of the preceding routines in such a way that the Russian sentence obtains its grammatical form in accordance with the rules of Russian grammar. The synthesis routines go into operation Just at the time when the variant (contextual) grammatical signs for all variable words in the output sentence are obtained and the steps taken. to adjust the word order to Russian norms. In the place of the Russian numbers, which represented Russian words up to this time, Russian equivalents are selected from the glossary, after which the variable words (verbs, substantives, numerals, and adjectives) are handled by the synthesis routines: a word ending is changed whenever the desired word form does not coincide with the dictionary form of the wore 10. Synthesis routines operate in the following orders 1. Word-forming routine; 2. "Verb" routine; 3. "Adjective" routine; 4. "Substantive" routine. Changes in the numerals are effected partly in the "substantive" routine, partly in the "adjective" routine. The word-forming routine occupies a special plane: it provides for various oases going beyond word changes while inserting the grammatioal signs of the Russian word derived from analysis of the foreign sentence. 44. YORK ON A NORYQEGIAN-RUSSIAN ALGORITHM OF M&CHINE TRANSLATION V. P. Berkov (Leningrad) I. The projected set of programs are: A. Analytic part: (1) morphological program; (2) program for distinguishing homonyms; (3) syntactic -64- Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release ZfJ00/08/24 : CIA-RDP68-00069A00010Q, O07-9 program; B. Glossary parts (4) glossary..-address; (5) regular glossary; (6) glossary of phrases and idioms;.(7) program for compound words; (8 prepositional program; (9) program for unification of orthography; C. Synthesizing part. IT. Two methods of analysis, different in principle, were initially oantemplateds (a) To begin with a search for words in the glossary; (b) To begin by extracting grammatical information from the text before referring to the glossary on the basis of a supporting program (lists of indisputable endings9 word-forming suffixes, supporting words, ate.). Due to the extensive amount of grammatical homonomy in Norwegian, the second method seemed very cumbersome and9 in.some cases, practically unsound. It has therefore been rejected. III. The fact that the functioning of the algorithm- which begins with a search for the words in the glossary and withdrawal of the infor- mation located there into the operative metcory?-leads to clogging the latter with information that is as a rule temporarily superfluous (in some oases this is general) suggested the idea of creating typical sets of .information. TV. Programs (1) and (2) are now (beginning of March 1958) ready in rough form. An ending obtained by stripping the dictionary stem from the text form of the word is compared with the list of endings; if the given word has a single grammatical n sanin.g, an information suffix is attached to it and no further action is taken ova the word at this stage. Cases of grammatical homonomy are handled by a series of special programs (2). On extracting all the grammatical information from the text linear trans- fers of words are made in order to impart a standard appearance to the items derived by "unrolling" R,AZV .TE 7 this is done by a part of pro- gram (3). V. The program for the unification of orthography is the specifically Norwegian part of the algorithm. The need for this program is dictated by the considerable amount of inconsistency in Norwegian orthography, even in scientific texts; without the program, the glossary would necessarily be overloaded with many pairs of words. VIA The program for unification of orthography will be used as a basis on which to construct an adjusting program in connection with the use of this algorithm for Danish. 55V Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release 2000/08/24: CIA-RDP68-00069A00010007-9 45. GLOSSARY STRUCTURE AND INFORMATION CODING I. L. Fratchikov, S. Ya. Fitialov, and Go S. Tseitin (Leningrad) 1. Consideration is being given-to the problem of introducing a glossary on tape into the machine to search for coincidences in the event that the glossary does not fit into the operational memory. 2. A glossary structure is proposed that will accelerate searching and decrease the size of the indispensable portion of the memory for the size of the glossary under consideration. 3. The previously suggested process of "rolling up the codes" 5VERTYVANIYA XODOV7 is now in use. The rolled up code is directly utilized to obtain the address of information on words in the glossary. We have provided for oases of coincidences of addresses thus obtained (rolling up SVERTOCHNAYA/ homonomy) by differentiating routines in- oluded in dictionary compartments, the addresses of which are not addresses of the words. 4. Theoretical probability considerations have enabled us to obtain results which, based on the given number of words in the glossary and the volume of lexical information, make it possible to estimate the necessary size of the memory to aooomodate the glossary. 5. Methods are also suggested for programming certain operators encountered in the algorithms of machine translation. 46. GENDER AS A SUPERFLUOUS CATEGORY OF V. N. Vinogradova (Moscow) It is very important for machine translation to discover the gram- nnstioal categories of a language concerning which there is no need to give information insofar as translation can be effected without taking them into account. Certain general considerations suggest that gender in the Russian verb--an uncharacteristic phenomenon expressed only in -1 forms, the singular-of the past tense and of the conditional mood-- is one of these categories, We tested this assumption on a mathematical LT. G. Petrovskii, Discourses on the Theory of Differential E uations, 1954gtext where the number of verbs with gender expressed turns o u t to constitute only 4% (93) of the total number of verbs (1970). We then selected linguistic (history of language). A. Shakhmatov, Historic Morphology of the Russian Lan ua e, 195 , pages 9-617 and historic . Grekov, Kievan Russia, State Publishing House of Political Literature, 1953, page3 -texts in order to have a large number of diversified examples and found th t then verb tub v Approved For Release 2~O~p CIA=68'-CU _ 6000?'r of verbs Approved For Release 00/08/24: CIA-RDP68-00069AO0010WO07-9 usedo It appears that in most sentences the verb may be related only to the subject-ma single substantive in the nominative case, Doubts may arise only in the case of transitive verbs where there is an object in the accusative case that coincides in form with the nominative, of the type: "Equation (6) yielded the general integral of this equation over the entire surface except for the start of the coordinate", ravneniye(6) davalo obshohii integral etovo uravneniya vo vsei ploskos 3a isklyuoheniyem nachala Roordinag. Since we have a grammatical indication for both the subject and the object purely in the past tense, and even then only when the gender of the noun-subject differs from that of the noun-object, there remains no other way of determining which is which than-the word order of the sentence: the rule that the subject comes first holds in the over- whelming majority of oases. A rearrangement is, of course, possible for the sake of logical emphasis, e.g.t "in the Russian language preponderance has received the accent of the nominative plural." ff rueskom ya3yke pereves poluohilo udareniye imanitel'novo mnozhestvennovo. The phrase poluchit' pereves receives the preponderance a predominate will evidently have to be listed in the glossary as a phrase combination. It is possible to conceive of more complicated oases (we didn't find any examples, but we paraphrase one of the sentences of the type described above): "Chaange,..caused a shift of e to a before a hard con- sonant." Iznieneniye...vyzval perekhod e v o pared tverdoi soglasnog Such a sen ence is almost impossible with a predicate in the present tense (or is very badly writtena "Izmeneniye o.erzyayet perekhod..." will clearly be misunderstood); even in the past tense it is awkward. Apparently, rare instances of this kind will be edited; so too the following case in a complex sentence: "The bishop asserted that his church land went along the Lisichii -ford, which was in the time of Prince Yuri." /Tpiskop utverzhdal, chto yeo tserkovnaye, zemlye, idet po Lisichii brod, chto byl pri knyaze Yuri* . In the absence of information on the gender of the verb b ', it is i os- sible to determine whether the last clause modifies bro ford" (brodmchto byl.?,a kotoryi byl.., an obsolete meaning, according to Ushakov's Tolko vyi Slovar' Ziictionarf) or is a subordinate conjunctive clause relating to entire preceding clause. This ambiguity cannot be resolved here by formal signs. With the exception of the last example, the texts studied did not con- tain a single instance where the lack of information on the gender of verbs would have resulted in confusion. This Permits of the conclusion that as far as machine translation is concerned gender in the Russian verb may well be ignored. -67- Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For ReleasJ00/08/24: CIA-RDP68-00069A00010^0007-9 47. THE SYNTHESIS OF RUSSIAN VERB FORMS I'll MACHINE TRANSLATION Z. M. Volotskaya (Moscow) 1. For the synthesis of Russian verb forms in machine translation it is proposed to list in the glossary of stems only the stem of the imper- fective aspeotive of each verb. All the forms of the present, past, and future tense, perfective and imperfective aspect (personal as well as impersonal) are formed from this stem in accordance with definite rules. 2. It is suggested that three types of operation are sufficient to make all possible verbal forms from the single stem: (a) discarding the final"letter or letters, (b) adding a letter or letters to the stem on the right, and (c) adding a letter or letters to the stem on the left. All the individual letters and combinations of letters which are joined to the stem on the left and on the right are assigned by a list and arranged in tables in accordance with a definite system. 3. All'the verbs are classified in three groups depending on the method of producing: (a) the forms of the present tense, (b) the forms of the past tense, and (o) the stem of the perfective aspect from the stem of the imperfective aspect. 'By class of verbs we mean the total number of verbs that construct a given form in the same way. 4. The information for each verb stem contains the class number of the stem, which indicates the way in which a given form is to be con- structed, 48. RUSSIAN SYNTAGMAS (on the basis of mathematical texts) Z. M. Volotskaya, Ye. V. Paduoheva, I. N. Shelimova, and A. L. Shumilina (Moscow) 1. This report discusses the basic types of two word combinations in subordinate relationship (syntagms,s) as found in mathematical texts and by means of which it is possible to construct the rules of formal text analysis (for machine translation). 2. The syntagmas were based on specific word combinations drawn from the texts. - 58 - Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release Zo/08/24: CIA-RDP68-00069A0001002Q O7-9 Syntagmas are considered to differ from each other in type of syntactic relations between their component parts. Therefore, not all the morphological and syntactic signs of the words that form the given comr- bina.tions served as criteria for relating these combinations to the various syntagmas. S. A syntagma consists of two components: "governing" and "governed".' Each of which is accompanied in the list of syntagmas by certain information. As a rule, the "syntactic group" is the essential information for the "governing", the"morphological form" for the "governed" component. 4. Words are divided into "syntactic groups" on the basis of the following principle of marking words according to the sign of a common syntactic connection: first, those words which have a single common syntactic connection are separated from the mass of words into one group; then, those words which have another syntactic connection are separated from the sane mass, etc. The same words may fall into different groups which consequently appear to be crossing each other. The separation of syntactic groups not only according to one but according to a combination of signs should lead to a significant increase in the number of syntactic groups and correspondingly, in the number of syntagmas. 6. The report includes a list of syntagmas, description, and dis- cussion of possible ways of using them in text analysis. 49. SYNTHESIS OF THE RUSSIAN CLAUSE Z. M. Volotskaya and A. L. Shumilina (Moscow) 1. Sentence synthesis in machine translation consists of combining words into clauses and clauses into sentences according to the requirements for sentence building in a given language. 2. The aggregate of syntagmas in each sentence that are obtained by analyzing the language from which a translation is made does not constitute an adequate basis for synthesizing sentences of the language into which the translation is made. Correspondences must be established between the languages in question not only on the syntagmatio level but also on the sentence level. 3. A clause is synthesized by inserting a syntagma, i.e., one syntagma as it were overlays and draws into itself another. 4. Each word in the clause of the output language obtains, in addition to the information necessary for translation (number of stem in the output language, number, tense, etc.), the following signs: (a) number of the - 69 - Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Rele 2000/08/24: CIA-RDP68-00069A000 00007-9 syntagma into which it is entering as a governing word (by the first method, of. below) or as a governed word (by the second method); (b) ordinal numbers of the words (from the input language) with which the given word forms syntagmas. In combining words into clauses it is more convenient to use the ordinal numbers. of words from the sentence of the input language and not the numbers of the output stems because using only the latter might lead to mistakes inasmuch as the sentence may contain several identical lexemes or different ones, but with the same stem. 5. There are two possible ways of synthesizing a clause by means of syntagmas: (a) Isolating the pivotal syntagmas (predicatives) and successively expanding each component at the expense of the governed words. (b) Synthesizing a clause'by successively combining syntagmas until they are reduced to the predicative. Moreover, each srtagma enters as a single group into a higher rank syntagma as a governed, expanded component. 50. GRAMMATICAL ANALYSIS FOR MACHINE TRANSLATION 07 CHINESP INTO RUSSNIT V. A. Voronin (Moscow) The system of grammatical analysis for machine translation of Chinese into Russian was based on materials from contemporary scientific and technological texts in mathematics, electrical engineering and construction. it utilized the fundamental works of Soviet and Chinese authors on the modern Chinese language. The system was tested on mathematics articles from the Chinese periodicals Shusyue syaaebao (Mathematics Herald) and Shusyue tsin'chzhan' (Successes of the Mathematical Sciences). In constructing the system we did not have the task of solving the extensive and manifold grammatical problems connected with machine translation of literary and sooio-political, literature. However, we did take cognizance of gram- matical phenomena characteristic of Chinese as a whole. Treatment of the Chinese sentence according to the system of grams matical analysis starts after operation of the glossary and glossary supplement is completed: as a result of which words in the sentence enter the system with concrete relevant meaning and complete lexical characteris- tics, i.e. with the set of necessary signs. The special grammatical structure of Chinese possesses an extremely small number of formal means by which one can identify the full morpholog- ical properties of the Russian equivalent for the Chinese word within a given lexical unit. Therefore, a Chinese sentence cannot be processed for -60- Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release X00/08/24: CIA-RDP68-00069AO0010O W 07-9 machine translation without an analysis of the syntactic structure of the sentence. to be translated which was predetermined by the general principles underlying the system. The systen operated in the form of routines, consists of two main parts: (1) syntactic analysis of sentences, and (2) production of the mor- phological characteristics of the Russian equivalent. The entire system includes 9 interrelated, successively functioning routines. The first part has 4 routines in which the following stages of syntactic analysis are effected in corresponding orders (1) Breakdown of the input sentence into simple clauses. (2) Separation of attribute 4. attributed word groups. (3) and (4). Separation of other (than attributive) syntactic components of the clause. The second part of the system has 5 routines of which 4, on the basis of existing syntactic signs, produce the morphological characteristics for the Russian equivalents of all the words in the Chinese sentence. The classes of words mentioned below are handled in the order given: (1) Numeral, (2) Substantive (3) Verb (4) Adjective The operation of the fifth routine consists of changing Chinese word order in accordance with the norms of Russian word order. The system as a whole comes down to producing the formal signs that reflect in the first part the syntactic function of the-word and in the second part the morphological features of the Russian equivalent of the Chinese word. An adequate, readable translation is ensured by performing a combined lexico--grammatical analysis of the Chinese text put into the machine. - 61 - Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Relea^000/08/24 : CIA-RDP68-00069A0001 0007-9 51. APPLICATION OF MACHINE TRANSLATION METHODS TO QF TEIEGRLPHIC AM- V, I. Grigor'yev and G. G. Belonogav (Moscow) 1. Men have been searching from ancient times for the most effective utilization of the channels of communication, Up to now the main efforts of engineers and communications experts have been aimed at perfecting the oommsmioation channel proper and at seeking ways of transforming the signal so as to secure the maximum suitability of the signal to the given ohamnel6 The contents of communications meanwhile remained unchanged. However, the possibilities have now for the most part become exhausted so that the problem of finding means of reducing the size of messages trans- mitted is becoming increasing urgent. 2. The size of a telegram may be shortened 3-4 times if a lexical code is used instead of a literal code. A telegraphic communication that uses lexical coding differs from an ordinary printed letter com- munioation only in that they send not code groups designating letters, of the alphabet, but a code combination designating the ordinal number of the word according to the dictionary in the memory device plus certain items containing grammatical information about the word transmitted. 3. The principle of lexical coding of messages has been known since ancient times, It is employed in various kinds of signal tables, in the international radio code, and elsewhere. However, in all these oases coding is done manually, requiring great effort and considerable expeadi- ture of time. The development of computer technology has now made possible automatization of the process of lexical coding and its wide use in com- mtmications, 4. Lexical coding is based on an analysis of the message at the transmitting end and its subsequent synthesis at the reception end of the line of communication, This lexical analysis and synthesis of a message is essentially a simplified form of the analysis and synthesis of a text produced by machine translation. It is therefore worthwhile, when pre- paring an algorithm for lexical coding, to make full use of the method of text analysis and synthesis used for machine translation, 5. -Lexical coding has, in addition, several peculiarities. Text analysis and synthesis in the case of machine translation is aimed at securing the operation of hieroglyphic conversion--& basic operation in m chine translation, Elimination of hieroglyphic 'conversion would lead to considerable simplification of the routines of analysis and synthesis in the case of lexical coding. On the other hand, with lexical coding the demand for code economy is pushed to the foreground, whereas it is of purely secondary significance as far as machine translation is con- cerned, Lexical coding must rest to a large degree on speech statistics. -62- Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release Q 00/08/24: CIA-RDP68-00069AO00100 007-9 In particular, due to the interlinking of analyzer with the devices of the channel of communication, the size of the dictionary cannot be so conveniently large. Available statistics permit limitation of the dictionary of the lexical analyzer to a maximum of-4000 words in ordinary use, which generally make up 97.6% of a literary text. Rare words not found in the dictionary may be transmitted letter-by-letter. 6. Application of the principles of lexical coding to telephonic com- iaunioation may help greatly in solving the problem of maximum closeness of compression. 52. SOME PROBLEMS IN MACHINE TRANSLATION FROM E I M. B. Yefimov (Moscow) The purpose of this communication is to set forth some principles involved in analyzing Japanese sentences for machine translation, the principles being characteristic of the Japanese language alone. A, The primary problem with which we have to deal in analyzing a Japanese sentence is its division into separate words. This is typical chiefly of languages with an ideographic form of script (Japanese, Chinese, eto.). The fact is that words are not separated in a written Japanese text and, consequently, identification of their role in a sentence is quite difficult. We shall try to show in this report how we made the division in our work. We bean with the fact that the Japanese script uses the signs of a syllabary (kana) along with ideograms. Thus, the division of a Japanese sentence into separate words breaks down into 3 main steps: (1) 'Analysis of portions of sentences containing both ideograms and syllabary. (2) Analysis of ideographic part. (3) Analysis of syllabary part. This operation is closely linked to the operation of the existing Japanese glossary and is, so to speak, one of its parts. B. Breaking down a sentence into its individual clauses is no less important a problem in Japano-Russian translation and has both practical and theoretical interest. - 63 - Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release X000/08/24 : CIA-RDP68-00069A0001002U0007-9 . In this work we are relying chiefly on the rigid structure of the Japanese sentence in which either a verb or a predicate adjective always stands at the end. This enables us infallibly to determine the end of the sentence. The beginning of the sentence is determined by searching for the subject. Thus, the entire operation consists of two stepss 1. Determination of the end of the sentence, and 2. Determination of the beginning of the sentence. 0. As is true of all languages, the verb constitutes the greatest difficulty in translating from Japanese into Russian. The strongly developed affixation that is characteristic of Japanese is most clearly marked in the verb. This determined the cyclical nature of our operation. We used the fundamental rules of traditional grammar for the analysis of verb endings, relying mainly on the five stems of the Japanese verb. We have been successful in establishing the necessary grammatical and syntactic criteria for all verbs. 53. WORK ON THE RUSSO-ENGLISH ALGORITHM OF L. N. Zasorina (Leningrad) 1. Limitation of problems and scope of work. Choice of mathematical text as being most limited in stylistic peculiarities. Determination of set of programs for Russo-English algorithm. Ex-. elusion ofprogram of differentiating homonyms due to synthetic structure of Russian. Simultaneous work on glossary and morphology program. 2. Combined investigation of short text. Compilation of glossary in which the grammatical form and syntactic relations of the words are registered. Recording of statistical data. 3. Investigation of individual parts of speech, division of words into classes, and preliminary detection of homonymy between the parts of speech. - 64 Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release x,00108124 : CIA-RDP68-00069A00010024WO7-9 4. Terb'and grammatical information derived from personal forms and nominal formse .Homonomy of participles and adjectives distinguished by taking into account suffixes of full and short forms of participles. Lack of formal-graphio separation of auxiliary and modal verbs from the verb class. Adjective class comprising adjectives, adverbs in -o, ?e9 .ski, ordinal numerals, words in the status category. ArrangemenT in non? specified subclasses. The substantive class including nouns, sub= stantivized words and cardinal numerals (other than odin Za-nj~e, dva Atw7, tri hre ,the a our) is distinguished by the abundance of homonymic case formss in ra c ass homonomy and interclass homonomy. Separation of non-specified subclasses. Triliteral word class. Class of invariable words is characterized by negative separability in the text. 4. Advisability of introducing stamp-stripping program. Planning of groups of commands for the individual classes. Manysided investigation of homonymic coincidences of separable affixes. 5. Problems connected with differentiating grammatical data derived from homonymic affixes. Tables of separable, restrictive lists of letters that precede the separable affixes. Successive separation of affixes from stem (endingsand formmoonstructing suffixes) and storage of grammatical information derived. Table for verifying matching of preliminary information obtained from affixes and stem glossary. Method of multistage depositing of grammatical information derived from the glossary and steno stripping program. Attempt at dividing grammatical data into two non crossing fields to reduce the number of tests of possible grammatical forms. 6. Compilation of stem glossary. Determination of general size and limits of glossary. "Lexical article" plan, taking into account input and output information and list of possible forms. Obtaining pseudostems. Problems in contracting the glossary by separating word-building suffixes and prefixes. 7. General routine for processing words stem stripping program, stem glossary, morphology program. Obtaining input information for the syntactic program. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Releae000/08/24 : CIA-RDP68-00069A00016&0007-9 540 WORK ON A HINDUSTANI (HINDI) - RUSSIAN ALGORITHM T. Ye. Katenina (Leningrad) 1. The development of a Hindi-Russian algorithm is very important for similar work in the field of Indian languages--both Indo-Aryan and Dravidian. The structure of Hindustani is in the main analytical, al.- though the traces of ancient inflection and agglutinative elements-.a new synthesis.--play a definite role. The scientific style of Hindi prose is characterized by a more or less definite word order close to that of the Dravidian languages. Numerous phrases containing non-oonjugated verb forms,, equivalent to subordinate clauses, constitute the main difficulty for machine translation. Scientific texts are characterized by an abun- dance of Sanskritisms which are frequently translated loan words of in- ternational (European) terms. 2. Hindi writing, phonetic for the most part, is therefore especially convenient for an electric reading device. To record texts we worked out a mechanical transcription based on the Russian alphabet without complicated signs and diacritics. In addition statistics justified our combining several Hindustani sounds. 3. The set of programs for machine translation is as follows (1) glossary of stems (2) morphology program (3) postposition program (4) syntactic program (5) program for differentiating homonyms (6) list of idioms (7) a translation program of compound words may be required for some kinds of scientific texts. 4. In order to avoid superfluous information we adopted the following hypothesis for the syntactic analysis of a simple sentences (1) the first -noun substantive in a direct or active case is the subject (2) the verb in the last place in the sentence is the predicate (3) if the.verb is not a copula, the noun substantive in the next-to last place in the sentence with the postposition ko or in the direct case (not the subject) is the direct object. We have determined the necessary minimum,of morphological information,- but which requires statistical confirmation in individual-oases.-to bes (i) for the noun substantive--number case (direct, active, indirect), (2) for nominal adjeotive.-.-number (may be important to deternd.$e the number of noun substantives with zero ending of direct case plural number), (3) for the verb om tense, moods number (to determine the number of the some noun substantives); voice,A check of the text showed that the overwhelming majority of simple sentences as well as the constituent parts of complex sentences may be analyzed in accordance with these rules. 66 Approved For Release 2000/08/24 CIA-RDP68-00069A000100200007-9 Approved For Release.100/08/24: CIA-RDP68-00069AO0010Q 007-9 6. Among the basic problems requiring a solution for subsequent work in constructing a Hindustani-Russian algorithm area (1) elucidation of rules for analyzing complex sentences and equivalent phrases with non conjugated verb forms, (2) clarification on a statistical basis of the need to design a program analyzing compound words that would be compulsory for all kinds of texts. 55o AN ALGORITHM R TRANSLATING ENGLISH TMUI-U 0 E I K. Y. Komissarova (Gorki) The translation rules and glossary have been worked out with regard for the characteristics of English texts dealing with radio engineering. The translation process is divided into 2 main partas analysis of English sentences and synthesis of Russian sentences. Analysis of an English text is based on a syntactic analysis of the sentence. The grammatical function of a word is determined by morphological and syntactic analysis according to rules grouped by the parts of speech. The glossary contains more than 500 words in general use and specialized technical terms. 56. AUTOMATIZATION OF TRANSLATION PROGRAMMING 0. S. Kulagina (Moscow) 1. Long, tedious process of constructing translation programs causes need to automatize programming. Requirements of translation programs and impossibility of using existing programming programs. Formulation of problem of automatizing translation programming. 2. Breakdown of translation algorithms into operators. Types of operators and functions of each. Parameters of operators. 3. Preparation of translation algorithm for translations recording of algorithm in the form of sequence of simple rules, transition from this recording to operator, automatic construction of translation program ac- cording to operator recording of algorithm by means of compiling program. 4. Compiling program, its structure. Some features of structure of programs obtained by the method described. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releas 00/08/24: CIA-RDP68-00069AO0010d%0007-9 670 A FRENCH-RUSSIAN TRANSIATXON ALGORITHM 0. So Kulagina (Moscow) (1) Formulation of problems translation of mathematical texts, De- mands for quality in translation, oases requiring editing. (2) Structure of glossary for machine translationfeatures, Glossary information end purpose, Glossary of phrases, (3) Principles in constructing translation algorithm. Structure of algorithm and'order of operation, Word look-up in glossary. Treatment of phrases. Differentiation of homonyms and analysis of polysemants, order of operation of rules for differentiating hononyms0 Analysis of French sentence, problems. Sequence of handling parts of speech during analysis. Character of information obtained through analysis, Change of word order in translation. Synthesis of Russian sentenoea order of operation of synthesizing rules and how they differ from analyzing rules. (4) Supplementing and correcting algorithm on the basis of experimental translations (greater precision in rules for differentiating homonyms, change in handling of adjectives, separation of morphological from syntactic analysis). 68. DETERMINATION OF SYNTACTIC CONNECTIONS FOR FORMULAS IN RUSSIAN WgWf= TNT$ M. M0 Langleben (Moscow) 1. We call "formulas" all text elements not found in a mechanical glossary in processing a text (surname, mathematical formulas, foreign references, neologisms, eta.). 9'Formulas', like words to be translated, require the ascertaining of syntactic connections in the text to be analyzed, i.e., the identification of formulas that form part of one of the previously given syntagmas. 2. The analysis of a "formula" is broken down into 2 partas (A) testing the formula proper for the presence of any word- changing suffixes, the sequence of tests being determined by frequency of the oases. (B) analysis of its environment (words and punctuation marks). This begins only after all the 5formulas" contained in the given segment of text have passed through part A. 3. The following order of ascertaining the possible syntactic connections for "?ormulas" is advisable in that it eliminates the possi- bility of establishing false syntagmas s 68 Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Releaser2p00/08/24: CIA-RDP68-00069A00010Q2 007-9 (a) the formula acts as an adjective for a substantive standing on the right; (b) the formula is a name with a substantive standing on the left; (o) the formula forms part of a prepositional phrase; (d) the formula forms part of a ayntagma with an adjective requiring the dative ease (RAVNYI qua7, KRATNYI /maltipl27); (e) the formula forms part of a syntagma with an adjective in the comparative degree replacing a substantive in the genitive came; (f) the formula replaces a governing substantive in an "adjective f substantive"' syntags; (g) the formula acts as a predicative combinations These last ar?e used to check various syntagmas with a verb; the function of a formula with a verb is chiefly determined by its position on the right or left of the formula, not by the form of the verb. 4. Since the analysis of "formulas" is a basic part of the routine developed for the language as a whole, it will be performed piecemeal at various stages of the total analysis. b9o ELIMINATION OF 1)RPHOIAGICAL AND SYNTACTIC H M. M. Iangleben and Ye. V. Paducheva (Moscow) to Those words in a dictionary of stems that cannot be identified as a fixed part of speech, i.e. "attempt" (verb, noun), "cool" (adjective, verb), and "further" (adverb? adjective), etc,, are handled as followas If a word can be a noun and a verb or an adjective and a verb, it is inserted in a dictionary of substantives or verbs, respectively. Those word-changing suffixes that can readily identify one part of speech to which a word belongs (mad, .-in g, but not -a) are listed in a-table of word- building suffixes, i.e7If TEW-word has one of these endings, the part of speech will be revealed after morphological analysis. However, homonymic stems do not require any changes in the analysis routine provided for the other words. (This method is based on a suggestion by A. I. Smirnitakii who defined conversion as word building by means of paradigms). 2. If the part of speech cannot be readily determined by morphological analysis of (zero ending in word stems) "They attempt", "the attempt", homonymic ending--"he attempts", "the atten ts--or the parts of speech which have no word-ohanging forms are howrymiom? further" (adjective, adverb)-- Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Releas00/08/24: CIA-RDP68-00069A000100007-9 the word is assigned several syntactic functions corresponding to the possibilities of the word to enter a syntagma as a substantives verb, etc,, The possible functions are examined in a definite order and a syntagma is established for the given word? depending on whether certain words are present in the sentence; thereafter all the remaining functions listed are dropped out except that for which the syntagma was found, 3. Similarly. homonomy in -ing forms, ,-ed forms, etc. is eliminated by successive tests for the presence of oert syntagmas in the sentence. 60. THE SUPERFLUOUSNESS OF RUSSWT ADJECTIVE INFLECTION N. N. IAont9yeva and G. N. Vavilova (Moscow) 1, In machine translation from Russian the procedures for handling the inflection of adjectives are quite cumbersome. The machine has to perform a double tasks first, to investigate the inflection of the ad- jectivea then to search for the substantive with which the adjective agrees. There is an easier way of relating an adjective to the substantive with which it agrees, a way that ignores inflection in most oases. 2, When a Russian text is analyzed, it usually turns out that adjective inflection is superfluous as far as translation in concerned. It merely indicates the agreement of the given adjective with a certain substantive. 3. An adjective may be related to the substantive with which it agrees without analysis of its inflection by using the adjective's position in the sentence. An adjective m attributive most frequently occupies with respect to the substantive with which it agrees a definite positions it stands either before this substantive or after it, following a comm. Accordingly, it is possible to formulates two rough rules for relating an adjective to its substantives (a) Relate the adjective to the nearest substantive on the right; (b) If there is no substantive on the right, relate the adjective to a substantive that is followed by a comma. 4. However, relating an adjective to a substantive in accordance with these rules alone may turn out to be incorrect, Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release 2W/08/24: CIA-RDP68-00069A000100207-9 Therefore, a number of individual tests must be performed before finally deciding the problem of relating an adjectives is the adjective part of the nominal constituent of the predicate, is it included in a for- mula, does it govern the following noun with or without a preposition (VYZVEDENNYI IZ FORMULY seduced from the formul7a zer27) 9 RAVNYI Ze-qual to ., 5. After these checks the machine either relates the adjective to the substantive without regard to its inflection or, if it cannot dispense with it, analyzes the inflection of the adjective. 6. An analysis of mathematical texts shows that without investigation of inflection it is possible to relate more ;than 85% of all adjectives to the appropriate substantives. The remaining 10-15% of the adjectives re- quires an analysis of the inflections. 7. In calculating the number of adjectives we excluded short ad- jectives, the relative MOTORYI 5hicg9 cases where the adjective is part of a formula, cases of ellipsis (the adjective is present, but not the noun with which it agrees, e.g. OTLICHAYETSYA OT RASBMDTRENNYKH B ETOM PARAGRAFE 5t differs from the (things) considered in this paragrap7. 8. The practicability of a method to ascertain the possibility of ignoring adjective inflection has still not been proved. This will re- quire further work on texts as well as more experience with machine translation, taking cogni-Xanoe of technical difficulties, Nevertheless, the suggested routine for relating an adjective to its substantive by position criteria will retain its value, even if the necessity for investigating the inflections of all adjectives is demon- strated, since inflection is merely one of the factors that control the correct relating of an adjective to its substantive by position criteria. 61. AN ALGORITHM OF AACHINE TRANSLATION FROM ENGLTSH INTO RUSSW T. N. Moloshnaya (Moscow) I. (1) Different possibilities for formalizing linguistic data in different languages. (2) Advantages of a structural-syntactic analysis of English. 1I. (1) Classification of English and Russian words according to formal criteria. (2) Grammatical configurations constructed from isolated classes of words. III. Analysis of English sentence structure according to grammatical configurations. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Relea2000/08/24: CIA-RDP68-00069A00010000007-9 (1) Replacement of grammatical configuration by its chief member. (2) Sequence of ascertaining grammatical configurations in the sentence to be analyzed. IV. Synthesis of Russian sentence structure according to grammatical configurations. (1) Substitution of the English grammatical configuration used by the corresponding Russian configuration. (2) Morphological formation of Russian sentence structure. (3) Def3.nition of grammatical forms of words in the Russian sentence. V. Elimination of lexico-grammatical homonomy in the English sentence on the basis of s (1) morphological data, (2) syntactic data VI. Tests of machine analysis of English sentence structure. 62. A DEVICE FOR THE READING OF ORDINARY PRTMED R Y THE B= R. S. Muratov (Sverdlovsk) 1. Conversion of the graphic form of letters in a printed text into electrical signals is achieved by breaking down the group of photosensitive elements as they move along the line of text. 2. Electrical impulses generated when photosensitive elements are blacked out switch on electronic relays which, in turn, switch on a tactile or phonic signalling instrument. 8. The form of the signals (of successive formation of elementary signals corresponding to each zone of disintegration) expresses the graphic peculiarities of the letters and other marks in the text. 4. Correct reading of the signals rekuires preliminary instruction by a reader. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2600/08/24: CIA-RDP68-00069A0001002Qp07-9 630 ANALYSIS OF PUNCTUATION MARKS DURING MACHINE T LATON Oli RUSS To No Nikolayev. (Moscow) to The purpose of this operation is to obtain the distinguishing features of punctuation marks during machine translation, 2. In translation from Russian each word in the sentence must receive definite morphological and syntactic signs. The required signs are obtained in different ways for each part of speech. In particular, in order to determine the case and number of substantives it is necessary to know the correlative position of the parts of speech within the limits of the closed sentence L IcNmrovo PREDIDZ$ENIY '. However, most Russian sentences are complicated by parenthetical and setoff 5.e, by commas-OBOSOBI constructions, subordinate clauses, etc, Hence, to obtain the precise grammatical signs it is necessary to break down a complex sentence into simpler components, dividing the main clause from the subordinate clauses and separating the setoff and parenthetical phrases. Thus, the final goal of the analysis of punctuation marks is tot (a) separate simple clauses from the body of the complex sentence, to find the boundaries of the simple clause within the sentence; (b) separate similar members of the clause; (o) help the subsequent elucidation of interrelations between the individual parts of the punctuated complex, sentence; (d) determine a group of similar members. 2, i27 The analysis is made within a single complex sentence. Accordingly, "simple" and "multipurpose" Punctuation marks are dis- tinguished. The simple ones (period, exclamation point, question mark9 and dots) serve as the boundaries of a complex sentence, Multipurpose marks (comma, dash, and colon) unite simple clauses into a complex clause9 introduce subordinate clauses, and separate parenthetical and set-off constructions, We are devoting the bulk of our attention to the multipurpose marks. In a clause they may serve9 according to Profa A. B. Shapiro's terminology, to "divide" or to *separate", We are also paying special attention to the problem of distinguishing between single and non-single punctuation marks (e.g. those used at the end of a setoff phrase and the beginning of a subordinate clause eta.). Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Rele^e 2000/08/24: CIA-RDP68-00069A000100200007-9 3. As a result of the analyaig, all the multipurpose punctuation marks receive one of the following signs$ (1) parenthetical (i.e. separating parenthetical words and phrases); (2) setting off (separating participial and verbal-adverb phrases as well as setoff attributives and appositives); (3) similar=simple (dividing similar, members of a sentence); (4) similar=complex (demarcating they parts of a compound sentence); (5) dissimilar (i.e. introducing a subordinate clause). 4. Separation of the simple clauses occurs within the limits of the complex whole according to our data. The entire process of analyzing punctuation marks can be divided into 3 stages s (1) Separation of the purely parenthetical constructions takes place in the analysis glossary where the words that may be used parenthetically or that are a basic part of a parenthet- ical phrase undergo special analysis, after which the punc- tuation marks that separate them receive an appropriate in- formation sign, (2) Processing of punctuation marks by the "Punctuation Marks" routine, where the basic analysis of all the punctuation marks takes place. (3) Breakdown of the sentence into its constituent parts-- separation of parenthetical and setoff constructions, dividing of simple clauses, etc. Here the occurrence of a "non-single" mark is extremely important. This routine also provides for insertion of a sentenoe-demarcating punctuation mark where necessary. 5. The "Analysis of Punctuation Marks" routine consists of several parts, each of which corresponds roughly to a given punctuation mark. Within each part several checks are made on a number of individual factors that determine the function of the multipurpose punctuation marks. These factors include the presence of verbs with the sign"IF" (LICHNAYA FORMA) rpersonal forlon both sides of a given mark (or on one side of it , the presence of verbs with the sign *NELICSNAYA FORMA' son-personal fo , the lace of a substantive with the sign FS ("FORMA SI1)VARNAYA") 5ictionary form in respect to the given mark, the separation of words belonging to a given lexical group, etc. Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release 2QpO/08/24: CIA-RDP68-00069A00010029007-9 6. As a result-of our investigation, all the punctuation marks are provided with the requisite distinguishing features and the analysis is performed accordingly within the separated simple units. 64. SOME PROBLEMS CONK CTED WITH THE ANALYSIS OF COMA' SENTENCES AND C U H Ye. V. Paduoheva (Moscow) 1. The following problems must be solved in connection with the syntactic analysis of complex sentences and clauses with similar memberes (a) To distinguish between a syntagma with similar members and clause coordination (the difficulties in solving this problem are explained by the fact that most of the co-ordinating con- junctions (,Lin N$) and, or, bug may connect both similar members of clauses and entire clauses and therefore they can- not serve as a trustworthy sign either of clause boundary or of syntagm, with co-ordinating connective 5OCHINITEL'NOI SVAZ QY` (b) To separate words interlinked by a oomordinating connective, having divided them from the words governed by them. 2. For this purpose we propose the following method of analyzing sen- tenoes with oo-ordinating conjunctions (only 2-member combinations are con- sidered for the time being)s The sentence is out up into "chunks*, the limits of which are co-ordinating conjunctions, and the eyntaotio analysis is performed within the chunks; if after completion of syntactic analysis within the chunk no words remain without a governor, it means that the con- junction connects two clauses; if, however, such words remain, it means that the sentence contains similar members. Words lacking a governor are, for the most part, members of a co-ordinating syntagma. 3. When words are combined into a coordinating syntagma, the concept of "sameness of form" LEAVNOOFORML3NNOST j7 is used. "Sameness of form" is'the coincidence of several of their morphological and syntactic signs, The same form is sought beyond the chunk for a word that lacks a governor within the chunk and a coordinating syntagma is thereby established. (This must be refined somewhat due to the possible absence of agreement in number for words with the chunk, etc.). L4 This method of analysis is feasible for Russian because a word normally contains all the information regarding the possible syntactic connections for it ( with some exceptions,--compare, e.g., the home onomy- of oases, whion may make the syntactic function of a word in the chunk inCAefinite). This method is impracticable in English (e.g. the syntactic functions of a substantive are determined wholly by its position Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Relee 2000/08/24: CIA-RDP68-00069A000100200007-9 after a transitive verb, before another substantive, etc.; therefore, superfluous "subjects" would appear after the division into chunks is made. However, some of the difficulties mentioned for Russian disappear in English during the analysis of a sentence with co-ordinating conjunctions due to the rigid word order and preferential position of the governed word after the governor. English syntagmas with oo-ordinating connectives are determined at the same time as the others during the course of syntactic analysis. 50 Some methods of fixing the boundaries of a simple clause inside a complex clause are indicated. 65. MACHINE TRANSLATION OF CONFOUND NOUNS FROM OW INTO RUSSM V. V. Parshin (Moscow) 1e The extensive use of oompoundsin German, particularly in scientific and technical literature, has made it necessary to work out universal rules for their translation. Formulation of such rules makes possible a significant reduction in the size-of'the dictionary and the translation of compounds, provided that the components are loaowA. Universal riles for the translation of compounds are deduced from a struoturalmsemantic analysis of the constituent words. Determination of semantic connections between them ensures an adequate translation. The iuthor?s investigations do not pretend to be a complete and final solution to the problem of translating compounds. They are merely an initial, empirical attempt at working out the basic principles and methods that would permit of a more or less successful translation at the first stage. 2. The existence of the following types of connections between the stems of compounds has been demonstrated by an analysis of concrete linguistic material (individual original works on mathematics and a German-Russian polytechnioal dictionary)s 1, Relation of the sum to the constituents, 2. Relation of a part to the whole, 3. Object or subject of an action to the action, 4. Object of the bearer of a quality to the quality, Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 000/08/24 : CIA-RDP68-00069A000100;W007-9 5, Object of a determiner to the thing determined. Translation of the first component of compound words, the internal oonneotions of whose components relate to the first four types, is effected by producing a Russian equivalent in the genitive case. If the last type of connection is present, the first component is translated in two wayss by a adjective and the production of a Russian equivalent in the genitive case, Polysemia causes a certain type of connection for each meaning of the word. Therefore, a semantic analysis of the components is necessary to differentiate the types of relations between the constituent elements. Differentiating the relations of a part to the whole and the relation of a determiner to the thing determined is the most difficult of all. 3. A"special case is the translation of compounds consisting of three -components. It is important here to establish the oo-subordination of determining stems to the determined stemr, which is done by subjecting them to analysis in pairs. Threemoomponent words are translated in accordance with the rules for translating two-stem words. 4. Compounds of the input text are broken down into constituent stems by the superposition of stems included in the dictionary taking into account connecting consonants and rejected endings. 5. The principles and methods of translating German compounds into Russian, as set forth above, can serve as the basis for a definitive, detailed solution of one of the most complicated lexicographical problems in German. 66. PROPER NOUNS IN MACHINE TRANSLATION A. V. Superanskaya (Moscow) 1. Proper nouns are unavoidably present in every scientific test. 2. In the present state of development the machine translates a text, but leaves proper nouns just the way they are, printing them in Latin letters. 3. Since the number of proper nouns increases as one proceeds from selective to continuous translation, the question of the desirability of automatising the process of transcribing proper nouns arises. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Relea000/08/24: CIA-RDP68-00069A000100007-9 4. Proper nouns are not always written,, read, and pronounced in all languages in accordance with the rules for common nouns. 5o Proper nouns are international, The same nouns are encountered among peoples of different nationality. People move from country to country and publish their papers in different countries in different languages. That is the reason for the difficulty in determining the nationality of a noun and,, accordingly the rules by which it should be transcribed. 6o There is much inconsistency its the current transcription of nouns. The need to unify the transcription and eliminate the lack of uniformity is long overdue. 7o Due to the limitless memory potentialities of the macci and the difficulty of mechanical analytical transcription, it is-more eff4oient" to store proper nouns as & ,whole in the'machine's memory. Consequently, if it encountered such a noun in a text, the machine would locate it in the glossary and deliftr the answer (simple or in several variants de= pending on the linguistic origin of the noun and on existing traditions). This would help to make transcription uniforms and it could be accompanied by a printed glossary to match. 67. WORK ON A BURNESEURUSSIAN ALGORITHM OF CHINE TRANSLATION O. A. Timofeyeva (Leningrad) to The syllabic nature of Burmese writing requires the elaboration of a special-program by which an electrical reading device can handle a Burmese text. 2. We are compelled to restrict the algorithm to the literary form of'Burmese'speeoh owing to the sharp divergences between the written and contemporary spoken languages. 3. A highly developed word-building root struoture that crosses with a form-building root structure makes it necessary to have a special word-building program he purpose of which is to separate lexical from morphological phenomela. 4. The development of agglutination and the rudiments of internal infleotion require the construction of a complicated morphological pro. am for handling the abundant and varied grammatioal information con- tamed in the Burmese word, 5. The absence of a rigid order for nominal members of the Burmese sentence'complicates the syntactic program, which cannot be effected without the preliminary operation of the morphological program, o78m Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release 2000/08/24: CIA-RDP68-00069AO001002QW07-9 68. WORK ON AN ARASIC-RUSSIAN ALGORITHM OF WHINE TRANSLATION 0. B. Frolova (Leningrad) I. Items from newspapers are used as texts in machine translation from Arabic to Russian. II. The main principles in working on an Arabic-Russian algorithm of machine translation, as contrasted with those of traditional grammar, are as follows (a) Only the written form of the language with the infixes consonants and long vowels is considered, whereas all the existing grammars take into account the short vowels, which are not normally noted in writing. For Arabic two algorithms, differing in principle, are neoessarys one for the spoken language,, the other for the written; the two variants are not re- ducible to each other. (b) The traditional dictionary arranged by roots is replaced by a dictionary arranged by stems. (o) For oorivenience in transliterating Arabic letters into Russian letters, the latter are used with no additional signs of any kind. III. The programs makin up the algorithm are as follows, (1) stem- strippin (2) address (3) morphological (4) syntactic (5) dictionary of stems (6 table of prepositions (7) glossary of idioms and phrases (8) program for distinguishing homonyms. TV. Work on the stem-stripping programs (a) Initial variations of this program provided for cutting off the stems, prefixes, and suffixes; the glossary increase considerably, how- ever, due to pseudostems. (b) An important factor in simplifying this program was the idea of a reject ZUTKAZp I7 glossary which was later developed into the idea of an address used in other algorithms too. (o) The stem-stripping program includes the following rules$ (1) Out of the 28 letters of the Arabic alphabet 10 letters may be joined as non-radicals to the beginning of a words these are certain conjunctions and prepositions, the definite article, and verbal prefixes. Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Release=2000/08/24: CIA-RDP68-00069A000100200007-9 (2) In the Case of words" that do not contain initial nonradieal 1ettersa it is necessary to refer at once to the address endings and suffixes are automatically strippedupon'com- paring the words with the stems found in the address. (3) Some of these initial non-radical letters., which when out off reveal an insignificant number of pseudostems, are first transf re to the end of the words and converted into suf. fix es' are kept apart g the words are then sought in the address. (4) Words with remaining initial nonmradioal letters,, which if out off would result in a large number of pseudostems, are first checked in the address- if they are not found there, -the non-radioals are transferred to the end of the words, and the words are again looked up in the address. Checking for their presence in the address is not equivalent to ex- traoting'from the address all the information relating to the stem. 69, EXPERI?ENTAL TRANSITIONS FROM FRENCH INTO RUSSIAN G. V. Chekova (Moscow) Devising of algorithms for translation from French to Russian. 8eguenoe of operations for translation programs. Changes in programs and coding of glossary on the basis of experimental translations produced by the machine. Utilization of scales in translation programs. Progremning characteristics, scope of programs and glossary; operations utilized'in translation programs- numerical characteristics of translation programs. Basic demands made of a special translation machine. Examples of translations produced by the STRELA machine in 1957-1958. 70, ESTABLISHMENT OF SYNTACTIC CUES FOR PRLTRITTRAL -- ~ - I, No Shelimova (Moscow) 1. The object in making a syntactic analysis of prepositional phrases consisting of either a preposition and substantive standing to the right of it or a preposition and pronoun immediately adjacent to it on the right is Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release 2gp0/08/24: CIA-RDP68-00069AO001002Q0907-9 to include these prepositional phrase in syntagmasb It is neoessary, therefore? to find a word in the sentence with which the prepositional phrase forms a syntag. 2. There are no complications in drawing trip the rules for the formal analysis of prepositional phrases if a word that belongs to a class of words capable of forming a syntagma with the prepositional phrase is found immediately to the left of the prepositional phrase. The only exception is a case where a noun stands next to. the prepositional phrase. Thus, if there is any verbal form o infinitive, participle (short or full), verbal adverb, or adjective (full or short) m or special group of invariable words on the left of the prepositional,phrase1 the prepo- sitional phrase forms a syntagma with this particular word. 3. If on the left of the prepositional phrase .s a word that belongs to a class of words with which the prepositional phrase does not generally forma syntagma (pronounsf, adverbs, particles, conjunctions) or the prep- ositional phrase stands at the very beginning of the sentence, than the word with which the prepositional phrase forms a syntagma must be searched for in the following orders (a) Search to the left for the next word with which the prepo- sitional phrase may become a syntagma, excluding a noun, i.e. search for any form of verb, adjective or special kind of invariable word. A prep- ositional phrase may unite in a syntagma with several of the classes of words listed after it fulfills a series of oonditions. (b) Search to the right for the next word belonging to the class of verbs (except the full participle and verbal adverb) or a word from 'the s ecial group of invariable words or a short adjective. Actually while searo ing for a word on the right,-with which the prepositional phrase may form a synta,) we are looking for a word In the predicate of the sentence. 4. If a prepositional phrase stands next to a noun (immediately to the left of the noun), the rule for establishing the syntagma constituted by this phrase is not general for prepositional phrases with different prepositions. 5o Therefore, any of the following may be significant in determining the rules for analyzing prepositional phrases with a number of prepositions (a) The lexical composition of the prepositional phrase itself; (b) Does the prepositional phrase have on its left a noun which by virtue of its syntactic or lexical properties is such that its connection with the prepositional phrase must be regarded as certain? (c) Does the sentence have any verbal form that by virtue of syntactic or lexical properties must be regarded as necessarily connected with a given prepositional phrase? Approved For Release 2000/08/24: CIA-RDP68-00069AO00100200007-9 Approved For Reled"se?2000/08/24 : CIA-RDP68-00069A000100007-9 6, The structure of the sentence is particularly important in establishing the rules of syntactic analysis for prepositional phrases with s?veral other prepositions (e.g. v n' with the prepositional case and dl o )o In order to determine the regular syntactic Dues for the prepositional phrases mentioned, it is necessary in certain oases to know if the prepositional phrase stands before or after the predicate or whioh'syntagma contains the noun that is followed by the prepositional phrase; Sometime it is important to know whather or not this noun in turn forms a prepositional phrase with certain prepositions (e.g. ir rezuletatia As a result o , falle 5afterD etc) because in such a cease a prepositional phrase with rl or cannot be related to this no=n 71. CORRELkTION BETWEEN 3RD PERSON PERSONAL PROMUT M-Tim _ _.-' FOR WHICH Tk7EY S SE A, L, Shumilina (Moscow) 1, In machine translation the 3rd person personal pronouns of one language cannot be mechanically substituted for the corresponding prom nouns of another language since gender is not an inherent sign of every pronoun, but depends on the gender of the corresponding noun, which is accidental as far as they are concerned and specific for the different languages. 2. The following formal data must be obtained first if the correlation between a pronoun and the corresponding substantive is to be establisheds (a) The 'boundaries of the clauses (no cognizance is taken of the differences between the boundaries of clauses within sentences and sentence boundaries)s (b) The grammatical properties'of the substantives and 3rd person personal pronouns (gender, number, case)- (o) The syntactic relations and specific syntactic functions .of the substantives (d) The order of substantives in the clauses (e) Certain sequences of syntactically related words (eago ex- panded attributes). 3. A substantive for which a given pronoun is used must 'correspond gramsaticallye to this pronoun. By grammatical correspondence we mean the correspondence between substantive and pronoun in number (correspondence in number will in several oases differ from the conventional.) and gender (in the singular). Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9 Approved For Release4000/08/24: CIA-RDP68-00069A000100 O07-9 4, The way to determine the corresponding ("unknown") substantive is, for the most part, as followss The search for the grammatically corresponding word is made only to the left of the given pronoun (omitting the previously determined elements in the clause). A. Within a zero (1)olause ~'(1)Clauses subject to'analysis are numbereds zero ? a clause within which the given pronoun in found, first (1) m next clause to the left of the zero, second (2) a next clause to the left of the first, etc? (a) For pronouns in the nominative case, the only possible un- known substantive may be one with a sign of the "grammatical subject" (this concept is defined beforehand). (b) For pronouns in other than the nominative case, the unknown word is the substantive that is closest to the given pronoun, but with certain restrictions (e.g. the unknown substantive must not forma single word combination with the given pronoun, nor must it be the middle word or word on the extreme right in a chain of genitive cases, if the word on the extreme left satisfies the sign of "grammatical correspondence", etc.) Ba Within the first, second.*.nth clause (The analysis is made auoceseive y within the 1st 2nd ...nth. clause until the word that satisfies our requirements is found). For pronouns both in the nominative and in other oases, a word with a sign of the "grans tical subject" is considered first; in the event that there is no grammatical correspondence between the pronoun and the "grammatioal subject" found, we pass on to a word with a sign of the "grammatical direct object", then to the substantive that is closest to the right boundary of the lst or nth clause (taking into account the various restrictions already determined). 5. Similar work in the future may, with appropriate additions (animatenessin nouns and other-criteria), be significant-from the'point of'view of'practioal stylistios, i.e. it may create the possibility of dit6imining certain purely formal rules for using 3rd person personal pronouns on the basis of the laws of the language itself. US JPRS/DC DUPONT 7-4240 Approved For Release 2000/08/24: CIA-RDP68-00069A000100200007-9

Printer-friendly version

Search form

ABSTRACTS OF THE CONFERENCE ON MACHINE TRANSLATION (MAY 15-21, 1958)