DATA AUTOMATION PROGRAM RECORDS - GENERAL RECORDS SCHEDULE NO. 20

Document Type: 
Collection: 
Document Number (FOIA) /ESDN (CREST): 
CIA-RDP74-00005R000100010007-6
Release Decision: 
RIFPUB
Original Classification: 
K
Document Page Count: 
33
Document Creation Date: 
December 9, 2016
Document Release Date: 
May 4, 2001
Sequence Number: 
7
Case Number: 
Publication Date: 
April 28, 1972
Content Type: 
REGULATION
File: 
AttachmentSize
PDF icon CIA-RDP74-00005R000100010007-6.pdf1.54 MB
Body: 
Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 DATA AUTOMATION PROGRAM RECORDS - GENERAL RECORDS SCHEDULE NO. 20 Introduction This schedule covers machine readable records, related documentation re- quired for their servicing, and files related to the automatic data proces- sing (ADP) procurement, operations, and management functions. The decision table format, rather than the columnar format, is used for two reasons: 1) footnote requirements are greatly reduced with this format as compared to the columnar format of the first 19 schedules, and 2) the number of times a given file of logical records has been processed is often more important than the name assigned to it. For example, in an update system, the last created version of an interim master file becomes a final master file after the sponsor declares it error free. The only difference between it and its predecessors is the version number. There may be many versions of a given file created during the course of a processing cycle. Failure to promptly return unneeded tapes to the inventory will lead to ex- cessive requirements for tape. For this reason it is imperative that the creator of machine readable records assign file retention times at the out- set--that is to say, at the time of the original system design effort. The principal machine readable and supporting records common to more than one agency have been divided into four categories. These classes of records correspond roughly to the typical organizational and functional structure found in most ADP installations and their parent organizations. Data automation planning and operational records (part I) are normally those created during the life cycle of individual computer installations. They deal with planning for, managing, procuring, selecting, utilizing, and accounting for the physical facility investment of the ADP installation and supporting activities. Documentation required for servicing machine readable records (part II) is defined as the organized series of descriptive documents required to initi- ate, develop, operate, and maintain specific applications of ADP systems. These include project documentation, system specifications, test data and procedures, file and user documentation, and the various installation pro- cedures and standards used.in daily operations. Erasable media (part III) covers all devices which store machine readable records in an erasable mode. At present, only magnetic media are commonly used for such purposes. However, future technological developments may pro- vide the same characteristics (nonvolatility and easy reusability) now found on magnetic tape. 1 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 Since magnetic records may be destroyed by overwriting, a variety of pro- tective devices and techniques have been developed over the years to pre- clude inadvertant erasure of records. The earliest technique, still in use, consists of a mechanical interlock device known as a "write protec- tion ring," inserted or left out of a reel of tape. With the later de- velopment of computer-manufacturer-supplied "operating systems," an addi- tional safeguard was inserted into the software. It consists of writing file identification and expiration dates on a label record at the begin- ning of each reel of tape. Other magnetic media, such as disks, depend almost solely on such software devices. Nonerasable media (part IV) covers such media tape. Most ADP installations use media other of roles and functions, but for the most part punched cards are sometimes used as documents bonds, and requisition forms. In such cases, period, developed in other records schedules, as punched cards and paper than magnetic for a variety they are temporary. However; such as checks, savings the functional retention will apply. Procedural analysis of data processing systems (part V) is a guide for archivists, records officers, an au i ors in determining secondary uses for data files. Unlike paper, computers create more working copies, which, should be erased promptly. But the secondary value, such as furnishing data for audit trails and statistical analysis must be recognized when ap praising machine readable records. Many systems, in becoming more automa ted upon procurement of newer ADP equipment, drop certain manual controls.' Since many systems are dynamic, they change due to corresponding changes in legislation and other factors. Thus, nonoperational programs may have to be kept for site-audit records. Approved For Release 2001/08/30 2 CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Part I. Data Automation Planning and Operational Records g em y s u" an operational u" orting systems, includin e ui sent selection and statistics. File - f Planning documents Program management Hardware selection Standardi- zation a t au omation data needs and systems design of management supportin s st d Covering documentation relating to objectives, concepts, policies, and plans providing overall aspects of dat Consisting of Which are master plan, feasibility studies with associated charts and diagrams, and supporting data that re- flect on the characteris- tics of the data auto- mation activity development of plans, policy, and procedures governing the conversion to electrical machine operations and the super- vision, control, coordina- tion, and operation of the mechanization program agency requirements, specifications for hardware software, and support capabilities of vendors of complete installations or of major peripheral data elements and codes, standardization requests, and justification for all data systems Utilizatioal forms or cards that and equipment operators maintenance complete relative to machine use, nonuse, or maintenance daily detail cards, intermediate summary clocks, related magnetic files, and machine listings ability monthly sunnary of cost and utilization reports documents concerning the management of ADPE equipment requirements for cards, paper and magnetic tape reels, and inventory of ADPC supplies other standards; e.g., developed by agency used for daily management of operations FPMR 101-11.4 April 28, 1972 disposal not authorized by this schedule. disposal not authorized by this schedule. dispose of 2 years after specific configuration of equipment is discontinued. dispose of when super- seded or obsolete. disposal not authorized by this schedule. dispose of after 3 years. dispose of after 90 days, card decks, magnetic dispose of after 3 tape files, and years. machine listings original records dispose of 2 ea f m i t i d y rs a ter a n a ne at data- the date equipment is processing instal. Iation discontinued. dispose of after 1 year. graphic, narrative, and tabular informa- tion relating to the present and/or planned ADP composition and requirements of the data automation activity maintained at policy determination level selection criteria for procurements in the establishment or modification of an ADP installation promulgated Federal or national (except record copies at National Bureau of 3 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FP"AR 101-11.4 April 28, 1972 Part . Data Automation Planning and Operational Records File desienation Consistin of which are Then I contractor's invoices for rental dispose of after 3 years and other charges incurred for use of ADPE 12 Magnetic library transaction records cards docks and magnetic dispose fourth update he the cycle is tape tape files fourtd . library control records machine listings dispose of. after 90 days 13 transaction slips dispose of after 90 days 14 or when no ion er needed 4 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 Item 13. Machine listings of library transactions are often produced daily. Quite often the transaction listings provide audit trails of the last re- cording made on a specific reel and may be useful in retrieving a lost file or in determining how a file may have been inadvertently scratched. Ac- cordingly, some installations keep some copies of these listings for as long as 1 year. Item 14. Transaction slips for military-classified or other sensitive records have longer retention periods. These retention periods are gener- ally specified as a matter of agency policy or regulation. Approved For Release 2001/08/305: CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPDIR 101-11.4 April 28, 1972 Documentation covering the organized series of descriptive documents relating to all aspects of system development and operation: These include system planning documents, ADP systems specifications, application program manuals, systems operating instructions, and various management aids. designation consisting of Which are Then 1 Specific data documents containing defi- at departmental level disposal not authorized systems planning nition of the system in- headquarters by this schedule. records cluding the system review after 5 years. objectives, request for the system, authorizing directives, source data, _--_---__---------.----- detailed studies re- flecting advantages and disadvantages of alter- 2 nate solutions, equipment supplementary files dispose of 5 years requirements, tangible at ADP unit level after final action. benefits, output re- quirements, and schedule for completion 3 System test system test specifica- an approved system dispose of 1 year after documentation tions, test runs, discontinuance of the machine listings of system. test data, and test results 4 a disapproved dispose of 1 year after proposed system final action. 5 Systems design documents containing for systems for dispose of at time final specifications operating procedures which related magnetic tape records for implementation magnetic tape data produced by system have of a specific data is authorized for been blanked. system, including blanking policies, instructions, details of computer technique, logic -- charts, and input/ output document flow data 6 for systems for which retain with the related the related magnetic magnetic tape. tape data is not authorized for blanking 7 Files(s) narrative description for systems for which dispose of at time specifications of the source and func- the related magnetic tape final magnetic tape tional characteris- data is authorized for records produced by tics of the file(s), blanking system have been a definition of the blanked. 8 content of each for systems for which retain with the related record in terms of the related magnetic magnetic tape. the relative position tape data is not name, length, and type authorized for blanking of each data element in a field (run layout) explanation of the coding system and a cross reference code manual of every code used together with all their values 6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 Pile designation Consisting of Which are Then 9 Input specifications detailed description of for systems for which the dispose of at time each transaction that related magnetic tape final magnetic generated some activity data is authorized for tape records pro- in the system in the form blanking duced by system they appear at the time have been they enter the computer scratched. system; identification title, recording media, purpose, frequency, volume, and source; detailed de- scription of the contents of each input to the basic ---._.___-._----_--, -__._-_----_-_- record file and a graphic { illustration of each 0 for systems for which retain with the re the related magnetic lated magnetic tape data is not tape. authorized for blanking F 1l Output detailed descriptions a listing of the outputs dispose of on ter- (report forms) of products of the by sequence, name, media, mination of system specifications system that are to be purpose, frequency, by either obsoles- used outside the computer volume and distribution; cence, update, or center a detailed record de- discontinuance. scription; and samples of output in the form of lay- outs or copies, keyed to names and. numbers in the output listine, s 2 Application documents reflecting . a description of input, dispose of on ter- program manual the latest information files, and output; source mination of system for a general descrip- and object code listings by either obsoles- tion of the function, and flow diagrams showing cence, update, or use, and methodology the logic of the program; discontinuance. of the program description of program out- put messages; and coding in formation, test plan, program test, and operating instructions 3 User guides information used in handbooks, guides to data retain with system training or explain- availability, and proce- specification. ing overall system dures for querying files 4 System operating' procedures user oriented instruc- for systems for which the dispose of at the tions. 1) to prepare related magnetic tape time magnetic tape input data, 2) for data is authorized for reels are control and interpre- disposition scratched. tation of output reports, and 3) for processing work on the computer 75 for systems that require retain with file retention of related (systems) specifi- magnetic tape data cations. 7 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 File designation Consisting of Which are Then 16 Report printed final report containing for systems which retain one copy the statistical tabulation and require retention of the printed an analysis of the findings of of related magnetic report with re- a study or survey including a tape data lated file speci- narrative description of fications. methodology employed Approved For Release 2001/08/36 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 Part III. Erasable Media The term erasable media refers to tape (analog, digital), drums, disks, disk packs, data cells, and other devices that store data in an erasable mode. The term "dispose of" in column 4 is synonomous with the terms "scratch," "erase," and "blank." File designation Consisting of Which are Then 1 Scratch tape temporary magnetic tape used by ' new tape or tape not available for im- (blank tape) the console operators or tape included in a tape mediate use or handlers to facilitate general library control or reuse. computer runs such as sort and , files whose reten- merge runs tion dates have expired 2 Test tape magnetic tape used in testing used by programmer for dispose of after a proposed system individual, run test- system has been ing and not under accepted or discon- library control inued, whichever is ooner. 3 system debugging test retain until re- data lated program is discontinued. 4 system acceptance test data S Program tape or disk pack tapes (disk packs) contain- ing sequence of instructions updated ispose of after hird update cycle. required to accomplish the processing of data or solving a problem 6 the last update of dispose of after specific EDP application agency has ex- used in a terminated hausted its use of system the tape. 7 require in auudit trail dispose of in ac- cordance with func- tional guidelines provided -by GAO. 8 Raw data input magnetic tapes containing used for updating with dispose of first data abstracted from source existing program and generation data documents or other media required to support upon successful and entered into the system for the first time reconstruction of master file completion of fourth processing machine pass. 9 not required to support i P of after reconstruction of raw data is pro- master file and/or cessed into final used as input for a one- data and proved time study or survey satisfactory.' 10 officially designated dispose of in ac- to replace or serve as cordance with in- the basic source data structions appli- in lieu of the "hard cable to the "hard copy" or other input copy" or other source document files documenting the same process, transaction, or j case. 9 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 11 Working tape (input/output) magnetic tape containing used in an updated dispose of after output or control within or system subsequent magnetic from one run to a subsequent tapes that contain run that manipulates, sorts the accepted detail and/or moves data through data have been the systems; includes check- created and proved point, edit, correction, satisfactory. unmatched data t li t -- - 12 s , rejec eliminating error, and rerun used in a one-time dispose of after tapes study or survey master data tape has been proved to be satisfactory. 13 Valid transaction magnetic tapes containing partially valid trans- dispose of after valid file of items used action after all out- third update cycle. with a master data tape standing items are input file for creation liquidated from of master data tape output current status tapes file 14 valid transaction dispose of after after cumulative third update cycle. final master tape is prepared and determined to be success- ful, and there is no necessity for statis- tical analysis 15 used in additional disposal not author- statistical analysis ized by this schedule. 16 Information magnetic media contain- a cumulative index to dispose of after retrieval ing data created by the scientific and tech- third update cycle. system master merging of prior master nical publications, reference file with valid trans- and bibliographic and action data to create a other nonrecord material new master file (includ- ing the security copy 17 tape of data on disk an index to record disposal not author- packs) material such as cor- 'zed by this schedule. respondence, legal hearings and decisions, patents and trademarks, and record copy of pub- lications 18 Federal loan magnetic media containing cumulative data of funds dispose of after and grant data created by the merging made available through third update cycle. program master of prior master file with federally supported file valid transaction data to loan and grant programs create a new master file 19 (initial data includes noncumulative periodic disposal not author- excerpts from forms placed file of status of ized by this in case files) Federal loan and grant schedule. activity 10 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 File designation Consisting of Which are Then 20 "Housekeeping magnetic media con- not required for GAO site audit dispose of in systems" taining data for such accordance with master data "housekeeping systems" instructions file as fiscal account- applicable to ability, supply man- the hard copy or agement, and payroll other files doc- administration umenting the same process, transaction, or case. 21 (required for GAO site audit dispose of in accordance with functional guide- lines provided by GAO. 22 Economic magnetic media con- cumulative data such as status of banks statistics taining data created and insurance institutions; production, master file by the merging of consumption, and monetary status of prior master file with industry and agriculture; value of valid transaction foreign commerce and other economic in- data to create a new dicators such as construction of houses master file and buildings; motor, rail, and air (travel; communications, including broad- casting, telephone, and telegraph ------------- 23 noncumulative data use to prepare disposal not reports covering a limited period of authorized by time this schedule. 24 noncumulative recurring periodic surveys disposal not including wholesale and consumer price authorized by indexes, annual industry, housing this schedule. vacancy, and other economic indicators 25 noncumulative economic census taken disposal not during 5-year intervals authorized by this schedule. 26 Social magnetic media con- cumulative social and demographic data statistics taining data created concerning births, deaths, and marriages; dispose of after master file by the merging of income taxes paid; social security third update prior master file with accounts; employment information; law cycle. valid transaction data enforcement, crime and civil disturbance, to create a new and other social indicators master file 27 noncumulative data. used to prepare disposal not reports covering a limited period of authorized by time this schedule. Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 :File designation Consisting of Which are Then 28i noncumulative recurring disposal not author periodic surveys including ized by this current population statis- schedule. tics, annual industry, housing vacancy, voter participation, and statis- tics of income sample 291 noncumulative demographic disposal not author censuses ized by this schedule. 30Natural resources continuously updated cumulative data on charac- dispose of after master file magnetic media con- teristics, use, and owner- third update cycle. taining data created ship of natural resources by the merging of prior such as land, water, master file with valid minerals, and timber transaction data to create a new master file 31 noncumulative data used to disposal not author prepare reports covering ized by this a limited period of time schedule. 32 Longitudinal magnetic tape contain- a series of observations disposal not author 'studies master ing data recorded over relating to individual ized by this data file time from one or more units (persons, places, schedule. sources things) 33 Scientific magnetic media source dispose of after data files data recordings received converted to raw data meaningful data from experimental sensor digital magnetic tape has been analyzed. instruments for scien- media tific measurements such as outer space orbiting spacecraft, oceanograp- hic and geophysical phenomena, and medical research (including analog tape) 34 of converted or converted dispose of after de- nly in part to raw data termination has been digital magnetic tape media made that the data will not be conver- ted to raw data digital magnetic tae media. 35 magnetic media containing data created either from held in national data centers disposal not author- izedby this schedul analog magnetic tape or 36 recorded directly on mag- not duplicated in disposal not author- netic digital tape for national data centers - - izedbythis schedule scientific measurements - 37 of astronomic, outer duplicated in national dispose of after de- space, and oceanographic data centers termination is made phenomena; air and water that data is not re-, quality; and medical quired outside the research measurements data centers. 12 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 -File ---_`r__ ! designation Consisting of Which are Then 38 not calibrated or validated dispose of after subsequent magnetic tapes containing the accepted data have been created and proved satisfactory. 39 Summary data magnetic tape containing substantially unpublished, disposal not auth- file aggregates of individual such as tapes containing orized by this observations from valid data that are disclosure schedule. transaction or master free data files that are disaggregates of published data 40 Publication magnetic tape containing reproduced and disseminated record copy not tape source output data ex- as a publication or used for authorized for tracted from the system reproducing a printed disposal by this (without destroying the publication schedule. source tapes) 41 Print tape used for producing required dispose of after printouts of tabulations, output has been ledgers, tables, registers, released and and reports approved. 42 Re.-formated magnetic tape containing created for the specific dispose of as pro- data file essentially duplicate purpose of information vided for master data from the master interchange -- - " - data tape. data file but which is -" - 43 created for use with of specific application dispose of when de- other computer hardware for agency computer termination is made systems hardware systems that such format is unnecessary. 44 Sample and magnetic tape contain- disclosure free or useful disposal not auth- subsample ing individual observa- in statistical analysis orized by this data files tions selected from a or policy formulation schedule. larger census or survey models and simulation studies file such as stratified or pure random sample files with or without weighting factors 45 Security backup file magnetic tape that is updated dispose of after identical in format to third update cycle. master tape retained as --------- 46 security in case master a one-time study or survey dispose of or retai tape is damaged or inad in accordance with vertently erased tandards for standards scratching of cor- respondence master file. Other agency magnetic tape created not altered substantially dispose of when no files by other agencies by the receiving agency longer needed. n 13 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 Items 3 and 4. This type of data is differentiated from simple debugging test data in that the data set is used to exercise all possible data system options within the complete set of programs. System debugging test data means data used to debug individual programs or groups of programs prior to final acceptance testing. It must be retained until the related program is discontinued. Acceptance test data may also be a contractually defined specification or item in software systems being procured and it or a listing of it may have to be kept with the contract file. For details in this case, see General Records Schedule 3, item 4. In other cases, particularly in systems where accounting for funds is in- volved, it may be required that the files be kept until a particular ver- sion of a system has been audited and approved by the General Accounting Office. Retention periods in this case will be in accordance with the spe- cific functional file in one of the other general records schedules. This means that specific acceptance test data sets might have to be kept for the life of the particular version of a software system or until all records produced under that system have been disposed of. Item 7. Just as the acceptance test data may need to be kept beyond its useful life for auditing purposes, programs which processed that data may also be kept for audit purposes beyond the operational life of the partic- l u ar system. Disk packs are and there is usually a backup these cases, the tape copy of mentation may be used in lieu object versions of the system Items 16, 18, 22, 26, and 30. deleted in the present pass. relatively expensive for long-term storage copy of the system on magnetic tape. In the program together with all relevant docu- of the disk pack version. Either source or may be used for this purpose. "Cumulative date" implies no earlier data is Approved For Release 200148/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 readable media. re File designation Consisting of Which are Then 1 ADP program punched cards containing processed with a processor dispose of indivi- card files common language source or utility program to hine-coded dual cards when re- placed by new ones, program data (source deck) produce a mac object program dispose of program deck after program has been removed from system. See note in part III, item 4. 2 machine-punched cards read into computer dispose of after containing coded machine memory before running successful com- language instructions a program to cause the pletion of a pro- arranged in proper computer to perform gram revision or d l t sequence (object deck) data-processing e a after re m has been functions progra removed from system. See note, part III, item 7. 3 prepunched utility or used to update instal- dispose of after processor program card lation systems software receipt and suc- decks cessful use of new cards from the manufacturer or programmer, or 1 year after discon- tinuance of pro- gram or system. 4 job stream (job stack, used to activate dispose of individu job control) card decks program-processing al cards or sets of modules performing cards when replaced a data-processing job by new cards and when necessary changes (if any) have been made to appropriate data- processing manual. Nonerasable media refers to ADP punched cards, paper tape, and other nonerasable, machine Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 File designation Consisting of Whi h c are Then ADP program control cards punched cards containing pertinent to a specific data for dispose of individual program control run or cycle generated by the producer cards or setsof cards or user when replaced by new cards and when neces- sary changes (if any) have been made to appropriate data- processing manual. for repetitive use and dispose of individual updated either by ADP cards after replace- or user ment by new cards; destroy control deck 1 year after program has been removed from system, or after system has been dis- continued. 7 ADP source data cards punched cards or paper retained by ADP dispose of when re- (or paper tape containing data abstracted from operational elements lated magnetic c file le tape as source documents and used fo as backup to magnetic has been proven to be applicable) r conversion to magnetic tape tae or disk p satisfactory and has or processing on (EAM) grandfather backup. electric accounting machine equipment created after January 1, 1970 8 EAM output listings dispose of after 180 and reports days if used in pro- cessing without being converted to magnetic tae. 9 on magnetic tape dispose of after veri- fication of data on related magnetic tape. 10 punched cards that source documents da cese of in accor- contain original entry data with film n t ruc dace with instruc- or written inserts tions applicable to the hard copy or otheri files documenting the same process, trans- action, or case. 16 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R0001000199 -6101-11.4 April 28, 1972 Items 5 and 6. These items refer to parameter cards associated with the execution of various options of operational programs. These include date cards, periodic (monthly or quarterly) options executed only occasionally, and queries to information retrieval systems. They do not include card decks for generalized interpreter systems used with computer simulation software packages such as SIMSCRIPT, GPSS, DYNAMO, and similar systems. These decks have the status of program source decks. Similarly, all except report generation decks in file management systems are considered to be source program decks and should be retained or destroyed in accordance with the criteria of items 5-7 of part III. 17 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 10 p1 ~oYed For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 April 28, 1972 Part V. Procedural Analysis of Data Processing Systems--Guidelines for Appraising Files and Data Sets for Permanent Retention This section is a guide to ADP systems analysts, records officers, and archivists for determining the nature of data files (also called data sets) generated by computers. Factors that influence the selection of specific data files for permanent retention in machine readable form (chiefly on magnetic tape) are indicated and explained here. In examining a variety of documentation files for different ADP systems, substantial differences were found in the use of technical terms among agencies, and in some cases, within agencies. These differences are being resolved by several vocabulary standardization groups, among them Federal Information Processing (FIP) task group 5 and its successors and the American National Standards Institute (ANSI) X.3.5 committee on vocabulary. However, the definitions in the vocabulary have not been standardized to the extent that flow chart symbols have been in ANSI Standard X3.12-1968, Flowchart Symbols and Their Usage in Information Processing. Accordingly, better guidance for appraising data and documentation files can usually be achieved by studying the high-level system flow charts in addi- tion to the narrative description found in the system documentation files. The system files are enumerated and described in part II of the schedule. This section has been written based on the fact that virtually all ADP systems are composed of a small number of basic procedure types connected in sequences that can be called modules. The text and charts in the follow- ing sections are organized around this concept. Almost all existing ADP systems can be analyzed into portions or groupings of these charts. 2. The Elements of Data-Processing Systems Data processing systems are composed of four basic classes: hardware, soft- ware, peopleware, and data files. The hardware consists of the central processing unit and all of its peripheral devices and recording media. The software consists of the machine instructions that direct the hardware to perform the processing. Peopleware is listed in parts I and II and consists of specifications, hard copy documentation, and user manuals for all per- sonnel involved in running a system. The data files themselves are listed and described in part III. Appraisal criteria for them will constitute the bulk of sections 3 and 4, below. 18 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005ROO0100010007-6 FPMR 101-11.4 April 28, 1972 2. 1 Hardware Computer hardware and recording media are still undergoing relatively rapid evolution, and this presents a problem in finding equipment that can suc- cessfully read some older machine readable files. Files to be retained per- manently may have to be recopied periodically onto newer media or totally converted in format and most other physical characteristics. Since costs for this type of work are declining, this situation presents no undue bur- den to the holder of this data. In general, the property value and conver- sion costs of machine readable records are less than one-tenth of 1 percent of the data collection and editing costs of the information recorded on it. Upon consultation, the Office of the National Archives, National Archives and Records Service, will recommend procedures and techniques needed for the physical preservation of the record content beyond the life of the recording medium. 2.2 Software This is divided into two main types, systems and application software. System software is furnished by the computer manufacturer and is designed primarily to manage the available resources of the computer complex in an efficient manner. The computer complex consists of the central processor and its attached peripheral devices, such as card readers, magnetic tape drives, high speed printers, and other equipment. In general, this type of software is not related to any specific file or record maintained in an in- stallation. It is, therefore, of no permanent value except to the history of the development of computer science. Selected portions of systems soft- ware specifications are useful for reading files produced on one computer with another equipment configuration. However, this information may be documented in less than one page and does not require extensive documenta- tion. Subclasses of system software include utility, operating system, sorts, merge, and compiler software. An exception is application software written in one of the standardized machine independent programming languages. COBOL, FORTRAN, and PL-1 are the three most widely used of such languages? In most cases, application software written in these languages may be considered for retention with the related files. However, only a small portion of the total software written for an application need be retained permanently. For example, a file that has been closed off and covers a specific period of time will not be updated. Therefore, the update software is unlikely to be ever required again and is disposable. Approved For Release 2001/08/30 : CIA-DP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 If the file is a large complex data base designed to service many in- quiries, retention of the query software may be warranted. However, much good query software is commercially available to handle the problems of file inquiry. Therefore, retention of this software is less important than retention of user documentation described in section 2.3 below. The final class of software that may have permanent value is that used in computer simulation work. There are several software systems that have been used in policy formulation and evaluation work for high-level manage- ment in agencies. The three best known such software systems are SIM- SCRIPT, DYNAMO, and GPSS (General Purpose Systems Simulator). Like COBOL and FORTRAN, these systems are available for most computers on the market. It is also likely that they will continue to be available for the foresee- able future. What is important to save in such applications are the source program decks. The policy alternatives and much of the information on a project is contained in these decks and they often constitute records of intrinsic historical value. Economic and financial projection models and war game software are typical examples. 2.3 Peopleware A wide variety of hard copy documentation is produced in data processing systems. Peopleware is that documentation required by the personnel in- volved in the design, development, operation and maintenance of ADP systems. The files are listed in parts I and II of this schedule. Of interest in this section are primarily those files required for the direct servicing of files declared permanent. The basic concept to grasp in data processing is that the record consti- tutes a representation of an event and not the event itself. As such, the representation or record may have been recorded by a sensor (as in scien- tific measurement) or may have been transcribed and encoded from some other document or document group as in all transaction reporting. In either case, a researcher needs to know what kind of transformations oc- curred between the actual event and its representation on magnetic tape. This knowledge is in the documentation described in part II of this schedule. For example, most housekeeping systems usually encode events using elabor- ate code tables rather than narrative fields on the record. A payroll system may have dozens of deduction code possibilities as well as an equal number of pay plans. Typical codes would represent bond deductions, local tax rates for States and municipalities, bond and charity deductions, over- time and premium shift differential rates, etc. In scientific work, in- strument readings represent observations of physical phenomena and other occurrences. Each time a transaction is encoded or instrument reading is made, there is a possibility of an error or distortion taking place in the process. The errors may be simple random occurrences such as digit transposition by key- punch operators or transcribers, or systematic because of some bias in the recording instrument or observer. In general, the scientist attempts 20 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 to calibrate his instruments and adjusts instrument readings for other known factors, and the systems accountant devises consistency checks, batch totals, and clerical training programs to assure "accurate" record- ing of his data. Permanent records of this class include the file and input specifica- tions (items 7-10 of part II) along with the final version of the re- lated tape file. They tell a future user of the probable quality and coverage of the file and, for those with much encoding such as account- ing files, the meaning of all of the descriptive data fields along with the bias and judgement that went into transcribing a record of an event into a coded element. Some portions of system-operating procedures and user guides (items 13-15 of part II) are also useful for later reference work. These records are essential for determining how the related data files were used for operations and research and must be retained even if the related software is disposed. 3. Data-Processing Systems Flow Charts and Their Use in File Appraisal Data records in ADP systems are processed both manually and mechanically before finally residing in a file as a correct record. This section presents typical systems charts found in the high-level documentation of most such applications. These charts should be used by an appraiser for determining which files among many are most useful for permanent retention. Data processing systems are categorized by two sets of terms. One break- down is between continuing and one-time systems; the second is between real-time and batch-processing systems. Real-time systems handle one tran- saction at a time and complete the function of posting and validation before going on to the next transaction. These operations occur at the time the actual real world event occurs or at the latest, soon afterward. Batch-processing systems perform one stage of processing for a group (or batch) of transactions. These operations occur after the real world event took place. The delays may range from hours in some cases to months in others. Continuing systems are those which are run periodically with a repetition rate ranging from a few hours to a year. Most housekeeping systems are of this type. The most familiar applications are for payrolls, inventory control, and financial management. Although the file contents are contin- ually changing, such systems have high continuity from one period to the next and are well documented for auditing and operational purposes. One- time systems are less well documented than continuing systems. There is usually pressure to deliver the results in the form of reports within a time constraint. Many undocumented ad hoc decisions are made during the course of these projects to meet project deadlines, which result in files containing data errors that may not reflect the contents of published results. Surveys, simulation projects, and censuses fall into this cate- gory. 21 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 GENERALIZED Ii\IPUT UPDATE (1) Source Document MECHANICAL CONVERSION Raw Data Input File (Disk) Subsidiary Master Files Code Tables EDIT PHASE SORTS AND VALIDITY CHECKS Job Deck Manufacturer Software Application Error Listing 22 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R00010001009T 101-11.4 April 28, 1972 Various combinations of these attributes are found in ADP systems. Real- time systems are used in continuous housekeeping systems, the most notable being military command and control and airline reservation applications. Batch processing is used for both one-time and continuous systems and con- stitutes the bulk of existing data-processing applications. The files in all types of systems are increasingly being put on mass storage devices such as disks rather than tape. However, even in real- time systems, backup and recovery procedures dictate that magnetic tape copies of the file be created. These are usually called either "file dumps" or "safe data dumps." They are also created for running off summary reports since total file scanning of disk files is inefficient. These files can be appraised in the same manner as tape resident files. 3.1 Input and Update Subsystem Phases These two phases are common to all data-processing systems that involve file maintenance. Two typical flow charts are shown and labelled "Genera- lized Input Update." They show the processing steps taken to record, con- vert, check, edit, and post a record to a file for later use. 3.1.1 Source Data Conversion Phase Data can be converted to machine readable form by several methods. For- merly, data were transcribed from source documents onto transcript sheets. They were keypunched, converted to magnetic tape, and then processed. More recent methods either record machine readable data onto source documents (turnaround documents) or accept input directly into computers through key- board-driven terminals (source data automation). The data control function is closely interwoven with the mechanical conver- sion process. One part of data control consists of keeping count of the documents in each batch to be processed and control the totals of one or more quantity fields. Examples would be dollar totals and counts of checks or invoices. The other part of data control is the manual editing of source documents. Such editing consists of checking codes and resolving errors. The typical sequence is shown on the chart entitled "Generalized Input Update I." A manual-handling phase is followed. by a media conversion step. The machine readable transaction is further validated by a series of com- puter runs. Errors may be introduced and detected during each stage of the process. Correction procedures depend on the stage at which the error is detected, the type of error, and the conversion hardware used. For example, in keypunch-oriented systems, verification is used to minimize conversion errors while the computer passes are used to catch logical and transcrip- tion errors. Some systems combine error detection and correction processes by attaching the conversion equipment to a computer. 23 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 However many steps and cycles occur in the process, the end product is usually called a raw data tape. Other names used are unsorted transac- tions, partially edited transactions, sorted preliminary update file, and the like. Raw data tapes are seldom of any permanent value since they contain erro- neous and duplicate records. A possible exception in the case of real-time systems where the tape may be named "logging file" in such applications as message switching or production control systems. Usually such tapes are kept for a short period of time as backup to recreate a real-time file. The sole other usage is for system test data or some simple transaction counting for real-time system work load studies. 3.1.2 Edit Phase-Sorts and Validity Checks Many tapes with records of temporary value are produced at this stage. Another common designation for these tapes is work t,'pes. This phase or module processes transaction files against various editing and validation criteria. These criteria may be found in a computer program, such as a table of valid transaction codes, or in a subsidiary master file, such as valid account numbers or a name and address file. Other common checks per- formed here are for numeric characters in quantity fields, transaction batch totals, transaction counts, and consistency checks. The output of such a phase is a file of partially validated transactions. Two methods are used in handling errors at this time. In one invalid tran- saction items are listed on a printer along with the error indicators for immediate correction on a batch basis. In other systems, erroneous records are coded to indicate the presence of certain errors, but are not deleted from the transaction file. Instead, they are kept in the transaction file for still further checks in the update program itself. This gives a con- solidated error listing at one time for a given batch of transactions. The most common additional tests performed on the data would be tests against the master file key itself. Examples are transactions that attempt to de- lete a nonexistent record or insert a duplicate record into a file; others may be quantities that are checked for "reasonableness." 3.1.3 Update Master File or Data Base Phase The edit phase's output is the updating run's input. This is shown in "Generalized Input Update II." If the file contains only one application, such as accounts payable or receivable, it is normally called a "master file." If it contains data from a series of applications, or summary data from a variety of sources, then it is a "data base." Under certain circum- stances, an individual transaction and status report is the direct concern of an organization's top management. Examples are status reports for an important research and development project, construction job, or loan--all of which might have enduring value. Approved For Release 2001/08/30: CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 GENERALIZED INPUT UPDATE (II) UPDATE MASTER FILE PHASE 1 or work tape. 3. May be cumulative or noncumulative file. 2 A series of these files may be merged 4. Until final version is into a continuous history file. approved, prior versions are interim master files. 25 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101 ,~r1pved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 April 28, 1972 There are several types of master files. In some systems, they include only currently active records. Purged master file records are periodi- cally transferred to dormant account (history) files. Personnel and pay- roll files are typical, with the purge generally occurring at year's end. Many other files are cumulative and continue to grow in size depending upon the application. As in the case of paper records, frequency of use is the major criterion when deciding the length of time "detailed trans- actions" will remain on the master file. In the case of periodically updated files, where transactions are deleted, purged records are often merged into historical files. These files are valuable. However, they may lack data found on the master record file. Therefore, both the merged periodic transaction file and the master file should be retained. Items 13-15 of part III denote the types of trans- action files created. Items 16-32 of part III furnish disposition criteria for master files. Master files are seldom updated for a given period in a single update pass. Some errors cannot be detected until the actual posting attempt is made. This creates a series of "interim" master files. The only valid file would be the one from which the periodic output was run. Usually, process- ing deadlines determine which version is "valid." Interim master files are usually retained for short periods as backup tapes for the final master file. (See items 44 and 45 of part III.) This retention plan is called the "grandfather system." In appraising master files for permanent reten- tion, it is preferable to retain the "as of" date from the official files. While this is usually possible, there are many cycle-billing systems in which the master file is never completely purged of detail transactions-- thus never complete. In such cases, it may be preferable to retain ex- tracts of the master file made for reporting purposes and audit trails rather than the master file itself. Items 39-43 of part III describe al- ternative selections to master record tapes. 3.2 Report Generation Phase This section describes the files, processing, and software used to produce output from ADP systems. The chart labeled "Report Generation Phase" shows the usual processing sequence in such modules from the machine readable record to the final printed report or listing. Since mass, random-access storage is increasingly used, the chart shows tape and disk files used in- terchangeably, although in practice one or the other medium will predomin- ate. As indicated previously, unless its usefulness or transferability to future computing systems is assured, it is unnecessary to retain output-oriented software. This evaluation should be ad hoc. 3.2.1 Report Data Extract and Format Phase If the printed report and the master file are in identical sequence, the data selection, tabulation, and printing phase may occur in one program. 26 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Stub Descriptio File and Software REPORT GENERATION PHASE Selection Criteria and Formatting XI, zAXIJQ 27 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 1 FPMR 101Appr9ved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 April 28, 1972 This is characteristic of billing, payroll, and most housekeeping systems. However, often it is necessary to print the report in a sequence different from that of the master file. The use of a sort will resequence ,,he selec- ted records as desired, creating a series of intermediate work files be- tween the master file and the printed output. The flowchart labeled "Report Generation Phase" shows both tapes and disk files in the process- ing sequence. Newer computers and operating systems seldom produce work tapes except for the largest multireel files. The intermediate files reside on disk as transient files within the "job stream." (See item 4, part IV.) The input file, the job control deck, and the final printed output only are visible to the uninformed. Thus, seldom is there need to retain intermediate files because they can always be recreated from the master file. In one case, extract files are useful and should be retained. Files that contain "statistical samples" of the entire body of data often have long- term value when the methodology is documented. These sample files, along with appropriate weighting factors and stripped of identifying information d?sclosing individual persons or establishments are immediately releasable for public research. The next problem is to determine which of several work files to retain. In general, this depends upon the degree of decoding stub descriptors required to interpret the file. Heavily encoded files with little or no narrative description are suitable provided that the stub descriptor files and tables required for human reading and interpretation are of reasonable length. When the code is a Federal Information Processing Standard, the length of the code table is unimportant. An example is the table of State and county codes of the United States with more than 3,000 entries. Tables of less than 200 entries developed for individual agencies or one-time studies may be reasonably left encoded as they can be decoded by simple computer pro- grams. For large code tables stub descriptions are preferable for long- term preservation. 3.22 One or More Sorting Runs Extract files are often in the wrong sequence for producing reports or required tabulations. In fact, the same file may be sorted into as many as 10 different sequences for different types of analysis and tabulation. The criteria for retention of sorted work tapes are the same as for extract tapes described above. The output of this phase is a sorted work tape or file ready for tabulation, summarization, and editing. This type of file on tape is often a useful research file, particularly if there has been some editing and if interpreting has been performed. In general, continuing administrative systems have relatively few processing steps between the first extract run and the final output pass. This is different from one-time reports as described in section 4 below. 28 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R0001000109J9R6101-11.4 April 28, 1972 3.2.3 Tabulation Run The inputs to the tabulation run are the sorted work files, and usually with a stub descriptor file. The stub descriptor file is invariably used when a very large list of codes must be displayed in plain text. If it is on magnetic tape, one or more sorting runs are typically required of the extract tape in order to apply the descriptors. If on disk, most of this decoding can be performed during the tabulation run. When files are considered for retention, the information necessary to decide such data elements must be retained. This may either be a hard copy document as de- scribed in section 2.3 above or a machine readable file. The final output of a tabulation run may consist of either summary data files or a print tape (items 39 and 41 of part III). Summary data files may also serve as publication tapes (item 40 of part III) when they are reproduced and disseminated to the public and/or Federal agencies. Summary data files are occasionally used as input for published and widely disseminated printed reports. Many installations do not use print tapes when they produce a computer listing or report. Instead, the data are temporarily transcribed on disks until the printed report is complete. However, tapes can be created upon specific request if there is a known demand and further use for the same information in machine readable form. This procedure is often followed by producers of general proposal statistics. When a tape file may be classified under more than one of these three categories, disposal is not authorized by this schedule. 4. One-Time Surveys and Report Generation Systems The sequence of operations in one-time surveys, censuses, and tabula- tions is shown in the following two charts. When the flow process charts are compared to the typical continuous file maintenance system, the similarities are evident. The basic difference between continuously running systems and one-time jobs is the much higher amount of manual editing and encoding required. Unless the job is a very large effort with many thousands of observations, the forms used allow somewhat more variability in field entries than accounting type documents. Since line respondents to these surveys rarely have an opportunity to correct the imputs, much more manual editing and encoding is required to correct (clean up) a file prior to its use in tabulations. As the second sheet shows, there is a file buildup process which occurs with no changes occurring to individual records after they have entered the file. Where a multiplicity of systems and sources feed the file, the individual records are usually of variable length to minimize storage requirements. Documentation for such systems contains complex record formats but in- cludes few of the elaborate codes found in administrative systems. 29 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 TYPICAL ONE-TIME PROCESSING SEQUENCE (1) Lilli Edit Criteria and t EDIT CRITERIA SOFTWARE 30 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 TYPICAL ONE-TIME PROCESSING SEQUENCE, (11) FILE CONTAINS A SORT .SYSTEM PROGRAM REPORT GENERATION ADJUSTMENT PROCESS Printed Report aster File Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 FPMR 101-11.4 April 28, 1972 When these files are retained, it is important to identify the original sources of information, the instructions to respondents for filling out the forms, together with sample forms, and the directions given to the response form editors for proper interpretation and secondary usage of these files. Most such files are described in part II of this schedule. 5. Adjustments in One-Time Jobs A variety of additions, changes, and deletions can occur in either indivi- dual or groups of entries within a file at any stage of processing. They can occur for a variety of reasons and lead to magnetic tape files of different accessibility-and validity. If the files are to be retained per- manently, it is important to document the corrections and adjustments. This record of changes constitutes the equivalent of the accountant's audit trail for evaluating the accuracy of a financial file. If a payroll record contains an error, the originating office usually hears about it in short order, particularly when an employee is short changed. In sample surveys, respondents seldom correct reporting and transcription errors unless elaborate procedures have been established for a review of the machine-prepared record. This correction and review process almost always occurs in accounting systems, while cost usually precludes this process in most one-time jobs. Therefore, such files of recorded observations contain a variety of errors which in summary tabulations are nonsense entries. For example, male widows may appear in a tabulation. Such errors arise from a number of causes. Correction of the tabulations can be made at any step in the process between the final survey file and the printed report. The accom- panying flow chart shows the points in the procedure where this is usually made. If the error is thought to be a random event, the illogical counts are gen- erally distributed to all other possible categories and deleted from the tabulation array. This would lead to a discrepancy between the published table and the final master file. The illogical records would remain in the file uncorrected. Systematic final edited master errors also occur frequently in encoding and processing. In these cases, the summary file may be corrected by moving the entire nonsense count to the correct location in the table. These errors can also be corrected in the final master file using the computer. Another common adjustment operation occurs when a tabulation discloses in- dividual confidential information. Confidentiality is protected in one of three ways: 1) by deleting the entry on the summary file and combining with enough other tabular entries to eliminate individual disclosure, 2) by correcting the print tape, and 3) by correcting only the printed report. In the first method the summary file is releasable to the public. In the second, the summary file is not releasable to the public, but the print file is. Approved For Release 2001/08/3w: CIA-RDP74-00005R000100010007-6 Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6 WHERE OBSERVATIONS AND SUMMARIES GET ADJUSTED IN ONE-TIME JOBS 1 Original file still contains errors if only this file is corrected. 2 Extract file still contains errors. 3 Disclosure of individuals can be deleted in this file. FIRST EXTRACT Extract ile 1 SORTING & TABULATION PROGRAMS Summary File at Low Level 2,3 FINAL TABULATION AND EDIT 4 CORRECT LOW LEVEL SUMS GS=. DC 73-B27Approved For Release 2001/08/30 : CIA-RDP74-00005R000100010007-6