In data processing it is necessary to store information on particular subjects, for example customers, suppliers, or personnel, and such information need to be structured so that it is readily controllable and accessible to the user. In traditional data processing systems, each of these ‘topics’ of information is allocated a file. A file is a collection of related data stored within a computer system. A file name is used to define this collection of data, and it can be handled as a single unit, e.g. it can be opened, closed, copied or deleted.

Master File

This is the principal source of data for an application. It is used for the storage of permanent or semi-permanent data which is used in applications such as stock control, sales order processing or payroll. Some of the fields tend to contain data which is fairly static, for example, customer name and address, whilst data in some fields is continually changing, for example, customer balance as transactions are applied to the file. Such updating is carried out either through the direct entry (on-line) of individual transactions, or from an accumulated set of entries stored on a transaction file (batch processing).

Transaction File

This is a temporary file which only exists to allow the updating of master files. It contains data which is to be added to the master file, or used to change details on the master file. Each transaction record contains the key field value of the master record it is to update (to allow correct matching with its relevant master record), together with data gathered from source documents, e.g. invoice amounts which update the balance field in customer accounts.

Files, Records and Fields

A file is a collection of related records stored within a computer system. E.g. a customer file, which has one record per customer.

A record is a collection of related data items, and is treated as a single unit for processing, e.g. customer details record, which has fields such as customer name, customer number, customer address

A field is part of a record which holds a single data item of a specified type, e.g. customer name. Fields can be of differing data types – some fields will hold characters, and some fields will be numeric and used in calculations. Each field is referred to by a field name.

Record identification - primary keys

In most systems it will be necessary to identify each record uniquely. In a Personnel File, for example, it might be thought that it is possible to identify each individual record simply by the employee’s Surname and this would be satisfactory as long as no two employees had the same surname. In reality, many organisations will have several employees with the same surnames, so to ensure uniqueness, each employee is assigned a unique Works Number. The works number field is then used as the primary key in the filing system, each individual having his or her own unique Works Number and so a unique primary key. A primary key is used to uniquely identify a record in a file.

Other fields in a record may be defined as secondary keys. These fields are not unique to each record, but may be used to quickly locate a group of records. For example, in an Employee file, Department could be defined as a secondary key, so that all the employees in a department could be viewed as a group. Examples of Application Systems An electronics factory uses a computerised payroll system that produces payslips each month. There is an employee master file and a monthly transaction file. Each record on the employee master file has the fields: Employee id – primary key Employee name and address Department Job title Basic salary Total pay to date Total tax to date Each record on the transaction file has fields such as: Employee id – primary key. Employee’s hours worked that month.

During the update process, on the master file, the “total pay to date” and the “total tax to date” will be updated.

A gas company uses a computerised billing system that includes a master file and a transaction file. The records in the master file have the following fields: Customer id – primary key. Customer name and address. Customer payment details. Date of last bill. Current outstanding balance. Previous meter reading. Method of payment.

The records in the transaction file have the following fields: Customer id – primary key. Date of meter reading. Current meter reading. During the update process, on the master file, the “date of last bill” will be updated, and the previous meter reading will be changed for the “current meter reading” from the transaction file.

Fixed and variable length records

The length of the record is the number of bytes allocated to it within the file.

Fixed length records have the same number of fields, and each record has the same number of bytes.

Variable length records have different numbers of fields in each record, and each record can have a different number of bytes.

Advantages of fixed length records Fixed length records are quicker to process (load or save faster), because start and end points of each record can be readily identified by the number of character positions. If a record has a fixed length of 80 character positions, then the second record starts at the 81st character position, the third at the 161st character position and so on. Easier to program, because the amount of storage required is known beforehand. Fixed length records allow an accurate estimation of file storage requirements. For example, a file containing 1000 records, each of fixed 80 characters length, will take approximately 80000 characters of storage. Where direct access files are being used, fixed length records can be readily updated 'in situ' (in other words the updated record overwrites the old version in the same position on the storage medium). As the new version will have the same number of characters as the old, any changes to a record will not change its physical length. On the other hand, a variable length record may increase in length after updating, preventing its return to its home location.

Advantages of variable length records

Storage space is saved as no blank spaces are stored when the data is short.

Truncation is avoided, as each field can extend to store any number of characters.

When the saving in storage space makes the introduction of more complex file handling techniques worthwhile.

Examples of files which would have variable length records Variable length records may be used in files which have storage requirements that vary greatly, e.g. in a personnel file, each record may contain details of previous jobs held and as the number of previous jobs may vary considerably from one employee to another, so the number of fields would be similarly varied. Variable length records could save storage space for employees with few previous jobs.

File Operations

Data is usually held as one or more files on backing storage. If the file is a new one, then it must first be created, and this might include specifying its maximum size and how it will be organised. Before it can be used, the file must be opened, and when access to the file is no longer required it must be closed. A file may be deleted from backing storage when it is no longer required. Reading is the operation of taking a copy of a data item from a file, into main memory for use. Writing is the operation of saving any changes to a file. Updating is altering an existing data item already written to the file. Inserting is adding a new data item to an existing file. Appending is adding a new data item at the end of an existing file.

File Access

The way in which a file is to be accessed by users influences the way in which it is stored.

Serial files store the records in the order in which they were created, i.e. no particular order. The file is always processed as a complete file. This type of file can be stored on tape or disk. An example of a serial file is an unsorted transaction file. e.g meter readings file for a gas company, as the readings are not taken in any particular order.

New records are added to a serial file by appending them to the end of the file. e.g. electricity meter readings.

Sequential files store the records in order, usually in key value order. It is not possible to jump straight to a particular record. This type of file can be stored on tape or disk. An example of a sequential file is a sorted transaction file, or a master file, such as a customer file. Sequential file access is used for master files as the records need to be in order of the key field for efficient processing.

A new record is added to a sequential file by: Copying the old file to a new file UNTIL the point of insertion is reached. Insert the new record. Then copy over the remaining records to the new file.

Updating Sequential Files •Master file is stored as a sequential file. •The transactions are placed on a serial file and then sorted on to a sequential file in the same order as the master file, just before updating starts. •The system reads the old master file and writes out a completely new master file. •This method is preferred when the hit rate (the proportion of master file records to be updated) is high. •The process leaves us with an old master file and a updated master file.

Examples of a sequential file update are: •Producing electricity/gas bills. •Payroll production.

Computer Science

QR Code
QR Code wjec_computing_alevel (generated for current page)