In computing, a file system (often also written as file system) is a method for storing and organizing computer files and the data they contain to make it easy to find and access them. File systems may use a data storage device such as a hard disk or CD-ROM and involve maintaining the physical location of the files, they might provide access to data on a file server by acting as clients for a network protocol (e.g., SMB, or 9P clients), or they may be virtual and exist only as an access method for virtual data (e.g., proofs). More formally, a file system is a set of abstract data types that are implemented for the storage, hierarchical organization, manipulation, navigation, access, and retrieval of data. File systems share much in common with database technology, but it is debatable whether a file system can be classified as a special-purpose database (DBMS).
Needs of file systems
The most familiar file systems make use of an underlying data storage device that offers access to an array of fixed-size blocks, sometimes called sectors, generally 512 bytes each. The file system software is responsible for organizing these sectors into files and directories, and keeping track of which sectors belong to which file and which are not being used. However, file systems need not make use of a storage device at all. A file system can be used to organize and represent access to any data, whether it be stored or dynamically generated (e.g. from a network connection). Whether the file system has an underlying storage device of not, file systems typically have directories which associate file names with files, usually by connecting the file name to an index into a file allocation table of some sort, such as the FAT in an MS-DOS file system, or an in ode in a Unix-like file system. Directory structures may be flat, of allow hierarchies where directories may contain subdirectories. In some file systems, may be flat, of allow hierarchies where directories may contain subdirectories. In some file systems, file names are structured, with special syntax for filename extensions and version numbers. In others, file names are simple strings, and per-file metadata is stored elsewhere.
Types of file systems organizations:-
There are various types of files in which the records are collected and maintained. They are categorized as:
- Master file
- Transaction file
- Table file
- Report file
- Back-up file
- Archival file
- Dump file
- Library file
Master File: Master file are the most important types of file. Most design activities concentrate here. In a business application, these are considered to be very significant because they contain the essential records for maintenance of the organization’s business. A master file can be further categorized. It may be called as reference master file, in which the records are static or unlikely to change frequently. For example, a product file containing descriptions and codes’ a customer file containing name, address and account numbers are example of reference files. Alternatively, it may be described as a dynamic master file. In this file, we keep records which are frequently changed (updated) as a result of transactions or a other events. These two types of master file may be kept as separate files or may be combined, for example, a sales ledger file containing reference data, such as name, address, account number, together with current transaction and balance outstanding for each customer.
A transaction is a temporary file used for two purposes. First of all, it is used to accumulate data about events as they occur. Secondly, it helps in updating master files to reflect the results of current transactions. The term transaction refers to and business event that affects the organization and about which data is captured. Examples of common transactions in the organization are making purchases, hiring of workers of workers and recording of sales.
A special type of master file is included in many systems to meet specific requirements where data must be referenced repeatedly. Table files are permanent files containing reference data used in processing transaction, updating master file or producing output. As the name implies, these files store reference data in tabular form. Table files conserve memory space and make the program maintenance easier by storing data in a file, that otherwise would be included in programs or master file records.
A sequential file contains records organized in the order they were entered. The order of the records is fixed. The records are stored and sorted in physical. Contiguous blocks within each block the records are in sequence. Records in these files can only be read or written sequentially. Once stored in the file, the record cannot be made shorter, or longer, or deleted. However, the record can be updated if the length does not change. (This is done by replacing the records by creating a new file.) New records will always appear at the end of the file. If the order of the records in a file is not important, sequential organization will suffice, no matter how many records you may have. Sequential output is also useful for report printing or sequential reads which some programs prefer to do.
Line-Sequential files are like sequential files, except that the records can contain only characters as data. Line-sequential files are maintained by the native byte stream files of the operating system. In the COBOL environment, line-sequential file that are created with WRITE statements with the ADVANCING phrase can be directed to a printer as well as to a disk.
Key searches are improved by this system too. The single-level indexing structure is the simplest one where a file, whose records are pairs, contains a key pointer. This pointer is the position in the data file of the record with the given key. A subset of the records, which are evenly spaced along the data file, is indexed, in order to mark intervals of data records. This is how a key search is performed: the search key is compared with the index keys to find the highest index key coming in front of the search key. While a linear search is performed from the record that the index key points to. until the search key is matched or until the record pointed to by the next index entry is reached. Regardless of double file access (index + data) required by this sort of search, the access time reduction is significant compared with sequential file searches. Hierarchical extension of this scheme is possible since an index is a sequential file in itself, capable of indexing in turn by another second-level index, and so forth and so on. And the exploit of the hierarchical decomposition of the searches more and more, to decrease the access time will pay increasing dividends in the reduction of processing time. There is however a point when this advantage starts to be reduced by the increased cost of storage and this in turn will increase the index access time.
In file organization, this is a file that is indexed on many of the attributes of the data itself. The inverted list method has a single index for each key type. The records are not necessarily stored in a sequence. They are placed in the data storage area, but indexes are updated for the record keys and location.
Direct or hashed access
With direct or hashed access a portion of disk space is reserved and a “hashing” algorithm computes the record address. So there is additional space required for this kind of file in the store. Records are placed randomly through out the file. Records are accessed by addresses that specify their disc location… Also this type of file organization requires a disk storage rather than tape. It has an excellent search retrieval performance, but care must be taken to maintain the indexes. If the indexes become corrupt, what is left May as well go to the bit-bucket, so it is as well to have regular backups of this kind of file just as it is for all stored valuable data.