OS file structure. File Operations


Laboratory work №10

OUTPUT TO DISK AND PRINTER

Simple display of information on the display is used in almost every program, but its capabilities are somewhat limited. Even using a temporary pause to allow the user to read all the messages does not completely solve the problem: once a message leaves the screen, it can no longer be read without starting the program again.

Moreover, the values ​​assigned to variables are retained only for the duration of the program's execution. Once the program is completed, all entered information is lost. This means that if, for example, you entered information about your CD collection into an array of structure variables, it is lost when the program exits, and the next time you access your computer, you will have to enter all the data again.

In order to save information for yourself or to familiarize other people with the results of your program, you need to print these results on paper. And in order to be able to access the data once entered at any time, it is necessary to save the information in a file on disk.

What is a file structure

The output data is not immediately sent to the disk or printing device, depending on the corresponding output instructions. Instead, they first go to an area of ​​memory designed for temporary storage of information called buffer. And only when the buffer is full, the data is sent to the disk or printer (Fig. 1). Data entered from disk also first goes into a buffer, from where it can be displayed on the screen or assigned as a variable value.

In order to send data to or receive data from a buffer, you need some kind of link between your program and the computer's operating system. This link is the file structure.

When a program opens a file for work, it thereby creates a special structure in memory. This structure contains information

Rice. 1. Data is stored in a buffer for some time

necessary for your program and computer to output data to a file and input from a file, as well as to print information to a printer.

For example, the structure contains the address of a file buffer so that the computer knows where to look for information you want to output to disk, or where to put data you want to read from disk. In addition, this structure stores information about the number of characters remaining in the buffer, as well as the position of the next character both output from and into the buffer (Fig. 2).


Rice. 2. The file structure stores information necessary for the normal execution of file operations

Almost all C and C++ compilers store the information needed to work with files in the header file STDIO.H. This file contains definitions of constants that are needed for file operations. In addition, it may contain a description of the file structure. In order to use the functions of working with files, the program should start with instructions

#include,

which will make file constants and file structure descriptions available during program compilation and linking.

When you enter data from a disk file, it is copied into the computer's memory; the information remaining on the disk does not change while the program is running. For this reason, programmers call this input by reading data from the file. When outputting data to disk, a copy of the data stored in memory is placed in a file. This procedure is called recording to disk.

Pointer to file

Input or output of information into files is provided using a so-called file pointer, which is a pointer to a file structure in memory. When writing information to a file or reading from a file, the program obtains the necessary information from the structure. The file pointer is defined as follows:

FILE *file_pointer;

The name of the structure FILE tells the program that the variable being defined is a pointer to a file structure. The asterisk instructs to create a pointer with the corresponding variable name.

If you are going to use multiple files at the same time, you need pointers for each of them. For example, if you are writing a program that copies the contents of one file to another, you need two pointers to the files. Two pointers are also required if you want to read information from the disk and print it on a printer:

FILE *infile, *outfile;

How to open a file

The connection between the program and the file is established using the fopen() function, the syntax of which is shown in Fig. 3.

This function assigns the address of a structure to a pointer. The first parameter of this function is the file name, which must be specified according to


Rice. 3. Syntax of the fopen() function

certain rules. For example, in the operating room MS-DOS system The filename can be a maximum of eight characters, plus a name extension of up to three characters (the extension is optional). If you want to output information to a printing device rather than to disk file, "PRN" is specified as the file name in quotes. In this case, the data is automatically output to the printer.

The second parameter to the function is the file access mode, that is, a message about what operations the user intends to perform with the file. In C and C++, the parameter that determines the access mode is also enclosed in quotes. The following options are possible:

R - Indicates what will be executed reading information from a file into computer memory. If the file does not already exist on disk, the program will report a runtime error.w - Indicates that it will be executed record data to disk or output to a printer. If the file does not exist at this point, the operating system will create it. If the file already exists on disk, all information currently written to it will be destroyed.a - Indicates that it should add information at the end of the file. If the file does not exist, the operating system will create it. If it exists, the new output will be appended to the end of the file without destroying the current content.

For example, if you wanted to create a file named CD.DAT to store a catalog of CD collections, you would use the following instructions:

FILE *cdfile;cdfile = fopen("CD.DAT", "w");

If your program needs to read from a file rather than write to it, use the following notation:

FILE *cdfile;cdfile = fopen("CD.DAT", "r");

Note that both the file name and the symbol defining the access mode are enclosed in double quotes. This is because they are passed to fopen() as strings. The file name can be entered from the keyboard as the value of a string variable, and then the name of that variable can be used as an argument, without quotes.

If you want to print information about your collection on a printer, use the following sequence of instructions:

FILE *cdfile;cdfile = fopen("PRN", "w");

Please note that information output to the printer is only possible with the “w” access mode.

How C/C++ works with files

C stores information about the current read and write position in a file using a special pointer.

When reading information from a file, the pointer specifies the next data to be read from disk. When a file is opened for the first time using the "r" access mode, the pointer is placed at the first character of the file. When a read operation is performed, the pointer moves to the next piece of data to be read. The size of the movement step depends on the amount of information that is read at one time (Fig. 4). If only one character is read at a time, the pointer will move to the next character; if an entire structure is read, the pointer will move to the next structure. Once all the information has been read from the file, the pointer goes to a special code called end-of-file character The presence of the end-of-file character is actually not necessary at all. Attempting to continue reading after reaching the end of the file will result in a runtime error.

If a file is opened with access mode "w", the pointer is also placed at the beginning of the file, so that the first data entered will be placed at the beginning of the file. When closing a file, an end-of-file character will be added after the entered data array. If the file already exists by the time it is opened using the “w” access mode, all the data it contains is overwritten and new information entered using the recording procedure is written “on top” of it. Any data that may not be destroyed is placed after the new end-of-file character, so that it can no longer be accessed the next time data is read from the file. Thus, any attempt to write data to existing file using access mode "w" will lead to the destruction of the information currently stored in it. This will happen even if the file is simply opened and closed without writing any data.

If a file is opened using access mode "a", the pointer is placed at the end-of-file character. New data that is written to a file is placed after the existing data, and then an end-of-file character is added.

How to close a file

After finishing writing to or reading from a file, you must close it, that is, break the connection between the file and the program. This is done using the instructions

Fclose(file_pointer);

By closing the file, we get a guarantee that all the information in the buffer is actually written to the file. If program execution ends before the file is closed, some part of the information that did not make it to disk may remain in the buffer, as a result of which it will be lost. In addition, the end-of-file character will not be properly written, and the program will not be able to access the file the next time.

It should be added that closing a file frees the pointer, after which it can be used with another file or to perform other operations on the same file. As an example, let's say you want to create a file, write data to it, and then verify that the information is written correctly. To do this, the program can use the structure shown in Listing1.

Listing 1. Using one file pointer in two operations.

FILE *cdfile;if((cdfile = fopen("CD.DAT", "w")) == NULL) ( puts("Unable to open file"); exit(); )/* Instructions for writing to the file should go here */fclose(cdfile);if((cdfile = fopen("CD.DAT", "r")) == NULL) ( puts("Unable to open file"); exit(); )/* At this point you should read instructions from the file be written */fclose(cdfile);

Here the file is first opened using the "w" access mode, then data is written to it. The second time, the file is opened using the "r" access mode, which allows the data to be read and displayed on the screen.

Some compilers allow you to ensure that all data is written to a file by clearing the buffer using the function

This function allows you to clear the buffer without closing the file and write all the data in it to disk or send it to a printer.

Input and output functions

There are several ways to pass data to and from a file, depending on the function used:

· character-by-character writing of data to a file or outputting it to a printer using the putc() or fputc() function;

· character-by-character reading of data from a file using the getc() or fgetc() function;

· writing data line by line to a file or outputting it to a printer using the fputs() function;

· line-by-line reading of data from a file using the fgets() function;

· formatted output of characters, strings or numbers to disk or to a printer using the fprintf() function;

· formatted input of characters, strings or numbers from a file using the fscanf() function;

· writing an entire structure using the fwrite() function;

· reading an entire structure using the fread() function.

Working with Symbols

Character-by-character data transfer is the most basic form of file operations. Although it is not one of the widely used methods of handling information in practice, it nevertheless well illustrates the basic principles of working with files. The program below writes data to a file character-by-character and continues until a key is pressed. Enter:

/*fputc.c*/#include main() ( FILE *fp; char letter; if((fp = fopen("MYFILE","w"))==NULL) ( puts("Unable to open file"); exit(); ) do ( letter=getchar(); fputc(letter, fp); ) while(letter != "\r"); fclose(fp); )

The file opens with access mode "w". If a file named MYFILE does not exist when the program runs, it will be created. The do loop uses the getchar() function to input a sequence of characters, which is then written to a file using the putc() function. The syntax for writing putc() is:

Putc(char_variable, file_pointer);

The fputc() function can also be used with the same arguments.

The loop runs until a key is pressed Enter, which enters a carriage return (\r) code, which closes the file.

Working with Strings

Instead of working with individual characters, you can read and write entire lines of text from a file. Line-by-line writing and reading are done using the fputs() and fgets() functions.

The fputs() function has the following syntax:

Fputs(string_variable, file_pointer);

This function writes data line by line to a file or outputs to a printer, but does not add a "new line" code. In order for each line to be written to disk (or printed to a printer) truly as a separate line, you must enter the "newline" code manually. For example, the following program creates a names file:

/*fputc.c*/#include main() ( FILE *fp; char flag; char name; if((fp = fopen("MYFILE","w"))==NULL) ( puts("Unable to open file "); exit(); ) flag = "y"; while(flag != "n") ( puts("Enter a name"); gets(name); fputs(name, fp); fputs("\n" ,fp); printf("Would you like to enter a different name?"); flag=getchar(); putchar("\n"); ) fclose(fp); )

The while loop continues until n is entered at the prompt. In this loop, the name is entered from the keyboard using the gets() function, after which the name is written to disk using the fputs() function. Next, the "new line" code is written to the file, and finally the program asks the user if he wants to continue entering names.

If your compiler can use the strlen() function, you can simplify the input procedure somewhat by using the following instructions:

Printf("Please enter a name: ");gets(name);while(strlen(name) > 0) ( fputs(name, fp); fputs("\n", fp); printf("Please enter a name : "); gets(name); )

The characters you type on the keyboard are assigned to the string variable name and then checked to see if the length of the string is 0. If you press Enter immediately when prompted, the string will have a length of zero and the loop will stop executing. If before pressing Enter enter at least one character, the line and the “new line” code will be written to disk.

Some compilers allow you to further simplify the string input algorithm, for example, like this:

Printf("Please enter a name: ");while(strlen(gets(name)) > 0) ( fputs(name, fp); fputs("\n", fp); printf("Please enter a name: " ); )

where the line input is done inside the while condition.

To print a string to a printer, the file name "prn" is used instead of writing it to disk. To open a file, you need to specify:

If ((fp = fopen("prn", "w")) == NULL)

To create a print program, the line length is set to 81 characters so that the line can fit the full width of the screen before a key is pressed Enter. Listing 2 is a program that demonstrates how a simple word processor can be written. The line is not sent to the printer until a key is pressed Enter, which allows you to use the key Backspace correct line input errors.

Listing 2. Program for outputting a string to a printing device.

/*wp.c*/#include "stdio.h"main() ( FILE *fp; char line; if ((fp = fopen("prn", "w")) == NULL) ( puts("Printer not ready for work"); exit(); ) puts("Enter text, press Enter after each line\n"); puts("To stop typing, press Enter at the beginning of a new line\n"); gets(line) ; while (strlen(line) > 0) ( fputs(line, fp); fputs("\n", fp); gets(line); ) fclose(fp); )

Reading lines

Reading lines from a file is done using the fgets() function. Function syntax:

Fgets(string_variable, length, file_pointer);

The function enters the entire string up to the newline character if its length does not exceed the value specified in the lenght parameter minus one character. The lenght parameter is an integer, or an integer constant or variable, indicating the maximum possible number of characters per line.

Below is a program that reads names from the file created in the previous example:

/*fgets.c"/#include "stdio.h"main() ( FILE *fp; char name; if ((fp = fopen("MYFILE", "r")) == NULL) ( puts("Impossible open file"); exit(); ) while(fgets(name, 12, fp) != NULL) ( printf(name); ) fclose(fp); )

The input is performed inside a while loop until the value of the character being read is NULL. Once the pointer reaches the end of the file, the string variable is assigned the value NULL. When reading from a file line by line, NULL is always used to indicate the end of the file, and EOF is used when reading character by character.

If you are writing a program to read any text file, specify the lenght argument to be 80.

By the way, note that the printf() function is used in this example to print the contents of a string variable without format specifiers. Each line read from the file includes the newline code that was written to the file in the fputs("\n", fp); statement, and no additional newline codes need to be included in the parameters of the printf() function.

Listing 3. Formatted output.

/*fprintf.c*/#include "stdio.h"main() ( FILE *fp; char name; int quantity; float cost; if ((fp = fopen("MYFILE", "w")) == NULL ) ( puts("The file cannot be opened"); exit(); ) printf("Enter the product name: "); gets(name); while (strlen(name) > 0) ( printf("Enter the product price: ") ; scanf("%f", &cost); printf("Enter the number of product units: "); scanf("%d", &quantity); fprintf(fp, "%s %f %d\n", name, cost , quantity); printf("Enter product name: "); gets(name); ) fclose(fp); )

Please note that in last line cycle, the next name is entered. This allows you to stop repeating the cycle by simply pressing a key Enter. Some novice programmers would probably write the loop like this:

Do ( printf("Enter the product name: "); gets(name); printf("Enter the price: "); scanf("%f", &cost); printf("Enter the number of product units: "); scanf(" %d", &quantity); fprintf(fp, "%s %f %d\n", name, cost, quantity); )while (strlen(name) > 0);

and this program would work just as well, except that it would require pressing a key to end the loop Enter three times: the first time when entering the name and two more times in response to the request to enter the price and quantity of the product.

Inside the while loop, the price and quantity of each item are entered using the scanf() function and then written to disk using the instruction

Fprintf(fp, "%s %f %d\n", name, cost, quantity);

Note that the newline code is written to the file at the end of each line. If you view the contents of the file using the TYPE command of the MS-DOS operating system, then each line of the inventory list and on the screen will begin on a new line:

If the newline code had not been written to disk, the text would have been printed consecutively, on one line on the screen, and would have looked something like this:

Floppy disks 1.120000 100tape 7.340000 150cartridge 75.000000 3

Please note that there is no space between the number showing the number of units of one product and the name of the next one. Even with this recording method, you can easily read from this file, since the compiler is able to distinguish between the end of a numeric value and the beginning of a line, but what happens if the last value for each product name is a line with the name of the manufacturer? The information in the file will look something like this:

Floppy disks 1.120000 Memoryex tape 7.340000 Okaydata cartridge 75.000000 HP

and then, when reading data from a file, the program will append the beginning of the data about the next product to the end of the description of the previous one. For example, data on the first product name would look like this:

Floppy disks 1.120000 Memoryex tape

All data written to disk, even values type int or float, are stored as text characters. We will talk about this a little later.

Listing 4. Reading rich text from a file.

/*fscanf.c*/#include "stdio.h"main() ( FILE *fp; char name; int quantity; float cost; if ((fp = fopen("MYFILE", "r")) == NULL ) ( puts("Unable to open file"); exit(); ) while (fscanf(fp, "%s%f%d", name, &cost, &quantity) != EOF) ( printf("Product name: %s \n", name); printf("Price: %.2f\n", cost); printf("Number of units: %d\n", quantity); ) fclose(fp); )

Working with structures

One way to overcome the limitations of scanf() is to combine data elements into a structure so that entire structures can be input and output later. The structure can be written to disk using the fwrite() function and read from a file using the fread() function.

The fwrite() function syntax is:

Fwrite(&structure_variable, structure_size, number_of_structures, file_pointer);

At first glance, this instruction looks a little intimidating, but in fact it is very easy to use:

· &structure_variable - the name of a structure variable with an address receiving operator that tells the compiler the starting address of the information that we want to write to disk;

· structure_size is the number of characters in the structure; You don’t have to calculate it yourself; you can use the library function sizeof(), written like this:

Sizeof(structure_variable)

which will automatically determine the size of the specified structure;

· number_of_structures is an integer that determines the number of structures that we want to write at one time; The number 1 should always be specified here, unless you are going to create an array of structures and write it as one big block;

· file_pointer - pointer to the file.

As an example, let's say you want to burn information about your CD collection to disk. Using the CD structure, which we discussed in detail earlier, we write the instruction: fwrite(&disc, sizeof(disc), 1, fp);

The execution of this instruction is illustrated in Fig. 5.

The program that enters data into a CD structure and then saves it to disk is shown in Listing 12.5. To enter a name created file The gets() function is used. The variable that stores the file name is used by the fopen() function to open the file.

Information about each CD structure is entered from the keyboard, after which the entire structure is written to the disk.



Rice. 5. Syntax of the fwrite() function in the CD structure write instruction

Listing 5. Recording the CD structure.

/*fwrite.c*/#include "stdio.h"main() ( FILE *fp; struct CD ( char name; char description; char category; float cost; int number; ) disc; char filename; printf("Enter the name of the file you want to create: "); gets(filename); if ((fp = fopen(filename, "w")) == NULL) ( printf("Unable to open file %s\n", filename); exit(); ) puts("Enter disk information\n"); printf("Enter disk name: "); gets(disc.name); while (strlen(disc.name) > 0) ( printf("Enter description: "); gets(disc.description); printf("Enter category: "); gets(disc.category); printf("Enter price: "); scanf("%f", &disc.cost); printf ("Enter cell number: "); scanf("%d", &disc.number); fwrite(&disc, sizeof(disc), 1, fp); printf("Enter name: "); gets(disc.name) ; ) fclose(fp); )

Reading structures

Fread(&structure_variable, structure_size, number_of_structures, file_pointer);

With the exception of the function name, this instruction is exactly the same as the fwrite() function. The program that reads the CD structure from a file is shown in Listing 6. To read the data, use a while loop:

While (fread(&disc, sizeof(disc), 1, fp) == 1)

The fread() function returns a value corresponding to the number of structures successfully read. Since we specified in the function argument that one structure should be read, the function returns the value 1. while loop will be executed as long as the read of structures from disk is successful. If the structure cannot be read, for example because the end of the file has been reached, the function returns 0 and the loop terminates.

Listing 6. Reading the CD structure from disk.

/*fread.c*/#include "stdio.h"main() ( FILE *fp; struct CD ( char name; char description; char category; float cost; int number; ) disc; char filename; printf("Enter name of the file you want to open: "); gets(filename); if ((fp = fopen(filename, "r")) == NULL) ( printf("Unable to open file %s\n", filename); exit (); ) while (fread(&disc, sizeof(disc), 1, fp) == 1) ( puts(disc.name); putchar("\n"); puts(disc.description); putchar("\ n"); puts(disc.category); putchar("\n"); printf("%f", disc.cost); putchar("\n"); printf("%d", disc.number) ; ) fclose(fp); )

In table 1 collects all the described methods of entering and exiting data and shows the values ​​that each function returns if it is impossible to continue reading or writing data.

Table 1. Functions for input to and output from a file.

Reading into an array

All of the example programs shown so far have read data from a file and displayed the input on the screen. However, if you read data into variables, you can perform any operations on them, for example, use them to write to an array.

Listing 7 shows the text of a program that reads information from a file containing data about a collection of CDs into an array of CD structures (assuming no more than 20 of them). The index is used to ensure that each structure read from a file is stored in a separate element of the disc array. After the next structure is read and displayed, the cost of the next disk is added to the amount reflecting the total cost of the collection, and the index and counter are incremented by executing the following instructions:

Total = total + disc.cost;index++;count++;

If we were only interested in information about total cost and the number of instances of the collection, it would be possible to read the data into a structure variable without using an array, and simply count the values ​​of the total and count variables. However, if the data is read into an array, you can arbitrarily access the structures and print any information.

Note that the program repeats the prompt for a file name until you enter a file name that can actually be opened.

Listing 7. Reading a structure into an array.

/*rarray.c*/#include "stdio.h"main() ( FILE *fp; struct CD ( char name; char description; char category; float cost; int number; ) disc; int index, count; float total ; count = 0; total = 0; char filename; printf("Enter data file name: "); gets(filename); while ((fp = fopen(filename, "r")) == NULL) ( printf(" Unable to open file %s\n", filename); printf("Enter data file name: "); gets(filename); ) index = 0; while (fread(&disc, sizeof(disc), 1, fp) == 1) ( puts(disc.name); putchar("\n"); puts(disc.description); putchar("\n"); puts(disc.category); putchar("\n"); printf( "%f", disc.cost); putchar("\n"); printf("%d", disc.number); total = total + disc.cost; index++; count++; ) fclose(fp); printf( "The total value of the collection is %.2f\n", total); printf("The collection contains %.d disks\n", count); )

Listing 8. Program for copying file contents.

/*filecopy.c*/#include "stdio.h"main() ( FILE *fp1, *fp2; char infile, outfile; int letter; printf("Enter file name to read: "); gets(infile); if ((fp1 = fopen(infile, "r")) == NULL) ( printf("Unable to open file %s", infile); exit(); ) printf("Enter the file name to write: "); gets (outfile); if ((fp2 = fopen(infile, "w")) == NULL) ( printf("Unable to open file %s", outfile); fclose(fp1); exit(); ) while ((letter = fgetc(fp1)) != EOF) ( putchar(letter); fputc(letter, fp2); ) fclose(fp1); fclose(fp2); )

The first file is opened with access mode "r" so that data can be read from it. If the file cannot be opened, the program ends. The second file opens with the "w" access mode, which allows you to write to


Rice. 6. The fprintf() function writes numeric values ​​as text characters

him data. If the second file cannot be opened, the first file is closed first before the program exits. This gives us a guarantee that the first file, if it was successfully opened, will not be damaged when the program exits.

The fprintf() function writes all data as text. For example, if you use fprintf() to write the number 34.23 to disk, five characters will be written as shown in Fig. 6. If the fscanf() function is subsequently used to read data from a file, the characters will be converted to a numeric value and written to a variable in this form.

Because the fprintf() function writes data as text, reading from a file can also be done using the getc(), fgetc(), or fgets() functions. However, these functions will read the information in the form of "printable" characters. For example, if you use the fgets() function, the numbers will be read as characters that are part of a string. When displaying or printing data read using fgets() or fgetc(), you will not be able to perform arithmetic operations on separate elements data.

Binary format

The fwrite() function is used to save numeric variables in binary format. Data written in this way will take up the same amount of space on disk as in memory. If we view the contents of such a file using the TYPE command, we will see in place numerical values meaningless letters and icons. These are ASCII characters equivalent to the values ​​written to the file.

To read a file written with fwrite(), you should use the fread() function. Data should be entered into a structure that has a structure corresponding to the previously saved data. The structure may have a different name, and the names of the members of the structure may also differ, but the order, types and sizes of the members of both structures must be the same.

Print data

From a technical point of view, you can output data to a printer using any output function: character-by-character, line-by-line, formatted strings or structures. The only thing required is to specify the file name "prn" and the access mode "w".

However, "structural" printing using the fwrite() function is practically not used, since the numeric data will be printed in binary format as cryptic characters. Instead, the fprintf() function is used to print structures, as shown in Listing 9. This program opens two files: the disk file is opened for reading, and the printer file is opened for output.

Listing 9. Reading and printing the contents of a disk file.

Each structure is entered as a whole by the fread() function, after which the individual members of the structure are printed using the fprintf() function. The fread() function can read strings that include spaces, so it is preferable to using the fscanf() function.

Instructions

Fprintf(ptr, "\n\n");

output two empty lines between individual CD structures.

Program design

Knowing how to write to and read from a disk file opens up the possibility of creating complex applications. All programs that demonstrated data entry from a disk file read the entire file. But you can imagine a situation where you want to deal with the data in some other way.

For example, you may need to scan a disk file for a specific entry. In this case, you should open the file with access mode "r", and then use a loop to gradually enter data, structure by structure or line by line, depending on what type of information is written in the file. During each iteration of the loop, the values ​​of the input data are compared with the values ​​being sought. To check string values, use the strcmp() function, if your compiler allows it, of course. Once the required data is found, it is displayed on the screen, after which the file is closed.

In order to understand the principles by which they operate computer systems, it is not enough to simply interact with the operating system on a visual level. To fully understand everything that is happening, you should clearly understand what a file and file structure are. When considering this topic, it will be indicated why this is necessary.

and file structure

First you need to decide on the most important terms and concepts. The key here is the concept of a file, which determines the operating mechanisms of the system in software terms.

So, a file is an object containing certain information. To understand data, file structures and their interaction, it is better to give an example from life, say, compare these concepts with an ordinary book.

Everyone knows that in almost any book you can find a cover, pages, table of contents, chapters and sections. For the simplest understanding, the cover is the entire file system in its entirety, the pages are folders (directories) in which individual files are stored, the table of contents is file manager, chapters and sections are files containing specific information.

As a rule (not always, however), the designation of an object called a file consists of two parts: a name and an extension. Actually, the name can be completely arbitrary and set in different languages. Expansion is special designation of three or more Latin letters, which indicates Simply put, by the extension you can understand what program the file is associated with, whether it is a system file, etc.

Open file by default in any operating system done by double clicking the mouse. However, it is not a fact that everything can be opened in this way. The simplest example: executable files in Windows with the extension .exe can be launched this way, but the same dynamic libraries, designated in the extension as .dll, although they contain executable codes, however, are not opened in this way. This is only due to the fact that their contents are accessed through other software components, or the code is called by specialized components of the operating system itself. But this is the simplest example.

Files (objects) that do not correspond to any program will not be so easy to open. Roughly speaking, not a single “OS” will understand which opening tool needs to be launched. In the best case, you will be asked to select the appropriate program from the provided list of probable solutions.

Files and file structure: computer science at the dawn of the development of computer technology

Now let's see what they were information Technology, when they first appeared. It is believed that the main system used at that time was DOS, primitive in modern times, in which specialized commands had to be entered to access functions.

With the advent of the unique creation Norton Commander, such a need not only disappeared (some commands still needed to be registered), but rather decreased. It is this file manager, based on our example, that can be called a table of contents, since all the data stored on the hard drive or external storage device was clearly structured.

Files and folders

As is already clear, in any system there are several main types of objects. The file and file structure, in addition to the main element (the file), are inseparable from the concept of a folder. Sometimes this term is referred to as “directory” or “directory”. Essentially, this is a section in which individual components are stored.

In principle, not to mention book pages, the concept of a folder can be expressed most clearly if you look at some chest of drawers with many drawers in which something lies. This “something” is files, and the boxes are directories.

The simplest examples of file search

Based on the above, we can draw a conclusion about quickly searching for information. Any existing operating system has tools for this purpose. In the same file manager (for example, Windows Explorer), in a special field, you just need to enter at least part of the file name, after which the system will display all objects containing the entered string.

However, for more precise search sometimes you need to know exactly where the file you are looking for is located. Roughly speaking, we need to select a specific drawer in the chest of drawers where the item we need is located. The search itself is performed using standard means in the file manager, but you can also use a combination like Ctrl + F, which brings up the search bar.

What is a file system?

Files and file structures cannot be imagined without understanding the file system. Note that file structure and file system are not the same thing. Structure is the main type of organizing files, or, if you like, systematizing data, but the file system is a method that determines the operation of the structure. In other words, this is the principle of data processing in terms of its placement on a hard drive or any other storage medium.

Today you can find quite a lot of file systems. For example, the most famous systems for Windows since the development of computer technology have been FAT systems with 8, 16, 32 and 64 bit architecture, NTFS and ReFS. File system, file structure, ordering method are closely related. But now a few words about the systems themselves.

Without talking about technical details, it should be noted that the main difference between them is only that FAT has more storage and accelerated access to small files, while NTFS and ReFS are optimized for large amounts of data and quick access to them at maximum speed reading information from the hard drive.

File Operations

Now let’s look from the other side at what constitutes Operations with files, which are provided in any “OS”; in general, they do not differ much.

The main ones include creating a file, opening, viewing, editing, saving, renaming, copying, moving, deleting, etc. Such actions are standard for all existing systems. However, there are also some specific functions.

Data archiving

Among the specific functions, first of all, we can highlight the compression of files and folders, called archiving, as well as the reverse process - extracting data from the archive. At the time of the development of DOS, the creation of archival data types was mainly limited to using the ARJ standard.

But with the advent of ZIP archiving technologies, such processes have received a new development. Subsequently, a universal RAR archiver. These technologies are now available in any operating system, even without the need to install additional software. The OS file structure of file operations in this perspective is interpreted as virtual compression. Essentially, compression technologies simply instruct the system to determine not the desired size, but a smaller one. The information volume of a file or folder itself does not change during archiving.

Controlling the display of objects

The concepts of “file structure”, “file structure”, etc. should also be considered from the point of view of the possibility of seeing the objects themselves. It's no secret that almost all users of modern PCs have come across the term “hidden files and folders.”

What it is? This only means that the system has restrictions on the display of certain objects (for example, system files and folders, so that the user does not accidentally delete them). That is, physically they don’t disappear from the hard drive, the file manager just doesn’t see them.

To display all hidden objects, in the same “Explorer” you should use the “View” menu, where on the corresponding tab a check mark is placed in the line for displaying all files. After enabling this type, objects will have translucent icons.

Finding hidden objects can also be difficult. When you enter a file name or its extension, even indicating a specific location, when the display of such objects is disabled, there will be no result (the system does not see them). In order to find them, you need to enter the % symbol at the beginning and at the end of the root folder name. For example, to search for the AppData directory, which is hidden and located in the local settings folder specific user, you should use the search string %USERPROFILE%\AppData. Only in this case will the file and the file structure as a whole receive the key to the relationship.

Conclusion

Here is a brief summary of all that concerns understanding the basic terms. In principle, it is not so difficult to understand what a file and file structure are using elementary examples. Finally, if you want, you can define these terms as the bricks and the wall that make it up. A brick is a file, a wall is a file structure, where each brick occupies a strictly defined place, assigned only to it.

Some technical aspects or classical definitions accepted in programming and computer technologies so that the reader can understand the material at an elementary level.

All programs and data are stored in long-term memory computer as files.

Definition 1

File– a named collection of data recorded on a medium. Any file has a name consisting of two parts separated by a dot - the name itself and the extension. When specifying a file name, it is desirable that it indicates either the contents of the file or the author.

The extension indicates the type of information stored in the file. The file name is given by the user, and the file type is usually set automatically by the program when it is created.

Picture 1.

The file name can contain up to $255$ characters, including extension. The file name can consist of English and Russian letters, numbers and other symbols.

It is forbidden to use the following characters in file names:

\ / * ? : “ | .

Extension of some file types:

Figure 2.

In addition to the name and type, the file parameters also include: file size, date and time of creation, icon (an elementary graphic object, by which you can find out in what environment the file was created or what type it is).

Figure 3.

Classification of file icons

Figure 4.

Definition 2

File structure– a set of files and the relationship between them.

Single-level file structure used for disks with a small number of files and is a linear sequence of file names.

Multi-level file structure used if the disk contains thousands of files grouped into folders. Multi-level means a system of nested folders with files.

Each disk has a logical name, denoted by a Latin letter followed by a colon:

  • C:, D:, E: etc. – hard and optical drives,
  • A:, B: - flexible disks.

Folder top level for a disk is the root folder, which in OS Windows is indicated by adding the “\” icon to the disk name, for example, D:\ is the designation of the root folder.

Example file structure:

Figure 5.

Catalog is a folder or directory where files and other directories are placed.

A directory that is not a subdirectory of any other directory is called root. This directory is at the top level of the hierarchy of all directories. In Windows, each drive has its own root directory (D:\, C:\, E:).

Directories in OS Windows are divided into system and user. Example of system directories: “Desktop”, “ network", "Trash", "Control Panel".

Figure 6. OS Windows system directories

From left to right: system folder

Trash, My Documents folder, shortcut to the My Documents folder

A directory and a folder are physically the same thing.

The path to the file is its address.

The path to the file always begins with the logical drive name (D:\, C:\, E:), then a sequence of names of folders nested within each other is written, the last folder containing the desired file. The path to the file together with the file name is called the full file name, for example: D:\My Documents\Literature\Essay.doc full file name Essay.doc.

Figure 7. Directory and file tree

Schematically, the file structure of a disk is represented as a tree.

Figure 8. File structure of drive Z:

  • Z:\box\box1 – full name of the folder (directory) box1
  • Z:\box\box.txt – full name of the box.txt file
  • Z:\box\box2\box3\box1 - full name of the folder (directory) box1
  • Z:\box\box2\box3\box.txt - full name of the box.txt file

L 5.1. OS ARCHITECTURE

Keywords: file, file name extension, file attributes, file structure, directory (folder), file path, formatting, sector, track, cylinder, file allocation table (FAT table), cluster, file system, FAT 16, FAT 32, NTFS , MTF, CDFS, OS commands, desktop, taskbar, object icon and shortcut, main menu Windows, window Windows, title bar, toolbar, drag-and-drop, drag, " Conductor", clipboard, " Norton Commander", templates for selecting and searching files.

operating system is a complex of system and service software. On the one hand, it relies on the basic computer software included in its system BIOS (Basic Input/Output System); on the other hand, it itself is the support for software at higher levels - application and most service applications. Operating system applications It is customary to call programs designed to work under the control of a given system.

The main function of all operating systems is mediation. It consists of providing several types of interface:

· interface between the user and computer hardware (user interface);

interface between software and hardware (hardware-software interface);

· interface between different types of software (software interface).

Even for one hardware platform, such as
IBM PC, There are several operating systems (OS). For example, let's look at the file structure, main objects and management techniques of the most common operating systems: MS DOS And Windows XP.

File structure of a personal computer. When storing data, two problems are solved: how to store data in the most compact form and how to provide convenient and convenient access to it. fast access(if access is not provided, then this is not storage). To ensure access, the data must have an ordered structure. This generates address data. Without them, it is impossible to access the necessary data elements included in the structure.

A variable-length object called file.

A file is a named sequence of bytes of arbitrary length. Since a file can have zero length, creating a file involves giving it a name and registering it in the file system - this is one of the OS functions.

Usually in separate file store data belonging to the same type. In this case, the data type determines file type.

Since there is no size limit in the file definition, one can imagine a file having 0 bytes (empty file), and a file having any number of bytes.



When defining a file, special attention is paid to the name. It actually carries address data, without which the data stored in the file will not become information due to the lack of a method to access it. In addition to addressing-related functions, a file name can also store information about the type of data contained in it. This is important for automatic tools for working with data, because based on the file name (or rather, its extension), they can automatically determine an adequate method for extracting information from the file.

According to the methods of naming files, they distinguish “ a short"(8 characters are allocated for the file name, and 3 characters for its extension) and " long» name (up to 256 characters). The file name and its extension are separated by a dot. The file extension is an optional parameter and may be missing.

In OS MS DOS name (no more than 8 characters) and extension (no more than 3 characters) can consist of uppercase and lowercase Latin letters, numbers and symbols:

- _ $ # & @ ! % () { } " ~ ^

It should be remembered that for OS lines MS DOS:

A dot is placed between the name and the extension, which is not included in either the name or the extension;

The file name can be typed in any case, because for the system all letters are lowercase;

Characters not used in filenames

* = + \ ; : , . < > / ?

Device names cannot be used as file names:

AUX - name of additional input/output device;

CON - name of the keyboard for input or display for output;

LPT1 ... LPT3 - names of parallel ports;

COM1 ... COM3 - names of serial ports;

PRN - printing device name;

NUL is the name of a dummy device that emulates output operations without actual output.

With the advent of the OS Windows 95 the concept “ long" name. This name can be up to 256 characters long, which is enough to create meaningful file names. " Long"The name can contain any characters except nine special ones:

\ / : * ? " < > |

Spaces and multiple periods are allowed in the name. The name extension includes all characters after the last dot.

Along with " long» OS name Windows 95/98/Me/2000/XP are also created a short file name - it is necessary to be able to work with this file on workstations with outdated operating systems.

Usage " long» file names in the latest OS Windows has a number features.

1. If " long"The file name includes spaces, then in service operations it must be enclosed in quotation marks. It is recommended not to use spaces, but to replace them with underscores.

2. It is not advisable to store files with long names in the root folder of the disk (at the top level of the hierarchical file structure) - unlike other folders, the number of storage units is limited in it (the longer the names, the fewer files can be placed in the root folder).

3. In addition to the limit on the length of the file name (256 characters), there is a much stricter limit on the length full file name(this includes the file access path, starting from the top of the hierarchical structure). The full name cannot be longer than 260 characters.

4. It is allowed to use characters of any alphabets, including Russian, but if the document is being prepared for transmission, it is necessary to agree with the customer on the possibility of reproducing files with such names on his equipment.

5. Uppercase and lowercase letters are not distinguished by the OS. Names Letter.txt And letter. txt correspond to the same file.

6. Programmers have long learned to use file name extensions to convey to the operating system, the executing program, or the user information about what type of data the file contains and the format in which it is written. System applications offer to select only the main part of the name and specify the file type, and the corresponding name extension is assigned automatically.

Depending on the extension, all files are divided into two large groups: executable And non-executable .

Executable files - these are files that can be executed independently, i.e. do not require any special programs to run them. They have the following extensions:

· exe- file ready for execution ( winrar.exe; winword.exe);

· cell - operating system file ( command.com);

· sys- operating system file ( io.sys) - usually this is the driver external device;

· bat- batch file operating system MS DOS (autoexec.bat).

Non-executable files require the installation of special programs to run. So, for example, in order to view a text document, you need to have some kind of text editor. By the extension of a non-executable file, you can judge the type of data stored in this file. Here are some standard extensions and the names of programs designed to work with files of the specified extensions:

A.S.M.- program text in language assembler;

AVI, MPEG, MPG, WMV etc. - various video file formats, for viewing you can use, for example, Windows Media Player- data type: image;

BAK- old version of the file;

BAS- program text in language BASIC;

BMP- a document created in a graphic editor, for example, Paint- data type: image;

C- program text in language Si;

CDR CorelDraw- data type: image;

CPP- program text in language C++;

dbf- a database file created, for example, in a DBMS FoxPro;

DOC- a document created in word processor Microsoft Word- data type: text;

DWG, DXF- graphic files created in AutoCAD;

HTML- a document intended for publication on the Internet;

LIB- library (usually object modules);

MDB- database file created in the DBMS Microsoft Access;

MP3, MID, WMA, WAV– various audio file formats - data type: audio;

O.B.J.- object module;

PAS- program text in language Pascal;

PDF - PDF-a document created and intended for viewing in a program Adobe Reader;

PPT- presentation file created in Microsoft PowerPoint;

PSD- a graphic file created in GPU Adobe Photoshop;

RAR WinRar;

RTF- a document created in a text editor WordPad;

TIF, GIF, JPG- various graphic file formats;

TMP- temporary file;

TXT - text file, for example, created in the program Notebook;

XLS - eBook, created in a table processor Microsoft Excel - data type: characters (text or numbers);

ZIP- archive file created by the archiver program WinZip.

In addition to the name and extension of the file name, the operating system stores for each file the date of its creation (change) and several flag values ​​called file attributes. Attributes are Extra options, defining file properties. The operating system allows you to control and change them. The state of attributes is taken into account when performing automatic operations with files.

There are four main attributes:

· Only for reading(Read only);

· Hidden(Hidden);

· System(System);

· Archival(Archive).

Attribute " Only for reading" limits the ability to work with the file. Setting it means that the file is not intended to be modified.

Attribute " Hidden" signals to the operating system that this file Not should be displayed on the screen when performing file operations. This is a protection measure against accidental (intentional or unintentional) file damage.

Attribute " System" files that have important functions for the operation of the operating system itself are marked. Its distinctive feature is that it cannot be changed using the operating system. As a rule, most files that have the " System", also have the attribute “ Hidden".

Attribute " Archival" used in the past to run backup programs. Any program that modifies a file was supposed to automatically set this attribute, and the backup tool was supposed to reset it. Thus, only those files for which this attribute was set were subject to the next backup. Modern programs backups use other means to determine whether a file has been changed, and this attribute is not taken into account, and changing it manually using the operating system has no practical significance.

File storage is organized in a hierarchical structure, which in this case called file structure(Fig. 1) .

Rice. 1. Hierarchical disk structure

File structure - hierarchical structure in which the operating system displays files and directories (folders).

Serves as the top of the structure carrier name, where files are saved. Next, the files are grouped into directories (folders), within which can be created nested directories(Fig. 1) .

Names external media information. The disks on which information is stored on the computer have their own names - each disk is named with a letter of the Latin alphabet, followed by a colon. So, floppy disks are always assigned letters A: And IN:. The logical drives of the hard drive are named starting with the letter WITH:. All logical drive names are followed by CD drive names. For example, installed: a floppy drive, a hard drive divided into 3 logical drives and a CD drive. Identify the letters of all storage media. A:- floppy disk drive; WITH:, D:, E:- logical drives of the hard drive; F:- CD drive.

Catalog (folder) - disk space (special system file), which stores service information about files (name, extension, creation date, size, etc.). Directories at lower levels are nested within directories at higher levels and are for them nested. The top-level directory (superdirectory) in relation to lower-level directories is called the parent directory. The top level of nesting of the hierarchical structure is root directory disk (Fig. 1). The directory that the user is currently working with is called current.

The rules for naming a directory are no different from the rules for naming a file, although it is not customary to specify name extensions for directories. When writing a file access path through a system of subdirectories, all intermediate directories are separated by a specific symbol. Many operating systems use "\" (backslash) as this character.

The requirement for a unique file name is obvious - without this it is impossible to guarantee unambiguous access to data. In means computer technology the requirement of name uniqueness is ensured automatically - neither the user nor the automation can create a file with a name identical to an existing one.

When a file is used that is not in the current directory, the program accessing the file needs to indicate where exactly the file is located. This is done by specifying the path to the file.

The path to a file is the name of the media (disk) and a sequence of directory names, separated by the “\” character in Windows OS (the “/” character is used in UNIX line OS). This path specifies the route to the directory in which the desired file is located.

To specify the path to the file, use two various methods. In the first case, each file is given absolute path name (full file name), consisting of the names of all directories from the root to the one that contains the file, and the name of the file itself. For example, the path C:\Abby\Doc\otchet.doc means that the root directory of the disk WITH: contains a directory Abby, which in turn contains a subdirectory Doc where the file is located report.doc. Absolute path names always begin with the media name and root directory and are unique. Applies also relative path name. It is used together with the concept current directory. The user can designate one of the directories as the current working directory. In this case, all pathnames that do not begin with a delimiter character are considered relative and counted relative to the current directory. For example, if the current directory is C:\Abby, then to the file with in an absolute way C:\Abby\ can be contacted as Doc\otchet.doc.

File systems. Each file on the disk has its own address. To understand the principle of accessing information stored in a file, you need to know how data is recorded on storage media.

All modern disk operating systems provide the creation of a file system designed to store data on disks and provide access to them. The principle of organizing a file system is tabular. The surface of the hard drive is considered as a three-dimensional matrix, the dimensions of which are the numbers surface, cylinder And sectors.

Before use, the disk is marked into tracks and sectors ( formatted). From a hardware point of view, marking is the process of recording service information onto a storage medium that marks the end and beginning of each sector.

Sectors are blocks in which data is stored. Numbered starting from one. In addition to user information, sectors contain service information, for example, their own number.

Track - a concentric circle along which the read-write heads move when moving or searching for data. Tracks are numbered from zero. Zero number has the most outer track on disk.

The typical sector size is 512 bytes. There are 80 tracks on one side. Each track contains 18 sectors.

Under A cylinder is understood as the set of all tracks belonging to different surfaces and located at an equal distance from the axis of rotation. The physical structure of data storage is shown in Figure 2.

Rice. 2. Physical structure of information storage

Data about where on the disk a particular file is recorded is stored in the system area of ​​the disk in special file allocation tables(FAT-tables). Since the violation FAT-table leads to the inability to use the data recorded on the disk; special requirements reliability and it exists in two copies, the identity of which is regularly monitored by operating system tools.

The smallest physical unit of information storage is a sector. Because size FAT- table is limited, then for disks whose size exceeds 32 MB, it is not possible to provide addressing to each individual sector. In this regard, groups of sectors are conditionally combined into clusters. A cluster is the smallest unit of addressing information. The cluster size, unlike the sector size, is not fixed and depends on the disk capacity.

As mentioned earlier, information on disks is written in sectors of a fixed length, and each sector and the location of each physical record (sector) on the disk is uniquely identified by three numbers: disk surface numbers, cylinder And sectors on the track. And the disk controller works with the disk in exactly these terms. And the user wants to use not sectors, cylinders and surfaces, but files and directories. Therefore, when performing operations with files and directories on disks, it is somehow necessary to translate this into actions understandable to the controller: reading and writing certain sectors of the disk. And to do this, it is necessary to establish the rules by which this translation is carried out, that is, first of all, to determine how information should be stored and organized on disks. The set of these rules is called file system.

File system- is a set of conventions that define the organization of data on storage media. The presence of these conventions allows the operating system, other programs and users to work with files and directories, and not just with sections (sectors) of disks. The file system defines:

· how files and directories are stored on the disk;

· what information about files and directories is stored;

· how can you find out which parts of the disk are free and which are not;

· format of directories and other service information on disk.

To use discs written (partitioned) using some file system, the operating system or special program must support this file system.

The file system most common in IBM PC-compatible computers, was introduced back in the early 80s in operating systems MS DOS 1.0 and 2.0. This file system is quite primitive, since it was created to store data on floppy disks. This file system is usually called FAT, since the most important data structure in it is the file allocation table on the disk, in English - file allocation table, abbreviated - FAT. This table contains information about which areas (clusters) disks are free, and about the chains of clusters that form files and directories.

On the file system FAT File and directory names must be no more than 8 characters long, plus three characters in the name extension. It leads to significant losses (up to 20%) of disk space due to large cluster sizes on high-capacity disks. This is due to the fact that at the end of the last cluster of the file there is free space, on average equal to half the cluster. And on large disks the size of the clusters FAT can reach 32 KB . Thus, on a disk with a capacity
2 GB with 20,000 files will lose 320 MB, that is, about 16%. Finally, the file system FAT low-performance, especially for large disks, not suitable for multitasking (all operations require access to the file allocation table, and therefore you cannot start another before one operation is completed).

During development Windows 95 firm Microsoft decided not to introduce a new file system, but to patch the existing file system FAT, allowing you to assign files and directories long names. This file system became known as FAT 32. Adopted in Windows 95 The good thing about this approach is that it allows you to use old disks with a file system FAT- long names simply begin to be written on them. But still, this solution is very artificial, and many programs are for repairing the disk file system, “compressing” disks, backup, etc. - can lead to the loss of long names on disk. FAT 32 Supports smaller cluster sizes, allowing for more efficient use of disk space.

When developing an operating system Windows NT a new file system was created - NTFS. It was aimed at large-capacity disks containing many files, and took significant measures to ensure efficient data storage and access control. This file system supports long file names. On logical drives with a capacity of 1-2 GB, the file system NTFS allows you to store on average 10-15% more information, how FAT. And accessing files in it is noticeably faster, especially in a multitasking environment.

When forming a file system NTFS The formatting program creates the file Master File Table(MTF) and other areas for storing metadata. Metadata is used NTFS to implement the file structure. The first 16 entries in MTF reserved by herself NTFS. The location of the metadata files is recorded in the boot sector of the disk. If the first entry in MTF damaged, NTFS reads the second record to find a copy of the first. A complete copy of the boot sector is located at the end of the volume. IN MTF metadata is stored, such as a copy of the first four records (guarantees access to MTF in case the first sector is damaged). MTF contains information about the volume - label and version number. IN MTF there is a table of attribute names and descriptions, the root directory, etc. The remaining lines MTF contain entries for every file and directory located on this volume. Developers NTFS, without forgetting about efficiency, we also tried to ensure the reliability of the file system and data recoverability in case of failures. For this, in particular, NTFS duplicates everything critically important information and ensures that all changes on disks are recorded in a special registration file, and for each change the method for canceling it is also remembered. As a result, in almost any failure NTFS automatically restored. NTFS also (unlike FAT) can work with logical drives and files larger than 2 GB - the maximum size of logical drives and files is 4x10 18 bytes.

Comparative characteristics of file systems are presented in table. 1. If the file system on the disk is not supported by a given operating system, then all the information on this disk will be inaccessible (when working in this operating system, of course). For such logical drives, there may either be no letter assigned at all (that is, the drive cannot be accessed), or any access to the drive will generate an error message.

A special file system has been developed for CDs ( CD-ROM). This turned out to be necessary, since the physical structure of CDs is not the same as that of hard drives or floppy disks: in them, information is not recorded in ring tracks, but in a single spiral-shaped track (like audio CDs). This file system is called CDFS.

Table 1

Comparative characteristics of file systems

NTFS FAT 32 FAT
Supported Operating Systems Windows NT with Service Pack 4, Windows 2000, Windows XP MS-DOS, Windows 95 OSR2, Windows 98, Windows Millennium Edition, Windows NT, Windows 2000, Windows XP
Possible logical drive sizes Recommended minimum size logical disk (volume) is approximately 10 MB. Volume sizes larger than 2 TB are allowed. Cannot be used for floppy disks Logical disk (volume) with a capacity from 512 MB to 2 TB. Can be used for floppy disks Logical disk (volume) up to 4 GB. Can be used for floppy disks
Possible sizes of stored files The maximum file size is limited only by the volume size Maximum file size is 4 GB Maximum file size is 2 GB






2024 gtavrl.ru.