Level 10 raid of 4 discs description. Why RAID5 is a “must have”

All tips

We are faced with the problem that most servers purchased by users of our programs come with a disk array configured at RAID 5. Subsequently system administrators they don’t want to waste time on reconfiguring, or are simply afraid to change something in an already configured and working computer. As a result, the performance of working with a database installed on such a server turns out to be less than it was on the old one, which worked at the enterprise for 3-4 years. Probably, the desire of suppliers to configure a disk array in RAID level 5 can be explained by the desire to surprise the client with the huge size of disk space. System administrators, in turn, often simply do not have sufficient knowledge of how a RAID array of one level or another works. The purpose of this article is to answer two questions:

Why can't I use RAID 5 for a database server?

How to optimally configure a RAID controller to host a Firebird server database?

Let us immediately make a reservation that the conclusions drawn in this article do not apply to those rare cases when the database is used exclusively (or mainly) for read-only purposes.

How does RAID 5 work?

Let's look at a simplified diagram of how an array of four disks works. One of the disks is allocated to store the checksum. Three are available for data placement. In the picture below, disks with useful information named A, B and C. Drive D stores the checksums.

The minimum amount of information that the controller reads or writes to one disk is called strip(strip). The parameters of most controllers that we have encountered do not indicate the size of the strip, but the size stripe(stripe) – a block of information that is distributed across all disks of the array. In the figure below, one stripe is highlighted in a darker color:

The stripe size is equal to the strip size multiplied by the number of disks in the array. Those. in the case of four disks and a stripe size of 64K, the minimum amount of information that the controller is capable of writing or reading from the disk is 64 / 4 = 16K.

The checksum that goes to disk D is calculated using the following formula:

D = A xor B xor C

Due to the transitivity of the xor operation, if one of the disks with useful information fails, it can be restored by xoring the data of the remaining disks, including the disk with the checksum. For example, drive B has failed.

When requesting a block of information from disk B, the controller will restore it using the formula:

B = A xor C xor D

The Firebird server exchanges data pages with the disk subsystem. The optimal page size in most cases is 8K, which is much smaller size stripe and in most cases even less than the strip size. Situations when sequential pages are written to disk are also quite rare. Thus, if in our example information is written to disk A, then the controller will have to perform the following operations:

Read strip data from drives B and C. Two read operations.
Calculate a new checksum. Two xor operations.
Write information to disk A and checksum to disk D. Two write operations.

Total, two reads, two writes and two xor operations. It would be surprising if, with such a volume of work, overall productivity did not drop. Now it becomes obvious why RAID 5 is not suitable for hosting a database file.

An important feature of RAID 5 is significant drop in performance when one of the disks in the array fails. After all, now, in order to recover information from this disk, it is necessary to read and re-sort data from all other disks.

However, like any rule, ours also has its exception. The performance of a RAID 5 disk array will not be degraded if the size of the controller's nonvolatile cache is comparable to the size of the database file. For example, with a cache memory size of 512 MB, it is quite possible to use a fifth-level RAID array for databases up to 1-1.5 GB. Provided that the server is dedicated only to working with the database and does not perform other tasks.

It is worth noting that the above scheme of RAID 5 operation is for methodological reasons. seriously simplified. In reality, the controller distributes stripes cyclically across all disks of the array, so there is no dedicated disk for storing checksums. All disks store both data and checksums of different stripes, which allows you to equalize the load on them.

Which RAID level should I choose?

If RAID 5 is not suitable, what level should I choose to host the database file? If the number of disks is less than four, the only option is a mirror (RAID 1). If the array has four or more disks, then RAID 10 is optimal in terms of performance and reliability - combining (RAID 0) several mirrors (RAID 1). Sometimes you can see it written as RAID 1+0. The figure below shows a RAID 10 array of four drives. The data of one stripe is highlighted in dark tone. The shading shows a duplicate of that stripe.

Note also that if a RAID 5 array can survive the loss of only one disk, then a RAID 10 of m mirrors with two disks will survive the loss of one to m disks, provided that no more than one disk in each mirror fails.

Let's try to quantitatively compare RAID arrays 5 and RAID 10, each with n disks. n is a multiple of two. Let's take the size of the read/write data block to be equal to the strip size. The table below shows the required number of data read/write and xoring operations.

It can be clearly seen that the RAID 10 array not only has higher write performance, but also does not allow for an overall performance degradation when one drive fails.

How to configure a RAID controller?

Cache size

The bigger, the better. The main thing is that the controller has a battery (accumulator) to save the contents of the memory in the event of a power outage. For many controllers, the battery is not included as standard and must be ordered separately. Without a battery, the write cache will be disabled.

RAID level

RAID 10. If the number of disks is less than four, then RAID 1 (mirror). Why? Read the article from the very beginning.

Stripe size

The database page size multiplied by the number of mirrors in the array. For example, if the array has 8 disks, combined into four mirrors of two disks each, and the database page size is 8K, then the stripe size should be set to 8 * 4 = 32K.

Read Ahead

Since sequential access to database pages is very rare, and they themselves, as a result of fragmentation, may be located in different places on the disk, read-ahead should be disabled, or the adaptive mode should be used (read-ahead in the case of sequential access to two consecutive pages).

Write cache policy

Select write back mode. The data will be cached and then written to disk. The write operation will be considered complete as soon as the data is placed in the cache.

Spare disk

If the controller capabilities allow, it is recommended to include in the array backup disk. Such a disk is in the stand by state during normal operation. If one of the working hard drives fails, a backup disk is automatically added to the array.

All modern motherboards are equipped with an integrated RAID controller, and top models even have several integrated RAID controllers. The extent to which integrated RAID controllers are in demand by home users is a separate question. In any case, a modern motherboard provides the user with the ability to create a RAID array of several disks. However, not everyone home user knows how to create a RAID array, which array level to choose, and generally has little idea of the pros and cons of using RAID arrays.
In this article we will give brief recommendations on creating RAID arrays on home PCs and on specific example Let's demonstrate how you can independently test the performance of a RAID array.

History of creation

The term “RAID array” first appeared in 1987, when American researchers Patterson, Gibson and Katz from the University of California Berkeley in their article “A Case for Redundant Arrays of Inexpensive Discs, RAID” described how In this way, you can combine several low-cost hard drives into one logical device so that the resulting capacity and performance of the system are increased, and the failure of individual drives does not lead to failure of the entire system.

More than 20 years have passed since this article was published, but the technology of building RAID arrays has not lost its relevance today. The only thing that has changed since then is the decoding of the RAID acronym. The fact is that initially RAID arrays were not built on cheap disks at all, so the word Inexpensive (inexpensive) was changed to Independent (independent), which was more true.

Operating principle

So, RAID is a redundant array of independent disks (Redundant Arrays of Independent Discs), which is tasked with ensuring fault tolerance and increasing performance. Fault tolerance is achieved through redundancy. That is, part of the disk space capacity is allocated for official purposes, becoming inaccessible to the user.

Increased performance of the disk subsystem is ensured by the simultaneous operation of several disks, and in this sense, the more disks in the array (up to certain limit), all the better.

The joint operation of disks in an array can be organized using either parallel or independent access. With parallel access disk space is divided into blocks (strips) to record data. Similarly, information to be written to disk is divided into the same blocks. When writing, individual blocks are written to different disks, and multiple blocks are written to different disks simultaneously, which leads to increased performance in write operations. Necessary information it is also read in separate blocks simultaneously from several disks, which also increases performance in proportion to the number of disks in the array.

It should be noted that the parallel access model is implemented only if the size of the data write request is larger than the size of the block itself. Otherwise, parallel recording of several blocks is almost impossible. Let's imagine a situation where the size of an individual block is 8 KB, and the size of a request to write data is 64 KB. In this case, the source information is cut into eight blocks of 8 KB each. If you have a four-disk array, you can write four blocks, or 32 KB, at a time. Obviously, in the example considered, the write and read speeds will be four times higher than when using a single disk. This is only true for an ideal situation, but the request size is not always a multiple of the block size and the number of disks in the array.

If the size of the recorded data is less than the block size, then a fundamentally different model is implemented - independent access. Moreover, this model can also be used when the size of the data being written is larger than the size of one block. With independent access, all data from a single request is written to a separate disk, that is, the situation is identical to working with one disk. The advantage of the independent access model is that if several write (read) requests arrive simultaneously, they will all be executed on separate disks independently of each other. This situation is typical, for example, for servers.

In accordance with various types access, there are different types of RAID arrays, which are usually characterized by RAID levels. In addition to the type of access, RAID levels differ in the way they accommodate and generate redundant information. Redundant information can either be placed on a dedicated disk or distributed among all disks. There are many ways to generate this information. The simplest of them is complete duplication (100 percent redundancy), or mirroring. In addition, error correction codes are used, as well as parity calculations.

RAID levels

Currently, there are several RAID levels that can be considered standardized - these are RAID 0, RAID 1, RAID 2, RAID 3, RAID 4, RAID 5 and RAID 6.

Various combinations of RAID levels are also used, which allows you to combine their advantages. Typically this is a combination of some kind of fault-tolerant level and a zero level used to improve performance (RAID 1+0, RAID 0+1, RAID 50).

Note that all modern RAID controllers support the JBOD (Just a Bench Of Disks) function, which is not intended for creating arrays - it provides the ability to connect individual disks to the RAID controller.

It should be noted that the RAID controllers integrated on motherboards for home PCs do not support all RAID levels. Dual-port RAID controllers only support levels 0 and 1, while RAID controllers with more ports (for example, the 6-port RAID controller integrated into the southbridge of the ICH9R/ICH10R chipset) also support levels 10 and 5.

In addition, if we talk about motherboards based on Intel chipsets, they also implement the Intel Matrix RAID function, which allows you to create hard drives x simultaneously RAID matrices of several levels, allocating part of the disk space for each of them.

RAID 0

RAID level 0, strictly speaking, is not a redundant array and, accordingly, does not provide reliable data storage. Nevertheless this level actively used in cases where it is necessary to ensure high performance of the disk subsystem. When creating a RAID level 0 array, information is divided into blocks (sometimes these blocks are called stripes), which are written to separate disks, that is, a system with parallel access is created (if, of course, the block size allows it). By allowing simultaneous I/O from multiple disks, RAID 0 provides the fastest data transfer speeds and maximum disk space efficiency because no storage space is required for checksums. The implementation of this level is very simple. RAID 0 is mainly used in areas where fast transfer large amount of data.

RAID 1 (Mirrored disk)

RAID Level 1 is an array of two disks with 100 percent redundancy. That is, the data is simply completely duplicated (mirrored), due to which a very high level of reliability (as well as cost) is achieved. Note that to implement level 1, it is not necessary to first partition the disks and data into blocks. In the simplest case, two disks contain the same information and are one logical disk. If one disk fails, its functions are performed by another (which is absolutely transparent to the user). Restoring an array is performed by simple copying. In addition, this level doubles the speed of reading information, since this operation can be performed simultaneously from two disks. This type of information storage scheme is used mainly in cases where the cost of data security is much higher than the cost of implementing a storage system.

RAID 5

RAID 5 is a fault-tolerant disk array with distributed checksum storage. When recording, the data stream is divided into blocks (stripes) at the byte level and simultaneously written to all disks of the array in cyclic order.

Suppose the array contains n disks, and the stripe size d. For each portion of n–1 stripes, the checksum is calculated p.

Stripe d 1 recorded on the first disk, stripe d 2- on the second and so on up to the stripe dn–1, which is written to ( n–1)th disk. Next on n-disk checksum is written p n, and the process is repeated cyclically from the first disk on which the stripe is written d n.

Recording process (n–1) stripes and their checksum are produced simultaneously for all n disks.

The checksum is calculated using a bitwise exclusive-or (XOR) operation applied to the data blocks being written. So, if there is n hard drives, d- data block (stripe), then the checksum is calculated using the following formula:

pn=d1 d 2 ... d 1–1.

If any disk fails, the data on it can be restored using the control data and the data remaining on the working disks.

As an illustration, consider blocks of four bits. Let there be only five disks for storing data and recording checksums. If there is a sequence of bits 1101 0011 1100 1011, divided into blocks of four bits, then to calculate the checksum it is necessary to perform the following bitwise operation:

1101 0011 1100 1011 = 1001.

Thus, the checksum written to the fifth disk is 1001.

If one of the disks, for example the fourth, fails, then the block d 4= 1100 will not be available when reading. However, its value can be easily restored using the checksum and the values of the remaining blocks using the same “exclusive OR” operation:

d4 = d1 d 2d 4p5.

In our example we get:

d4 = (1101) (0011) (1100) (1011) = 1001.

In the case of RAID 5, all disks in the array are the same size, but the total capacity of the disk subsystem available for writing becomes exactly one disk smaller. For example, if five disks are 100 GB in size, then the actual size of the array is 400 GB because 100 GB is allocated for control information.

RAID 5 can be built on three or more hard drives. As the number of hard drives in an array increases, its redundancy decreases.

RAID 5 has an independent access architecture, which allows multiple reads or writes to be performed simultaneously.

RAID 10

RAID level 10 is a combination of levels 0 and 1. The minimum requirement for this level is four drives. In a RAID 10 array of four drives, they are combined in pairs into level 0 arrays, and both of these arrays are logical drives are combined into a level 1 array. Another approach is also possible: initially, the disks are combined into mirrored arrays at level 1, and then logical drives based on these arrays into a level 0 array.

Intel Matrix RAID

The considered RAID arrays of levels 5 and 1 are rarely used at home, which is primarily due to the high cost of such solutions. Most often, for home PCs, a level 0 array on two disks is used. As we have already noted, RAID level 0 does not provide secure data storage, and therefore end users are faced with a choice: create a fast but unreliable RAID level 0 array or, doubling the cost of disk space, RAID- a level 1 array that provides reliable data storage, but does not provide significant performance benefits.

To solve this difficult problem, Intel developed Intel Matrix Storage Technology, which combines the benefits of Tier 0 and Tier 1 arrays on just two physical disks. And in order to emphasize that the speech in in this case This is not just about a RAID array, but about an array that combines both physical and logical disks; in the name of the technology, instead of the word “array,” the word “matrix” is used.

So, what is a two-disk RAID matrix using Intel Matrix Storage technology? The basic idea is that if the system has several hard drives and a motherboard with an Intel chipset that supports Intel Matrix Storage Technology, it is possible to divide the disk space into several parts, each of which will function as a separate RAID array.

Let's look at a simple example of a RAID matrix consisting of two disks of 120 GB each. Any of the disks can be divided into two logical disks, for example 40 and 80 GB. Next, two logical drives of the same size (for example, 40 GB each) can be combined into a RAID level 1 matrix, and the remaining logical drives into a RAID level 0 matrix.

In principle, using two physical disks, it is also possible to create just one or two RAID level 0 matrices, but it is impossible to obtain only level 1 matrices. That is, if the system has only two disks, then Intel Matrix Storage technology allows you to create the following types of RAID matrices:

one level 0 matrix;
two level 0 matrices;
level 0 matrix and level 1 matrix.

If the system has three hard drives, the following types of RAID matrices can be created:

one level 0 matrix;
one level 5 matrix;
two level 0 matrices;
two level 5 matrices;
level 0 matrix and level 5 matrix.

If the system has four hard drives, then it is additionally possible to create a RAID matrix of level 10, as well as combinations of level 10 and level 0 or 5.

From theory to practice

If we talk about home computers, the most popular and popular are RAID arrays of levels 0 and 1. The use of RAID arrays of three or more disks in home PCs is rather an exception to the rule. This is due to the fact that, on the one hand, the cost of RAID arrays increases in proportion to the number of disks involved in it, and on the other hand, for home computers, the capacity of the disk array is of primary importance, and not its performance and reliability.

Therefore, in the future we will consider RAID levels 0 and 1 based on only two disks. The objective of our research will be to compare the performance and functionality of RAID arrays of levels 0 and 1, created on the basis of several integrated RAID controllers, as well as to study the dependence of the speed characteristics of the RAID array on the stripe size.

The fact is that although theoretically, when using a RAID level 0 array, the read and write speed should double, in practice the increase in speed characteristics is much less modest and it varies for different RAID controllers. The same is true for a RAID level 1 array: despite the fact that theoretically the read speed should double, in practice it’s not so smooth.

For our RAID controller comparison testing, we used the Gigabyte GA-EX58A-UD7 motherboard. This board is based on Intel chipset X58 Express with ICH10R southbridge, which has an integrated RAID controller for six SATA II ports, which supports the organization of RAID arrays of levels 0, 1, 10 and 5 with the Intel Matrix RAID function. In addition, the Gigabyte GA-EX58A-UD7 board integrates the GIGABYTE SATA2 RAID controller, which has two SATA II ports with the ability to organize RAID arrays of levels 0, 1 and JBOD.

Also on the GA-EX58A-UD7 board is an integrated SATA III controller Marvell 9128, on the basis of which two SATA III ports are implemented with the ability to organize RAID arrays of levels 0, 1 and JBOD.

Thus, the Gigabyte GA-EX58A-UD7 board has three separate RAID controllers, on the basis of which you can create RAID arrays of levels 0 and 1 and compare them with each other. Let us recall that the SATA III standard is backward compatible with the SATA II standard, therefore, based on the Marvell 9128 controller, which supports drives with SATA interface III, you can also create RAID arrays using drives with a SATA II interface.

The testing stand had the following configuration:

processor - Intel Core i7-965 Extreme Edition;
motherboard - Gigabyte GA-EX58A-UD7;
BIOS version- F2a;
hard drives - two drives Western Digital WD1002FBYS, one Western Digital WD3200AAKS drive;
integrated RAID controllers:
ICH10R,
GIGABYTE SATA2,
Marvell 9128;
memory - DDR3-1066;
memory capacity - 3 GB (three modules of 1024 MB each);
memory operating mode - DDR3-1333, three-channel operating mode;
video card - Gigabyte GeForce GTS295;
power supply - Tagan 1300W.

Testing was carried out under the control of the operating room Microsoft systems Windows 7 Ultimate (32-bit). The operating system was installed on a Western Digital WD3200AAKS drive, which was connected to the port of the SATA II controller integrated into the ICH10R southbridge. The RAID array was assembled on two WD1002FBYS drives with a SATA II interface.

To measure the speed characteristics of the created RAID arrays, we used the IOmeter utility, which is the industry standard for measuring the performance of disk systems.

IOmeter utility

Since we intended this article as a kind of user guide for creating and testing RAID arrays, it would be logical to start with a description of the IOmeter (Input/Output meter) utility, which, as we have already noted, is a kind of industry standard for measuring the performance of disk systems. This utility is free and can be downloaded from http://www.iometer.org.

The IOmeter utility is a synthetic test and allows you to work with non-logical partitions hard drives, thanks to which you can test disks regardless of the file structure and reduce the influence of the operating system to zero.

When testing, it is possible to create a specific access model, or “pattern,” which allows you to specify the execution of specific operations by the hard drive. In case of creation specific model access is allowed to change the following parameters:

size of the data transfer request;
random/sequential distribution (in%);
distribution of read/write operations (in%);
The number of individual I/O operations running in parallel.

The IOmeter utility does not require installation on a computer and consists of two parts: IOmeter itself and Dynamo.

IOmeter is a monitoring part of the program with a user defined graphical interface, allowing you to make all the necessary settings. Dynamo is a load generator that has no interface. Each time you run IOmeter.exe, the Dynamo.exe load generator automatically starts.

To start working with the IOmeter program, just run the IOmeter.exe file. This opens the main window of the IOmeter program (Fig. 1).

Rice. 1. Main window of the IOmeter program

It should be noted that the IOmeter utility allows you to test not only local disk systems (DAS), but also network-attached storage devices (NAS). For example, you can use it to test the performance of the server disk subsystem (file server), using several network clients. Therefore, some of the bookmarks and tools in the IOmeter utility window relate specifically to network settings programs. It is clear that when testing disks and RAID arrays we will not need these program capabilities, and therefore we will not explain the purpose of all tabs and tools.

So, when you start the IOmeter program, a tree structure of all running load generators (Dynamo instances) will be displayed on the left side of the main window (in the Topology window). Each running Dynamo load generator instance is called a manager. Additionally, the IOmeter program is multi-threaded and each individual thread running on a Dynamo load generator instance is called a Worker. The number of running Workers always corresponds to the number of logical processor cores.

In our example, we use only one computer with a quad-core processor that supports Hyper-Threading technology, so only one manager (one instance of Dynamo) and eight (according to the number of logical processor cores) Workers are launched.

Actually, to test disks in this window there is no need to change or add anything.

If you select the name of the computer with the mouse in the tree structure of running Dynamo instances, then in the window Target on the tab Disk Target All disks, disk arrays and other drives (including network drives) installed on the computer will be displayed. These are the drives that IOmeter can work with. Media may be marked yellow or blue. Yellow logical partitions of the media are marked, and blue - physical devices without logical partitions created on them. A logical section may or may not be crossed out. The fact is that in order for a program to work with a logical partition, it must first be prepared by creating a special file, equal in size to the capacity of the entire logical partition. If the logical partition is crossed out, this means that the section is not yet prepared for testing (it will be prepared automatically at the first stage of testing), but if the section is not crossed out, this means that a file has already been created on the logical partition, completely ready for testing .

Note that, despite the supported ability to work with logical partitions, it is optimal to test drives that are not partitioned into logical partitions. You can delete a logical disk partition very simply - through a snap-in Disk Management . To access it, just click right click mouse on icon Computer on the desktop and select the item in the menu that opens Manage. In the window that opens Computer Management on the left side you need to select the item Storage, and in it - Disk Management. After that, on the right side of the window Computer Management All connected drives will be displayed. By right-clicking on the desired drive and selecting the item in the menu that opens Delete Volume..., you can delete a logical partition on a physical disk. Let us remind you that when you delete a logical partition from a disk, all information on it is deleted without the possibility of recovery.

In general, using the IOmeter utility you can only test blank disks or disk arrays. That is, you cannot test a disk or disk array on which the operating system is installed.

So, let's return to the description of the IOmeter utility. In the window Target on the tab Disk Target you must select the disk (or disk array) that will be tested. Next you need to open the tab Access Specifications(Fig. 2), on which it will be possible to determine the testing scenario.

Rice. 2. Access Specifications tab of the IOmeter utility

In the window Global Access Specifications There is a list of predefined test scripts that can be assigned to the boot manager. However, we won’t need these scripts, so all of them can be selected and deleted (there is a button for this Delete). After that, click on the button New to create a new test script. In the window that opens Edit Access Specification You can define the boot scenario for a disk or RAID array.

Suppose we want to find out the dependence of the speed of sequential (linear) reading and writing on the size of the data transfer request block. To do this, we need to generate a sequence of boot scripts in sequential read mode at different block sizes, and then a sequence of boot scripts in sequential write mode at different block sizes. Typically, block sizes are chosen as a series, each member of which is twice the size of the previous one, and the first member of this series is 512 bytes. That is, the block sizes are as follows: 512 bytes, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 KB, 1 MB. There is no point in making the block size larger than 1 MB for sequential operations, since with such large data block sizes the speed of sequential operations does not change.

So, let's create a loading script in sequential reading mode for a block of 512 bytes.

In field Name window Edit Access Specification enter the name of the loading script. For example, Sequential_Read_512. Next in the field Transfer Request Size we set the data block size to 512 bytes. Slider Percent Random/Sequential Distribution(the percentage ratio between sequential and selective operations) we shift all the way to the left so that all our operations are only sequential. Well, the slider , which sets the percentage ratio between read and write operations, is shifted all the way to the right so that all our operations are read only. Other parameters in the window Edit Access Specification no need to change (Fig. 3).

Rice. 3. Edit Access Specification Window to Create a Sequential Read Load Script
with a data block size of 512 bytes

Click on the button Ok, and the first script we created will appear in the window Global Access Specifications on the tab Access Specifications IOmeter utilities.

Similarly, you need to create scripts for the remaining data blocks, however, to make your work easier, it is easier not to create the script anew each time by clicking the button New, and having selected the last created scenario, press the button Edit Copy(edit copy). After this the window will open again Edit Access Specification with the settings of our last created script. It will be enough to change only the name and size of the block. Having completed a similar procedure for all other block sizes, you can begin to create scripts for sequential recording, which is done in exactly the same way, except that the slider Percent Read/Write Distribution, which sets the percentage ratio between read and write operations, must be moved all the way to the left.

Similarly, you can create scripts for selective writing and reading.

After all the scripts are ready, they need to be assigned to the download manager, that is, indicate which scripts will work with Dynamo.

To do this, we check again what is in the window Topology The name of the computer (that is, the load manager on the local PC) is highlighted, and not the individual Worker. This ensures that load scenarios will be assigned to all Workers at once. Next in the window Global Access Specifications select all the load scenarios we have created and press the button Add. All selected load scenarios will be added to the window (Fig. 4).

Rice. 4. Assigning the created load scenarios to the load manager

After this you need to go to the tab Test Setup(Fig. 5), where you can set the execution time of each script we created. To do this in a group Run Time set the execution time of the load scenario. It will be enough to set the time to 3 minutes.

Rice. 5. Setting the execution time of the load scenario

Moreover, in the field Test Description You must specify the name of the entire test. In principle, this tab has a lot of other settings, but they are not needed for our tasks.

After all the necessary settings have been made, it is recommended to save the created test by clicking on the button with the image of a floppy disk on the toolbar. The test is saved with the extension *.icf. Subsequently, you can use the created load scenario by running not the IOmeter.exe file, but the saved file with the *.icf extension.

Now you can start testing directly by clicking on the button with a flag. You will be asked to specify the name of the file containing the test results and select its location. Test results are saved in a CSV file, which can then be easily exported to Excel and, by setting a filter on the first column, select the desired data with test results.

During testing, intermediate results can be seen on the tab Result Display, and you can determine which load scenario they belong to on the tab Access Specifications. In the window Assigned Access Specification a running script appears in green, completed scripts in red, and unexecuted scripts in blue.

So we've looked at basic techniques work with the IOmeter utility, which will be required for testing individual disks or RAID arrays. Note that we have not talked about all the capabilities of the IOmeter utility, but a description of all its capabilities is beyond the scope of this article.

Creating a RAID array based on the GIGABYTE SATA2 controller

So, we begin creating a RAID array based on two disks using the GIGABYTE SATA2 RAID controller integrated on the board. Of course, Gigabyte itself does not produce chips, and therefore under the GIGABYTE SATA2 chip is hidden a relabeled chip from another company. How can you find out from the driver INF file, we're talking about about the JMicron JMB36x series controller.

Access to the controller setup menu is possible at the system boot stage, for which you need to press the Ctrl+G key combination when the corresponding inscription appears on the screen. Naturally, first in BIOS settings you need to define the operating mode of two SATA ports belonging to the GIGABYTE SATA2 controller as RAID (otherwise access to the RAID array configurator menu will not be possible).

The setup menu for the GIGABYTE SATA2 RAID controller is quite simple. As we have already noted, the controller is dual-port and allows you to create RAID arrays of level 0 or 1. Through the controller settings menu, you can delete or create a RAID array. When creating a RAID array, you can specify its name, select the array level (0 or 1), set the stripe size for RAID 0 (128, 84, 32, 16, 8 or 4K), and also determine the size of the array.

Once the array is created, then any changes to it are no longer possible. That is, it is impossible to subsequently create of this array change, for example, its level or stripe size. To do this, you first need to delete the array (with loss of data), and then create it again. Actually, this is not unique to the GIGABYTE SATA2 controller. The inability to change the parameters of created RAID arrays is a feature of all controllers, which follows from the very principle of implementing a RAID array.

Once an array based on the GIGABYTE SATA2 controller has been created, its current information can be viewed using the GIGABYTE RAID Configurer utility, which is installed automatically along with the driver.

Creating a RAID array based on the Marvell 9128 controller

Configuring the Marvell 9128 RAID controller is only possible through the BIOS settings Gigabyte boards GA-EX58A-UD7. In general, it must be said that the Marvell 9128 controller configurator menu is somewhat crude and can mislead inexperienced users. However, we will talk about these minor shortcomings a little later, but for now we will consider the main ones functionality Marvell 9128 controller.

So, although this controller supports SATA III drives, it is also fully compatible with SATA II drives.

The Marvell 9128 controller allows you to create a RAID array of levels 0 and 1 based on two disks. For a level 0 array, you can set the stripe size to 32 or 64 KB, and also specify the name of the array. In addition, there is an option such as Gigabyte Rounding, which needs explanation. Despite the name, which is similar to the name of the manufacturer, the Gigabyte Rounding function has nothing to do with it. Moreover, it is in no way connected with the RAID level 0 array, although in the controller settings it can be defined specifically for an array of this level. Actually, this is the first of those shortcomings in the Marvell 9128 controller configurator that we mentioned. The Gigabyte Rounding feature is defined only for RAID Level 1. It allows you to use two drives (for example, from different manufacturers or different models), the capacity of which is slightly different from each other. The Gigabyte Rounding function precisely sets the difference in the sizes of the two disks used to create a RAID level 1 array. In the Marvell 9128 controller, the Gigabyte Rounding function allows you to set the difference in the sizes of the disks to 1 or 10 GB.

Another flaw in the Marvell 9128 controller configurator is that when creating a RAID level 1 array, the user has the ability to select the stripe size (32 or 64 KB). However, the concept of stripe is not defined at all for RAID level 1.

Creating a RAID array based on the controller integrated into the ICH10R

The RAID controller integrated into the ICH10R southbridge is the most common. As already noted, this RAID controller is 6-port and supports not only the creation of RAID 0 and RAID 1 arrays, but also RAID 5 and RAID 10.

Access to the controller setup menu is possible at the system boot stage, for which you need to press the key combination Ctrl + I when the corresponding inscription appears on the screen. Naturally, first in the BIOS settings you should define the operating mode of this controller as RAID (otherwise access to the RAID array configurator menu will be impossible).

The RAID controller setup menu is quite simple. Through the controller settings menu, you can delete or create a RAID array. When creating a RAID array, you can specify its name, select the array level (0, 1, 5 or 10), set the stripe size for RAID 0 (128, 84, 32, 16, 8 or 4K), and also determine the size of the array.

RAID performance comparison

To test RAID arrays using the IOmeter utility, we created sequential read, sequential write, selective read, and selective write load scenarios. The data block sizes in each load scenario were as follows: 512 bytes, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 KB, 1 MB.

On each of the RAID controllers, we created a RAID 0 array with all allowable stripe sizes and a RAID 1 array. In addition, in order to be able to evaluate the performance gain obtained from using a RAID array, we also tested a single disk on each of the RAID controllers.

So, let's look at the results of our testing.

GIGABYTE SATA2 Controller

First of all, let's look at the results of testing RAID arrays based on the GIGABYTE SATA2 controller (Fig. 6-13). In general, the controller turned out to be literally mysterious, and its performance was simply disappointing.


Rice. 6.Speed sequential and selective disk operations Western Digital WD1002FBYS	Rice. 7.Speed sequential with a stripe size of 128 KB (GIGABYTE SATA2 controller)


Rice. 12.Serial speed and selective operations for RAID 0 with a stripe size of 4 KB (GIGABYTE SATA2 controller)	Rice. 13.Serial speed and selective operations for RAID 1 (GIGABYTE SATA2 controller)

If you look at the speed characteristics of one disk (without a RAID array), the maximum sequential read speed is 102 MB/s, and the maximum sequential write speed is 107 MB/s.

When creating a RAID 0 array with a stripe size of 128 KB, the maximum sequential read and write speed increases to 125 MB/s, an increase of approximately 22%.

With stripe sizes of 64, 32, or 16 KB, the maximum sequential read speed is 130 MB/s, and the maximum sequential write speed is 141 MB/s. That is, with the specified stripe sizes, the maximum sequential read speed increases by 27%, and the maximum sequential write speed increases by 31%.

In fact, this is not enough for a level 0 array, and I would like the maximum speed of sequential operations to be higher.

With a stripe size of 8 KB, the maximum speed of sequential operations (reading and writing) remains approximately the same as with a stripe size of 64, 32 or 16 KB, however, there are obvious problems with selective reading. As the data block size increases up to 128 KB, the selective read speed (as it should) increases in proportion to the data block size. However, when the data block size is more than 128 KB, the selective read speed drops to almost zero (to approximately 0.1 MB/s).

With a stripe size of 4 KB, not only the selective read speed drops when the block size is more than 128 KB, but also the sequential read speed when the block size is more than 16 KB.

Using a RAID 1 array on a GIGABYTE SATA2 controller does not significantly change the sequential read speed (compared to a single drive), but the maximum sequential write speed is reduced to 75 MB/s. Recall that for a RAID 1 array, the read speed should increase, and the write speed should not decrease compared to the read and write speed of a single disk.

Based on the results of testing the GIGABYTE SATA2 controller, only one conclusion can be drawn. It makes sense to use this controller to create RAID 0 and RAID 1 arrays only if all other RAID controllers (Marvell 9128, ICH10R) are already used. Although it is quite difficult to imagine such a situation.

Marvell 9128 controller

The Marvell 9128 controller demonstrated much higher speed characteristics compared to the GIGABYTE SATA2 controller (Fig. 14-17). In fact, the differences appear even when the controller operates with one disk. If for the GIGABYTE SATA2 controller the maximum sequential read speed is 102 MB/s and is achieved with a data block size of 128 KB, then for the Marvell 9128 controller the maximum sequential read speed is 107 MB/s and is achieved with a data block size of 16 KB.

When creating a RAID 0 array with stripe sizes of 64 and 32 KB, the maximum sequential read speed increases to 211 MB/s, and sequential write speed increases to 185 MB/s. That is, with the specified stripe sizes, the maximum sequential read speed increases by 97%, and the maximum sequential write speed increases by 73%.

There is no significant difference in the speed performance of a RAID 0 array with a stripe size of 32 and 64 KB, however, the use of a 32 KB stripe is more preferable, since in this case the speed of sequential operations with a block size of less than 128 KB will be slightly higher.

When creating a RAID 1 array on a Marvell 9128 controller, the maximum sequential operation speed remains virtually unchanged compared to a single disk. So, if for a single disk the maximum speed of sequential operations is 107 MB/s, then for RAID 1 it is 105 MB/s. Also note that for RAID 1, selective read performance degrades slightly.

In general, it should be noted that the Marvell 9128 controller has good speed characteristics and can be used both to create RAID arrays and to connect single disks to it.

Controller ICH10R

The RAID controller built into the ICH10R turned out to be the highest performing of all those we tested (Figure 18-25). When working with a single drive (without creating a RAID array), its performance is virtually the same as that of the Marvell 9128 controller. The maximum sequential read and write speed is 107 MB and is achieved with a data block size of 16 KB.

Rice. 18. Sequential speed
and selective operations
for Western Digital WD1002FBYS disk (ICH10R controller)

If we talk about the RAID 0 array on the ICH10R controller, then the maximum sequential read and write speed does not depend on the stripe size and is 212 MB/s. The size of the stripe depends only on the size of the data block at which maximum value sequential read and write speeds. Test results show that for RAID 0 based on the ICH10R controller, it is optimal to use a 64 KB stripe. In this case, the maximum sequential read and write speed is achieved with a data block size of only 16 KB.

So, to summarize, we once again emphasize that the RAID controller built into the ICH10R significantly exceeds all other integrated RAID controllers in performance. And given that it also has greater functionality, it is optimal to use this particular controller and simply forget about the existence of all the others (unless, of course, the system uses SATA III drives).

There are a lot of articles on the Internet describing RAID. For example, this one describes everything in great detail. But as usual, there is not enough time to read everything, so you need something short to understand - whether it is necessary or not, and what is better to use in relation to working with a DBMS (InterBase, Firebird or something else - it really doesn’t matter). Before your eyes is exactly such material.

To a first approximation, RAID is a combination of disks into one array. SATA, SAS, SCSI, SSD - it doesn't matter. Moreover, almost every normal motherboard now supports SATA RAID. Let's go through the list of what RAIDs are and why they are. (I would like to immediately note that in RAID you need to combine identical wheels. Combining disks from different manufacturers, from the same but different types, or different sizes- this is pampering for a person sitting on a home computer).

RAID 0 (Stripe)

Roughly speaking, this is a sequential combination of two (or more) physical disks into one "physical" disk. It is only suitable for organizing huge disk spaces, for example, for those who work with video editing. There is no point in keeping databases on such disks - in fact, even if your database is 50 gigabytes in size, then why did you buy two disks of 40 gigabytes each, and not 1 by 80 gigabytes? The worst thing is that in RAID 0, any failure of one of the disks leads to the complete inoperability of such RAID, because data is written alternately to both disks, and accordingly, RAID 0 has no means of recovery in case of failures.

Of course, RAID 0 provides faster performance due to read/write striping.

RAID 0 is often used to host temporary files.

RAID 1 (Mirror)

Disk mirroring. If Shadow in IB/FB is software mirroring (see Operations Guide.pdf), then RAID 1 is hardware mirroring, and nothing more. Forbid you from using software mirroring using OS tools or third-party software. You need either an “iron” RAID 1 or shadow.

If a failure occurs, carefully check which disk has failed. The most common case of data loss on RAID 1 is incorrect actions during recovery (the wrong disk is specified as the “whole”).

As for performance - the gain for writing is 0, for reading - perhaps up to 1.5 times, since reading can be done “in parallel” (alternately from different disks). For databases, the acceleration is small, while when accessing different (!) parts (files) of the disk in parallel, the acceleration will be absolutely accurate.

RAID 1+0

By RAID 1+0 they mean the RAID 10 option, when two RAID 1s are combined into RAID 0. The option when two RAID 0s are combined into RAID 1 is called RAID 0+1, and “outside” it is the same RAID 10.

RAID 2-3-4

These RAIDs are rare because they use Hamming codes, or byte blocking + checksums, etc., but the general summary is that these RAIDs only provide reliability, with a 0-performance increase, and sometimes even its deterioration.

RAID 5

It requires a minimum of 3 disks. Parity data is distributed across all disks in the array

It is usually said that "RAID5 uses independent disk access, so requests to different disks can be executed in parallel." It should be borne in mind that we are, of course, talking about parallel queries for input/output. If such requests go sequentially (in SuperServer), then of course you will not get the effect of parallelizing access on RAID 5. Of course, RAID5 will give a performance boost if the operating system and other applications work with the array (for example, it will contain virtual memory, TEMP, etc.).

In general, RAID 5 used to be the most commonly used disk array for working with DBMSs. Now such an array can be organized on SATA drives, and it will be significantly cheaper than on SCSI. You can see prices and controllers in the articles
Moreover, you should pay attention to the volume of purchased disks - for example, in one of the mentioned articles, RAID5 is assembled from 4 disks with a capacity of 34 gigabytes, while the volume of the “disk” is 103 gigabytes.

Testing five SATA RAID controllers - http://www.thg.ru/storage/20051102/index.html.

Adaptec SATA RAID 21610SA in RAID 5 arrays - http://www.ixbt.com/storage/adaptec21610raid5.shtml.

Why RAID 5 is bad - https://geektimes.ru/post/78311/

Attention! When purchasing disks for RAID5, they usually take 3 disks, at a minimum (most likely because of the price). If suddenly, over time, one of the disks fails, then a situation may arise when it is not possible to purchase a disk similar to the ones used (no longer produced, temporarily out of stock, etc.). Therefore, a more interesting idea seems to be purchasing 4 disks, organizing a RAID5 of three, and connecting the 4th disk as a backup (for backups, other files and other needs).

The volume of a RAID5 disk array is calculated using the formula (n-1)*hddsize, where n is the number of disks in the array, and hddsize is the size of one disk. For example, for an array of 4 disks of 80 gigabytes, the total volume will be 240 gigabytes.

There is a question about the “unsuitability” of RAID5 for databases. At a minimum, it can be viewed from the point of view that to get good RAID5 performance, you need to use a specialized controller, and not what is included by default on the motherboard.

Article RAID-5 must die. And more about data loss on RAID5.

Note. As of 09/05/2005, the cost of a Hitachi 80Gb SATA drive is $60.

RAID 10, 50

Next come combinations of the listed options. For example, RAID 10 is RAID 0 + RAID 1. RAID 50 is RAID 5 + RAID 0.

Interestingly, the RAID 0+1 combination turns out to be worse in terms of reliability than RAID5. The database repair service has a case of one disk failure in the RAID0 (3 disks) + RAID1 (3 more of the same disks) system. At the same time, RAID1 could not “raise” the backup disk. The base turned out to be damaged without any chance of repair.

RAID 0+1 requires 4 drives, and RAID 5 requires 3. Think about it.

RAID 6

Unlike RAID 5, which uses parity to protect data against single faults, RAID 6 uses the same parity to protect against double faults. Accordingly, the processor is more powerful than in RAID 5, and not 3, but at least 5 disks are required (three data disks and 2 parity disks). Moreover, the number of disks in raid6 does not have the same flexibility as in raid 5, and must be equal to a simple number (5, 7, 11, 13, etc.)

Let's say two disks fail at the same time, but such a case is very rare.

I haven’t seen any data on RAID 6 performance (I haven’t looked), but it may well be that due to redundant control, performance could be at the level of RAID 5.

Rebuild time

Any RAID array that remains operational if one drive fails has a concept called rebuild time. Of course, when you replace a dead disk with a new one, the controller must organize the functioning of the new disk in the array, and this will take some time.

When “connecting” a new disk, for example, for RAID 5, the controller can allow operation of the array. But the speed of the array in this case will be very low, at least because even if the new disk is “linearly” filled with information, writing to it will “distract” the controller and disk heads from synchronizing operations with the rest of the disks of the array.

Restoration time for the array to function in normal mode directly depends on the size of the disks. For example, Sun StorEdge 3510 FC Array with an array size of 2 terabytes in exclusive mode does a rebuild within 4.5 hours (at a hardware price of about $40,000). Therefore, when organizing an array and planning disaster recovery, you need to first of all think about rebuild time. If your database and backups occupy no more than 50 gigabytes, and the growth per year is 1-2 gigabytes, then it hardly makes sense to assemble an array of 500 gigabyte disks. 250 GB will be enough, and even for raid5 this will be at least 500 GB of space to accommodate not only the database, but also movies. But the rebuild time for 250 GB disks will be approximately 2 times less than for 500 GB disks.

Summary

It turns out that the most meaningful thing is to use either RAID 1 or RAID 5. However, the most common mistake, which almost everyone does is to use RAID “for everything”. That is, they install a RAID, pile everything they have on it, and... they get reliability at best, but no performance improvement.

Write cache is also often not enabled, as a result of which writing to a raid is slower than writing to a regular single disk. The fact is that for most controllers this option is disabled by default, because... It is believed that to enable it, it is desirable to have at least a battery on the raid controller, as well as the presence of a UPS.

Text
The old hddspeed.htmLINK article (and doc_calford_1.htmLINK) shows how you can get significant performance gains by using multiple physical disks, even for an IDE. Accordingly, if you organize a RAID, put the base on it, and do the rest (temp, OS, virtual disk) on other hard drives. After all, all the same, RAID itself is one “disk”, even if it is more reliable and fast.
declared obsolete. All of the above has a right to exist on RAID 5. However, before such placement, you need to find out how you can backup/restore the operating system, and how long it will take, how long it will take to restore a “dead” disk, whether there is (will be) ) a disk is at hand to replace the “dead” one, and so on, i.e. you will need to know in advance the answers to the most basic questions in case of a system failure.

I still recommend operating system keep it on a separate SATA drive, or if you prefer, on two SATA drives connected in RAID 1. In any case, placing the operating system on a RAID, you must plan what you will do if the motherboard suddenly stops working - sometimes moving the raid drives array to another motherboard (chipset, raid controller) is not possible due to incompatibility of default raid parameters.

Placement of the base, shadow and backup

Despite all the advantages of RAID, it is strictly not recommended, for example, to make a backup to the same logical drive. Not only does this have a bad effect on performance, but it can also lead to problems with the lack of free space (on large databases) - after all, depending on the data, the backup file can be equivalent to the size of the database, and even larger. Making a backup to the same physical disk is still all right, although the most best option- backup to a separate hard drive.

The explanation is very simple. Backup is reading data from a database file and writing to a backup file. If all of this is physically happening on one drive (even RAID 0 or RAID 1), then performance will be worse than if reading from one drive and writing to another. The benefit from this separation is even greater when backup is done while users are working with the database.

The same applies to shadow - there is no point in putting shadow, for example, on RAID 1, in the same place as the database, even on different logical drives. If shadow is present, the server writes data pages to both the database file and the shadow file. That is, instead of one write operation, two are performed. When dividing the base and shadow across different physical disks, write performance will be determined by the slowest disk.

Now let's see what types there are and how they differ.

The University of California at Berkeley introduced the following levels of the RAID specification, which have been adopted as the de facto standard:

RAID 0- high-performance disk array with striping, without fault tolerance;
- mirrored disk array;
RAID 2 reserved for arrays that use Hamming code;
RAID 3 and 4- disk arrays with striping and a dedicated parity disk;
- disk array with striping and “unallocated parity disk”;
- interleaved disk array using two checksums calculated in two independent ways;
- RAID 0 array built from RAID 1 arrays;
- RAID 0 array built from RAID 5 arrays;
- RAID 0 array built from RAID 6 arrays.

A hardware RAID controller can support several different RAID arrays simultaneously, the total number of hard drives of which does not exceed the number of connectors for them. At the same time, the controller built into the motherboard has only two states in the BIOS settings (enabled or disabled), so a new hard drive connected to an unused controller connector activated mode A RAID can be ignored by the system until it is associated as another JBOD (spanned) RAID array consisting of a single disk.

RAID 0 (striping - “alternation”)

The mode that achieves maximum performance. The data is evenly distributed across the disks of the array; the disks are combined into one, which can be divided into several. Distributed read and write operations can significantly increase operating speed, since several disks simultaneously read/write their portion of data. The user has access to the entire volume of disks, but this reduces the reliability of data storage, since if one of the disks fails, the array is usually destroyed and data recovery is almost impossible. Scope of application - applications that require high speeds of exchange with the disk, for example video capture, video editing. Recommended for use with highly reliable drives.

(mirroring - “mirroring”)

an array of two disks that are complete copies of each other. Not to be confused with RAID 1+0, RAID 0+1, and RAID 10 arrays, which use more than two drives and more complex mirroring mechanisms.

Provides acceptable write speed and gains in read speed when parallelizing queries.

It has high reliability - it works as long as at least one disk in the array is functioning. The probability of failure of two disks at once is equal to the product of the probabilities of failure of each disk, i.e. significantly lower probability of failure separate disk. In practice, if one of the disks fails, immediate action must be taken to restore redundancy. To do this, it is recommended to use hot spare disks with any RAID level (except zero).

A variant of data distribution across disks, similar to RAID10, which allows the use of an odd number of disks (minimum number - 3)

RAID 2, 3, 4

various distributed data storage options with disks allocated for parity codes and different block sizes. Currently, they are practically not used due to low performance and the need to allocate a lot of disk capacity for storing ECC and/or parity codes.

The main disadvantage of RAID levels 2 to 4 is the inability to perform parallel write operations, since a separate control disk is used to store parity information. RAID 5 does not have this disadvantage. Data blocks and checksums are cyclically written to all disks of the array; there is no asymmetry in the disk configuration. Checksums mean the result of an XOR (exclusive or) operation. Xor has a feature that makes it possible to replace any operand with the result, and by applying the algorithm xor, get the missing operand as a result. For example: a xor b = c(Where a, b, c- three disks of the raid array), in case a refuses, we can get him by putting him in his place c and after spending xor between c And b: c xor b = a. This applies regardless of the number of operands: a xor b xor c xor d = e. If it refuses c Then e takes his place and holding xor as a result we get c: a xor b xor e xor d = c. This method essentially provides version 5 fault tolerance. To store the result of xor, only 1 disk is required, the size of which is equal to the size of any other disk in the raid.

Advantages

RAID5 has become widespread, primarily due to its cost-effectiveness. The capacity of a RAID5 disk array is calculated using the formula (n-1)*hddsize, where n is the number of disks in the array, and hddsize is the size of the smallest disk. For example, for an array of four 80 gigabyte disks, the total volume will be (4 - 1) * 80 = 240 gigabytes. Writing information to a RAID 5 volume requires additional resources and performance decreases, since additional calculations and write operations are required, but when reading (compared to a separate hard drive), there is a gain because data streams from several disks in the array can be processed in parallel.

Flaws

The performance of RAID 5 is noticeably lower, especially on operations such as Random Write, in which performance drops by 10-25% of the performance of RAID 0 (or RAID 10), since it requires more disk operations (each operation writes, with the exception of the so-called full-stripe writes, the server is replaced on the RAID controller by four - two read operations and two write operations). The disadvantages of RAID 5 appear when one of the disks fails - the entire volume goes into critical mode (degrade), all write and read operations are accompanied by additional manipulations, and performance drops sharply. In this case, the reliability level is reduced to the reliability of RAID-0 with the corresponding number of disks (that is, n times lower than the reliability of a single disk). If before full recovery If the array fails, or an unrecoverable read error occurs on at least one more disk, then the array is destroyed and the data on it cannot be restored by conventional methods. It should also be taken into account that the process of RAID Reconstruction (recovery of RAID data through redundancy) after a disk failure causes an intensive read load from the disks for many hours continuously, which can cause the failure of any of the remaining disks in the least protected period of RAID operation, as well as identify previously undetected read failures in cold data arrays (data that is not accessed when regular work array, archived and inactive data), which increases the risk of failure during data recovery.

The minimum number of disks used is three.

RAID 6 - similar to RAID 5, but has more high degree reliability - the capacity of 2 disks is allocated for checksums, 2 sums are calculated using different algorithms. Requires a more powerful RAID controller. Ensures operation after the simultaneous failure of two disks - protection against multiple failures. A minimum of 4 disks are required to organize the array. Typically, using RAID-6 causes approximately a 10-15% drop in disk group performance relative to RAID 5, which is caused by large volume processing for the controller (the need to calculate a second checksum, and read and rewrite more disk blocks as each block is written).

RAID 0+1

RAID 0+1 can mean basically two options:

two RAID 0 are combined into RAID 1;
three or more disks are combined into an array, and each data block is written to two disks of this array; Thus, with this approach, as in “pure” RAID 1, the useful volume of the array is half of the total volume of all disks (if these are disks of the same capacity).

RAID 10 (1+0)

RAID 10 is a mirrored array in which data is written sequentially to multiple disks, like RAID 0. This architecture is an array RAID type 0, the segments of which are RAID 1 arrays instead of individual disks. Accordingly, an array of this level must contain at least 4 disks (and always an even number). RAID 10 combines high fault tolerance and performance.

The assertion that RAID 10 is the most reliable option for data storage is quite justified by the fact that the array will be disabled after the failure of all drives in the same array. If one drive fails, the chance of failure of the second one in the same array is 1/3*100=33%. RAID 0+1 will fail if two drives fail in different arrays. The chance of failure of a drive in a neighboring array is 2/3*100=66%, however, since a drive in an array with an already failed drive is no longer used, the chance that the next drive will fail the entire array is 2/2 *100=100%

an array similar to RAID5, however, in addition to distributed storage of parity codes, the distribution of spare areas is used - in fact, a hard drive is used, which can be added to the RAID5 array as a spare (such arrays are called 5+ or 5+spare). In a RAID 5 array, the backup disk is idle until one of the main hard drives fails, while in a RAID 5EE array, this disk is shared with the rest of the HDDs all the time, which has a positive effect on the performance of the array. For example, a RAID5EE array of 5 HDDs will be able to perform 25% more I/O operations per second than a RAID5 array of 4 primary and one backup HDD. The minimum number of disks for such an array is 4.

combining two (or more, but this is extremely rarely used) RAID5 arrays into a stripe, i.e. a combination of RAID5 and RAID0, partially correcting the main disadvantage of RAID5 - low speed data recording due to parallel use several such arrays. The total capacity of the array is reduced by the capacity of two disks, but, unlike RAID6, such an array can tolerate the failure of only one disk without data loss, and the minimum required number of disks to create a RAID50 array is 6. Along with RAID10, this is the most recommended RAID level to use in applications where high performance combined with acceptable reliability is required.

combining two RAID6 arrays into a stripe. The write speed is approximately doubled compared to the write speed in RAID6. The minimum number of disks to create such an array is 8. Information is not lost if two disks from each RAID 6 array fail

The problem of increasing the reliability of information storage is always on the agenda. This is especially true for large amounts of data, databases on which the operation of complex systems in a wide range of industries depends. This is especially important for high-performance servers.

As you know, performance modern processors is constantly growing, which modern ones clearly cannot keep up with in their development.
hard disks. Having one disk, be it SCSI or, even worse, IDE, is already won't be able to decide tasks relevant to our time. You need many disks that will complement each other, replace them if one of them fails, store backup copies, and work efficiently and productively.

However, simply having several hard drives is not enough, you need them integrate into a system, which will work smoothly and will not allow data loss in the event of any disk-related failures.

You need to take care of creating such a system in advance, because, as the famous proverb says, Bye fried the rooster won't bite- they won’t miss it. You may lose your data irrevocably.

This system could become RAID– a virtual storage technology that combines several disks into one logical element. A RAID array is called redundant array independent disks. Typically used to improve performance and reliability.

What is needed to create a raid? At least two hard drives. Depending on the array level, the number of storage devices used varies.

What types of raid arrays are there?

There are basic, combination RAID arrays. The Berkeley Institute in California proposed dividing the raid into specification levels:

Basic:
- RAID 1 ;
- RAID 2 ;
- RAID 3 ;
- RAID 4 ;
- RAID 5 ;
- RAID 6 .
Combined:
- RAID 10 ;
- RAID 01 ;
- RAID 50 ;
- RAID 05 ;
- RAID 60 ;
- RAID 06 .

Let's look at the most commonly used ones.

Raid 0

RAID 0 intended to increase speed and recording. It does not increase storage reliability and is therefore not redundant. His other name is stripe (striping - “alternation”). Usually used from 2 to 4 disks.

The data is divided into blocks, which are written to disks one by one. Speed writing/reading increases by a number of times that is a multiple of the number of disks. From shortcomings One can note the increased likelihood of data loss with such a system. It makes no sense to store databases on such disks, because any serious failure will lead to complete inoperability of the raid, since there are no recovery tools.

Raid 1

RAID 1 provides mirror data storage at the hardware level. Also called an array Mirror, What means « mirror» . That is, the disk data in this case is duplicated. Can use with the number of storage devices from 2 to 4.

Speed writing/reading practically does not change, which can be attributed to benefits. The array works if at least one raid disk is in operation, but the system volume is equal to the volume of one disk. In practice, when failure one of the hard drives, you will need to take steps to replace it as quickly as possible.

Raid 2

RAID 2 - uses the so-called Hamming code. Data is split across hard drives similar to RAID 0, and is stored on the remaining drives error correction codes, in case of failure by which you can regenerate information. This method allows on-the-fly find, and then correct system failures.

Rapidity read/write in this case compared to using one disk rises. The downside is the large number of disks, for which it is rational to use it so that there is no data redundancy, usually this 7 and more.

RAID 3 - in an array, the data is split across all disks except one, which stores the parity bytes. Resistant to system failures. If one of the disks fails. Then its information can be easily “raised” using parity checksum data.

Compared to RAID 2 no possibility error correction on the fly. This array is different high performance and the ability to use 3 disks or more.

Main minus such a system can be considered increased load to a disk that stores parity bytes and the low reliability of this disk.

Raid 4

In general, RAID 4 is similar to RAID 3 except difference that parity data is stored in blocks rather than bytes, which allows for increased speed of small data transfers.

Minus The specified array turns out to have a write speed, because write parity is generated on one single disk, just like RAID 3.

This seems to be a good solution for those servers where files are read more often than written.

Raid 5

RAID 2 to 4 have disadvantages due to the inability to parallelize write operations. RAID 5 eliminates this drawback. Parity blocks are written simultaneously to all disk devices in the array, no asynchrony in the data distribution, which means parity is distributed.

Number used hard drives from 3. The array is very common due to its versatility And efficiency, the greater the number of disks used, the more economical the disk space will be spent. Speed wherein high due to data parallelization, but performance decreases compared to RAID 10 due to large number operations. If one drive fails, reliability drops to RAID 0. It takes a long time to recover.

Raid 6

RAID 6 technology is similar to RAID 5, but higher reliability by increasing the number of parity disks.

However, at least 5 or more disks are already required powerful processor to process an increased number of operations, and the number of disks must be equal to the prime number 5,7,11 and so on.

Raid 10, 50, 60

Next come combinations the previously mentioned raids. For example, RAID 10 is RAID 0 + RAID 1.

They inherit and advantages arrays of their components in terms of reliability, performance and number of disks, and at the same time efficiency.

Creating a raid array on a home PC

The advantages of creating a raid array at home are not obvious, due to the fact that it uneconomical, data loss is not so critical in comparison with servers, but information can be stored in backup copies, making backups periodically.

For these purposes you will need raid controller, which has its own BIOS and its own settings. In modern motherboards, the raid controller can be integrated to the south bridge of the chipset. But even in such boards, you can connect another controller by connecting to a PCI or PCI-E connector. Examples include devices from Silicon Image and JMicron.

Each controller can have its own configuration utility.

Let's look at creating a raid using the Intel Matrix Storage Manager Option ROM.

Transfer all data from your disks, otherwise during the creation of the array they will be cleared.

Go to BIOSSetup your motherboard and turn on the operating mode RAID for your sata hard drive.

To launch the utility, restart your PC, click ctrl+i during the procedure POST. In the program window you will see a list of available disks. Click Create Massive. Next select required array level.

Further following intuitively clear interface enter array size And confirm its creation.

Level 10 raid of 4 discs description. Why RAID5 is a “must have”

How does RAID 5 work?

Which RAID level should I choose?

How to configure a RAID controller?

Cache size

RAID level

Stripe size

Read Ahead

Write cache policy

Spare disk

History of creation

Operating principle

RAID levels

RAID 0

RAID 1 (Mirrored disk)

RAID 5

RAID 10

Intel Matrix RAID

From theory to practice

IOmeter utility

Creating a RAID array based on the GIGABYTE SATA2 controller

Creating a RAID array based on the Marvell 9128 controller

Creating a RAID array based on the controller integrated into the ICH10R

RAID performance comparison

RAID 0 (Stripe)

RAID 1 (Mirror)

RAID 1+0

RAID 2-3-4

RAID 5

RAID 10, 50

RAID 6

Rebuild time

Summary

Placement of the base, shadow and backup

RAID 0 (striping - “alternation”)

(mirroring - “mirroring”)

RAID 2, 3, 4

RAID 0+1

RAID 10 (1+0)

What types of raid arrays are there?

Raid 0

Raid 1

Raid 2

Raid 4

Raid 5

Raid 6

Raid 10, 50, 60

Creating a raid array on a home PC

Popular articles

Latest articles

Sections

Pages

Special projects

Contacts