Optimal use of MySQL. Basic SQL Concepts


Anyone who has set a goal to master web technologies will sooner or later need to master the SQL language - a structured query language used to create and manage data in relational databases data. If we talk about a modern web application, today almost every application interacts with a DBMS - a database management system.

The most popular DBMS for working with web applications - MySQL. Today, almost all websites that operate on the Internet interact with MySQL DBMS. Of course, MySQL is not the only DBMS, there are a number of others, but it turns out that it is the most popular. Therefore, web application developers need to master the MySQL DBMS and the SQL language.

There is no need to buy or download dozens of books; anyway, you won’t read them right away, and even if you do get through them, they won’t give you a huge result. A result is necessary, we can only get it when we do something, and not constantly read books. Reading books is good, but it’s even better to practice, this is the only way we gain experience and skills.

Choose one good book on learning MySQL. Focus on this book, read it, practice each chapter. If you haven’t mastered something well, practice it, don’t be afraid to go back and read and do it a few more times. I recommend reading Alan Bewley’s book “Learning SQL”; personally, this book helped me a lot. The book is written in simple language, For ordinary people in the book accessible language describes how to install a MySQL server, how operators, filters, etc. work. Read more about this book.

If you do not yet have the skills to work with MySQL, there is no understanding of data types and SQL statements, I do not recommend starting to master phpMyAdmin. This doesn't mean that phpMyAdmin is bad. the main task mastering SQL and MySQL is best done through a console application.

Download the MySQL server for free from the official website, install it on your computer and start working. For example: if you take Alan Bewley’s book “Learning SQL” it contains detailed SQL course and working with MySQL, all examples of working through a console application.

Read only one book, don’t try to grab everything at once, it won’t work, you’ll waste time and won’t learn anything. It is important to master the basics once and then you will only need practice, practice and MySQL documentation, which is located on the official website.

Your tools: selected book, MySQL server, practice, practice and practice again.

Having experience working through a console application, it will not be difficult for a developer to work in the same phpMyAdmin; it will not be difficult for him to write all SQL commands. This is necessary and important because you can manage and create a database in phpMyAdmin without really knowing SQL. But remember SQL - you will need to write queries when developing web applications, there is no escape from this.

Tags: sql, mysql, databases

WORKING WITH A MySQL DATABASE WITH PHP TOOLS

Lecture. Prepared by Prokhorov V.S.


1. CONNECTING PHP SCENARIOS with MySQL tables

Let's look at the most commonly used functions that allow you to work with the database MySQL data by means of RNR.

When PHP and MySQL interact, the program interacts with the DBMS through a set of functions.

1.1 Connection to the server. Function mysql_connect

Before working with the database, you must install network connection, as well as authorize the user. The mysql_connect() function is used for this.

resource mysql_connect(]])

This function establishes a network connection to the MySQL database located on the host $server (by default this is localhost, i.e. current computer) and returns the identifier open connection. All further work is carried out with this identifier. All other functions that take this identifier (handle) as an argument will uniquely identify the selected database. When registering, specify the user name $username and password $password (by default, the user name from which the current process is running - when debugging scripts: root, and an empty password):

$dbpasswd = ""; //Password

//Display a warning

echo("

");

The $dblocation, $dbuser and $dbpasswd variables store the server name, username and password.

1.2 Disconnection from the server. Function mysql_close

The connection to the MySQL server will be automatically closed when the script completes, or when the mysql_close function is called

bool mysql_close()

This function closes the connection to the MySQL server and returns true when successful completion operations and false otherwise. The function takes as an argument the database connection handle returned by the mysql_connect function.

$dblocation = "localhost"; //Server name

$dbuser = "root"; //Username

$dbpasswd = ""; //Password

//Connect to the database server

//Suppress error output with the @ symbol before calling the function

$dbcnx = @ mysql_connect($dblocation, $dbuser, $dbpasswd);

if (!$dbcnx) //If the descriptor is 0, the connection is not established

//Display a warning

echo("

B currently The database server is not available, so the page cannot be displayed correctly.");

if (mysql_close($dbcnx)) //close the connection

echo("Database connection terminated");

echo("We were unable to complete the connection");

1.3 Creating a database. CREATE DATABASE function

The command - create a database is available only to the server administrator, and on most hosting sites it cannot be executed:

CREATE DATABASEDatabaseName

Creates new base data named DatabaseName.

An example of working with this function:

@mysql_query("CREATE DATABASE $dbname");

It is recommended to use apostrophes (“SQL – command”) everywhere as delimiters for lines containing SQL – commands. This ensures that no $ variable is accidentally interpolated (i.e. replaced by its value), and increases script security.

The CREATE DATABASE database creation command is available only to the superuser, and on most hosting to a simple user it is impossible to accomplish it. It is available only to the server administrator.

For experiments, let's create a testbase database by executing an SQL query from the command line. To do this you need to log in MySQL system and enter in command line MySQL:

mysql> create database testbase;

After this you should type:

mysql>use testbase;

Database created:



1.4 Selecting a database. Function mysql_select_db

Before sending the first request MySQL server, we need to indicate which database we are going to work with. The mysql_select_db function is designed for this:

bool mysql_select_db(string $database_name [,resource $link_identifier])

It notifies PHP that future operations on the $link_identifier connection will use the $database_name database.

Using this function is equivalent to calling the use command in an SQL query, i.e. the mysql_select_db function selects a database for further work, and all subsequent SQL queries are applied to the selected database. The function takes as arguments the name of the selected database database_name and the connection descriptor resource. The function returns true if the operation was successful and false otherwise:

//Database connection code

if (! @mysql_select_db($dbname, $dbcnx))

//Display a warning

echo("

The database is currently unavailable, so the page cannot be displayed correctly.");

1.5 Error handling

If errors occur while working with MySQL (for example, the parentheses in the query are not balanced or there are not enough parameters), then the error message and its number can be obtained using the two functions described below.

It is important to use these functions carefully and promptly, because otherwise debugging scripts can become more difficult.

● Function:

int mysql_errno()

returns the number of the last reported error. The connection identifier $link_identifier can be omitted if only one connection was established during the script's execution.

● Function:

string mysql_error()

returns not a number, but a string containing the text of the error message. It is convenient to use for debugging purposes. Usually mysql_error is used together with the or die() construct, for example:

@mysql_connect("localhost", "user", "password")

or die("Error connecting to the database: ".mysql_error());

The @ operator, as usual, serves to suppress the standard warning that might occur in the event of an error.

IN latest versions PHP warnings are not logged in MySQL functions by default.

1.6 Automation of connection to MySQL. File ( config.php )

Typically, there are several scripts on a site that need access to the same database.

It is recommended to separate the code responsible for connecting to MySQL into separate file, and then connect using the include function to the necessary scripts.

It makes sense to place the functions for connecting, selecting and creating a database in the same file (config.php) where the variables with server name $dblocation, username $dbuser, password $dbpasswd and database name $dbname are declared:

config.php listing:

//config.php code of the file containing parameters for connecting to the server and selecting a database

//outputs connection error messages to the browser

$dblocation = "localhost"; //Server name

$dbname = "insert database name" //Database name: being created or existing

$dbuser = "root"; //Database username

$dbpasswd = ""; //Password

//Connect to the database server

//Suppress error output with the @ symbol before calling the function

$dbcnx=@mysql_connect($dblocation,$dbuser,$dbpasswd);

if (!$dbcnx) //If the descriptor is 0, the connection to the database server is not established

//Display a warning

echo("

The database server is currently unavailable, so the page cannot be displayed correctly.

");

//Create the database $dbname - only a superuser can do this

//If the database already exists, there will be a non-fatal error

@mysql_query("CREATE DATABASE if not exists $dbname’);

//Connection code to the database: we make an unambiguous selection of a newly created database or an already existing database

//Suppress error output with the @ symbol before calling the function

if(!@mysql_select_db($dbname, $dbcnx)) //If the descriptor is 0, the connection to the database is not established

//Display a warning

echo("

The database is currently unavailable, so the page cannot be displayed correctly.

");

//Small auxiliary function, which displays a message

//error message in case of database query error

function puterror($message)

echo("");



2. EXECUTION OF DATABASE QUERIES

2.1 Creating a table. Function CREATE TABLE:

CREATE TABLE TableName (FieldName type, FieldName type,)

This command creates a new table in the database with columns (fields) defined by their names (FieldName) and specified types. After creating the table, you can add records to it consisting of the fields listed in this command.

Listing test_11.php. The program that creates new table in the database:

include "config.php";//Connect to the server and select a database

mysql_query("CREATE TABLE if not exists people

id INT AUTO_INCREMENT PRIMARY KEY,

or die("MySQL error: ".mysql_error());



This script creates a new table people with two fields. The first field has type INT (integer) and name id. The second is the TEXT type (text string) and the name name.

If the table exists, the or die() construct will work.

The optional if not exists clause, when specified, tells the MySQL server that it should not generate an error message if a table with the specified name already exists in the database.

Database operations very often become a bottleneck when implementing a web project. Programmers need to properly structure tables, write optimal queries and more productive code. This article provides a small list of techniques for optimizing work with MySQL for programmers.

Optimization issues in such cases concern not only the database administrator.

1. Optimize your queries for the query cache.

Most MySQL servers use query caching. This is one of the effective performance improvement techniques that is performed by the database engine in the background. If a query is executed many times, the cache begins to be used to obtain the result and the operation is completed much faster.

The problem is that it is so simple and at the same time hidden from the developer, and most programmers ignore such a great opportunity to improve the performance of the project. Some actions may actually prevent the query cache from being used during execution.

// Query cache NOT WORKING $r = mysql_query("SELECT username FROM user WHERE signup_date >= CURDATE()"); // The query cache is WORKING! $today = date("Y-m-d"); $r = mysql_query("SELECT username FROM user WHERE signup_date >= "$today"");

The reason the query cache doesn't work in the first case is because of the use of the function CURDATE(). This approach is used for all non-deterministic functions, for example, NOW(), RAND(), etc. Since the function's return result may change, MySQL decides not to cache the query. All that is needed to fix the situation is to add an extra line of PHP code before the request.

2. Use EXPLAIN for your SELECT queries

Using a keyword can help paint a picture of what MySQL is doing to fulfill your query. This picture makes it easy to identify bottlenecks and other problems in queries or table structure.

The result of an EXPLAIN query shows which indexes are used, how the table is scanned and sorted, and so on.

Let's take a SELECT query (preferably a complex one, with a JOIN), and add the EXPLAIN keyword before it. You can use PhpMyAdmin for this. Such a query will output the result in a nice table. Let's say we forgot to add an index on a column that is used for a JOIN:

After adding an index for the group_id field:

Now instead of scanning 7883 rows, only 9 and 16 rows from two tables will be scanned. A good way to evaluate performance is to multiply all the numbers in the “rows” column. The result is approximately proportional to the amount of data being processed.

3. Use LIMIT 1 if you need to get a unique string

Sometimes, while using a query, you already know that you are looking for only one row. You can get a unique record or simply check for the existence of any number of records that satisfy the WHERE clause.

In such a case, adding LIMIT 1 to your query may improve performance. Under this condition, the database engine stops scanning records as soon as it finds one and does not scan the entire table or index.

// Is there any user from Alabama? // Don't do this: $r = mysql_query("SELECT * FROM user WHERE state = "Alabama""); if (mysql_num_rows($r) > 0) ( // ... ) // This will be much better: $r = mysql_query("SELECT 1 FROM user WHERE state = "Alabama" LIMIT 1"); if (mysql_num_rows($r) > 0) ( // ... )

4. Index your search fields

Index more than just primary and unique keys. If any columns in your table are used for search queries, then they need to be indexed.

As you can see, this rule also applies to searching by part of a string, for example, “last_name LIKE ‘a%’”. When the start of a string is used for a search, MySQL can use the index of the column being searched.

You should also understand what types of searches you can't use regular indexing for. For example, when searching for the word (“WHERE post_content LIKE ‘%apple%’”), the benefits of indexing will not be available. In such cases, it is better to use or build your own solutions based on indexing.

5. Indexing and using the same types for linked columns

If your application contains many JOIN queries, you need to index the columns that are linked in both tables. This has the effect of internally optimizing bind operations in MySQL.

Also, the columns being linked must be of the same type. For example, if you associate a DECIMAL column with an INT column from another table, MySQL will not be able to use an index on at least one of one table. Even the character encoding must be the same for identical string type columns.

// Search for a company from a specific state $r = mysql_query("SELECT company_name FROM users LEFT JOIN companies ON (users.state = companies.state) WHERE users.id = $user_id"); // both columns for the state name must be indexed // and both must be the same type and character encoding // or MySQL will perform a full table scan

6. Don't use ORDER BY RAND()

This is one of those tricks that looks cool, and many new programmers fall into its trap. They cannot even imagine what a terrible problem they are creating for themselves by starting to use this expression in their queries.

If you really need to randomize the rows in your query result, there are many better ways to do this. Of course, this will be implemented with additional code, but you will be saved from a problem that grows exponentially with the volume of data. The thing is, MySQL performs a RAND() operation (which takes up CPU time) on every single row in the table before sorting it and giving you just one row.

// DO NOT do this: $r = mysql_query("SELECT username FROM user ORDER BY RAND() LIMIT 1"); // This will work better: $r = mysql_query("SELECT count(*) FROM user"); $d = mysql_fetch_row($r); $rand = mt_rand(0,$d - 1); $r = mysql_query("SELECT username FROM user LIMIT $rand, 1");

This way you get a random number that is less than the number of rows in the query result and use it as an offset in the LIMIT clause.

7. Avoid using SELECT *

The more data is read from the table, the slower the query will be. Such operations also take time to complete disk operations. And if the database server is separate from the web server, then delays will also be caused by data transfer over the network between the servers.

It's a good habit to specify a column when doing a SELECT.

// Bad: $r = mysql_query("SELECT * FROM user WHERE user_id = 1"); $d = mysql_fetch_assoc($r); echo "Welcome ($d["username"])"; // This is better: $r = mysql_query("SELECT username FROM user WHERE user_id = 1"); $d = mysql_fetch_assoc($r); echo "Welcome ($d["username"])"; // The difference becomes significant with large amounts of data

8. Try to use the id field everywhere

A good practice is to use an id field in each table that has the PRIMARY KEY, AUTO_INCREMENT properties set, and is of type from the INT family. Preferably - UNSIGNED, since in this case the value cannot be negative.

Even if your table has a field with a unique username, don't make it the primary key. VARCHAR fields are slow to work as primary keys. Also, the structure of your database will be better if it internally uses links to records based on id.

In addition, the MySQL engine uses primary keys for its internal tasks, and the use of the id field creates optimal conditions for solving them.

One possible exception to this rule is “associative tables,” which are used for many-to-many relationships between two other tables. For example, the “posts_tags” table contains 2 columns: post_id, tag_id. They are used to describe the relationship between two tables “post” and “tags”. The table described may have a primary key that contains both id fields.

9. Use ENUM instead of VARCHAR

// Create a prepared expression if ($stmt = $mysqli->prepare("SELECT username FROM user WHERE state=?")) ( // Bind the parameters $stmt->bind_param("s", $state); // Execute $stmt->execute(); // Bind the result variables $stmt->bind_result($username); // Get the values ​​$stmt->fetch(); printf("%s is from %s\n", $username , $state); $stmt->close(); )

13. Unbuffered requests

Typically, when you execute a request from a script, the script is interrupted until the request is completed. This behavior can be changed using unbuffered queries.

Excellent explanation of the function from the PHP documentation:

“mysql_unbuffered_query() sends a SQL query to the MySQL server without automatically retrieving and buffering the result rows as the mysql_query() function does. This way, a certain amount of memory is saved by SQL queries that produce a large result set, and you can start working on the result set immediately after receiving the first row, without waiting for the SQL query to be completely executed.”

However, there are several limitations. You must either read all lines or call before executing the next query. You also cannot use or to type a result.

14. Store the IP address as UNSIGNED INT

Many programmers create a VARCHAR(15) field to store an IP address, without even thinking about the fact that they will store an integer value in this field. If you use INT, the field size will be reduced to 4 bytes, and it will have a fixed length.

You must use the UNSIGNED INT type because the IP address uses all 32 bits of an unsigned integer.

$r = "UPDATE users SET ip = INET_ATON("($_SERVER["REMOTE_ADDR"])") WHERE user_id = $user_id";

15. Tables with a fixed record length (Static) are faster

When each individual column in a table has a fixed length, then the entire table as a whole is treated as . Examples of column types that do not have a fixed length: VARCHAR, TEXT, BLOB. If you include at least one column of this type, the table will no longer be considered "static" and will be processed differently by the MySQL engine.

"static" tables are processed faster by the MySQL engine when searching for records. When you need to read a specific record in a table, its position is quickly calculated. If the row size is not fixed, then determining the position of the record requires time to search and compare with the index of the main key.

Such tables are also easier to cache and easier to recover from failures. But they can take up more space. For example, if you convert a VARCHAR(20) field to a CHAR(20) field, then 20 bytes will always be occupied, regardless of whether they are in use or not.

Using the Vertical Splitting technique makes it possible to separate variable-length columns into a separate table.

16. Vertical separation

Vertical partitioning is the act of dividing a table structure vertically for the purpose of optimization.

Example 1: You have a table that contains home addresses that are rarely used in the application. You can split your table and store the addresses in a separate table. This will reduce the size of the main user table. And as you know, a smaller table is processed faster.

Example 2: You have a field “last_login” in your table. It is updated every time a user registers on the site. But every table update causes the query to be cached, which can create system overload. You can separate this field into another table to make updates to the user table less frequent.

But you need to be sure that you don't need to permanently link the two tables you just split, as this could lead to poor performance.

17. Separate large DELETE or INSERT queries

If you need to perform a large DELETE or INSERT query on a live site, you need to be careful not to disrupt traffic. When a large query is running, it can lock your tables and cause your application to stop.

Apache runs many parallel processes/threads. for this reason, it works more efficiently when the script finishes execution as quickly as possible, thus the server does not use too many open connections and processes consuming resources, especially memory.

If you lock tables for an extended period of time (for example, 30 seconds or more) on a heavily loaded web server, you can cause a backlog of processes and requests that will require significant cleanup time or even cause your web server to stop.

If you have a script that deletes a large number of records, simply use the LIMIT clause to break it into small batches to avoid this situation.

While (1) ( mysql_query("DELETE FROM logs WHERE log_date<= "2009-10-01" LIMIT 10000"); if (mysql_affected_rows() == 0) { // выполняем удаление break; } // вы можете сделать небольшую паузу usleep(50000); }

18. Small columns are processed faster

For a database engine, disk is the most important bottleneck. The desire to make everything smaller and smaller usually has a positive impact on productivity by reducing the amount of data being moved.

The MySQL documentation contains a list for all types.

If the table will only contain a few rows, then there is no reason to make the primary key type INT rather than MEDIUMINT, SMALLINT, or even TINYINT. if you only need the date, use DATE instead of DATETIME.

You just need to remember about the opportunities for growth.

19. Choose the right storage mechanism

There are two main data storage engines for MySQL: MyISAM and InnoDB. Each has its own advantages and disadvantages.

MyISAM is great for read-heavy applications, but it doesn't scale well when there are a lot of records. Even if you update one field in one row, the entire table will be locked and no process will be able to read anything until the query completes. MyISAM performs calculations quickly for queries like SELECT COUNT(*).

InnoDB is a more complex storage engine, and it can be slower than MyISAM for most small applications. But it supports row locking, which is better for scaling tables. It also supports some additional features such as transactions.

20. Use object-relational mapping

Using an object-relational mapping (ORM - Object Relational Mapper) provides a number of advantages. Everything that can be done in an ORM can be done manually, but with more effort and higher developer level requirements.

ORM is great for lazy loading. This means that values ​​can be retrieved when they are needed. But you need to be careful because you can create a lot of small queries that will reduce performance.

An ORM can also combine your queries into transactions, which are significantly faster than individual database queries.

For PHP you can use ORM.

21. Be careful with persistent connections

Persistent connections are designed to reduce the cost of restoring connections to MySQL. When a persistent connection is created, it remains open even after the script completes. Since Apache reuses child processes, the process is executed for the new script, and it also uses the MySQL connection.

This sounds great in theory. But in reality, this feature is not worth a penny due to the problems. It can cause serious problems with connection limits, memory overflow, and so on.

Apache works on concurrency principles and creates many child processes. This is the reason why persistent connections do not work as expected on this system. Before using the mysql_pconnect() function, consult your system administrator.

» Why use MySQL?

Navigation through the Tutorial: 1.1 What is MySQL? 1.2 Why use MySQL? 1.3 How stable is MySQL? 1.4 How big can MySQL tables be? 1.5 MySQL, MySQL AB, MySQL-MAX: what is it? 1.6 What operating systems does MySQL run on? 1.7 MySQL Distributions 1.8 MySQL Command Line Prompts 2.1 Introduction to MySQL 2.2 Connecting to a MySQL Server 2.3 Entering Queries in MySQL 2.4 Creating and Using Databases 2.5 Creating a MySQL Database 2.6 Creating a MySQL Table 2.7 Loading Data into a MySQL Table 2.8 Selecting All Data from a MySQL Table 2.9 Selecting specific rows from a MySQL table 2.10 Selecting arbitrary columns from a MySQL table 2.11 Sorting rows from a MySQL table 2.12 Calculating dates in a MySQL table 2.13 Working with NULL values ​​in a MySQL table 2.14 Pattern matching. SQL templates. 2.15 Counting rows in SQL templates. COUNT() function 2.16 Using multiple tables in one SQL query 2.17 Obtaining information about MySQL databases and tables 2.18 Examples of common queries in MySQL 2.19 Maximum value for a MySQL column 2.20 Which row stores the maximum of a certain MySQL column 2.21 Maximum of a column in a MySQL group 2.22 V Which MySQL line contains the maximum value for a group? 2.23 Using user variables in MySQL 2.24 Using the MySQL client in batch mode 3.1 Rows in MySQL 3.2 Numbers in MySQL. How to write numbers in MySQL? 3.3 Hexadecimal Values ​​in MySQL 3.4 NULL Values ​​in MySQL 3.5 Database, Table, Index, Column, and Alias ​​Names in MySQL 3.6 Case Sensitivity in MySQL Names 3.7 User Variables in MySQL 3.8 Comments in MySQL 3.9 MySQL Reserved Words 4.1 Backing Up MySQL Databases 4.2 BACKUP TABLE Syntax in MySQL 4.3 RESTORE TABLE Syntax in MySQL 4.4 CHECK TABLE Syntax in MySQL 4.5 REPAIR TABLE Syntax in MySQL 4.6 OPTIMIZE TABLE Syntax in MySQL 4.7 ANALYZE TABLE Syntax in MySQL 4.8 FLUSH Syntax in MySQL 4.9 K Syntax ILL in MySQL 4.10 SHOW Syntax MySQL 4.11 SHOW TABLE STATUS syntax in MySQL 4.12 SHOW STATUS syntax in MySQL 4.13 SHOW VARIABLES syntax in MySQL 4.14 back_log 4.15 character_set, character_sets, concurrent_inserts 4.16 connect_timeout, delay_key_write, delayed_insert_limit 4.17 delayed_insert_timeout, delayed_queue_ size, flush_time 4.18 have_raid, have_ssl, init_file 4.19 interactive_timeout, join_buffer_size , key_buffer_size 4. 20 language, log_bin, long_query_time 4.21 lower_case_table_names, max_allowed_packet, max_binlog_cache_size 4.22 max_connections, max_connect_errors, max_delayed_threads 4.23 max_join_size, max_sort_length, max_user_connections 4.24 max_tmp_tables, max_write_lock_count, myisam _sort_buffer_size 4.25 mуisam_max_extra_sоrt_file_size, myisam_max_sort_file_size, net_buffer_length 4.26 net_read_timeout, net_retry_count, net_write_timeout 4.27 open_files_limit, port, record_buffer 4.28 protocol_version , record_rnd_buffer, query_buffer_size 4.29 safe_show_databases, skip_networking, skip_show_databases 4.30 socket, sort_buffer, skip_show_databases 4.31 thread_cache_size, tmp_table_size, wait_timeout 4.32 SHOW PROCESSLIST syntax in MySQL 4.33 SHOW GRAN syntax TS in MySQL 4.34 Syntax SHOW CREATE TABLE in MySQL 4.35 Options file my.cnf in MySQL 5.1 Column types in MySQL 5.2 Numeric types in MySQL 5.3 Date and time types in MySQL 5.4 Y2K (2000) problem and Date types in MySQL 5.5 DATETIME, DATE and TIMESTAMP types in MySQL 5.6 TIME type in MySQL 5.7 YEAR type in MySQL 5.8 String types CHAR and VARCHAR in MySQL 5.9 BLOB and TEXT string types in MySQL 5.10 ENUM string type in MySQL 5.11 SET string type in MySQL 5.12 Choosing the right type for a MySQL column 5.13 Using column types from other DBMSs for MySQL 5.14 Memory requirements of MySQL columns 6.1 Functions for MySQL usage in SELECT and WHERE 6.2 Untyped operator Brackets in MySQL 6.3 Untyped Comparison operator in MySQL 6.4 Logical operators in MySQL 6.5 Branching functions in MySQL 6.6 String functions in MySQL

MySQL is very fast, reliable and easy to use. MySQL also has a very practical set of features, developed in very close collaboration with end users. MySQL was originally designed to handle very large databases much faster than existing solutions and has been used successfully in highly demanding industrial applications.

With ongoing development, MySQL today offers a rich and very useful set of features. Simplicity, speed and reliability make MySQL very suitable for accessing databases from the Internet. Supporting truly enormous amounts of data is very important. There is a known case of using MySQL with 60,000 tables, totaling about 5000000000 rows.

MySQL is a client-server system

This means that MySQL consists of one SQL server that supports various functions (in fact, it does all the database work), several different client programs (they provide only the interface between the user and the server), administrative tools, and several programming interfaces. Clients communicate with the server using their own network protocol.

This scheme allows a large number of users to simultaneously work with the server. Including, jointly processing the same data without interfering with each other. This interaction will be discussed in detail in subsequent chapters.

Usually the client and server are used on different computers, but it is acceptable to install both parts of the package on the same computer. Even within a single computer, a network is still used to communicate between client and server. By the way, it should be noted that the client and server can run under different operating systems, this does not interfere with anything. MySQL is already available on more than a dozen operating systems.

MySQL is a fully multi-threaded system

This means that the package can easily use many processors, if any. Many server systems already have several processors installed. On such systems, MySQL can strictly execute several database queries at once (one per processor) in parallel.

MySQL is an open source package

This means that the package can easily be modified to suit your needs by anyone who is well versed in programming. Having written an important amendment to a package, such a specialist can send it to the main package site for possible inclusion in the next version. It is thanks to this development scheme that MySQL develops very quickly and takes into account the needs of end users, and all errors are eliminated in the shortest possible time. This property was perhaps the first reason for the package’s high popularity: with proper qualifications, it can be tailored to your needs.

MySQL is free in most cases

MySQL is intended for non-commercial use. If you use the package to organize a forum, guest book system or blogs, then you do not have to pay for any licenses. You just need to download the package from his website and install it on your server. But if you need technical support, then you have to pay for it. Note: The fee is charged specifically for technical support, and not for using the program itself. It is this property that has largely made this DBMS so popular.

Now we will consider in more detail the absolutely necessary terminology that will be needed in the subsequent presentation, as well as the basic principles of database operation, so that you at least have a general idea of ​​what you are dealing with. At first glance, the list of terms is long, but they will all be needed.

Since all data is stored in the form of tables, you should clearly define the type of tables and the corresponding terminology. Table comprises cells, each of which can store some data. If a cell does not currently store any data, it is called empty.

The table consists of vertical columns and horizontal rows into which cells are grouped. Table strictly rectangular, that is, all rows have the same number of cells, and all columns have the same number of rows.

Each row in the table is called recording. Each cell in the record is named record field. Since fields can store a wide variety of data, to simplify working with them, the concept type. Each cell has its own type and can only store data of the corresponding type, which must be understood when working with the system.

In MySQL, a data type is assigned to a column in a table and applies to all cells in that column. That is, if the first column is assigned a certain type (for example, char), this means that the first field of any record in this table will always be of the char type and no other.

It is very important! Most of the time you'll be working with columns because they define the data type and structure of the table. The rows in the table are not numbered; they exist only for ease of working with data. In most cases, you shouldn't care about the number of records in a table at all. You work with data, and its structure is determined by the types and order of the columns. The number of records is only important for the server.

A database can consist of several different tables. Their exact number is limited mainly by computer memory. The server can store many databases. It should be noted that you can mix tables from different databases in a single query.

The table also has type And attributes. The set of capabilities available when working with this table, as well as the logic of its processing by the server, depends on this. This will be discussed in detail later.

Now let's talk about the data in the fields. To do something with some data, you first need to find it. This is what the SQL language is used for. Queries are written on it, during the processing of which the server scans the database tables, finds the required data based on the criteria specified in the request, and does something with it (what exactly depends on the request).

The data used to search for a record is called key. If a record has several fields, then you can find it using different keys, each for its own field. The key by which a record is most likely to be searched is called primary key(PRIMARYKEY). The key can relate not only to one field, but to several at once.

One or more records can be found using one key. If only one specific record can be found using a certain key, such a key is called unique(UNIQUE). A key that is suitable for searching multiple different records in a table is called non-unique.

The next concept is index. Imagine how a database server works. Having received a request, it begins to scan the tables to find the relevant data. The scan is performed sequentially, line by line. To make sure that all the data is collected, the server must look at all the rows in the table (or tables, if the search is carried out across several tables at once).

But it's easy to say: view. What if the table contains ten thousand records? What if it's a million? The server, of course, will look at it, but how long will it take him? Quite a lot. And you need to get an answer as quickly as possible.

This is where indexes come to the rescue. Index is something like a table of contents that lists in which records certain data appears. It can be compared to searching on the Internet, where search engines store data about which pages contain certain words. Indexes in MySQL do the same thing. The index must be specially created; it is not built by itself (how exactly to create it will be discussed later).

The process of building an index takes a lot of time, but after its creation, the server no longer scans the entire table in search of data: It looks for data using an index, where for each value found in the table it is indicated in which cells of the table the necessary data can be found. This significantly speeds up the work (tens of times).

Different queries look for different data. It is often necessary to have not one, but several different indexes on a table, so that different types of queries work each with their own index, optimally created and suitable specifically for this type of query. For example, imagine a phone book. When searching for a subscriber's last name, the index must indicate in which records to look for a particular last name. But if the search is carried out by name, then the index should be built not by last names, but by the names of subscribers.

It turns out that sometimes you need to have several different indexes for each table. In MySQL you can have up to 32 indexes per table. Maximum index length(that is, the length of each entry in the index) is up to 500 bytes. An index can include data from multiple columns at once (in the current MySQL implementation, the maximum number of columns in each index is 15).

Since indexes are built on the data in the fields of records, and each field can be considered as a key, it is said that the index is built by keys. An index built on primary keys will be used in searches with the highest probability and is called primary index.

All columns have default values. This means that if a field is not explicitly given a value, it will be automatically set to some default value specific to each column type. The default value can be changed when creating a column.

Each column in the table has unique name. Tables and databases also have names. In a number of cases, which will be discussed below, a column must have several names. In this case, only the name assigned to the column when it was created is considered the name; the rest are called aliases. There are also aliases for table names.

Columns are not necessarily created with tables. The table can be changed as needed by adding or removing columns or changing their types. In SQL statements, you can access tables from different databases using the syntax Database_name.Table_name. This long name allows you to uniquely identify the table and is called fully qualified table name.

You can also refer to a column in any table by specifying the syntax Database_name.Table_name.Column_name. This design allows you to uniquely identify the column and is called complete column name.

In addition to name and type, columns can also have optional attributes, which determine exactly how the server processes a particular column, change the logic of how the server works with this column. Assigning an attribute to a column is equivalent to assigning that attribute to the column's corresponding field in each record in the table. Note that columns of the same type, but with different attributes, are processed in different ways.

When working with a DBMS, some database is always considered active or current. There is a certain active table. It is with the active table in the current database that all actions will be performed, unless you explicitly indicate otherwise.

When accessing a column in the active table, you can specify only the column name instead of its full name (indicating the database and table). It will automatically be supplemented with the name of the active table of the current database, which significantly speeds up the input process. Also, if you are accessing a column from another table in the current database, you can specify its name as Table_name.Column_name, omitting part Database_name. it will be set to the current database.

In the following discussion you will come across the concept of flow. Flow in this represents your connection to the database server. All commands you enter apply only to this thread. Please note that if you have created several different connections to the same database (for example, by opening several copies of the client), then each connection will be a completely independent thread, in no way dependent on the others. This is the big difference between a user and a thread: one user can be represented by several independent threads. Later we will show what pitfalls this can lead to.

Wildcards or wildcards represent symbols that can correspond to a certain number of other symbols. For example, when working in Windows, you've most likely encountered the asterisk (*) character, which denotes any string of arbitrary characters. This method is widely used when working with groups of files. This is the wildcard.

Regular Expressions represent certain sequences of characters, some or all of which are wildcards. For example, when working in Windows, you most likely came across notations like *.doc (matches all files with a doc extension in the current directory) or *.* (matches all files in the directory). These are regular expressions, only they are called differently in Windows. As you can see, there is nothing complicated about them. In MySQL, you will come across such expressions more than once (though a little more complex). It is with their help that the criteria for searching for information are set.

Now let's think about this complex problem: a server allows multiple users to work simultaneously on the same database (even the same table). But how to ensure data integrity? If one user writes some data, and another at the same time tries to change this particular data, they will destroy the entire database. How to avoid this?

There are two ways to set partitioning, and MySQL uses both, albeit on different types of tables. The first one is transaction model, second atomic modifications.

Transaction refers to a certain database operation that cannot be divided into several separate operations. A full description of the two concepts listed above is very complex and seems unnecessary in this book, so we will consider them only from the user's point of view.

So what does working with a transaction look like? The server presents it this way: first, it is determined what exactly will be changed in the table, then a mark is made that these cells are changing in some way, then the changes are actually made, and then a mark is made that the modification has been completed. At the same time, while changes are made, the table either returns old data or returns nothing at all, but in any case, it does not allow two users to simultaneously make changes to the same data. First, the first one finishes the modification, then the next one takes over.

The difference between a transaction and an atomic modification, all scientific theory aside, for the end user is that in the case of a transactional model, the user can, after executing a query that changes some data in a table, decide exactly how he wants to complete the transaction: should whether save changes (commit) or Refuse them (call rewind, rollback), thereby returning the table to the state it had before calling the query. It should be noted that there is a mode for automatically saving all changes (AUTO_COMMIT). In the atomic model, changes cannot be undone: they are entered into the table immediately. The transaction currently in progress is called active.

MySQL uses its own table format by default: MyISAM. Previously, the proprietary ISAM format was also used, but it has now been declared obsolete and is supported solely for compatibility with older users. If you have tables in this format, we highly recommend converting them to the new MyISAM format. It works more reliably and faster.

Since we are talking about supported table formats, we should immediately clarify that MySQL supports several different formats. Some work on a transaction basis, others on atomic modifications, so that the end user always has a choice of what to use. It should be noted that the transactional model is much more complex to implement than the atomic one, and therefore support for transactions in MySQL appeared much later. MySQL formats (MyISAM and ISAM) use atomic modifications, but they implement them so well that in terms of reliability they are in no way inferior to the transactional access model. Table 1.1. lists the table types supported by MySQL with their brief descriptions.

Table 1.1. Brief descriptions of MySQL supported table types.

Please note that not all server versions support all tables listed! The ISAM and MyISAM types are always supported, but there are options for the rest. So take a look at http://www.mysql.com to see what exactly you need.

A dump is an intermediate representation of a database. Moreover, it is usually specified which data representation is meant. For example, dump to text file means that the entire database is converted and saved as a text file.

Why is this necessary? Dumps are very convenient for backup, as well as for transferring data between different servers. It may be necessary to apply the dump within the same server.

Table 1.1 shows the different types of tables. Each of them has its own internal data storage format. The easiest way to convert one to another comes down to dumps.

Database operations very often become a bottleneck when implementing a web project. Optimization issues in such cases concern not only the database administrator. Programmers need to properly structure tables, write optimal queries and more productive code. This article provides a small list of techniques for optimizing work with MySQL for programmers.

1. Optimize your queries for the query cache.

Most MySQL servers use query caching. This is one of the effective performance improvement techniques that is performed by the database engine in the background. If a query is executed many times, the cache begins to be used to obtain the result and the operation is completed much faster.

The problem is that it is so simple and at the same time hidden from the developer, and most programmers ignore such a great opportunity to improve the performance of the project. Some actions may actually prevent the query cache from being used during execution.

// Query cache NOT WORKING $r = mysql_query("SELECT username FROM user WHERE signup_date >= CURDATE()"); // The query cache is WORKING! $today = date("Y-m-d"); $r = mysql_query("SELECT username FROM user WHERE signup_date >= "$today"");

The reason the query cache doesn't work in the first case is because of the use of the function CURDATE(). This approach is used for all non-deterministic functions, for example, NOW(), RAND(), etc. Since the function's return result may change, MySQL decides not to cache the query. All that is needed to fix the situation is to add an extra line of PHP code before the request.

2. Use EXPLAIN for your SELECT queries

Using the EXPLAIN keyword can help paint a picture of what MySQL is doing to complete your query. This picture makes it easy to identify bottlenecks and other problems in queries or table structure.

The result of an EXPLAIN query shows which indexes are used, how the table is scanned and sorted, and so on.

Let's take a SELECT query (preferably a complex one, with a JOIN), and add the EXPLAIN keyword before it. You can use PhpMyAdmin for this. Such a query will output the result in a nice table. Let's say we forgot to add an index on a column that is used for a JOIN:

After adding an index for the group_id field:

Now instead of scanning 7883 rows, only 9 and 16 rows from two tables will be scanned. A good way to evaluate performance is to multiply all the numbers in the “rows” column. The result is approximately proportional to the amount of data being processed.

3. Use LIMIT 1 if you need to get a unique string

Sometimes, while using a query, you already know that you are looking for only one row. You can get a unique record or simply check for the existence of any number of records that satisfy the WHERE clause.

In such a case, adding LIMIT 1 to your query may improve performance. Under this condition, the database engine stops scanning records as soon as it finds one and does not scan the entire table or index.

// Is there any user from Alabama? // Don't do this: $r = mysql_query("SELECT * FROM user WHERE state = "Alabama""); if (mysql_num_rows($r) > 0) ( // ... ) // This will be much better: $r = mysql_query("SELECT 1 FROM user WHERE state = "Alabama" LIMIT 1"); if (mysql_num_rows($r) > 0) ( // ... )

4. Index your search fields

Index more than just primary and unique keys. If any columns in your table are used for search queries, then they need to be indexed.

As you can see, this rule also applies to searching by part of a string, for example, “last_name LIKE ‘a%’”. When the start of a string is used for a search, MySQL can use the index of the column being searched.

You should also understand what types of searches you can't use regular indexing for. For example, when searching for the word (“WHERE post_content LIKE ‘%apple%’”), the benefits of indexing will not be available. In such cases, it is better to use mysql full-text search or build your own solutions based on indexing.

5. Indexing and using the same types for linked columns

If your application contains many JOIN queries, you need to index the columns that are linked in both tables. This has the effect of internally optimizing bind operations in MySQL.

Also, the columns being linked must be of the same type. For example, if you associate a DECIMAL column with an INT column from another table, MySQL will not be able to use an index on at least one of the two tables. Even the character encoding must be the same for identical string type columns.

// Search for a company from a specific state $r = mysql_query("SELECT company_name FROM users LEFT JOIN companies ON (users.state = companies.state) WHERE users.id = $user_id"); // both columns for the state name must be indexed // and both must be the same type and character encoding // or MySQL will perform a full table scan

6. Don't use ORDER BY RAND()

This is one of those tricks that looks cool, and many new programmers fall into its trap. They cannot even imagine what a terrible problem they are creating for themselves by starting to use this expression in their queries.

If you really need to randomize the rows in your query result, there are many better ways to do this. Of course, this will be implemented with additional code, but you will be saved from a problem that grows exponentially with the volume of data. The thing is, MySQL performs a RAND() operation (which takes up CPU time) on every single row in the table before sorting it and giving you just one row.

// DO NOT do this: $r = mysql_query("SELECT username FROM user ORDER BY RAND() LIMIT 1"); // This will work better: $r = mysql_query("SELECT count(*) FROM user"); $d = mysql_fetch_row($r); $rand = mt_rand(0,$d - 1); $r = mysql_query("SELECT username FROM user LIMIT $rand, 1");

This way you get a random number that is less than the number of rows in the query result and use it as an offset in the LIMIT clause.

7. Avoid using SELECT *

The more data is read from the table, the slower the query will be. Such operations also take time to complete disk operations. And if the database server is separate from the web server, then delays will also be caused by data transfer over the network between the servers.

It's a good habit to specify a column when doing a SELECT.

// Bad: $r = mysql_query("SELECT * FROM user WHERE user_id = 1"); $d = mysql_fetch_assoc($r); echo "Welcome ($d["username"])"; // This is better: $r = mysql_query("SELECT username FROM user WHERE user_id = 1"); $d = mysql_fetch_assoc($r); echo "Welcome ($d["username"])"; // The difference becomes significant with large amounts of data

8. Try to use the id field everywhere

A good practice is to use an id field in each table that has the PRIMARY KEY, AUTO_INCREMENT properties set, and is of type from the INT family. Preferably - UNSIGNED, since in this case the value cannot be negative.

Even if your table has a field with a unique username, don't make it the primary key. VARCHAR fields are slow to work as primary keys. Also, the structure of your database will be better if it internally uses links to records based on id.

In addition, the MySQL engine uses primary keys for its internal tasks, and the use of the id field creates optimal conditions for solving them.

One possible exception to this rule is “associative tables,” which are used for many-to-many relationships between two other tables. For example, the “posts_tags” table contains 2 columns: post_id, tag_id. They are used to describe the relationship between two tables “post” and “tags”. The table described may have a primary key that contains both id fields.

9. Use ENUM instead of VARCHAR

// Create a prepared expression if ($stmt = $mysqli->prepare("SELECT username FROM user WHERE state=?")) ( // Bind the parameters $stmt->bind_param("s", $state); // Execute $stmt->execute(); // Bind the result variables $stmt->bind_result($username); // Get the values ​​$stmt->fetch(); printf("%s is from %s\n", $username , $state); $stmt->close(); )

13. Unbuffered requests

Typically, when you execute a request from a script, the script is interrupted until the request is completed. This behavior can be changed using unbuffered queries.

Excellent explanation of the mysql_unbuffered_query() function from the PHP documentation:

“mysql_unbuffered_query() sends a SQL query to the MySQL server without automatically retrieving and buffering the result rows as the mysql_query() function does. This way, a certain amount of memory is saved by SQL queries that produce a large result set, and you can start working on the result set immediately after receiving the first row, without waiting for the SQL query to be completely executed.”

However, there are several limitations. You must either read all rows or call mysql_free_result() before executing the next query. You also cannot use mysql_num_rows() or mysql_data_seek() to set the result.

14. Store the IP address as UNSIGNED INT

Many programmers create a VARCHAR(15) field to store an IP address, without even thinking about the fact that they will store an integer value in this field. If you use INT, the field size will be reduced to 4 bytes, and it will have a fixed length.

You must use the UNSIGNED INT type because the IP address uses all 32 bits of an unsigned integer.

$r = "UPDATE users SET ip = INET_ATON("($_SERVER["REMOTE_ADDR"])") WHERE user_id = $user_id";

15. Tables with a fixed record length (Static) are faster

When each individual column in a table has a fixed length, then the table as a whole is considered “static” or “fixed record length”. Examples of column types that do not have a fixed length: VARCHAR, TEXT, BLOB. If you include at least one column of this type, the table will no longer be considered "static" and will be processed differently by the MySQL engine.

"static" tables are processed faster by the MySQL engine when searching for records. When you need to read a specific record in a table, its position is quickly calculated. If the row size is not fixed, then determining the position of the record requires time to search and compare with the index of the main key.

Such tables are also easier to cache and easier to recover from failures. But they can take up more space. For example, if you convert a VARCHAR(20) field to a CHAR(20) field, then 20 bytes will always be occupied, regardless of whether they are in use or not.

Using the Vertical Splitting technique makes it possible to separate variable-length columns into a separate table.

16. Vertical separation

Vertical partitioning is the act of dividing a table structure vertically for the purpose of optimization.

Example 1: You have a table that contains home addresses that are rarely used in the application. You can split your table and store the addresses in a separate table. This will reduce the size of the main user table. And as you know, a smaller table is processed faster.

Example 2: You have a field “last_login” in your table. It is updated every time a user registers on the site. But every table update causes the query to be cached, which can create system overload. You can separate this field into another table to make updates to the user table less frequent.

But you need to be sure that you don't need to permanently link the two tables you just split, as this could lead to poor performance.

17. Separate large DELETE or INSERT queries

If you need to perform a large DELETE or INSERT query on a live site, you need to be careful not to disrupt traffic. When a large query is running, it can lock your tables and cause your application to stop.

Apache runs many parallel processes/threads. for this reason, it works more efficiently when the script finishes execution as quickly as possible, thus the server does not use too many open connections and processes consuming resources, especially memory.

If you lock tables for an extended period of time (for example, 30 seconds or more) on a heavily loaded web server, you can cause a backlog of processes and requests that will require significant cleanup time or even cause your web server to stop.

If you have a script that deletes a large number of records, simply use the LIMIT clause to break it into small batches to avoid this situation.

While (1) ( mysql_query("DELETE FROM logs WHERE log_date<= "2009-10-01" LIMIT 10000"); if (mysql_affected_rows() == 0) { // выполняем удаление break; } // вы можете сделать небольшую паузу usleep(50000); }

18. Small columns are processed faster

For a database engine, disk is the most important bottleneck. The desire to make everything smaller and smaller usually has a positive impact on productivity by reducing the amount of data being moved.

The MySQL documentation contains a list of data storage standards for all types.

If the table will only contain a few rows, then there is no reason to make the primary key type INT rather than MEDIUMINT, SMALLINT, or even TINYINT. if you only need the date, use DATE instead of DATETIME.

You just need to remember about the opportunities for growth.

19. Choose the right storage mechanism

There are two main data storage engines for MySQL: MyISAM and InnoDB. Each has its own advantages and disadvantages.

MyISAM is great for read-heavy applications, but it doesn't scale well when there are a lot of records. Even if you update one field in one row, the entire table will be locked and no process will be able to read anything until the query completes. MyISAM performs calculations quickly for queries like SELECT COUNT(*).

InnoDB is a more complex storage engine, and it can be slower than MyISAM for most small applications. But it supports row locking, which is better for scaling tables. It also supports some additional features such as transactions.

20. Use object-relational mapping

Using an object-relational mapping (ORM - Object Relational Mapper) provides a number of advantages. Everything that can be done in an ORM can be done manually, but with more effort and higher developer level requirements.

ORM is great for lazy loading. This means that values ​​can be retrieved when they are needed. But you need to be careful because you can create a lot of small queries that will reduce performance.

An ORM can also combine your queries into transactions, which are significantly faster than individual database queries.

For PHP you can use ORM Doctrine.

21. Be careful with persistent connections

Persistent connections are designed to reduce the cost of restoring connections to MySQL. When a persistent connection is created, it remains open even after the script completes. Since Apache reuses child processes, the process is executed for the new script, and it also uses the MySQL connection.

This sounds great in theory. But in reality, this feature is not worth a penny due to the problems. It can cause serious problems with connection limits, memory overflow, and so on.

Apache works on concurrency principles and creates many child processes. This is the reason why persistent connections do not work as expected on this system. Before using the mysql_pconnect() function, consult your system administrator.







2024 gtavrl.ru.