HostMonster Web Hosting Help

Optimizing MySQL: Queries and Indexes Article 4 of 4

Optimizing MySQL: Queries and Indexes Article 4 of 4

Courtesy of:Ian Gilfillan

Most systems need to be highly optimized for selects - take a news site which performs millions of queries per day, but where the data arrives in large batches of text files. So for parts of the day, inserts need to be optimal, without noticeably affecting the millions trying to access the data...

Assuming a nicely formatted '|' delimited text file that we want to insert into the table above, take this piece of PHP code:

if (!($fp = fopen("datafile.txt","r"))) {               // open the file for reading
          print "nUnable to open datafile.txt for writing";    // display error
          exit();                                       // end the running of the program
        }
        
        
        while (!feof ($fp)) {                   // loop through the file line by line
          $sline = fgets($fp, 4096);            // put the data into the variable $sline
          $sline = chop($sline);                        // remove the newline
          list($eno,$fname,$sname,$telno,$salary) = split("|",$code);
                                // split the line on "|", populating the ind. variables
          $db->query("insert into employee(employee_number,firstname,surname,
        tel_no, salary 
           values($eno,'$fname','$sname','$tel_no', $salary)");
        }                                               // end while loop
        

This would work, but would be very slow. The index buffer would be flushed after every insert. Until recently, MyISAM tables (The MySQL default) did not allow data to be inserted at the same time as being read. The new format does, but only if there are no deleted records present (highly unlikely in a heavily used system). So the entire table is locked for the duration of each insert. Fine on a low volume site, but when you're getting hundreds or thousands of queries per second, you'll soon notice the backlog!

There's a solution however - the best way to insert the data is to use MySQL's "LOAD DATA INFILE". This is much faster (20 times according to MySQL), and the only way some systems I've seen are still hanging in there!

The syntax is simple, and the code becomes a lot simpler too:

$db->query("LOAD DATA INFILE 'datafile.txt' INTO TABLE employee (employee_number,firstname,surname,tel_no,salary) FIELDS TERMINATED BY '|'");

LOAD DATA INFILE has defaults of:

FIELDS TERMINATED BY 't' ENCLOSED BY '' ESCAPED BY ''

if you don't specify any of these clauses. And, just as with an ordinary insert, you need to specify a field list if the order of the fields is different, or, as in the example above, you're not inserting data for every field. Always specifying a field list is good practice for all queries anyway - if someone adds a field to the table at a later stage, you don't want to go back and have to fix all your previous INSERT and SELECT * statements.

If you can't get this to work properly, have a look at the format of your text file - every problem I've seen with LOAD DATA has been because of a corrupted text file. Every field in every row must be delimited correctly!

You may not always be inserting from a text file - perhaps your application needs to do many unrelated inserts continually. There are ways to make sure the mass of users selecting are not badly affected... The first is to use INSERT LOW PRIORITY. This waits until there are no more reads waiting to happen, waiting for the gap, and not pushing in as it were. Of course, if your database is a rush hour special, there may never be a gap, and the client performing the INSERT LOW PRIORITY may start to grow cobwebs! An alternative here is INSERT DELAYED. The client is immediately freed, and the insert put into a queue (with all the other INSERT DELAYED's still waiting for the queue to end). This means that there can be no meaningful information passed back to the client, (such as the auto_increment value), as the INSERT has not been processed when the client is freed. Also, be aware that a catastrophe such as an unexpected power failure here will result in the queued INSERT's being lost. For neither of these methods do you have any idea when the data will be inserted, if at all, so I suggest you use with caution.

Conclusion

It's not only getting the data in that needs to be quick - sometimes you need to get it out quickly too. (Say you've accidentally loaded yesterday's classified ads, for example). Don't do a:

DELETE FROM classifieds;

Rather, use:

TRUNCATE TABLE classifieds;

The difference here is that DELETE drops records one by one, and that can be 1 million one by one's too slow! Note that this does not apply before version 4.0 of MySQL. At time of writing, most of you will still be using 3.x versions (if you do a DELETE FROM tablename on a non-empty table, and get 0 records back as a result, you're running an earlier version. To fix this problem, MySQL made DELETE remove records one by one so as to return the number of records deleted, but TRUNCATE still did the quick delete. Also, earlier versions than 3.23.33 used TRUNCATE tablename, not TRUNCATE TABLE tablename)

This has only been a brief introduction to optimizing queries and indexes. Once you've mastered these tips, you'll want to look at your more complex joins that just never seem to perform quickly. The MySQL site has lots of useful information, and there are a few good books out there as well. But don't forget EXPLAIN! Often the best way is to try and rewrite the query in as many different ways as possible, and see which one runs more efficiently. You can learn a lot by trying to figure out why one alternative ran faster than the other. Good luck!

resource list:
www.mysql.com - complete documentation for MySQL

Knowledgebase Article 59,671 views bookmark tags: cpu exceeded index indexes mysql optimize optimizing queries query (updated 966 days ago)


Was this resource helpful?

Did this resolve your issue?


Please add any other comments or suggestions about this content:





Recommended Help Content

Optimizing MySQL: Queries and Indexes Article 3 of 4 Courtesy of: Ian Gilfillan Ordering by surname is a common requirement, so it would make sense to create an index on surname. But in this example o (updated 967 days ago)
Knowledgebase Article 63,400 views tags: cpu exceeded index indexes mysql optimize optimizing queries query

Optimizing MySQL: Queries and Indexes Article 1 of 4 Courtesy of: Ian Gilfillan Badly defined or non-existent indexes are one of the primary reasons for poor performance, understanding and then fixing (updated 576 days ago)
Knowledgebase Article 82,718 views tags: indexes mysql optimize optimizing queries query

Optimizing MySQL: Queries and Indexes Article 2 of 4 Courtesy of: Ian Gilfillan Some knowledge of how indexes work allows you to use them more efficiently. Firstly, note that when you update a table w (updated 966 days ago)
Knowledgebase Article 58,626 views tags: indexes mysql optimize optimizing queries query

Related Help Content

Why does my site get throttled, or why did my account get suspended for performance? (updated 44 days ago)
Knowledgebase Article 128,220 views tags: cpu database mysql optimize query script slow throttling

How to repair and optimize your database using phpMyAdmin: (updated 654 days ago)
Knowledgebase Article 111,509 views tags: database mysql optimizing repairing slow speed

A guide on performing a standard SQL query using the phpMyAdmin interface (updated 462 days ago)
Knowledgebase Article 6,011 views tags: databases mysql phpmyadmin

How to restore MySQL databases using the Site Backup & Restore tool (updated 855 days ago)
Knowledgebase Article 20,531 views tags: backup database mysql restore site

How to backup all of your MySQL databases using the Site Backup & Restore tool (updated 989 days ago)
Knowledgebase Article 32,625 views tags: backup database mysql restore site

How to restore MySQL databases using the Site Backup & Restore tool (updated 628 days ago)
Knowledgebase Article 31,172 views tags: backup database mysql restore site

How can I optimize my images? (updated 1661 days ago)
Knowledgebase Article 67,018 views tags: images optimize

I am trying to rebuild my index for Entropy Search and it stops before 100%. (updated 1662 days ago)
Knowledgebase Article 49,323 views tags: index search