Archive for June, 2008

Sort by a list of IDs queried using MySQL's IN() function

Thursday, June 26th, 2008

I run into this problem from time to time, and I've finally put enough energy into it that I want to commit it to memory through the blog. The most common occurrence of this in my world is when performing a search using something like Lucene or Sphinx. Here's what usually happens:

I perform a search using my engine of choice. Sphinx, for example, will give me back a hash of results formatted like this:

  Array
  (
    [matches] => Array
    (
      [10546] => Array
      (
        [weight] => 3
        [attrs] => Array
        (
          [feed_id] => 624
          [published_at] => 1213713545
        )
      )
      [14154] => Array
      (
        [weight] => 3
        [attrs] => Array
        (
          [feed_id] => 583
          [published_at] => 1213801410
        )
      )
      ...
    )
    [time] => 0.000
    ...
  )

Since I do not store my full-text fields in my index, I then have to take the IDs from the matches array and query the database to get at my data. The query is simple:

SELECT * FROM items WHERE id IN (10546, 14154, …);

Unfortunately, that query will not necessarily return the rows in the order I provided. This is an easy fix if I'd like to sort my results on a column in my database, since I can just add a quick ORDER BY clause on say, the date the item was created. But if I searched the database for results sorted by relevance, the database really doesn't know anything about how to sort that way. I need a way to tell MySQL to sort the results in the order I've supplied my IDs, which leads us to at least two functions: FIND_IN_SET and FIELD.

Both functions work the same way: Given a string (or column) as the first argument, they return the 1-based index — or zero if not found — of the argument in either the list of additional arguments (FIELD) or the second argument, a comma-delimited string (FIND_IN_SET). For my purposes, then, my query looks like this:

# Using FIELD
SELECT * FROM items WHERE id IN (10546, 14154, …) ORDER BY FIELD(id, 10546, 14154, …);

# Using FIND_IN_SET
SELECT * FROM items WHERE id IN (10546, 14154, …) ORDER BY FIND_IN_SET(id, '10546,14154,…');

Both functions will return the same results, sorted the same way. I went so far as to profile both, too, just to make sure I was using the best function. Granted, when I profiled this using MySQL's built-in profiler, my query was pretty simple (as it should be, in my opinion). Your mileage may vary:

  +----------+------------+------------------------------+
  | Query_ID | Duration   | Sorting result | Query       |
  +----------+------------+------------------------------+
  |        1 | 0.01572400 | 0.001129       | FIND_IN_SET |
  |        2 | 0.00241400 | 0.00127        | FIELD       |
  |        3 | 0.00194700 | 0.00114        | FIND_IN_SET |
  |        4 | 0.00193100 | 0.001121       | FIELD       |
  |        5 | 0.00192300 | 0.001116       | FIND_IN_SET |
  |        6 | 0.00192100 | 0.001101       | FIELD       |
  +----------+------------+------------------------------+

Note that the query result is a hybrid between show profiles and show profile for query [id], and I've ripped out the actual query because all I care about is the sorting function being used (the query is the same otherwise). The duration on the whole was lower in each subsequent query because of MySQL's query cache, but since I really wasn't interested in that anyway — just the sorting duration — it doesn't really matter.

The profiling I did was ridiculously contrived, but it did go to show me, based on the miniscule differences, that it most likely will not matter which function I use (although FIND_IN_SET ultimately won out over my exhaustive six-query profile).

Awesome recruiter email

Thursday, June 19th, 2008

Holy hell, this is amazing:

Hello Allan,

I am vara, recruiter form Tekforcecorp.
I saw your resume on Internet and we have a immediate requirement for PHP developer.
please see the requirement and let me know if you are interested. Please get back to me with updated rseume and salary info.This is a fulltime position.

Responsibilities:

  • Develop web applications and services
  • Apply industry best-practice software standards and technology to complex business problems
  • Help set-up and maintain development, test and production environments

Qualifications:

  • Bachelors degree in CS or related field, or equivalent experience
  • 2+ years experience in design and development of commercial web applications on multiple platforms
  • Previous professional experience with PHP required (PHP5 preferred)
  • Current experience with MySQL required
  • Basic understanding of OOP principles and practices is required
  • Experience with Java, Ruby on Rails, and/or object oriented C++ a plus
  • Experience with AJAX a plus
  • Ability to work on a small team, in a start-up environment
  • Ability to handle multiple concurrent activities and have a flexible, positive attitude
  • Excellent verbal, written and communication skills is a must
  • Must be team-oriented, with an interest and willingness to help develop and mentor the engineering team as it grows

Vara
Tekforce Corp
2420 Camino Ramon , Suite 212
San Ramon, CA 94583
925 866 8200 ext. 234
925 866 8219 Fax
vara@tekforcecorp.com