Social networking type queries with NDB (part 2)

Recently, I talked about how to optimize social networking type queries for the NDB storage engine using IN clause statements. In clauses are great but they have one fundamental limitation, they work only on one column (Actually, this is not true, I discovered, thanks to Roland’s comment, that MySQL supports multiple columns IN clause). What if the primary key is a composite of let’s say “region_id”, “application_id” and “user_id”? Recently, while on site with a client, Brian Morin showed me a very clever way of dealing these type of primary keys in an IN clause. The main problem is that you cannot return a binary or varbinary from a function. So the idea was to used the return values of a stored proc. First we need to compose the varbinary from the actual values with this stored proc:
.

delimiter $$
drop procedure if exists compose_user_account_key $$

create procedure compose_user_account_key(
    in p_region tinyint,
    in p_application_id int,  
    in p_user_id int,
    out p_key varbinary(255))
  deterministic
  sql security definer
sproc:begin

  SET @dyn_sql = CONCAT(
    'set @user_account_key = 0x', 
    lpad(hex(p_region), 2, '0'), 
    lpad(hex(p_application_id), 8, '0'), 
    lpad(hex(p_user_id), 8, '0'), ';');
  PREPARE s1 from @dyn_sql; 
  EXECUTE s1;

  select @dyn_sql;

  select @user_account_key into p_key;

end$$
delimiter ;

A “select @dyn_sql” has been added in the stores procedure for debugging, it can be removed when in production. Running the stored proc gives the following output:

mysql> call compose_user_account_key(1,2,3,@out);
+-----------------------------------------------+
| @dyn_sql                                      |
+-----------------------------------------------+
| set @user_account_key = 0x010000000200000003; |
+-----------------------------------------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> select hex(@out);
+--------------------+
| hex(@out)          |
+--------------------+
| 010000000200000003 |
+--------------------+
1 row in set (0.00 sec)

From there, we need decompose functions to extract the values

delimiter $$
drop function if exists decompose_user_account_region $$
create function decompose_user_account_region(
  p_key varbinary(255))
  returns tinyint
  deterministic
  sql security definer
  no sql
sproc:begin
  return ascii(substring(p_key, 1, 1));
end$$

drop function if exists decompose_user_account_application_id $$
create function decompose_user_account_application_id(
  p_key varbinary(255))
  returns int
  deterministic
  sql security definer
  no sql
sproc:begin

  declare l_ret int;

  set l_ret = ascii(substring(p_key, 2, 1));
  set l_ret = (l_ret * 256)  + ascii(substring(p_key, 3, 1));
  set l_ret = (l_ret * 256)  + ascii(substring(p_key, 4, 1));
  set l_ret = (l_ret * 256)  + ascii(substring(p_key, 5, 1));
  
  return l_ret;
end$$

drop function if exists decompose_user_account_user_id $$
create function decompose_user_account_user_id(
  p_key varbinary(255))
  returns int 
  deterministic
  sql security definer
  no sql
sproc:begin
  declare l_ret int;

  set l_ret = ascii(substring(p_key, 6, 1));
  set l_ret = (l_ret * 256)  + ascii(substring(p_key, 7, 1));
  set l_ret = (l_ret * 256)  + ascii(substring(p_key, 8, 1));
  set l_ret = (l_ret * 256)  + ascii(substring(p_key, 9, 1));
  
  return l_ret;
end$$
delimiter ;

Using these funtions with the previous “@out” value gives the following:

mysql> select decompose_user_account_region(@out) as region, decompose_user_account_application_id(@out) as application_id, decompose_user_account_user_id(@out) as user_id;
+--------+----------------+---------+
| region | application_id | user_id |
+--------+----------------+---------+
|      1 |              2 |       3 |
+--------+----------------+---------+
1 row in set (0.00 sec)

From there, we have the tools to build our queries, using “hex(@out)” for use in the IN clause. Probably the best way to that trick would be to use a stored procedure to build dynamic queries with IN clause like (IN (0×010000000200000003, …). Apart from these store proc and functions, another way to resolve this problem is to encode numbers in hexadecimal or better, in base64 and use varchar primary keys.

About Yves Trudeau

I work as a senior consultant in the MySQL professional services team at Sun. My main areas of expertise are DRBD/Heartbeat and NDB Cluster. I am also involved in the WaffleGrid project.
This entry was posted in HA, mysql, NDB Cluster, yves. Bookmark the permalink.

4 Responses to Social networking type queries with NDB (part 2)

  1. Hi!

    “In clauses are great but they have one fundamental limitation, they work only on one column.”

    No, this is not true. This syntax is perfectly valid according to the SQL standard, and it’s implemented by MySQL:

    SELECT *
    FROM film
    WHERE (language_id, rating) IN (
    (1, ‘G’)
    , (2,’PG’)
    );

    Perhaps you mean the optimizer isn’t really smart in dealing with this?

  2. Yves Trudeau says:

    wow… awesome. I was not aware of this one. Thanks Roland!

  3. Hi!

    np, glad to.

    Perhaps you can edit the post and use one of them strike-through thingies to make it clear there is a mistaken assumption in the post…It may help future readers.

    regards,

    Roland

  4. Yves Trudeau says:

    I just added a statement to correct the facts.