Friday, April 27, 2007

MySQL Pseudo-Partitioning, Part II

On advice of the MySQL She-BA, I wanted to see what performance I could get with MERGE tables. In my particular application I'm going to need the RI I get with InnoDB, so it's not a solution, but as long as I am investigating anyway...

Seems simple enough, let me the MERGE table over my existing tables:

mysql> create table xyz_mt (
-> `id` bigint(10) NOT NULL default '0',
-> `report_dt` date default NULL,
-> `acct_id` varchar(8) default NULL,
-> `some_text` varchar(222) default NULL,
-> PRIMARY KEY (`id`)
-> ) ENGINE=MERGE UNION=(xyz_2007_01,xyz_2007_02,xyz_2007_03) ;
Query OK, 0 rows affected (0.08 sec)

mysql> select id, report_dt, acct_id
-> from xyz_mt
-> where report_dt ='2007-02-17'
-> and acct_id = 'X0000741';
ERROR 1168 (HY000): Unable to open underlying table which is differently defined or of non-MyISAM type or doesn't exists

Doh! That's right, got to change to MyISAM.

mysql> alter table xyz_2007_01 engine=MyISAM;
Query OK, 1000000 rows affected (25.42 sec)
Records: 1000000 Duplicates: 0 Warnings: 0

mysql> alter table xyz_2007_02 engine=MyISAM;
Query OK, 1000000 rows affected (17.82 sec)
Records: 1000000 Duplicates: 0 Warnings: 0

mysql> alter table xyz_2007_03 engine=MyISAM;
Query OK, 1000000 rows affected (18.84 sec)
Records: 1000000 Duplicates: 0 Warnings: 0

Lets see how it works now.

mysql> select id, report_dt, acct_id
-> from xyz_mt
-> where report_dt ='2007-02-17' and acct_id = 'X0000741';
+----------+------------+----------+
| id | report_dt | acct_id |
+----------+------------+----------+
| 20000741 | 2007-02-17 | X0000741 |
| 20967398 | 2007-02-17 | X0000741 |
+----------+------------+----------+
2 rows in set (2.43 sec)

Whoa, that's pretty good, I could live with that. Lets compare the inline view of MyISAM tables vs. InnoDB tables:

mysql> select * from (
-> select id, report_dt, acct_id from xyz_2007_01
-> union all
-> select id, report_dt, acct_id from xyz_2007_02
-> union all
-> select id, report_dt, acct_id from xyz_2007_03) t
-> where report_dt ='2007-02-17' and acct_id = 'X0000741';
+----------+------------+----------+
| id | report_dt | acct_id |
+----------+------------+----------+
| 20000741 | 2007-02-17 | X0000741 |
| 20967398 | 2007-02-17 | X0000741 |
+----------+------------+----------+
2 rows in set (15.17 sec)

Slightly faster than InnoDB, but not what I would call Significant. Definitely a viable option for partitioning in v5.0. Thanks to Sheeri for the top.

No comments: