Last modified: 2013-08-09 22:44:39 UTC
During my work using the hook "SpecialRecentChangesQuery" (code and detailed analysis see [1]) I found a reproducible problem which arise only under the following conditions: - if MySQL >= 5.0.12 AND - if the hook function for SpecialRecentChangesQuery adds table(s) to $table[]. Analysis: The code in [1] modifies the Recent Changes main SQL statement to this SELECT * FROM `recentchanges` FORCE INDEX (rc_timestamp),`page` LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id)) WHERE (rc_timestamp >= '20100211000000') AND rc_bot = '0' ORDER BY rc_timestamp DESC LIMIT 50 This throws an error Unknown column 'rc_id' in 'on clause' (localhost) (MySQL >= 5.0.12 due to new JOIN processing) This ad-hoc modification works (parentheses added around the table names outside the JOIN) SELECT * FROM (`recentchanges` FORCE INDEX (rc_timestamp),`page`) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id)) WHERE (rc_timestamp >= '20100211000000') AND rc_bot = '0' ORDER BY rc_timestamp DESC LIMIT 50 I add an ad-hoc and very hacky - only experimental - patch, which corrects the problem in a certain case for [1]. The patch is not mentioned for SVN submisson. Citing [2]: Beginning with MySQL 5.0.12, natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard. The goal was to align the syntax and semantics of MySQL with respect to NATURAL JOIN and JOIN ... USING according to SQL:2003. However, these changes in join processing can result in different output columns for some joins. Also, some queries that appeared to work correctly in older versions must be rewritten to comply with the standard. Citing [3]: SELECT * FROM t1, t2 JOIN t3 ON (t1.i1 = t3.i3); Previously, the SELECT was legal due to the implicit grouping of t1,t2 as (t1,t2). Now the JOIN takes precedence, so the operands for the ON clause are t2 and t3. Because t1.i1 is not a column in either of the operands, the result is an Unknown column 't1.i1' in 'on clause' error. To allow the join to be processed, group the first two tables explicitly with parentheses so that the operands for the ON clause are (t1,t2) and t3: SELECT * FROM (t1, t2) JOIN t3 ON (t1.i1 = t3.i3); Alternatively, avoid the use of the comma operator and use JOIN instead: SELECT * FROM t1 JOIN t2 JOIN t3 ON (t1.i1 = t3.i3); [1] http://www.mediawiki.org/wiki/Extension:OnlyRecentRecentChanges [2] MySQL Manual Join Processing Changes in MySQL 5.0.12 http://dev.mysql.com/doc/refman/5.0/en/join.html [3] Bug #19053 MySQL Unknown column in 'on clause' http://bugs.mysql.com/bug.php?id=19053
Created attachment 7157 [details] ad-hoc and hackish patch - experimental and for showing a solution for a specific query - not to be committed to SVN
*** CORRECTION *** SELECT * FROM `recentchanges` FORCE INDEX (rc_timestamp),`page` LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id)) WHERE (rc_timestamp >= '20100211000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid) ORDER BY rc_timestamp DESC LIMIT 50 This throws an error Unknown column 'rc_id' in 'on clause' (localhost) (MySQL >= 5.0.12 due to new JOIN processing) This ad-hoc modification works (parentheses added around the table names outside the JOIN) SELECT * FROM (`recentchanges` FORCE INDEX (rc_timestamp),`page`) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id)) WHERE (rc_timestamp >= '20100211000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid) ORDER BY rc_timestamp DESC LIMIT 50
T. Gries: In case this is still an issue, willing to put that patch into Gerrit? Adding "patch-reviewed" as it says "not to commit".
(In reply to comment #3) > T. Gries: In case this is still an issue, willing to put that patch into > Gerrit? > > Adding "patch-reviewed" as it says "not to commit". uh, this patch is old, from 2010. Leave open, perhaps someone of the database experts can check my observations which are in detail explained here.
Hi. I started investigations and found, that the problem still exists. The reason is that the MySQL JOIN syntax changed in MySQL 5.0.12 (!) see - http://bugs.mysql.com/bug.php?id=19053 - http://dev.mysql.com/doc/refman/5.0/en/join.html Corresponding changes have never been done in $IP/includes/database/db.php or - I mean - in the MySQL driver Suggested solution: add parentheses around tables FROM (recentchanges .., page) in all database statements for MySQL.
this does NOT work: SELECT rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags FROM `recentchanges` FORCE INDEX (rc_timestamp),`page` LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id)) WHERE (rc_timestamp >= '20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid) ORDER BY rc_timestamp DESC LIMIT 50 With the correct parentheses around -- see the FROM () -- it DOES work SELECT rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags FROM (`recentchanges` FORCE INDEX (rc_timestamp),`page`) LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id)) WHERE (rc_timestamp >= '20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid) ORDER BY rc_timestamp DESC LIMIT 50
the above is generated by my extension as published in http://www.mediawiki.org/wiki/Extension:OnlyRecentRecentChanges#Code basically: $dir = dirname( __FILE__ ); $wgExtensionMessagesFiles['onlyrecentrecentchanges'] = $dir . '/OnlyRecentRecentChanges.i18n.php'; $wgHooks['SpecialRecentChangesQuery'][] = 'onSpecialRecentChangesQuery'; // see http://www.mediawiki.org/wiki/Manual:Hooks/SpecialRecentChangesQuery function onSpecialRecentChangesQuery( &$conds, &$tables, &$join_conds, $opts, &$query_options = array(), &$select = array() ) { if ( !in_array( 'page', $tables ) ) $tables[] = 'page'; $conds[] = 'page_latest=rc_this_oldid'; return true; }
tl;dr: Suggested solution: =================== Add parentheses around tables FROM (recentchanges .., page) in all database statements for MySQL at the last stage before committing the query.
Source: http://dev.mysql.com/doc/refman/5.0/en/join.html http://i.imgur.com/wVjBBqY.png Previously, the comma operator (,) and JOIN both had the same precedence, so the join expression t1, t2 JOIN t3 was interpreted as ((t1, t2) JOIN t3). Now JOIN has higher precedence, so the expression is interpreted as (t1, (t2 JOIN t3)). This change affects statements that use an ON clause, because that clause can refer only to columns in the operands of the join, and the change in precedence changes interpretation of what those operands are. Example: CREATE TABLE t1 (i1 INT, j1 INT); CREATE TABLE t2 (i2 INT, j2 INT); CREATE TABLE t3 (i3 INT, j3 INT); INSERT INTO t1 VALUES(1,1); INSERT INTO t2 VALUES(1,1); INSERT INTO t3 VALUES(1,1); SELECT * FROM t1, t2 JOIN t3 ON (t1.i1 = t3.i3); Previously, the SELECT was legal due to the implicit grouping of t1,t2 as (t1,t2). Now the JOIN takes precedence, so the operands for the ON clause are t2 and t3. Because t1.i1 is not a column in either of the operands, the result is an Unknown column 't1.i1' in 'on clause' error. ******** IMPORTANT To allow the join to be processed, group the first two tables explicitly with parentheses so that the operands for the ON clause are (t1,t2) and t3: SELECT * FROM (t1, t2) JOIN t3 ON (t1.i1 = t3.i3); ********** Alternatively, avoid the use of the comma operator and use JOIN instead: SELECT * FROM t1 JOIN t2 JOIN t3 ON (t1.i1 = t3.i3); This change also applies to statements that mix the comma operator with INNER JOIN, CROSS JOIN, LEFT JOIN, and RIGHT JOIN, all of which now have higher precedence than the comma operator. Source: http://dev.mysql.com/doc/refman/5.0/en/join.html
update to comment #6 https://bugzilla.wikimedia.org/show_bug.cgi?id=22613#c6 this does NOT work: SELECT rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags FROM `recentchanges` FORCE INDEX (rc_timestamp),`page` LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id)) WHERE (rc_timestamp >= '20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid) ORDER BY rc_timestamp DESC LIMIT 50 With the correct parentheses around -- see the FROM () -- it DOES work SELECT rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags FROM (`recentchanges` FORCE INDEX (rc_timestamp),`page`) LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id)) WHERE (rc_timestamp >= '20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid) ORDER BY rc_timestamp DESC LIMIT 50 The following also works. It is an alternative which does not require a core code change, i.e. this version does NOT require additional parentheses (I swapped the order: the additional table `page` ist listed as the first FROM table name): SELECT rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags FROM ,`page`,`recentchanges` FORCE INDEX (rc_timestamp) LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id)) WHERE (rc_timestamp >= '20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid) ORDER BY rc_timestamp DESC LIMIT 50 This is done by changing function onSpecialRecentChangesQuery( &$conds, &$tables, &$join_conds, $opts, &$query_options = array(), &$select = array() ) { if ( !in_array( 'page', $tables ) ) $tables[] = 'page'; $conds[] = 'page_latest=rc_this_oldid'; return true; } to function onSpecialRecentChangesQuery( &$conds, &$tables, &$join_conds, $opts, &$query_options = array(), &$select = array() ) { if ( !in_array( 'page', $tables ) ) array_unshift( $tables, 'page' ); $conds[] = 'page_latest=rc_this_oldid'; return true; }
Problem for extension http://www.mediawiki.org/wiki/Extension:OnlyRecentRecentChanges#Code is solved, so I am closing this bug report, even when the general statement applies.