Last modified: 2014-07-23 16:35:34 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T14974, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 12974 - The newline added to a template, magic word, variable, or parser function that returns line-start wikicode formatting (*#:; {|) causes unexpected parsing
The newline added to a template, magic word, variable, or parser function tha...
Status: NEW
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: High major with 25 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
http://test.wikipedia.org/wiki/Newlin...
: newparser
: 5590 8199 10687 10781 11262 13378 14036 19144 19302 20574 20592 22086 23033 23355 26000 35129 36215 38697 40294 52548 56562 60444 60827 (view as bug list)
Depends on: 22880 529
Blocks: 52548
  Show dependency treegraph
 
Reported: 2008-02-08 18:51 UTC by AJF
Modified: 2014-07-23 16:35 UTC (History)
44 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
testcases (2.95 KB, text/html)
2010-01-26 00:27 UTC, Happy-melon
Details

Description AJF 2008-02-08 18:51:24 UTC
assume there is tamplate named color with content #002255 (normal color definition) if its transcluded this way:

<span style="color:{{color}}"></span>

It works perfectly - it gives <span style="color: #002255;">test</span>

But if you use it in a table (or anywhere else not inside tag attribute) it crashes:

{| style="color:{{color}};"
|-
| test
|-
|}

gives:

<table>
<ol><li>002255;"
</li></ol>

<tr>
<td> test
</td></tr>
</table>

same:

<p>test {{color}} test</p>

gives:

<p>test 

<ol><li>002255 test</p>
</li></ol>

Whis has even broken tag nesting!!!
Comment 1 Christian Neubauer 2008-02-08 21:15:32 UTC
The pound sign (#) is getting interpreted as an ordered list.  Like doing this:

# first item
# second item
# etc

Do you have an extra new line in your template?  If not, try removing the # or using a hex value or a named color.
Comment 2 AJF 2008-02-08 21:20:20 UTC
Sure. I did, but it does not change that showed example means inconsistency in interpretation of such tamplates by parser.
Comment 3 Christian Neubauer 2008-02-08 21:43:14 UTC
Hmm, can't duplicate this bug on 1.11 or 1.12.  Do you have any third party extensions installed?
Comment 4 Christian Neubauer 2008-02-08 21:45:09 UTC
No I was wrong.  It does cause the output you described in 1.11.  In 1.12 it seems to strip the style attribute out of a table at least.
Comment 5 Christian Neubauer 2008-02-14 15:28:10 UTC
Okay, in 1.11, the relevant section is in Parser->braceSubstitution():

# If the template begins with a table or block-level
# element, it should be treated as beginning a new line.
if (!$piece['lineStart'] && preg_match('/^(?:{\\||:|;|#|\*)/', $text)) /*}*/{
	$text = "\n" . $text;
}

In 1.12, the same section says:

# Bug 529: if the template begins with a table or block-level
# element, it should be treated as beginning a new line.
# This behaviour is somewhat controversial.
if (!$piece['lineStart'] && preg_match('/^(?:{\\||:|;|#|\*)/', $text)) /*}*/{
	$text = "\n" . $text;
}

See bug 529.  You can work around this by putting a space before the # in the template.
Comment 6 Mormegil 2008-11-12 21:15:37 UTC
This is a much more general problem, and is much worse: it affects not only templates but also parser functions (e.g. {{#if:}}), and it affects not only HTML colors, but everything which starts with colon, semicolon, asterisk, hashmark, or the “{|” table syntax. Check the linked URL for some examples where this breaks stuff.

The problem: when the result of a template, or parser function call starts with {|, :, ;, #, or *, a newline is prepended to it, forcing this character to be a syntax element, even though the author might not have intended it so (and wanted to just use the plain character). Especially in the case of parser functions, this is quite understandable.

This bug is caused by the fix to bug 529, which I believe is wrong, and should be reverted, even though a compromise version is possible – force the newline only for the table syntax. Table syntax is rare in other uses than tables (while colons and semicolons are perfectly normal in plain text), and tables seem to be the primary use case for that original fix.

(Changing summary, and marking as bug, not a feature request.)
Comment 7 Splarka 2008-11-12 21:56:04 UTC
Note also page-title magic words like {{PAGENAME}} on a page starting with such a character (like * or ;) cause much breakage, and I can't think of a workaround.

Try [[Special:Prefixindex/{{FULLPAGENAME}}|prefix search]] on a page like [[;Foo]] or [[*Foo]].
Comment 8 Mormegil 2008-11-12 22:04:51 UTC
(In reply to comment #7)
> Try [[Special:Prefixindex/{{FULLPAGENAME}}|prefix search]] on a page like
> [[;Foo]] or [[*Foo]].

Cool! It also explains the totally broken noarticletext display at e.g. http://en.wikipedia.org/wiki/*Foo
Comment 9 Mormegil 2008-12-18 17:45:41 UTC
*** Bug 13378 has been marked as a duplicate of this bug. ***
Comment 10 Happy-melon 2009-04-22 19:26:07 UTC
This also breaks the colon delimiters in magic words:

*Test:

{{SUBJECTPAGENAME{{#if:yes|:Talk:Foo/bar}}}}

*Expected: 

Foo/bar

*Actual:

{{SUBJECTPAGENAME
:Talk:Foo/bar}}

This wouldn't be a problem if the parser then recognised the split material as a complete magic word call, but of course it doesn't.  This is ugly.
Comment 11 Siebrand Mazeland 2009-04-25 11:07:59 UTC
adding keyword i18n. Also influences PLURAL and GENDER handling in messages. Raising priority.
Comment 12 Splarka 2009-05-13 18:52:42 UTC
*** Bug 10687 has been marked as a duplicate of this bug. ***
Comment 13 Splarka 2009-05-13 18:53:44 UTC
*** Bug 11262 has been marked as a duplicate of this bug. ***
Comment 14 Splarka 2009-05-13 18:53:45 UTC
*** Bug 8199 has been marked as a duplicate of this bug. ***
Comment 15 Splarka 2009-05-13 18:53:50 UTC
*** Bug 14036 has been marked as a duplicate of this bug. ***
Comment 16 Splarka 2009-05-13 18:58:22 UTC
Update summary to catch more dupes
Comment 17 Splarka 2009-06-10 11:03:38 UTC
*** Bug 19144 has been marked as a duplicate of this bug. ***
Comment 18 Splarka 2009-09-10 05:07:10 UTC
*** Bug 20574 has been marked as a duplicate of this bug. ***
Comment 19 Splarka 2009-09-11 12:00:03 UTC
*** Bug 20592 has been marked as a duplicate of this bug. ***
Comment 20 RockMFR 2009-09-13 18:51:25 UTC
On a page named "*", "a{{PAGENAME}}a" gives "a<ul><li>*</li></ul>a" instead of "a*a". This particular regression was caused by r29205. So I'm assuming that this whole class of bugs are all being tracked in this one bug?

We ran into this particular problem on enwiki at [[MediaWiki:Histlegend]]. No workaround yet.
Comment 21 Le Chat 2009-10-29 14:22:48 UTC
Can this behaviour not at least be disabled for such pseudo-templates as PAGENAME?
Comment 22 P.Copp 2010-01-12 15:33:40 UTC
*** Bug 22086 has been marked as a duplicate of this bug. ***
Comment 23 Happy-melon 2010-01-25 22:01:15 UTC
This behaviour is unjustifiable.  The original bug has a trivial workaround: judicious use of newlines where appropriate.  The 'solution' creates problems with no reasonable workarounds, such as noted in comments 6, 7, 10, 11, 20 above.  However longstanding the feature, this functionality is broken.  

Unless there are serious counterarguments, I intend to undo the newline-insertion added for bug529, WONTFIXing that and FIXing this.  CCing Tim for parsery-ness.
Comment 24 Tim Starling 2010-01-25 22:49:28 UTC
You realise it will break many, many templates if it's removed, right?
Comment 25 Happy-melon 2010-01-25 23:18:07 UTC
In the same way the original fix presumably broke many templates, given all these unexpected side effects, yes.  However, those breakages can be fixed, unlike some of the breakages it causes.  And since the syntax without the bug529 is valid regardless, templates can be fixed any time, before or after they become broken.  No one can fix {{talkpage}} on [[Talk:*-algebra]] (http://en.wikipedia.org/w/index.php?title=Talk:*-algebra&oldid=340022974) with this parsing in place.
Comment 26 Tim Starling 2010-01-25 23:24:34 UTC
No, the original fix did not break many templates. It was 2004, there weren't many templates to break back then. 

It's not as easy as you make out to produce line starts without the bug 529 hack. Look at what happens when an extension breaks it:

http://lists.wikimedia.org/pipermail/mediawiki-l/2010-January/033103.html

If the problem is parser function and variable output, then we can fix that specifically and leave template output as it is.
Comment 27 Happy-melon 2010-01-26 00:27:14 UTC
Created attachment 7019 [details]
testcases

It's generally trivial: just add a linebreak in the calling table:

{|
|-
|
{{template-with-block-level-wikimarkup}}
|-
| {{template-without-block-level-wikimarkup}}
|}

If the outer markup expects block-level content, it should be on a new line.  The comment you refer to is just putting the cart before the horse to try and fix this in the subtemplate; templates generating extra whitespace is a big enough problem as it is. Of course, their specific problem is with the newline position of the #ask: parser function getting lost somewhere, but their implementation puts the contents of the #ask on a newline whether or not that's desired.  Linestart status should be decided from the top down, where people can actually see what the transclusions are doing, not blind-guessed from the inside out.  However, as was pointed out in that thread, adding anything; be it an nbsp, <nowiki/> tag, etc, reproduces the effect they wanted.

In the testcases attached, the existing implementation (with the hack) breaks cases 5, 6, 7 & 8.  Without the hack, cases 1 and 3 break, assuming that block-start functionality is always desired.  If the inner template should sometimes exhibit block-level functionality and sometimes not, of course, there's no way to produce that with the hack in place, although that's an unlikely situation.
Comment 28 Tim Starling 2010-01-26 00:42:54 UTC
We could add a bug 529 tracking category to the parser output to determine how the hack is being used on Wikimedia wikis.
Comment 29 Alexandre Emsenhuber [IAlex] 2010-04-05 11:08:07 UTC
*** Bug 23033 has been marked as a duplicate of this bug. ***
Comment 30 Alexandre Emsenhuber [IAlex] 2010-04-05 18:36:21 UTC
*** Bug 5590 has been marked as a duplicate of this bug. ***
Comment 31 Bawolff (Brian Wolff) 2010-05-01 06:32:13 UTC
*** Bug 23355 has been marked as a duplicate of this bug. ***
Comment 32 Helder 2010-05-03 20:44:11 UTC
Is the code for headers (=) a "line-start" code too?

It is appearing undesired line breaks in the headers here:
http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles%0D|Title+1%0D|Title+2%0D}}

and here:
http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles2%0D|Title+1%0D|Title+2%0D}}

The code of [[User:Heldergeovane/Test/Template for titles]] is:
----
{{#if:{{{1|}}}|<h1>{{{1}}}</h1>}}
== Section 1.1 ==
Text 1.1
{{#if:{{{2|}}}|<h1>{{{2}}}</h1>}}
== Section 2.1 ==
Text 2.1
----

And the code of [[User:Heldergeovane/Test/Template for titles2]] is:
----
{{#if:{{{1|}}}|={{{1}}}=}}
== Section 1.1 ==
Text 1.1
{{#if:{{{2|}}}|={{{2}}}=}}
== Section 2.1 ==
Text 2.1
----

Helder
Comment 33 Mormegil 2010-05-03 21:36:40 UTC
(In reply to comment #32)
> Is the code for headers (=) a "line-start" code too?
No

> It is appearing undesired line breaks in the headers here:
> http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles%0D|Title+1%0D|Title+2%0D}}

I fail to see the “undesired line breaks”. The only line breaks I see there are those you explicitly added yourself (and I do not even think they present any problem). If you remove them from the input, they disappear from the output:

http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles|Title+1|Title+2%7D%7D

http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles2|Title+1|Title+2%7D%7D
Comment 34 Helder 2010-05-03 23:59:08 UTC
I forgot to mention that the example is based in a case where the number of parameters used is superior to 100, so it is desired to have one in each row (as in the example). The problem is that the line breaks are breaking the TOC, which shows
----
* 1 Section 1.1
* 2 Section 2.1
----
instead of 
----
* Title 1
** 1 Section 1.1
* Title 2
** 2 Section 2.1
----
This should not be happening, I mean:
<h1>Title
</h1>
should also make "Title" to appears in the TOC, as it does in
<h1>Title</h1>

Besides this, the code
----
{{:User:Heldergeovane/Test/Template with parameters
|FIRST
|SECOND
}}
----
should result in the same output as this:
----
{{:User:Heldergeovane/Test/Template with parameters
|1=FIRST
|2=SECOND
}}
----
without any undesired line breaks. Here is a link showing the differences:
http://bit.ly/bVLioV

Helder
Comment 36 Bawolff (Brian Wolff) 2010-11-20 02:28:26 UTC
*** Bug 26000 has been marked as a duplicate of this bug. ***
Comment 37 Umherirrender 2010-12-12 12:59:10 UTC
In some case, you can use <code>&#35;</code> for #, because the entity is replaced after the braceSubstitution.
Comment 38 Happy-melon 2010-12-12 13:02:45 UTC
(In reply to comment #37)
> In some case, you can use <code>&#35;</code> for #, because the entity is
> replaced after the braceSubstitution.

Ew, god, please no.  Escaped entities are escaped entities, they should never be being interpreted as wikimarkup; if they are, that's a separate bug.
Comment 39 Umherirrender 2010-12-12 13:30:16 UTC
(In reply to comment #38)
> (In reply to comment #37)
> > In some case, you can use <code>&#35;</code> for #, because the entity is
> > replaced after the braceSubstitution.
> Ew, god, please no.  Escaped entities are escaped entities, they should never
> be being interpreted as wikimarkup; if they are, that's a separate bug.

That is why I say, in same case. For the Template:Color it is possible (comment 0), because the # is not wikimarkup there. Using 

{{SUBJECTPAGENAME{{#if:yes|&#58;Talk:Foo/bar}}}}

does not work (comment 10), because the &#58; is for wikimarkup (for the parser function)
Comment 40 Phillip Patriakeas 2010-12-12 14:36:39 UTC
As HM said in comment 27, all you need generally is a preceding nbsp or <nowiki/>:

style="color:<nowiki/>{{#if:yes|#000;|#fff;}}"

...or something to that effect.
Comment 41 Happy-melon 2011-01-17 10:35:16 UTC
I fixed this in r80430.  The newline is now only added when the brace construct begins with a wikitable element {|
Comment 42 Brion Vibber 2011-01-26 01:20:41 UTC
I've provisionally reverted this in r81012. As noted in code review comments, this alters various existing edge cases, and causes unexpected changes in behavior for constructs that are already in use in pages and tables.

As we're in the middle of settling down work on trunk into the 1.17 deployment and release, I'd strongly recommend revisiting this in a few weeks when things have settled down.

Definitely recommend going ahead and testing things and checking to see what the best machine strategies for fixing up old code are, if these are the correct changes to make.
Comment 43 Mark A. Hershberger 2011-04-12 16:22:27 UTC
Punting this to the new parser Brion has under development.
Comment 44 DieBuche 2011-04-14 20:41:28 UTC
*** Bug 10781 has been marked as a duplicate of this bug. ***
Comment 45 Bergi 2011-05-03 14:06:32 UTC
*** Bug 19302 has been marked as a duplicate of this bug. ***
Comment 46 P.Copp 2012-03-10 18:30:33 UTC
*** Bug 35129 has been marked as a duplicate of this bug. ***
Comment 47 db [inactive,noenotif] 2012-05-05 19:35:37 UTC
*** Bug 36215 has been marked as a duplicate of this bug. ***
Comment 48 db [inactive,noenotif] 2012-07-29 12:36:49 UTC
*** Bug 38697 has been marked as a duplicate of this bug. ***
Comment 49 Danny B. 2012-07-29 14:02:29 UTC
Bumping the importance - so many dupes and so many broken/non-working things because of this.
Comment 50 Fomafix 2012-09-18 06:20:02 UTC
*** Bug 40294 has been marked as a duplicate of this bug. ***
Comment 51 Mormegil 2012-10-24 14:41:03 UTC
Note that the worst case of this problem, unusability of things like {{PAGENAME}} on pages like “*Foo” was specifically solved in Bug 26781 with commits r80511 and r80512.
Comment 52 Mr. Stradivarius 2013-04-09 16:36:44 UTC
This bug is causing unexpected behaviour in {{#invoke:}} as well. In my case, I found this when writing [[Module:UrlToWiki]] that converts URLs into interwiki links. When the module generates the text for a link that uses the colon trick, the parser generates an unwanted new line. I made a demonstration module as well:

https://test2.wikipedia.org/wiki/Module:User:Mr._Stradivarius/colonbug
https://test2.wikipedia.org/wiki/User:Mr._Stradivarius/colonbug

Allow me to join the voices of those calling for this to be fixed. There would be a simple workaround for templates which rely on this behaviour should it be fixed, but as it is the bug makes certain things impossible.
Comment 53 Brad Jorsch 2013-08-05 15:45:39 UTC
*** Bug 52548 has been marked as a duplicate of this bug. ***
Comment 54 db [inactive,noenotif] 2013-11-16 20:58:30 UTC
*** Bug 56562 has been marked as a duplicate of this bug. ***
Comment 55 Andre Klapper 2013-12-04 15:25:17 UTC
Happy-melon: This issue has been assigned to you in January 2010.
Could you please provide a status update and inform us whether you are still working (or still plan to work) on this issue? 
Only in case you do not plan to work on this issue anymore, should the assignee be set back to default? Thanks.
Comment 56 Happy-melon 2013-12-04 15:44:16 UTC
(In reply to comment #55)
> Happy-melon: This issue has been assigned to you in January 2010.
> Could you please provide a status update and inform us whether you are still
> working (or still plan to work) on this issue? 

I pretty much *had finished* working on this, deployed a fix, etc; then Brion reverted.  Essentially my code was rejected.  

> Only in case you do not plan to work on this issue anymore, should the
> assignee
> be set back to default? Thanks.

I'd say this should either be WONTFIXed if it's actually not going to happen, or my fix should be reinstated (I'd expect it still applies fairly cleanly, the Parser code is *very* stable).  There's no reason for anyone *else* to be working on it.
Comment 57 Gerrit Notification Bot 2013-12-04 16:14:55 UTC
Change 99133 had a related patch set uploaded by Bartosz Dziewoński:
Stop prepending newlines to templates starting with *#;:

https://gerrit.wikimedia.org/r/99133
Comment 58 Bartosz Dziewoński 2013-12-04 16:22:44 UTC
This behavior has annoyed me for long enough. I say we should break the wikis and fix it (after appropriate community nudging is done, and probably after we run a diff of resulting HTML on a large enough subset of pages – Parsoid testing infrasctructure can probably help here a lot).

I tried reapplying Happy-melon patch from r80430 in Ifc6080cb linked above, fixing a few minor merge conflicts on the way and one larger one, and hopefully not breaking too many unrelated things in the process.

It is naturally failing parser tests right now due to how many other things changed in these three years, but that's nothing insurmountable, I can fix the tests myself if there is any chance of this actually getting merged again someday.
Comment 59 Gabriel Wicke 2013-12-05 22:08:36 UTC
Changes to this behavior will also very likely break a lot of existing content for a small gain in usability. This is also the reason why it was reverted in the past. 

Removing the newline insertion would also make life for Parsoid harder. Example case:

{{random}}{{echo|* foo}}

The newline context of * foo now depends on the expansion of the random template. This makes independent parsing, correct WYSIWYG and efficient updates for template expansions very difficult to impossible. The newline insertion hack happens to help us here, even if the original author probably didn't think about future parser development affected by this.

My preference is to focus efforts on better DOM-based templating rather than spending a lot of time moving sideways with wikitext templating.
Comment 60 Brad Jorsch 2014-01-07 18:45:22 UTC
I don't think having the ability to have template output beginning with "#" and similar characters is a "small" gain in usability. This sort of thing seems to come up on enwiki every few months as new people run into this bug.


(In reply to comment #56)
> 
> I pretty much *had finished* working on this, deployed a fix, etc; then Brion
> reverted.  Essentially my code was rejected.  

Looking at the history on this bug and the comment on r81012, it doesn't seem so much "no" as "not right now, we're trying to release 1.17" and then it never got followed up on after. And then Brion was supposed to be working on a new parser, etc.

Possibly the biggest help would be to identify what exactly on the wikis would be broken by making this change.
Comment 61 Bartosz Dziewoński 2014-01-26 15:57:59 UTC
*** Bug 60444 has been marked as a duplicate of this bug. ***
Comment 62 Brad Jorsch 2014-02-04 14:35:24 UTC
*** Bug 60827 has been marked as a duplicate of this bug. ***
Comment 63 Jesús Martínez Novo (Ciencia Al Poder) 2014-03-29 15:27:36 UTC
Someone hit this problem today on IRC (discussed privately).

A template with a link to IRC (although it will affect any protocol) and a parameter to supply the port.

Example:

[irc://{{{server|}}}{{#if:{{{port|}}}|:{{{port}}}}}/{{{channel}}} #{{{channel}}}]

If you specify a port, the colon in the port breaks the link, as it's being interpreted as a definition list.

But if you try to escape it wrapping the colon inside nowiki tags, the link is broken anyway since the < character is interpreted as the end of the link.

See this test https://www.mediawiki.org/w/index.php?oldid=943856
Comment 64 Phillip Patriakeas 2014-03-29 16:10:38 UTC
I'm pretty sure that particular instance could be worked around by using [[Template:Colon]] on wikis that've created it, but that's very much not an ideal solution.
Comment 65 Greg Grossmeier 2014-06-24 17:39:25 UTC
Brad gave a pretty good summary in comment 60, with this as a good next step:

(In reply to Brad Jorsch from comment #60)
> Possibly the biggest help would be to identify what exactly on the wikis
> would be broken by making this change.

Matma: Can you do this? The current behavior also annoys you, so you might be inclined to help move this bug forward.

For the record, as of today there are 23 duplicate bugs for this issue.

Setting status to Assigned (from patchtoreview) since the next step isn't necessarily reviewing the (old) patch, but working out what will break if it's merged.
Comment 66 Bartosz Dziewoński 2014-06-24 20:22:24 UTC
(In reply to Greg Grossmeier from comment #65)
> Matma: Can you do this? The current behavior also annoys you, so you might
> be inclined to help move this bug forward.

I'd love to help, but I don't think I can test a representative sample of all articles in all Wikimedia wikis on the hardware and the Internet connection available to me.

Isn't there a whole infrastructure for Parsoid testing and comparing the results to current parser? I think it'd make sense to use that intead.
Comment 67 Nemo 2014-07-02 15:07:21 UTC
(In reply to Bartosz Dziewoński from comment #66)
> Isn't there a whole infrastructure for Parsoid testing and comparing the
> results to current parser? I think it'd make sense to use that instead.

There is [[mw:Parsoid/Setup]] and mediawiki/services/parsoid/tests/dumpGrepper.js, which could be run on any Wikimedia Labs instance over the Wikimedia projects dumps, but last time I tried I wasn't able to make it work and I ended up using bzgrep instead. :-)

It may get easier if someone familiar with parsoid improves the docs and/or testing infrastructure for this sort of things, but in the current state assessing the effects of such a whitespace change certainly is not a few hours' job.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links