Last modified: 2014-05-28 18:46:32 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T67812, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 65812 - Infinite loop/stuck parsing


Summary:	Infinite loop/stuck parsing

Status:	RESOLVED FIXED

Product:	Parsoid
Classification:	Unclassified
Component:	General (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Highest major
Target Milestone:	---
Assigned To:	Gabriel Wicke

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2014-05-27 15:59 UTC by ssastry
Modified:	2014-05-28 18:46 UTC (History)
CC List:	5 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description ssastry 2014-05-27 15:59:23 UTC

Try this url:
http://localhost:8000/huwiki/Vegy%C3%BCletek_%C3%B6sszegk%C3%A9plet-t%C3%A1bl%C3%A1zata?oldid=14167011

Puts Parsoid in a coma.

Discovered via production logs after Parsoid cluster load spiked y'day where most parsoid processes were stuck.

Comment 1 ssastry 2014-05-27 16:03:28 UTC

It does look like a tokenizer issue:

[subbu@earth tests] ./fetch-wt.js --prefix huwiki 14167011 > inf.loop.wt
[subbu@earth tests] node parse --trace peg --prefix huwiki < inf.loop.wt
... some tokens emitted ...
... stuck ...

Comment 2 Gerrit Notification Bot 2014-05-27 20:32:29 UTC

Change 135611 had a related patch set uploaded by GWicke:
Bug 65812: Speed up processing of huge sync token chunks

https://gerrit.wikimedia.org/r/135611

Comment 3 Gabriel Wicke 2014-05-27 20:33:25 UTC

(In reply to Gerrit Notification Bot from comment #2)
> Change 135611 had a related patch set uploaded by GWicke:
> Bug 65812: Speed up processing of huge sync token chunks
> 
> https://gerrit.wikimedia.org/r/135611

Sorry for the spam, this was actually intendend for bug 65812.

Comment 4 Gabriel Wicke 2014-05-27 20:35:12 UTC

(In reply to Gabriel Wicke from comment #3)
> Sorry for the spam, this was actually intendend for bug 65812.

Never mind..

Comment 5 ssastry 2014-05-27 20:39:19 UTC

{{:Sablon:összegtáblázat}} is the transclusion in question in huwiki which generates a 408K token chunk in the tokenizer prior to the fix and which seemed to essentially slow down in async-ttm after about close to 128K tokens were processed and we traced this to a slowdown in concatenation once the accum size crossed a threshold.

Comment 6 Gerrit Notification Bot 2014-05-27 20:55:49 UTC

Change 135611 merged by jenkins-bot:
Bug 65812: Speed up processing of huge sync token chunks

https://gerrit.wikimedia.org/r/135611

Comment 7 Gabriel Wicke 2014-05-28 18:46:32 UTC

Fixed by https://gerrit.wikimedia.org/r/135611, and further improved by https://gerrit.wikimedia.org/r/135723 to the point where this huge test case now parses in about 66 seconds.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links