Last modified: 2014-11-21 00:23:18 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T75662, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 73662 - Server does not accept MIME RFC2231 encoded header values
Server does not accept MIME RFC2231 encoded header values
Status: RESOLVED INVALID
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
unspecified
All All
: Unprioritized normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-11-20 16:33 UTC by Fabian
Modified: 2014-11-21 00:23 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Fabian 2014-11-20 16:33:25 UTC
Values in MIME headers which are encoded using RFC2231 are not accepted. As the only header value which isn't ASCII (and thus might need to be encoded) is in ApiUpload this currently only accepts uploads with non-ASCII chars in the filename.

Another interpretation of this bug is also, why there is 'filename' in the header of the chunk/file entry of the MIME request as the server ignores it apparently. But that would just mask the underlying issue that there is no way to get Unicode data in the header values to the server (except by just using the encoding of the server but afaics is that not MIME compliant and Python 3's library doesn't support that).
Comment 1 Brad Jorsch 2014-11-20 17:53:54 UTC
I note that one of the header examples you gave in bug 73661,

 Content-disposition: =?utf-8?b?Zm9ybS1kYXRhOyBuYW1lPSJmaWxlIjsgZmlsZW5hbWU9?=   
   =?utf-8?b?IsOcMi5qcGci?=

is not actually valid. See RFC 2047 section 5.

However, the other one,

 Content-disposition: form-data; name="file"; filename*=utf-8''%C3%9C.jpg

is also not correctly recognized.

But that doesn't have anything to do with MediaWiki, as PHP itself is not correctly handling such encoded parameters when populating $_POST and $_FILES. If this gets fixed in PHP, MediaWiki should accept it fine.
Comment 2 John Mark Vandenberg 2014-11-21 00:23:18 UTC
https://www.mediawiki.org/wiki/API:Upload(In reply to Brad Jorsch from comment #1)
> I note that one of the header examples you gave in bug 73661,
> 
>  Content-disposition:
> =?utf-8?b?Zm9ybS1kYXRhOyBuYW1lPSJmaWxlIjsgZmlsZW5hbWU9?=   
>    =?utf-8?b?IsOcMi5qcGci?=
> 
> is not actually valid. See RFC 2047 section 5.

yea, and it is noted as garbage in that bug.  We should find the Python 2 bug for that.

> However, the other one,
> 
>  Content-disposition: form-data; name="file"; filename*=utf-8''%C3%9C.jpg
> 
> is also not correctly recognized.
> 
> But that doesn't have anything to do with MediaWiki, as PHP itself is not
> correctly handling such encoded parameters when populating $_POST and
> $_FILES. If this gets fixed in PHP, MediaWiki should accept it fine.

Shouldnt this be filed as a bug, and this tracked as an 'upstream' bug?  I couldnt find a php bug about this, but it is very possible I have missed it because I'm not familiar with terms php uses.

https://www.mediawiki.org/wiki/API:Upload only says the following about these fields

file - File contents
chunk - Chunk contents

paraminfo says type 'upload'; that is all.

https://en.wikipedia.org/w/api.php?action=paraminfo&modules=upload

API:Upload suggests it should look like

Content-Disposition: form-data; name="file"; filename="Apple.gif"

But that doesnt address non us-ascii filenames.

It looks like we can send any value as the filename in Content-disposition.
The following is copying my rough analysis on https://gerrit.wikimedia.org/r/#/c/174677/ (would appreciate any corrections or historical titbits from mediawiki devs):

fwiw, this filename value is exposed to MediaWiki extensions via WebRequestUpload method getName.

http://git.wikimedia.org/blob/mediawiki%2Fcore.git/c1826209e739d51359bcea37ff4116eed9bd971c/includes%2FWebRequest.php#L1173

($fileInfo comes from $_FILES which is http://php.net/manual/en/reserved.variables.files.php)

Interestingly, Safari sends unicode filename to the server using html encoding (probably {), which are decoded by Sanitizer.php : http://git.wikimedia.org/blob/mediawiki%2Fcore.git/c1826209e739d51359bcea37ff4116eed9bd971c/includes%2FSanitizer.php#L32

WebRequestUpload method getName does not appear to be used in the current mediawiki codebase, but it is used (badly) by some (probably broken) mediawiki extensions. I quickly checked the v1.16 codebase, and cant see any use of getName to be concerned about.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links