Last modified: 2014-11-01 18:31:39 UTC
This might seem ridiculous at first glance, but it would be incredibly useful for writing Commons transfer scripts (similar in concept to CommonsHelper, but calling the API from JavaScript). It may be as simple as adding upload.wikimedia.org to $wgCopyUploadsDomains in InitialiseSettings.php. However, I don't know if the server configuration will allow this to work straight away.
See also my comment at bug 14919 comment 5
Adding dependency to tracking bug 37883 (Wikimedia Commons features). Adding reedy as CC to get feedback on the server configuration issue.
See also bug 20512
[feature => severity "enhancement"]
Adding Ryan; he's the man, apparently
One issue with this is that the proxy server currently handling upload-by-URL requests can't do HTTPS. So we would either need to fix that bug, or give some warning that HTTPS requests will error out.
Is there already a bug "add HTTPS capability to the proxy server"? If so, please add a dependency.
HTTPS upload by URL feature request: https://bugzilla.wikimedia.org/show_bug.cgi?id=40596
Could we now enable this feature or is there another blocker?
(In reply to comment #9) > Could we now enable this feature or is there another blocker? I guess it should be enabled on testwiki and confirmed to work first...
Could someone please go ahead and enable this on testwiki?
(In reply to comment #11) > Could someone please go ahead and enable this on testwiki? https://gerrit.wikimedia.org/r/47299
Thanks, however it doesn't seem to work for me. I ran a test from test2wiki (this was easier because my JS code is set up for CORS): HTTP POST to http://test.wikipedia.org/w/api.php action=upload filename=0.28589522187660577.png text=this is a test file comment=upload comment token=<VALID EDIT TOKEN> url=http%3A%2F%2Fupload.wikimedia.org%2Fwikipedia%2Ftest2%2F5%2F53%2F0.28589522187660577.png ignorewarnings=true format=json origin=http%3A%2F%2Ftest2.wikipedia.org This is the response: {"servedby":"srv193","error":{"code":"http-bad-status","info":"Error fetching file from remote source","0":"403","1":"Forbidden"}}
(In reply to comment #13) > {"servedby":"srv193","error":{"code":"http-bad-status","info":"Error fetching > file from remote source","0":"403","1":"Forbidden"}} acl to-wikimedia dst 208.80.152.0/22 acl to-wikimedia dst 91.198.174.0/24 acl to-wikimedia dst 10.0.0.0/16 acl to-wikimedia dst 10.64.0.0/16 # Do not allow any fetches from our own IP ranges http_access deny to-wikimedia I'm not sure if the answer is to make squid serve those requests, or add a list of sites that shouldn't use $wgCopyUploadProxy Suspect that's a question for ops whether they're ok with letting the proxy read from the cluster..
No, an upload-by-url proxy is the wrong day to do it. If we want to copy files within the upload.wm.org realm, then we should use efficient server-side copies (e.g. Swift's X-Copy-From header), not go through the application servers and upload-by-URL proxies. Moreover, copying files internally seems wrong to me in general. It's probably okay if it's a limited use case, but if it's something that's going to get popular, then some other way of multiple reference to the same file should be found, rather than having the same contents copied over and over in the media storage backends.
Maybe so. However, Commons transfer has always been done by a download-upload process (this is what CommonsHelper on toolserver does, for example). Fixing this bug would allow this tried-and-true approach to continue at a faster rate. Or, we could wait an indefinite amount of time for the file storage backend to be complexified, convoluted, etc...
(In reply to comment #14) > Suspect that's a question for ops whether they're ok with letting the proxy > read from the cluster.. Were ops ever contacted about this?
(In reply to comment #17) > Were ops ever contacted about this? See answer in comment 15 by Faidon.
(In reply to comment #18) > See answer in comment 15 by Faidon. My bad, I didn't realise Faidon was part of the ops team. It seems we've reached a stalemate: ops is refusing to fulfil the request, but no alternative is being suggested. (In reply to comment #15) > It's probably > okay if it's a limited use case, but if it's something that's going to get > popular Just so you are aware, Faidon... I daresay hundreds of thousands of files have already been copied from WMF wikis to Commons, leading already to massive duplication on the servers. So this process is already rather popular, and this bug is a way to streamline the process. To be clear, I would welcome an alternative internal approach, or a rationalisation of the file storage backend, but I don't see those things happening anytime soon. Going ahead and reconfiguring the proxy can be done now (as far as I can tell) and would make the process as it already exists a lot simpler.
[CC'ing Fabrice as this covers Uploading/Multimedia]
*** Bug 62820 has been marked as a duplicate of this bug. ***
RfC is running at Commons: https://commons.wikimedia.org/wiki/Commons:Requests_for_comment/Allow_transferring_files_from_other_Wikimedia_Wikis_server_side I didn't conceal that it's possibly not implemented *but* I hope that strong consensus and some of the comments by the community possibly motivate responsible persons to re-consider their position. The way transferring files is currently done adds likely more load the the WMF servers as if the proxies would allow to fetch from WMF directly.
Status update: On [[Commons:Commons:Requests for comment/Allow transferring files from other Wikimedia Wikis server side]], we have an unanimous consensus. (In reply to Faidon Liambotis from comment #15) > No, an upload-by-url proxy is the wrong day to do it. If we want to copy > files within the upload.wm.org realm, then we should use efficient > server-side copies (e.g. Swift's X-Copy-From header), not go through the > application servers and upload-by-URL proxies. > > Moreover, copying files internally seems wrong to me in general. It's > probably okay if it's a limited use case, but if it's something that's going > to get popular, then some other way of multiple reference to the same file > should be found, rather than having the same contents copied over and over > in the media storage backends. Actually, we already do that with manual bots and tools to transfer media from local Wikimedia to Commons when they have been cleared as freely licensed or in public domain. So I offer to enable it as it won't create more copy than we currently have, and then open a new bug to work on a better solution.