Last modified: 2013-11-15 07:20:41 UTC
From bug 36580: <Robin_Watts> That looks a lot like you're rendering to JPEG - the ringing artifacts etc. <chrisl> hexmode: the "heavily compressed" effect is, as Robin_Watts mentioned, because it's jpeg compressed - the solution is: don't use jpeg.....
Created attachment 10534 [details] using png Switching to png output instead of jpeg using the command in bug 36580 comment 4 results in smaller file size, as well: $ gs -sDEVICE=png16m -sOutputFile=after_gs.png -dFirstPage=1 -dLastPage=1 -r150 -dBATCH -dNOPAUSE -q Welcome2WP_English_082310.pdf gives me a file size of 59394 instead of 143024.
Created attachment 10535 [details] downscaled png after convert, compare to attachment #10524 [details] (https://bugzilla.wikimedia.org/attachment.cgi?id=10524)
Gerrit change: https://gerrit.wikimedia.org/r/6802
GS default compression for jpeg device is 0.75, it'll better to try first a saner value like -dJPEGQ=95 and compare the output size and quality between png and jpg before switching to png. Switching to png can have a huge impact on wikisource using pdf file.
png is better for low-color pages, jpg is better for wide-color pages.
Copy paste from a comment I made on Gerrit change #6802 : ---------------------------------------------------------- We could add a parameter to the thumb syntax to let the user choose the rendered. Something like: [[File:foo.pdf|thumb|png]] [[File:foo.pdf|thumb|jpg]] And have the default set by a global configuration variable such as $wgPdfThumbOutputFormat or something. Would get us the best of both worlds :-] ---------------------------------------------------------- That is definitely an easy change to the current patchset I will be more than happy to review it :)
It is better to calculate the number of color in PDF page, because one PDF file may combine low-color text pages and colorful illustrations. Or, if the color range calculation is too expensive, one may compress to lossless png and to 95% jpeg the same page and choose which image is smaller. For a wide-color images, lossless png will be MUCH larger than high-quality 95% jpeg.
I have abandoned Gerrit change #6802 pending a proper design choice which should be happening in this bug report.
[Patch in Gerrit got reviewed (and abandoned), hence resetting keyword]
Chaning state back to 'new' since the previous patch was abandoned some time ago.
I'm interested in following up on this bug, particularly for PDFs of vectorial graphs generated from data analysis software (like R or Mathematica), which researchers (including myself) routinely upload to Commons. The quality of JPEG thumbnails for these PDF graphs is abysmal when compared to a thumbnails for a native PNG format. Original files: https://commons.wikimedia.org/wiki/File:Active_Editors_arwiki.pdf https://commons.wikimedia.org/wiki/File:Active_Editors_arwiki_2.png Thumbnails: https://upload.wikimedia.org/wikipedia/commons/thumb/3/39/Active_Editors_arwiki.pdf/page1-1004px-Active_Editors_arwiki.pdf.jpg https://upload.wikimedia.org/wikipedia/commons/0/06/Active_Editors_arwiki_2.png The only other option for vectorial plots to avoid these compression artifacts is to upload them as SVG (which renders as PNG). However in many cases PDF is the default export option and the most common format for scientific media people will consider donating to Commons.