Page 1 of 3

How to create small PDF files?

Posted: Thu Jul 24, 2008 11:11 pm
by ma.kwonghua
hanks for uploading the Beethoven Piano Concert No. 2. your PDF files are really small! Just over 1MB for 26 pages (first movement). How did you scan and compress the scores and how did you fit them into pdf?

I followed

http://imslp.org/wiki/IMSLP:Scanning_music_scores

and scanned a 100-page score with each page as a 600x600 black and white (no grayscale) tiff file. Then I concatenate all tiffs into a single tiff file with CCITT Group 4 compression by:

tiffcp -c g4 *.tiff output.tiff

Finally I convert the concatenated tiff file into pdf by:

tiff2pdf output.tiff > output.pdf

However the resultant pdf file is 22MB!

Thanks!

Posted: Fri Jul 25, 2008 12:57 am
by imslp
How large is output.tiff? If the TIFF is about the same size as the PDF, then you have done nothing wrong. ~250kb/page is large, but within expectations for a 600dpi monochrome scan (I had a few ~200kb/page files myself), if the paper size is A4/letter.

Posted: Fri Jul 25, 2008 1:33 am
by Vivaldi
Yes, the file size increases if the resolution is higher, even with CCITT 4 compression. So a file size of 100-200kb for a monochrome 600dpi (A4 size) after CCITT 4 compression is acceptable.

Posted: Fri Jul 25, 2008 8:30 am
by ma.kwonghua
Yes, for a tiff 600dpi monochrome scan with CCITT4 compression, it was about 200kB per page and yes, the tiff is about the same size as the pdf.

However for IMSLP submissions, should I still insist on 600dpi monochrome or should I drop to 300dpi (for A4 orchestral scores)?

Thanks!

Posted: Fri Jul 25, 2008 9:42 am
by Vivaldi
As per the music scanning guidelines, it is recommended to scan the scores at 600dpi in black and white.

Posted: Fri Jul 25, 2008 10:58 am
by ma.kwonghua
So does it mean that it is okay to have 200-250kB per page?

Posted: Fri Jul 25, 2008 12:20 pm
by daphnis
I'm in the process of re-writing the scanning page. 600dpi isn't necessarily the best minimum sampling rate. 300 will be fine in most cases if your score is not a miniature or pocket score. Always scan in black & white/monochrome/1-bit color mode and adjust the threshold so that no black artifacts appear around the rims of the paper or no "salt and pepper" effect occurs throughout.

Posted: Fri Jul 25, 2008 6:36 pm
by imslp
Like Daphnis says, it does depend on the score, but when in doubt I would prefer 600dpi as a safe bet. The reason I say this is because 600dpi scans can be downsampled to 300dpi without quality loss, whereas the reverse is obviously not true. Piano A4 scores should be fine at 300dpi, but some orchestral A4 scores might need 600dpi (depends very much on the size of the notes, text, etc).

So the short answer is, yes, 200-250kb/page is large but acceptable. However, if the score is long you might want to break it up into movements.

Posted: Sat Jul 26, 2008 4:24 am
by Vivaldi
Feldmahler, would it be better to use at least 600dpi to scan miniature scores since the font sizes are smaller?

Posted: Sat Jul 26, 2008 1:55 pm
by Lyle Neff
Is there a free program that will do CCITT4 on TIFF files? I know Gimp does that, but it can't seem to handle my TIFFs consistently with loading and saving (editing seems to be okay), and my desktop works very slowly with it.

A program that does just the compression would be great.

Posted: Sat Jul 26, 2008 4:16 pm
by carmar1791
To convert to ccitt by gimp ,but I think by all program ,your ima ge must be indexed and reduced to 2 color

http://docs.gimp.org/en/gimp-images-out.html

You can open/convert a set of image in gimp by " david's batch plug in"

Posted: Mon Oct 13, 2008 10:52 am
by j077y_r0g3r
For b/w I use png. The whole pdf will be a little slow to open, but PNG is loseless, and it is the best when you need highest b/w quality in the smallest space.

I only scan at 600 dpi because even if it's slow, if you have a 600 dpi image you can downsample it, if you have a 300 dpi image, expecially old scores or photocopies, it's impossible to recover quality if after 500 pages you find out there is some imperfection!

Posted: Mon Oct 13, 2008 12:16 pm
by Leonard Vertighel
j077y_r0g3r wrote:For b/w I use png. The whole pdf will be a little slow to open, but PNG is loseless, and it is the best when you need highest b/w quality in the smallest space.
Sorry to contradict you, but CCITT group 4 is more efficient on monochrome (1-bit) files than PNG, and it is lossless as well.

Posted: Mon Oct 13, 2008 12:18 pm
by daphnis
And JBIG2 (lossy or lossless) is even more efficient than that.

Posted: Mon Oct 13, 2008 3:17 pm
by j077y_r0g3r
sorry, I forgot to tell that I use automatic png JBIG2 compression during the creation of acrobat files