Page 1 of 3

Sibley Project List of remaining links

Posted: Thu Jun 23, 2011 12:53 am
by icactus
So I came up with a system to produce all the Sibley links to files that have not yet been uploaded. There are the odd exception for files that were uploaded without the 1802.xxxxx tag, but these are few and far between. Also, a small percentage are used for non-music scores but the vast majority are scores.

This seems to me like a faster way to avoid hunting for duplicates and constant cross checking of every piece. It's a list of about 10,000 links and we have already uploaded around 4000, so the list has the remaining ones. Here's a random sample of the list I'm talking about:

http://hdl.handle.net/1802/7787
http://hdl.handle.net/1802/7788
http://hdl.handle.net/1802/7789
http://hdl.handle.net/1802/7790
http://hdl.handle.net/1802/7791
http://hdl.handle.net/1802/7792
http://hdl.handle.net/1802/7797
http://hdl.handle.net/1802/7798
http://hdl.handle.net/1802/7800
http://hdl.handle.net/1802/7802
http://hdl.handle.net/1802/7803
http://hdl.handle.net/1802/7804


I don't know a good place to put it because it is so long.... but it is already marked as hyperlinks... any ideas?

Re: Sibley Project List of remaining links

Posted: Thu Jun 23, 2011 12:55 am
by icactus
I should have added that the point of doing it in this format is so that people can remove the links as they finish them, but I only takes a few seconds to repopulate the list in case someone forgets to delete a link.

Re: Sibley Project List of remaining links

Posted: Thu Jun 23, 2011 1:57 am
by imslp
Wow... this was exactly what I had in mind when I introduced the SIBLEY1802.* file names a few years ago. If people have time (this is certainly low priority) I would recommend resubmitting all Sibley scans that do not use the SIBLEY1082* naming format, so that we can know for sure what files are missing.

@Icactus: What is the format of your system? Is it a script? Is it possible to somehow run a live-updating mini-website without hammering the IMSLP server?

Re: Sibley Project List of remaining links

Posted: Thu Jun 23, 2011 4:26 am
by icactus
I just used the imslp's search function and a bunch of filtering in excel to compile the list. Then another script to convert the data into links. I don't know how to write the code to set it up in real-time to check. I thought for now it would be a good idea to just make a few wiki pages of the links in numerical order for people to delete as they check through them for the time being, but I don't think I have the ability to create new pages.

Re: Sibley Project List of remaining links

Posted: Thu Jun 23, 2011 7:13 am
by pml
I went looking for a copy of the list over at the wiki and can’t see any instance of you making it available there, unless I’m looking in completely the wrong place. In which case, you haven’t made it obvious!

Assuming it isn’t there yet, are you able to work the list into wiki formatting and link it somewhere from the Sibley project page? If you’re not able to, feel free to send me the spreadsheet and I can easily wikify it.

Cheers, Philip

Re: Sibley Project List of remaining links

Posted: Thu Jun 23, 2011 4:26 pm
by icactus
hooray! i have a MOVE tab now so I should be able to put it up on a page! I just didn't want to add it to the current sibley page as it is very long. I'll break it into a few different pages for ease of editing when people need to delete links after uploading. I can wikify it in excel so no worries.

Re: Sibley Project List of remaining links

Posted: Thu Jun 23, 2011 5:19 pm
by icactus
page is up and linked from Sibley Project page. I think we should get rid of the composer list section as it isn't necessary now that we can track all the links that are outstanding. What do other people think?

http://imslp.org/wiki/IMSLP:Sibley_Mirr ... :Link_List

Re: Sibley Project List of remaining links

Posted: Thu Jun 23, 2011 11:45 pm
by pml
One thing that could be improved is that the links redundantly repeat the address, which doubles the size of the list. That is not inconsiderable when dealing with a list with 2,000 items. Every single character that can be shaved off saves about 2K: here’s a quick method using find and replace methods to save 20 K (from 125 K):

Replace " http://" with " " (the leading space is important!)
Replace "<br>\r" with "\r*" (where \r is shorthand for return)

A template could make much greater savings again by dispensing with the part of the Sibley address which is always the same. (Several of us are pretty handy with concocting these, so don’t panic.)

Feel free to add some other lists, e.g. Link List 2, Link List 3 etc. About 2,000 links per page initially is just on the verge of being manageable.

One other thing to mention in the preamble — some Sibley works have to be uploaded to the US server as life+50 years hasn’t elapsed for Canada — contributors who don’t have access to the US server should flag these for attention by those who do: probably easiest to add "PD-US" after the link.

Cheers Philip

Re: Sibley Project List of remaining links

Posted: Fri Jun 24, 2011 12:25 am
by icactus
I couldn't get your suggestion to show up in the preview so I've left them as is. Adding the " " and removing "http://" caused the link to be written out as text and then separately linked next to itself and written out in full, which looked weird. And I couldn't get a \r to work anywhere.

In any case, I'm still filtering the full list to get rid of all the non-music files (a lot of Dentistry files.... and Gravestones....) so once it's done we'll have a clean list to work from.

Re: Sibley Project List of remaining links

Posted: Fri Jun 24, 2011 12:34 am
by pml
Remember the point of writing the "\r" was that it was a convenient stand-in for the real character, i.e. line break. There's no point searching for "\r" anywhere on that page — you won't find it. Here are a few "real" line breaks:






See the problem?

Anyway, I've devised a template with a few little tricks up its binary sleeves. The same text you were doing as a long link with each URL copied twice is achieved by about a dozen characters:

{{Spill|####}} — where #### is the number after the 1802 part. Each use of the template should go on a fresh line.

(Spill is a backronym for "Sibley project: icactus' link list")

Cheers Philip

PS Special features: an optional second variable can be used for standard messages or info about a given Sibley item: for examples, try out

{{Spill|1234|0}} and {{Spill|1234|1}}

PPS Using {{Spill}} on the first 1500 items did this:
Before: (94,277 bytes)
After: (21,033 bytes)

Each line uses the # symbol which easily allows a running count of how many items are present in a particular numeric range. I've also divided each cohort of 1,000 files so that people can edit individual parts of the page (say, the files numbered in the 3,000s or 4,000s) rather than having to edit the entire page – this should help prevent edit conflicts.

It is trivially easy for me to convert your link format to the Spill template, so I suggest you continue adding more items in exactly the same verbose format as you had used previously, and I will chop them down to prevent the page length becoming too great.

Re: Sibley Project List of remaining links

Posted: Fri Jun 24, 2011 2:30 am
by pml
Also in reply to Feldmahler’s post above, wouldn’t the complete results of using the following ListFiles search:

http://imslp.org/wiki/Special:ListFiles ... SIBLEY1802.

allow you to compile exactly the list of Sibley items that aren’t represented on ISMLP just as icactus has done?

What’s more, you could presumably create a search button (like the existing search buttons for IMSLP# and PMLP# in the Wiki sidebar) that for an arbitrary #### does a search for files named SIBLEY1802.####.

Cheers, PML

Re: Sibley Project List of remaining links

Posted: Fri Jun 24, 2011 8:31 pm
by icactus
I've got a program now that automatically fetches the page titles of the Sibley links and reports it next to the sibley link (which is what Rochester should have done on their webiste) which has made it much faster for me to filter the bad links (and scholarly papers and gravestones) from my list of everything not yet uploaded.

Wouldn't it be useful to append this extra title information somewhere? In case someone wants to find a particular piece on our site but it hasn't been uploaded from Sibley they could find it through google on the Sibley Links page and then see that it needs to be uploaded, but still follow the link to get it from the Sibley site in the meantime. I know it takes up extra KB but it seems like really useful information to have.

This is what i'm talking about:

http://hdl.handle.net/1802/5487 Chants sacrés : 60 motets avec accompt. d’orgue ou piano pour messes, saluts, mariages, etc.
http://hdl.handle.net/1802/5488 La noce Bretonne. Morceau imitatif. Violon seul.
http://hdl.handle.net/1802/5489 Clair de lune; extrait de la Suite bergamasque [par] Claude Debussy. Transcription pour piano
http://hdl.handle.net/1802/5490 Poems of childhood. Blue pigeon
http://hdl.handle.net/1802/5491 Il pleut des pétales de fleurs. The rose-leaves are falling like rain. [Music by] Henry Hadley. Op. 49
http://hdl.handle.net/1802/5492 Cunnin’ little thing
http://hdl.handle.net/1802/5493 Gedichte, op. 42. Gieb, schönes Kind, mir deine Hand. English & German

Re: Sibley Project List of remaining links

Posted: Fri Jun 24, 2011 11:38 pm
by pml
Hi icactus, I can see some immediate uses for that; though its presence on the list of links would perhaps be counter-productive. Might I be so bold as to suggest you e-mail me the entire list thus generated for the 15,000 or so Sibley items? (Unedited, i.e. complete with gravestones and non-musical items!)

How goes the other list of 8,000 items not uploaded?

Cheers, Philip (philip.m.legge AT gmail DOT com)

Re: Sibley Project List of remaining links

Posted: Fri Jun 24, 2011 11:57 pm
by icactus
I just sent you the link list, but did you want the version with some of the titles next to the links? I have 2000 with titles since i started using that method.

Re: Sibley Project List of remaining links

Posted: Sat Jun 25, 2011 12:08 am
by pml
I meant the list with descriptions, actually :-) No harm done with sending the other list though: I presume you’ll be culling it like you have been doing with the links on the Wiki page?

P.