We need one or more volunteers to scan and OCR a book for us. Specifically, we need to scan and OCR _The Phoenix Guards_. I did _Jhereg_, _Yendi_, and _Teckla_ myself. Each of them took 2-3 hours to scan, OCR, and look at all the things the spellcheck queried. I was using a somewhat slow flatbed scanner (I chose it for its ability to handle photos, not for its speed; *and* it's several years old), and Abbyy FineReader 6.0 OCR. (Note that TPG is in the range of 2-3 times as big as any of those three; so on this basis TPG might take a bit over 6 hours total.) (So far as I can see it's legal to OCR a book you own; it's what you do with it *after* that that may be illegal. It's certainly legal to OCR a book *for the copyright holder*, that is, for Steven, and pass the file to him. If you do this for us, we do ask you *not* to distribute the results to *anybody* else!) Obviously you need to have a flatbed scanner to do this, and OCR software. The package I used is downloadable for a 15-hour trial (hours of actual use, not elapsed hours). And you need to be willing to risk or sacrifice a copy of the book in question; we don't have a budget or a pile of free copies of the books sitting around anywhere. It was easiest to do with the pages loose rather than bound, which was pretty easy to arrange -- for $0.50 Kinko's cut the binding off for me on their big guillotine paper cutter. This does have the downside of ruining the book; but a used paperback in poor condition works fine for scanning. Doing it with the pages bound probably ruins the book anyway, through pressing it down flat on the scanner. When we get to the rare books, it may be worth considering alternatives, like photographing the pages with a digital camera. Anybody interested, please get in touch with me by email, dd-b at dd-b.net (no point tying up the group for it). I will provide more details about what we do and don't want in the resulting files, and coordinate assigning page number ranges to various volunteers. I'm hoping those who volunteer can put in a few hours fairly soon -- say, three people each putting in three hours in the next week would get the book done. This is for the full-text search engine for the books. We hope to make this available in "alpha" test mode to a very few selected people relatively soon (sorry, I've already figured out who I want to ask), and it will be a part of the new public web site when it opens. It'll also be useful to Steven to have an etext of the published version of the book (especially so for TPG, since the manuscript etext appears to have been lost). Oh, in fairness, I should say there's still some possibility a problem (political / legal) with making the search engine publicly available *could* still crop up, though it now looks unlikely. We will, of course, credit the people who scanned particular books for us in the credits on the search engine. I'm currently using Steven's manuscript etext for Issola and 500 Year's After, and will probably start with that for more recent books where one is available. But having the search engine reflect the actual published text is the long-term goal, so eventually if this goes ahead we'll need to scan the rest of the books, too. (Looking at the amount of work the scanning is, it's pretty clearly less work to just do the scan than it is to collate the manuscript etext manually against the published copy and update it.) -- David Dyer-Bennet, dd-b at dd-b.net / New TMDA anti-spam in test John Dyer-Bennet 1915-2002 Memorial Site http://john.dyer-bennet.net Book log: http://www.dd-b.net/dd-b/Ouroboros/booknotes/ New Dragaera mailing lists, see http://dragaera.info