FAIRBANKS - “Self-editing is the path to the dark side,” warns writer Eric Benoit, “Self-editing leads to self-delusion, self-delusion leads to missed mistakes, missed mistakes lead to bad reviews.” I entered the dark side when I deluded myself into sending an early draft of last week’s column to the News-Miner, instead of the smoother, more cogent final version that’s obtainable by emailing me at hillofbooks@gmail.com.

Proofreading’s among my biggest complaints about e-books. Up there with “Why can’t I bequeath someone else the e-books I’ve purchased?” is “Why are even newly released e-books riddled with typographical errors?” “C” appears as “e”, “j” as “i,” words are jumbled, etc. The older the work, the worse it is. I bought several 99-cent copies of Montaigne’s “Essays,” published in the 1500s, for my Nook before finding a version that was readable. The others had entire chapters all scrambled up.

The problem is the same even with newly printed e-books, albeit on a lesser scale. In “Why Are E-books Riddled with Typos?,” a Forbes.com article, Tim Worstall identified the surge of self-published books as a major culprit, since “almost no self-publishers are passing their work under the nose of an editor.” The authors often believe they can proof their own work, but the human mind’s wired to breeze through its own writings, mentally transposing the correct words onto existing errors. You can try techniques like reading your work backwards, sentence by sentence, but it’s always better to bring in another brain and set of eyes.

What really irks Worstall and me is when brand new e-books are rife with typos, which happens frequently when print is digitized for computers. This involves optical character recognition, or OCR, software that’s used to scan print books’ pages, recognize the images as words, and then convert the scanned words into text files. A study in D-Lib Magazine, an online publication “dedicated to digital library research and development,” said “most OCR software claims 99 percent accuracy,” while the average standard for OCR accuracy is 90-98 percent. Doesn’t sound like much? There are between 2,500 and 4,000 characters on an 8-by-11-inch page, meaning there are potentially 200 OCR errors. Even a 99 percent accuracy rate allows 40 mistakes. On every page. As Worstall complained, the publishers “aren’t paying enough attention to the production process.” The errors in new e-books boil down to publishers not employing enough humans to spot the problems.

Not so with Project Gutenberg, who has a system for producing much cleaner e-books, thanks to help from a nonprofit outfit called Distributed Proofreaders. Project Gutenberg is an archive of 44,890 out-of copyright titles that are readable and free at Gutenberg.org. It’s the brainchild of Michael Hart, the inventor of e-books, whose dream was to preserve the great written works of humankind and make them freely available though the Internet. Hart spoke at Noel Wien Library a few years before he died in 2011, and it was one of the more freewheeling lectures ever made there, which is saying something.

Hart’s brother’s best friend worked on the new Xerox Sigma V mainframe computer at the University of Illinois, one of the nodes of the spanking new Internet, and the friend finagled for Michael an unlimited account on the Xerox, which has since been estimated to have been worth from $100,000 to $100 million. Hart conceived of Project Gutenberg as a way to “give back” to society by creating a database of “the 10,000 most consulted books.” He did that and more with an army of volunteers from Distributed Proofreaders.

Distributed Proofreaders sends each volunteer proofer a scanned page from a book along with the OCR version of the same page. The volunteer makes any necessary changes to the OCR version and sends it on to another volunteer proofer who repeats the task. Finally it’s sent to a “post processor” who reviews the work and compiles the pages into a book. This allows for closer scrutiny than reading the undiluted OCR version alone, and it’s easier to spot errors when reading pages out of context. It also permits scores of people to work on a single book simultaneously.

As for future columns, I’ll double-check them with the proofreader I live with before submission, following Shakespeare’s advice: “Be sure of it; give me the ocular proof.”

Greg Hill is the former director of Fairbanks North Star Borough libraries.