I'm working with Unit 1 of the book (86 pages). That's about the same size as the knitting manual so I have an idea of about how much work that will involve. I can do the initial proofreading at a volume of between ten and 20 pages at a sitting; my goal is an average of ten pages a day.
3. Some Special Properties of Matter 17. What is tenacity? By laboratory test we find that a silk thread is stronger than one of cotton, if both have the same diameter, or the same cross-sectional area. A copper wire is more easily broken than one of steel. We say that steel is more tenacious than copper. When we ride in an elevator, our safety depends upon the tenacity of the cable. The tenacity of any material, or its tensile strength, is measured by the force needed to break a rod or wire of that material whose cross-sectional area is unity, one square inch for example. (See Table 7, Appendix B.) It takes a load of 300,000 lb. to break a bar of high-grade steel whose cross-sectional area is one square inch. (See Fig. 6.) [Figure 006. With its approaches, the George Washington Bridge is 8700 feet in length. The steel cables, which are 36 inches in diameter, support the weight of the bridge. *Courtesy of the Port of New York Authority*] The steel cables that sustain the weight of the George Washington Bridge are 36 inches in diameter. A single span of the Golden Gate Bridge of San Francisco stretches over 4200 feet of water. (See Fig. 7.)
Primary proofreading is how I refer to my first pass through the text. I'm there to find misrecognized text, rearrange text that was laid out incorrectly, and indicate in some primitive way the large-scale features of the document structure.
As you can see above, I'm already using some conventions, like "empty lines indicate paragraph breaks" and "Figure captions go in square brackets". At this point, though, I'm not too concerned with structural markup and very little with presentation markup. The idea is that I'm eliminating most of the mistakes made during OCR and getting the text in a form that will be easy to navigate and compare with the page images when it comes time to add structural markup and text styles.
I like to have the page images open while I'm doing primary proofreading, but I don't want to be comparing it line-by-line unless I get to a section that was badly mangled during OCR (like those short columns).