[Start]   Projekt Runebergs Wiki - vi hjälps åt att reda ut begreppen!
Instructions for proofreaders
Wiki | Senaste nytt | Inställningar | Sök: | NE | Susning.nu | Wikipedia | Google

CONTENTS:

Summary

  1. Resize your web browser window to maximum size. If you cannot see all the contents, use the scroll bars to the right. Press F11 in Internet Explorer, Mozilla and Opera to maximize your proofing area.
  2. Edit the text in the lower frame so that it matches the facsimile image in the upper frame. If you cannot make out a detail in the image, use the zoom button on the bottom right to increase magnification in the top window.
  3. Use only plain text. Do not use HTML codes or entities.
  4. Do not correct factual errors. Do not correct spelling errors. Reproduce the text as it is printed.
  5. Remove headings and footers, page numbers, and printers' marks.
  6. Leave a blank line between paragraphs. If the first line of each paragraph is indented, then remove the indentation and insert a blank line instead.
  7. If the final word in a line ends with a hyphen, rejoin it with its remainder, moving the hyphenated part to the next line (or page). Do not, however, remove hard hyphens, for instance in compound words or hyphenated phrases.
  8. Rejoin words written in spaced-out type, so that there are no extra spaces between the letters.
  9. Remove emphasis and pronunciation accents in encyclopedic works (that is, remove marks that are not present in the standard spelling).
  10. Mark poems with <poem></poem>
  11. If you want feedback from us, enter your email address (optional) in the lower frame before you press Save. This address will not be exposed on our website nor will it be used for spam.
  12. If you have read the entire page and corrected all the errors you found, click in the box labeled "The whole page is OK now".
  13. When you are done, press Save. This brings you to a "Thank you" page, from where you can either return to the original web page, or continue to proofread the next or previous page in sequence. If you return to the original web page, you might have to reload the page in your browser to see the changes you have made in the page.

This is the summary version. Exceptions and clarifications to these rules are given below.

This is a wiki page. You can [edit] it.

News

Background

Proofreading is an important part of [Project Runeberg]'s activites, one that requires many volunteers. Starting in 2003 this has occurred directly on the webpages. This requires both the scanned facsimile pages (see [scanning]) and the computer-created OCR-texts to be available, which is the case with Project Runeberg's "electronic facsimile editions." For our older texts without the digitized page images, you must still use e-mail (runeberg@lysator.liu.se) to report mistakes and problems.

Proofing via webpages is a relatively new part of Project Runeberg, and is still under development. We welcome ideas and suggestions on how we can improve. Some other projects that use proofreading via webpages, and where you might find ideas, are

On every page at Project Runeberg that has both a digital page-image and an OCR-created text page, there is a link "Proofread the page now!". If you click the link you will reach another page showing the page-image and the text, but on this page you can scroll through them independently to see the corresponding lines in each simultaneously (most screens are too small to display the whole of both). You can also correct the text. When you correct something, all you have to do is click "Save". The next person to look at the page will see your corrected version, not the original one. If you read the entire page and correct all the errors you find you may click in the box labeled "The whole page is OK now" found at the end of the "save" line.

When all the pages in a chapter have been proofread then an editor from Project Runeberg puts them together to form an HTML-page.

How to Proofread

We have made it easy for you as volunteers to help Project Runeberg with proofreading of the computer-created text versions of our electronic fascimile editions. The purpose of proofreading is to improve the accuracy of searches in the texts and make it easier to use them for other purposes, not to reproduce exactly the typography of the printed text. Therefore:

If the printed page has a spelling error or presents facts that are incorrect, let them stand. Let old spellings (fv, hv, dt) remain as they are. We try to reproduce older books here, not write new, modernized texts. If a name is used in a different way in a text (Hälsingborg) than what is common today (Helsingborg) do not change that. If you find an obvious spelling error (Helsingbogr) it is permissible to correct it, but in this case write a comment about what you have changed in the "Comment" field.

Remove the headers and footers, if any. If, for instance, all the pages have an author's name, a chapter title, or something similar across the top or bottom edge, remove it. Also remove page numbers and printers' guidenotes and numbers.

Divisions

Leave a blank line between every paragraph in the text. If the first line in every paragraph is indented, remove the indentation and add a blank line instead. If a page begins with a new paragraph, mark it with a blank line at the very top of the page.

In ordinary running text it is not necessary for line breaks to appear in the exactly same place as they do in the digitized images.

Poems

When it comes to poetry, of course, preserving line-breaks becomes important; the breaks should be in the correct place, and the OCR-program does not always handle them correctly. Those who proofread these pages first will often find it necessary to change the page layout.

The whole poem should begin with a <poem>-tag and end with a </poem>-tag, but remember that both tags must be on the same page, so if the poem continues on the next page you can either move some lines (if there aren't many more) or close the tag on the first page and open (and close) a second set on the next page. If you are logged into the forum you also have a <poem> button.

In poems sometimes certain lines are indented. The can be marked by putting in a <tab>.

      Original                      Proofed

                                    <poem> 
     A first poem line              A first poem line
        An indented line            <tab>An indented line
     More beautiful poetry          More beautiful poetry
                                    </poem>

In printed poetry, when a line in a poem is too long, the rest of the line will often be printed as an indented line beneath the first part. This type of indentation should not be proofed using <tab>. In this case rejoin the full line by bringing the continuation up at the end of the previous line. Our computer screens are much wider than the average printed book. How can one tell this type of continuation line from a line indented for reasons of style? It will usually be clear from capitalization and rhyme schemes. When in doubt, leave a comment before hitting the Save button.

Paragraph spacing

Between every paragraph there should be a blank line. Sometimes a book will have a larger between-paragraph space printed, or some other form of mark:

        *

or

             *
        *         *

or

        --------

When nothing in the style interferes, remove it and replace it with a line that contains only * at the beginning of the line (and add a blank line before and after the line with the *).

Reconnecting hyphenated words

If a word is hyphenated, put it back together.

      Original                     Proofed

      Därför har Stock-            Därför har Stockholm
      holm S:t Erik som            S:t Erik som

This applies even if the word is divided between one page and another. It's easier to remove the partial word from the first page and then paste it in next to its ending when you go on to the next page. Don't forget to paste in again the part you clipped out!

If the work contains pronunciation marks in the form of apostrophes or accents inside a word then remove them.

      Original                     Proofed

      Ta'rtu                       Tartu

How does one tell an accent or stress mark from a regular apostrophe? It doesn't change the meaning of the sentence when it is removed. An apostrophe in a possessive or a contraction cannot be removed without changing the meaning of the sentence or turning it into nonsense. Again, when in doubt, leave a comment before saving the page.

If the work contains spaced-out text please rejoin it.

      Original                     Proofed           
      s p ä r r a d                spärrad

See below on other formatting.

Other formatting

When the typeface is "'italic'" it should be replaced with < i > < /i >.

When the typeface is "bold" it should be replaced with < b > < /b >.

When the typeface is "s p a c e d o u t" it should be replaced <sp>spaced out</sp>.

The <sp> marking began in [February 2005]. Before that <spärr> was used, and this is still found in texts that were proofread earlier. This type of marking continues to work, so there is no reason to modify the older texts. But we now support the use of <sp>. This marking makes sense to many more proofers, as it can be remembered as "spärrad" in Swedish, "spatieret" in Danish, "sperren" in German, and "spaced out" in English.

In its own editions, Project Runeberg does not display spaced-out text as spaced out, because it is more difficult to read, but rather as underlined text. Spaced out text was used as a substitute for italic--one that was simpler to use, as it did not require an extra typeface.

Rather often we see names that are spaced out. When deciding how much of the text to include in the tag, consider what the reader will see underlined:

      Original                    Proofed

      P. A. S p a r r e           <sp>P. A. Sparre</sp>

      s p ä r r!                  <sp>spärr!</sp>

      e t t, t v å och t r e      <sp>ett</sp>, <sp>två</sp> och <sp>tre</sp>

When punctuation marks occur directly next to other formatting, it is not terribly important whether the punctuation is included within the markup or not. Do what is simplest and makes sense.

Comment. According to usual typesetting conventions, a punctuation mark "belongs" to the word to which it is closest. In the case of parentheses, for instance, not (< i >text< /i >) but < i >(text)< /i >. The same with bold, etc. Actual underlining (rather than that used as a substitute for spaced-out text) almost never includes ending punctuation in the underlining.

Mathematics are a special case. Parentheses in equations should always be set in roman type (non-italic) regardless of whether they are in the middle of an italicized expression or not. /BLW

A <sc>Small Caps</sc> tag has been added as of April 2007, and proofers will see a <sc>-button if they are logged into the forum.

Illustrations

If the text contains illustrations with captions, move the caption to the space between two paragraphs. It doesn't matter whether you move it forward or back. Forward seems the most natural, but if the paragraph doesn't end on the same page it's fine to move the caption before the paragraph in which it's mentioned. If the page has no paragraph breaks on it, move the caption to the end of page, with a blank line before it.

As of April 2007 it's possible to connect an illustration with its caption.

Do this: highlight the caption and press the <img>-button. This will create a <img>-tag before and a (new) </img>-tag after the caption. If the caption has multiple lines that's fine; every line between the tags will become part of the caption, even if it is a blank line. All of the usual formatting tags (< b >, < i >, etc.) function within the caption. For illustrations without captions you need only insert the tags with nothing between them (<img></img>).

You may also proof

  <img>Caption</img>

as

  <img>
  Caption
  </img>

The result is the same.

You can also determine if the illustration will be on the left (or right) of the page, so that the remaining text can flow either on the right (or left) of the illustration. You merely modify the opening tag to be <img l> (or <img r>); the regular tag or <img c> will cause the illustration to be centered on the page and no text will be diplayed around it.

In older projects, illustrations were marked with a single <img>-tag. This still works, but if there are multiple illustrations on a single page it's important to use either the matched or unmatched tags, but not both types.

Remember to use the Preview-button to check if everything looks okay before you save your work with the Save-button.

Tables

As of June 2007 we have added table formatting. To use this, it's best first to understand how a table is constructed.

Tables consist of cells, which are arranged in rows and columns. The cells are not necessarily the same size; the size depends on their contents, but all cells in a column will have the same width. In the simplest table the number of cells will be = rows x columns. It is also possible to create a table where a cell is larger, and spans two, three, or more columns, or two mor more rows in height.

<table>-tag

One begins a table with a <table>-tag and ends it with a </table>-tag. Everything between the two tags are rows, and every row is divided into cells. NOTE: <table> and </table> must each be on its on line.

Tables at Runeberg can be displayed with or without borders. If you use a <table b>-tag, then there will be visible lines around all the cells (b=border). Using a <table o>-tag will force the table to be shown in the middle of the page. If you use neither "b" nor "o", you will get a table without borders displayed on the left of the page. It is not possible to display the table on the right side of the page, and text will not be formatted to flow in the space around tables.

Text or numbers in each cell can be aligned on the left, right, or center of a cell. If you want most of your cells to have their contents right-aligned, use <table r>, of centered <table c>, and left-aligned <table l>.

You can use more than one of these options at the same time, but you must have a blank space between them. The order of the fields doesn't matter. The syntax options for the <table>-tag are

  <table [b|o] [l|c|r]>

Examples: <table>; <table b>; <table l>; <table b c>; <table r b>.

<td>-tag

each cell in a table is marked with a <td>-tag (d stands for "data"). For example,

 <table>
 <td>County    <td>Capital  <td>Language
 <td>England   <td>London   <td>English
 <td>Italy     <td>Rome     <td>Italian
 <td>Australia <td>Canberra <td>English
 </table>

creates a table with 4 rows and 3 columns. The extra blank space after, for example, "Country", has no effect on the table, but makes it easier to proofread. Remember that all rows must begin in the first space, even if the cell is going to be blank.

The <td>-tag can have arguments similar to those used with the <table>-tag. If you want the text on the right in a cell, use <td r>; if centered, <td c>; if left, <td l>. These cell arguments will override the default you defined in the <table> tag.

You can create a cell that is 2 columns wide with <td 2>, and one that is 2 rows tall (vertically) with <td v2>. If you want to have a cell with both double width and double height, write <td 2 v2>. The entire syntax for the <td>-tag is

 <td [#] [v#] [l|c|r]>

Examples: <td>; <td l>; <td 2>; <td 3 v2 c>; <td r>.

If you are making a table with borderlines, you must have the "right number of cells" in each row, otherwise the table will appear strange in some web browsers.

Some tricks:

Use the preview-button often if you are creating a complex table.

Even if you ultimately want a table without borders, it can be easier to see errors if you begin with the border (<table b>) and then remove the lines when the table is correctly laid out.

Empty cells are often problematic. Putting a <tab>-tag in your empty cells is an easy way to solve some of the common problems.


Wiki | Senaste nytt | Inställningar | Sök: | NE | Susning.nu | Wikipedia | Google
Redigera den här sidan | Visa andra versioner | runeberg.org drivs av Projekt Runeberg
Senast ändrad 13 mars 2008 13:56 (skillnad)

Valid HTML 4.0!