- Project Runeberg -  Welcome to Project Runeberg
Front page | Next >>
Lysator Linköping University
  Project Runeberg | Like | Catalog | Recent Changes | Donate | Comments? |   
Project Runeberg (runeberg.org) is a volunteer effort to create free electronic editions of classic Nordic (Scandinavian) literature and make them openly available over the Internet. Projekt Runeberg (runeberg.org) arbetar på frivillig grund med att skapa fria elektroniska utgåvor av klassisk nordisk litteratur och göra dem öppet tillgängliga över Internet.

Project Runeberg, December 2020


December 2020

Danish novels

Project Runeberg aims to cover the literature of the Nordic countries or Scandinavia, but to be honest, most of our content is Swedish. This year, however, we have improved in the area of Danish novels by authors such as: Herman Bang, Carit Etlar (Carl Brosbøll), J. P. Jacobsen, Aage Madelung, Carl Møller, and Fanny Suenssen.

Our 28th anniversary

Project Runeberg was founded on December 13, 1992.

Looking forward to Public Domain Day

Copyright lasts for the author's lifetime and then for 70 full years (life+70), meaning that January 1st (Public Domain Day) is when it expires for those who died 70 whole years earlier. On January 1st, 2021, this happens to authors who died in 1950. So who are they? One easy way to find out is our list of Nordic Authors.

Among them are Nobel Prize winners George Bernard Shaw (1925) and Johannes V. Jensen (1944), but in the case of Shaw we also have to consider when the translators died. Other notable names are Harry Blomberg, Anna Branting, Edgar Rice Burroughs (Tarzan!), Ida Bäckman, Ewald Dahlskog, Ossian Elgström, Grenville Grove, Swedish king Gustav V, B. Rudolf Hall, Thorsten Jonsson, Martin Lamm, translator Wendela Leffler, Eva Neander, Ellen Nordenstreng, Oscar Olsson, George Orwell, Gösta Oswald, and Ester Ståhlberg.

Since we do digitize journals that are 70 years old, some of them already include articles by these authors, such as this 1946 article by Shaw about H.G. Wells in Bonniers litterära magasin, translated to Swedish by the journal's editor Georg Svensson (1904-1998).

Looking back at the 1949-ers

On January 1st, 2020, copyright expired for authors who died in 1949, including Nobel Prize winners Maurice Maeterlinck (1911) and Sigrid Undset (1928). By far, our greatest effort this year have gone into Sigrid Undset and Swedish writer Elin Wägner.

It is remarkable, that we have no works by Vilhelm Ekelund, one of Sweden's most prominent writers. He is, however, well represented at Litteraturbanken. By Anna Lamberg Wåhlin we have no books, but many translations and articles in the journal Ord och Bild that her husband edited.

We now have several works by Joel Haugard, Aage Madelung, and Axel Munthe, some by Ernst Enochsson, A. Stefan Gustafsson, Akke Kumlien, Ernst Newman, Håkan Theodor Ohlsson, and Ernst Westerberg, but none yet by Johan Harald Kylin, Siffer Lemoine, Gustaf Reinius, Ansgar Roth, Storm P., Nils Evert Taube, Eva Wahlenberg, or Karolina Widerström.

It was fun while it lasted,
but there is a time for everything.

by Lars Aronsson, December 2020

I founded Project Runeberg in December 1992, 28 years ago, and has managed it almost single-handedly since then. It was an early prototype of what the Internet could be used for, what a website could look like, how a collaborative volunteer (crowdsourcing) project could be organized. It has inspired others to start their own websites, it has inspired literature scholars and librarians to digitize books, it has inspired some aspects of Wikipedia, the free encyclopedia.

Some people will remember how I also started a wiki website, "susning.nu", in October 2001, how it was closed to editing in April 2004, and how it vanished entirely some time later. Project Runeberg has now reached a similar point where it is closed to contributions. Perhaps it will reopen later, but not in the same shape. Luckily, the risk of it vanishing entirely is much smaller, but should not be neglected.

1. To every thing there is a season,
and a time to every purpose under the heaven:
2. A time to be born, and a time to die;
a time to plant, and a time to pluck up that which is planted;
3. A time to kill, and a time to heal;
a time to break down, and a time to build up;
4. A time to weep, and a time to laugh;
a time to mourn, and a time to dance;
1. Allting har sin tid,
och vart företag under himmelen har sin stund.
2. Födas har sin tid, och dö har sin tid.
Plantera har sin tid, och rycka upp det planterade har sin tid.
3. Dräpa har sin tid, och läka har sin tid.
Bryta ned har sin tid, och bygga upp har sin tid.
4. Gråta har sin tid, och le har sin tid.
Klaga har sin tid, och dansa har sin tid.
Ecclesiastes 3:1–4 (KJV) Predikaren 3:1–4 (1917)

Project Runeberg has not been the same all of the time. It started out as a few text files on a Gopher and FTP server. The web (HTTP) server was added after about a year. When the project was six years old, I started to scan books as facsimile images of entire pages, instead of just presenting the resulting text. Some years later, just after the turn of the millennium, online proofreading through a wiki-like web form was added. Daily statistics of our growth date back to the fall of 2003. In 2005, Project Runeberg started to use UTF-8 characters for new books. The existing collection was converted to UTF-8 in 2012. Until circa 2010, the majority of books were scanned in black-and-white TIFF G4 format. Later, color JPEG has dominated.

From the beginning, Project Runeberg presented small poems and song texts. The first longer poems and complete novels came in the first years, as did the full text of the Bible in the Swedish translation of 1917. Among the first works in facsimile were the collected works in 14 volumes of Viktor Rydberg. The very first years of Wikipedia (and also susning.nu) coincide with the years (2001–2003) when I scanned the the Swedish encyclopedia Nordisk familjebok (two editions, 20+38 volumes, 1876–1926), which was followed in 2004–2008 by the Danish encyclopedia Salmonsens konversationsleksikon (26 volumes, 1915–1930). Later, new genres such as complete years of journals and more than a hundred dictionaries have been added.

The first decade of the millennium was also the time when Google announced their intention to scan many millions of books in a decade (Wikipedia: Google Books), followed by similar declarations from national libraries in France and Norway. It was clear, that book scanning was now a big thing, no longer an experiment. In Sweden, literature scholars started Litteraturbanken in 2004.

Wikipedia was growing more mature, and in 2007 I helped to organize the Swedish chapter of the Wikimedia Foundation, Wikimedia Sverige. I was a board member for the first five years (2007–2012). During this time I was also an active contributor to Wikipedia and some of its sister projects: Wikisource and Wiktionary. Wikisource is indeed a direct parallel to Project Runeberg, a book scanning and proofreading project. Maybe I could hope that Wikisource would replace Project Runeberg, just like Wikipedia had replaced susning.nu? I gave that thought a serious consideration in 2010–2011, but found it far easier to add and proofread books in Project Runeberg than in Wikisource. Wikisource is one project in Swedish, one in Norwegian and another one in Danish language, each having very few active contributors in 2020. Only larger languages like English, French, Italian and German have succeeded in building active communities of contributors.

If Project Runeberg were to continue after 2011, it would need to reinvent itself. Some of the software needed to be redesigned and a reliable source of funding would be necessary. To find out what could and needed to be done, I applied for and received a grant from the Swedish Internet Foundation. During 2012 I attended the annual Wikimania conference in Washington DC and also took the time to visit the Internet Archive's scanning center at the University of Toronto and to meet in New York with Greg Newby, head of Project Gutenberg. To my disappointment, there was far less direction and coordination in book digitization than I had hoped to find. It seemed to me that every project was working on the funding they could hope to find to just randomly digitize whatever books they could find, in the vague hope that someone would find them interesting to read at some later date. There was no demonstrable use or benefit from book scanning that could directly motivate investments. This was a depressing insight.

I gave up hope of finding reliable funding for a "real" project and instead purchased a new scanner with support from Wikimedia Sverige. I had collected some books that I could scan and I did so, just adding volume to the existing Project Runeberg, without rewriting any software. Several volunteers teamed up to help in this effort, both scanning books, importing books scanned by others, and proofreading. But nobody volunteered to improve the software or repair the broken web forum or wiki. This was the end of Project Runeberg's slow growth of 50,000 pages per year (2006–2011) and the beginning of 250,000 pages per year (2012–2020). While these numbers seem like a great success, it was still a continued period of stagnation in technical development.

(Was there a conflict of interest, when the board I had just left in 2012 gave me support to buy new equipment? I see it the other way around: I provided that organization with an opportunity to show how they supported successful scanning of books, that are useful to the organization's purpose, to improve Wikipedia, for a rather small amount of money. Both as a board member and when scanning books, I volunteered my time without salary or compensation. I'm more worried that nothing useful came from the grant I received from the Swedish Internet Foundation.)

In a few years at the beginning of the millennium, new technical functions had been added to Project Runeberg at a fast pace. Not only could we upload and proofread books, check the editing history, compare versions of a text page, and get daily statistics on growth. The codes or markup used when proofreading evolved into a language of its own, similar to HTML but not entirely, with its own syntax for table layout and poetry. There were also ways to index books, to edit the presentation of the books, to upload new versions of bad scans, to cut out illustrations from book pages and upload them separately. Added to this was our own wiki and a web forum. In all cases, these functions were implemented as rapid prototypes based on a quick idea, and never to any written specification, with any unit tests or with any security considerations.

I developed some of these functions, but not all of them. And my helpers soon left the project without documenting their features or their limitations. Some volunteer proofreaders learned how to use them, but nobody knew how to repair them when something went wrong.

Some proofreaders wanted to do more than the markup language could offer, and invented ways to use table layout code for things like centered headings, hanging indent and side margin notes. Well, isn't that great, it looks nice and solves the problem, doesn't it? The problem is that we are supposed to proofread the text of the book, and if a reader spots an OCR error, he or she should be able to correct the error. When opening the proofreading form, there should be the text and not a rat's nest of markup code for table syntax. Markup always needs to be minimalistic.

Among the security considerations left out is the ability to monitor and revert abuse. Wikipedia has developed very advanced features for this, and as a side effect they are also available to Wikisource. Pages can be locked for certain categories of editors, editors can be blocked from editing, any edit will be listed in an edit history, and can easily be reverted by an administrator. Project Runeberg has no such roles of editors, no hierarchy of administrators. Essentially, all edits are anonymous. Edits to text pages during proofreading are properly logged in a history and can be reviewed, but there is no quick revert function. Edits to scanned images are not logged at all. Who did what? Nobody knows.

For a project to continue like Project Runeberg in the 2010s, all volunteers must be careful and only use the existing functions with moderation. The project is very vulnerable to attacks. Intentional abuse can quickly get out of hand. This is what happened to susning.nu in 2003 and 2004, and led to the site being closed to editing. Fortunately, Project Runeberg has had no cases of intentional abuse, which is amazing. But in a handful of cases, there have been overly enthusiastic proofreading volunteers, who can't accept that some undocumented functions (separate uploading of illustrations) have stopped working or that they aren't allowed ot use table markup code to make prettier text pages, or who repeatedly do clumsy mistakes that can only be corrected by the only inside administrator (which is me). In most cases, a simple explanation has sufficed to correct them, but a few have been very stubborn. And for me, trying to lead and develop the project, supporting such users has taken an increasing fraction of my own volunteer time, which is why I decided on December 18, to close the project to editing.

We all have plenty of time to think of where this should go next. Readers can continue to read existing texts on Project Runeberg as long as the website stays open, hopefully for ever. Volunteer proofreaders will have to find a new hobby or move to some other project, perhaps Wikisource. Perhaps I will reopen some of the functions, such as simple proofreading, after first making sure that they can be monitored and reverted. But I am very reluctant to spend time on implementing a full security system with administration roles, locking and blocking.

But perhaps we need to take one step further back, and ask again why are we really scanning and proofreading books? Is there any real use or benefit to it? How can that benefit be measured, and is there a way to use it as a source of funding?

Some of our digitized works are used a lot, such as the encyclopedias I mentioned and some dictionaries. But does it matter that we also have indexed and proofread most of the text? I never hear any praise for this great effort, or lamentation that some other books are not yet proofread. The national library of Norway has during the 2010s digitized all Norwegian books, with good page images and rather good OCR text, but proofread none of it. And Norway is not in a deep crisis because of this. Maybe it's fine to just skip proofreading? The Swedish national library has digitized far fewer of its books, and Sweden is not in a deep crisis because of this. Maybe it's fine not to scan books? These are the key questions that have not yet been answered.

Project Runeberg has now completed three annual fundraisers, each time raising 25,000 SEK (US$ 2900), which is enough for buying new equipment and covering expenses, but insufficient for salary to administrators and software developers. Even if some software developers would volunteer their time and skill, there needs to be a coordinator that stays on the project and doesn't just disappear. I estimated in 2012 that a reasonable project with staff and equipment could be operated on an annual budget of 1–2 MSEK (US$ 120–240 thousand). From where could we get that kind of funding, and how would we write the motivation for it?


November 2020

Insamlingskampanj 2020/21

Den 9-20 november genomfördes vår insamlingskampanj för året. Det var den tredje vi genomförde och den gjordes likadan som de förra. En liten reklamskylt (banner, som ovan) syntes på några av våra webbsidor, uppmanande till donationer med ett givet mål, 25.000 kronor för verksamhetsåret 2020/21. Bannern togs bort när målet hade uppnåtts, och återkommer nästa år. Sedan länge finns en länk "Donate" i sidhuvudet till alla våra webbsidor. Läs mer på vår sida för donationer.

2020/21 Fundraiser

November 9-20, a small banner (the one above) was seen on some of our web pages, promoting donations toward our aim of raising 25,000 SEK for the fiscal year 2020/21. The banner was removed when the aim was reached, and will reappear next year. We have long had a link "Donate" in the header of all our web pages. Read more on our donation page.

Lost in Search

In the early days of the Internet, 25 or 20 or even 10 years ago, it was much less monopolized than now. Websites existed side by side, as did search engines and other services. We focused on digitizing books and collecting free electronic editions of literature, but we never developed our own search engine. Instead, we encouraged people to use existing web search engines.

One day, we added a search box to our pages, but all it did was to forward the query to Google only adding +site:runeberg.org to it, as this limited the search results to those found on our website. Beginning in June 2009, to provide fair competition among search engines, our search box randomly sent your query to Google or Yahoo or Bing. In September 2012, Chinese Baidu and Russian Yandex were added, but Yahoo and Baidu were removed soon after, and Yandex was removed in October 2014, because their coverage was too poor and search hits were less useful. In January 2020 DuckDuckGo was added, leaving us with three: Google, Bing, and DuckDuckGo. Google was the best, the others were so-so but not completely useless.

However, in the autumn of 2020, both Bing and DuckDuckGo have started to ignore the +site: clause, and presented search results from various other websites. This is not what we intended, so both had to be dropped, leaving us again with only Google. This was not Google's fault. They gained the monopoly position by failures, intentional or not, but obvious failures, from the competing engines.

We don't like that Google has a de-facto monopoly on web search. We don't like monopolies in general, but we are particularly unhappy with Google being the monopolist here, because they also scan books. Google's coverage of our web pages was much better before the big book scanning operation started, that is known as Google Book Search. Now, if you google for a phrase from some old Scandinavian book, you are likely to get a hit from Google's own scanning even though the same book and text is available from our website. This was not the case 15 or even 10 years ago. Back then, Google would find and index any new book we scanned within a few days. That is not the case anymore.

And then, we are still an open and well-structured website that Google could easily index completely. That is not the case with all the books, journals and newspapers scanned by the National Libraries of Denmark, Norway or Sweden. Their websites are intentionally (!) designed not to be indexed by general web search engines. Readers of Scandinavian literature would need a better search engine. Unfortunately, the interest to build one appears to be lacking.


March 2020

Happy Women's Day, March 8!

Recently added to our collection are 19 books by Swedish journalist Elin Wägner (whose works entered the public domain this January) and five years of the Norwegian suffragette journal Nylænde (1888-1892).

Medicine

While the whole world is worrying about the new corona virus, we thought that maybe we can derive some wisdom from the history of previous epidemics. The Spanish Flu of 1918 comes to mind. What has been written about it, really? Perhaps the best accounts we have are encyclopedia entries from the 1920s about influenza, such as this one in Nordisk familjebok.

Over the years, we have digitized and gathered quite a few books relating to medicine, but we have only now made a thematic page for this topic. Most of the books found there are in Swedish. We welcome suggestions for more works to digitize. Recently, we have added:


February 2020

Happy Public Domain Day!

Public Domain Day is January 1st. This is when works by a new group of authors enter the public domain because copyright expires when they have been dead for 70 years. We continue to celebrate it, even though it is already February. So who died in 1949? And what have we added so far?

Since we boldly digitize journals and encyclopedias (having numerous contributors) 70 years after they were published, regardless of when each contributors lived, we have also scanned:

Also recently digitized are some works by people who died in 1947 and 1948:


Project Runeberg, 2021-09-17 23:59 (runeberg)
http://runeberg.org/

Valid HTML 4.0! All our files are DRM-free