SuomeksiPå svenskaIn English

Info - Search instructions - Contact information - Suggest materials for harvesting

Info


The web archive is a part of the National Collection

The National Library archives Finnish publications according to the Act on Collecting and Preserving Cultural Materials (1433/2007, only in Finnish and Swedish) in cooperation with the publishing industry. Online material has been archived since 2006 when this was enabled in the Copyright Act (404/1961, section 16b). The Copyright Act also lays down provisions on the use of archived online material. A collection plan confirmed by the Ministry of Education and Culture in accordance with the Act on Collecting and Preserving Cultural Materials serves as the basis for searching for and storing resources.

Contents of the web archive

The web archive consists of a representative and diverse range of online material collected since 2006, i.e., it is not a complete record of past internet content. New material is added to the archive through an annual Finnish domain harvest and continuous harvesting of online news and Twitter content, and with harvests focused on a specific theme. By the end of 2020, the size of the web archive was approximately 240 terabytes.

The thematic harvests are described in the Finnish National Bibliography on a collection level.

The annual Finnish domain harvest collects internet content from the .fi and .ax domains and, since 2005, other domains as well, using a language identification tool.

Since 2024, online material have been harvested from Finnish public administration websites and other other significant websites of Finnish communities and companies on a continuous quarterly harvest.

The National Library has archived news content on a regular basis as of 2009. As a rule, constantly updated newspaper websites have been archived daily as of 2011. Since 2015 the archiving of news content behind paywalls has also been piloted in the harvesting of news content. The web archive contains more than 600 different newspaper webpages.

The continuous Twitter harvest started in October 2020. This harvest collects selected Twitter accounts at regular intervals, approximately once a month, focusing on official Twitter accounts as well as accounts considered important that concentrate on selected topics. In 2021 this harvest covers approximately 2,500 Twitter accounts. New accounts to be archived are constantly being sought.

The thematic harvests focus on content related to different subjects and events that are often excluded from the annual Finnish domain harvest. Material archived through thematic harvests is, when possible, surveyed together with scholars, experts and the general public to ensure that the most diverse range of material on the theme in question is archived.

Use of the web archive

Customer use of the web archive is based on the Copyright Act. The archived web material is available for use on legal deposit terminals at the six legal deposit libraries in Finland:

The web archive is also available for use at the National Audiovisual Institute and the Library of Parliament.

The web archive and e-legal deposit copies can be used on legal deposit terminals. The material may be printed and recorded through loudspeakers, and the display screen may be photographed. Digital copies cannot be made, and the terminals are not connected to the internet. A public index of the web archive is freely available for use on the internet.

Essential documents related to the web archive

Privacy policy statement (HTML, in Finnish only)

Accessibility statement (HTML, in Finnish only)