Archive for the ‘Digitalisierung’ Category

Low-cost entry-level digitization software docWORKS[e]

Thursday, January 22nd, 2009

CCS is releasing a “small” version of its docWORKS document digitiser – the technology that is currently in use in the largescale digitisation project at the British Library, and also in the KBNL newspaper project, amongst others.

The developers have not cut back on features or quality in their [e]-version – in fact, docWORKS[e] is only limited by its restriction to a single dualcore computer. All the other features – including Mets Alto output – are exactly the same as those in the “industrial” version.
This release sees CCS reacting to the challenge that medium-sized and small libraries are presently facing: that of satisfying the demands of their users, who are clamoring for an (at least partly) digitised library.
docWORKS[e] is THE package for prime digitising technology at low cost. It will help libraries produce the professional output they need, using a compact, all in one digitiser.
docWORKS[e] will be available from August 2009. For more information: e@content-conversion.com

Codex Sinaiticus, eBook und assoziative Suche

Monday, October 20th, 2008

neues auf 3Sat: Sendung am 19. Oktober 2008

Robotics speed up book digitisation

Thursday, August 14th, 2008

Again the British Library book digiti(sz)ation effords are of interest. BL taking big steps towards being a NEXT-LEVEL-LIBRARY.

By the end of this year, 20 million pages of the British Library’s 19th century books will be available electronically. Siân Harris visited the library to see how it is being done

Research Information: August / September 2008

As I write this – and probably as you read it – six sophisticated machines and their operators are hard at work in a corner of the British Library (BL). These machines are busy turning hundreds of pages of old books into digital files every hour.

The BL’s digitisation of 19th century books is one of many digitisation projects around the world that have been funded by Microsoft. The software giant was originally loading the digitised books onto its Live Search platform and about 40,000 British Library items were available on this site before Microsoft pulled its book project at the end of May.

The Live Search Books programme, which also included libraries such as those at the University of California, the University of Toronto, the New York Public Library, the American Museum of Veterinary Medicine and Cornell University, digitised 750,000 out-of-copyright books to put on the platform. However, Microsoft said that it now believes that the best way for a search engine to make book content available is by crawling content repositories created by publishers and libraries.

The end of Live Search Books does not mean an immediate end to the projects it was funding, however. For the British Library, the Microsoft funding covers 20 million pages which is approx 80,000 to 100,000 books, a target that the library anticipates reaching by the end of the year, subject to production variables. And Microsoft is encouraging its partners to keep their own digitised copies and carry on their projects. ‘We are removing our contractual restrictions placed on the digitised library content and making the scanning equipment available to our digitisation partners and libraries to continue digitisation programmes,’ said Satya Nadella, Microsoft’s senior vicepresident for search, portal and advertising when the Microsoft decision was announced.

Making whole collections electronic

The 19th century book project is not the British Library’s first digitisation initiative or its only one – there are 15 such projects going on currently. However, the speed and sheer number of titles being digitised are far greater than past initiatives and this is changing the process of picking which books to digitise.

‘One of the big challenges with digitisation is title selection,’ said Neil Fitzgerald, book digitisation project delivery manager for the British Library. ‘Mass digitisation allows us to deal with historical biases by digitising a whole collection.’

The six machines in the BL, which were provided by Kirtas Technologies, USA can each digitise up to 2,400 pages per hour, although Fitzgerald said that 1,200 pages per hour is more realistic for old and fragile books such as many in the BL’s collections. These machines are being put to work 16 hours per day by digitisation partner, Content Conversion Specialists (CCS) of Germany. ‘The original target was to digitise about one million pages per month but it will soon be two million pages per month,’ commented Fitzgerald. This project was piloted last year and full production began in late October/early November 2007. The pilot was essential in deciding the workflow. According to Fitzgerald, the book scanning itself posed fewer challenges than other parts of the process. ‘The actual digitisation is relatively easy. The robotics are new, but we have been digitising materials for 20 years,’ he explained. ‘New approaches are needed, however, to cope with volumes.’

more

Videos of the Treventus ScanRobot book scanner

Thursday, June 26th, 2008

Here

Koninklijke Bibliotheek start met digitaliseren acht miljoen pagina’s historische kranten

Monday, May 26th, 2008

von hier

Den Haag, 26 mei 2008De Koninklijke Bibliotheek in Den Haag heeft een overeenkomst gesloten met het Duitse bedrijf CCS (Content Conversion Specialists) voor het digitaliseren van acht miljoen historische krantenpagina’s. De gedigitaliseerde kranten zijn doorzoekbaar op ieder woord in de tekst en worden opgenomen in de Databank Digitale Dagbladen, een project van de Koninklijke Bibliotheek dat gefinancierd wordt door het Nationaal Programma Grootschalige Onderzoeksfaciliteiten.

Voor de uitvoering van het project is CCS een samenwerking aangegaan met het Nederlandse bedrijf M&R uit Kampen. Binnenkort gaan de eerste kranten richting Kampen waar het scannen plaatsvindt. Per maand zullen zo’n 200.000 krantenpagina’s worden gedigitaliseerd. In drie jaar tijd komen alle acht miljoen pagina’s beschikbaar. Begin 2009 worden de eerste resultaten online voor iedereen beschikbaar gesteld.

In Nederland zijn in de afgelopen vier eeuwen meer dan 7000 landelijke, regionale en lokale dagbladtitels verschenen. Dagbladen bevatten informatie over de geschiedenis van de samenleving, politiek, economie, kunst, cultuur en wetenschap. Ze vormen een onmisbare bron voor tal van onderzoekers, van historici tot taaltechnologen die de historische kranten gebruiken voor onderzoek naar de ontwikkeling van het taalgebruik. De krant brengt het nieuws van de dag, maar de informatie heeft eeuwigheidswaarde. Door de kwetsbaarheid van het materiaal (dun en slecht papier) dreigt een belangrijke bron voor wetenschappelijk onderzoek verloren te gaan. Een groot deel van de Nederlandse collectie – afkomstig uit het bezit van zowel de Koninklijke Bibliotheek als van andere erfgoed instellingen – wordt daarom gedigitaliseerd en voor iedereen toegankelijk gemaakt op internet. Een wetenschappelijke adviescommissie adviseert de Koninklijke Bibliotheek over de selectie van de meest belangrijke titels vanaf 1618 – toen de eerste krant in Nederland verscheen – tot aan de twintigste eeuw.

Bij de digitalisering van kranten uit de 20ste eeuw loopt de KB – door de huidige Auteurswet – tegen een aantal beperkingen aan. Hierover voert zij momenteel overleg met het Nederlands Uitgeversverbond en verschillende organisaties die de belangen van freelancers en andere auteursrechthebbenden behartigen.