Archive for the ‘THE VALUE OF DIGITIZATION’ Category

Robotics speed up book digitisation

Thursday, August 14th, 2008

Again the British Library book digiti(sz)ation effords are of interest. BL taking big steps towards being a NEXT-LEVEL-LIBRARY.

By the end of this year, 20 million pages of the British Library’s 19th century books will be available electronically. Siân Harris visited the library to see how it is being done

Research Information: August / September 2008

As I write this – and probably as you read it – six sophisticated machines and their operators are hard at work in a corner of the British Library (BL). These machines are busy turning hundreds of pages of old books into digital files every hour.

The BL’s digitisation of 19th century books is one of many digitisation projects around the world that have been funded by Microsoft. The software giant was originally loading the digitised books onto its Live Search platform and about 40,000 British Library items were available on this site before Microsoft pulled its book project at the end of May.

The Live Search Books programme, which also included libraries such as those at the University of California, the University of Toronto, the New York Public Library, the American Museum of Veterinary Medicine and Cornell University, digitised 750,000 out-of-copyright books to put on the platform. However, Microsoft said that it now believes that the best way for a search engine to make book content available is by crawling content repositories created by publishers and libraries.

The end of Live Search Books does not mean an immediate end to the projects it was funding, however. For the British Library, the Microsoft funding covers 20 million pages which is approx 80,000 to 100,000 books, a target that the library anticipates reaching by the end of the year, subject to production variables. And Microsoft is encouraging its partners to keep their own digitised copies and carry on their projects. ‘We are removing our contractual restrictions placed on the digitised library content and making the scanning equipment available to our digitisation partners and libraries to continue digitisation programmes,’ said Satya Nadella, Microsoft’s senior vicepresident for search, portal and advertising when the Microsoft decision was announced.

Making whole collections electronic

The 19th century book project is not the British Library’s first digitisation initiative or its only one – there are 15 such projects going on currently. However, the speed and sheer number of titles being digitised are far greater than past initiatives and this is changing the process of picking which books to digitise.

‘One of the big challenges with digitisation is title selection,’ said Neil Fitzgerald, book digitisation project delivery manager for the British Library. ‘Mass digitisation allows us to deal with historical biases by digitising a whole collection.’

The six machines in the BL, which were provided by Kirtas Technologies, USA can each digitise up to 2,400 pages per hour, although Fitzgerald said that 1,200 pages per hour is more realistic for old and fragile books such as many in the BL’s collections. These machines are being put to work 16 hours per day by digitisation partner, Content Conversion Specialists (CCS) of Germany. ‘The original target was to digitise about one million pages per month but it will soon be two million pages per month,’ commented Fitzgerald. This project was piloted last year and full production began in late October/early November 2007. The pilot was essential in deciding the workflow. According to Fitzgerald, the book scanning itself posed fewer challenges than other parts of the process. ‘The actual digitisation is relatively easy. The robotics are new, but we have been digitising materials for 20 years,’ he explained. ‘New approaches are needed, however, to cope with volumes.’

more

Award helps safeguard UK’s research journals

Tuesday, August 12th, 2008

Research Information 20 July 2008

The Higher Education Funding Council for England (HEFCE) has announced nearly £10 million of funding for a collaboration between higher education libraries led by Imperial College London and the British Library. The funding will enable the creation of the UK Research Reserve (UKRR).

UKRR is an agreement between higher education and the British Library whereby the British Library will store low-use journals for the HE community and make them accessible to researchers and others using state-of-the-art ordering and delivery systems.

Deborah Shorley, director of library services at Imperial, said: ‘The UKRR is a fantastic example of HEFCE, Imperial and the British Library working together to produce a better and more coherent way to access research material. It addresses the problem of libraries up and down the country with duplicate copies of low use periodicals and will offer a more sophisticated approach to providing information for the UK’s research community.’

Imperial College will be managing the scheme in conjunction with the British Library.

Related internet links

British Library
HEFCE
Imperial College London

Historical Society awards 3 grants

Saturday, August 9th, 2008

The News Messenger

COLUMBUS — The Ohio Historical Society has been awarded three grants totaling more than $430,000 from public and private sources.
The funding will benefit the society’s digital archives collections, preservation of Ohio battlefields efforts and educational outreach to classrooms respectively.
“As a nonprofit organization, the Ohio Historical Society relies heavily on grants and private contributions to supplement ever-dwindling state funds,” said William K. Laidlaw, OHS executive director and CEO. “The three grants will make a significant impact on these programs and will help our organization to continue its mission to preserve and interpret Ohio’s history.”

The Ohio Newspaper Digitization Project

The National Endowment for the Humanities awarded $353,069 to the Ohio Historical Society to begin digitization of Ohio’s microfilmed newspapers.

The Ohio Newspaper Digitization Project, a part of the National Digital Newspaper Program developed by NEH and the Library of Congress, will digitize 100,000 Ohio newspaper pages between the year of 1880 and 1922 during the two-year grant period. Newspapers digitized as part of the grant award will be included in the Library of Congress’ Chronicling America database at www.loc.gov/chroniclingamerica.

“The ultimate goal over the next 20 years is to create a national online, searchable resource of historically significant newspapers,” said Angela O’Neal, OHS collections technical services manager who oversees the Historical Society’s digital collections.

The Ohio Newspaper Digitization Project will build upon an earlier NEH initiative, the United States Newspaper Program, which enabled the Society to locate, catalog and microfilm Ohio’s newspapers. As a result, the society holds the most complete Ohio newspaper microfilm collection in the state comprising some 20,000 volumes of newsprint.

Because the initial project will be limited to a small number of newspapers, an advisory group of journalists, historians, educators, scholars, librarians and archivists will select the titles to be digitized, according to O’Neal. “This is just the beginning,” she said. “The Society will continue to apply for NEH funds in upcoming grant cycles until we can complete the Ohio Newspaper Digitization Project.”

Workflows for Mass Digitisation

Thursday, July 17th, 2008

Author: Claus Gravenhorst
at Colloquium of Library Information Employees of the V4+ Countries

Accessible information is a basic need of the society or to put it another way … of everyone. Usually the original can only be accessed in printed form or microfilm/microfiche, which means search, use and distribution of the information is time-consuming, cost-intensive and not available for everyone. The digitisation and conversion of printed items into electronic formats were, until recently, complex and cost-intensive. Insufficient budgets and/or resources prevented extensive transformations to digital repositories. Reliable methods for long-term security and the storage of these enormous data sets were virtually unavailable.

As the result of the METAe project (http://meta-e.uibk.ac.at), funded by the European Commission through the 5th Framework Research Program, CCS Content Conversion Specialists GmbH, Germany developed a comprehensive software solution, available on the market since 2003 under the brand name docWORKS. It is a production tool, which offers an integrated workflow for automated, structured conversion of printed documents into digital objects, which describe the physical and logical document structure by consistent use of international XML standards. These XML documents are to be equated concerning quality and structure with born digital documents and can be transferred to digital library systems, portals, document, content and knowledge management systems as well as virtually any media output device.
The main goal achieved through the project was the automatic generation of administrative, descriptive and structural metadata. The advantages of highly structured documents:
As “digital original” they meet the requirements for a digital long-term storage in repositories
With the use of XML open metadata standards, the data can be transformed and migrated to meet current and future requirements (more…)

Digital access to information: Concerns.

Tuesday, July 15th, 2008

So - what are major concerns of libraries, archives or collections, when thinking of going digital. First of all:

  • money.

Second to ultimate:

  • Mobile devices
  • Information filtering
  • Digital preservation and archive
  • Interoperability issues
  • Content and knowledge management
  • Information retrieval
  • Grid architectures
  • Information seeking and use
  • Intellectual property
  • Data and information mining
  • Collection development and management
  • User communities
  • Interface and interaction design
  • Security and privacy
  • Multimedia digital libraries
  • Multi-language support
  • Digital libraries in education
  • Document genres and categorization
  • Social media / Web 2.0
  • Metadata and cataloguing
  • Sustainability
  • Open Source tools and systems

The second list looks really insuperable. It is not.