Site visit: British newspaper library

Bound volumes of newspapers  at British Library Photograph: Martin Argles

Photograph: Martin Argles

It’s traditional to send our trainee on a visit to the British newspaper library in Colindale, and I’d not been since I was one many moons ago, so I tagged along with Nina and our archivist when they visited last week.

The newspaper library is in the process of moving to Boston Spa, the British Library’s base in Yorkshire, and the transition is evident. Gone are the impressive cameras I remember, used for photographing newspaper pages for microfilm. Fragments of binding and aging newsprint litter the floor in the stacks (every bound volume has to be weighed and measured prior to the move north, and if they sweep up the dust, airborne particles could cause damage).

It was really interesting to hear about the move – as well as the northern base there’ll be a dedicated reading room at the British Library on Euston Road (although bound volumes will have to be ordered from Boston Spa). Fascinating, too, to see inside “the pen” where some of the more precious volumes are held, including (to my other half’s delight) volumes of Marvel comics from the 1970s and British counterparts from their heigh-day in the ’40s.

The newspaper library fulfills my romantic ideal of an archive – the musty smell of weathered paper, cracked spines of long-forgotten tomes like John Bull and The Cherokee Phoenix (an attempt at a reservation paper from the 1830s). I could happily get lost in the stacks.

Times have changed, space is short and there’s a pressing need to preserve old volumes (an ongoing programme of digitisation will ensure safer access to millions of pages of print, though 40m pages is only a small proportion of the overall collection). The move is clearly important to the continuing success of the newspaper archive. But I hope that some of the magic of Colindale will remain at the new facility at Boston Spa.

Martin Belam on the editorial pitfalls when digital and print collide

Martin Belam has flagged up one of the dangers of online reporting over on curreybetdotnet.

Yesterday’s Times website headline for the Sean Hoare story, Hacking whistleblower found dead, was unfortunately prepended with the ‘Live’ tag, leading, as Martin says, to the formula “Live: Someone is dead”.

the perfect example of something that wouldn’t be allowed to happen in print, but which hits a magic Venn diagram intersection of technology, editorial and information architecture allowing it to happen digitally.

Martin suggests adding more options for prepends – ‘Breaking’ or ‘Latest’ for example, which would remove the unintentional pun in the headline for such a tragic story.

It’s clear that more consideration needs to be given to traditional page layout when information architects, who are often far removed from the reporting process, are working in the media sphere.

Working week: June 27-June 28

I’m going to keep a work diary every week so I can track what I’ve done, and try to reflect before the start of the next week, to pick out portfolio tasks. I do like lists!

Monday June 27

  • Went through emails (first day back after 9 months, rather a lot of deleting!)
  • From the Archive pieces for July 7 and 8 – found and edited articles on striking dockers (1923) and Brian Jones (1969)
  • Guardian Films query – El Salvador journos/photogs in 1980s/1990s – checked Factiva and our internal text archive for articles with datelines, suggested digital archive for pics

Tuesday June 28

  • Factiva training – haven’t used it for a few years, good reminder of connectors, searching, sources, alerts, introduction to new workspace feature
  • Query about fish and chip shops – statistics, history – Factiva articles search, internal text archive to fill in gaps, Google search for source for stats
  • Query about Sarah Helm – background info, interviews, news stories – Factiva and internal text archive search
  • Had a look over the intranet to see what needs improving, reorganising
  • Updated Afghanistan casualty lists for intranet and Datablog
  • Wrote a blog post to accompany the Brian Jones archive piece next week

Guardian 190: From the archive

I can’t claim any credit for this because I’m on leave, but I was involved in pushing for the From the Archive blog initially so I’m a little bit proud!

The Guardian is celebrating its 190th birthday this month, and has pulled together a bundle of resources, including a rather nifty interactive showing 190 key moments in the Guardian’s development.

As part of that, the research department are blogging an article from each year – in order - on their blog From the Archive. I’m a bit late in highlighting it – they’ve already reached 1896 – but there’s plenty more to come, and you can access the back catalogue on the blog or through the main Guardian 190 microsite.

More on public libraries

This week’s Cilip bulletin included a couple of links to articles on the reinvention of public libraries. According to the Evening Standard, 130 London libraries are at risk of closure due to funding cuts; writers Charlie Higson, Benjamin Zephaniah and Will Self have added their voices to the campaign to save them.

The Standard also reported on Upper Norwood library, which is funded jointly by Croydon and Lambeth councils but since its foundation in 1899 has been run by an independent committee, which drastically reduces running costs by cutting red tape.

On a lighter note, the bulletin also included a link to this Flickr page of 1960s library posters from Enokson. They’d look great on mugs or tea towels – one way of funding libraries once the budget cuts are in force?

Cilip members can sign up for the weekly email bulletin (which is keeping me in the loop while I’m on leave) by following this link.

On the web: DocumentCloud

The Christian Science Monitor librarian (@CSMLibrary) flagged up a blogpost about DocumentCloud from the NewsliBlog today.

DocumentCloud is a free online tool for converting PDF documents into web text that you can annotate, making the contents much more accessible and useable. News organisations in particular have used it to provide readers with official documents that add value to a news story.

As Derek Willis (@derekwillis) says in his blogpost:

It’s a great way to maintain a set of files that anyone from the newsroom can access and annotate, making it a good candidate for long-term project work. And when you’re reading to show that work to the world, you can make any or all of the files public.

I’ve not had time to play around with it, but I’ve struggled in the past with ways of stripping text from a PDF quickly and without too many errors, so anything that helps speed up the process can only be a good thing!

I can think of several time-consuming queries it would have helped with off the top of my head – editing From the Archive articles that predate our text archive (anything pre-1984), taking content for the Datablog from government documents that are only released in PDF or the Russian spies story, for starters.

On the web: data visualisation: howbigreally.com

Something we get asked for fairly regularly in the news library is a size comparison – some journalists like to be able to equate a distance or area in a story to a recognisable place in the UK (recent examples include a piece on nature reserves “covering an area the size of the west Midlands” and a reference to British manoeuvres in Sangin, Afghanistan “to capture an area the size of the Isle of Wight”).

There is a questionmark over the validity of such comparisons – how much value does it really add to a story, and how many people have a strong enough grasp of geography to be able to visualise even UK areas? But they show no sign of dropping out of use.

A new online tool developed by the BBC could help media librarians, journalists and readers to draw more relevant, and useful, comparisons in future. BBC Dimensions, found at howbigreally.com, takes a template of a newsworthy event (for example the area afffected by the floods in Pakistan, the Twin Towers or the BP oil spill) and lays it over a Google map at a location of your choice.

Dimensions is a prototype and it has its limits – you can’t create a new template so the event you’re covering has to be listed already; you can’t play with the shape of the template so you need a reasonable amount of spacial awareness to be able to compare it to specific areas like counties or countries; and, perhaps most worryingly, the disclaimer at the bottom states, “We make no guarantee as to its accuracy, reliability or performance” – but it is a pretty good starting point for queries of comparison, and it’s a nifty way of using data visualisation to add value to a news story.

More on SEO: STI or STD?

I inadvertently created my own example of the “ground zero mosque” problem just after I wrote about it last week.

Writing on the Datablog, I posted the latest statistics on sexually transmitted infections (STIs) from the Health Protection Agency. STI is the correct, recognised term for things like chlamydia, herpes, gonorrhoea and HIV. The problem is that for years such infections were termed STDs – sexually transmitted diseases -  and although the medical world has stopped using that term, the real world hasn’t.

I ummed and aahed about whether I should be accurate, and use STI throughout the article, or go with the more recognisable STD. In the end I used STI in the body but stuck with STD in the headline and, therefore, the URL (“STDs in England: Breakdown by region, gender and ethnicity”). That way, I reasoned, search engines would pick up the term STD but the article stayed true to the recognised term.

Surpringly, no one in the comment thread picked up on the use of STDs versus STIs. I’m sure it’s frustrating for sexual health professionals when the media continues to peddle outdated terms, but until the SEO process adapts unfortunately we need to keep using them, if we’re to capture as many readers as possible.

You’ll notice by the way that I refrained from using ‘sex’ rather than ‘gender’ in the headline, which would probably have brought in a lot more…

SEO: not always a good thing

There’s an interesting post from Kelly McBride over on Poynter, discussing the “ground zero mosque” story.

I use the quotation marks because the proposed building isn’t on ground zero and isn’t actually a mosque but an Islamic cultural centre, including, as McBride says, “a pool, community rooms and offices”.

Unfortunately once the “mosque at ground zero” story started circulating, it was quickly picked up and broadcast throughout the media in the US and worldwide. A quick check of UK papers shows 111 articles containing the falsehood (including, perhaps unsurprisingly, articles in today’s Daily Mail and Daily Express that make no attempt to correct the mistake).

Even though the media has (largely) recognised the error, the phrase won’t go away because the dissemination of news online means the temptation is there to tag every related story with “ground zero” “mosque” to pick up readers using those search terms.

As McBride points out:

…now that the story has peaked, now that we know the real facts, can anyone possibly correct the record? Not if Google has anything to say about it.

That’s because accurate or not, people are searching for the term “ground zero mosque.” So if you want to reach people who are looking for information, you have to use that term.

It’s easy enough to do in a story meant to debunk the phrase. All you have to write is, “It’s not a ground zero mosque.” But, what about ongoing coverage? Must you keep using the inaccurate term?

Sadly, the answer is yes, according to people familiar with SEO practices.

McBride also makes the point that, in a world where bloggers and not just media organisations play a role in initiating news stories, fact-checking is increasingly important for journalists. More reason than ever to boost news libraries, not close them!

Assistant Librarian role at the Guardian

My maternity leave starts in three weeks (and counting!) and the deadline for applications for my maternity cover has just been extended to 31 August.

Do you have a library degree? Experience of finding, analysing and presenting data using online resources? Passionate about current affairs? Looking for a part-time role? Then become the new me!

The work is varied and fast-paced, the team is lovely and there’s tea and cake in abundance.

Full job spec here.