Training: searching statistics on ons.gov.uk

Image

5 February 2012, CILIP HQ (organised by CILIP Information Services Group)

Notes on the day

Geoff Davies, Implementation Manager at the ONS, gave a run-through of the navigation of the newly redesigned ons.gov.uk. Recent improvements include new search functionality, additional synonyms and acronyms and better navigation.

  • Several new elements on the homepage will be useful for headline figures – the “carousel” in the centre which announces the latest big releases, and the Key figures panel on the right which is a quick way of accessing the most up-to-date stats for GDP, unemployment etc.
  • The UK Publication Hub (link at bottom of landing page) holds all government data, not just that held by ONS.
  • ONS YouTube videos give explanations of big releases, and the new interactives are a good way of interrogating data.
  • Links to the previous site are obsolete, so if you’ve saved a URL it won’t redirect to the new site, but all the statistical releases have been carried over, so they will be there if you dig deep enough.

Geoff then outlined the basic structure of the ONS site, which is a simple nested hierarchy:

  • Business area (section) folder -> each publication has a folder -> calendar entry for each edition -> edition folder -> all content “nuggets” released on that date eg. charts, data tables, summary, statistical bulletin etc.
  • Every edition published to the site has a separate release page, which goes live on the publication date (the release calendar includes future publications). Everything relating to that release is accessible from the page – datasets and reference tables are listed at the bottom of the page, and contact details for a named person responsible for that release are to the right.
  • The redesigned theme pages, which are launching shortly and will be rolled out gradually across each theme, are simplified and easier to understand, and much more visual than the current text-based version. A moving carousel, in the centre, gives the most recent data. They are a work in progress and will be improved as more pages are updated.

Geoff gave a quick run-through of the navigation tabs across the top of the site:

  • Browse by theme – alphabetical index of themes -> individual theme pages, with the most relevant or important content at the top.
  • Publications – chronological list, with filters on the right to narrow down content.
  • Data – chronological list, search for datasets and reference tables here (not available in publications list).
  • Release calendar – all releases, chronologically, including future releases (the landing page only includes big releases). If you click through to a release page there’s a link to all editions at top right, to access previous data.
  • Guidance and methodology – gives background on the ONS and data collection, classifications etc.
  • Media Centre – includes official statements and releases, and letters correcting misinterpretations of stats in the media.
  • About ONS – most useful is the ad hoc research undertaken by ONS, which isn’t searchable in the publications indexes. Go to Publication Scheme under What We Do, then Published Ad Hoc Data on the left.

Continuing problems with the site

The main issue users have raised since the redesign is difficulty in finding content. The ONS has decentralised publishing, which means each department is responsible for their own releases (around 460 staff contributing to the site). This has led to inconsistency, as some staff are reluctant to change old methods or not interested in web standards, and some are just too busy. The ONS are working on solutions:

  • training staff on how to tag content with six or seven most useful keywords (too few, or too many irrelevant ones, mean weaker search results), and improving the metadata.
  • publishing support team to help departments who are too busy or uninterested.
  • health checks are run on content regularly.
  • there is pressure from management to conform to the new standards.

Practical examples

We ran through some real search queries for tips on searching the site, with assistance from a member of the customer services team (whose name I missed, sorry!). The main advice was to search through the release calendar using filters as necessary (selecting ‘last 5 years’ clears future releases from the list), and to use the ‘all editions’ link on each release page to locate time series data.

Unfortunately, the practical examples just proved that the search functionality of the site still needs improvement (if a roomful of information professionals struggles to find data you have a problem!). Advising users to call the customer services team with any queries is helpful but no use in a high pressure environment where data is needed within hours, not days – what I really needed were ways of finding the stats myself.

Reflections

  • The redesigned ons.gov.uk site is much cleaner and simpler than the old version, and easier to navigate, but it’s still difficult to actually find specific data. It’s a shame the ONS didn’t take advantage of having a room full of information professionals to interrogate the system further and to make notes of improvements needed.
  • Some of the problems the ONS are facing are familiar – they’ve decentralised uploading of content, but some staff are reluctant to adopt new techniques and others are over-keen and tag excessively. This is true of other new technologies being adopted across many library sectors (certainly it applies to social media in the news industry). It’s an issue of good training and perseverance with the new standards, and having support from management is vital.
  • Some issues with the redesign are similar to those we’ve experienced in relaunching our intranet recently – lack of redirects from old pages, decentralising, need for training.

Applying what I learned

  • The key figures and carousel on the front page of ons.gov.uk will be incredibly useful for finding the most recent headline data quickly (a common query).
  • The new theme pages will be very useful once they are launched, as a quick way to access key figures on a topic (another common query).
  • I’ll bookmark the ad hoc data page as an extra location to check for data.
  • The training also offered some good ideas on how to ensure consistently good content and metadata, which we could apply to any new roles that our department undertakes.

Generating a word cloud (or not) from a Twitter hashtag

Word cloud showing most common questions under #askgove

Sample #askgove word cloud created from around 2,500 tweets

Education asked last Tuesday if we could create a word cloud on Friday from the questions asked on Twitter using the #askgove hashtag. One of those jobs that seems simple on the surface but isn’t!

  • Problem one – by Tuesday there were already thousands of tweets, and Twitter will only allow you to search so far back on a keyword.
  • Problem two – they wanted the cloud generated on Friday (when they go to print) so they could include as many #askgove questions as possible, which meant checking for new tweets every couple of hours during the week to compile an immense list.
  • Problem three – because there were so many tweets, it was impossible to go through and weed out all the extraneous words like reply, retweet, favorite, open, askgove before generating the cloud, to say nothing of all the stop words (and, a, the…). They wanted a cloud that highlighted the key questions being asked, so no words relating to usernames, no why/will/what/when… and sadly no swearing!
  • Problem four – I don’t work on Fridays.

I got as far as I could with it – I searched for #askgove on Twitter and pasted the available list of tweets so far into a program called word counter, to generate a list of words ranked by frequency. That weeded out some of the basic stop words. But how to turn that into a Wordle? I could see the most popular terms, but they only occur once in the text generated by the counter so the word cloud would be meaningless.

Step forward production, specifically a systems editor, who showed me a nifty bit of code which takes the word counter list and returns each word, repeated as many times as the frequency number next to it. Weed out the words we don’t want (check the ones we’re not sure about – ebacc, ict, hei – on Twitter), paste this into Wordle and voila! a word cloud.

I showed the process to the art director who works on Education, and mocked up a word cloud using the layout and colours she chose, to see whether it worked on the page.

I wrote detailed instructions for colleagues, and at their request I talked them through the process at my screen, so they could create the cloud without too many difficulties. They started to add to the list of tweets at the end of Wednesday (while I was still in, to check they’d got the process right).

And then…

…the word cloud was dropped from the supplement. This happens fairly often in journalism – a story is superceded by breaking news, the space is needed for advertising or a better alternative presents itself. The reason in this case was space – the word cloud simply didn’t work in the space available on the page. And they let us know early on Thursday, so my colleagues didn’t spend too long on it (sometimes we don’t get told at all).

So was it a waste of time? No. I learnt some valuable lessons, about how to generate word clouds but also about working with different departments (and colleagues) to create something for the paper.

Reflections

  • If something seems impossible at first glance don’t just dismiss it, there’s usually a solution and sometimes you have to put a bit of work in.
  • Ask for help if you don’t know how to do something – in such a big organisation there will usually be someone in the building who has the knowhow.
  • Collaboration is key – education came to us at the beginning with a clear idea of what they wanted but little knowledge of how it could be done; I took it as far as possible then consulted someone with the technical knowledge; and collaborated on the design so the editors could make a final decision. Sharing knowledge led to a better end result, even though it wasn’t used.
  • Now I know how to create a word cloud from any volume of text, so if it comes up again it’ll be easy (she says…).
  • Walking colleagues through a complicated process is better than just emailing a list of instructions, which can be confusing (some people learn better with visual aids) and can seem a little superior (not everyone responds well to being told what to do remotely).

I think that last one is the lesson I should really take to heart!

Training: Tweetdeck, 22 November 2011

I’ve been using Twitter for a few years but I’m behind the times with my software! Following a talk about the use of Twitter, I signed up for a Tweetdeck intro session with John Stuttle, one of the Guardian’s systems editors.

Tweetdeck offers much more usability than the basic Twitter feed. Key for me is that it allows you to track more than one account at once (and from more than one social network), to tweet from more than one account simultaneously (say my personal and department ones) and to publish timed tweets (so we could set a tweet to launch the weekend’s From the archive in advance, for example).

John recommended signing up for a bit.ly account as well, which allows you to analyse the statistics on how many people used your link to click through to a story, versus other links, and use that to improve your tweeting. Bit.ly provides various other stats too – to view statistics and graphs just click the Analyze link at the top once you’re signed in, or click on Info Page next to an individual link.

 

CPD23 Thing 18: Jing, screen capture and podcasts

Maria’s Thing 18 post

Jing

I love the idea of Jing, and other screen capture software – being able to record a How to… would be of huge benefit to bossy old me (and any of my colleagues who have had to refer to a seemingly endless list of bullet points I’ve written).

Directing colleagues and users to a short video would save me from running through the same processes time and again, and would be a really useful tool for marketing the department (back to advocacy!). It certainly has potential.

The unable-to-download monster has reared its head again, but yay! there’s a non-downloadable option too (thanks Maria).

Screencast-o-matic

I had a quick go of recording a How to… – how to upload the From the archive column to the website. I made a bit of a schoolboy error – the recording box wasn’t big enough so every menu selection happened off screen. I didn’t record sound either, although I don’t think I’d use sound anyway (far too camera shy!).

The recording process is fairly intuitive though, and I’d definitely consider using it for all our how to… type material, once I’ve practiced a bit more.

Podcasts

I’m not ready to record my own (and I’m not sure our users would be ready to listen to me either!) but I’m going to have a listen to the arcadia@cambridge podcasts, and ask around (okay, ask Twitter) if there are other good library podcasts out there. Anyone know any?

CPD23 Thing 17: The medium is the message (Prezi & Slideshare)

Ange’s Thing 17 post

I should say right off that my experience of using slides is very new (as in, I created my first PowerPoint slide this afternoon for a presentation tomorrow – I’ll let you know how it goes in the comments!). Reading Ange’s suggestions, and browsing SlideShare, (hopefully!) helped me to avoid some of the pitfalls of using slides.

Being new to presenting, I’m not sure I’m ready to use advanced software like Prezi. I can see the advantages (and I’m sure the audience would find my talk much more interesting) but I find it hard enough just getting my points in the right order, without playing around with the screen as well.

I will be taking a fuller look at Prezi once I’ve delivered my talk though – as a tool for sharing a presentation online it looks like it’s streets ahead of my boring PowerPoint slides. If I manage to revamp those I’ll post a link.

CPD23 Thing 14: Zotero, Mendeley and citeulike

Isla’s Thing 14 post

As usual I’m hampered by my inability to download software at work (the problem of working for a big company that thinks it can meet all your IT needs, without asking you what they are, grumble grumble*). So I’m skipping Zotero and Mendeley, and focusing on citeulike.

It’s hard to try the site out thoroughly because there isn’t an obvious application at work, but it was quick to get started and seems easy to use. It could also come in handy as a new resource for finding articles, as well as recording ones I’ve found elsewhere.

Reflections

At the moment, I don’t need a citation tool, but I’m really impressed with the ones on offer so if I need one in future (Chartership portfolio?) I’ll definitely take a closer look. Like Isla, I really wish they’d been around when I was at uni!

*I’m being unfair, but it is frustrating when you don’t have control over the tools you use!