Friday, 23 November 2012

A pillar of data journalism: The Guardian

I have long been an avid reader of The Guardian (albeit in its online incarnation, except for Saturdays when you can't beat stretching out on a sofa with the oversized supplements), and have been impressed with their skills at using data to uncover hidden truths that are often concealed behind complexity.

Today, I've stumbled upon not one but two exemplery data visualisations from The Guardian which struck me not for their aesthetic value but for the simplicity of their message.

The first one relates to the EU budget and allows you to interact to select your country of origin to see how much you personally contribute to the EU budget (average for the country, not YOU personally!) and how much you receive. . I never realised how much cash Poland receives from the EU coffers!

The second is again financial data, but this time looking at average earnings in the UK. I was shocked to see that there only 4 local authority areas in which women out-earn men (funnily enough, one of them is mine!).

What these two visualisations prove is that the key to impact and insight is simplicity. To my mind, it is one of the hardest things to achieve in visualising data. I recall a diagram from Andy Kirk on his Visualising Data course in which he talks about the journey of the data/message getting from the visualisation to the brain of the viewer, and how that journey needs to be as direct and unimpeded as possible. Its something I strive for (though don't always achieve), and something which The Guardian has become very adept at.

Friday, 9 November 2012

Using Pinterest to inspire new ideas

There are so so many amazing data visualisation sources out there in the cyber world, be they blogs, portals, websites, or applications, that I find myself struggling to keep up with it all. Like most people I imagine, I save a lot of stuff in to my favourites, but that folder is starting to resemble my wardrobe: overflowing and impossible to find what I'm looking for! I have recently opened an account on Pinterest as I thought it may be easier to store those things that inspire me, that I can generate my own ideas from, and I'm finding it a real revelation. Unlike Favourites, you can store things based on an image/icon which is much more appropriate for data visualisation links. My own page is - do check it out. If I'm struggling with writer's block, I have a quick look on my Pinterest page and the block becomes unblocked! 

Thursday, 20 September 2012

Infographic #3 - an alternative visualisation of Infographic #2

Early on when I was looking at the pharmaceutical data that I used for my first two infographics, I knew I wanted to create some sort of network diagram. I realised early on though that Illustrator wouldn't cut it, and that there was a distinct lack of software that was free, relatively easy to learn, and fit for purpose. Had I have known about NodeXL during that period, the final infographic would have looked very different. In the space of two hours, I created a network diagram! I was so amazed! Its an open-source Excel template created by the Social Media Research Foundation, and it took me no more than about half an hour to work out how to use it. The network diagrams are customisable and you can alter colours, sizes, fonts, layout, and most things. There are a couple of drawbacks of course: I couldn't seem to add a legend (so added this in Paint), and also I couldn't label every vertex otherwise it would have been too crowded (though this is a drawback of network diagrams, not of NodeXL). I do think this infographic does a better job of conveying the inter-connectedness of conditions between the medications, whereas #2 I think shows more the sheer variety of conditions that are possible side-effects. What do you think?

Friday, 14 September 2012

Data viz of the day #2

Its been too long since my last post, and I can only blame a mixture of busyness and lack of inspiration. I think its only a matter of time before you cease to be awestruck by new emergences in a particular field of interest, and I have to confess that I have unknowingly developed a British sense of cynicism towards a lot of new data visualistations I've seen in the last few months. Maybe its saturation, maybe its lack of innovation in the field, who knows. The point is, I finally saw something today which really took my breath away. Its actually a series of visualisations by a Brazilian fellow called Icaro Doria who works for the Portuguese magazine Grande Reportagem. Essentially, Icaro has used data about each country to allocate data points to a coloured element within a flag to illustrate a particular variable, be it country exports or genital mutilation. The facts have been well chosen to really give a fascinating insight into the economic, political, social, and health situations in each country. Anyway, I beg you to take a look, these are phenomenal and such a great example of the power of visualisation to communicate data.

Thursday, 9 August 2012

The making of Infographic #2

I thought readers may be interested to know how my second infographic evolved, particularly as it changed quite considerably from my initial conception.

A few months back I attended a course by Andy Kirk on data visualisation, in which he highlighted the importance of conceiving your visualisation design, rather than just jumping head first into it. Having already familiarised myself with the data back in Infographic #1, the purpose of the second one was to really highlight the sheer variety of side effects that these five drugs alone could potentially cause. In my mind I already knew that I did not want to use any numbers, it would be kind of meaningless. I also knew that I wanted to show the circle of side-effects caused by one drug and the treatment of that side-effect by another. For example, Omeprazole is used in the "treatment and prevention of...NSAID-related ulcers"*. An NSAID is a non-steroidal anti-inflammatory drug, an example of which is Aspirin. Interestingly, BNF does not list ulcers as a side-effect per se (hence why it is not in the diagram) but chooses to use the less specific description of "gastro-intestinal irritation or gastro-intestinal haemorrhage"; more of a catch-all I suppose. So I had a few conceptions in my mind (and crucially had already ruled out some ideas), but visualisations need to sometimes be seen on screen to get a feel for whether it could work or not.

My first idea was as a table, with red pills showing side-effects, green pills showing conditions that are treated/prevented. The five drugs were the columns, the conditions the rows. I quickly realised that there were too many conditions for it to be readable. Perhaps with <20 conditions, this could have been do-able.

My next thought came to me in the middle of the night (they normally do), and I was thinking of venn diagrams but using body organs instead of circles, with conditions in one circle for treated by, then conditions in another that were side-effects, with the overlap being those conditions that are in both categories. On drawing it out on paper though I quickly discarded that idea: I couldn't figure out how to distinguish the five drugs.

Next up was a mindmap idea. This was the most time-consuming as I didn't have any network diagram software which would automatically pick up the connector line with the text/circle (as it does in Microsoft Word when you're creating a mindmap). So I fiddled around with this for ages, having created vector images of body organs (again something that took a LOT of time!!!) and then reluctantly gave up on this, as there were just too many lines. I really should have figured this out from the table I did!!! I knew the text would never be big enough to read, and it didn't have that "wow" factor yet.

My penultimate idea was to have the five drugs at the centre of the infographic each discerned by an oval (so I could make the middle look like a pill just to over-theme the whole thing!). This really started to look promising as I could represent side-effect as a red dot and treated by as a green dot, and it would allow multiple dots for one condition. The one problem was the alignment of the lines leading from the conditions, it was still too jumbled.

So it was then I thought of switching to a circle instead of an oval in order to overcome this, which is what you see in the final version. I then grouped conditions by the body system (seemed most logical to me!). Classifying the conditions itself was a little challenging given that I am not a medical practitioner and that Google can only take you so far, so some of them were my best estimate!

So crucially what lessons have I learned?

  • First and foremost, I've got to keep it 'rough and ready' until pretty much the final draft. What I mean is, I spent ages early on creating these nice vector images of the body organs but they ended up not even being a critical part of the infographic. The main thing should have been to get the outline down on Illustrator, and then the elaboration should come last. What was important was realising how many conditions I had to deal with, and then ruling out visualisations quickly based on that. 
  • Second, never underestimate the onerous task of data collection if the data is unstructured. The data here was from a website, so I had to get it all in an Excel sheet, with one condition per row. I faffed about it with it for ages, looking up each condition on Google. I did this for all of the top ten drugs, when I only use the top five. A bit of wasted time, but not disastrous. 
  • Third, I've learnt a lot more Adobe Illustrator tricks which I will use in the future. The best one for this was using a pie chart of 88 segments (how many conditions are listed on the infographic) to help align all of the lines. Using the alt-drag to copy shapes, and the eyedropper, and the various align and pathfinder tools certainly helped to speed things up a bit also. 

Now on to the next project.........

*From the BNF description of Omeprazole:

Tuesday, 7 August 2012

Infographic #2 - the side effects of the top five prescribed medications in England 2011

So leading on from my previous infographic about the scale and cost of England's prescriptions in 2011, my next infographic looks at the side effects of these drugs. This infographic has been, to put it mildly, a labour of love. It has taken me well over a month of an hour here and an hour there of conceiving it, drafting it, scrapping it, starting again, tweaking, redesigning..... you get the idea. The complexity came from the fact that I wanted to show the connection between the conditions that are treated/prevented by these drugs, but also the side effects, as there were many conditions that popped up on both lists. Showing a many-to-many relationship is not easy; at first I thought of a venn diagram type thing using body organs instead of circles (yes I like to over-complicate things), then I was thinking of a flow chart, but there were way too many lines, so then I settled on this circle which is the format you see now. I may well do another post showing the stages of this infographic at another time, as it has been a useful learning curve to me that others may find interesting and/or helpful.

Anyway, back to the subject matter at hand. There were a few things in particular that really concerned me about these data. First, as I've sort of mentioned before is that these drugs are primarily prescribed to tackle our most prevalent Western diseases: high cholesterol, high blood pressure, cardiovascular disease, hypothyroidism, conditions all largely caused by unhealthy lifestyles. To my amateur eye, it seems that many of these drugs are not only papering over the cracks of these conditions, but also they perversely allow patients to continue with their harmful lifestyles (see this article for a very worrying quote from a Omeprazole patient about how she could indulge her love of pastries after suffering with heartburn for many years!!). Second, I was particularly concerned at the sheer number of varied and sometimes gruesome side effects of these prescriptions. Admittedly, some are rare, but on reading some of the descriptions, I am not sure I would want to take such a risk (just Google rhabdomolysis or Stevens-Johnson Syndrome - warning it may turn your stomach!).

To me, it seems the only winners in all this are the food and pharmaceutical industries. As long as we keep spending vast amounts of money on alcohol, sugar, and processed foods (and lots of it), there were always be a pharmaceutical company ready with a magic pill to reduce the effects of consumption of these foods. Its a virtuous circle for them, and a vicious one for us.

Tuesday, 24 July 2012

Data viz of the day #1

I'm an avid reader of all things health-related, particularly when it comes to subjects of nutrition and natural health. So this little beauty caught my eye this morning after opening up a Dr Mercola newsletter which popped up in my inbox today. The newsletter contained a link to a recent study by the UK-based Alliance for Natural Health, which looked at the relative risk of death from a diverse set of hazards such as drowning, car accidents and being struck lightning. Crucially though, this bubble chart shows that adverse reactions to pharmaceuticals are 62,000 times more likely to kill you than food supplements. From a personal perspective though, that scuba diving bubble concerns me somewhat, being a scuba diver myself, eek! Anyway, I digress..... This information, while perhaps not a shock, sits uncomfortably for me as not only is there a relatively high societal risk of fatalities from pharmaceuticals, but also a high individual risk, simply because of the sheer volume of pharmaceuticals that are prescribed each year, which links neatly to my Infographic #1.

Do check out the ANH website for more details, its full of fascinating data.