Minnesota Bridge Image
St. Anthony Main
219 Main Street SE, Suite 302
Minneapolis, Minnesota 55414

612.623.9110
(f) 612.623.8807


Proportional Venn diagrams for data viz

Recently our Senior Analyst, Becky Lien, shared some uses for Venn diagrams based on this recent report from the Pew Research Center, and I’m surprised to say I actually see use for them! The twist with this Venn diagram is that you change the size of the circle as well as the size of the circle overlap to accurately represent the proportion of data. Sweet, and duh! You can do this fairly accurately in Microsoft Office, or super-duper accurately in R.

Now, I often avoid using circles when I visualize data because people have a hard time estimating area, but this seems like a useful way to visualize data when you have different combinations of responses. Take the following example. We collect data on what type of stop-smoking medication people used, if any, to help them quit using tobacco. Someone might use one type of medication, but they could also use a combination of medications, which can result in a lengthy, and perhaps unclear, bar chart. Becky thought that a proportional Venn diagram might be a great way to show the various patterns of medication use. Here is a diagram that shows the number of participants who used nicotine replacement therapy (NRT) in the form of the patch, gum, and/or lozenge.

Patterns of NRT use

Becky created this in R, but you could also do it in Microsoft PowerPoint, Word, or Excel. You can either go the quick route and eyeball the proportions of the circles, which could be totally fine, or you can revisit your geometry and algebra classes to get more exact circle sizes, which is what I tried out.

I replicated Becky’s work in PowerPoint. I first determined the total number of people represented for each of the four circles. For Patch, 661 + 106 + 60 + 40 = 867. For Gum, 239 + 106 + 102 + 40 = 349. I added up the numbers for the remaining two circles.

I then created the largest sized circle, in this case it’s Patch. To figure out the size of the circle for the Gum, I divided the number of people who used Gum (349) by the number of people who used Patch (867) which equals .40. So the circle for the Gum needs to be 40% the size of the Patch circle.

Here is where that geometry and algebra come into play. The area of a circle = 3.14*r2, where the radius (r) is half the height or width of the circle. The area of the Patch circle came out to 2.83, so I knew my area for the Gum circle needed to be 40% of that, or 1.13. I solved for r (the radius of the Gum circle) and following these steps for the remaining circles.

By the way, I could solve for r all day long! Geometry wasn’t really my thing, but I was way into algebra.

The trickier, less exact part is trying to overlap the circles in a proportional way. R will do this for you fairly accurately, but you will have to eyeball it in PowerPoint. I think you could do this well enough though to get the point across.

Finally, I added in text boxes for the data labels, and the color fill of the circles is 40% transparent so that you can see the overlap.

What do you think about using Venn diagrams in this way? How would you visualize these data?


 

Posted in Data visualization, General