Don't neglect your data!

February 2, 2016

This situation comes up a lot, especially when I speak to social scientists.

A student has 1 year left before they have to submit their PhD thesis. They have written 10s of thousands of words. They have read countless articles. They have an outline of their thesis and a few drafts of some of their chapters. They have some good ideas, they have a good knowledge of their field, and have done all their data collection… But they have done zero data analysis. Sometimes, they won't have transcribed their interviews. Some won't have even listened to the interviews since they conducted them (perhaps several years ago).

This leaves them in a very precarious situation; with only a few months remaining, not knowing whether there is anything useful in their data, and not knowing how to find out. They have to learn how to do the analysis for the first time under enormous pressure, with no opportunity to redo any of the practical work should there be a problem with the data (or to further investigate anything interesting).

It's a nightmare situation, but so easily avoidable if you start learning how to do data analysis as early as possible. This includes;

  • Learning how to use the software you need (eg NVIVO)
  • Putting data in an appropriate format (transcription)
  • Basic analytical techniques (coding)

You don't need a full data set to get started. It doesn't even need to be real data. Starting early means that you be more comfortable with the analysis once you do have a full data set, and having an understanding of the analytical process will help you get better quality data.

Don't neglect your data, and don't treat the analysis as something you can throw together at the end.

What to do if you have neglected your data until now

First, make sure you know where the data is, then start on whatever formatting needs to be done.

For example, if you have audio recordings of interviews, these will probably need to be transcribed. Many underestimate how long this takes, so start immediately.

Once you have one transcribed file, that's enough to load into whatever software you are using so you can play around with the basics of analysis. If you know someone who has used the software before, ask them nicely if they can show you what their process is for analysing data. If you don't know anyone, find some online tutorials to get you started.

You must then get all your data into a usable state. Until this is done, you don't really have anything to work with. It's time-consuming and can be tedious, but it has to be done. Try to put together a checklist so you have a consistent process to follow. Take note of where you save every file, and ALWAYS keep an unaltered copy of the original raw data.

Only once you have the data in an analysable form can you start to figure out whether you have anything valuable. The earlier you do this, the better.

"Box of floppy disks and USB memory stick" by JIP - Own work. Licensed under CC BY-SA 3.0 via Commons.

"Box of floppy disks and USB memory stick" by JIP - Own work. Licensed under CC BY-SA 3.0 via Commons

If you found this post useful, click below to share!

For more detailed guidance and support...

The PhD Academy

Weekly calls with James

You don't have to do it all alone! Get the All Access Pass for weekly group calls and Q&A sessions with James

Online courses

Build your skills and confidence with our detailed video courses. Go at your own pace and get advice and support when you need it

Writing groups

Meet other students online for company and accountability

Support community

Post questions, share resources and connect with other members

Get the book!
PhD: an uncommon guide to research, writing & PhD life

order now on amazon