9.5 Excel, SPSS, and other data files

A common question I hear is “How can I import an SPSS/Excel/… file into R?”. The first answer to this question I always give is “You shouldn’t”. Shitty Piece of Shitty Shit files can contain information like variable descriptions that R doesn’t know what to do with, and Excel files often contain something, like missing rows or cells with text instead of numbers, that can completely confuse R.

Rather than trying to import SPSS or Excel files directly in R, I always recommend first exporting/saving the original SPSS or Excel files as text .txt. files – both SPSS and Excel have options to do this. Then, once you have exported the data to a .txt file, you can read it into R using read.table().

Warning: If you try to export an Excel file to a text file, it is a good idea to clean the file as much as you can first by, for example, deleting unnecessary columns, making sure that all numeric columns have numeric data, making sure the column names are simple (ie., single words without spaces or special characters). If there is anything ‘unclean’ in the file, then R may still have problems reading it, even after you export it to a text file.

If you absolutely have to read a non-text file into R, check out the package called foreign (install.packages("foreign")). This package has functions for importing Stata, SAS and SPSS files directly into R. To read Excel files, try the package xlsx (install.packages("xlsx")). But again, in my experience it’s always better to convert such files to simple text files first, and then read them into R with read.table().