Statistical analysis software on a budget

When I left my university position and hung out my independent evaluator shingle almost a decade ago, one of the big shocks was the non-academic prices of many of the software packages I had relied on. In particular, I had used SPSS for more than 30 years on university licenses that were either free or very low cost to me. But the current non-academic license for SPSS runs $99 per month (almost $1200 per year) for the base system, with additional features requiring additional monthly amounts.

In my practice, I occasionally do quantitative data analysis, but not enough to justify spending that kind of cash if I don’t have to! So, I’ve been looking at alternatives and wanted to share what I’ve found.

R

I’ve been intending to learn R one of these days, ever since I first heard about it in the late 200o’s. And I still want to. From what my stats friends tell me, it’s really the best environment for doing data analysis. So, one of these days… But in the meantime:

Real Statistics

Real Statistics is a free (donations appreciated) Excel add-on with extensive documentation, developed and maintained by Dr. Charles Zaiontz, a researcher and research administrator who has taught at the University of South Florida and other institutions. It appears that Real Statistics is his labor of love—literally, since he credits his wife’s research as the inspiration for the Excel resource pack he created. To use Real Stats, you download the package, activate Excel’s Solver add-in, and then install the Real Stats resource. From within Excel, you invoke the Real Stats popup menu, which provides access to all of Real Statistics’ functions. Indicate the location in the worksheet of the data you want to analyze, the location where you want the output, select whatever options you need, and a moment later, there’s your analysis. I haven’t tested the output extensively, but have compared the procedures I use most often to that of other statistics programs and have gotten consistent results.

As the Real Statistics website points out, all of the statistical functions actually exist in Excel, so the resource pack is not required to do the supported analyses. However, Real Statistics provides the formulas and output formats needed, so you don’t have to figure them out for yourself. In addition, the Real Statistics website is an excellent resource for understanding the statistical concepts and procedures involved. In fact, the Real Statistics website is a terrific informational resource, even if you don’t use the add-in.

For me, Real Statistics turns Excel into a highly capable and efficient statistical analysis tool.

Colectica

A major barrier to using Excel for statistical analysis is fact that it does not support the importation of SPSS and SAS files, since many data files are provided in one of these formats. In addition, Excel lacks data documentation features that are included in other data file formats. The data definitions that are included in SPSS’s .SAV data files are really helpful, since they provide variable and data labels, coding information, etc. I recently received a .SAV file of data to analyze, and for various reasons needed to covert the data to Excel format. Although I had tried this before, I decided to google “how to import SPSS files into Excel” one more time, without expecting to find anything useful. The search results included a link to an Excel add-in called Colectica, which turns out to be a great find.

This little gem of an add-in provided the features I needed to address both importing SPSS files and documenting variable information. Colectica’s main purpose is to document Excel data files—so it stores the same kinds of data definitions that are part of SPSS data files, directly in Excel .xlsx files. It will even assist in building a coding scheme from coded data. Previously, I used Excel’s comment feature to store variable label and coding information, but this wasn’t always completely satisfactory or convenient, so Colectica’s data documentation features are worth getting for their own sake. The paid version of Colectica adds the ability to import SPSS, SAS, and Stata data files directly into Excel, including both data and data definitions. The free version includes the data documentation features. To activate the ability to import SPSS, SAS, and Stata files, an annual license costs $12 per year, or you can purchase a perpetual license for $49.

Is Excel a credible alternative to SPSS?

I don’t know if a hard-core stats geek would think so, but since I started using the combination of Excel, Real Statistics, and Colectica, I haven’t yet found a need for anything else. So, I’m happy with this.

Other alternatives

Before settling on Excel with the Real Statistics and Colectica add-ins, I found several other helpful alternatives.

PSPP

The GNU PSPP application, developed by the Free Software Foundation is a work-alike that can read and write SPSS data files and whose syntax is similar to SPSS, at least for the features it supports. Donations are requested, but it is free to download and use. If you know how to use SPSS, you’ll be able to use PSPP very quickly. It doesn’t have all of the features of SPSS and the user interface is a little quirky sometimes, but the price is right, and unless you need advanced procedures, it’s an alternative to consider.

You can import Excel files into PSPP, but, at least at present, it’s not possible to export to an Excel file, which I find to be its most frustrating limitation (although, with Colectica, that’s less of a problem than it has been). I haven’t done extensive testing, but have compared the output of some SPSS and PSPP procedures and found that they were the same. So, to my knowledge, PSPP statistical output is accurate.

Websites

There are websites that offer almost every conceivable statistical test. Typically, you either download an app that runs locally on your computer, or upload your data to the website and run the procedure there. In the case of downloaded apps, I always check the files with antivirus software before opening or installing them. So far, I haven’t had any problems with viruses, trojans, etc. but it pays to be security-conscious. Likewise, when uploading data to a statistics website, I always make certain that there’s nothing identifiable in the data, just a list of numbers, none of which are identifiers.

So, I’ve found that there’s a universe of reasonably efficient and accurate statistical software alternatives to the high-priced commercial analytics packages, which is good to know. But, yeah, one of these days I’m still going to get around to learning R…

Leave a Reply

Your email address will not be published. Required fields are marked *