Data analysis is a new name for an old profession. We all perform data analysis daily, but may not realize it because it is such a natural function. This article will show you how to find and use free software to analyze data.
What Is Data Analysis?
Have you ever calculated miles per gallon for your car? Have you ever calculated your GPA or your test average for school? Have you ever tried to average a number or complete a statistics project for school? Then, you have performed data analysis.
We are not going to go into the different stages of data analysis in detail. For this article, it is assumed you have already determined the requirement for the data and have “cleaned" the data sufficiently to ensure the calculations will be as verifiable as possible. What we are about are the applications today, including:
- Open Office (spreadsheet program)
- “R" Statistical software
- QGIS (geospatial software)
- KNIME (simulation/modeling software)
You can download these programs using links listed in the “Resources" section of this article.
OpenOffice’s spreadsheet program is much like Microsoft Excel, although without the price. Remember one thing as we continue with this article – free does not necessarily mean intuitive. You may have to struggle a little with these free programs, but you have plenty of communities and developers available to help. You can also make recommendations for further enhancements to the programs.
Open Office can help you perform some very elementary data analysis such as statistical mean, median and standard deviation. Much like Excel, the spreadsheet portion of OpenOffice performs almost all the basic statistical analysis of data that you might want to evaluate.
R Stats Software
This free program, downloaded from the “CRAN" servers located all over the world can give you a program that is not only highly usable but also extremely flexible because of the different “packages" that can be installed from the servers. These can give you graphic, geospatial and even data mining capabilities. One of my favorite R packages is one called RATTLE. This GUI-based data mining sub-application developed for R gives users the ability to take existing data and run tests at the touch of a button including some sophisticated regression analysis and time series graphs.
QGIS Geospatial Software
Similar to the two previous applications, Quantum GIS (QGIS) is free to download and use. The interesting thing about QGIS is that it can take readily available geospatial maps and use them as part of the overall application. There are GIS software applications that do the same and can be very expensive. QGIS allows the neophyte GIS specialist to develop maps for free, and even publish those maps to websites. The main strength behind the QGIS is the training material available both on the site (they have a fully functional training manual, to which you can add if you choose) as well as books.
KNIME Data Modeling
Did you ever wish that there was a way to perform myriad evaluations on data in a repeatable fashion? Well, KNIME is the tool for you. This so surprised me that I was just floored at the possibilities of using a “macro on steroids" to get data cleaned, analyzed, evaluated, and reported all in one application. In fact, this application also does some GIS! Modeling, for the uninitiated, is the method to repeat steps (almost like a conveyor system) in order to do a number of manual steps in an automated fashion.
This table can help you compare the free programs for “one stop shopping.” Happy analyzing!