Statistics is reviewing and analyzing the data and R programming using many different build-in functions for statistical analysis, computing and graphing. Most of the statistical functions are present in the R basic package and vectors are input to those functions with different parameters to give the results. We will go through the following basic ones and do statistical analysis.
- Summary
- Minimum and maximum value
- Mean, median and mode
- Percentiles
- Variance and Standard Deviation
- Linear regression
The following table gives a summary of the functions and parameters that we can utilize to create those different statistical functions.

Let’s use one of the built- in datasets in R to go through some of the functions, their functions and get familiar with the statistical analysis.
Data set is called “mtcars” and it is statistical data of Motor Trend Car Road Tests. The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). Remember, if you want to get more info about it, you can type “?” and data set name so R can give more info (help).
Go to RStudio and first print the data just by typing the name of the data set, highlight it and click the “Run” button. It will show 32 cars and 11 columns with different information.
Then, “?mtcars” to get more info about each column.
Summary Function
Let’s use summary() function to get the statistical summary of the data set. As you will see, you will get the followings for each column`s data;
- Min (minimum value)
- First quantile (percentile)
- Median
- Mean
- Third quantile (percentile)
- Max (maximum value)
Minimum and maximum values
I wonder, which car is heavier or lighter and what are those cars’ names. We will use max(), min(), which.max(), which.min() and rownames() functions. To print variables belonging to a variable , we will use the “$” sign and the name of the variables.
So, maximum weight is 5.424 for the car type LincoLn Continental and minimum weight is 1.513 for Lotus Europa car.
Mean, median and mode
What is the mean, median and mode for the weight of the cars? R has mean(), and median() functions for the calculations of average (mean) and the median value of data but does not have a function to calculate mode. We will create our own function to find it.
Percentiles
This helps us to find the value in the data set for the lower than given percent of the value. You can use the c() function to specify percentage or if you do not use it, you will be able to get 0,25,50,75 and 100% results. I will show you both below;
Variance and standard deviation
Variance is a measure of the variability and it represents average squared deviations from the mean. Standard deviation is the square root of the variance. Let’s use the same example data set to see the variance and standard deviation. We will use var() and sd() functions.
Linear Regression
Linear analysis is used to establish a relationship between two variables, which are related through an equation. Mathematically, a linear relationship represent a straight line and mathematical equation is like following;
y=ax+b (a & b are constants)
Let’s zoom in to see the details;
Created plot;
Formula for the linear relationship is mpg=37.285+(-5.344)*wt
That is it! I hope, you got a feeling of the statistics in R programming. Please use some other data sets to practice more on the statistical functions. In the next post, we will go though accessing, entering and importing the data sets.
