Saturday, January 12, 2013

Data Types and Basic Commands in R Software

Before you start to read this article, please visit my previous post to know how to install R software into your machine.
R has five basic or "atomic" classes of objects:
  • character
  • numeric (real numbers)
  • integer
  • complex
  • logical (True/False)
The most basic object is a vector
  • A vector can only contain objects of the same class
  • BUT: The one exception is a list, which represented as a vector but can contain different objects of different classes.
  • Empty vector can be created with the vector() function.
R objects can have attributes
  • names, dimnames
  • dimension (e.g. matrices, arrays)
  • class
  • length
  • other user-defined attributes/metadata
Attributes of an object can be access using attributes() function.

Factors are used to represent categorical data. Factors can be unordered or ordered. One can think of a factor as an integer vector where each integer has a label.
  • Factors are treated specially by modelling functions like lm() and glm()
  • Using factors with labels is better than using integers because factors are self-describing.
 Data frames are used to store tabular data
  • They are represented as a special type of list where every element of the list has to have the same length
  • Unlike matrices, data frames can store different classes of objects in each column
  • Data frames also have a special attribute called row.names
  • Data frames are usually created by calling read.table() or read.csv()
  • Can be converted to a matrix by calling data.matrix()

Here are some basic commands for practicing:

1. Assignment operator


2. Create vector of objects with c() function


3. Coerce from one class to another using as.* functions, if available


4. Create matrices



5. Create list


6. Create factors



7. Check for missing values


8. Create data frames


9. Name R object




Have fun!