データの型

R のデータ型は vector (numeric, character, logical), matrix, data.frame, list などがある。

vector (ベクトル)

a <- c(1,2,5.3,6,-2,4) # 数値ベクトル
b <- c("one","two","three") # 文字ベクトル
c <- c(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE) # 論理ベクトル

添え字を使って各要素を指定する

a <- letters[1:10]
a # a ベクトルの中身
a[c(2,4)] # ベクトルの2番目と4番目の要素


matrix (行列)

行列では、全ての列が同じ型(数値なら数値、文字なら文字) で同じ長さでなくてはならない。行列の生成は matrix( ) 関数を用いる。

mat <- matrix(vector, nrow=r, ncol=c, byrow=FALSE,
  dimnames=list(
行名の文字列ベクトル, 列名の文字列ベクトル))

byrow=TRUE とすると、1行ごとにベクトルの要素を入れていく(1行目の左から右にいって、右端まできたら2行目の左から埋めていく) 。
byrow=FALSE とすると1列ごとに埋めていく (1列目の上から下 -> 2列目の上から下) 。これがデフォルト。
dimnames で行と列の名前を指定できる。

# 2 x 5 の数値の行列
x <- 1:10
x
x.mat <- matrix(x, nrow=2)
x.mat

# byrow=TRUEにする
x.mat2 <- matrix(x, nrow=2, byrow=TRUE)
x.mat2

# dimnamesを使って名前をつける
x.mat3 <- matrix(x, nrow=2, dimnames=list(c("r1", "r2"), c("c1", "c2", "c3", "c4", "c5"))) # list関数を忘れないこと
x.mat3

# 添え字で行や列を指定する
x.mat[2,] # 2行目のみ
x.mat[,3] # 3列目のみ
x.mat[1:2,2:3] # 1,2行目の2,3列目

Arrays

Arrays are similar to matrices but can have more than two dimensions. See help(array) for details.

Dataframes

A dataframe is more general than a matrix, in that different columns can have different modes (numeric, character, factor, etc.). This is similar to SAS and SPSS datasets.

d <- c(1,2,3,4)
e <- c("red", "white", "red", NA)
f <- c(TRUE,TRUE,TRUE,FALSE)
mydata <- data.frame(d,e,f)
names(mydata) <- c("ID","Color","Passed") # variable names

There are a variety of ways to identify the elements of a dataframe .

myframe[3:5] # columns 3,4,5 of dataframe
myframe[c("ID","Age")] # columns ID and Age from dataframe
myframe$X1 # variable x1 in the dataframe

Lists

An ordered collection of objects (components). A list allows you to gather a variety of (possibly unrelated) objects under one name.

# example of a list with 4 components -
# a string, a numeric vector, a matrix, and a scaler
w <- list(name="Fred", mynumbers=a, mymatrix=y, age=5.3)

# example of a list containing two lists
v <- c(list1,list2)

Identify elements of a list using the [[]] convention.

mylist[[2]] # 2nd component of the list
mylist[["mynumbers"]] # component named mynumbers in list

Factors

Tell R that a variable is nominal by making it a factor. The factor stores the nominal values as a vector of integers in the range [ 1... k ] (where k is the number of unique values in the nominal variable), and an internal vector of character strings (the original values) mapped to these integers.

# variable gender with 20 "male" entries and
# 30 "female" entries
gender <- c(rep("male",20), rep("female", 30))
gender <- factor(gender)
# stores gender as 20 1s and 30 2s and associates
# 1=female, 2=male internally (alphabetically)
# R now treats gender as a nominal variable
summary(gender)

An ordered factor is used to represent an ordinal variable.

# variable rating coded as "large", "medium", "small'
rating <- ordered(rating)
# recodes rating to 1,2,3 and associates
# 1=large, 2=medium, 3=small internally
# R now treats rating as ordinal

R will treat factors as nominal variables and ordered factors as ordinal variables in statistical proceedures and graphical analyses. You can use options in the factor( ) and ordered( ) functions to control the mapping of integers to strings (overiding the alphabetical ordering). You can also use factors to create value labels. For more on factors see the UCLA page.

Useful Functions

length(object) # number of elements or components
str(object)    # structure of an object
class(object)  # class or type of an object
names(object)  # names

c(object,object,...)       # combine objects into a vector
cbind(object, object, ...) # combine objects as columns
rbind(object, object, ...) # combine objects as rows

object     # prints the object

ls()       # list current objects
rm(object) # delete an object

newobject <- edit(object) # edit copy and save as newobject
fix(object)               # edit in place