07 - Las funciones

François Rebaudo, IRD francois.rebaudo@ird.fr

Marzo 2019 ; PUCE-Quito-Ecuador http://myrbooksp.netlify.com/

CC BY-NC-ND 3.0

Las funciones

Las funciones

function(argumento1 = x, argumento2 = y)

El acceso a la documentación

help o ?

help(matrix) # equivalente a ?matrix

help o ?

# --------------------------------------------------------
# Matrices
# --------------------------------------------------------
#
# Description
# --------------------------------------------------------
# 
# matrix creates a matrix from the given set of values.
# 
# as.matrix attempts to turn its argument into a matrix.
# 
# is.matrix tests if its argument is a (strict) matrix.
# 
# Usage
# --------------------------------------------------------
# 
# matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE,
#        dimnames = NULL)
# 
# as.matrix(x, ...)
## S3 method for class 'data.frame'
# as.matrix(x, rownames.force = NA, ...)
# 
# is.matrix(x)
#
# Arguments
# --------------------------------------------------------
# 
# data  an optional data vector (including a list or expression vector). Non-atomic classed R objects are coerced by as.vector and all attributes discarded.
# nrow  the desired number of rows.
# ncol  the desired number of columns.
# byrow logical. If FALSE (the default) the matrix is filled by columns, otherwise the matrix is filled by rows.
# dimnames  A dimnames attribute for the matrix: NULL or a list of length 2 giving the row and column names respectively. An empty list is treated as NULL, and a list of length one as row names. The list can be named, and the list names will be used as names for the dimensions.
# x an R object.
# ...   additional arguments to be passed to or from methods.
# rownames.force    logical indicating if the resulting matrix should have character (rather than NULL) rownames. The default, NA, uses NULL rownames if the data frame has ‘automatic’ row.names or for a zero-row data frame.
# 
# Details
# --------------------------------------------------------
# 
# If one of nrow or ncol is not given, an attempt is made to infer it from the length of data and the other parameter. If neither is given, a one-column matrix is returned.
# 
# If there are too few elements in data to fill the matrix, then the elements in data are recycled. If data has length zero, NA of an appropriate type is used for atomic vectors (0 for raw vectors) and NULL for lists.
# 
# is.matrix returns TRUE if x is a vector and has a "dim" attribute of length 2 and FALSE otherwise. Note that a data.frame is not a matrix by this test. The function is generic: you can write methods to handle specific classes of objects, see InternalMethods.
# 
# as.matrix is a generic function. The method for data frames will return a character matrix if there is only atomic columns and any non-(numeric/logical/complex) column, applying as.vector to factors and format to other non-character columns. Otherwise, the usual coercion hierarchy (logical < integer < double < complex) will be used, e.g., all-logical data frames will be coerced to a logical matrix, mixed logical-integer will give a integer matrix, etc.
# 
# The default method for as.matrix calls as.vector(x), and hence e.g. coerces factors to character vectors.
# 
# When coercing a vector, it produces a one-column matrix, and promotes the names (if any) of the vector to the rownames of the matrix.
# 
# is.matrix is a primitive function.
# 
# The print method for a matrix gives a rectangular layout with dimnames or indices. For a list matrix, the entries of length not one are printed in the form integer,7 indicating the type and length.
# 
# Note
# --------------------------------------------------------
# 
# If you just want to convert a vector to a matrix, something like
# 
#   dim(x) <- c(nx, ny)
#   dimnames(x) <- list(row_names, col_names)
#   will avoid duplicating x.
# 
# References
# --------------------------------------------------------
# 
# Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
# 
# See Also
# --------------------------------------------------------
# 
# data.matrix, which attempts to convert to a numeric matrix.
# 
# A matrix is the special case of a two-dimensional array.
# 
# Examples
# --------------------------------------------------------
# 
# is.matrix(as.matrix(1:10))
# !is.matrix(warpbreaks)  # data.frame, NOT matrix!
# warpbreaks[1:10,]
# as.matrix(warpbreaks[1:10,])  # using as.matrix.data.frame(.) method
# 
## Example of setting row and column names
# mdat <- matrix(c(1,2,3, 11,12,13), nrow = 2, ncol = 3, byrow = TRUE,
#                dimnames = list(c("row1", "row2"),
#                                c("C.1", "C.2", "C.3")))
# mdat

help.search()

La función help.search() o ?? permite buscar una expresión en toda la documentación.

Ver los datos

str()

La función str() permite visualizar los tipos de datos y la estructura de los datos.

str(iris)
## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

head() y tail()

Para ver los primeros y ultimos elementos de un objeto.

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

head() y tail()

tail(iris)
##     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
## 145          6.7         3.3          5.7         2.5 virginica
## 146          6.7         3.0          5.2         2.3 virginica
## 147          6.3         2.5          5.0         1.9 virginica
## 148          6.5         3.0          5.2         2.0 virginica
## 149          6.2         3.4          5.4         2.3 virginica
## 150          5.9         3.0          5.1         1.8 virginica

head() y tail()

head(iris, n = 20)
##    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1           5.1         3.5          1.4         0.2  setosa
## 2           4.9         3.0          1.4         0.2  setosa
## 3           4.7         3.2          1.3         0.2  setosa
## 4           4.6         3.1          1.5         0.2  setosa
## 5           5.0         3.6          1.4         0.2  setosa
## 6           5.4         3.9          1.7         0.4  setosa
## 7           4.6         3.4          1.4         0.3  setosa
## 8           5.0         3.4          1.5         0.2  setosa
## 9           4.4         2.9          1.4         0.2  setosa
## 10          4.9         3.1          1.5         0.1  setosa
## 11          5.4         3.7          1.5         0.2  setosa
## 12          4.8         3.4          1.6         0.2  setosa
## 13          4.8         3.0          1.4         0.1  setosa
## 14          4.3         3.0          1.1         0.1  setosa
## 15          5.8         4.0          1.2         0.2  setosa
## 16          5.7         4.4          1.5         0.4  setosa
## 17          5.4         3.9          1.3         0.4  setosa
## 18          5.1         3.5          1.4         0.3  setosa
## 19          5.7         3.8          1.7         0.3  setosa
## 20          5.1         3.8          1.5         0.3  setosa

names()

names(iris)
## [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
## [5] "Species"

cat() y print()

cat(names(iris))
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
print(names(iris))
## [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
## [5] "Species"

Manipular los datos

rank()

La función rank() devuelve el número de la posición ordenada de cada elemento de un conjunto de elementos.

vecManip <- c(10, 20, 30, 70, 60, 50, 40)
rank(vecManip)
## [1] 1 2 3 7 6 5 4

rank()

vecManip2 <- c(10, 20, 30, 10, 50, 10, 40)
rank(vecManip2)
## [1] 2 4 5 2 7 2 6
rank(vecManip2, ties.method = "first")
## [1] 1 4 5 2 7 3 6
rank(vecManip2, ties.method = "min")
## [1] 1 4 5 1 7 1 6

order()

La función order() devuelve el número de la reorganización de los elementos en función de su posición.

print(vecManip2)
## [1] 10 20 30 10 50 10 40
rank(vecManip2)
## [1] 2 4 5 2 7 2 6
order(vecManip2)
## [1] 1 4 6 2 3 7 5

sort()

La función sort() se usa para ordenar los elementos de un objeto.

print(vecManip2)
## [1] 10 20 30 10 50 10 40
sort(vecManip2)
## [1] 10 10 10 20 30 40 50
vecManip2[order(vecManip2)]
## [1] 10 10 10 20 30 40 50

sort()

print(vecManip2)
## [1] 10 20 30 10 50 10 40
order(vecManip2)
## [1] 1 4 6 2 3 7 5
vecManip2[c(1, 4, 6, 2, 3, 7, 5)]
## [1] 10 10 10 20 30 40 50
sort(vecManip2)
## [1] 10 10 10 20 30 40 50

order()

miDf <- data.frame(
  v1 = c(1, 5, 6, 7, 2, 3), 
  v2 = c("z", "t", "y", "t", "n", "b"))
print(miDf)
##   v1 v2
## 1  1  z
## 2  5  t
## 3  6  y
## 4  7  t
## 5  2  n
## 6  3  b

order()

miDf[order(miDf$v1),]
##   v1 v2
## 1  1  z
## 5  2  n
## 6  3  b
## 2  5  t
## 3  6  y
## 4  7  t

order()

miDf[order(miDf$v2),]
##   v1 v2
## 6  3  b
## 5  2  n
## 2  5  t
## 4  7  t
## 3  6  y
## 1  1  z

append()

Agregar un elemento a un vector en una posición determinada por el argumento after.

print(vecManip2)
## [1] 10 20 30 10 50 10 40
append(vecManip2, 5)
## [1] 10 20 30 10 50 10 40  5
append(vecManip2, 5, after = 2)
## [1] 10 20  5 30 10 50 10 40

cbind() y rbind()

cbind(vecManip2, vecManip2)
##      vecManip2 vecManip2
## [1,]        10        10
## [2,]        20        20
## [3,]        30        30
## [4,]        10        10
## [5,]        50        50
## [6,]        10        10
## [7,]        40        40

cbind() y rbind()

rbind(vecManip2, vecManip2)
##           [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## vecManip2   10   20   30   10   50   10   40
## vecManip2   10   20   30   10   50   10   40

paste() y paste0()

paste(1, "a")
## [1] "1 a"
paste0(1, "a")
## [1] "1a"
paste(1, "a", sep = "_")
## [1] "1_a"

paste() y paste0()

paste0("prefix_", vecManip2, "_suffix")
## [1] "prefix_10_suffix" "prefix_20_suffix" "prefix_30_suffix"
## [4] "prefix_10_suffix" "prefix_50_suffix" "prefix_10_suffix"
## [7] "prefix_40_suffix"

paste() y paste0()

paste(vecManip2, rank(vecManip2), sep = "_")
## [1] "10_2" "20_4" "30_5" "10_2" "50_7" "10_2" "40_6"

rev()

print(vecManip2)
## [1] 10 20 30 10 50 10 40
rev(vecManip2)
## [1] 40 10 50 10 30 20 10

%in%

print(vecManip)
## [1] 10 20 30 70 60 50 40
print(vecManip2)
## [1] 10 20 30 10 50 10 40
vecManip %in% vecManip2
## [1]  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE
vecManip2 %in% vecManip
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Funciones matemáticas

Funciones matemáticas

vecManip3 <- c(10, 100)
exp(vecManip3)
## [1] 2.202647e+04 2.688117e+43
sqrt(vecManip3)
## [1]  3.162278 10.000000
abs(-vecManip3)
## [1]  10 100

Funciones matemáticas

sin(vecManip3)
## [1] -0.5440211 -0.5063656
cos(vecManip3)
## [1] -0.8390715  0.8623189
tan(vecManip3)
## [1]  0.6483608 -0.5872139

Funciones matemáticas

log(vecManip3)
## [1] 2.302585 4.605170
log10(vecManip3)
## [1] 1 2

Estadísticas descriptivas

vecManip3 <- c(1, 5, 6, 8, NA, 45, NA, 14)
mean(vecManip3)
## [1] NA
mean(vecManip3, na.rm = TRUE)
## [1] 13.16667

Estadísticas descriptivas

sd(vecManip3, na.rm = TRUE)
## [1] 16.16684
max(vecManip3, na.rm = TRUE)
## [1] 45
min(vecManip3, na.rm = TRUE)
## [1] 1

Estadísticas descriptivas

quantile(iris[, 1])
##   0%  25%  50%  75% 100% 
##  4.3  5.1  5.8  6.4  7.9

Estadísticas descriptivas

quantile(iris[, 1], probs = c(0, 0.05, 0.5, 0.95, 1))
##    0%    5%   50%   95%  100% 
## 4.300 4.600 5.800 7.255 7.900

Estadísticas descriptivas

summary(iris[, 1])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.300   5.100   5.800   5.843   6.400   7.900

Estadísticas descriptivas

median(iris[, 1])
## [1] 5.8

Estadísticas descriptivas

length(iris[, 1])
## [1] 150
nrow(iris)
## [1] 150
ncol(iris)
## [1] 5

Estadísticas descriptivas

round(5.98149374)
## [1] 6
round(5.98149374, digits = 2)
## [1] 5.98
ceiling(5.9999)
## [1] 6
ceiling(5.0001)
## [1] 6

Estadísticas descriptivas

floor(5.9999)
## [1] 5
floor(5.0001)
## [1] 5

Estadísticas descriptivas

miDf2 <- data.frame(a = 1:10, b = 2:11, c = 3:12)
print(miDf2)
##     a  b  c
## 1   1  2  3
## 2   2  3  4
## 3   3  4  5
## 4   4  5  6
## 5   5  6  7
## 6   6  7  8
## 7   7  8  9
## 8   8  9 10
## 9   9 10 11
## 10 10 11 12

Estadísticas descriptivas

rowSums(miDf2)
##  [1]  6  9 12 15 18 21 24 27 30 33
colSums(miDf2)
##  a  b  c 
## 55 65 75

Estadísticas descriptivas

rowMeans(miDf2)
##  [1]  2  3  4  5  6  7  8  9 10 11
colMeans(miDf2)
##   a   b   c 
## 5.5 6.5 7.5

Estadísticas descriptivas

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa
irisNum <- iris[, c(1, 2, 3, 4)]

Estadísticas descriptivas

aggregate(
  irisNum, 
  by = list(iris$Species), 
  FUN = mean)
##      Group.1 Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1     setosa        5.006       3.428        1.462       0.246
## 2 versicolor        5.936       2.770        4.260       1.326
## 3  virginica        6.588       2.974        5.552       2.026

Estadísticas descriptivas

aggregate(iris[, 1], by = list(iris$Species), FUN = summary)
##      Group.1 x.Min. x.1st Qu. x.Median x.Mean x.3rd Qu. x.Max.
## 1     setosa  4.300     4.800    5.000  5.006     5.200  5.800
## 2 versicolor  4.900     5.600    5.900  5.936     6.300  7.000
## 3  virginica  4.900     6.225    6.500  6.588     6.900  7.900

Estadísticas descriptivas

range(iris[, 1])
## [1] 4.3 7.9

Estadísticas descriptivas

letras <- sample(letters[1:5], size = 50, replace = TRUE)
print(letras)
##  [1] "e" "c" "d" "e" "c" "d" "b" "b" "d" "b" "d" "e" "e" "d" "c" "c" "b"
## [18] "a" "b" "d" "e" "d" "c" "a" "e" "a" "e" "d" "c" "d" "e" "b" "c" "a"
## [35] "a" "d" "a" "b" "a" "c" "b" "b" "b" "a" "b" "a" "a" "a" "a" "b"
unique(letras)
## [1] "e" "c" "d" "b" "a"

Estadísticas

rnorm(10, mean = 0, sd = 1)
##  [1] -0.24160589  0.29244455  0.28958534 -1.66755821 -0.42076400
##  [6] -2.11093862 -0.01141307 -0.66444310  1.83721640 -1.21032143

Estadísticas

distribución probabilidad cuantil densidad aleatorio
Beta pbeta qbeta dbeta rbeta
Binomial pbinom qbinom dbinom rbinom
Cauchy pcauchy qcauchy dcauchy rcauchy
Chi-Square pchisq qchisq dchisq rchisq
Exponential pexp qexp dexp rexp
F pf qf df rf
Gamma pgamma qgamma dgamma rgamma
Geometric pgeom qgeom dgeom rgeom
Hypergeometric phyper qhyper dhyper rhyper
Logistic plogis qlogis dlogis rlogis
Log Normal plnorm qlnorm dlnorm rlnorm
Negative Binomial pnbinom qnbinom dnbinom rnbinom

Estadísticas

distribución probabilidad cuantil densidad aleatorio
Normal pnorm qnorm dnorm rnorm
Poisson ppois qpois dpois rpois
Student t pt qt dt rt
Studentized Range ptukey qtukey dtukey rtukey
Uniform punif qunif dunif runif
Weibull pweibull qweibull dweibull rweibull
Wilcoxon Rank Sum Statistic pwilcox qwilcox dwilcox rwilcox
Wilcoxon Signed Rank Statistic psignrank qsignrank dsignrank rsignrank

Otras funciones útiles

seq_along()

print(vecManip3)
## [1]  1  5  6  8 NA 45 NA 14
seq_along(vecManip3)
## [1] 1 2 3 4 5 6 7 8

:

5:10
## [1]  5  6  7  8  9 10

rep()

miVec12 <- c(1, 1, 1, 1, 1, 1, 1, 1, 1)
miVec12 <- rep(1, times = 9)
rep("Hola", times = 3)
## [1] "Hola" "Hola" "Hola"

rep()

rep(1:3, time = 3)
## [1] 1 2 3 1 2 3 1 2 3
rep(1:3, length.out = 10)
##  [1] 1 2 3 1 2 3 1 2 3 1
rep(1:3, each = 3)
## [1] 1 1 1 2 2 2 3 3 3

seq()

seq(from = 0, to = 1, by = 0.2)
## [1] 0.0 0.2 0.4 0.6 0.8 1.0
seq(from = 20, to = 10, length.out = 10)
##  [1] 20.00000 18.88889 17.77778 16.66667 15.55556 14.44444 13.33333
##  [8] 12.22222 11.11111 10.00000
letters[seq(from = 1, to = 26, by = 2)]
##  [1] "a" "c" "e" "g" "i" "k" "m" "o" "q" "s" "u" "w" "y"

seq()

rep(seq(from = 1, to = 2, by = 0.5), times = 3)
## [1] 1.0 1.5 2.0 1.0 1.5 2.0 1.0 1.5 2.0

getwd()

getwd()
## [1] "C:/Users/nous/Documents/Francois/TRAVAIL/00__EN_COURS/FORMATION/CURSOS_DE_R_2019"

setwd()

oldWd <- getwd()
print(oldWd)
## [1] "C:/Users/nous/Documents/Francois/TRAVAIL/00__EN_COURS/FORMATION/CURSOS_DE_R_2019"
setwd("..")
getwd()
## [1] "C:/Users/nous/Documents/Francois/TRAVAIL/00__EN_COURS/FORMATION"

setwd()

setwd(oldWd)
getwd()
## [1] "C:/Users/nous/Documents/Francois/TRAVAIL/00__EN_COURS/FORMATION/CURSOS_DE_R_2019"

list.files()

list.files(pattern = "(html)$") # html
## [1] "R00_links.html"        "R01_introduction.html" "R02_calculadora.html" 
## [4] "R03_objet.html"        "R04_editorTexto.html"  "R05_dataType01.html"  
## [7] "R06_dataType02.html"

list.files()

list.files(pattern = "(pdf)$") # pdf
## character(0)

ls()

Los objetos ubicados en el entorno general en la memoria RAM del sistema (disponibles para R).

ls()
##  [1] "i"         "irisNum"   "letras"    "miDf"      "miDf2"    
##  [6] "miVec12"   "oldWd"     "vecManip"  "vecManip2" "vecManip3"

ls()

zzz <- "mi nuevo objeto"
ls()
##  [1] "i"         "irisNum"   "letras"    "miDf"      "miDf2"    
##  [6] "miVec12"   "oldWd"     "vecManip"  "vecManip2" "vecManip3"
## [11] "zzz"

rm()

rm(zzz)
ls()
##  [1] "i"         "irisNum"   "letras"    "miDf"      "miDf2"    
##  [6] "miVec12"   "oldWd"     "vecManip"  "vecManip2" "vecManip3"

SIGUIENTE