| Main Arguments: 
fileheader
sep,columns separate,like ,.
colClasses,the data class types of the column.
nrows,number of the rows.
comment.character,a character vector indicating the class of each column in the dataset.
skip,the number of lines to skip from the beginning.
stringsAsFactors,should character variables be coded as factors? Usages: read.table(file,header = FALSE,sep = "",quote = ""'",dec = ".",numerals = c("allow.loss","warn.loss","no.loss"),row.names,col.names,as.is = !stringsAsFactors,na.strings = "NA",colClasses = NA,nrows = -1,skip = 0,check.names = TRUE,fill = !blank.lines.skip,strip.white = FALSE,blank.lines.skip = TRUE,comment.char = "#",allowEscapes = FALSE,flush = FALSE,stringsAsFactors = default.stringsAsFactors(),fileEncoding = "",encoding = "unknown",text,skipNul = FALSE)
read.csv(file,header = TRUE,sep = ",",quote = """,fill = TRUE,comment.char = "",...)
read.csv2(file,sep = ";",dec = ",...)
read.delim(file,sep = "t",...)
read.delim2(file,...)Writing Data 
 Description: write.tableprints its required argumentx(after converting it to a data frame if it is not one nor a matrix) to a file or connection. Main Points: 
write.tablewriteLinesdumpdputsaveserialize Usages: write.table(x,file = "",append = FALSE,quote = TRUE,sep = " ",eol = "n",na = "NA",row.names = TRUE,col.names = TRUE,qmethod = c("escape","double"),fileEncoding = "")
write.csv(...)
write.csv2(...)Reading Large Tables
Read the help page for read.table,which contains many hints.Make a rough calculation of the memory required to store your dataset. If the dataset is larger than the amount of RAM on your computer,you can probably stop right here.Set comment.char = ""if there are no commented lines in your file.Use the colClassesargument. Specifying this option instead of using the default can makeread.tablerun MUCH faster,often twice as fast. In order to use this option,you have to know the class of each column in your data frame. If all of the columns are "numeric",for example,then you can just setcolClasses = "numeric". A quick an dirty way to figure out the classes of each column is the following: > initial <- read.table("db.txt",nrows = 100,sep = "t")
> classes <- sapply(initial,class)
> tabAll <- read.table("db.txt",colClasses = classes)
Set nrows. This doesn't make R run faster but it helps with memory usage. A mild overestimate is okay. You can use the Unix toolwcto calculate the number of lines in a file. Reading Data Formats
dputanddget> y <- data.frame(a = 1,b = "a") ## Create a `data.frame` object for `dput`
> dput(y)                         ## `dput` the object created before
structure(list(a = 1,b = structure(1L,.Label = "a",class = "factor")),.Names = c("a","b"),row.names = c(NA,-1L),class = "data.frame")
> dput(y,file = 'y.R')           ## `dput` the object created before into a file which named 'y.R'
> new.y <- dget('y.R')            ## get the data store in the file 'y.R'
> new.y                           ## print the data in the 'y.R'
  a b
1 1 adump
Multiple objects can be deparsed using the dump function and read back in using source. > x <- "foo"                          ## create the first data object
> y <- data.frame(a = 1,b = "a")     ## create the second data object
> dump(c("x","y"),file = "data.R")  ## store the both data object in to a file called 'data.R'
> rm(x,y)                            ## remove the both data object from RAM
> source("data.R")                    ## import the dumped file 'data.R'
> y                                   ## print the data object 'y' from 'data.R'
  a b
1 1 a
> x                                   ## print the data object 'x' from 'data.R'
[1] "foo"Connections: Interfaces to the Outside WorldData are read in using connection interfaces. Connections can be made to files (most common) or to other more exotic things. 
file,opens a connection to a file
gzfile,opens a connection to a file compressed with gzip
bzfile,opens a connection to a file compressed with bzip2
url,opens a connection to a webpage. > con <- file('db.txt','r')
> readLines(con)Subsetting
[always returns an object of the same class as the original; can be used to select more than one element (there is one exception)
[[is used to extract elements of a list or a data frame; it can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame.
$is used to extract elements of a list or data frame by name; semantics are similar to hat of[[. Basic> x <- c("a","c","d","e")
> x[1]
[1] "a"
> x[2]
[1] "b"
> x[1:3]
[1] "a" "b" "c"
> x[x > "a"]
[1] "b" "c" "d" "e"
> u  <- x>"a"
> u
[1] FALSE  TRUE  TRUE  TRUE  TRUE
> x[u]
[1] "b" "c" "d" "e"Lists> x <- list(foo = 1:4,bar = 0.6)
> x[1]
$foo
[1] 1 2 3 4
> x[[1]]
[1] 1 2 3 4
> x[[2]]
[1] 0.6
> x$bar
[1] 0.6
> x$foo
[1] 1 2 3 4
> x[["bar"]]
[1] 0.6
> x["bar"]
$bar
[1] 0.6 > x <- list(foo = 1:4,bar = 0.6,baz = "hello")
> x[c(1,3)]
$foo
[1] 1 2 3 4
$baz
[1] "hello"
> name <- "foo"
> x[[name]]
[1] 1 2 3 4
> x$name          ## `name` is a variable,not a `level`,so does not has x$name in the list `x`.
NULL
> x$foo
[1] 1 2 3 4 Matrices(编辑:晋中站长网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |