XStringSet-class {Biostrings}R Documentation

BStringSet, DNAStringSet, RNAStringSet and AAStringSet objects

Description

The BStringSet class is a container for storing a set of BString objects and for making its manipulation easy and efficient.

Similarly, the DNAStringSet (or RNAStringSet, or AAStringSet) class is a container for storing a set of DNAString (or RNAString, or AAString) objects.

All those containers derive directly (and with no additional slots) from the XStringSet virtual class. They are also said to be XStringSet subtypes.

Usage

  ## Constructors:
  BStringSet(x, start=NA, end=NA, width=NA, use.names=TRUE)
  DNAStringSet(x, start=NA, end=NA, width=NA, use.names=TRUE)
  RNAStringSet(x, start=NA, end=NA, width=NA, use.names=TRUE)
  AAStringSet(x, start=NA, end=NA, width=NA, use.names=TRUE)

Arguments

x Either a character vector, or an XString, XStringSet or XStringViews object.
start Either NA, a single integer, or an integer vector of the same length as x specifying how x should be "narrowed" (see ?narrow for the details).
end Either NA, a single integer, or an integer vector of the same length as x specifying how x should be "narrowed" (see ?narrow for the details).
width Either NA, a single integer, or an integer vector of the same length as x specifying how x should be "narrowed" (see ?narrow for the details).
use.names TRUE or FALSE. Should names be preserved?

Details

The BStringSet, DNAStringSet, RNAStringSet and AAStringSet functions are constructors that can be used to "naturally" turn x into an XStringSet object of the desired subtype.

They also allow the user to "narrow" the sequences contained in x via proper use of the start, end and/or width arguments. In this context, "narrowing" means dropping unwanted parts of x located at the beginning (prefix) or end (suffix) of each sequence in x.

The narrow function is a generic function (defined in the IRanges package) with a method for narrowing IRanges objects. Because XStringSet objects are a particular kind of IRanges objects (the XStringSet class is a subclass of the IRanges class), an XStringSet object y can be narrowed with narrow(y). Therefore the two following expressions are equivalent:

DNAStringSet(x, start=s, end=e, width=w)
narrow(DNAStringSet(x), start=s, end=e, width=w)

but, besides being more convenient, the former is also more memory efficient on character vectors and would work even if the dropped parts contained letters that are not in the DNA alphabet (see ?DNA_ALPHABET).

Accesor methods

The XStringSet class derives from the IRanges class hence all the accessor methods defined for a IRanges object can also be used on an XStringSet object. In particular, the following methods are available (in the code snippets below, x is an XStringSet object:

length(x): The number of sequences in x.
width(x): A vector of non-negative integers containing the number of letters for each element in x.
nchar(x): The same as width(x).
names(x): NULL or a character vector of the same length as x containing a short user-provided description or comment for each element in x. These are the only data in an XStringSet object that can safely be changed by the user. All the other data are immutable! As a general recommendation, the user should never try to modify an object by accessing its slots directly.

Subsetting and appending

In the code snippets below, x and values are XStringSet objects, and i should be an index specifying the elements to extract.

x[i]: Return a new XStringSet object made of the selected elements.
x[[i]]: Extract the i-th XString object from x.
append(x, values, after=length(x)): Add sequences in values to x.

Other methods

In the code snippets below, x is an XStringSet object.

as.character(x, use.names): Convert x to a character vector of the same length as x. use.names controls whether or not names(x) should be used to set the names of the returned vector (default is TRUE).
as.matrix(x, use.names): Return a character matrix containing the "exploded" representation of the strings. This can only be used on an XStringSet object with equal-width strings. use.names controls whether or not names(x) should be used to set the row names of the returned matrix (default is TRUE).
toString(x): Equivalent to toString(as.character(x)).

Ordering and related methods

In the code snippets below, x is an XStringSet object.

order(x): Return a permutation which rearranges x into ascending or descending order.
sort(x): Sort x into ascending order (equivalent to x[order(x)]).

Author(s)

H. Pages

See Also

BString-class, DNAString-class, RNAString-class, AAString-class, XStringViews-class, narrow, DNA_ALPHABET

Examples

  x0 <- c("#TTGA", "#-CTC-N")
  x1 <- DNAStringSet(x0, start=2)
  x1
  names(x1)
  names(x1)[2] <- "seqB"
  x1

  library(drosophila2probe)
  x2 <- DNAStringSet(drosophila2probe$sequence)
  x2

  RNAStringSet(x2, start=2, end=-5)  # does NOT copy the sequence data!

[Package Biostrings version 2.10.22 Index]