Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/apache/arrow/llms.txt

Use this file to discover all available pages before exploring further.

Overview

An Array is an immutable data array with some logical type and some length. Most logical types are contained in the base Array class; there are also subclasses for DictionaryArray, ListArray, StructArray, and other specialized types.

Array Class

Factory Method

Array$create()

Instantiates an Array and returns the appropriate subclass.
x
vector | list | data.frame
An R vector, list, or data.frame
type
DataType
default:"NULL"
An optional data type for x. If omitted, the type will be inferred from the data
my_array <- Array$create(1:10)
my_array$type
# int32

my_array$cast(int8())
# <int8>
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Methods

$IsNull()

Return true if value at index is null. Does not boundscheck.
i
integer
Zero-based index position
na_array <- Array$create(c(1:5, NA))
na_array$IsNull(0)  # FALSE
na_array$IsNull(5)  # TRUE

$IsValid()

Return true if value at index is valid. Does not boundscheck.
i
integer
Zero-based index position
na_array <- Array$create(c(1:5, NA))
na_array$IsValid(5)  # FALSE

$length()

Size in the number of elements this array contains.
my_array <- Array$create(1:10)
my_array$length()  # 10

$nbytes()

Total number of bytes consumed by the elements of the array.
my_array <- Array$create(1:10)
my_array$nbytes()

$Equals()

Check if this array is equal to another.
other
Array
Another Array to compare with
na_array <- Array$create(c(1:5, NA))
na_array2 <- na_array
na_array$Equals(na_array2)  # TRUE

$ApproxEquals()

Check if this array is approximately equal to another.
other
Array
Another Array to compare with

$Diff()

Return a string expressing the difference between two arrays.
other
Array
Another Array to compare with

$data()

Return the underlying ArrayData.

$as_vector()

Convert to an R vector.
my_array <- Array$create(1:10)
my_array$as_vector()
# [1]  1  2  3  4  5  6  7  8  9 10

$ToString()

String representation of the array.

$Slice()

Construct a zero-copy slice of the array with the indicated offset and length.
offset
integer
Starting position (zero-based)
length
integer
default:"NULL"
Number of elements in the slice. If NULL, the slice goes until the end of the array
na_array <- Array$create(c(1:5, NA))
new_array <- na_array$Slice(5)
new_array$offset  # 5

$Take()

Return an Array with values at positions given by integers.
i
integer vector | Array
Positions to take (R vector or Arrow Array)
my_array <- Array$create(1:10)
my_array$Take(c(0, 2, 4))

$Filter()

Return an Array with values at positions where logical vector is TRUE.
i
logical vector | Array
Logical vector or Arrow boolean Array
keep_na
logical
default:"TRUE"
Whether to keep NA values
my_array <- Array$create(1:10)
my_array$Filter(c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE))

$SortIndices()

Return an Array of integer positions that can be used to rearrange the Array in ascending or descending order.
descending
logical
default:"FALSE"
Whether to sort in descending order

$RangeEquals()

Check if a range of values is equal to another array.
other
Array
Another Array to compare with
start_idx
integer
Starting index in this array
end_idx
integer
Ending index in this array
other_start_idx
integer
default:"0"
Starting index in the other array

$cast()

Alter the data in the array to change its type.
target_type
DataType
The target data type
safe
logical
default:"TRUE"
Whether to check for overflows or other unsafe conversions
options
CastOptions
default:"cast_options(safe)"
Casting options
my_array <- Array$create(1:10)
my_array$cast(int8())

$View()

Construct a zero-copy view of this array with the given type.
type
DataType
The data type for the view

$Validate()

Perform validation checks to determine obvious inconsistencies within the array’s internal data. This can be an expensive check, potentially O(length).

Active Bindings

$null_count

The number of null entries in the array.
na_array <- Array$create(c(1:5, NA))
na_array$null_count  # 1

$offset

A relative position into another array’s data, to enable zero-copy slicing.

$type

Logical type of data.
my_array <- Array$create(1:10)
my_array$type
# DataType
# int32

DictionaryArray Class

DictionaryArray is a subclass of Array for dictionary-encoded data, similar to R factors.

Factory Method

DictionaryArray$create()

x
vector | Array
An R vector or Array of integers for the dictionary indices, or an R factor
dict
vector | Array
default:"NULL"
An R vector or Array of dictionary values (like R factor levels). Not needed if x is a factor
# From a factor
factor_array <- DictionaryArray$create(factor(c("a", "b", "a", "c")))

# From indices and dictionary
indices <- c(0L, 1L, 0L, 2L)
dict <- c("a", "b", "c")
dict_array <- DictionaryArray$create(indices, dict)

Methods

$indices()

Return the indices array.

$dictionary()

Return the dictionary array.

Active Bindings

$ordered

Whether the dictionary is ordered.

StructArray Class

StructArray is a subclass of Array for struct (nested) data.

Factory Method

StructArray$create()

Create a StructArray from named arrays or vectors.
struct_array <- StructArray$create(
  x = c(1, 2, 3),
  y = c("a", "b", "c")
)

Methods

$field()

Extract a field by integer position.
i
integer
Zero-based field index

$GetFieldByName()

Extract a field by name.
name
character
Field name
struct_array <- StructArray$create(x = c(1, 2, 3), y = c("a", "b", "c"))
struct_array$GetFieldByName("x")

$Flatten()

Flatten the struct array into a list of arrays.

ListArray Class

ListArray is a subclass of Array for list data.

Methods

$values()

Return the values array (all list elements flattened).

$value_length()

Return the length of a specific list element.
i
integer
Zero-based index

$value_offset()

Return the offset of a specific list element.
i
integer
Zero-based index

$raw_value_offsets()

Return the raw offsets array.

Active Bindings

$value_type

The data type of the list values.

Helper Functions

as_arrow_array()

Convert an object to an Arrow Array. This is an S3 generic that allows methods to be defined in other packages.
x
object
An object to convert to an Arrow Array
type
DataType
default:"NULL"
A data type for the final Array. If NULL, will be inferred
as_arrow_array(1:5)
as_arrow_array(c("a", "b", "c"))

concat_arrays()

Concatenate zero or more Arrays into a single array. This operation will make a copy of its input.
...
Array
Zero or more Array objects to concatenate
type
DataType
default:"NULL"
An optional type describing the desired type for the final Array
concat_arrays(Array$create(1:3), Array$create(4:5))
# <int32>
# [1, 2, 3, 4, 5]

arrow_array()

Alias for Array$create().
x
vector | list | data.frame
An R object representable as an Arrow array
type
DataType
default:"NULL"
An optional data type. If omitted, will be inferred from the data
my_array <- arrow_array(1:10)