Get one or more tables from a rdocx or rpptx object. officer_tables()
returns a list of data frames and officer_table()
returns a single table as
a data frame. These functions are based on example code on extracting Word
document and PowerPoint slides in the officeverse documentation.
Some additional features including the type_convert parameter and the
addition of doc_index values as the default names for the returned list of
tables are based on this blog post by Matt Dray.
Usage
officer_tables(
x,
index = NULL,
has_header = TRUE,
col = NULL,
preserve = FALSE,
...,
stack = FALSE,
type_convert = FALSE,
nm = NULL,
call = caller_env()
)
officer_table(
x,
index = NULL,
has_header = TRUE,
col = NULL,
...,
call = caller_env()
)
Arguments
- x
A rdocx or rpptx object or a data frame created with
officer_summary()
.- index
A index value matching a doc_index value for a table in the summary data frame, Default:
NULL
- has_header
If
TRUE
(default), tables are expected to have implicit headers even if the Word table does not have an explicit header row. IfFALSE
, only explicit header rows will be used as column names.- col
If col is supplied,
officer_table()
passes col and the additional parameters in ... tofill_with_pattern()
. This allows the addition of preceding headings or captions as a column within the data.frame returned byofficer_tables()
. This is an experimental feature and may be modified or removed. Defaults toNULL
.- preserve
If
FALSE
(default), text in table cells is collapsed into a single line. IfTRUE
, line breaks in table cells are preserved as a "\n" character. This feature is adapted fromdocxtractr::docx_extract_tbl()
published under a MIT licensed in the{docxtractr}
package by Bob Rudis.- ...
Additional parameters passed to
fill_with_pattern()
.- stack
If
TRUE
and all tables share the same number of columns, return a single combined data frame instead of a list. Defaults toFALSE
.- type_convert
If
TRUE
, convert columns for the returned data frames to the appropriate type usingutils::type.convert()
.- nm
Names to use for returned list of tables. If
NULL
(default), the names are set to the doc_index values using the pattern "doc_index_<doc_index_number>".- call
The execution environment of a currently running function, e.g.
call = caller_env()
. The corresponding function call is retrieved and mentioned in error messages as the source of the error.You only need to supply
call
when throwing a condition from a helper function which wouldn't be relevant to mention in the message.Can also be
NULL
or a defused function call to respectively not display any call or hard-code a code to display.For more information about error calls, see Including function calls in error messages.
Examples
docx_example <- read_docx_ext(
filename = "example.docx",
path = system.file("doc_examples", package = "officer")
)
officer_tables(docx_example)
#> $doc_index_16
#> Petals Internode Sepal
#> 1 5,621498349 <NA> 2,46210657918,2034091
#> 2 4,994616997 AA 2,429320759
#> 3 4,767504884 <NA> AAA
#> 4 25,9242382 <NA> 2,066051345
#> 5 6,489375001 25,21130805 2,901582763
#> 6 5,7858682 25,52433147 2,655642742
#> 7 5,645575295 Merged cell 2,278691288
#> 8 4,828953215 <NA> 2,238467716
#> 9 6,783500773 <NA> 2,202762147
#> 10 5,395076839 <NA> 2,538375992
#> 11 4,683617783 29,2459239 2,601945544
#> 12 NoteNew line note <NA> <NA>
#> Bract
#> 1 <NA>
#> 2 17,65204912
#> 3 <NA>
#> 4 18,37915478
#> 5 17,3130473717,0721572418,2902189
#> 6 <NA>
#> 7 <NA>
#> 8 19,87376227
#> 9 19,85326662
#> 10 19,56545356
#> 11 18,95335451
#> 12 <NA>
#>
pptx_example <- read_pptx_ext(
filename = "example.pptx",
path = system.file("doc_examples", package = "officer")
)
officer_tables(pptx_example)[[1]]
#> Header 1 Header 2 Header 3
#> 2 A 12.23 blah blah
#> 3 B 1.23 blah blah blah
#> 4 B 9.0 Salut
#> 5 C 6 Hello