Skip to contents

Note: this vignette is pre-computed to utilize the Microsoft Graph login that is only available locally to me as a logged in SharePoint user.

Downloading and reading files from SharePoint

You can use download_sp_item() to download files or folders from SharePoint:

docx_url <- "https://bmore.sharepoint.com/:w:/r/sites/MayorsOffice-DataGovernance/Policy%20Documents/Data%20Classification%20Standard.docx?d=w54a9ae7eaa894e94b6d6d14516f3aaa4&csf=1&web=1&e=ee7ZSX"

download_sp_item(
  path = docx_url,
  new_path = tempdir()
)
#> Loading Microsoft Graph login for default tenant
#> 
ℹ Downloading SharePoint item to
#> '/var/folders/3f/50m42dx1…
#> Error: Path exists and overwrite is FALSE
#> 
✖ Downloading SharePoint item to '/var/folders/3f/50m42dx1…

For files on SharePoint, read_sharepoint() extends download_sp_item() by downloading the selected item to a temporary folder by default and, depending on the file extension, tries to read the file using readr, readxl, officer, or sf.

docx <- read_sharepoint(docx_url)
#> Loading Microsoft Graph login for default tenant
#> 
ℹ Downloading SharePoint item to '/var/folders/3f/50m42dx1…

✔ Downloading SharePoint item to '/var/folders/3f/50m42dx1…
#> 
ℹ Reading item with `officer::read_docx()`

✔ Reading item with `officer::read_docx()` [44ms]

docx
#> rdocx document with 62 element(s)
#> 
#> * styles:
#>                 Normal              heading 1 
#>            "paragraph"            "paragraph" 
#>              heading 2 Default Paragraph Font 
#>            "paragraph"            "character" 
#>           Normal Table                No List 
#>                "table"            "numbering" 
#>         Heading 1 Char             Table Grid 
#>            "character"                "table" 
#>         List Paragraph         Heading 2 Char 
#>            "paragraph"            "character" 
#>                  Title             Title Char 
#>            "paragraph"            "character" 
#>           Normal (Web)                 header 
#>            "paragraph"            "paragraph" 
#>            Header Char                 footer 
#>            "character"            "paragraph" 
#>            Footer Char          markedcontent 
#>            "character"            "character" 
#>            TOC Heading                  toc 1 
#>            "paragraph"            "paragraph" 
#>                  toc 2              Hyperlink 
#>            "paragraph"            "character" 
#>                  toc 3                  toc 4 
#>            "paragraph"            "paragraph" 
#>                  toc 5                  toc 6 
#>            "paragraph"            "paragraph" 
#>                  toc 7                  toc 8 
#>            "paragraph"            "paragraph" 
#>                  toc 9     Unresolved Mention 
#>            "paragraph"            "character" 
#>      FollowedHyperlink               Revision 
#>            "character"            "paragraph" 
#>    Unresolved Mention1   annotation reference 
#>            "character"            "character" 
#>        annotation text      Comment Text Char 
#>            "paragraph"            "character" 
#>     annotation subject   Comment Subject Char 
#>            "paragraph"            "character" 
#>           Balloon Text      Balloon Text Char 
#>            "paragraph"            "character" 
#>          footnote text     Footnote Text Char 
#>            "paragraph"            "character" 
#>     footnote reference 
#>            "character" 
#> 
#> * Content at cursor location:
#>   level num_id text style_name content_type
#> 1    NA     NA       heading 1    paragraph

Writing and uploading files to SharePoint

You can use upload_sp_item() to upload a local file to a SharePoint folder or document library.

folder_url <- "https://bmore.sharepoint.com/:f:/r/sites/MayorsOffice-DataGovernance/Shared%20Documents/RStats?csf=1&web=1&e=S1XxVU"

upload_sp_item(
  file = system.file("gpkg/nc.gpkg", package = "sf"),
  dest = folder_url
)
#> Loading Microsoft Graph login for default tenant
#> 
ℹ Uploading file 'nc.gpkg' to SharePoint drive

✔ File upload complete [1.9s]                 

Using read_sharepoint(), we can confirm that the file has been uploaded:

sp_drive <- get_sp_drive(folder_url)
#> Loading Microsoft Graph login for default tenant

nc <- read_sharepoint(
  "RStats/nc.gpkg",
  drive = sp_drive
)
#> 
ℹ Downloading SharePoint item to
#> '/var/folders/3f/50m42dx1…
#> 
✔ Downloading SharePoint item to '/var/folders/3f/50m42dx1…
#> 
ℹ Reading item with `sf::read_sf()`

✔ Reading item with `sf::read_sf()` [27ms]

plot(nc["AREA"])
plot of chunk nc_plot
plot of chunk nc_plot

write_sharepoint() extends upload_sp_item() by allowing you to pass an R object instead of a file path. Like read_sharepoint() tries to guess the appropriate input function, write_sharepoint() tries to guess the appropriate output function based on the object class.

write_sharepoint(
  mtcars,
  file = "mtcars.csv",
  dest = folder_url
)
#> Loading Microsoft Graph login for default tenant
#> 
ℹ Uploading file 'mtcars.csv' to SharePoint drive

✔ File upload complete [970ms]                   

To wrap up this example, we need to remove the uploaded files from SharePoint to keep a tidy shared file system.

delete_sp_item() supports the option to use a shared item URL to select which file to remove but, in this case, it is easier to set the drive argument along with a relative filepath:

# Remove the file
delete_sp_item(
  file.path("RStats", "nc.gpkg"),
  drive = sp_drive,
  confirm = FALSE
)

delete_sp_item(
  file.path("RStats", "mtcars.csv"),
  drive = sp_drive,
  confirm = FALSE
)

Listing files

If you do not know the URL or file path for an item on SharePoint you can also use the directory info functions to list items typically using the sp_dir_info() function. This function supports recursive listings but this can be slow depending on the number of items in the SharePoint library.

This last example is not computed but it shows how to list and remove empty nested directories left over from a failed manual import;

# List directories
dir_info <- sp_dir_info("<SharePoint Folder URL>", type = "directory", recurse = TRUE)

# Filter to empty directories and sort by depth
empty_dirs <- dir_info |>
  filter(size == 0) |>
  mutate(
    path_depth = str_count(name, "/")
  ) |>
  arrange(desc(path_depth))

# Get drive
drive <- get_sp_drive(
  drive_name = "<SharePoint Document Library Name>",
  site_url = "<SharePoint Site URL>"
)

# Delete empty directories and skip confirmation
walk(
  empty_dirs[["id"]],
  \(x) {
    delete_sp_item(
      item_id = x,
      drive = drive,
      confirm = FALSE
    )
  }
)

Overall, the intent of this package is to maximize flexibility in how and where you read and write files from SharePoint. Suggestions are welcome so please share your own tips on working with the Microsoft SharePoint API and the Microsoft365R package.