Securing Shiny: Safeguarding Links and Downloads

Filip's photo

Filip Akkad

banner

Problem context

A while back, together with my team we were asked to implement a safe way of displaying sensitive PDFs (containing patient’s data) in our pharma Shiny app. Despite operating within the RStudio Connect environment, ensuring the security of these documents demanded extra precautions beyond standard practices.

The prevailing method involves exposing assets through `addResourcePath“. The example is demonstrated in the code snippet below:

app_1.r (unsafe)
library(shiny)
 
ui <- fluidPage(
  uiOutput("pdf")
)
 
server <- function(input, output, session) {
  addResourcePath("assets", "./")
  output$pdf <- renderUI({
    # Fetch the example PDF content from the external link and save it as a temporary file
    temp_pdf_path <- tempfile(fileext = ".pdf")
    external_pdf_link <- 'https://www.africau.edu/images/default/sample.pdf'
    download.file(external_pdf_link, temp_pdf_path, mode = "wb")
    tags$iframe(src="/assets/sample.pdf")
  })
}
 
shinyApp(ui, server, options = list(port = 4444))

However, this seemingly straightforward approach harbors inherent security risks. The issue with the provided code lies in the unrestricted access it grants to the asset once enabled. Even without initiating a new session in Shiny, external users can access the file. For instance, opening the following link in a separate incognito browser tab:

http://localhost:4444/assets/sample.pdf

reveals a serious security vulnerability.

app_1

Safer Alternative: Endpoint with Session Token

A more secure approach involves creating a dedicated endpoint within Shiny to serve assets exclusively for specific sessions. This method is commonly employed in functionalities like downloadHandler, albeit hidden beneath the implementation layer.

The example path appears as follows:

http://localhost:4444/session/55099de85c5bfd2839b87488605c9395/dataobj/example-get-api?w=&nonce=853efe9753e75378

One notable feature in this link is the embedded session token, ensuring that the asset is served only for the duration of a specific session.

The following code snippet demonstrates a practical implementation:

app_2.r
library(shiny)
 
ui <- fluidPage(
  textOutput("link"),
  uiOutput("pdf")
)
 
server <- function(input, output, session) {
  example_get_data_url <- session$registerDataObj(
    name = "example-get-api",
    data = {
      # Fetch the example PDF content from the external link and save it as a temporary file
      temp_pdf_path <- tempfile(fileext = ".pdf")
      external_pdf_link <- 'https://www.africau.edu/images/default/sample.pdf'
      download.file(external_pdf_link, temp_pdf_path, mode = "wb")
      readBin(temp_pdf_path, "raw", file.info(temp_pdf_path)$size)
    },
    filterFunc = function(content, req) {
      httpResponse(200, "application/pdf", content)
    }
  )
  output$pdf <- renderUI(
    tags$iframe(
      src = example_get_data_url
    )
  )
 
  output$link <- renderText({
    paste0(
      session$clientData$url_protocol,
      '//',
      session$clientData$url_hostname,
      ':',
      session$clientData$url_port,
      '/',
      example_get_data_url
    )
  })
}
 
shinyApp(ui, server)

The crucial element here is session$registerDataObj, which registers a new endpoint and efficiently serves the specified data. In this instance, the data is a binary PDF file, and its accessibility is confined to the associated session, thereby enhancing the overall security of the Shiny application.

But..?

Although our PDF delivery mechanism in app_2.r is significantly more secure, there remains a security gap that requires attention.

To demonstrate this, execute the app_2.r script and copy the generated link from the top. Now, open a separate incognito tab and paste the link – voila, the PDF is accessible for an anonymous user. This scenario is far from ideal, especially when dealing with highly sensitive files.

app_2

As observed, sharing the link (or more precisely, the session token embedded within it) poses a security risk. Session tokens should be considered a secret. What if the link is accidentally shared or falls into the wrong hands? The recipient gains immediate access to the file. While there are significant mitigators of the risks (see a discussion) + the link only functioning within the duration of the open session, there is still a room for improvement.👇

At the time of writing this blog an open PR from Joe Cheng addresses this vulnerability - PR-3766.

In the interim, you can explore my app-level implementation designed to tackle this issue. Although this solution may become obsolete once the aforementioned PR is merged, I still consider it a worthwhile experiment to implement it independently. 👇


Extra: Endpoint with Session Token and JWT token

To further enhance security, we need a mechanism that goes beyond the session lifespan and introduces additional layers of authentication.

In other words, we have to provide that browser is the same as the one that initiated the session. To achieve this, we will leverage JSON Web Tokens (JWTs) to create session access tokens.

The general idea is to generate a token for each session and store it in the browser as a cookie. Then, when the user requests the asset, the token is sent to the server, which validates it and serves the asset if the token is valid. Think of it as a two-factor authentication mechanism, where the session URL is the first factor, and the JWT token is the second factor.

Comparing to app_2.r we need to introduce couple of helper functions first:

Sys.setenv(SHINY_PEPPER = 'pepper')

We need to set a so called pepper for the JWT token. The pepper is used to generate a unique token for each application. The pepper should be a secret and should be different for each Shiny application.

get_secret <- function(pepper = Sys.getenv('SHINY_PEPPER'), salt, token) {
  paste0(pepper, salt, token)
}

There is also a salt - a random value that is used to generate a unique token for each session. The salt should be different for each session. In this example, we’re using a random value, but you can use any other value that is unique for each session.


generate_session_jwt <- function(
    session_token,
    pepper = Sys.getenv('SHINY_PEPPER'),
    salt
) {
  modified_session_token = get_secret(pepper, salt, session_token);
  key <- paste0('session_', digest::digest(modified_session_token))
  token <- jwt_encode_hmac(secret = modified_session_token)
  list(
    key = key,
    token = token
  )
}

generate_session_jwt function generates a JWT token for a given session id. The token is a JWT token, and the key is a unique identifier for the token (in case multiple SHiny sessions are open in one browser). The key is used to store the token in the browser as a cookie. There is no payload needed, we just need to know that the token is valid.


validate_session_jwt <- function(
    token,
    session_token,
    pepper = Sys.getenv('SHINY_PEPPER'),
    salt
) {
  jwt_decode_hmac(token, get_secret(pepper, salt, session_token))
}

validate_session_jwt throws an error if the token is invalid.


set_token <- function(session, pepper = Sys.getenv('SHINY_PEPPER'), salt) {
  token_obj <- generate_session_jwt(session$token, pepper, salt)
  session$sendCustomMessage("token", token_obj)
  token_obj
}

set_token function generates a token for the current session and sends it to the browser as a cookie. Naturally, we need to add a corresponding custom message handler to the UI:

add_token_handler <- function() {
  tags$script("
      import 'https://cdn.jsdelivr.net/npm/[email protected]/dist/js.cookie.min.js';
      Shiny.addCustomMessageHandler('token', function({key, token}) {
        Cookies.set(key, token);
      })
    ",
    type = "module"
  )
}

The handler receives a token and a key and stores them in the browser as a cookie (using js-cookie library).

One issue that remains is that the cookie lacks the httpOnly attribute (indicating that the cookie should not be accessible via JavaScript), leaving it vulnerable to theft via Cross-Site Scripting (XSS) attacks. Naturally, this is expected since we’re setting the cookie with JavaScript


Last but not least:

parse_cookies <- function(cookie_string) {
  cookies <- strsplit(cookie_string, ";")[[1]]
  Reduce(function(prev, curr) {
    cookie <- strsplit(curr, '=')[[1]]
    key <- trimws(cookie[1])
    value <- trimws(cookie[2])
    prev[[key]] <- value
    prev
  }, cookies, init = list())
}

parse_cookies is used in the filterFunc to extract the token from the request and returns a named list of cookies.

The final code looks like this:

app_3.r
library(shiny)
library(jose)
 
Sys.setenv(SHINY_PEPPER = 'pepper')
 
get_secret <- function(pepper = Sys.getenv('SHINY_PEPPER'), salt, token) {
  paste0(pepper, salt, token)
}
 
generate_session_jwt <- function(
    session_token,
    pepper = Sys.getenv('SHINY_PEPPER'),
    salt
) {
  modified_session_token = get_secret(pepper, salt, session_token);
  key <- paste0('session_', digest::digest(modified_session_token))
  token <- jwt_encode_hmac(secret = modified_session_token)
  list(
    key = key,
    token = token
  )
}
 
validate_session_jwt <- function(
    token,
    session_token,
    pepper = Sys.getenv('SHINY_PEPPER'),
    salt
) {
  jwt_decode_hmac(token, get_secret(pepper, salt, session_token))
}
 
add_token_handler <- function() {
  tags$script("
      import 'https://cdn.jsdelivr.net/npm/[email protected]/dist/js.cookie.min.js';
      Shiny.addCustomMessageHandler('token', function({key, token}) {
        Cookies.set(key, token);
      })
    ",
    type = "module"
  )
}
 
set_token <- function(session, pepper = Sys.getenv('SHINY_PEPPER'), salt) {
  token_obj <- generate_session_jwt(session$token, pepper, salt)
  session$sendCustomMessage("token", token_obj)
  token_obj
}
 
parse_cookies <- function(cookie_string) {
  cookies <- strsplit(cookie_string, ";")[[1]]
  Reduce(function(prev, curr) {
    cookie <- strsplit(curr, '=')[[1]]
    key <- trimws(cookie[1])
    value <- trimws(cookie[2])
    prev[[key]] <- value
    prev
  }, cookies, init = list())
}
 
ui <- fluidPage(
  add_token_handler(),
  textOutput("link"),
  uiOutput("pdf")
)
 
 
server <- function(input, output, session) {
  salt <- runif(1) * 1000 # Random value
  token_obj <- set_token(session, salt = salt)
  example_get_data_url <- session$registerDataObj(
    name = "example-get-api",
    data = {
      # Fetch the PDF content from the external link and save it as a temporary file
      temp_pdf_path <- tempfile(fileext = ".pdf")
      external_pdf_link <- 'https://www.africau.edu/images/default/sample.pdf'
      download.file(external_pdf_link, temp_pdf_path, mode = "wb")
      readBin(temp_pdf_path, "raw", file.info(temp_pdf_path)$size)
    },
    filterFunc = function(content, req) {
      tryCatch({
        token <- parse_cookies(req$HTTP_COOKIE)[[token_obj$key]]
        print(token)
        validate_session_jwt(token, session_token = session$token, salt = salt)  # Throws an error if invalid
        httpResponse(200, "application/pdf", content)
      },
      error = function(cond) {
        httpResponse(403, 'text/plain', 'not this time my friend')
      })
    }
  )
 
  output$pdf <- renderUI(tags$iframe(src = example_get_data_url))
 
  output$link <- renderText({
    paste0(
      session$clientData$url_protocol,
      '//',
      session$clientData$url_hostname,
      ':',
      session$clientData$url_port,
      '/',
      example_get_data_url
    )
  })
}
 
shinyApp(ui, server, options = list(port = 4444))

Now, try to open the link in a separate incognito tab. If you see error 403 (or a ‘not this time my friend’ message 😁) it means that the token is invalid 👉 we’re good to go! app_3

Conclusion

In conclusion, this post delved into the security challenges linked with serving assets in Shiny apps and presented effective mitigation strategies. The least secure method, utilizing addResourcePath, opens the door to unrestricted access to assets. A more robust alternative is establishing a dedicated endpoint within Shiny, limiting asset access to specific sessions.

However, a security gap persists in this approach, as a leaked link could be exploited by unauthorized users. This issue is currently under consideration in an open Pull Request. Until the proposed changes are incorporated, we can address this vulnerability by implementing a system that utilizes JSON Web Tokens (JWTs) to generate session access tokens. This final solution markedly improves security, rendering a leaked asset link insufficient for unauthorized data retrieval. Thus, we ensure that assets are exclusively served for the duration of a specific session to the browser that initiated it.