Skip to main content

Writing Streamlit Apps

Writing streamlit apps is as simple as writing any other streamlit app and following its best practices, except for a couple of differences. For basic streamlit knowledge we link here to its documentation:

Streamlit Docs

In addition, the Octostar developer should be aware of:

  • Initialization logic
  • Browser Interactions
  • Parallelism & User Session Management
  • As a Back-end Service

Initialization Logic

It is good practice to have some initialization logic in a streamlit application. By convention, apps should follow this skeleton structure:

import streamlit as st

def initialize():
    st.session_state = {
        'state': dict(),
        'input': {
            'user': dict(),
            'workspaces': None
        },
        'initialized': True
    }
   
def loop():
    st.write("Hello World!")
   
if not st.session_state.get('initialized', False):
    initialize()
if st.session_state.get('initialized', False):
    loop()

Browser Interactions

Functions involving the front-end (from streamlit-octostar-research) cannot be put in initialize() and must be put in loop() instead (with if-guarded statements), as they return data asynchronously (initially they just return None) and cause a rerun of streamlit's code when their data changes, exactly as if a user had interacted with a widget.

DON'T

from streamlit_octostar_research.desktop import get_open_workspace_ids

def initialize():
st.session_state = {
'state': dict(),
'input': {
'user': dict(),
'workspaces': get_open_workspace_ids() # Will be set to None
},
'initialized': True
}

def loop():
st.write(st.session_state['input']['workspaces'])

DO

from streamlit_octostar_research.desktop import get_open_workspace_ids

def initialize():
st.session_state = {
'state': dict(),
'input': {
'user': dict(),
'workspaces': None
},
'initialized': True
}

def loop():
if not st.session_state['input']['workspaces']:
## Will be set to None initially but eventually rerun with a valid value
st.session_state['input']['workspaces'] = get_open_workspace_ids()
if not st.session_state['input']['workspaces']:
st.stop() # prevent further code execution until we get a valid result
st.write(st.session_state['input']['workspaces'])

Parallelism & User Session Management

Streamlit does not natively handle parallel computations and using asyncio is generally awkward and error-prone, therefore strongly discouraged. Indeed, whenever a streamlit rerun is invoked, if the event loop is currently running at that time it will crash and be left in an inconsistent state.

Using threads is a much more robust mechanism. However, since they are detached from the main thread, streamlit will not generally be aware of them. Furthermore, we want to use a shared pool of threads among user sessions to avoid starving the app of resources.

To resolve these issues, there are two module in streamlit-octostar-utils:

  • threading.async_task_manager: This module defines an AsyncTaskManager which can be cached globally in an app via st.cache_resource(). Tasks can be sent to the pool as a combination of an ID and a callable using method submit(), and a (intermediate or final) result can be polled via get_result() . Tasks can also be (co-operatively) cancelled via cancel()
  • threading.session_callback_manager: This module defines a singleton object which runs in a separate threads and listens for the following events:
    • user session start
    • user session heartbeat (every 2 seconds)
    • user session end

This can be used in conjunction with the AsyncTaskManager to cancel pending tasks when a user session is ended. It can also be used standalone, e.g. to shut down some service if no user is connected to the streamlit server.

As a Back-End Service

Streamlit is a stateful server, allowing heavy computations in the same codeplace as the user interface. However, streamlit does not natively offer a solution to expose reusable functions as APIs to other services. Instead, what is recommended is to use a python REST API server, such as Flask or FastAPI (Writing FastAPI/Flask apps), which are typically used in conjunction with a JS front-end.

If there is a requirement to develop an API together with a streamlit interface, the recommended approach is to deploy two servers (e.g. FastAPI + streamlit) in the main.sh file of the app, coordinated by nginx as a reverse proxy to route incoming requests to the correct server. For example:

## install nginx
apt-get update
apt-get install -y nginx
## run streamlit server on port 8501
streamlit run --server.port 8501 --server.address 127.0.0.1 --server.enableCORS true streamlit_app.py &
## run flask server on port 8502
python fastapi_app.py &
## run nginx to respond to requests on 8080 (default port for apps)
nginx -g 'daemon off;' -c $(pwd)/nginx.conf

And in an nginx.conf file:

events {}

http {
server {
listen 8080;

location /docs {
proxy_pass http://127.0.0.1:8502/docs;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}

location /openapi.json {
proxy_pass http://127.0.0.1:8502/openapi.json;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}

location /api {
proxy_pass http://127.0.0.1:8502/api;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
client_max_body_size 128M;
}

location / {
proxy_pass http://127.0.0.1:8501;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}
}
}