Writing Streamlit Apps

Writing streamlit apps is as simple as writing any other streamlit app and following its best practices, except for a couple of differences. For basic streamlit knowledge we link here to its documentation:

Streamlit Docs

In addition, the Octostar developer should be aware of:

Initialization logic
Browser Interactions
Parallelism & User Session Management
As a Back-end Service

Initialization Logic

It is good practice to have some initialization logic in a streamlit application. By convention, apps should follow this skeleton structure:

import streamlit as st

def initialize():
    st.session_state = {
        'state': dict(),
        'input': {
            'user': dict(),
            'workspaces': None
        },
        'initialized': True
    }
    
def loop():
    st.write("Hello World!")
    
if not st.session_state.get('initialized', False):
    initialize()
if st.session_state.get('initialized', False):
    loop()

Browser Interactions

Functions involving the front-end (from streamlit-octostar-research) cannot be put in initialize() and must be put in loop() instead (with if-guarded statements), as they return data asynchronously (initially they just return None) and cause a rerun of streamlit's code when their data changes, exactly as if a user had interacted with a widget. ❌ DON'T

from streamlit_octostar_research.desktop import get_open_workspace_ids

def initialize():
    st.session_state = {
        'state': dict(),
        'input': {
            'user': dict(),
            'workspaces': get_open_workspace_ids() # Will be set to None
        },
        'initialized': True
    }
 
def loop():
  st.write(st.session_state['input']['workspaces'])

✅ DO

from streamlit_octostar_research.desktop import get_open_workspace_ids

def initialize():
    st.session_state = {
        'state': dict(),
        'input': {
            'user': dict(),
            'workspaces': None
        },
        'initialized': True
    }
 
def loop():
  if not st.session_state['input']['workspaces']:
    ## Will be set to None initially but eventually rerun with a valid value
    st.session_state['input']['workspaces'] = get_open_workspace_ids()
  if not st.session_state['input']['workspaces']:
    st.stop() # prevent further code execution until we get a valid result
  st.write(st.session_state['input']['workspaces'])

Parallelism & User Session Management

Streamlit does not natively handle parallel computations and using asyncio is generally awkward and error-prone, therefore strongly discouraged. Indeed, whenever a streamlit rerun is invoked, if the event loop is currently running at that time it will crash and be left in an inconsistent state.

Using threads is a much more robust mechanism. However, since they are detached from the main thread, streamlit will not generally be aware of them. Furthermore, we want to use a shared pool of threads among user sessions to avoid starving the app of resources. To resolve these issues, there are two module in streamlit-octostar-utils:

threading.async_task_manager: This module defines an AsyncTaskManager which can be cached globally in an app via st.cache_resource(). Tasks can be sent to the pool as a combination of an ID and a callable using method submit(), and a (intermediate or final) result can be polled via get_result() . Tasks can also be (co-operatively) cancelled via cancel()
threading.session_callback_manager: This module defines a singleton object which runs in a separate threads and listens for the following events:
- user session start
- user session heartbeat (every 2 seconds)
- user session end

This can be used in conjunction with the AsyncTaskManager to cancel pending tasks when a user session is ended. It can also be used standalone, e.g. to shut down some service if no user is connected to the streamlit server.

As a Back-End Service

Streamlit is a stateful server, allowing heavy computations in the same codeplace as the user interface. However, streamlit does not natively offer a solution to expose reusable functions as APIs to other services. Instead, what is recommended is to use a python REST API server, such as Flask or FastAPI (Writing FastAPI/Flask apps (Writing FastAPI/Flask apps)), which are typically used in conjunction with a JS front-end. If there is a requirement to develop an API together with a streamlit interface, the recommended approach is to deploy two servers (e.g. FastAPI + streamlit) in the main.sh file of the app, coordinated by nginx as a reverse proxy to route incoming requests to the correct server. For example:

## install nginx
apt-get update
apt-get install -y nginx
## run streamlit server on port 8501
streamlit run --server.port 8501 --server.address 127.0.0.1 --server.enableCORS true streamlit_app.py &
## run flask server on port 8502
python fastapi_app.py &
## run nginx to respond to requests on 8080 (default port for apps)
nginx -g 'daemon off;' -c $(pwd)/nginx.conf

And in an nginx.conf file:

events {}

http {
    server {
        listen 8080;

        location /docs {
            proxy_pass http://127.0.0.1:8502/docs;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_set_header Host $host;
        }

        location /openapi.json {
            proxy_pass http://127.0.0.1:8502/openapi.json;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_set_header Host $host;
        }

        location /api {
            proxy_pass http://127.0.0.1:8502/api;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_set_header Host $host;
            client_max_body_size 128M;
        }

        location / {
            proxy_pass http://127.0.0.1:8501;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_set_header Host $host;
        }
    }
}

Initialization Logic​

Browser Interactions​

Parallelism & User Session Management​

As a Back-End Service​

Initialization Logic

Browser Interactions

Parallelism & User Session Management

As a Back-End Service