Writing Streamlit Apps
Writing streamlit apps is as simple as writing any other streamlit app and following its best practices, except for a couple of differences. For basic streamlit knowledge we link here to its documentation:
In addition, the Octostar developer should be aware of:
- Initialization logic
- Browser Interactions
- Parallelism & User Session Management
- As a Back-end Service
Initialization Logic
It is good practice to have some initialization logic in a streamlit application. By convention, apps should follow this skeleton structure:
import streamlit as st
def initialize():
st.session_state = {
'state': dict(),
'input': {
'user': dict(),
'workspaces': None
},
'initialized': True
}
def loop():
st.write("Hello World!")
if not st.session_state.get('initialized', False):
initialize()
if st.session_state.get('initialized', False):
loop()
Browser Interactions
Functions involving the front-end (from streamlit-octostar-research) cannot be put in initialize() and must be put in loop() instead (with if-guarded statements), as they return data asynchronously (initially they just return None) and cause a rerun of streamlit's code when their data changes, exactly as if a user had interacted with a widget.
❌ DON'T
from streamlit_octostar_research.desktop import get_open_workspace_ids
def initialize():
st.session_state = {
'state': dict(),
'input': {
'user': dict(),
'workspaces': get_open_workspace_ids() # Will be set to None
},
'initialized': True
}
def loop():
st.write(st.session_state['input']['workspaces'])
✅ DO
from streamlit_octostar_research.desktop import get_open_workspace_ids
def initialize():
st.session_state = {
'state': dict(),
'input': {
'user': dict(),
'workspaces': None
},
'initialized': True
}
def loop():
if not st.session_state['input']['workspaces']:
## Will be set to None initially but eventually rerun with a valid value
st.session_state['input']['workspaces'] = get_open_workspace_ids()
if not st.session_state['input']['workspaces']:
st.stop() # prevent further code execution until we get a valid result
st.write(st.session_state['input']['workspaces'])
Parallelism & User Session Management
Streamlit does not natively handle parallel computations and using asyncio is generally awkward and error-prone, therefore strongly discouraged. Indeed, whenever a streamlit rerun is invoked, if the event loop is currently running at that time it will crash and be left in an inconsistent state.
Using threads is a much more robust mechanism. However, since they are detached from the main thread, streamlit will not generally be aware of them. Furthermore, we want to use a shared pool of threads among user sessions to avoid starving the app of resources.
To resolve these issues, there are two module in streamlit-octostar-utils:
threading.async_task_manager: This module defines anAsyncTaskManagerwhich can be cached globally in an app viast.cache_resource(). Tasks can be sent to the pool as a combination of an ID and a callable using methodsubmit(), and a (intermediate or final) result can be polled viaget_result(). Tasks can also be (co-operatively) cancelled viacancel()threading.session_callback_manager: This module defines a singleton object which runs in a separate threads and listens for the following events:- user session start
- user session heartbeat (every 2 seconds)
- user session end
This can be used in conjunction with the AsyncTaskManager to cancel pending tasks when a user session is ended. It can also be used standalone, e.g. to shut down some service if no user is connected to the streamlit server.
As a Back-End Service
Streamlit is a stateful server, allowing heavy computations in the same codeplace as the user interface. However, streamlit does not natively offer a solution to expose reusable functions as APIs to other services. Instead, what is recommended is to use a python REST API server, such as Flask or FastAPI (Writing FastAPI/Flask apps), which are typically used in conjunction with a JS front-end.
If there is a requirement to develop an API together with a streamlit interface, the recommended approach is to deploy two servers (e.g. FastAPI + streamlit) in the main.sh file of the app, coordinated by nginx as a reverse proxy to route incoming requests to the correct server. For example:
## install nginx
apt-get update
apt-get install -y nginx
## run streamlit server on port 8501
streamlit run --server.port 8501 --server.address 127.0.0.1 --server.enableCORS true streamlit_app.py &
## run flask server on port 8502
python fastapi_app.py &
## run nginx to respond to requests on 8080 (default port for apps)
nginx -g 'daemon off;' -c $(pwd)/nginx.conf
And in an nginx.conf file:
events {}
http {
server {
listen 8080;
location /docs {
proxy_pass http://127.0.0.1:8502/docs;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}
location /openapi.json {
proxy_pass http://127.0.0.1:8502/openapi.json;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}
location /api {
proxy_pass http://127.0.0.1:8502/api;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
client_max_body_size 128M;
}
location / {
proxy_pass http://127.0.0.1:8501;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}
}
}