This page documents how to deploy the jupyterhealth
Create the Kubernetes Cluster¶
Define the cluster in a configuration file, cluster.yml
. Specify values
that are appropriate for your deployment.
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: jhe
region: us-east-2
version: '1.30'
vpc:
clusterEndpoints:
publicAccess: true
privateAccess: true
nat:
gateway: Single
nodeGroups:
- name: public-nodes
instanceType: t2.micro
desiredCapacity: 2
privateNetworking: false
managedNodeGroups:
- name: system-nodes
instanceType: t2.small
privateNetworking: true
minSize: 1
maxSize: 3
The configuration will be provided to eksctl
which in this case had access to the following environment variables:
AWS_SECRET_ACCESS_KEY
AWS_ACCESS_KEY_ID
AWS_DEFAULT_REGION
Create the cluster:
eksctl create cluster -f cluster.yml
Install Cluster Components¶
Install ingress-nginx¶
First, prepare parameters in ingress-nginx.yaml
:
controller:
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
config:
use-forwarded-headers: "true"
Then run the following:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install nginx-ingress ingress-nginx/ingress-nginx -f ingress-nginx.yml
Install certmanager¶
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.15.4 \
--set crds.install=true \
--set installCRDs=true \
--wait
Create a Database¶
This is an example configuration for an Amazon RDS PostgreSQL instance. Use values appropriate for your deployment. The VPC-related values would come from identifiers created when the cluster was created.
Example RDS¶
AWS RDS Configuration
Parameter | Value |
---|---|
Creation method | Standard create |
Engine type | PostgreSQL |
Engine version | 16.3-R3 |
Templates | Dev/Test |
Availability and durability, deployment | Multi-AZ DB Instance |
DB instance identifier | jhe-db-staging-1 |
Credentials management | Self managed, not auto generated |
DB instance class | Burstable classes, db.t3.small |
Storage type | General Purpose SSD (gp2) |
Allocated storage | 100 GiB |
Enable storage autoscaling | yes |
Maximum storage threshold | 1000 GiB |
Compute resource | Don’t connect to an EC2 compute resource |
VPC | eksctl-jhe-cluster/VPC |
DB subnet group | create new db subnet group |
Public access | no |
VPC security group | choose existing |
Existing VPC security groups | default, eks-cluster-jhe-... |
Database authentication | password |
Enable Performance insights | yes |
Retention period | 7 days (free tier) |
AWS KMS key | (default) aws/rds |
Initial database name | jhe |
Note the attributes of the database, e.g.
Database Attributes
Parameter | Value |
---|---|
db identifier | database-1 |
endpoint | database-1...rds.amazonaws.com |
port | 5432 |
master username | postgres |
secret value | (your secret) |
rotation | 365d |
Test the Database¶
Launch a shell in the cluster.
$ kubectl run postgres-test -it --rm --image=postgres:16.3 -- bash
If you don't see a command prompt, try pressing enter.
root@postgres-test:/#
Use the database endpoint, username, and secret to connect to the database you created.
root@postgres-test:/# psql -h {endpoint} -U {master username} -d postgres
Password for user postgres:
psql (16.3 (Debian 16.3-1.pgdg120+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression:
off)
Type "help" for help.
postgres=>
Seed Data into the Database¶
Migrate Database¶
Create a Job to migrate the database using our existing ConfigMap.
# job-manage-migrate.yml
apiVersion: batch/v1
kind: Job
metadata:
name: jhe-manage-migrate
namespace: jhe
spec:
template:
metadata:
name: jhe-manage-migrate
spec:
restartPolicy: Never
containers:
- name: jhe-manage-migrate
image: ryanlovett/jupyterhealth-exchange:a30ad58
command: ["python", "manage.py", "migrate"]
envFrom:
- configMapRef:
name: jhe-config
Run the job.
kubectl apply -f job-manage-migrate.yml
Seed the Database¶
This requires the seed.sql file from the jupyterhealth-exchange repository, and a new python script, jhe/scripts/seed.py
to import it. seed.py
is currently available in a pull request to jupyterhealth-exchange.
Injest them as ConfigMaps by running the following commands from within the working directory of jupyterhealth
kubectl -n jhe create configmap db-seed-sql --from-file=db/seed.sql
kubectl -n jhe create configmap jhe-scripts-seed.py --from-file=jhe/scripts/seed
.py
Create a Job to seed the database.
apiVersion: batch/v1
kind: Job
metadata:
name: import-seed
namespace: jhe
spec:
template:
metadata:
name: import-seed
spec:
containers:
- name: import-seed
image: ryanlovett/jupyterhealth-exchange:a30ad58
command: ["python", "/app/seed.py"]
envFrom:
- configMapRef:
name: jhe-config
volumeMounts:
- name: seed-sql
mountPath: /app/seed.sql
subPath: seed.sql
- name: seed-py
mountPath: /app/seed.py
subPath: seed.py
restartPolicy: Never
volumes:
- name: seed-sql
configMap:
name: db-seed-sql
- name: seed-py
configMap:
name: jhe-scripts-seed.py
and run it
kubectl apply -f job-import-seed.yml
Install the Application¶
Finally, install the application into the cluster. jhe-example.yml is provided as example kubernetes configuration, although you will need to substitute values appropriate for your deployment.
kubectl apply -f jhe-example.yml
Administering JHE¶
Login to your JupyterHealth Exchange app, https://
jhe .example .org /admin/ Under Django OAuth Toolkit, add application
a. Save Client id
b. Add space-separated redirect uris for hubs
- http://
localhost:8000 /auth /callback - https://
jupyterhub .example .org /hub /oauth _callback - https://
jupyterhub .example .org /services /smart /oauth _callback - https://
jupyterhub .example .org /user -redirect /smart /oauth _callback - https://
jhe .example .org /auth /callback
c. Client type: Public
d. Authorization grant type: Authorization code
e. Client secret: {client secret}
f. Hash client secret: yes
g. Skip authorization: yes
h. Algorithm: RSA with SHA-2 256
- http://
Authenticating JupyterHub with JHE¶
In order for users of JupyterHub to have access to JHE, the simplest way is to use JHE as the OAuth provider for logging into JupyterHub. To do that, configure. Below is the configuration to login to JupyterHub with JHE as OAuth provider:
hub:
config:
JupyterHub:
# first chunk:use Exchange as oauth provider
authenticator_class: generic-oauth
GenericOAuthenticator:
client_id: ${{ saved from JHE }}
cookie_max_age_days: 1
authorize_url: https://jhe.example.org/authorize/
token_url: https://jhe.example.org/o/token/
userdata_url: https://jhe.example.org/api/v1/users/profile
username_claim: email
login_service: JupyterHealth Exchange
scope:
- openid
admin_users:
- email@example.org
enable_auth_state: true
# grant specific users access by email
allowed_users:
- user-email@example.org
# or allow all JHE users to access the Hub with:
# allow_all: true
# see other example for group-based access
extraConfig:
# add access tokens from auth state to user env
auth_state_env.py: |
def auth_state_env(spawner, auth_state):
if not auth_state:
spawner.log.warning(f"Missing auth state for user {spawner.user.name}")
return
spawner.environment["JHE_TOKEN"] = auth_state["access_token"]
c.Spawner.auth_state_hook = auth_state_env
singleuser:
extraEnv:
JHE_URL: https://jhe.example.org
You have 3 choices for authorizing JHE users to access the Hub:
allow any JHE user to use the Hub. In which case, set:
GenericOAuthenticator: allow_all: true
allow specific users by email address:
GenericOAuthenticator: allowed_users: - user@example.org
allow based on organization membership in JHE, which requires a bit more configuration.
Authorizing the Hub via JHE organization¶
To authorize access to the Hub based on JHE organization membership, we need to connect JupyterHub groups with JHE organizations. This lets you manage access to the Hub in the JHE UI by adding/removing users to the authorized groups.
[In JHE] create the organization(s) that you want to grant access to the Hub. Note the integer “organization id” of each organization (they probably look like 2000X).
[In JHE] add users to these organizations
configure JupyterHub to populate group membership based on JHE organization membership:
hub-jhe-access-groups.yamlhub: config: JupyterHub: GenericOAuthenticator: # grant access based on JHE organization membership manage_groups: true auth_state_groups_key: "organizations" allowed_groups: # the integer id (in quotes) in JHE of organizations to allow access to the Hub - "2XXXX" extraConfig: # get organization membership for managed groups: managed_organizations.py: | from urllib.parse import urlparse async def auth_state_hook(authenticator, auth_state): if not auth_state: return auth_state access_token = auth_state["access_token"] url = urlparse(authenticator.authorize_url) org_url = f"{url.scheme}://{url.netloc}/api/v1/users/organizations" organizations = await authenticator.httpfetch( org_url, headers={"Authorization": f"Bearer {access_token}"} ) # use string ids for now auth_state["organizations"] = [str(org['id']) for org in organizations] return auth_state c.OAuthenticator.modify_auth_state_hook = auth_state_hook
Accessing JHE from the Hub¶
With the above configuration, when a user logs in to the Hub, two environment variables are set when a user starts their server:
$JHE_URL # the URL of the Exchange
$JHE_TOKEN # the user's access token for the Exchange
You can use these to make API requests to the Exchange.
There is also the jupyterhealth
pip install --pre jupyterhealth-client
And then you can use the JupyterHealthClient class to fetch patient data.