| Day | Time | Session | Tags | Interest ▼ | session attention ratio |
|---|
This is an exploratory look at the SIOP 2026 conference programme. It is built from
two pieces of public information for each session: the session title and the
number of people who added that session to their personal Whova agenda ("agenda-adds").
Sessions were tagged with one or more scientific topic labels by an LLM running an
active-learning loop, then placed on a 2D map by embedding each topic name with a local
multilingual sentence-embedding model (gte-multilingual-base) and projecting the
embeddings to two dimensions with classical MDS on cosine distance. Topics that are semantically similar end up near each other.
Each label is one of 30 topics that emerged from the active-learning tagging loop. Position reflects semantic similarity (closer = more related in meaning). The dropdown "Size by" controls what drives the size of the labels and dots (no colour coding — only size encodes value):
| Median session residual log (default) | Median of session residuals after adjusting for day and time slot. Bigger label = more over-indexed (more interest than expected); smaller label = more under-indexed. Median is used (rather than mean) so a single outlier session can't dominate a topic's score. |
| Frequency across sessions | How many sessions carry this topic. |
| Agenda-adds across sessions | Sum of agenda-add counts across the topic's sessions. Not unique people. |
| Median agenda-adds per session | Median number of adds among the topic's sessions — robust to outliers. |
| No scaling | Uniform font size. Use to read the layout itself without size influence. |
Click any topic — or a combination — to filter. Selected topics highlight in orange; the right-hand panel shows the sessions that match all selected topics, sorted by agenda-adds. Click again or use the pills to remove. Hovering any label or dot shows a tooltip with the underlying numbers.
One row per session. You can search by title, multi-select tag filters (sessions must contain every chosen tag), sort any column by clicking its header, and download the current filtered & sorted view as CSV.
The last column, "session attention ratio", is the session-level attention ratio — agenda-adds divided by what the baseline model would predict for a session held on the same day, in the same time slot. The number is colour-coded for quick scanning (deep red = strongly over-indexed, deep blue = strongly under-indexed) — this is the only place on the dashboard that uses colour. At the topic level (used for size on the map), this is the median of the residuals across the topic's sessions, exponentiated.
This page.
Starting from two seed topics ("Artificial Intelligence", "Well-being") an LLM (Gemini Flash) iterated through every session, tagged each with all relevant topics, then proposed new candidate topics from sessions that had no match. The loop stopped when no useful new topics were produced. The vocabulary was filtered to keep scientific subject areas and drop session-format labels (Panel, Workshop, Poster, etc.). Final vocabulary: 30 topics. After excluding the 17 Whova "Posters" listings (which aggregate many independent posters under a single entry) and 2 placeholder "Friday Seminars" listings (no-topic umbrellas), the dashboard covers 473 single-presentation sessions.
Each topic name was passed through a local multilingual embedding model
(gte-multilingual-base) producing a 768-dimensional vector that captures meaning.
Vectors were reduced to 2D with multidimensional scaling (MDS) on cosine distance, which tries
to preserve every pairwise distance — so a topic's neighbours on the map reflect its actual
semantic neighbours in embedding space. With only 30 points, perfect 2D preservation isn't
possible, but global structure (which topic clusters with which) is faithful and stable across
re-runs.
A simple linear regression predicts log(agenda-adds + 1) from the day and the
time slot. The residual for each session is
what's left over after that prediction — i.e. how much more (or less) interest the session
attracted than its scheduling peers. attention_ratio = exp(residual) turns
that into an intuitive multiplier (1.50 = 50% more than expected). Aggregated to the topic
level, it identifies topics that over-index on attendee interest.