Inverted Word Clouds Architecture

Case Scenario

Jutta sometimes creates a slide with an interactive word cloud question using www.menti.com or www.ahaslides.com. When she presents an online workshop, she asks the audience to answer this question by giving an URL such as https://www.menti.com/4yho9tkqux. When audience answer this question, the slide will dynamically updates by highlighting majority answers. The Word Cloud tool we create should perform the same workflow but generating inverted Word Clouds that highlight the minority answers. Jutta’s requirements include:

  1. The audience doesn’t need to pay for it or go through a lengthy sign up process. I want to give them a link directly to the page where they enter their words.
  2. The results don’t need to be embeddable as long as I can share the results when I’m presenting on a Website.

Architectural Diagram

Data Structure

The word cloud questions and answers are saved as .json files in the "src/_data/" directory in the GitHub repository:

  • Every word cloud question is saved as an individual .json file in the file name: {uuid}-question.json
  • Answers to a word cloud question is saved as an individual .json file in the file name: {uuid}-answers.json

The format of a wordle question file

{
    "workshopName": {String},
    "question": {String},
    "entries": {Number},
    "entryMaxLength": {Number},
    "createdTimestamp": {Timestamp},
    "lastModifiedTimestamp": {Timestamp}
}

The format of a wordle answers file

{
    "{uuid}": {
        "answers": {String[]},
        "createdTimestamp": {Timestamp}
    }
    ....
}

Why Using Github Branches as Data Source

Another alternative that was considered was to save questions and answers into a third party database such as Fauna. Netlify functions will interact with it through Fauna API.

The final decision goes with Github branches is because this work will also benefit the implementation of the pluralistic data infrastructure that was planned to use Github branches and pull requests for collecting community-based user submitted data.

Lessons Learnt during Development

Ned's comments below describe the original implementation plan. The diagram above shows technologies used in the final product. There are 2 parts are changed during the development and here is why:

  1. Use Github REST API to interact with branches and files in the remote Github repository rather than using Github Actions.
    1. Why: The API that triggers a Github Actions workflow only responds whether the workflow is triggered successfully or not. When a workflow is triggered successfully, it doesn't report the result of the workflow execution. For example, if the workflow is to create a remote Github branch, in the case the workflow runs but for any reason it fails, the API response doesn't tell the run failed. The API response always returns success (response code 200) as long as the workflow is triggered.
  2. The flow that polls the answers file is to have the client page calls a server endpoint which uses Github REST API to poll rather than having the client page polling the file directly from https://raw.githubusercontent.com.
    1. Why: Raw github URLs cannot be refreshed more than once every 5 minutes. The detail of the issue using raw github URLs can be found in FLUID-6626.