
The Whole SHEbang: Sexual Health Education
The Question Bot: Comprehensive Sexual Health Answers to Any Question | Student Capstone Project
Comprehensive sexual health education is a human right as recognized by the World Health Organization. However, in Canada, few students receive best practice sexual education or have access to accurate inclusive and non-judgmental information resources. Many students turn to Google or ChatGPT or other Large Language Models for answers. However, this information can often be unreliable due to misinformation, confusing ideas, or even dangerous content.
In order to provide students with a better and more confidential source, the Whole SHEbang: Sexual Health Education tasked a group of UBC Master of Data Science in Computational Linguistics students to create a front facing digital AI that can generate an answer to any teenager's sexual health question that is scientific, provides necessary definitions, and is thoughtful of social aspects (i.e.: consent, relationship context). Also, the students wanted a dashboard to track metrics about questions asked (what words, topic sorting, etc.…) for future research applications.
The WholeSHEbang delivers best practice sexual health education to grades 8-12 and partner with schools to deliver a best practice curriculum.
The students were provided with lesson plans, videos, glossaries, and a large library of real questions from students grade 8-12 (2015-2024) and best practice answers.
From this, the students built a chatbot called SUE (Sexuality Understanding & Education), that gives confidential, inclusive, and accurate answers about sexual health.
How SUE works is users will talk to SUE and those message goes through a Reverse Proxy. This routes incoming traffic to the appropriate service, handles load balancing, and adds a layer of security by hiding the internal architecture and filtering malicious requests. There are three parts that running SUE: The frontend, which shows the website to the user, the backend is where SUE searches through the resources the partner provided to find answers, then puts them into friendly, readable language. Chroma DB saves and finds the information SUE needs. All three parts are running in a CentOS Stream 9, which is an operating system that helps everything to work together and run smoothly. All of this is all run on a cloud server using AWS Lightsail, which helps keep system stable and ready for use at any time.
The students used a parser that extracts all the content from each file (PDF, Excel, .txt, HTML, and .mp4) provided from the partner and convert them format into a JSON format. The students then took that content from the data and used OpenAI ada-002, a text embedding model, to ChromaDB. In addition, SUE is compatible on mobile devices and support multiple languages.
The admin dashboard the students created is where administrators of SUE can see all in one place, the total chat sessions for a given time period, top keywords used, crisis keyword flagging and the most referenced sources. All of this can be exported into the data in a database or a .csv file for backup or analysis. The dashboard is also the place where administrators can upload new documents to add to SUE’s database of resources.
A challenge the students had was making the chatbot sound human. The students wanted SUE to feel like a caring adult, not a robot reading a textbook. At first, the replies were too cold. The students found adjusting the prompts it helped address this challenge.
Another challenge was filtering sensitive content. Since this is about sexual health, the students had to make sure the SUE stayed safe, respectful, and age-appropriate.
While SUE is still learning, the students were able to show how AI can support hard conversations in safe ways.
Parts of this article first appeared on Medium in a post written by the WholeSHEbang capstone team.
Explore Computational Linguistics Explore Other Data in Action Stories