AI-Powered Course Discovery

Students search by learning goals. Catalogs organize by department codes. I designed & built the system that bridges that gap by searching 8,900+ courses via ML and natural language.

Timeline

4 weeks, Fall 2025

4 Weeks, Fall 2025

My role

Solo Design, ML Engineering, Full-Stack Development

Tech stack

Python, Scikit-learn, GPT-4, Streamlit, TF-IDF, K-Means, Random Forest, NLP

The Problem

Planning my spring schedule, I wanted something to build communication skills. Creative, engaging, manageable alongside demanding core courses. After an hour of searching Duke's catalog, "communication" returned quantum computing and network security. I gave up without registering.

Talking to other students, everyone described the same frustration: "I know there are great courses out there, but I can't find them."

The core issue is a mismatch. Students think "I want to learn X with Y workload." Catalogs are organized by department codes and course numbers. Discovery is nearly impossible, and great courses stay chronically under-enrolled.

Learning from Existing Solutions

This isn't just a campus problem. Coursera is actively investing in solving the same challenge at scale: matching learner queries and desires with course content in ways that go beyond what's reflected in course titles or syllabi. Their approach uses skill-based tagging and cross-subject recommendations to surface relevant courses based on learning goals rather than category browsing. LinkedIn Learning takes a similar goal-oriented approach with natural language search and ML-powered personalization.

Both platforms validate the core insight: students want to search by what they'll learn, not by navigating administrative hierarchies. The conversational, goal-focused model works.

The gap is that university catalogs present harder challenges. Unlike curated platforms with standardized metadata, university courses have inconsistent descriptions, no uniform tagging, and students face constraints these platforms don't address: prerequisites, actual workload intensity beyond credit hours, and schedule conflicts.

My approach: combine unsupervised learning to discover thematic structure without manual tagging, supervised learning to predict workload from messy text descriptions, and conversational AI to understand intent and generate clear explanations. The goal was commercial-platform ease at university-catalog scale.

Building the ML Pipeline

I worked with Penn State's complete course catalog: 8,982 courses across 300+ departments. The methodology is university-agnostic and transfers to any catalog.

Phase 1: Technical Validation

Before building anything, I needed to know what information course descriptions actually encode. I tested three supervised learning approaches:

  • Workload prediction (Random Forest): 86% accuracy. Descriptions reliably contain workload signals like "lab," "intensive" vs. "seminar," "discussion"

  • Difficulty prediction (Decision Tree): 53%. Not reliably detectable

  • Department prediction (Logistic Regression): 40%. Too noisy

This shaped the entire architecture: focus on workload prediction where the data is strong, and use clustering for subject discovery where manual labels don't exist.

Phase 2: Discovering Natural Structure

I applied K-Means clustering (TF-IDF vectorization, 300 features, K=8) to let the algorithm find natural groupings based on content similarity rather than department labels.

Eight thematic clusters emerged. Courses from Psychology, Statistics, and Information Science clustered together when they taught similar concepts. This proved content-based recommendation across departments works at scale.

Phase 3: The Full Pipeline

The system chains five stages together:

  1. Intent Understanding (GPT-4): Extracts keywords from natural language, expands them semantically ("decision-making" triggers cognitive science, behavioral economics), and detects workload preferences

  2. Thematic Search (K-Means): Narrows from 8,982 courses to ~1,200 by identifying relevant clusters. Runs once at setup, loads instantly

  3. Relevance Ranking (TF-IDF + Cosine Similarity): Ranks by semantic similarity. Courses using different vocabulary but similar concepts still surface

  4. Workload Filtering (Random Forest): Predicts effort level from text and filters to user preference

  5. Personalized Explanations (GPT-4): Generates reasoning connecting each course to user goals

Interface Design

Early sketches explored traditional filtering like dropdowns for credits, departments, difficulty. But conversations with Duke students revealed this still required knowing exactly what to look for.

The insight: students don't want to configure parameters. They want to describe goals.

Design decisions:

  • Single text input: Natural language, no filters. Lower the barrier to zero.

  • Rich result cards: Course details, match percentages, and personalized "Why this course?" reasoning in a scannable format

  • Visual context: Interactive bubble chart plotting courses in workload vs. interest match space, so students can see tradeoffs at a glance

  • Progressive disclosure: Start simple, reveal complexity only when useful

The visual hierarchy prioritized fast evaluation; students need to compare multiple options during stressful registration periods.

Post-Launch Iteration

After deploying the initial version, I tested with Duke students and iterated based on their feedback.

Keyword expansion wasn't broad enough. Students searching for "human behavior" expected psychology, sociology, anthropology, and behavioral economics, but early results were too narrow. I enhanced the LLM prompt engineering to expand intent more aggressively across related disciplines.

Results felt generic. The "Why this course?" explanations initially read like they could apply to any student. I refined the prompt to generate reasoning that directly references the user's specific query language, making each explanation feel personally relevant.

Workload signals needed recalibration. I switched TF-IDF from default weighting to binary weighting, focusing on vocabulary diversity rather than frequency. This slightly reduced Random Forest accuracy (87.2% to 86.0%) but improved real-world recommendation relevance. A tradeoff I chose deliberately.

The UI needed polish. Based on feedback about readability during quick scanning, I improved visual hierarchy with better typography, line heights, and gradient backgrounds to help result cards stand out.

Impact

For students: Course selection shifts from hours of frustration to seconds of exploration. The system surfaces courses across departments students would never find through traditional browsing.

For universities: Better course-student fit means increased enrollment in overlooked courses and stronger student satisfaction. The methodology is university-agnostic. Built on Penn State's 8,900 courses, transferable to any catalog.

What's next: Duke catalog integration, a course comparison tool for side-by-side evaluation, and mobile-responsive design.

Reflections

Let failed models guide architecture. The difficulty and department predictions failing at 53% and 40% wasn't wasted work. It told me exactly where the data was reliable and where it wasn't. That negative result shaped a better system than assuming everything would work.

Design the output, not just the input. The "Why this course?" explanations are what make this system useful rather than just technically impressive. Students don't trust a match percentage alone. They need to understand the reasoning to feel confident registering.

UX and ML aren't separate disciplines. Every ML decision had a UX consequence. Switching to binary TF-IDF weighting was a modeling choice, but the reason was a user experience problem: results that felt too similar. The best systems are designed end-to-end.

Let's connect :)

Always happy to chat about design, research, or potential opportunities.

Let's connect :)

Always happy to chat about design, research, or potential opportunities.

Let's connect :)

Always happy to chat about design, research, or potential opportunities.

©

2026