Project Details

Projects allow you to apply what you have learned throughout the course. The only hard constraint is that your project must implement an LLM-based system that does something meaningful. What exactly you build is your decision, as long as it is approved by the instructor and implemented in Python.

To give you some orientation, the following are the kinds of techniques covered in this course that a project could draw from:

Your project should integrate at least two of these (or comparable techniques). How you combine them, and for what purpose, is up to you.

Grading

Your project is graded on three criteria, each counting for one third of the project grade. Grades follow the scale defined in the Prüfungsverfahrensordnung (PVO).

1. Functionality

The system must run end-to-end. Setup instructions must work. What you build and how you combine techniques is your decision; the grade reflects whether the result does something non-trivial and whether the implementation is complete.

2. Code Quality

Your code is evaluated on structure, readability, and documentation. Follow the coding guidelines below.

3. Presentation

You present your project and answer technical questions. We are less interested in what you built than in whether you understand why you built it the way you did: the tradeoffs you made, what broke, and how you fixed it. Only features you present will be graded.

The presentation does not need to be slides. A live demo or a well-structured notebook walkthrough is equally valid.

Mandatory Requirements

Failure to meet any of these results in automatic failure, regardless of the grade earned on the three criteria above.

  • The system runs with exactly the features presented on the day of presentation, no more, no less
  • All required Moodle tasks completed (max 1 missing)
  • Attendance and contribution to the presentation

Coding Guidelines

  • No god-files. Split responsibilities across modules.
  • Functions do one thing. If you need “and” to describe it, split it.
  • Names are documentation: avoid data, result, tmp, x.
  • No commented-out code in the final submission.
  • Comments explain why, not what. If the name explains it, no comment is needed.
  • No hardcoded model names or file paths; use constants or .env.
  • No external APIs or services requiring registration or a paid account. All models must run locally or on infrastructure you control.
  • If you copied it, you must be able to explain every line.
  • If it works but you don’t know why, it doesn’t work.

Bonus

Additional technical depth, creative application, or production-ready features can earn bonus points. Complexity and justification matter more than quantity. The presentation is the feature proposal; you will only be graded on what you present and can explain.

Example Project Ideas:

  1. LLM Tourist Guide: Uses TA.SH data to provide travel tips and enhances them with generated images.
  2. Quarto Data Presentation Pipeline: Builds and illustrates a Quarto presentation based on a given open dataset.
  3. Synthetic Author: Generates commit-messages based on commit history/diff. It could also suggest GitHub issues illustrated with AI-generated images.
  4. AI Storyteller: Creates illustrated short stories for children based on historical events.
  5. AI Webdesigner A tool that creates and illustrates a webpage based on a Amazon product page.

A note about using LLMs for your Project

Let’s start with a small psychological demonstration.

Look at these anagrams:

\[ \begin{array}{ccc} \text{edbbal} & \rightarrow & \text{dabble} \\ \text{eaeslg} & \rightarrow & \text{eagles} \\ \text{fcbair} & \rightarrow & \text{fabric} \\ \text{elsmod} & \rightarrow & \text{models} \\ \text{actysh} & \rightarrow & \text{yachts} \end{array} \]

How long would you take to solve such an anagram?

  • A: 30 sec
  • B: > 30 sec, < 1 min
  • C: > 1 min, < 1:30 min
  • D: > 1:30 min, < 2 min
  • E: > 2 min

\[ \begin{array}{ccc} \text{piemls} & \rightarrow & \text{???} \end{array} \]

Please take your time to think about how long you will take before clicking here to unveiling the anagram you are to solve.


I will not show you the solution, though I assure you that it is quite simple.

The main point of this demo is to illustrate the psychological bias often called overconfidence. This effect takes place when you underestimate the effort a reaching a solution takes when you are directly presented with the solution.

In terms of using Gen AI to solve tasks, findings in the same vain can be found in Stadler et al. (2024), who ran a study in which students were asked to research nanoparticles in sunscreen either using search engines or ChatGPT 3.5.

Their

Results indicated that students using LLMs experienced significantly lower cognitive load. However, despite this reduction, these students demonstrated lower-quality reasoning and argumentation in their final recommendations compared to those who used traditional search engines.

and they argue further that

[…] while LLMs can decrease the cognitive burden associated with information gathering during a learning task, they may not promote deeper engagement with content necessary for high-quality learning per se.

Giving a lecture about Gen AI and expecting the students to not use seems rather pointless, but we will use the presentation at the end of the semester to test if you do indeed understand your solution to test the depth of your engagement with the lecture’s contents.

Show solution

Stadler, M., Bannert, M., & Sailer, M. (2024). Cognitive ease at a cost: LLMs reduce mental effort but compromise depth in student scientific inquiry. Computers in Human Behavior, 160, 108386. https://doi.org/10.1016/j.chb.2024.108386