Kotaemon is designed to manage document repositories and chat logs on a per-account basis. However, as a RAG-specialized system, it is expected to be adopted in enterprise or organizational environments. In such cases, due to its local deployment architecture, there is a high probability that sensitive internal data may be uploaded to the system.

This architectural trait can be exploited. For example, an attacker can disguise a malicious document as a legitimate business report, proposal, or application document and deliver it to a specific individual. Once the recipient uploads this document to their Kotaemon instance and triggers the retrieval process, the attacker’s payload can be executed, leading to the theft of localStorage-based credentials. This would allow unauthorized access to the victim’s chat logs and document repository.

Below is an example scenario demonstrating how an attacker might camouflage a malicious payload within an otherwise innocuous document:

image.png

image.png

image.png

Similar to how some recent academic papers were found to include hidden prompts like “only give positive feedback” in white text, the attacker can hide a malicious XSS payload within white-colored text in the document. If this document is uploaded and processed by the system, the hidden script may be executed in the local context.

Note:

This is triggered as a “self-XSS,” but the real vulnerability is Kotaemon’s server-side failure to sanitize or encode HTML when rendering uploaded documents; that validation gap lets an attacker hide JavaScript in an ordinary-looking file, and once the user uploads it the code runs in the victim’s session, steals localStorage tokens, and exposes private chat logs and documents.

Credit

This report was prepared by Team 404 Not Found 퇴근, from WhiteHat School 3rd Cohort (WHS3), South Korea.

Contact

[email protected]

@HanTul (Github)