Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entities.EscapeMode.none question #2223

Open
Arnauec opened this issue Nov 11, 2024 · 0 comments
Open

Entities.EscapeMode.none question #2223

Arnauec opened this issue Nov 11, 2024 · 0 comments

Comments

@Arnauec
Copy link

Arnauec commented Nov 11, 2024

Hi @jhy and first of all thanks for the amazing work with the library.

I'm facing a situation where I would need the Entities.EscapeMode.none option.
In this pull request you say:

If you want plain text output, use one of the .text() methods. If one of those doesn't fit the use case, I'd be happy to hear more about the use case and we can explore ways to improve those.

I was using .text() until now, but I found a weird behaviour. If the input is something like:
<script src=\\\"https://xss.cacker.io/\\\">removed script </script>
Then JSoup will include it in the document, and when extracting the text we will get:
<script src=\"https://xss.cacker.io/\">removed script </script>

This 1) turns encoded-non dangerous input to input that could be dangerous, and most importantly 2) modifies user input. My goal would be to sanitize input but modify it as little as possible and only based on the data in the SafeLists. Data is not always outputted after using Jsoup, but sometimes also saved into a DB, and having encoded text is not desirable.

I know JSoup tries to do both output encoding (with valid HTML) and input sanitization, but in my case I only need Input Sanitization, so it would be really great to have a flag that allows me to do that.

What could be a way to solve this issue I have?
Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant