You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The example above uses combining characters to create an é. Emoji make heavy use of combining characters (👨👨👧👧 is made up of 11 characters: \uD83D\uDC68\u200D\uD83D\uDC68\u200D\uD83D\uDC67\u200D\uD83D\uDC67).
I have seen emoji used as css class names in the wild, and I think the character escaping code is doing the wrong thing when calling cssSelector, it looks like it's escaping every character individually, which breaks things with these combining characters.
The text was updated successfully, but these errors were encountered:
samshutchins
changed the title
cssSelector doesn't handle combinding characters correctlycssSelector doesn't handle combining characters correctly
Aug 4, 2023
Current jsoup: html > body > img.e\́
Chrome: body > p.e\\u0301
I don't think it's incorrect to emit it as a run of characters. And the selector does work in jsoup. We could improve to escape the combining form as a \u escape character, like Chrome is.
The example above uses combining characters to create an
é
. Emoji make heavy use of combining characters (👨👨👧👧 is made up of 11 characters:\uD83D\uDC68\u200D\uD83D\uDC68\u200D\uD83D\uDC67\u200D\uD83D\uDC67
).I have seen emoji used as css class names in the wild, and I think the character escaping code is doing the wrong thing when calling
cssSelector
, it looks like it's escaping every character individually, which breaks things with these combining characters.The text was updated successfully, but these errors were encountered: