-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IDNA Utils #274
Comments
My two cents: I'm worried that attaching additional parsing to this feature will result in an implementation hazard. I was thinking about how to implement domain parsing yesterday and it would be convenient if the URL object contained that information. However, I would assume that the frequent updates to the Public Suffix List would make it hard to maintain compatibility between browsers and versions. The size and complexity of an IP parsing library is nowhere near that of Stringprep, Nameprep, and Punycode. But if it's easy to do, IP parsing is a reasonable request to make of the standard library for a web-centric programming language. WRT to which version of |
@mikewest any new thoughts on all this? I'm mainly asking the other questions since I wonder whether we should introduce a (I don't think we want ToUnicode to be configurable. Each extra bit of API surface just leads to lots of bugs. Better to start out small.) |
Also a host parser can be added to |
const host = new URLHost(rawInput)
host.toString() // probably ASCII, as per usual
host.unicode() // ToUnicode?
host.type // "ipv4", "ipv6", "domain" Alternatively you could make ToUnicode an argument to |
Well, which "version" of ToUnicode do we want, the standard ICU implementation or what the browser URL bar displays?
I just don't think that overloading |
Thinking this over, I think it should output the standard ToUnicode function, as that's easier to standardize across environments (i.e. Node.js). |
I do think something like this would be useful, and Node's implementation seems like a reasonable justification for paving the cowpath. If WebKit and Mozilla are also interested, I think Blink would follow suit. That said, @sleevi had some concerns in #63 (comment). CCing him here. |
@indolering note that there's no such thing as "standard" ToUnicode. I think we should be using https://url.spec.whatwg.org/#concept-domain-to-unicode which we already use in various places throughout the platform. I don't think we should expose variants, which I think was @sleevi's concern in that other thread. (Also note that our host parser is very specific to DNS already, since it already involves Punycode/Nameprep due to ToASCII which is applied on input.) |
I'll take your word for it! It's my preference for a single implementation to be shared across browsers and Node. AFAIK, this isn't the case when it comes to what's displayed in the URL bar. But IDNA makes me go cross-eyed, so I'll stop inserting myself. |
Also export host parser (already in use by HTML). Fixes #274.
I created a PR for this since we've got interest now from WebKit and Chrome. I'm a little worried about all the incompatibilities we still have with IDNA, but those are also exposed in other ways already. @achristensen07 I'd appreciate review of #288 from you since you said WebKit would be interested in something like this. What should be done before landing:
|
Yeah, I do want to echo the concerns, and I'll loop @mikewest onto some design docs he may not have been aware of when he expressed support :) |
Also export host parser (already in use by HTML). Fixes #274.
Ticket tracking discussion of restoring the
URL.domainToASCII
andURL.domainToUnicode
functions or implementing something new.Summary
Processing international domain name labels is tricky, slow, and requires large lookup tables. However, browsers already perform this task (typically using the ICU library) and could expose these functions to JavaScript.
The proposal to add this functionality was nixed because no major browser had implemented it. Node supports the call (<50 lines) and a WebKit developer chimed in saying it would be trivial to add.
One issue is which version of ToUnicode function should be exposed and whether there are other utility functions that might be needed, such as subdomain comparisons, distinguishing between domains/subdomains/TLD/public suffix, and IP address parsing.
The text was updated successfully, but these errors were encountered: