emoji domain names with the puny package

3 min read

emojis emo puny package R dns utf8

Typical sunday night, lost in several inception layers of I don’t know how I got here, what I am doing here and what I was looking for in the first place.

So, some extensions allow arbitrary utf-8 characters in the domain name, but more importantly you can have arbitrary characters in sub domains. Next thing you know, obviously I’m not going to stop at accents or cedilla, I’m on a mission to spread emojis everywhere.

Emojis are characters, just sequence of utf-8 encoded bytes. I’ve been playing with the emo package to make it easy to include and extract emojis, and the utf8splain to extract some information about unicode runes (aka code points).

(s <- emo::ji_glue( "emojis :party: "))
## emojis 🎉
unclass(s)
## [1] "emojis \U0001f389 "
utf8splain::runes(s)
## utf-8 encoded string with 9 runes
## 
## U+0065             65                              01100101    Latin Small Letter E
## U+006D             6D                              01101101    Latin Small Letter M
## U+006F             6F                              01101111    Latin Small Letter O
## U+006A             6A                              01101010    Latin Small Letter J
## U+0069             69                              01101001    Latin Small Letter I
## U+0073             73                              01110011    Latin Small Letter S
## U+0020             20                              00100000    Space               
## U+1F389   F0 9F 8E 89   11110000 10011111 10001110 10001001    Party Popper        
## U+0020             20                              00100000    Space

Typically domain names are only made of boring ascii, but punycode gives a way to encode a string that may contain any unicode characters into a string that only has ascii.

There are online tools to perform the encoding and decoding, and firing the github search engine, I could quickly find the simple enough punycode C library.

I knew this was going to change my day, I only wish we were able to open the door of my coworking space space without the help of a locksmith, but I guess that’s a story for another day, and I had not agreed to install the updates on my mac. Anyway, once the elements would finally give me a break, I was able to wrap the library in the puny package and its code and decode functions.

puny::code( "crème brûlée" )
## [1] "crme brle-13ar8s"
puny::code( emo::ji_glue("emojis :party: everywhere") )
## [1] "emojis  everywhere-3q59q"

If you add domain=TRUE, the function adds the xn-- prefix and makes something suitable for a domain name.

puny::code( emo::ji("package"), domain = TRUE )
## [1] "xn--cu8h"

So that for example 📦.purrple.cat is (for now) a skeleton hugo site generated by blogdown and deployed on netlify. Eventually, I guess it will contain pkgdown sites for my packages, but I could not get pkgdown to work today, although I have not tried after the issue was fixed … a nice side effect is that I could discover that pkgdown uses my highlight package and I was able to offer a pull request because pkgdown was using the interface of highlight before I broke it (for the greater good) last summer.

Aaaaaaanyway, to do this, I’ve had to use the encoded name (xn–cu8h) in both netlify:

… and the DNS settings on my registrar:

xn--cu8h 1800 IN CNAME dreamy-hypatia-b7499e.netlify.com.

Unfortunately, all browsers don’t treat punycode the same way. On safari, everything looks fine, I can browse to 📦.purrple.cat and that’s how it looks like.

But chrome (at least the version I have) only 🙍 echoes the encoded subdomain.