Skip to content

Add a "no UTF-8 stripping URL" option #4032

@marcanuy

Description

@marcanuy

I am working with Chinese content (using UTF-8), while most of the time it generates the right url, sometimes it strips certain Chinese characters from URL.

Some examples of these characters are:

When generating a page for each character, i.e.: example.com/post/〇 it generates empty paths example.com/post// .

Steps

To reproduce the bug, add

slug: "foo〇○〡〤〢⺮〣21三bar"

in the front matter of any page Hugo will generate the following stripped path:

http://localhost:1313/post/foo21三bar/` 

removing 〇○〡〤〢⺮〣.

*Tested with latest Hugo release: Hugo Static Site Generator v0.30.2 linux/amd64 BuildDate: 2017-10-19T08:34:27-03:00, SO: 4.10.0-37-generic #41-Ubuntu SMP Fri Oct 6 20:20:37 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Ubuntu 17.04 *

(x-post: stackoverflow.com, forum)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions