Skip to content
/ Recaf Public

Russian characters in string are represent as \uXXXX #160

@artemking4

Description

@artemking4

Russian characters in string are represent as \uXXXX

To Reproduce
Steps to reproduce the behavior:

  1. Open any class that contains non-english characters
  2. See \uXXXX instead of a character

Activity

Col-E

Col-E commented on Jul 25, 2019

@Col-E
Owner

Its much easier to modify unicode characters in this escaped representation. A lot of people can't easily create these characters on their machines. Plus then you don't have to worry that a person's machine doesn't support rendering the character (font).

In a read-only context showing the unicode literal would make sense, for instance in the decompiled output. That's up to the decompiler how to print those characters though.

See below

andylizi

andylizi commented on Jul 25, 2019

@andylizi
Contributor

Its much easier to modify unicode characters in this escaped representation

But how can you modify anything if you can't read it in the first place?

A lot of people can't easily create these characters on their machines.

If you don't have the proper program to input those characters (e.g. an IME) then why would you try to write them. And you only need Backspace to delete them regardless of language anyway.
It's not like it's easy to write in \u5929\u6c14\u4e0d\u9519. Who, except a computer, can read or write in that?

Plus then you don't have to worry that a person's machine doesn't support rendering the character (font).

A Unicode test page

Nope. Welcome to the 21st century.

Col-E

Col-E commented on Jul 25, 2019

@Col-E
Owner

I can accept when I'm wrong.

My mindset comes from working off of intentionally obfuscated code abusing unicode characters such as in the attached sample: calc-obf.zip

Moving forward the option to use either presentation of international characters will be available, with the default for the literal values.

Col-E

Col-E commented on Jul 25, 2019

@Col-E
Owner

Using StringEscapeUtils in the way shown in a81a41d will have to be changed. This was initially done to properly escape/unescape \n but this also escapes international characters, which is why we're here. This behavior is unintentional.

So for that I'm adding the bug label. As mentioned in the comment just above, I'm also adding enhancement since we'll want to add the ability to opt into this behavior.

Col-E

Col-E commented on Jan 27, 2020

@Col-E
Owner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @andylizi@artemking4@Col-E

        Issue actions