utf 8 - understanding file encodings -
in eclipse, have file place written:
onclick='obj1.help_open_new_window(fn1(), "/redir/url_name")'
and in eclipse edit menu->set encoding, see this:
now change encoding utf-8 using same dialog box , text changes to:
onclick='obj1.help_open_new_window(fn1(),�"/redir/url_name")'
all know if not happening, website working fine. why happening , do prevent this?
i have knowledge encodings: Â , nbsp mystery explained the absolute minimum every software developer absolutely, positively must know unicode , character sets (no excuses!) still not understand why happening. feel free go byte level(how file stored) explain it.
update: here's understand: if file encoded in latin-1
every character byte , . should
hex(32)
. when convert utf-8, still remains hex(32)
, . leads me believe in latin-1,
not
hex(32)
combination of 2 bytes. how possible?
the character have between comma , quote appear sto not normal space other whitespace character, famous u+00a0 no-break space. since file encoded in latin1, character stored on disk byte \xa0
, not form valid character in utf-8. means if reload file in editor using utf-8 see universal replacement character �
in stead. (the proper utf-8 encoding of no-break space \xc2\xa0
.)
to rid of problem replace no-break space normal space (u+0020). there no reason why should use no-break space in context, i.e. in program text.
Comments
Post a Comment