Discussion:
Understanding ASCII encoding across platforms?
(too old to reply)
TRS-90
2023-12-06 02:10:19 UTC
Permalink
Could anyone help me understand a text file? Like the most basic that works on every system since the 1960s? I realize that might not be possible and that's why ASCII was invented. I've read about UTF-8 and there are certainly more since then. I honestly don't even know what encoding is used to make this message on an Apple IIgs readable on Usenet. Is ASCII the most platform-independent? On modern systems I use VSCode, but I find even that program adds characters that show up as ? marks if I send the file to a IIgs for example.

Thank you for reading.
fadden
2023-12-06 17:19:24 UTC
Permalink
Post by TRS-90
Could anyone help me understand a text file?
Run "iconv -l" to get a brief list of character encodings. (I see about 1100 on Ubuntu Linux.)

If you stick to ASCII, your text will be readable everywhere, but most non-English languages can't be represented with the ASCII character set. Modern systems use Unicode, often with UTF-8 encoding, which was designed so that ASCII text "just works".

The Apple IIgs uses a custom locale-specific character set, often Mac OS Roman. It is based on ASCII, but has additional characters for common Latin-derived languages, plus some math symbols.
TRS-90
2023-12-07 00:16:02 UTC
Permalink
Post by fadden
If you stick to ASCII, your text will be readable everywhere, but most
non-English languages can't be represented with the ASCII character set.
Modern systems use Unicode, often with UTF-8 encoding, which was
designed so that ASCII text "just works".
Thank you, I've been re-typing historical articles from 1800s newspapers about the area I live in. Doing it on a IIgs. I didn't realize there are so many different encodings. Your reply was helpful.
mmphosis
2023-12-08 00:32:02 UTC
Permalink
That iconv command is super helpful. Thank you!


My serial card is currently in slot 3 on the Apple II:

]IN#3
]0 get a$ : ? a$; : if a$ <> chr$(4) goto
]RUN


In the Terminal, on the Linux platform:

./mistral-7b-instruct-v0.1-Q4_K_M-main.llamafile --temp 0.7 -r '\n' -p
'Display the euro symbol.' | tee /dev/tty | iconv -f UTF-8 -t
ASCII//TRANSLIT | tr [:lower:] [:upper:] | tr '\n' '\r' > /dev/ttyUSB0

Display the euro symbol.
Answer: €


On the Apple II:

DISPLAY THE EURO SYMBOL.
ANSWER: EUR
Colin Leroy-Mira
2024-02-01 21:35:50 UTC
Permalink
Hi,
Post by mmphosis
./mistral-7b-instruct-v0.1-Q4_K_M-main.llamafile --temp 0.7 -r '\n' -p
'Display the euro symbol.' | tee /dev/tty | iconv -f UTF-8 -t
ASCII//TRANSLIT | tr [:lower:] [:upper:] | tr '\n' '\r' > /dev/ttyUSB0
Display the euro symbol.
Answer: €
DISPLAY THE EURO SYMBOL.
ANSWER: EUR
On a related note about iconv and Apple II,

1) For international Apple IIs, the charset are:
French: ISO646-FR1
Spanish: ISO646-ES
Italian: ISO646-IT
German: ISO646-DE

You can use iconv -f UTF-8 -t ISO646-FR1//TRANSLIT in the same manner.

2) By the way, glibc 2.39, released yesterday, contains a little patch
of mine that translits (some) emojis to ASCII:

***@a2proxy:~# echo "😉" | iconv -f UTF-8 -t ASCII//TRANSLIT
;-)

I wrote it so that my Mastodon client, which relies on a proxy for
network access, json parsing and charset change, could display common
emojis!
--
Colin
https://www.colino.net/
Loading...