dev-resources.site
for different kinds of informations.
Reading UTF-8 char by char in C
Published at
12/28/2024
Categories
c
utf8
Author
tallesl
Author
7 person written this
tallesl
open
Using wchar_t
didn't quite worked out in my tests, so handling it on my own:
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
// https://stackoverflow.com/a/44776334
int8_t utf8_length(char c) {
// 4-byte character (11110XXX)
if ((c & 0b11111000) == 0b11110000)
return 4;
// 3-byte character (1110XXXX)
if ((c & 0b11110000) == 0b11100000)
return 3;
// 2-byte character (110XXXXX)
if ((c & 0b11100000) == 0b11000000)
return 2;
// 1-byte ASCII character (0XXXXXXX)
if ((c & 0b10000000) == 0b00000000)
return 1;
// Probably a 10XXXXXXX continuation byte
return -1;
}
void main ()
{
const char* filepath = "example.txt";
FILE* file = fopen(filepath, "r");
if (!file) {
perror(filepath);
exit(1);
}
char c;
for(;;) {
c = getc(file);
if (c == EOF)
break;
putc(c, stdout);
int8_t length = utf8_length(c);
while (--length) {
c = getc(file);
putc(c, stdout);
}
getchar();
}
fclose (file);
}
And here's my test file:
Hello, World! ๐๐
Hello
ยกHola!
รa va?
ไฝ ๅฅฝ
ใใใซใกใฏ
์๋
ํ์ธ์
ยฉยฎโขโโ
๐๐ข๐๐ฅโจ
โฌ๐๐ญ
c Article's
30 articles in total
Top 5 Backend Programming Languages to Learn in 2024
read article
Week 2: Diving Deeper into Dynamic Memory, Structures, and Beyond in C Programming
read article
The 10 fastest programming languages in the world
read article
As 10 Linguagens de Programaรงรฃo mais velozes do mundo
read article
This turned out to be my best-performing technical article. Unfortunately I do not have the time to write more like it.
read article
Parsing command-line arguments in C
read article
Develop a weather application code using c language with date
read article
Working with Matter Team Membership Using the IntApp Walls API
read article
Reading UTF-8 char by char in C
currently reading
[Rust Self-Study] 1.0. Intro
read article
Gone back to learn C Programming 23 years later.
read article
Data access in code, using repositories, even with ORMs
read article
MockManager in unit tests - a builder pattern used for mocks
read article
Explaining donut like 5 years old Part-4 (Last)
read article
Explaining donut like 5 years old Part-3
read article
How to 100% CPU
read article
Explaining donut like 5 years old Part-3
read article
How do you print in c language?
read article
Tester c'est tricher, compiler c'est douter
read article
Pointers in C Programming - Lay Man's Analogy
read article
LogInsight
read article
Rust in Systems Programming: Why Devs Are Choosing Rust Over C and C++
read article
Discover File Splitter & Merger: A Revolutionary Tool for Managing Large Files
read article
Unused variables in C/C++: why and how?
read article
Jas - My x64 assembler
read article
How Does Deep Learning Work? Can You Write Simple Deep Learning Code at Home?
read article
Day 1 : Introduction of DSA
read article
Cybersecurity: The Shielding the Virtual Universe
read article
OKMX8MP-C GDB Remote Debugging Skills
read article
Publishing My First AUR Package: CPIG
read article
Featured ones: