Logo

dev-resources.site

for different kinds of informations.

OCR + SwiftUI + Japanese. Quite a training project! 😅

Published at
3/19/2024
Categories
sideprojects
learning
japanese
mobile
Author
k_romek
Author
7 person written this
k_romek
open
OCR + SwiftUI + Japanese. Quite a training project! 😅

Hi! It’s been a while! I’ve been learning a lot of new things lately. I came up with some OCR and AI related projects at work, and have been exploring SwiftUI and Flutter in more depth in my spare time. As I’m reaching a big milestone in development of One a Day (my gratitude and positivity platform), I decided it’s time for a mini-project-sized break! :D

I like to code useful things, so rather than create a yet another to-do-list app, I wondered what I could make in a short time that would provide value. I’ve had fun with AI and OCR at work, and wanted to dive deeper into it, whilst also sticking to mobile apps. I remembered I have a very real pain point I haven’t been able to resolve with any existing solutions.

It's a struggle

When I (try to) read Japanese books or manga, I often have a problem of not being able to fully understand certain kanji, words, or sentences. I can easily translate them using Google Translate, but that doesn’t give me a kanji-by-kanji breakdown needed to fully grasp sentence structure. I want to learn, not just translate. I have access to great dictionaries and grammar resources, but they’re not very useful when I don’t even know how to find the problematic kanji, because I don’t know its reading. What I often end up doing is to translate with the Google app, switch to original text, copy kanji, paste into dictionary, get all the necessary information, add it to flash cards (if I get that far). Quite a lot of steps that quickly kill any bit of motivation I have!

A shotgun to kill a fly

Here comes my project idea then! Create an app that will help me with reading native Japanese texts without the need for all those different tools.

Feature requirements:

  • Take a picture / upload a picture of Japanese writing
  • Implement OCR technology to read text from the images
  • Display this generated text with furigana to support me with reading
  • Implement a translation feature for selected words
  • Add option to save vocabulary for future study

Some quick designs from my other half to solidify the concept and kick off the project:

Image description
Image description
Image description

Initial investigation / Expected problems

  1. Even though the OCR APIs I’m familiar with (such as Pen to Print) were more than capable to read single words, they were not able to read vertical, right-to-left writing. This is going to be trickier than I initially assumed, and I will have to test more solutions. Shortlisted Azure AI Vision, and Google Vision, though the setup and pricing might be a bit of an overkill for a small side project. OCR.Space seems promising and the first few requests returned good results, though I did notice some small mistakes.

  2. There seem to be plenty of Japanese text analysers, dictionary packages, and learning software so I naively assumed I would have no problems finding a furigana generator API. Wrong! Initial search left me with a few excellent web tools, which I could scrape, some python based open source API projects which I could translate and host myself, and a very capable, albeit slow ChatGPT solution I quickly put together. I’ll need to research this some more, although I’m leaning heavily towards AI at the moment, as it would allow me to achieve the POC very quickly.

  3. Furigana requires using ruby characters, which complicates the matter of displaying it in a mobile app. From what I’ve seen this can be achieved using attributed strings with ruby annotations, though not without issues. Moreover, this is not supported by SwiftUI by default and I’ll have to look into some custom code magic to bring this to life.

  4. Translation feature should be achievable with the use of openly available XML dictionaries (JMdict for word meaning and KanjiDic for more in depth information on kanji). This is honestly such a relief to be able to access dictionary data so easily, especially after the struggles of having to implement Easy Polish News (my other language learning app) through web scraping!

Not giving up

That’s pretty much it for now! I should have everything I need to get this working, although it will be a bit more difficult than I initially assumed. Still, it’s been fun researching all these concepts and thinking about a language from a technical point of view again. I’m looking forward to working on this. Hopefully I’ll be able to make something that will finally help me tackle those books!

japanese Article's
30 articles in total
Favicon
re:Invent 2023に参加してから1年たってみて
Favicon
re:Invent 2023 day1
Favicon
IIS環境でのパス制御を考える
Favicon
ホストベースルーティングを活用してALBを集約した際のデメリットを検討する
Favicon
AWS Summit Japan 2024体験記
Favicon
AWS Step Functionsに入門する
Favicon
AWS Community Buildersになって変わったこと
Favicon
GenAI Use Cases JPを試してみた
Favicon
RDS for MySQLでスロークエリログの出力を有効化する
Favicon
AWS Fargateを利用した時刻固定したシステムテストの方法について
Favicon
ランブックを活用したWindows Serverインプレースアップグレードのススメ
Favicon
DocumentDBでマルチバイト検索を実現する場合の留意事項について
Favicon
JAWS DAYS 2024参加体験記
Favicon
OCR + SwiftUI + Japanese. Quite a training project! 😅
Favicon
Lambdaで.NET 7のカスタムランタイムを実行する
Favicon
$0.005 per In-use public IPv4 address per hour の明細が高額になる場合の対応を考える
Favicon
Ryuu - a Japanese dragon
Favicon
事務局長を2年やってみて感じたこと
Favicon
Mechanically Detecting Accessibility Violations
Favicon
Hello back! Sharing some projects
Favicon
Making Font Loading More Efficient with React Content Font
Favicon
How to Avoid Japanese Characters in Your Code!
Favicon
AI 基礎 Part 00 -- stable-diffusion / ChilloutMix を使って日本人美女の画像を作る
Favicon
Next 基礎 Part 02 -- axios で API を叩けるようにする。No 'Access-Control-Allow-Origin' エラーの解決。
Favicon
Next 基礎 Part 01 -- プロジェクト作成
Favicon
AWS 基礎 Part 0 -- EC2 などの AWS サービスと用途の整理
Favicon
Android 基礎 -- Part 00 Android Studio のインストールと Hello World
Favicon
Web エンジニアリング基礎 -- Part02 TS のメリットとブラウザのレンダリング詳細
Favicon
Web エンジニアリング基礎 -- Part01 同期 or 非同期の処理のコールスタックとタスクキュー
Favicon
Ubuntu Settings -- US キーで Mac のように ctrl space の英かな切り替えと ctrl h などでの削除ができるようにする。

Featured ones: