Logo

dev-resources.site

for different kinds of informations.

How to detect code language in browser

Published at
11/22/2024
Categories
tensorflow
webdev
programming
javascript
Author
ray-d-song
Author
10 person written this
ray-d-song
open
How to detect code language in browser

Repository: https://github.com/ray-d-song/guesslang-js

Demo: https://ray-d-song.github.io/guesslang-js/

Recently, I'm working on a project called EchoRSS, and I have a very wanted feature, which is to intercept external links in subscriptions (read full text, quote, etc.) and display them directly on the current page.

There is a problem that the returned HTML code block loses the language annotation (or the language was not annotated on the pre and code tags in the original code block), so it cannot be highlighted using tools like shiki or prism.js.

I found three solutions to detect code language:

1. linguist

This is a Ruby project deployed on the server, and Github uses it to detect the language composition of the repository. If you need extremely high accuracy and can be calculated on the server, this is the best solution.

2. hljs

highlight.js is a very famous web code highlighting library, and it is also the only library that provides automatic code detection.

The principle is very simple, which is to enumerate the keywords of the language, and then match them one by one with the text, and finally see which one has the highest matching degree.

hljs has four problems.

  • It requires a very long code length, and most languages require at least 300 characters to achieve a relatively good accuracy.
  • The part that detects the language is not a separate module, but tightly coupled with the parser and render, and the code is also very imperative, making it difficult to extract useful parts.
  • If you don't extract the detection module, the original format (line breaks and indentation) of the code will be lost when using hljs to highlight.
  • It requires a lot of regular matching, the performance is poor, and because of reason 2, it cannot be run in a web worker.

3. guesslang

guesslang is a machine learning project based on tensorflow.js.

Microsoft ported this project to node.js in 2021 and added the automatic language detection function to vscode.

A Vietnamese guy hieplpvip three years ago also ported this project to the browser, but there are also three problems:

  • Memory leak, memory leak...
  • Only supports the <script> tag to introduce the umd format, does not support esm, does not support bundle
  • Similarly, because of reason 2, it does not support web worker

The guy has not maintained this project, and the feat request to support esm in March has not been replied.

So I extracted the detection module from hljs, and forked guesslang-js to fix the above problems, and in the end guesslang won, the result is this:
https://github.com/ray-d-song/guesslang-js

I think it's too much to talk about, maybe someone will need it in the future, so I'll post it.

If someone knows tensorflow.js, I hope they can recommend some learning materials, I want to further modify it to web gpu calculation to improve efficiency.

tensorflow Article's
30 articles in total
Favicon
Binary classification with Machine Learning: Neural Networks for classifying Chihuahuas and Muffins
Favicon
Tensorflow on AWS
Favicon
How to Use TensorFlow.js for Interactive Intelligent Web Experiences
Favicon
How to Exit Full Screen on Mac: A Step-by-Step Guide
Favicon
Sound Event Detection
Favicon
# 🚀 Why PyTorch Stole the Spotlight from TensorFlow
Favicon
Why PyTorch Stole the Spotlight from TensorFlow?
Favicon
7 Cutting-Edge AI Frameworks Every Developer Should Master in 2024
Favicon
TensorFlow Learning Paths: Free Online Resources for AI Enthusiasts
Favicon
Using Open Source Models
Favicon
From Data to Deployment
Favicon
How to Authenticate Your Google Cloud Project Without a Migraine
Favicon
Top Python Libraries Every Developer Should Know
Favicon
How to detect code language in browser
Favicon
Binarized Neural Network (BNN) on MNIST Dataset
Favicon
Building a Personalised AI Chatbot with GPT and Gradio
Favicon
Lesson 12 - What is TensorFlow?
Favicon
Homemade GPT JS
Favicon
Creating blurred or virtual backgrounds in real-time video in React apps
Favicon
Neural Networks Simplified: The Future Beyond Traditional ML
Favicon
Build, Innovate & Collaborate: Setting Up TensorFlow for Open Source Contribution! 🚀✨
Favicon
Bridging Machine Learning with TensorFlow: From Python to JavaScript
Favicon
Improve container build time by 70% w/ better caching
Favicon
Unlocking Machine Learning in the Browser with TensorFlow.js
Favicon
Top Data Science Tools in 2024: A Comparative Review of the Best Software
Favicon
TensorFlow vs PyTorch: Which Should You Use?
Favicon
TensorFlow vs. PyTorch: Which Deep Learning Framework is Right for You?
Favicon
Keras: Understanding the Basics with a Detailed Example
Favicon
Guide to VisualGestures.js: Revolutionizing Web Interaction with Hand Gestures
Favicon
Verbose in Machine Learning

Featured ones: