dev-resources.site
for different kinds of informations.
Developing CROML (Crochet Obvious Minimal Language)
Back Story
During my time at Apple Academy, my colleagues and I developed an app about crocheting. The app is designed to help crocheters track their stitch and row. In the app, we have a feature that allows users to take pictures of an Amigurumi pattern and then break it down into individual stitches. The goal is to enable users to know what stitch they need to do at that exact moment.
Crochet
For you non-crocheters, let me give you a context. A pattern in crocheting is a "rule" that you need to follow to create a certain shape; it consists of what type of stitch, how many times the stitch should be done, and in which row it should be done. Amigurumi is a type of crochet. It's basically a doll.
Let me give you an example of an Amigurumi pattern:
Here's another one:
You see, the pattern can be very different from one another; it has multiple abbreviations, and sometimes it includes natural language.
Challange
As I mentioned before, one of the features in our app is allowing the user to take a picture or upload a screenshot of this pattern, and then from that picture, the app should break it down into individual stitches that the user can follow each stitch.
There are multiple problems I face while developing this feature:
- How to extract the crochet pattern string from an image.
- How to convert that extracted string, which is a semi-structured language and sometimes has natural language, to data that a computer can understand.
For the first problem, which is to extract the string from the image, it's pretty straightforward. I use Apple Vision Framework to detect text in an image; it's easy!. For the second problem though, it's a bit tricky. I try using Regex and Named Entity Recognition, and it's just not working because the nature of the patterns is very different from one to the other. Also, sometimes the Vision Framework can output incorrect data; sometimes it detects "S" as "5".
Based on that reason, the obvious solution that came to my head is using Large Language Model to analyze the pattern and convert it to more structured data.
Why not using JSON?
The first thing that I did was create a prompt that tells the LLM to convert the crochet pattern to a JSON format. It's pretty straightforward, right? No need to process any further, just straight JSON data. Well, maybe it's true, but there are some problems with that.
Here's the first 20 lines from 456 of the crochet pattern in JSON format, pretty long, right?
{
"pattern": {
"parts": [
{
"name": "Head",
"layers": [
{
"id": 1,
"sequences": [
{
"stiches": [
{
"name": "SC",
"times": 6
}
],
"repeat": 1
}
]
}
The problems with generating JSON directly with LLM are:
-
JSON
is error-prone if it's generated by LLM; it's possible, and I encounter that many times theJSON
data that the LLM output is not valid, sometimes it mises the (}
semicolon) at the end. - Returning pure
JSON
data is very token expensive because crochet patterns can have many rows and stitches, and it also takes a very long time to generate the output.
The other solution is not to use JSON
. I've tried using minimal language like TOML
or YAML
; it just doesn't feel right; they have too much indentation and are just not that readable for the context of a crochet pattern.
CROML
Because of that reason, I decided to create my own data representation that I can easily parse locally. I came up with CROML
(Crochet Obvious Markup Language). Why, you ask? Here's a quote from Kent Beck.
Make it work, make it right, and make it fast. – Kent Beck
Here's how CROML looks like:
Head:
1:SCx6
2:INCx6
3:(SCx1,INCx1)[r=6]
4:(SCx2,INCx1)[r=6]
5:(SCx3,INCx1)[r=6]
6-8:SCx30
9:SCx6,INCx3,(SCx1,INCx1,SCx1)[r=4],INCx3,SCx6
Ears:
1:SCx6
2:(SCx1,INCx1)[r=3]
3:SCx9
4:(SCx1,INCx1,SCx1)[r=3]
5-8:SCx12
9:(SCx2,DECx1)[r=3]
CROML Rules
1. Parts and Sections
- Each part (e.g., Body, Head, Paws, etc.) is introduced by its name (only a-zA-Z are allowed), followed by a colon (
:
). - Each part contains layers. These layers are numbered sequentially and correspond to the crochet steps.
- If a series of layers have the same sequence, they can be represented as a range (e.g.,
6-10
for five layers with identical stitches).
2. Stitch Notation
Each stitch type is represented by its abbreviation (e.g.,
SC
for single crochet,INC
for increase,DEC
for decrease).The number of times a stitch is repeated in a layer is denoted by the format
StitchName x Count
, whereStitchName
is the abbreviation, andCount
is the number of times the stitch is performed (e.g.,SCx6
for six single crochets).
3. Multiple Stitches in a Sequence
- When multiple stitches occur in the same sequence, group them with parentheses
( )
to show they occur together, separated by commas. - Example:
(SCx1, INCx1)
means one single crochet followed by one increase.
4. Repeating Sequences
- If a sequence (group of stitches) is repeated for a number of rounds, append
[r=x]
, wherex
is the number of repetitions. - Example:
(SCx1, INCx1) [r=6]
means to repeat “single crochet 1 stitch, increase 1 stitch” six times in a row.
5. Single Repetitions
- If the sequence is only done once, no
[r=x]
is necessary. By default, no repeat means it occurs only once. - Example:
SCx6
assumes a default repeat of 1 unless otherwise specified.
6. Layer Ranges
- If multiple layers have the same stitch pattern, use a hyphen
(-)
to represent the range. - Example:
6-10: SCx20
means layers 6 through 10 are all single crocheted 20 times.
7. Order of Stitch Types
- Within parentheses, list stitches in the order they occur.
- When no parentheses are used, the notation assumes a single stitch pattern for that layer.
How About The Parser?
It's pretty straightforward; CROML is agnostic to indentation; I just use simple regex and then convert it to JSON locally; it's much faster than having to wait for LLM to complete generating JSON data.
Conclusion
Telling LLM to generate CROML is significantly reducing token usage and also speeding up the generation time. While maybe still not error proof, it's better than having a parse error because of just 1 character missing if I use JSON.
Featured ones: