CJON v1.1 (Compact JSON)
Introduction
With each new wave of IT development, we become more dependent on technology. Adverse events and large-scale incidents in communications, cybersecurity, and related areas expose both that dependence and our helplessness when technology disappears. In modern Russia, we often face limited access to telecommunications technologies, including mobile Internet shutdowns. At the same time, IoT devices, sensors, and mobile applications require data transmission to function. For example, without Internet access you cannot order a Yandex taxi, transmit data from a field weather station, or collect readings from smart-home devices outside the city.
To address this problem, I propose using a backup channel in the form of SMS messages and DTMF-based transmission (Dual-Tone Multi-Frequency—the tone-dialing technology used in telephones and other devices to transmit digital information over phone lines). The idea is to use the voice and SMS channels of the cellular network (including 2G) as fallback transport for sending data to a server that has stable Internet connectivity. To package the data into a reasonably compact form, I propose the following solution.
1. Purpose
CJON is a lightweight, compact, and human-readable format intended for use over constrained communication channels such as SMS, DTMF, and low-speed or legacy links (for example, 2G, voice, or radio channels). Its primary purpose is to transmit structured telemetry or control data in situations where traditional JSON is too bulky and binary formats are impractical or poorly readable.
2. Use Cases
Remote telemetry for agriculture and industrial equipment;
- Emergency messages and alarms;
- Automation in low-speed or offline connectivity scenarios;
- Mobile devices transmitting structured data over SMS or voice channels;
- DTMF-based data transmission over GSM networks.
3. Syntax Overview
A CJON message consists of a sequence of elements separated by semicolons (;).
- The first element is the message type (mandatory, without a key), for example: T1;
- All remaining elements are either key=value pairs or comma-separated values for arrays.
Example of a taxi request (f = from coordinate array, to = destination coordinate array, r = fare class):
T1;f=56.3365,36.7259;to=56.1843,36.9745;r=2
4. Value Encoding Format
RAW String (plain text without base64)
The GSM 7-bit alphabet is used (the basic ASCII subset) without CJON reserved characters.
- Allowed: Latin letters A-Z and a-z, digits 0-9, space, and safe punctuation !"#$%&'()*+-./:<>?@.
- Forbidden in RAW: separators and service characters ; = , { } ~ and control characters 0x00–0x1F, 0x7F.
The @ symbol is allowed inside a string (for example, for email addresses), but it cannot be the first character of a value because that would conflict with the time marker.
Example:
note=station west gate
email=user@stukalin.com
Base64 String
If a value contains forbidden characters (including commas, curly braces, non-Latin characters, or line breaks), it is base64-encoded and written with the ~ prefix instead of =:
msg~0KHRgtGD0LrQsNC70LjQvSDQkNC70LXQutGB0LXQuQ==
Number (integer or floating point)
a=123
temp=21.7
Date/time as a Unix timestamp — @ prefix instead of =
ts@1725609120 (2024-09-06T12:32:00Z)
* Date without time (ISO 8601, YYYY-MM-DD) * — ^ prefix instead of =
start^2024-09-06
Array values are represented as comma-separated lists (interpreted as arrays of primitives)
tags=alpha,beta,gamma
loc=56.3365,36.7259
If you need a “real” comma inside a string, use base64.
Note: quotation marks and brackets are not used for values—this saves space and simplifies parsing. In CJON, the value type is determined by the symbol immediately following the key:
= = RAW/number; ~ = base64; @ = timestamp; ^ = ISO date.
Nested Structure Support
CJON supports nested objects in two ways—the shorter form is used:
- Dot notation for keys (convenient when there are only a few values):
d.id=agro-007;d.b.v=3.7;d.b.s=ok;l.lat=56.3284;l.lon=37.4921;e.temp=23.5;e.hum=61
- Block notation with curly braces (more compact for groups of fields):
d{id=agro-007;b{v=3.7;s=ok}};l{lat=56.3284;lon=37.4921};e{temp=23.5;hum=61}
The parser must correctly understand both notations and treat them as equivalent. For arrays of objects, block notation is used without indexes:
ev{{t=start;c=1};{t=stop;c=2}}
while primitive arrays are comma-separated, as shown above.
5. CJON vs. JSON
CJON removes quotation marks, brackets, and spaces, which saves from 30% to 50% of characters;
Example JSON (398 characters) -> equivalent CJON (287 characters): savings of 111 characters (~28%).
6. Comparison with Existing Approaches
There are a number of formats and methods designed to reduce the size of JSON data, especially over constrained channels (for example, in IoT, mobile, or embedded systems). Below is a brief overview of such approaches.
Binary formats
MessagePack (msgpack.org) is a binary serialization format that encodes JSON into a compact binary representation. It typically reduces size by 30–60% compared with ordinary JSON. It is used in high-performance APIs, but it is not suitable for text-based channels (SMS, voice, DTMF);
CBOR (Concise Binary Object Representation) (cbor.io) is an IETF format used in CoAP and other IoT protocols. It provides a high degree of compression, supports tags and extended data types, and requires a parser on the receiving side;
UBJSON / BSON / Smile are binary alternatives to JSON aimed at specific platforms (MongoDB, Java, and so on). They are not applicable in environments that require text-based transmission or compatibility with low-level channels.
Text formats
MinJSON / SlimJSON... are informal minimalist text formats in which spaces and quotation marks are removed, keys are shortened, and non-standard syntax is used. These approaches are fragmented, not standardized, and do not support nesting.
YAML Flow Style is almost the same as my format, and it could probably have been used instead, but CJON is slightly simpler and shorter—and in this context every byte counts.
7. Extended Nesting (Block-Based Optimization)
To further reduce message size when transmitting nested dictionaries (objects), CJON allows an alternative way to represent nesting using curly braces {};
Instead of serializing each element of a nested object via dot notation, for example:
batt.voltage=3.7;batt.status=ok
you can use a nested block if it turns out to be more compact:
batt{voltage=3.7;status=ok}
Rules of use:
Blocks may be nested to any depth;
Mixing dot notation and block notation is allowed;
Inside curly braces, the same key=value structure is used, separated by ";";
The symbols { and } become reserved and must be escaped (or base64-encoded) inside values if they are used as part of a string;
CJON parsers must be able to distinguish both a.b=1 and a{b=1} and process them as equivalent forms.
Example:
S1;device{id=agro-007;batt{voltage=3.7;status=ok}};env{temp=22.5;hum=60}
Equivalent dot notation form:
S1;device.id=agro-007;device.batt.voltage=3.7;device.batt.status=ok;env.temp=22.5;env.hum=60
When serializing nested objects, a CJON parser may automatically choose between block and dot notation based on the final string length: whichever format saves more characters should be used.
Conclusion
CJON is a structured but compact format that serves as an alternative to bulky formats such as JSON or XML in resource-constrained environments. It balances readability, ease of processing, and traffic savings, which makes it ideal for low-power, low-speed, or legacy systems. When encoding JSON files with a large number of nested objects and text fields, CJON may end up comparable to—or even larger than—the original JSON because of base64 or nested-object overhead. Even so, it retains its advantages for constrained communication channels and remains compatible with SMS use cases. Any slight increase in structure size is offset by the fact that the encoded payload is transmitted in GSM 7-bit rather than UCS-2 (UTF-16).