CJON v1.0 (Compact JSON)
While developing the idea of taking a digital step backward to solve modern
alternative digital data transmission problems, I began designing a prototype of an Arduino-based device that would act as both modem and receiver. As I dug deeper into the problem, I also realized that the SMS channel could be used for short data, where a 160-character SMS (GSM 7-bit) would be enough. In many cases that is sufficient, and larger packets can be split into chunks and reassembled on the receiving side.
To link such fragments together, you can add metadata to each message: the sequence number, the total number of chunks, and an identifier. This is similar to the UDH (User Data Header) standard, but because using UDH requires PDU mode—which is often unavailable on Arduino—a simpler approach can be used: a custom pseudo-header format.
In this approach, each fragment begins with a special # character, followed by:
- a message identifier (2–3 characters),
- the current fragment number (1 character, from 0...9, a...Z),
- the total number of fragments (1 character, from 0...9, a...Z).
Example header: #0010Z, where:
- 00 is the message ID (it can be generated, for example, as a compact representation of the current time on the source device),
- 1 is the fragment sequence number,
- Z is the total number of fragments.
Data can be transmitted not only via SMS, but also over a voice channel—for example, using DTMF signals (Dual-Tone Multi-Frequency). In that case, the data is encoded as a sequence of DTMF tones representing the message characters and transmitted as audio during a normal phone call. The receiver recognizes the tone sequence, reconstructs the message, and forwards it into the target system.
To ensure the integrity of data transmitted over the voice channel, it makes sense to add a checksum at the end of the message, calculated as the sum of all transmitted characters modulo 256. This provides basic integrity control for the transmitted data. For SMS this is not required.
For the data itself, I propose a modified and simplified format based on the JSON specification, but streamlined for extreme minimization of payload size (we save every character).
CJON (Compact JSON-like Object Notation) v1.0 Specification
1. Purpose
CJON is a lightweight, compact, and human-readable format intended for use over constrained channels such as SMS, DTMF, and low-speed radio links. Its primary purpose is to transmit structured telemetry or control data in situations where traditional JSON is too bulky and binary formats are impractical or poorly readable.
2. Use Cases
- Remote telemetry for agriculture and industrial equipment
- Emergency messages and alarms
- Automation in low-speed or offline connectivity scenarios
- Mobile devices transmitting structured data over SMS or voice channels
- DTMF-based data transmission over GSM networks
3. Syntax Overview
A CJON message consists of a sequence of elements separated by semicolons (;).
- The first element is the message type (mandatory, no key), for example: T1
- The remaining elements are key=value pairs or comma-separated values for arrays.
Example request (ordering a taxi):
T1;f=56.3365,36.7259;to=56.1843,36.9745;r=2
4. Data Types and Rules
Keys must be alphanumeric (for compactness, it is advisable to shorten keys and keep a mapping in the target systems for proper reconstruction of JSON from CJON). Key names are case-sensitive, which expands the namespace (62 one-character keys at one hierarchy level).
Values may be:
- strings (without spaces, semicolons, equal signs, or commas in the value), or base64-packed strings. In that case, the value must be preceded by the ~ prefix (for example: M1;id=device-01;msg=~0JrQvtC80L7RgtGAINC80LXQvdGC0L7QstCwIDEy;level=3)
- integers or floating-point numbers,
- comma-separated value lists (interpreted as arrays).
- Quotation marks and brackets are not used in order to reduce size.
Nested structure support:
CJON supports nested objects via dotted key notation (for example, d.b.v=3.7 — device, batt, voltage).
Example:
S1;d.id=agro-007;d.b.v=3.7;d.b.s=ok;l.lat=56.3284;l.lon=37.4921;e.temp=23.5;e.hum=61
5. Conversion to JSON
CJON: T1;f=56.3365,36.7259;to=56.1843,36.9745;r=2
Equivalent JSON:
{
"t": "T1",
"f": [56.3365, 36.7259],
"to": [56.1843, 36.9745],
"r": 2
}
6. CJON vs. JSON
- CJON removes quotation marks, brackets, and spaces, saving from 30% to 50% of characters.
- Example JSON (398 characters) -> equivalent CJON (287 characters): savings of 111 characters (~28%).
7. Limitations
- Maximum message length is limited to 140 characters (for example, for SMS).
- Nested objects and JSON strings are not supported, except for imitation via dotted keys.
- Reserved characters (=, ;, ,) cannot be used in keys or values; in strings they must be packed in Base64.
- The first element is always the message type (without a key).
8. Extensibility
Additional key=value pairs may be added to a message without breaking backward compatibility. Applications should ignore unknown keys to support future versions.
9. Comparison with Existing Approaches
There are many formats and methods designed to reduce the size of JSON data, especially for constrained channels (for example, in IoT, mobile, or embedded systems). Below is a brief overview of such approaches.
Binary formats
- MessagePack (msgpack.org) is a binary serialization format that encodes JSON into a compact binary representation. It reduces size by 30–60% compared with standard JSON. It is used in high-performance APIs, but it is not suitable for text channels (SMS, voice, DTMF).
- CBOR (Concise Binary Object Representation) (cbor.io) is an IETF format used in CoAP and other IoT protocols. It provides strong compression, supports tags and extended data types, and requires a parser on the receiving side.
- UBJSON / BSON / Smile are binary alternatives to JSON aimed at specific platforms (MongoDB, Java, and others). They are not applicable where text transmission or compatibility with low-level channels is required.
Text formats
MinJSON / SlimJSON... are informal minimalist text formats in which spaces and quotation marks are removed, keys are shortened, and non-standard syntax is used. These approaches are fragmented, not standardized, and do not support nesting.
10. Conclusion
CJON is a structured but compact format that serves as an alternative to bulky formats such as JSON or XML in resource-constrained environments. It provides a balance of readability, ease of processing, and bandwidth savings, which makes it ideal for low-power, low-speed, or legacy systems. For complex JSON files with many nested objects and large amounts of text data, CJON can grow larger, but it still preserves compatibility with SMS use cases. Because CJON converts all text data to Base64, the slight increase in structure size is offset by the fact that SMS can still be sent in GSM 7-bit rather than UCS-2 (UTF-16).