Skip to content

usrnatc/nc_json

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

nc_json.h

Single-file streaming JSON pull-parser for embedded C++.

About

nc_json.h is a C++ library for extracting fields from JSON one byte at a time. It never owns the input, never allocates, and never needs the document to be complete or contiguous, which makes it suitable for parsing straight off a socket, a ring buffer, or a flash read on a target with a few kilobytes of RAM to spare.

The parser is a single fixed-size nc_json struct. It makes no heap allocations, no system calls, and no recursive calls; nesting is tracked in a fixed array and a bitmask. Paths are hashed at compile time (FNV-1a), so a field match at runtime is a single integer comparison rather than a string walk.

Fields are matched with a small macro DSL that reads like a switch over JSON paths. Arrays are transparent in paths, so one matcher handles every element of an array regardless of its length. Strings longer than the token buffer are streamed in chunks, so arbitrarily long values pass through a small fixed buffer.

The library has no platform dependencies. With NCJSON_NO_STDLIB it drops every standard library include and uses internal MemSet / MemCpy, so it builds in freestanding.

Installation

Copy nc_json.h into your project.

In one C++ source file, define NCJSON_IMPLEMENTATION before including the header:

#define NCJSON_IMPLEMENTATION
#include "nc_json.h"

All other files that need the API include the header without the define.

Tip

To confine all symbols to a single translation unit, NCJSON_STATIC can be defined alongside NCJSON_IMPLEMENTATION.

Overrides

The following macros can be defined before including the header to replace default dependencies or tune fixed sizes:

Macro Default Purpose
NCJSON_BUFFER_MAX_TOKENS 64 Token buffer size. Also sets the string-chunk size (N - 1) and the maximum length of keys, numbers, and primitives before truncation
NCJSON_MAX_DEPTH 32 Maximum object/array nesting depth. Hard ceiling of 32 (depth is tracked in a 32-bit mask)
NCJSON_NO_STDLIB Undefined Suppresses <stdint.h> / <string.h>; caller must provide u8, u16, u32, u64, i8, i16, i32, i64, b32, b8, f32, f64 typedefs
NCJSON_MEMSET(d, v, n) memset Memory fill
NCJSON_MEMCPY(d, s, n) memcpy Memory copy
NCJSON_HASH(s) FNV-1a Compile-time path-hash function
NCJSON_IS_WHITESPACE(c) space / CR / LF / tab / FF / VT Whitespace classification
NCJSON_IS_DIGIT(c) 0 - 9 Digit classification
NCJSON_ALLOWED_IN_NUMBER(c) digits . e E - + Bytes that continue a number token
NCJSON_ALLOWED_IN_PRIMITIVE(c) a - z, A - Z Bytes that continue a primitive token
NCJSON_ARRAY_COUNT(a) sizeof-based Array element-count helper

Usage

Initialisation

A freshly zeroed parser is already a valid, initialised parser, so either form works:

nc_json parser = {};
// or
nc_json parser;
nc_json_init(&parser);

Dispatch loop

Feed one byte at a time. NCJSON_DISPATCH opens a block in which matchers are tested. Matchers are independent if statements, so more than one can fire for the same byte and the ones that miss cost nothing. The block must be braced.

for (char const* c = json; *c; ++c) {
    NCJSON_DISPATCH(&parser, *c) {
        // ... matchers ...
    }
}

Paths

A path is a slash-separated chain of keys, hashed at compile time. Two rules govern matching:

Arrays are transparent. "users/bio" matches bio in every object element of the users array; there is no users/0/bio syntax, and the element index is recovered separately.

Value matchers only match object members. A path-matched scalar must be the value of a key inside an object. A bare scalar inside an array (an element of [1, 2, 3]) is not path-reachable and must be caught with a raw event.

Value matchers

Each typed matcher declares a local of the named type, runs the body only on a match, and leaves the value in the named variable. Numbers fire on the NUMBER event, booleans on PRIMITIVE, strings on STRING. Names must be unique within a dispatch block.

NCJSON_STR ("user/name",  name)  { /* char const*, NULL-terminated, valid this byte */ }
NCJSON_I32 ("user/age",   age)   { /* i32 */ }
NCJSON_I64 ("ts",         ts)    { /* i64 */ }
NCJSON_U32 ("id",         id)    { /* u32 */ }
NCJSON_U64 ("id64",       id)    { /* u64 */ }
NCJSON_F32 ("ratio",      r)     { /* f32 */ }
NCJSON_F64 ("precise",    p)     { /* f64 */ }
NCJSON_BOOL("enabled",    on)    { /* b32 */ }
NCJSON_NULL("middle_name")       { /* matched a JSON null; no binding */ }

Strings

Three matchers handle string values depending on length and ownership:

NCJSON_STR("bio", s) // value that fits the token buffer
NCJSON_STR_CHUNK("bio", s) // raw streaming piece (manual reassembly)
NCJSON_STR_INTO("bio", buf, sizeof buf, len) // accumulate into a caller buffer

NCJSON_STR delivers the whole value when it fits the buffer. For values that may exceed it, NCJSON_STR_INTO accumulates the chunks into a caller-owned buffer and runs its body once, when the full string is assembled and NULL-terminated. parser.TokenLength holds the valid byte count at every string event. The low-level NCJSON_STR_CHUNK / NCJSON_STR pair is available for streaming without buffering.

Arrays

The element index of the nearest enclosing array is available through nc_json_array_index. The outermost array's index is parser.ArrayCount[0].

NCJSON_U32("channels/eid", eid) {
    u32 channel = nc_json_array_index(&parser);   // nearest array
    u32 meter   = parser.ArrayCount[0];           // outermost array
}

Scopes

NCJSON_ON_ENTER and NCJSON_ON_LEAVE fire on entering or leaving an object or array by path. NCJSON_IN is a stateless query for whether the current position is anywhere under a path, which removes the need to track scope with a manual flag.

NCJSON_ON_ENTER("flags") { /* entered the flags object/array */ }
NCJSON_ON_LEAVE("flags") { /* left it */ }

NCJSON_ON(NCJSON_EVENT_KEY) {
    if (NCJSON_IN("flags"))
        printf("flag: %s\n", parser.TokenBuffer);
}

Raw events

NCJSON_ON(event) matches any event directly. This is how bare array scalars, dynamic keys, and structural boundaries are handled.

NCJSON_ON(NCJSON_EVENT_NUMBER) {
    if (NCJSON_IN("params")) {
        printf(
            "[%u] = %s\n", 
            nc_json_array_index(&parser), 
            parser.TokenBuffer
        );
    }
}

Events: NCJSON_EVENT_OBJECT_START, OBJECT_END, ARRAY_START, ARRAY_END, KEY, STRING, STRING_CHUNK, NUMBER, PRIMITIVE, ERROR.

Errors

NCJSON_ON_ERROR fires once when the parser enters its error state, after which it stops emitting events. The parser carries the reason, the offending byte, and a stream offset that remains correct across separately fed buffers.

NCJSON_ON_ERROR() {
    printf(
        "error %u at offset %llu, byte '%c'\n",
        parser.ErrorCode,
        (unsigned long long) (parser.BytesFed - 1),
        parser.ErrorByte
    );

    return 1;
}

Error codes: NCJSON_ERR_NONE, NCJSON_ERR_UNEXPECTED_BYTE, NCJSON_ERR_EXPECTED_KEY, NCJSON_ERR_EXPECTED_COLON, NCJSON_ERR_UNBALANCED, NCJSON_ERR_MAX_DEPTH, NCJSON_ERR_INTERNAL.

Parser state

Fields that are useful to read from inside a match:

Field Meaning
TokenBuffer NUL-terminated text of the current key, string, number, or primitive
TokenLength Valid byte count in TokenBuffer at the current event
Depth Current nesting depth
ArrayCount[d] Element index of the array opened at depth d (ArrayCount[0] is the outermost array)
ObjectMask Bit d is set when the container at depth d is an object rather than an array
BytesFed Count of bytes consumed so far (offset of the current byte is BytesFed - 1)
ErrorCode Reason code, valid after an ERROR event
ErrorByte The byte that triggered the error

Examples

Energy meter array

A top-level array of meter objects, each containing a nested channels array. Bare keys (voltage) match only the meter level; the array-transparent channels/voltage matches every channel. The meter index comes from the outermost array, the channel index from the nearest array.

[
  { "eid": 704643328, "activePower": 0.000, "voltage": 237.151, "current": 0.254,
    "channels": [
      { "eid": 1778385169, "voltage": 237.151, "current": 0.254 },
      { "eid": 1778385170, "voltage":   9.994, "current": 0.278 }
    ]
  }
]
nc_json p;

nc_json_init(&p);

for (char const* c = ENERGY; *c; ++c) {
    NCJSON_DISPATCH(&p, *c) {
        NCJSON_U32("eid", mEid) {
            printf(
                "\n=== meter[%u]  eid=%u ===\n", 
                p.ArrayCount[0], 
                mEid
            );
        }

        NCJSON_F64("activePower", mP) { 
            printf("  P = %.3f W\n", mP); 
        }

        NCJSON_F32("voltage", mV) { 
            printf("  V = %.3f V\n", mV); 
        }

        NCJSON_F32("current", mI) { 
            printf("  I = %.3f A\n", mI); 
        }

        NCJSON_U32("channels/eid", cEid) {
            printf(
                "    channel[%u] eid=%u: ", 
                nc_json_array_index(&p), 
                cEid
            );
        }

        NCJSON_F32("channels/voltage", cV) { 
            printf("V=%.3f ", cV); 
        }

        NCJSON_F32("channels/current", cI) { 
            printf("I=%.3f\n", cI); 
        }

        NCJSON_ON_ERROR() { 
            return 1; 
        }
    }
}
=== meter[0]  eid=704643328 ===
  P = 0.000 W
  V = 237.151 V
  I = 0.254 A
    channel[0] eid=1778385169: V=237.151 I=0.254
    channel[1] eid=1778385170: V=9.994 I=0.278

JSON-RPC request

params is a heterogeneous array of bare scalars, which paths cannot reach, so its elements are taken from raw NCJSON_ON events scoped by NCJSON_IN. The id value is longer than the 64-byte token buffer, so it is reassembled with NCJSON_STR_INTO; a plain NCJSON_STR would deliver only its tail.

{
    "jsonrpc": "2.0",
    "id":      "d1acc980-0e4e-11e8-98f0-ab5030b47df4:d1db7aa0-0e4e-11e8-b1d9-5f0ab230c0d9",
    "method":  "example/hello",
    "params":  [ "world", 42 ]
}
nc_json p;

nc_json_init(&p);

char id[128] = {};
u32 id_len = 0;

for (char const* c = RPC; *c; ++c) {
    NCJSON_DISPATCH(&p, *c) {
        NCJSON_STR("jsonrpc", ver) { 
            printf("jsonrpc : %s\n", ver);  
        }

        NCJSON_STR("method",  meth) { 
            printf("method  : %s\n", meth); 
        }

        NCJSON_STR_INTO("id", id, sizeof(id), id_len) {
            printf("id      : %s\n", id);
            id_len = 0;
        }

        NCJSON_ON(NCJSON_EVENT_STRING) {
            if (NCJSON_IN("params")) {
                printf(
                    "param[%u]: \"%s\"  (string)\n",
                    nc_json_array_index(&p), 
                    p.TokenBuffer
                );
            }
        }

        NCJSON_ON(NCJSON_EVENT_NUMBER) {
            if (NCJSON_IN("params")) {
                printf(
                    "param[%u]: %s  (number)\n",
                    nc_json_array_index(&p), 
                    p.TokenBuffer
                );
            }
        }

        NCJSON_ON_ERROR() { 
            return 1; 
        }
    }
}
jsonrpc : 2.0
id      : d1acc980-0e4e-11e8-98f0-ab5030b47df4:d1db7aa0-0e4e-11e8-b1d9-5f0ab230c0d9
method  : example/hello
param[0]: "world"  (string)
param[1]: 42  (number)

ESP32 controller

A controller config with sensors and actuators arrays of heterogeneous objects, some omitting value. Each object's fields are stashed into scratch variables as they stream by, then a complete record is printed when the terminal key (alarm, always last) arrives. The optional value field is tracked with a presence flag. Bare type matches only the top-level device type, never the per-element types.

{
  "device": "Controller",
  "type": "ESP_32",
  "location": "/power/climatisation",
  "sensors": [
    { "name": "temp1", "type": "DS18B20", "temperature": 25,
      "status": "Temperature OK", "alarm": false }
  ],
  "actuators": [
    { "name": "fan1", "type": "PWM", "value": 100, "status": "FAN1 Working!", "alarm": false },
    { "name": "doorOpenAlarm", "type": "PINCONTROL", "status": "Door is CLOSE", "alarm": false }
  ]
}
nc_json p;

nc_json_init(&p);

char sName[32] = {};
char sType[24] = {};
char sStat[48] = {};
int  sTemp = 0;
char aName[32] = {};
char aType[24] = {};
char aStat[48] = {};
int  aValue = 0;
b32  aHasValue = FALSE;

for (char const* c = ESP; *c; ++c) {
    NCJSON_DISPATCH(&p, *c) {
        NCJSON_STR("device", dev) { 
            printf("device   = %s\n", dev); 
        }

        NCJSON_STR("type", typ) { 
            printf("type     = %s\n", typ); 
        }

        NCJSON_STR("location", loc) { 
            printf("location = %s\n\n", loc); 
        }

        NCJSON_STR ("sensors/name", sn)  { 
            snprintf(sName, sizeof sName, "%s", sn); 
        }

        NCJSON_STR ("sensors/type", st) { 
            snprintf(sType, sizeof sType, "%s", st); 
        }

        NCJSON_I32 ("sensors/temperature", st2) { 
            sTemp = st2; 
        }

        NCJSON_STR ("sensors/status", ss) { 
            snprintf(sStat, sizeof sStat, "%s", ss); 
        }

        NCJSON_BOOL("sensors/alarm", sa) {
            printf(
                "sensor[%u]  %-8s %-9s temp=%dC  alarm=%s  (%s)\n",
                nc_json_array_index(&p), 
                sName, 
                sType, 
                sTemp,
                sa ? "ON" : "OFF", 
                sStat
            );
        }

        NCJSON_STR("actuators/name", an) { 
            snprintf(aName, sizeof aName, "%s", an);
            aHasValue = FALSE; 
        }

        NCJSON_STR ("actuators/type", at) { 
            snprintf(aType, sizeof aType, "%s", at); 
        }

        NCJSON_I32 ("actuators/value", av) { 
            aValue = av; 
            aHasValue = TRUE; 
        }

        NCJSON_STR("actuators/status", as) { 
            snprintf(aStat, sizeof aStat, "%s", as); 
        }

        NCJSON_BOOL("actuators/alarm", aa) {
            if (aHasValue) {
                printf(
                    "actuator[%u] %-13s %-10s val=%d  alarm=%s  (%s)\n",
                    nc_json_array_index(&p), 
                    aName, 
                    aType, 
                    aValue,
                    aa ? "ON" : "OFF", 
                    aStat
                );
            } else {
                printf(
                    "actuator[%u] %-13s %-10s          alarm=%s  (%s)\n",
                    nc_json_array_index(&p), 
                    aName, 
                    aType,
                    aa ? "ON" : "OFF", 
                    aStat
                );
            }
        }

        NCJSON_ON_ERROR() {
            return 1; 
        }
    }
}
device   = Controller
type     = ESP_32
location = /power/climatisation

sensor[0]  temp1    DS18B20   temp=25C  alarm=OFF  (Temperature OK)
actuator[0] fan1          PWM        val=100  alarm=OFF  (FAN1 Working!)
actuator[1] doorOpenAlarm PINCONTROL          alarm=OFF  (Door is CLOSE)

Limitations

Constraint Value Defined by
Nesting depth 32 NCJSON_MAX_DEPTH (32-bit ObjectMask)
Key / number length NCJSON_BUFFER_MAX_TOKENS - 1 NCJSON_BUFFER_MAX_TOKENS
String chunk size NCJSON_BUFFER_MAX_TOKENS - 1 NCJSON_BUFFER_MAX_TOKENS

Path matching is by 32-bit hash with no string re-verification, so a hash collision routes a field to the wrong matcher with no error. This is acceptable for fixed, known schemas.

Escape sequences are not decoded. \" and \\ are handled as delimiters, but \n, \t, \uXXXX, and the like are stored literally: the character after the backslash is kept as-is.

Scalars directly inside an array are not path-reachable, since value matchers require an object member. Catch them with raw NCJSON_ON events scoped by NCJSON_IN.

Keys and numbers longer than the token buffer are truncated. Strings are the exception and stream in chunks, but an over-long numeric literal parses to the wrong value.

Number conversion uses an internal pow-10 table and is not a correctly-rounded strtod; values needing full double precision (large fixed-point energy counters, for example) should be read with NCJSON_F64, and the usual fast-path precision caveats apply.

About

zero-allocation json stream parser for embedded contexts

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages