Single-file streaming JSON pull-parser for embedded C++.
nc_json.h is a C++ library for extracting fields from JSON one byte at a time. It never owns the input, never allocates, and never needs the document to be complete or contiguous, which makes it suitable for parsing straight off a socket, a ring buffer, or a flash read on a target with a few kilobytes of RAM to spare.
The parser is a single fixed-size nc_json struct. It makes no heap allocations, no system
calls, and no recursive calls; nesting is tracked in a fixed array and a bitmask. Paths are
hashed at compile time (FNV-1a), so a field match at runtime is a single integer comparison
rather than a string walk.
Fields are matched with a small macro DSL that reads like a switch over JSON paths. Arrays are transparent in paths, so one matcher handles every element of an array regardless of its length. Strings longer than the token buffer are streamed in chunks, so arbitrarily long values pass through a small fixed buffer.
The library has no platform dependencies. With NCJSON_NO_STDLIB it drops every standard
library include and uses internal MemSet / MemCpy, so it builds in freestanding.
Copy nc_json.h into your project.
In one C++ source file, define NCJSON_IMPLEMENTATION before including the header:
#define NCJSON_IMPLEMENTATION
#include "nc_json.h"All other files that need the API include the header without the define.
Tip
To confine all symbols to a single translation unit, NCJSON_STATIC can be defined
alongside NCJSON_IMPLEMENTATION.
The following macros can be defined before including the header to replace default dependencies or tune fixed sizes:
| Macro | Default | Purpose |
|---|---|---|
NCJSON_BUFFER_MAX_TOKENS |
64 | Token buffer size. Also sets the string-chunk size (N - 1) and the maximum length of keys, numbers, and primitives before truncation |
NCJSON_MAX_DEPTH |
32 | Maximum object/array nesting depth. Hard ceiling of 32 (depth is tracked in a 32-bit mask) |
NCJSON_NO_STDLIB |
Undefined | Suppresses <stdint.h> / <string.h>; caller must provide u8, u16, u32, u64, i8, i16, i32, i64, b32, b8, f32, f64 typedefs |
NCJSON_MEMSET(d, v, n) |
memset |
Memory fill |
NCJSON_MEMCPY(d, s, n) |
memcpy |
Memory copy |
NCJSON_HASH(s) |
FNV-1a | Compile-time path-hash function |
NCJSON_IS_WHITESPACE(c) |
space / CR / LF / tab / FF / VT | Whitespace classification |
NCJSON_IS_DIGIT(c) |
0 - 9 |
Digit classification |
NCJSON_ALLOWED_IN_NUMBER(c) |
digits . e E - + |
Bytes that continue a number token |
NCJSON_ALLOWED_IN_PRIMITIVE(c) |
a - z, A - Z |
Bytes that continue a primitive token |
NCJSON_ARRAY_COUNT(a) |
sizeof-based |
Array element-count helper |
A freshly zeroed parser is already a valid, initialised parser, so either form works:
nc_json parser = {};
// or
nc_json parser;
nc_json_init(&parser);Feed one byte at a time. NCJSON_DISPATCH opens a block in which matchers are tested.
Matchers are independent if statements, so more than one can fire for the same byte and
the ones that miss cost nothing. The block must be braced.
for (char const* c = json; *c; ++c) {
NCJSON_DISPATCH(&parser, *c) {
// ... matchers ...
}
}A path is a slash-separated chain of keys, hashed at compile time. Two rules govern matching:
Arrays are transparent. "users/bio" matches bio in every object element of the users
array; there is no users/0/bio syntax, and the element index is recovered separately.
Value matchers only match object members. A path-matched scalar must be the value of a key
inside an object. A bare scalar inside an array (an element of [1, 2, 3]) is not
path-reachable and must be caught with a raw event.
Each typed matcher declares a local of the named type, runs the body only on a match, and
leaves the value in the named variable. Numbers fire on the NUMBER event, booleans on
PRIMITIVE, strings on STRING. Names must be unique within a dispatch block.
NCJSON_STR ("user/name", name) { /* char const*, NULL-terminated, valid this byte */ }
NCJSON_I32 ("user/age", age) { /* i32 */ }
NCJSON_I64 ("ts", ts) { /* i64 */ }
NCJSON_U32 ("id", id) { /* u32 */ }
NCJSON_U64 ("id64", id) { /* u64 */ }
NCJSON_F32 ("ratio", r) { /* f32 */ }
NCJSON_F64 ("precise", p) { /* f64 */ }
NCJSON_BOOL("enabled", on) { /* b32 */ }
NCJSON_NULL("middle_name") { /* matched a JSON null; no binding */ }Three matchers handle string values depending on length and ownership:
NCJSON_STR("bio", s) // value that fits the token buffer
NCJSON_STR_CHUNK("bio", s) // raw streaming piece (manual reassembly)
NCJSON_STR_INTO("bio", buf, sizeof buf, len) // accumulate into a caller bufferNCJSON_STR delivers the whole value when it fits the buffer. For values that may exceed it,
NCJSON_STR_INTO accumulates the chunks into a caller-owned buffer and runs its body once,
when the full string is assembled and NULL-terminated. parser.TokenLength holds the valid
byte count at every string event. The low-level NCJSON_STR_CHUNK / NCJSON_STR pair is
available for streaming without buffering.
The element index of the nearest enclosing array is available through nc_json_array_index.
The outermost array's index is parser.ArrayCount[0].
NCJSON_U32("channels/eid", eid) {
u32 channel = nc_json_array_index(&parser); // nearest array
u32 meter = parser.ArrayCount[0]; // outermost array
}NCJSON_ON_ENTER and NCJSON_ON_LEAVE fire on entering or leaving an object or array by
path. NCJSON_IN is a stateless query for whether the current position is anywhere under a
path, which removes the need to track scope with a manual flag.
NCJSON_ON_ENTER("flags") { /* entered the flags object/array */ }
NCJSON_ON_LEAVE("flags") { /* left it */ }
NCJSON_ON(NCJSON_EVENT_KEY) {
if (NCJSON_IN("flags"))
printf("flag: %s\n", parser.TokenBuffer);
}NCJSON_ON(event) matches any event directly. This is how bare array scalars, dynamic keys,
and structural boundaries are handled.
NCJSON_ON(NCJSON_EVENT_NUMBER) {
if (NCJSON_IN("params")) {
printf(
"[%u] = %s\n",
nc_json_array_index(&parser),
parser.TokenBuffer
);
}
}Events: NCJSON_EVENT_OBJECT_START, OBJECT_END, ARRAY_START, ARRAY_END, KEY,
STRING, STRING_CHUNK, NUMBER, PRIMITIVE, ERROR.
NCJSON_ON_ERROR fires once when the parser enters its error state, after which it stops
emitting events. The parser carries the reason, the offending byte, and a stream offset that
remains correct across separately fed buffers.
NCJSON_ON_ERROR() {
printf(
"error %u at offset %llu, byte '%c'\n",
parser.ErrorCode,
(unsigned long long) (parser.BytesFed - 1),
parser.ErrorByte
);
return 1;
}Error codes: NCJSON_ERR_NONE, NCJSON_ERR_UNEXPECTED_BYTE, NCJSON_ERR_EXPECTED_KEY,
NCJSON_ERR_EXPECTED_COLON, NCJSON_ERR_UNBALANCED, NCJSON_ERR_MAX_DEPTH,
NCJSON_ERR_INTERNAL.
Fields that are useful to read from inside a match:
| Field | Meaning |
|---|---|
TokenBuffer |
NUL-terminated text of the current key, string, number, or primitive |
TokenLength |
Valid byte count in TokenBuffer at the current event |
Depth |
Current nesting depth |
ArrayCount[d] |
Element index of the array opened at depth d (ArrayCount[0] is the outermost array) |
ObjectMask |
Bit d is set when the container at depth d is an object rather than an array |
BytesFed |
Count of bytes consumed so far (offset of the current byte is BytesFed - 1) |
ErrorCode |
Reason code, valid after an ERROR event |
ErrorByte |
The byte that triggered the error |
A top-level array of meter objects, each containing a nested channels array. Bare keys
(voltage) match only the meter level; the array-transparent channels/voltage matches
every channel. The meter index comes from the outermost array, the channel index from the
nearest array.
[
{ "eid": 704643328, "activePower": 0.000, "voltage": 237.151, "current": 0.254,
"channels": [
{ "eid": 1778385169, "voltage": 237.151, "current": 0.254 },
{ "eid": 1778385170, "voltage": 9.994, "current": 0.278 }
]
}
]nc_json p;
nc_json_init(&p);
for (char const* c = ENERGY; *c; ++c) {
NCJSON_DISPATCH(&p, *c) {
NCJSON_U32("eid", mEid) {
printf(
"\n=== meter[%u] eid=%u ===\n",
p.ArrayCount[0],
mEid
);
}
NCJSON_F64("activePower", mP) {
printf(" P = %.3f W\n", mP);
}
NCJSON_F32("voltage", mV) {
printf(" V = %.3f V\n", mV);
}
NCJSON_F32("current", mI) {
printf(" I = %.3f A\n", mI);
}
NCJSON_U32("channels/eid", cEid) {
printf(
" channel[%u] eid=%u: ",
nc_json_array_index(&p),
cEid
);
}
NCJSON_F32("channels/voltage", cV) {
printf("V=%.3f ", cV);
}
NCJSON_F32("channels/current", cI) {
printf("I=%.3f\n", cI);
}
NCJSON_ON_ERROR() {
return 1;
}
}
}=== meter[0] eid=704643328 ===
P = 0.000 W
V = 237.151 V
I = 0.254 A
channel[0] eid=1778385169: V=237.151 I=0.254
channel[1] eid=1778385170: V=9.994 I=0.278params is a heterogeneous array of bare scalars, which paths cannot reach, so its elements
are taken from raw NCJSON_ON events scoped by NCJSON_IN. The id value is longer than
the 64-byte token buffer, so it is reassembled with NCJSON_STR_INTO; a plain NCJSON_STR
would deliver only its tail.
{
"jsonrpc": "2.0",
"id": "d1acc980-0e4e-11e8-98f0-ab5030b47df4:d1db7aa0-0e4e-11e8-b1d9-5f0ab230c0d9",
"method": "example/hello",
"params": [ "world", 42 ]
}nc_json p;
nc_json_init(&p);
char id[128] = {};
u32 id_len = 0;
for (char const* c = RPC; *c; ++c) {
NCJSON_DISPATCH(&p, *c) {
NCJSON_STR("jsonrpc", ver) {
printf("jsonrpc : %s\n", ver);
}
NCJSON_STR("method", meth) {
printf("method : %s\n", meth);
}
NCJSON_STR_INTO("id", id, sizeof(id), id_len) {
printf("id : %s\n", id);
id_len = 0;
}
NCJSON_ON(NCJSON_EVENT_STRING) {
if (NCJSON_IN("params")) {
printf(
"param[%u]: \"%s\" (string)\n",
nc_json_array_index(&p),
p.TokenBuffer
);
}
}
NCJSON_ON(NCJSON_EVENT_NUMBER) {
if (NCJSON_IN("params")) {
printf(
"param[%u]: %s (number)\n",
nc_json_array_index(&p),
p.TokenBuffer
);
}
}
NCJSON_ON_ERROR() {
return 1;
}
}
}jsonrpc : 2.0
id : d1acc980-0e4e-11e8-98f0-ab5030b47df4:d1db7aa0-0e4e-11e8-b1d9-5f0ab230c0d9
method : example/hello
param[0]: "world" (string)
param[1]: 42 (number)A controller config with sensors and actuators arrays of heterogeneous objects, some
omitting value. Each object's fields are stashed into scratch variables as they stream by,
then a complete record is printed when the terminal key (alarm, always last) arrives. The
optional value field is tracked with a presence flag. Bare type matches only the
top-level device type, never the per-element types.
{
"device": "Controller",
"type": "ESP_32",
"location": "/power/climatisation",
"sensors": [
{ "name": "temp1", "type": "DS18B20", "temperature": 25,
"status": "Temperature OK", "alarm": false }
],
"actuators": [
{ "name": "fan1", "type": "PWM", "value": 100, "status": "FAN1 Working!", "alarm": false },
{ "name": "doorOpenAlarm", "type": "PINCONTROL", "status": "Door is CLOSE", "alarm": false }
]
}nc_json p;
nc_json_init(&p);
char sName[32] = {};
char sType[24] = {};
char sStat[48] = {};
int sTemp = 0;
char aName[32] = {};
char aType[24] = {};
char aStat[48] = {};
int aValue = 0;
b32 aHasValue = FALSE;
for (char const* c = ESP; *c; ++c) {
NCJSON_DISPATCH(&p, *c) {
NCJSON_STR("device", dev) {
printf("device = %s\n", dev);
}
NCJSON_STR("type", typ) {
printf("type = %s\n", typ);
}
NCJSON_STR("location", loc) {
printf("location = %s\n\n", loc);
}
NCJSON_STR ("sensors/name", sn) {
snprintf(sName, sizeof sName, "%s", sn);
}
NCJSON_STR ("sensors/type", st) {
snprintf(sType, sizeof sType, "%s", st);
}
NCJSON_I32 ("sensors/temperature", st2) {
sTemp = st2;
}
NCJSON_STR ("sensors/status", ss) {
snprintf(sStat, sizeof sStat, "%s", ss);
}
NCJSON_BOOL("sensors/alarm", sa) {
printf(
"sensor[%u] %-8s %-9s temp=%dC alarm=%s (%s)\n",
nc_json_array_index(&p),
sName,
sType,
sTemp,
sa ? "ON" : "OFF",
sStat
);
}
NCJSON_STR("actuators/name", an) {
snprintf(aName, sizeof aName, "%s", an);
aHasValue = FALSE;
}
NCJSON_STR ("actuators/type", at) {
snprintf(aType, sizeof aType, "%s", at);
}
NCJSON_I32 ("actuators/value", av) {
aValue = av;
aHasValue = TRUE;
}
NCJSON_STR("actuators/status", as) {
snprintf(aStat, sizeof aStat, "%s", as);
}
NCJSON_BOOL("actuators/alarm", aa) {
if (aHasValue) {
printf(
"actuator[%u] %-13s %-10s val=%d alarm=%s (%s)\n",
nc_json_array_index(&p),
aName,
aType,
aValue,
aa ? "ON" : "OFF",
aStat
);
} else {
printf(
"actuator[%u] %-13s %-10s alarm=%s (%s)\n",
nc_json_array_index(&p),
aName,
aType,
aa ? "ON" : "OFF",
aStat
);
}
}
NCJSON_ON_ERROR() {
return 1;
}
}
}device = Controller
type = ESP_32
location = /power/climatisation
sensor[0] temp1 DS18B20 temp=25C alarm=OFF (Temperature OK)
actuator[0] fan1 PWM val=100 alarm=OFF (FAN1 Working!)
actuator[1] doorOpenAlarm PINCONTROL alarm=OFF (Door is CLOSE)| Constraint | Value | Defined by |
|---|---|---|
| Nesting depth | 32 | NCJSON_MAX_DEPTH (32-bit ObjectMask) |
| Key / number length | NCJSON_BUFFER_MAX_TOKENS - 1 |
NCJSON_BUFFER_MAX_TOKENS |
| String chunk size | NCJSON_BUFFER_MAX_TOKENS - 1 |
NCJSON_BUFFER_MAX_TOKENS |
Path matching is by 32-bit hash with no string re-verification, so a hash collision routes a field to the wrong matcher with no error. This is acceptable for fixed, known schemas.
Escape sequences are not decoded. \" and \\ are handled as delimiters, but \n, \t,
\uXXXX, and the like are stored literally: the character after the backslash is kept as-is.
Scalars directly inside an array are not path-reachable, since value matchers require an
object member. Catch them with raw NCJSON_ON events scoped by NCJSON_IN.
Keys and numbers longer than the token buffer are truncated. Strings are the exception and stream in chunks, but an over-long numeric literal parses to the wrong value.
Number conversion uses an internal pow-10 table and is not a correctly-rounded strtod;
values needing full double precision (large fixed-point energy counters, for example) should
be read with NCJSON_F64, and the usual fast-path precision caveats apply.