diff --git a/CHANGELOG b/CHANGELOG index 1fb18a7..c550dac 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,5 +1,14 @@ - (Unreleased) - Add `Context#perform_microtask_checkpoint` to synchronously drain the V8 microtask queue, useful for spec-compliant `dispatchEvent` sequencing inside Ruby callbacks + - Expose V8 ScriptCompiler::CachedData via Context#compile / MiniRacer::Script (#411) + - script = ctx.compile(src, filename:, cached_data:, produce_cache:) → Script handle + - script.run replays compiled bytecode without re-parsing + - script.cached_data persists V8's per-script bytecode cache when produce_cache: true + - script.cache_rejected? reports source/version mismatches + - MiniRacer::V8_CACHED_DATA_VERSION_TAG for cache-key invalidation + - produce_cache defaults to false; passing true from inside a host-fn callback raises (V8's CreateCodeCache corrupts parser state when re-entered, so warm the cache from the top level) + - Cross-process reuse requires both processes to load byte-identical snapshot data via Snapshot#dump / Snapshot.load (Snapshot.new(src) is non-deterministic) and is incompatible with Platform.set_flags!(:single_threaded) (V8 embeds per-process state in the blob under single-threaded mode) + - TruffleRuby shim falls back to source replay (no equivalent in GraalJS) - 0.21.1 - 25-05-2026 - Run `:single_threaded` V8 dispatches on a reusable mini_racer-owned native thread so V8 does not execute on Ruby-owned threads diff --git a/README.md b/README.md index 6c0b0b0..5fd4218 100644 --- a/README.md +++ b/README.md @@ -129,6 +129,91 @@ context.eval("bar()", filename: "a/bar.js") # … ``` +### Bytecode cache for repeated script evaluation + +`Context#compile` returns a `MiniRacer::Script` handle you can run multiple times, +and exposes V8's bytecode cache so subsequent Contexts can skip the parse step. + +In a single process — e.g. warming a `Context` pool from one canonical compile: + +```ruby +# Warm the cache once — top-level compile, opt in with produce_cache: true. +warm = MiniRacer::Context.new +warmed = warm.compile(File.read("bundle.js"), + filename: "bundle.js", + produce_cache: true) +warmed.run +blob = warmed.cached_data # ASCII-8BIT String, hold onto it in memory + +# Subsequent Contexts (e.g. a per-request pool) consume the blob and skip parsing. +ctx = MiniRacer::Context.new +script = ctx.compile(File.read("bundle.js"), + filename: "bundle.js", + cached_data: blob) +# script.cache_rejected? is false when V8 accepted the blob. +script.run +``` + +Across processes (e.g. persisting blobs to disk), the consumer must boot from +**byte-identical snapshot data** — two separate `Snapshot.new(src)` calls produce +different blobs even for the same `src`, and V8 will then reject every cached +blob. Use `Snapshot#dump` / `Snapshot.load` to share canonical bytes: + +```ruby +# Build the snapshot once, persist its bytes. +snap_bytes = MiniRacer::Snapshot.new(snapshot_src).dump +File.binwrite("snapshot.bin", snap_bytes) + +# Every process loads the same bytes. +snap = MiniRacer::Snapshot.load(File.binread("snapshot.bin")) +ctx = MiniRacer::Context.new(snapshot: snap) +script = ctx.compile(File.read("bundle.js"), + filename: "bundle.js", + cached_data: File.binread("bundle.js.cache")) +script.run +``` + +`produce_cache` defaults to `false`; pass `true` to ask V8 for the cache blob. +When the supplied `cached_data` is accepted, `script.cached_data` returns `nil` so +callers can skip a redundant copy. When V8 produces a fresh blob (initial compile +with `produce_cache: true`, or a rejection while `produce_cache: true` was also +set), it returns the new bytes. + +`MiniRacer::V8_CACHED_DATA_VERSION_TAG` exposes V8's +`ScriptCompiler::CachedDataVersionTag()` — mix it into your cache key alongside +the source hash so a libv8-node version bump invalidates stale blobs automatically. +The constant is populated on first `Context.new` (after `Platform.set_flags!`), +so read it after constructing at least one Context. + +```ruby +key = "#{Digest::SHA256.hexdigest(source)}-#{MiniRacer::V8_CACHED_DATA_VERSION_TAG}" +``` + +Notes: + +- A `Script` is bound to the `Context` that compiled it; reusing it on another + Context isn't supported. +- `Script#dispose` frees the underlying V8 handle eagerly. The Ruby GC finalizer + does not (taking the V8 lock from a finalizer thread risks deadlock), so + long-lived Contexts with many short-lived scripts accumulate handles until + `Context#dispose` clears them. +- `produce_cache: true` is only safe at the top level. From inside a host-fn + callback (i.e., re-entrant compile while a JS → Ruby → JS frame is on the + stack) it raises `MiniRacer::RuntimeError`, because V8's `CreateCodeCache` + walks live isolate state and corrupts the parser when re-entered. Warm the + cache from the top level once and pass it back via `cached_data:` from your + callbacks. +- Cross-process reuse is **incompatible with `MiniRacer::Platform.set_flags!(:single_threaded)`**. + V8's single-threaded mode embeds process-local state in the cache blob, so + every cached_data is rejected when consumed in a fresh process. Same-process + reuse still works under `:single_threaded`. If you need both cross-process + reuse and `:single_threaded` (e.g. for fork-safety reasons), disable + `:single_threaded` for the path that produces / consumes the cache. +- On TruffleRuby, `Script` is implemented as source replay (GraalJS has no + equivalent per-script bytecode cache reachable from `Polyglot::InnerContext`), + so `cached_data` and `produce_cache` are silently ignored and `cached_data` + always returns `nil`, and `MiniRacer::V8_CACHED_DATA_VERSION_TAG` is `0`. + ### Fork Safety Some Ruby web servers employ forking (for example unicorn or puma in clustered mode). V8 is not fork safe by default and sadly Ruby does not have support for fork notifications per [#5446](https://bugs.ruby-lang.org/issues/5446). diff --git a/ext/mini_racer_extension/mini_racer_extension.c b/ext/mini_racer_extension/mini_racer_extension.c index 496d011..b135df1 100644 --- a/ext/mini_racer_extension/mini_racer_extension.c +++ b/ext/mini_racer_extension/mini_racer_extension.c @@ -158,6 +158,20 @@ typedef struct Snapshot { VALUE blob; } Snapshot; +// GC-finalizer caveat: script_free cannot send a dispose RPC (would need +// to take rr_mtx without a reliable GVL guarantee). Handles freed here +// rely on State::~State() clearing st.scripts at isolate teardown — so +// long-lived Contexts with many short-lived Scripts accumulate V8 handles +// until the Context is disposed. Call Script#dispose explicitly to free +// eagerly (it erases the v8::Global, which Reset()s the handle). +typedef struct Script { + VALUE context; // parent Context VALUE (kept alive via mark) + VALUE cached_data; // ASCII-8BIT String or Qnil + int32_t handle_id; // 0 if uninitialized or already freed + int cache_rejected; + int disposed; +} Script; + static void context_destroy(Context *c); static void context_free(void *arg); static void context_mark(void *arg); @@ -185,6 +199,19 @@ static const rb_data_type_t snapshot_type = { }, }; +static void script_free(void *arg); +static void script_mark(void *arg); +static size_t script_size(const void *arg); + +static const rb_data_type_t script_type = { + .wrap_struct_name = "mini_racer/script", + .function = { + .dfree = script_free, + .dmark = script_mark, + .dsize = script_size, + }, +}; + static VALUE platform_init_error; static VALUE context_disposed_error; static VALUE parse_error; @@ -196,10 +223,15 @@ static VALUE snapshot_error; static VALUE terminated_error; static VALUE context_class; static VALUE snapshot_class; +static VALUE script_class; static VALUE date_time_class; static VALUE binary_class; static VALUE js_function_class; +static ID id_filename; +static ID id_cached_data; +static ID id_produce_cache; + static pthread_mutex_t flags_mtx = PTHREAD_MUTEX_INITIALIZER; static Buf flags; // protected by |flags_mtx| @@ -808,10 +840,13 @@ static void dispatch1(Context *c, const uint8_t *p, size_t n) switch (*p) { case 'A': return v8_attach(c->pst, p+1, n-1); case 'C': return v8_timedwait(c, p+1, n-1, v8_call); + case 'D': return v8_dispose_script(c->pst, p+1, n-1); case 'E': return v8_timedwait(c, p+1, n-1, v8_eval); case 'H': return v8_heap_snapshot(c->pst); + case 'K': return v8_timedwait(c, p+1, n-1, v8_compile); // (K)ompile — 'C' is taken case 'M': return v8_perform_microtask_checkpoint(c->pst); case 'P': return v8_pump_message_loop(c->pst); + case 'R': return v8_timedwait(c, p+1, n-1, v8_run); case 'S': return v8_heap_stats(c->pst); case 'T': return v8_snapshot(c->pst, p+1, n-1); case 'W': return v8_warmup(c->pst, p+1, n-1); @@ -1673,6 +1708,17 @@ static VALUE context_initialize(int argc, VALUE *argv, VALUE self) barrier_wait(&c->early_init); barrier_wait(&c->late_init); } + // Deferred to first Context.new so Platform.set_flags! still has effect + // on the tag (which depends on V8 flags applied during v8_global_init). + { + static int version_tag_defined; + if (!version_tag_defined) { + VALUE m = rb_const_get(rb_cObject, rb_intern("MiniRacer")); + rb_define_const(m, "V8_CACHED_DATA_VERSION_TAG", + UINT2NUM(v8_cached_data_version_tag())); + version_tag_defined = 1; + } + } return Qnil; fail: rb_raise(runtime_error, "Context.initialize: %s: %s", cause, strerror(r)); @@ -1806,11 +1852,179 @@ static VALUE script_error_cause(VALUE self) return rb_iv_get(self, "@cause"); } +static VALUE context_compile(int argc, VALUE *argv, VALUE self) +{ + VALUE a, e, source, filename, cached_data, produce_cache, kwargs; + VALUE script_v, result; + Script *script; + Context *c; + Ser s; + + TypedData_Get_Struct(self, Context, &context_type, c); + rb_scan_args(argc, argv, "1:", &source, &kwargs); + Check_Type(source, T_STRING); + filename = Qnil; + cached_data = Qnil; + produce_cache = Qfalse; + if (!NIL_P(kwargs)) { + filename = rb_hash_aref(kwargs, ID2SYM(id_filename)); + cached_data = rb_hash_aref(kwargs, ID2SYM(id_cached_data)); + produce_cache = rb_hash_aref(kwargs, ID2SYM(id_produce_cache)); + } + if (NIL_P(filename)) + filename = rb_str_new_cstr(""); + Check_Type(filename, T_STRING); + if (!NIL_P(cached_data)) { + Check_Type(cached_data, T_STRING); + // Refuse non-binary encodings so a user reading a cache file without + // 'rb' mode gets a clear error instead of mangled bytes flowing to V8. + if (rb_enc_get(cached_data) != rb_ascii8bit_encoding()) + rb_raise(rb_eEncodingError, + "cached_data must be ASCII-8BIT (binary), got %s", + rb_enc_name(rb_enc_get(cached_data))); + } + ser_init1(&s, 'K'); + ser_array_begin(&s, 4); + add_string(&s, filename); + add_string(&s, source); + if (NIL_P(cached_data)) { + ser_null(&s); + } else { + ser_uint8array(&s, (const uint8_t *)RSTRING_PTR(cached_data), + RSTRING_LEN(cached_data)); + } + ser_bool(&s, RTEST(produce_cache)); + ser_array_end(&s, 4); + a = rendezvous(c, &s.b); + e = rb_ary_pop(a); + handle_exception(e); + result = rb_ary_pop(a); + Check_Type(result, T_ARRAY); + + script_v = rb_obj_alloc(script_class); // skip the raising initialize + TypedData_Get_Struct(script_v, Script, &script_type, script); + script->context = self; + script->handle_id = NUM2INT(rb_ary_entry(result, 0)); + script->cached_data = rb_ary_entry(result, 1); + script->cache_rejected = RTEST(rb_ary_entry(result, 2)); + return script_v; +} + +static VALUE script_alloc(VALUE klass) +{ + Script *s; + + s = ruby_xmalloc(sizeof(*s)); + memset(s, 0, sizeof(*s)); + s->context = Qnil; + s->cached_data = Qnil; + return TypedData_Wrap_Struct(klass, &script_type, s); +} + +static void script_free(void *arg) +{ + // Intentionally does not send a dispose RPC — finalizers can't safely + // take rr_mtx. State::~State() walks st.scripts at isolate teardown so + // we leak nothing across a Context's lifetime; use Script#dispose to + // free eagerly mid-lifetime. + ruby_xfree(arg); +} + +static void script_mark(void *arg) +{ + Script *s = arg; + rb_gc_mark(s->context); + rb_gc_mark(s->cached_data); +} + +static size_t script_size(const void *arg) +{ + const Script *s = arg; + size_t base = sizeof(*s); + if (!NIL_P(s->cached_data)) + base += RSTRING_LENINT(s->cached_data); + return base; +} + +static VALUE script_initialize(int argc, VALUE *argv, VALUE self) +{ + (void)argc; (void)argv; (void)self; + rb_raise(runtime_error, "MiniRacer::Script must be created via Context#compile"); + return Qnil; +} + +static VALUE script_run(VALUE self) +{ + VALUE a, e; + Script *script; + Context *c; + Ser s; + + TypedData_Get_Struct(self, Script, &script_type, script); + if (script->disposed) + rb_raise(runtime_error, "disposed script"); + TypedData_Get_Struct(script->context, Context, &context_type, c); + if (atomic_load(&c->quit)) + rb_raise(context_disposed_error, "disposed context"); + ser_init1(&s, 'R'); + ser_int(&s, script->handle_id); + a = rendezvous(c, &s.b); + e = rb_ary_pop(a); + handle_exception(e); + return rb_ary_pop(a); +} + +static VALUE script_cached_data(VALUE self) +{ + Script *script; + TypedData_Get_Struct(self, Script, &script_type, script); + return script->cached_data; +} + +static VALUE script_cache_rejected_p(VALUE self) +{ + Script *script; + TypedData_Get_Struct(self, Script, &script_type, script); + return script->cache_rejected ? Qtrue : Qfalse; +} + +static VALUE script_dispose(VALUE self) +{ + VALUE e; + Script *script; + Context *c; + Ser s; + + TypedData_Get_Struct(self, Script, &script_type, script); + if (script->disposed) return Qnil; + TypedData_Get_Struct(script->context, Context, &context_type, c); + script->disposed = 1; + // Context already gone? The handle was cleaned by State::~State(). + if (atomic_load(&c->quit)) + return Qnil; + ser_init1(&s, 'D'); + ser_int(&s, script->handle_id); + e = rendezvous(c, &s.b); + handle_exception(e); + return Qnil; +} + +static VALUE script_disposed_p(VALUE self) +{ + Script *script; + TypedData_Get_Struct(self, Script, &script_type, script); + return script->disposed ? Qtrue : Qfalse; +} + __attribute__((visibility("default"))) void Init_mini_racer_extension(void) { VALUE c, m; + id_filename = rb_intern("filename"); + id_cached_data = rb_intern("cached_data"); + id_produce_cache = rb_intern("produce_cache"); + m = rb_define_module("MiniRacer"); c = rb_define_class_under(m, "Error", rb_eStandardError); snapshot_error = rb_define_class_under(m, "SnapshotError", c); @@ -1830,6 +2044,7 @@ void Init_mini_racer_extension(void) c = context_class = rb_define_class_under(m, "Context", rb_cObject); rb_define_method(c, "initialize", context_initialize, -1); rb_define_method(c, "attach", context_attach, 2); + rb_define_method(c, "compile", context_compile, -1); rb_define_method(c, "dispose", context_dispose, 0); rb_define_method(c, "stop", context_stop, 0); rb_define_method(c, "call", context_call, -1); @@ -1841,6 +2056,15 @@ void Init_mini_racer_extension(void) rb_define_method(c, "low_memory_notification", context_low_memory_notification, 0); rb_define_alloc_func(c, context_alloc); + c = script_class = rb_define_class_under(m, "Script", rb_cObject); + rb_define_method(c, "initialize", script_initialize, -1); + rb_define_method(c, "run", script_run, 0); + rb_define_method(c, "cached_data", script_cached_data, 0); + rb_define_method(c, "cache_rejected?", script_cache_rejected_p, 0); + rb_define_method(c, "dispose", script_dispose, 0); + rb_define_method(c, "disposed?", script_disposed_p, 0); + rb_define_alloc_func(c, script_alloc); + c = snapshot_class = rb_define_class_under(m, "Snapshot", rb_cObject); rb_define_method(c, "initialize", snapshot_initialize, -1); rb_define_method(c, "warmup!", snapshot_warmup, 1); diff --git a/ext/mini_racer_extension/mini_racer_v8.cc b/ext/mini_racer_extension/mini_racer_v8.cc index 591d5e2..04303c0 100644 --- a/ext/mini_racer_extension/mini_racer_v8.cc +++ b/ext/mini_racer_extension/mini_racer_v8.cc @@ -3,6 +3,7 @@ #include "libplatform/libplatform.h" #include "mini_racer_v8.h" #include +#include #include #include #include @@ -92,6 +93,19 @@ struct State int err_reason; bool verbose_exceptions; std::vector callbacks; + // v8::Global (not Persistent): Global's destructor Reset()s the handle, + // so erase()/clear() actually release the compiled script eagerly. + // Default-traits Persistent has kResetInDestructor=false — destroying it + // is a no-op that leaks the global handle until isolate->Dispose(), which + // would silently defeat Script#dispose. Cleared in ~State() under the + // still-live isolate so each Global can Reset() before isolate->Dispose(). + std::unordered_map> scripts; + int32_t next_script_id; + // Depth counter incremented while v8_api_callback is on the stack. + // CreateCodeCache walks live isolate state and corrupts the parser + // when invoked from within a JS->Ruby->JS frame; see compile()'s + // `produce_cache` handling. + int in_callback; std::unique_ptr allocator; inline ~State(); }; @@ -380,6 +394,12 @@ void v8_api_callback(const v8::FunctionCallbackInfo& info) auto ext = v8::External::Cast(*info.Data()); Callback *cb = static_cast(ext->Value()); State& st = *cb->st; + // RAII counter so re-entrant compile() can refuse CreateCodeCache. + struct CallbackGuard { + State &st; + CallbackGuard(State &s) : st(s) { st.in_callback++; } + ~CallbackGuard() { st.in_callback--; } + } _guard(st); v8::Local request; { v8::Context::Scope context_scope(st.safe_context); @@ -606,6 +626,202 @@ extern "C" void v8_eval(State *pst, const uint8_t *p, size_t n) } } +// request: [filename, source, cached_data|null, produce_cache:Bool] +// response: errback [[handle_id:Int32, cached_data:ArrayBuffer|null, rejected:Bool], err] +// +// CreateCodeCache walks live isolate state in a way that corrupts the parser +// when called from inside a v8_api_callback frame (re-entrant compile from +// host fn). Callers must opt in via produce_cache and only do so from the +// top level; we raise from re-entrant context rather than silently skipping +// so misuse is caught immediately. +extern "C" void v8_compile(State *pst, const uint8_t *p, size_t n) +{ + State& st = *pst; + v8::TryCatch try_catch(st.isolate); + try_catch.SetVerbose(st.verbose_exceptions); + v8::HandleScope handle_scope(st.isolate); + v8::ValueDeserializer des(st.isolate, p, n); + des.ReadHeader(st.context).Check(); + v8::Local result; + int cause = INTERNAL_ERROR; + { + v8::Local request_v; + if (!des.ReadValue(st.context).ToLocal(&request_v)) goto fail; + v8::Local request; + if (!request_v->ToObject(st.context).ToLocal(&request)) goto fail; + v8::Local filename; + if (!request->Get(st.context, 0).ToLocal(&filename)) goto fail; + v8::Local source_v; + if (!request->Get(st.context, 1).ToLocal(&source_v)) goto fail; + v8::Local cached_v; + if (!request->Get(st.context, 2).ToLocal(&cached_v)) goto fail; + v8::Local produce_v; + if (!request->Get(st.context, 3).ToLocal(&produce_v)) goto fail; + bool produce_cache = produce_v->BooleanValue(st.isolate); + v8::Local source; + if (!source_v->ToString(st.context).ToLocal(&source)) goto fail; + + if (produce_cache && st.in_callback > 0) { + cause = RUNTIME_ERROR; + auto msg = v8::String::NewFromUtf8Literal(st.isolate, + "produce_cache: true is unsafe inside a host-function callback " + "(V8 CreateCodeCache corrupts parser state when re-entered); " + "compile with produce_cache from the top level instead"); + st.isolate->ThrowException(v8::Exception::Error(msg)); + goto fail; + } + + // ser_uint8array on the Ruby side wraps the bytes in an ArrayBuffer + + // Uint8Array view. The view's backing bytes are valid for the whole + // v8_compile call, so BufferNotOwned avoids a copy — the CachedData + // destructor (run when source_obj goes out of scope) leaves them alone. + v8::ScriptCompiler::CachedData *cached_in = nullptr; + if (cached_v->IsArrayBufferView()) { + auto view = cached_v.As(); + int len = static_cast(view->ByteLength()); + if (len > 0) { + auto store = view->Buffer()->GetBackingStore(); + auto bytes = static_cast(store->Data()) + view->ByteOffset(); + cached_in = new v8::ScriptCompiler::CachedData( + bytes, len, v8::ScriptCompiler::CachedData::BufferNotOwned); + } + } + + v8::ScriptOrigin origin(filename); + v8::ScriptCompiler::Source source_obj(source, origin, cached_in); + auto options = cached_in ? v8::ScriptCompiler::kConsumeCodeCache + : v8::ScriptCompiler::kNoCompileOptions; + v8::Local script; + cause = PARSE_ERROR; + if (!v8::ScriptCompiler::Compile(st.context, &source_obj, options) + .ToLocal(&script)) goto fail; + cause = INTERNAL_ERROR; + + bool rejected = (cached_in && source_obj.GetCachedData()->rejected); + v8::Local cache_value = v8::Null(st.isolate); + if (produce_cache && (!cached_in || rejected)) { + std::unique_ptr blob( + v8::ScriptCompiler::CreateCodeCache(script->GetUnboundScript())); + if (blob && blob->length > 0) { + auto backing = v8::ArrayBuffer::NewBackingStore(st.isolate, blob->length); + memcpy(backing->Data(), blob->data, blob->length); + cache_value = v8::ArrayBuffer::New(st.isolate, std::move(backing)); + } + } + + // Ids are monotonic and serialized as Int32 on the wire. Refuse to + // wrap at INT32_MAX rather than invoke signed-overflow UB / risk + // aliasing a still-live id (unreachable in practice — each undisposed + // script pins a handle, so the isolate OOMs long before 2^31). + if (st.next_script_id == INT32_MAX) { + cause = INTERNAL_ERROR; + auto msg = v8::String::NewFromUtf8Literal(st.isolate, + "script id space exhausted for this Context"); + st.isolate->ThrowException(v8::Exception::Error(msg)); + goto fail; + } + int32_t id = ++st.next_script_id; + + { + v8::Context::Scope context_scope(st.safe_context); + result = v8::Array::New(st.isolate, 3); + } + // Populate via the goto-fail idiom, not .Check(): v8_compile runs + // under the watchdog ('K' → v8_timedwait), so a timeout can leave the + // isolate terminating here, making Set() return Nothing — .Check() + // would abort the process. The fail path cancels termination and + // replies a proper TERMINATED_ERROR errback instead. + if (!result->Set(st.context, 0, v8::Int32::New(st.isolate, id)).FromMaybe(false)) goto fail; + if (!result->Set(st.context, 1, cache_value).FromMaybe(false)) goto fail; + if (!result->Set(st.context, 2, v8::Boolean::New(st.isolate, rejected)).FromMaybe(false)) goto fail; + + // Register the handle only after the reply array is fully built. If a + // Set above bailed (e.g. watchdog termination), the Ruby side gets an + // error and never learns the id, so it could never erase the entry — + // inserting earlier would orphan an undisposable handle until Context + // teardown. + st.scripts[id].Reset(st.isolate, script); + } + cause = NO_ERROR; +fail: + if (st.isolate->IsExecutionTerminating()) { + st.isolate->CancelTerminateExecution(); + cause = st.err_reason ? st.err_reason : TERMINATED_ERROR; + st.err_reason = NO_ERROR; + } + if (bubble_up_ruby_exception(st, &try_catch)) return; + if (!cause && try_catch.HasCaught()) cause = RUNTIME_ERROR; + v8::Local result_v = result.IsEmpty() + ? static_cast>(v8::Undefined(st.isolate)) + : static_cast>(result); + auto err = to_error(st, &try_catch, cause); + if (!reply(st, result_v, err)) { + assert(try_catch.HasCaught()); + goto fail; + } +} + +extern "C" void v8_run(State *pst, const uint8_t *p, size_t n) +{ + State& st = *pst; + v8::TryCatch try_catch(st.isolate); + try_catch.SetVerbose(st.verbose_exceptions); + v8::HandleScope handle_scope(st.isolate); + v8::ValueDeserializer des(st.isolate, p, n); + des.ReadHeader(st.context).Check(); + v8::Local result; + int cause = INTERNAL_ERROR; + { + v8::Local id_v; + if (!des.ReadValue(st.context).ToLocal(&id_v)) goto fail; + int32_t id; + if (!id_v->Int32Value(st.context).To(&id)) goto fail; + auto it = st.scripts.find(id); + if (it == st.scripts.end()) { + cause = RUNTIME_ERROR; + auto msg = v8::String::NewFromUtf8Literal(st.isolate, "no such script handle"); + st.isolate->ThrowException(v8::Exception::Error(msg)); + goto fail; + } + auto script = v8::Local::New(st.isolate, it->second); + v8::Local result_v; + cause = RUNTIME_ERROR; + if (!script->Run(st.context).ToLocal(&result_v)) goto fail; + result = sanitize(st, result_v); + } + cause = NO_ERROR; +fail: + if (st.isolate->IsExecutionTerminating()) { + st.isolate->CancelTerminateExecution(); + cause = st.err_reason ? st.err_reason : TERMINATED_ERROR; + st.err_reason = NO_ERROR; + } + if (bubble_up_ruby_exception(st, &try_catch)) return; + if (!cause && try_catch.HasCaught()) cause = RUNTIME_ERROR; + if (cause) result = v8::Undefined(st.isolate); + auto err = to_error(st, &try_catch, cause); + if (!reply(st, result, err)) { + assert(try_catch.HasCaught()); + goto fail; + } +} + +// Unknown ids are silently ignored — Ruby-side Script#dispose is idempotent. +extern "C" void v8_dispose_script(State *pst, const uint8_t *p, size_t n) +{ + State& st = *pst; + v8::HandleScope handle_scope(st.isolate); + v8::ValueDeserializer des(st.isolate, p, n); + des.ReadHeader(st.context).Check(); + v8::Local id_v; + if (des.ReadValue(st.context).ToLocal(&id_v)) { + int32_t id; + if (id_v->Int32Value(st.context).To(&id)) + st.scripts.erase(id); + } + reply_retry(st, v8::String::Empty(st.isolate)); +} + extern "C" void v8_heap_stats(State *pst) { State& st = *pst; @@ -924,6 +1140,11 @@ extern "C" void v8_single_threaded_dispose(struct State *pst) delete pst; // see State::~State() below } +extern "C" uint32_t v8_cached_data_version_tag(void) +{ + return v8::ScriptCompiler::CachedDataVersionTag(); +} + } // namespace anonymous State::~State() @@ -931,6 +1152,7 @@ State::~State() { v8::Locker locker(isolate); v8::Isolate::Scope isolate_scope(isolate); + scripts.clear(); persistent_safe_context.Reset(); persistent_context.Reset(); ruby_exception.Reset(); diff --git a/ext/mini_racer_extension/mini_racer_v8.h b/ext/mini_racer_extension/mini_racer_v8.h index 57f12fb..631a604 100644 --- a/ext/mini_racer_extension/mini_racer_v8.h +++ b/ext/mini_racer_extension/mini_racer_v8.h @@ -39,7 +39,10 @@ struct State *v8_thread_init(struct Context *c, const uint8_t *snapshot_buf, int verbose_exceptions); // calls v8_thread_main void v8_attach(struct State *pst, const uint8_t *p, size_t n); void v8_call(struct State *pst, const uint8_t *p, size_t n); +void v8_compile(struct State *pst, const uint8_t *p, size_t n); +void v8_dispose_script(struct State *pst, const uint8_t *p, size_t n); void v8_eval(struct State *pst, const uint8_t *p, size_t n); +void v8_run(struct State *pst, const uint8_t *p, size_t n); void v8_heap_stats(struct State *pst); void v8_heap_snapshot(struct State *pst); void v8_perform_microtask_checkpoint(struct State *pst); @@ -50,6 +53,7 @@ void v8_low_memory_notification(struct State *pst); void v8_terminate_execution(struct State *pst); // called from ruby or watchdog thread void v8_single_threaded_enter(struct State *pst, struct Context *c, void (*f)(struct Context *c)); void v8_single_threaded_dispose(struct State *pst); +uint32_t v8_cached_data_version_tag(void); // safe to call after v8_global_init #ifdef __cplusplus } diff --git a/lib/mini_racer/truffleruby.rb b/lib/mini_racer/truffleruby.rb index 8a40048..37aa4cc 100644 --- a/lib/mini_racer/truffleruby.rb +++ b/lib/mini_racer/truffleruby.rb @@ -3,6 +3,10 @@ require_relative 'shared' module MiniRacer + # GraalJS has no equivalent of V8's per-script bytecode cache reachable + # from Polyglot::InnerContext#eval, so the version tag is meaningless + # here. Define 0 as a sentinel callers can detect to skip cache logic. + V8_CACHED_DATA_VERSION_TAG = 0 class Context @@ -388,4 +392,50 @@ def warmup_unsafe!(src) self end end + + # GraalJS has no per-script bytecode cache reachable from + # Polyglot::InnerContext#eval, so cached_data: is silently ignored and + # Script#run replays the source through Context#eval. + class Context + def compile(source, filename: nil, cached_data: nil, produce_cache: false) + raise(ContextDisposedError, 'attempted to call compile on a disposed context!') if @disposed + raise TypeError, "wrong type argument #{source.class} (should be a string)" unless source.is_a?(String) + raise TypeError, "wrong type argument #{filename.class} (should be a string)" unless filename.nil? || filename.is_a?(String) + if cached_data + raise TypeError, "wrong type argument #{cached_data.class} (should be a string)" unless cached_data.is_a?(String) + raise EncodingError, "cached_data must be ASCII-8BIT (binary), got #{cached_data.encoding}" if cached_data.encoding != Encoding::ASCII_8BIT + end + # produce_cache is accepted for API parity but has no effect — the shim + # has no per-script bytecode cache to produce. + Script.send(:new, self, source, filename) + end + end + + class Script + private_class_method :new + + def initialize(ctx, source, filename) + @ctx = ctx + @source = source + @filename = filename + @disposed = false + end + + def run + raise MiniRacer::RuntimeError, 'disposed script' if @disposed + @ctx.eval(@source, filename: @filename) # raises ContextDisposedError if @ctx is disposed + end + + def cached_data; nil; end + def cache_rejected?; false; end + + def dispose + @disposed = true + nil + end + + def disposed? + @disposed + end + end end diff --git a/test/mini_racer_test.rb b/test/mini_racer_test.rb index a9ffe38..fdcf706 100644 --- a/test/mini_racer_test.rb +++ b/test/mini_racer_test.rb @@ -1275,4 +1275,223 @@ def test_exception_message_encoding assert e assert_equal(e.message.encoding.to_s, "UTF-8") end + + def test_v8_cached_data_version_tag + # Triggers v8_once_init which is when the constant is populated. + MiniRacer::Context.new + + assert_kind_of Integer, MiniRacer::V8_CACHED_DATA_VERSION_TAG + if RUBY_ENGINE == "truffleruby" + assert_equal 0, MiniRacer::V8_CACHED_DATA_VERSION_TAG + else + refute_equal 0, MiniRacer::V8_CACHED_DATA_VERSION_TAG + end + # Stable across calls. + assert_equal MiniRacer::V8_CACHED_DATA_VERSION_TAG, MiniRacer::V8_CACHED_DATA_VERSION_TAG + end + + def test_compile_run_roundtrip + ctx = MiniRacer::Context.new + script = ctx.compile("1 + 2 + 3") + assert_kind_of MiniRacer::Script, script + assert_equal 6, script.run + assert_equal 6, script.run # idempotent + end + + def test_compile_filename_in_parse_error + # The TruffleRuby shim is source-replay: it does not parse at compile time + # (Polyglot::InnerContext#eval has no parse-only mode), so a syntax error + # surfaces from Script#run instead, not from compile. + skip "TruffleRuby shim defers parsing to run" if RUBY_ENGINE == "truffleruby" + err = assert_raises(MiniRacer::ParseError) do + MiniRacer::Context.new.compile("function foo(", filename: "bundle.js") + end + assert_includes err.message, "bundle.js" + end + + def test_compile_invalid_source + skip "TruffleRuby shim defers parsing to run" if RUBY_ENGINE == "truffleruby" + assert_raises(MiniRacer::ParseError) do + MiniRacer::Context.new.compile("foo bar baz garbage") + end + end + + def test_compile_runtime_error + ctx = MiniRacer::Context.new + script = ctx.compile("throw new Error('boom')") + err = assert_raises(MiniRacer::RuntimeError) do + script.run + end + assert_includes err.message, "boom" + end + + def test_compile_cached_data_save_restore + skip "TruffleRuby has no equivalent caching API" if RUBY_ENGINE == "truffleruby" + + src = "function sq(x) { return x * x } sq(7)" + ctx_a = MiniRacer::Context.new + s_a = ctx_a.compile(src, filename: "sq.js", produce_cache: true) + blob = s_a.cached_data + assert_kind_of String, blob + assert_equal Encoding::ASCII_8BIT, blob.encoding + assert_operator blob.bytesize, :>, 0 + refute_predicate s_a, :cache_rejected? + assert_equal 49, s_a.run + ctx_a.dispose + + ctx_b = MiniRacer::Context.new + s_b = ctx_b.compile(src, filename: "sq.js", cached_data: blob) + refute_predicate s_b, :cache_rejected? + assert_nil s_b.cached_data, "accepted blob → nil so caller skips redundant persist" + assert_equal 49, s_b.run + end + + def test_compile_cached_data_rejection + skip "TruffleRuby has no equivalent caching API" if RUBY_ENGINE == "truffleruby" + + src = "function sq(x) { return x * x } sq(7)" + corrupt = ("garbage" * 100).b + ctx = MiniRacer::Context.new + script = ctx.compile(src, cached_data: corrupt, produce_cache: true) + assert_predicate script, :cache_rejected? + fresh = script.cached_data + assert_kind_of String, fresh + assert_operator fresh.bytesize, :>, 0 + assert_equal 49, script.run + end + + def test_compile_default_skips_cache_production + skip "TruffleRuby has no equivalent caching API" if RUBY_ENGINE == "truffleruby" + + ctx = MiniRacer::Context.new + script = ctx.compile("1 + 1", filename: "no_cache.js") + assert_nil script.cached_data + refute_predicate script, :cache_rejected? + assert_equal 2, script.run + end + + def test_compile_produce_cache_inside_host_fn_raises + skip "TruffleRuby has no equivalent caching API" if RUBY_ENGINE == "truffleruby" + + ctx = MiniRacer::Context.new + caught = nil + ctx.attach("trigger", lambda { + begin + ctx.compile("var v = 1; v", filename: "inside.js", produce_cache: true) + rescue MiniRacer::RuntimeError => e + caught = e + end + nil + }) + ctx.eval("trigger()") + refute_nil caught, "produce_cache: true from a host-fn callback should raise" + assert_includes caught.message, "host-function callback" + end + + def test_compile_inside_host_fn_default_is_safe + skip "TruffleRuby has no equivalent caching API" if RUBY_ENGINE == "truffleruby" + + # Without produce_cache, repeated re-entrant compiles must not crash — + # this is the path Discourse-style embedders take from their inline-script + # host fns. cached_data stays nil because we skip CreateCodeCache in + # callback frames; the user can warm the cache via top-level compile + # calls with produce_cache: true at startup instead. + ctx = MiniRacer::Context.new + ctx.attach("run_inline", lambda {|label, body| + script = ctx.compile(body, filename: label) + script.run + script.dispose + nil + }) + 3.times do |i| + ctx.eval("run_inline('inline://#{i}.js', 'var x = #{i}; x + 1;')") + end + end + + def test_compile_cached_data_must_be_binary + ctx = MiniRacer::Context.new + assert_raises(EncodingError) do + ctx.compile("1+1", cached_data: "not binary".encode("UTF-8")) + end + end + + def test_compile_cached_data_type_error + ctx = MiniRacer::Context.new + assert_raises(TypeError) do + ctx.compile("1+1", cached_data: 42) + end + end + + def test_script_dispose_idempotent + ctx = MiniRacer::Context.new + script = ctx.compile("1 + 1") + assert_equal 2, script.run + refute_predicate script, :disposed? + script.dispose + assert_predicate script, :disposed? + assert_nil script.dispose + assert_raises(MiniRacer::RuntimeError) { script.run } + end + + def test_script_dispose_releases_v8_handles_eagerly + # Each compiled Script pins a v8::Global. Disposing must + # release it eagerly: if the handle were a default-traits v8::Persistent + # (whose destructor does NOT Reset), erase() would leak it until the + # Context is disposed and disposed_growth would track retained_growth. + skip "TruffleRuby has no equivalent caching API" if RUBY_ENGINE == "truffleruby" + ctx = MiniRacer::Context.new + + base = ctx.heap_stats[:used_global_handles_size] + 300.times do |i| + ctx.compile("var x#{i} = #{i}; x#{i}", filename: "d#{i}.js").tap(&:run).dispose + end + ctx.low_memory_notification + disposed_growth = ctx.heap_stats[:used_global_handles_size] - base + + retained = [] + base2 = ctx.heap_stats[:used_global_handles_size] + 300.times do |i| + s = ctx.compile("var y#{i} = #{i}; y#{i}", filename: "r#{i}.js") + s.run + retained << s # keep alive so the global handles stay live + end + ctx.low_memory_notification + retained_growth = ctx.heap_stats[:used_global_handles_size] - base2 + + # 300 retained Global