Skip to content

Region GC#105

Open
kulisak12 wants to merge 15 commits into
fxpl:regions-mainfrom
kulisak12:regions-gc
Open

Region GC#105
kulisak12 wants to merge 15 commits into
fxpl:regions-mainfrom
kulisak12:regions-gc

Conversation

@kulisak12

Copy link
Copy Markdown

Please check the last commit in particular, I still don't understand the write barrier enough 😅

Comment thread Python/gc.c

/* Create artificial local references to the bridges. */
static int
region_list_add_local_refs(_PyRegionObject *root) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct. Where do you remove this RC and LRC that you've added here? I can't see it in this commit

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xFrednet

Copy link
Copy Markdown
Collaborator

For now, I only checked the last commit (based) on the description. I plan to look at the rest later.

The fact that this PR is this small is super impressive and a testament for how well you understood the whole thing and wired it up! I'm looking forward to reviewing it :D

@xFrednet xFrednet left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, the code is super readable and concise. I'm pretty confident that I can modify it or that we can give this to someone else to test different scheduling mechanics. Very well done, thank you for all the work you put into this!

I have some comments, but mainly some clarifying questions. I didn't find any logic bugs :D

Comment thread Objects/cownobject.c
cown_set_value_unchecked(self, Py_None);
// Should be able to release now.
int res = cown_release(self, this_ip);
assert(res == 0);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This produces a warning on non debug builds. It can be suppressed by adding (void)res in addition to the assert:

Suggested change
assert(res == 0);
assert(res == 0);
(void)res;

Comment thread Objects/cownobject.c
Comment on lines +488 to +489
PyThreadState *tstate = PyThreadState_Get();
if (!_PyGC_CanRunRegionGC(tstate)) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs of PyThreadState_Get(); state that it can return NULL. But this code doesn't null check. Does it expect _PyGC_CanRunRegionGC to do the NULL check? If so, can you add a comment about this?

Comment thread Objects/cownobject.c

// Lock without blocking, we only want the cown if it is released.
int res = cown_lock(self, NO_BLOCKING_TIMEOUT, this_ip, true);
if (res != COWN_ACQUIRE_SUCCESS) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check for COWN_ACQUIRE_ERROR and in that case surface the error to the caller? This is being called from CownObject_release which can still throw an exception.

If you want to explicitly ignore errors, can you add a comment about why this is safe?

Comment thread Python/gc.c
Comment on lines +1229 to 1240
// Regions: No barrier needed.
// If called from the local GC, op is local.
// If called from the region GC, the region is already open.
Py_INCREF(op);
// TODO(Immutable): This is only required until we have the SCC support working.
_Py_CLEAR_IMMUTABLE(op);
(void) clear(op);
if (_PyErr_Occurred(tstate)) {
PyErr_FormatUnraisable("Exception ignored in tp_clear of %s",
Py_TYPE(op)->tp_name);
}
Py_DECREF(op);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm considering if this is correct. If the clear code or a deallocator triggered by the clear code, triggers a region clean on the owner, it will include this RC in the LRC. The DECREF in line 1240 would only dec the RC and not the LRC.

So, I think this needs a write barrier to be safe. (I believe we talked about this previously, but I didn't think about this case until now.)

The barrier would look something like this:

+   // Regions: This barrier should always succeeds since:
    // If called from the local GC, op is local.
    // If called from the region GC, the region is already open.
+   int res = PyRegion_AddLocalRef(op);
+   assert(res == 0)
+   (void)res;
    Py_INCREF(op);
    // TODO(Immutable): This is only required until we have the SCC support working.
    _Py_CLEAR_IMMUTABLE(op);
    (void) clear(op);
    if (_PyErr_Occurred(tstate)) {
        PyErr_FormatUnraisable("Exception ignored in tp_clear of %s",
            Py_TYPE(op)->tp_name);
    }
+   PyRegion_RemoveLocalRef(op;)
    Py_DECREF(op);

Comment thread Python/gc.c
Comment on lines +1979 to +1985
/* Separate child regions from contained objects.
* Finalizers need them to be in the GC list, at the start of the list.
*/
PyGC_Head contained;
gc_list_init(&contained);
gc_region_list_split(&data->gc_list, &contained);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With all of these additional splits and joins I see why you asked if we can split the contained object and child-region lists. For now, I want to keep this structure since it works, but this should definitely be on the "benchmark this" list for when the implementation is complete-ish.

The implementation thus far is super readable, so this change should be easy to in cooperate into your implementation.

Comment thread Python/gc.c
*/
gcstate->region_budget += 3;
if (gcstate->region_budget > gcstate->young.threshold) {
gcstate->region_budget = gcstate->young.threshold;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is one of the places we can modify to adjust the scheduling?

For a steady state heap, the amount of work to do is three times

Where does the "three times" come from, is this a different comment in this file or some benchmarking?

Comment thread Python/gc.c
Comment on lines +2839 to +2840
// TODO(regions): If callback moves op into a region,
// this function would start iterating objects in the region.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this TODO is if op is local?

We can later mark the object as unmovable, when we support this on a per-instance level :)

Comment on lines +45 to +48
def setUp(self):
# Need to run collection multiple times to clean up region chains
while gc.collect() > 0:
pass

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the fact that you need this setup function :D

Comment on lines +50 to +51
def tearDown(self):
gc.disable()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confused me a bit, could you add a comment that the GC is enabled again once all tests in the module are done by tearDownModule

Comment on lines -17 to +60
def test_owned_cycles_are_ignored(self):
def build_region_with_unreachable_cycle(self):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to see that my old broken test got replaced :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants