MDEV-37009 bulk insert into vector index#5150
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces bulk insert support for hierarchical navigable small world (mhnsw) indexes during table copy operations. The reviewer identified several critical issues, including compilation errors from accessing bulk_insert_active through the wrong pointer, an uninitialized member variable, and an ignored return value during bulk insert initialization. Suggestions were also made to reset the active flag on failure and correct formatting to match the repository's style.
| DBUG_ASSERT(s->hlindexes() == (hlindex != NULL)); | ||
| if (hlindex && hlindex->in_use) | ||
| { | ||
| if (hlindex->bulk_insert_active) | ||
| return mhnsw_bulk_insert_row(this, key_info + s->keys); | ||
| else | ||
| return mhnsw_insert(this, key_info + s->keys); | ||
| } | ||
| return 0; |
There was a problem hiding this comment.
The member bulk_insert_active was added to the TABLE struct in sql/table.h, but here it is accessed via hlindex->bulk_insert_active. Since hlindex is not of type TABLE*, this will cause a compilation error. It should be accessed directly as bulk_insert_active (or this->bulk_insert_active).
Additionally, please update the indentation to 2 spaces to match the repository's implicit style guide.
DBUG_ASSERT(s->hlindexes() == (hlindex != NULL));
if (hlindex && hlindex->in_use)
{
if (bulk_insert_active)
return mhnsw_bulk_insert_row(this, key_info + s->keys);
else
return mhnsw_insert(this, key_info + s->keys);
}
return 0;| int TABLE::hlindexes_bulk_insert_begin(ha_rows rows) | ||
| { | ||
| if (hlindex && hlindex->in_use) | ||
| { | ||
| hlindex->bulk_insert_active= true; | ||
| return mhnsw_bulk_insert_begin(this, key_info + s->keys, rows); | ||
| } | ||
| return 0; | ||
| } | ||
|
|
||
| int TABLE::hlindexes_bulk_insert_end() | ||
| { | ||
| if (hlindex && hlindex->in_use) | ||
| { | ||
| hlindex->bulk_insert_active= false; | ||
| return mhnsw_bulk_insert_end(this, key_info + s->keys); | ||
| } | ||
| return 0; | ||
| } |
There was a problem hiding this comment.
There are three issues here:
bulk_insert_activeis a member ofTABLE, nothlindex, so accessing it viahlindex->bulk_insert_activewill cause a compilation error.- If
mhnsw_bulk_insert_beginfails (returns a non-zero error code),bulk_insert_activeshould be reset tofalseso that subsequent inserts do not incorrectly attempt bulk insertion. - The indentation should be updated to 2 spaces to match the repository's implicit style guide.
int TABLE::hlindexes_bulk_insert_begin(ha_rows rows)
{
if (hlindex && hlindex->in_use)
{
bulk_insert_active= true;
int err= mhnsw_bulk_insert_begin(this, key_info + s->keys, rows);
if (err)
bulk_insert_active= false;
return err;
}
return 0;
}
int TABLE::hlindexes_bulk_insert_end()
{
if (hlindex && hlindex->in_use)
{
bulk_insert_active= false;
return mhnsw_bulk_insert_end(this, key_info + s->keys);
}
return 0;
}| if (!to->s->long_unique_table) | ||
| { | ||
| to->file->ha_start_bulk_insert(from->file->stats.records, | ||
| ignore ? 0 : HA_CREATE_UNIQUE_INDEX_BY_SORT); | ||
| bulk_insert_started= 1; | ||
| to->file->ha_start_bulk_insert(from->file->stats.records, | ||
| ignore ? 0 : HA_CREATE_UNIQUE_INDEX_BY_SORT); | ||
| bulk_insert_started= 1; | ||
|
|
||
| if (to->s->hlindexes()) | ||
| { | ||
| to->hlindexes_bulk_insert_begin(from->file->stats.records); | ||
| hlindex_bulk_started= 1; | ||
| } | ||
| } |
There was a problem hiding this comment.
The return value of to->hlindexes_bulk_insert_begin is currently ignored. If the bulk insert initialization fails, we should handle the error, print it, and avoid setting hlindex_bulk_started to 1 so that we do not proceed with the bulk insert or attempt to end it later.
Additionally, please update the indentation to 2 spaces to match the repository's implicit style guide.
if (!to->s->long_unique_table)
{
to->file->ha_start_bulk_insert(from->file->stats.records,
ignore ? 0 : HA_CREATE_UNIQUE_INDEX_BY_SORT);
bulk_insert_started= 1;
if (to->s->hlindexes())
{
if ((error= to->hlindexes_bulk_insert_begin(from->file->stats.records)))
{
to->file->print_error(error, MYF(0));
}
else
{
hlindex_bulk_started= 1;
}
}
}| if (hlindex_bulk_started && to->hlindexes_bulk_insert_end() && error <= 0) | ||
| { | ||
| if (!thd->is_error()) | ||
| to->file->print_error(my_errno, MYF(0)); | ||
| error= 1; | ||
| } | ||
| hlindex_bulk_started=0; |
There was a problem hiding this comment.
Please update the indentation to 2 spaces and add proper spacing around operators (e.g., hlindex_bulk_started = 0;) to match the repository's implicit style guide.
if (hlindex_bulk_started && to->hlindexes_bulk_insert_end() && error <= 0)
{
if (!thd->is_error())
to->file->print_error(my_errno, MYF(0));
error= 1;
}
hlindex_bulk_started= 0;| */ | ||
| bool alias_name_used; /* true if table_name is alias */ | ||
| bool get_fields_in_item_tree; /* Signal to fix_field */ | ||
| bool bulk_insert_active; /* mhnsw bulk_insert_started flag */ |
There was a problem hiding this comment.
Add support for bulk insertion into MHNSW vector indexes during ALTER TABLE