fix type in featureselection converters by Felipedino · Pull Request #744 · DashAISoftware/dashAI

Felipedino · 2026-06-29T00:01:45Z

Problem

The feature selection converters (SelectKBest, SelectPercentile,
SelectFdr, SelectFpr, SelectFwe, GenericUnivariateSelect and
VarianceThreshold) hardcoded their get_output_type to always return
Float (float64). Since these converters only drop columns and never modify
the values of the retained ones, this corrupted the type: an integer column was
reported as float64 even though the underlying data stayed integer.

Solution

Preserve each retained column's original type:

Added to the base class FeatureSelectionConverter a fit that remembers the
input types and a get_output_type that returns the original type per column
(falling back to float64 only when the type is unknown). This covers the 6
scikit-learn selectors.
Removed the duplicated get_output_type (and the now-unused DashAIDataType
import) from the 6 selector files, which now inherit the behavior.
Applied the same fix to VarianceThreshold (same bug, same "only drops
columns" nature).

Key detail

Types are captured in fit and not in transform: scikit-learn
(_SetOutputMixin.__init_subclass__) automatically wraps any transform
defined on a subclass of a sklearn transformer and would coerce the output back
into a pandas DataFrame. fit is never wrapped, so it is the safe place and it
always runs before transform.

Verification

End-to-end test: integer columns stay int64, float columns stay float64,
and the declared type matches the underlying arrow data.
ruff check clean.
Existing converter tests pass (including test_base_converter_metadata.py).

Modified files

DashAI/back/converters/category/feature_selection.py
DashAI/back/converters/scikit_learn/select_k_best.py
DashAI/back/converters/scikit_learn/select_percentile.py
DashAI/back/converters/scikit_learn/select_fdr.py
DashAI/back/converters/scikit_learn/select_fpr.py
DashAI/back/converters/scikit_learn/select_fwe.py
DashAI/back/converters/scikit_learn/generic_univariate_select.py
DashAI/back/converters/scikit_learn/variance_threshold.py

fix type in converters

da23123

cristian-tamblay changed the base branch from production to develop June 29, 2026 01:17

Merge branch 'develop' into feat/fix-type-featureselect

97eded9

cristian-tamblay approved these changes Jun 29, 2026

View reviewed changes

cristian-tamblay merged commit 9965e96 into develop Jun 29, 2026
20 checks passed

cristian-tamblay deleted the feat/fix-type-featureselect branch June 29, 2026 02:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix type in featureselection converters#744

fix type in featureselection converters#744
cristian-tamblay merged 2 commits into
developfrom
feat/fix-type-featureselect

Felipedino commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Felipedino commented Jun 29, 2026

Problem

Solution

Key detail

Verification

Modified files

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants