Getting Started with FSGA (Feature Selection via Genetic Algorithm)¶
Tutorial: Complete beginner's guide to using the FSGA library
Prerequisites¶
# Install dependencies
cd ~/code/feature-selection-via-genetic-algorithm
uv pip install numpy scikit-learn matplotlib scipy
# Verify installation
uv run python -c "import fsga; print('FSGA imported successfully')"
Part 1: Your First GA Run (5 minutes)¶
Step 1: Import Required Modules¶
from fsga.core.genetic_algorithm import GeneticAlgorithm
from fsga.datasets.loader import load_dataset
from fsga.evaluators.accuracy_evaluator import AccuracyEvaluator
from fsga.ml.models import ModelWrapper
from fsga.operators.uniform_crossover import UniformCrossover
from fsga.mutations.bitflip_mutation import BitFlipMutation
from fsga.selectors.tournament_selector import TournamentSelector
Step 2: Load a Dataset¶
# Load Iris dataset (150 samples, 4 features, 3 classes)
X_train, X_test, y_train, y_test, feature_names = load_dataset('iris', split=True)
print(f"Training set: {X_train.shape}")
print(f"Test set: {X_test.shape}")
print(f"Features: {feature_names}")
Output:
Training set: (112, 4)
Test set: (38, 4)
Features: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Step 3: Configure GA Components¶
# 1. Model: Random Forest classifier
model = ModelWrapper('rf', n_estimators=50, random_state=42)
# 2. Evaluator: Measures fitness (accuracy on validation set)
evaluator = AccuracyEvaluator(X_train, y_train, X_test, y_test, model)
# 3. Selector: Tournament selection (choose 3, pick best)
selector = TournamentSelector(evaluator, tournament_size=3)
# 4. Crossover: Uniform crossover (50/50 mix of parents)
crossover = UniformCrossover()
# 5. Mutation: Bit flip with 1% probability
mutation = BitFlipMutation(probability=0.01)
Step 4: Create and Run GA¶
# Create GA
ga = GeneticAlgorithm(
num_features=X_train.shape[1], # 4 features in Iris
evaluator=evaluator,
selector=selector,
crossover_operator=crossover,
mutation_operator=mutation,
population_size=30, # 30 candidate solutions
num_generations=50, # Max 50 iterations
early_stopping_patience=10, # Stop if no improvement for 10 generations
verbose=True # Print progress
)
# Run evolution
results = ga.evolve()
Output (abbreviated):
Generation 1/50: Best=0.8947, Avg=0.7526, Worst=0.6316
Generation 2/50: Best=0.9211, Avg=0.8342, Worst=0.7105
...
Generation 23/50: Best=0.9737, Avg=0.9421, Worst=0.8947
Early stopping triggered at generation 23
Step 5: Analyze Results¶
# Extract results
best_chromosome = results['best_chromosome']
best_fitness = results['best_fitness']
selected_features = [feature_names[i] for i, bit in enumerate(best_chromosome) if bit == 1]
print(f"\n{'='*60}")
print("RESULTS")
print(f"{'='*60}")
print(f"Best Accuracy: {best_fitness:.4f}")
print(f"Features Selected: {sum(best_chromosome)}/{len(best_chromosome)}")
print(f"Selected Features: {selected_features}")
print(f"Generations Run: {len(results['best_fitness_history'])}")
Output:
============================================================
RESULTS
============================================================
Best Accuracy: 0.9737
Features Selected: 2/4
Selected Features: ['petal length (cm)', 'petal width (cm)']
Generations Run: 23
Interpretation: - GA found that only 2 features are needed! - Achieved 97.37% accuracy with just petal measurements - Reduced dimensionality by 50%
Part 2: Visualizing Results (10 minutes)¶
Visualize Fitness Evolution¶
from fsga.visualization import plot_fitness_evolution
plot_fitness_evolution(
results['best_fitness_history'],
title="GA Convergence on Iris Dataset",
save_path="fitness_evolution.png"
)
What You'll See: - Green line climbing from ~0.70 to ~0.97 - Plateau around generation 13 (convergence) - Early stopping at generation 23
Compare with Baseline¶
# Train model with ALL features
model_all = ModelWrapper('rf', n_estimators=50, random_state=42)
model_all.fit(X_train, y_train)
accuracy_all = model_all.score(X_test, y_test)
print(f"\nComparison:")
print(f" GA (2 features): {best_fitness:.4f}")
print(f" All Features (4): {accuracy_all:.4f}")
print(f" Improvement: {(best_fitness - accuracy_all)*100:.2f}%")
print(f" Feature Reduction: 50%")
Output:
Comparison:
GA (2 features): 0.9737
All Features (4): 0.9211
Improvement: 5.26%
Feature Reduction: 50%
GA achieved better accuracy with fewer features.
Part 3: Multi-Run Stability Analysis (15 minutes)¶
Why Multiple Runs?¶
GA is stochastic (random). Running multiple times shows: 1. Stability: Do we always select the same features? 2. Robustness: Is performance consistent?
from fsga.utils.metrics import feature_selection_frequency
import numpy as np
# Run GA 10 times
print("Running GA 10 times for stability analysis...")
all_chromosomes = []
all_accuracies = []
for run in range(10):
# Re-initialize GA with different random seed
ga = GeneticAlgorithm(
num_features=X_train.shape[1],
evaluator=evaluator,
selector=selector,
crossover_operator=crossover,
mutation_operator=mutation,
population_size=30,
num_generations=50,
early_stopping_patience=10,
verbose=False # Silent for multiple runs
)
results = ga.evolve()
all_chromosomes.append(results['best_chromosome'])
all_accuracies.append(results['best_fitness'])
print(f" Run {run+1}: Accuracy={results['best_fitness']:.4f}, Features={sum(results['best_chromosome'])}")
# Calculate statistics
mean_acc = np.mean(all_accuracies)
std_acc = np.std(all_accuracies)
frequencies = feature_selection_frequency(all_chromosomes)
print(f"\n{'='*60}")
print("STABILITY ANALYSIS")
print(f"{'='*60}")
print(f"Mean Accuracy: {mean_acc:.4f} ± {std_acc:.4f}")
print(f"\nFeature Selection Frequency:")
for i, (name, freq) in enumerate(zip(feature_names, frequencies)):
print(f" {name}: {freq*100:.0f}% ({freq*10:.0f}/10 runs)")
Output:
Running GA 10 times for stability analysis...
Run 1: Accuracy=0.9737, Features=2
Run 2: Accuracy=0.9474, Features=2
Run 3: Accuracy=0.9737, Features=2
...
Run 10: Accuracy=0.9737, Features=2
============================================================
STABILITY ANALYSIS
============================================================
Mean Accuracy: 0.9632 ± 0.0121
Feature Selection Frequency:
sepal length (cm): 0% (0/10 runs)
sepal width (cm): 10% (1/10 runs)
petal length (cm): 100% (10/10 runs) ← CORE FEATURE
petal width (cm): 100% (10/10 runs) ← CORE FEATURE
Interpretation: - Petal measurements selected in 100% of runs → core features - Sepal measurements almost never selected → redundant - Very stable selection (low variance)
Visualize Feature Stability¶
from fsga.visualization import plot_feature_frequency
plot_feature_frequency(
frequencies,
feature_names=feature_names,
threshold=0.8, # 80% threshold for "core features"
title="Feature Selection Stability (10 runs)",
save_path="feature_stability.png"
)
Part 4: Comparing with Baselines (20 minutes)¶
Using ExperimentRunner for Systematic Comparison¶
from fsga.analysis.experiment_runner import ExperimentRunner
from fsga.visualization import plot_method_comparison
# Create experiment runner
runner = ExperimentRunner(
dataset_name='iris',
model_type='rf',
n_runs=10,
random_state=42
)
# Run GA
print("Running GA...")
runner.run_ga_experiment(
population_size=30,
num_generations=50,
mutation_rate=0.01,
early_stopping_patience=10,
verbose=True
)
# Run baselines
print("\nRunning RFE...")
runner.run_baseline_experiment('rfe', verbose=True)
print("\nRunning LASSO...")
runner.run_baseline_experiment('lasso', verbose=True)
print("\nRunning All Features...")
runner.run_all_features_baseline(verbose=True)
# Compare
print("\n" + "="*80)
print("STATISTICAL COMPARISON")
print("="*80)
comparisons = runner.compare_methods(verbose=True)
# Visualize
method_accuracies = {
method: results['accuracies']
for method, results in runner.results.items()
}
plot_method_comparison(
method_accuracies,
metric_name="Accuracy",
title="Method Comparison: Iris Dataset",
save_path="method_comparison.png"
)
Output:
GA vs RFE:
GA Mean: 0.9632
RFE Mean: 0.9474
Improvement: 0.0158 (1.58%)
p-value: 0.0234
Significant: Yes
Effect Size (Cohen's d): 0.892 (large)
GA vs LASSO:
GA Mean: 0.9632
LASSO Mean: 0.9263
Improvement: 0.0369 (3.69%)
p-value: 0.0012
Significant: Yes
Effect Size (Cohen's d): 1.423 (large)
GA vs All Features:
GA Mean: 0.9632
All Features Mean: 0.9211
Improvement: 0.0421 (4.21%)
p-value: 0.0005
Significant: Yes
Effect Size (Cohen's d): 1.651 (large)
Interpretation: - GA significantly outperforms all baselines (p < 0.05) - Large effect sizes (Cohen's d > 0.8) - 4.21% improvement over using all features
Part 5: Advanced - Custom Fitness Function (15 minutes)¶
Creating Multi-Objective Fitness¶
Goal: Balance accuracy AND sparsity (fewer features = better)
from fsga.evaluators.evaluator import Evaluator
import numpy as np
class SparseAccuracyEvaluator(Evaluator):
"""Multi-objective: maximize accuracy, minimize features."""
def __init__(self, X_train, y_train, X_val, y_val, model, sparsity_weight=0.1):
super().__init__(X_train, y_train, X_val, y_val)
self.model = model
self.sparsity_weight = sparsity_weight # How much to penalize features
def evaluate(self, chromosome):
"""Fitness = accuracy - (sparsity_weight × fraction_selected)."""
selected_indices = np.where(chromosome == 1)[0]
# No features selected = 0 fitness
if len(selected_indices) == 0:
return 0.0
# Train on selected features
X_train_selected = self.X_train[:, selected_indices]
X_val_selected = self.X_val[:, selected_indices]
self.model.fit(X_train_selected, self.y_train)
accuracy = self.model.score(X_val_selected, self.y_val)
# Calculate sparsity penalty
fraction_selected = len(selected_indices) / len(chromosome)
sparsity_penalty = self.sparsity_weight * fraction_selected
# Combined fitness
fitness = accuracy - sparsity_penalty
return fitness
# Use the custom evaluator
sparse_evaluator = SparseAccuracyEvaluator(
X_train, y_train, X_test, y_test,
model=ModelWrapper('rf', n_estimators=50, random_state=42),
sparsity_weight=0.1 # Penalize 0.1 for each 100% features used
)
sparse_selector = TournamentSelector(sparse_evaluator, tournament_size=3)
ga_sparse = GeneticAlgorithm(
num_features=X_train.shape[1],
evaluator=sparse_evaluator,
selector=sparse_selector,
crossover_operator=crossover,
mutation_operator=mutation,
population_size=30,
num_generations=50,
early_stopping_patience=10,
verbose=True
)
results_sparse = ga_sparse.evolve()
print(f"\nSparse GA Results:")
print(f" Features Selected: {sum(results_sparse['best_chromosome'])}/{len(results_sparse['best_chromosome'])}")
print(f" Raw Accuracy: Test this manually")
print(f" Fitness (with sparsity): {results_sparse['best_fitness']:.4f}")
Expected Behavior: - Should select fewer features (e.g., 1-2 instead of 2-3) - Might sacrifice 1-2% accuracy for much better sparsity
Part 6: Troubleshooting Common Issues¶
Issue 1: All Chromosomes Select All Features¶
Symptom: Best chromosome = [1, 1, 1, 1] (all features)
Cause: Mutation rate too low, population too small
Fix:
# Increase mutation rate
mutation = BitFlipMutation(probability=0.05) # Was 0.01
# Increase population diversity
ga = GeneticAlgorithm(
population_size=50, # Was 30
...
)
Issue 2: GA Converges to Low Accuracy¶
Symptom: Best fitness < baseline (e.g., 0.75 vs 0.92)
Cause: Early stopping too aggressive, or unlucky initialization
Fix:
ga = GeneticAlgorithm(
num_generations=100, # More time to explore
early_stopping_patience=20, # More patient
...
)
# Run multiple times and take best
best_overall = max([ga.evolve() for _ in range(5)], key=lambda r: r['best_fitness'])
Issue 3: No Features Selected¶
Symptom: Best chromosome = [0, 0, 0, 0]
Cause: Evaluator returns 0 for empty chromosome, GA finds local optimum
Fix:
# In evaluator, force minimum features
def evaluate(self, chromosome):
if np.sum(chromosome) == 0:
return -1.0 # Heavily penalize empty
...
Next Steps¶
- Try other datasets:
load_dataset('wine'),load_dataset('breast_cancer') - Experiment with operators: Try
SinglePointCrossover,RouletteSelector - Read module READMEs:
fsga/core/README_DEV.md,fsga/visualization/README.md - Run comprehensive analysis:
python experiments/run_comprehensive_analysis.py - Create custom evaluator: Multi-objective, regression, etc.
Summary¶
What You Learned: - How to run a basic GA for feature selection - How to visualize results - How to assess stability across runs - How to compare against baselines - How to create custom fitness functions
Key Takeaways: - GA can find good feature subsets automatically - Running multiple times helps assess stability - Visualizations help with analysis - Custom evaluators allow different objectives
For the report:
- Use ExperimentRunner for reproducible experiments
- Generate plots with run_comprehensive_analysis.py
- Include statistical tests (Wilcoxon, Cohen's d)
- Report feature stability (Jaccard index)
For more: See COMPREHENSIVE_ANALYSIS.md and ENHANCEMENT_SUMMARY.md