Adding Multiple Sorting Interfaces =================================== If you have a session with multiple probes or the same recording is sorted with different algorithms, you may need to handle multiple spike sorting outputs. This how-to will guide you through the process of adding more than one sorting interface to your NWB files using NeuroConv. Why Unit ID Management Matters ------------------------------- When you have multiple sorting outputs (e.g., from different algorithms like Kilosort and MountainSort), they might have overlapping unit IDs (0, 1, 2, etc.). If they do, adding them naively in NeuroConv will lead to the rows corresponding to the second sorter being skipped, as NeuroConv will identify them as the same unit and skip them to avoid duplicates. To handle this problem you have two main approaches: 1. **Rename units** to create unique identifiers before merging into the canonical Units table 2. **Keep separate tables for each sorter** in the processing module to maintain original sorter IDs Setting Up the Example ----------------------- First, let's create two mock sorting interfaces to demonstrate the concepts: .. code-block:: python from neuroconv.tools.testing.mock_interfaces import MockSortingInterface from neuroconv import ConverterPipe # Create two sorting interfaces with overlapping unit IDs sorting_interface1 = MockSortingInterface(num_units=4) sorting_interface2 = MockSortingInterface(num_units=4) print("Sorting 1 unit IDs:", sorting_interface1.units_ids) print("Sorting 2 unit IDs:", sorting_interface2.units_ids) Expected output: .. code-block:: text Sorting 1 unit IDs: ['0', '1', '2', '3'] Sorting 2 unit IDs: ['0', '1', '2', '3'] The units from both sorting interfaces have the same IDs, which will cause conflicts. Approach 1: Canonical Units Table with Unit Renaming ----------------------------------------------------- This approach merges all sorting results into the main NWB Units table after renaming units to avoid ID conflicts. Step 1: Add First Sorting to NWB File ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Create NWB file with first sorting (no renaming needed) nwbfile = sorting_interface1.create_nwbfile() print("Units after adding first sorting:") print(nwbfile.units.to_dataframe()[['unit_name']]) Expected output: .. code-block:: text Units after adding first sorting: id unit_name 0 0 1 1 2 2 3 3 Step 2: Rename Units in Second Sorting ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Before adding the second sorting, we need to rename its units to avoid conflicts: .. code-block:: python # Method 1: Using the new rename_unit_ids method with dictionary mapping unit_rename_map = { '0': 'sorter2_unit_0', '1': 'sorter2_unit_1', '2': 'sorter2_unit_2', '3': 'sorter2_unit_3' } mock_sorting2.rename_unit_ids(unit_rename_map) print("Sorting 2 unit IDs after renaming:", mock_sorting2.units_ids) Expected output: .. code-block:: text Sorting 2 unit IDs after renaming: ['sorter2_unit_0', 'sorter2_unit_1', 'sorter2_unit_2', 'sorter2_unit_3'] Step 3: Add Second Sorting to Existing NWB File ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Add the renamed sorting to the existing NWB file mock_sorting2.add_to_nwbfile(nwbfile=nwbfile) print("Units after adding both sortings:") units_df = nwbfile.units.to_dataframe() print(units_df[['unit_name']]) Expected output: .. code-block:: text Units after adding both sortings: id unit_name 0 0 1 1 2 2 3 3 4 sorter2_unit_0 5 sorter2_unit_1 6 sorter2_unit_2 7 sorter2_unit_3 Advantages of This Approach ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - All units are in the canonical Units table, making analysis easier - Creates session-unique unit identifiers - Standard location that analysis tools expect Disadvantages ~~~~~~~~~~~~~ - Requires careful unit ID management - Original sorter IDs are lost unless preserved in unit properties Approach 2: Separate Tables in Processing Module ------------------------------------------------- This approach keeps each sorting in its own table within the processing module, preserving original unit IDs. .. code-block:: python # Create fresh sorting interfaces mock_sorting1 = MockSortingInterface(num_units=4) mock_sorting2 = MockSortingInterface(num_units=4) # Set up data interfaces with descriptive names data_interfaces = { "kilosort_sorting": mock_sorting1, "mountainsort_sorting": mock_sorting2, } # Create converter with both sortings converter = ConverterPipe(data_interfaces=data_interfaces) # Configure to write each sorting to separate processing tables conversion_options = { "kilosort_sorting": { "write_as": "processing", "units_name": "UnitsKilosort", "units_description": "Units detected by Kilosort spike sorting algorithm" }, "mountainsort_sorting": { "write_as": "processing", "units_name": "UnitsMountainSort", "units_description": "Units detected by MountainSort spike sorting algorithm" }, } # Create NWB file with separate tables nwbfile = converter.create_nwbfile(conversion_options=conversion_options) print("Processing module contents:") print(list(nwbfile.processing['ecephys'].data_interfaces.keys())) # ['UnitsKilosort', 'UnitsMountainSort'] Advantages of This Approach ^^^^^^^^^^^^^^^^^^^^^^^^^^^ - Preserves original unit IDs from each sorter - Clear provenance of which algorithm produced which units - No risk of ID conflicts Disadvantages ^^^^^^^^^^^^^ - Analysis tools need to know which table to use - More complex to work with multiple tables - Units are not in the standard NWB Units location Alternative Renaming Approaches ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You can also use more descriptive naming schemes: .. code-block:: python # Descriptive naming based on sorting algorithm descriptive_map = { '0': 'kilosort_cluster_01', '1': 'kilosort_cluster_02', '2': 'kilosort_cluster_03', '3': 'kilosort_cluster_04' } # Or cell-type based naming celltype_map = { '0': 'pyramidal_neuron_1', '1': 'interneuron_1', '2': 'pyramidal_neuron_2', '3': 'unclassified_1' } Adding Custom Properties to the Units Table ------------------------------------------- When using the canonical Units table approach, you may want to add additional columns that provide important context about your units. This is particularly useful when combining units from multiple probes or sorting algorithms. You can add custom properties using the sorting extractor's ``set_property`` method. Note that if the sorting extractor already pre-loads properties those will be automatically added to the units table. Adding Probe Information ~~~~~~~~~~~~~~~~~~~~~~~~ Here's how to add a "probe" column to distinguish units from different probes: .. code-block:: python from neuroconv.tools.testing.mock_interfaces import MockSortingInterface # Create two sorting interfaces representing different probes probe1_sorting = MockSortingInterface(num_units=4) probe2_sorting = MockSortingInterface(num_units=3) # Rename units to avoid conflicts (do this first) probe1_sorting.rename_unit_ids({ '0': 'a', '1': 'b', '2': 'c', '3': 'd', }) probe2_sorting.rename_unit_ids({ '0': 'e', '1': 'f', '2': 'g', }) # Add probe information as a property for each sorting probe1_sorting.sorting_extractor.set_property( key="probe", values=["probe_A"] * 4, # All 4 units are from probe A ids=["a", "b", "c", "d"] ) probe2_sorting.sorting_extractor.set_property( key="probe", values=["probe_B"] * 3, # All 3 units are from probe B ids=["e", "f", "g"] # Use renamed IDs ) # Create NWB file and add both sortings nwbfile = probe1_sorting.create_nwbfile() probe2_sorting.add_to_nwbfile(nwbfile=nwbfile) # Verify the probe column was added units_df = nwbfile.units.to_dataframe() print("Units table with probe information:") print(units_df[['unit_name', 'probe']]) Expected output: .. code-block:: text Units table with probe information: id unit_name probe 0 a probe_A 1 b probe_A 2 c probe_A 3 d probe_A 4 e probe_B 5 f probe_B 6 g probe_B Adding Algorithm Provenance Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You can also add information about which sorting algorithm was used: .. code-block:: python # Create sorting interfaces for different algorithms kilosort_sorting = MockSortingInterface(num_units=3) mountainsort_sorting = MockSortingInterface(num_units=2) # Rename units to avoid conflicts (do this first) kilosort_sorting.rename_unit_ids({ '0': 'a', '1': 'b', '2': 'c', }) mountainsort_sorting.rename_unit_ids({ '0': 'd', '1': 'e', }) # Add algorithm information kilosort_sorting.sorting_extractor.set_property( key="algorithm", values=["kilosort"] * 3, ids=["a", "b", "c"] ) mountainsort_sorting.sorting_extractor.set_property( key="algorithm", values=["mountainsort"] * 2, ids=["d", "e"] # Use renamed IDs ) # You can add multiple properties at once kilosort_sorting.sorting_extractor.set_property( key="quality_score", values=[0.95, 0.87, 0.92], ids=["a", "b", "c"] ) mountainsort_sorting.sorting_extractor.set_property( key="quality_score", values=[0.89, 0.76], ids=["d", "e"] # Use renamed IDs ) # Create NWB file with both sortings nwbfile = kilosort_sorting.create_nwbfile() mountainsort_sorting.add_to_nwbfile(nwbfile=nwbfile) # View the enriched units table units_df = nwbfile.units.to_dataframe() print("Units table with algorithm and quality information:") print(units_df[['unit_name', 'algorithm', 'quality_score']]) Expected output: .. code-block:: text Units table with algorithm and quality information: id unit_name algorithm quality_score 0 a kilosort 0.95 1 b kilosort 0.87 2 c kilosort 0.92 3 d mountainsort 0.89 4 e mountainsort 0.76