search.selectionΒΆ
The selection package handles candidate selection for peak groups using various algorithms and strategies.
Main candidate selection implementation for DIA data analysis. |
|
Configuration DataFrames for selection parameters. |
|
Fast Fourier Transform operations for signal processing. |
|
Kernel-based operations for candidate selection. |
|
Utility functions for candidate selection operations. |
Main candidate selection implementation for DIA data analysis.
- class alphadia.search.selection.selection.CandidateSelection(dia_data: TimsTOFTranspose | AlphaRawBase | MzML | Sciex | Thermo, precursors_flat: DataFrame, fragments_flat: DataFrame, config: CandidateSelectionConfig, rt_column: str, mobility_column: str, precursor_mz_column: str, fragment_mz_column: str, fwhm_rt: float = 5.0, fwhm_mobility: float = 0.012)[source]ΒΆ
Bases:
object- __init__(dia_data: TimsTOFTranspose | AlphaRawBase | MzML | Sciex | Thermo, precursors_flat: DataFrame, fragments_flat: DataFrame, config: CandidateSelectionConfig, rt_column: str, mobility_column: str, precursor_mz_column: str, fragment_mz_column: str, fwhm_rt: float = 5.0, fwhm_mobility: float = 0.012) None[source]ΒΆ
Select candidates for MS2 extraction based on MS1 features
- Parameters:
dia_data (DiaData) β dia data object
precursors_flat (pd.DataFrame) β flattened precursor dataframe
fragments_flat (pd.DataFrame) β flattened fragment dataframe
config (CandidateSelectionConfig) β config object
rt_column (str) β name of the rt column in the precursor dataframe
mobility_column (str) β name of the mobility column in the precursor dataframe
precursor_mz_column (str) β name of the precursor mz column in the precursor dataframe
fragment_mz_column (str) β name of the fragment mz column in the fragment dataframe
fwhm_rt (float, optional) β full width at half maximum in RT dimension for the GaussianKernel, by default 5.0
fwhm_mobility (float, optional) β full width at half maximum in mobility dimension for the GaussianKernel, by default 0.012
Configuration DataFrames for selection parameters.
- class alphadia.search.selection.config_df.CandidateContainer(*args, **kwargs)[source]ΒΆ
Bases:
CandidateContainer- class_type = jitclass.CandidateContainer#75e5dd9f7d10<precursor_idx:array(uint32, 1d, C),rank:array(uint8, 1d, C),score:array(float32, 1d, C),scan_center:array(uint32, 1d, C),scan_start:array(uint32, 1d, C),scan_stop:array(uint32, 1d, C),frame_center:array(uint32, 1d, C),frame_start:array(uint32, 1d, C),frame_stop:array(uint32, 1d, C)>ΒΆ
- class alphadia.search.selection.config_df.CandidateSelectionConfigJIT(*args, **kwargs)[source]ΒΆ
Bases:
CandidateSelectionConfigJITNumba compatible config object for the HybridCandidate class. Please see the documentation of the CandidateSelectionConfig class for more information on the parameters and their default values.
- candidate_count: int64ΒΆ
- center_fraction: float64ΒΆ
- class_type = jitclass.CandidateSelectionConfigJIT#75e5dd9f4fd0<rt_tolerance:float64,precursor_mz_tolerance:float64,fragment_mz_tolerance:float64,mobility_tolerance:float64,isotope_tolerance:float64,peak_len_rt:float64,sigma_scale_rt:float64,peak_len_mobility:float64,sigma_scale_mobility:float64,candidate_count:int64,top_k_precursors:int64,top_k_fragments:int64,exclude_shared_ions:bool,kernel_size:int64,f_mobility:float64,f_rt:float64,center_fraction:float64,min_size_mobility:int64,min_size_rt:int64,max_size_mobility:int64,max_size_rt:int64,group_channels:bool,use_weighted_score:bool,join_close_candidates:bool,join_close_candidates_scan_threshold:float64,join_close_candidates_cycle_threshold:float64,feature_std:array(float64, 1d, C),feature_mean:array(float64, 1d, C),feature_weight:array(float64, 1d, C)>ΒΆ
- f_mobility: float64ΒΆ
- f_rt: float64ΒΆ
- feature_mean: Array(float64, 1, 'C', False, aligned=True)ΒΆ
- feature_std: Array(float64, 1, 'C', False, aligned=True)ΒΆ
- feature_weight: Array(float64, 1, 'C', False, aligned=True)ΒΆ
- fragment_mz_tolerance: float64ΒΆ
- group_channels: boolΒΆ
- isotope_tolerance: float64ΒΆ
- join_close_candidates: boolΒΆ
- join_close_candidates_cycle_threshold: float64ΒΆ
- join_close_candidates_scan_threshold: float64ΒΆ
- kernel_size: int64ΒΆ
- max_size_mobility: int64ΒΆ
- max_size_rt: int64ΒΆ
- min_size_mobility: int64ΒΆ
- min_size_rt: int64ΒΆ
- mobility_tolerance: float64ΒΆ
- peak_len_mobility: float64ΒΆ
- peak_len_rt: float64ΒΆ
- precursor_mz_tolerance: float64ΒΆ
- rt_tolerance: float64ΒΆ
- sigma_scale_mobility: float64ΒΆ
- sigma_scale_rt: float64ΒΆ
- top_k_fragments: int64ΒΆ
- top_k_precursors: int64ΒΆ
- use_weighted_score: boolΒΆ
- class alphadia.search.selection.config_df.PrecursorFlatContainer(*args, **kwargs)[source]ΒΆ
Bases:
PrecursorFlatContainer- candidate_start_idx: Array(uint32, 1, 'C', False, aligned=True)ΒΆ
- candidate_stop_idx: Array(uint32, 1, 'C', False, aligned=True)ΒΆ
- charge: Array(uint8, 1, 'C', False, aligned=True)ΒΆ
- class_type = jitclass.PrecursorFlatContainer#75e5dd9f5750<precursor_idx:array(uint32, 1d, C),frag_start_idx:array(uint32, 1d, C),frag_stop_idx:array(uint32, 1d, C),candidate_start_idx:array(uint32, 1d, C),candidate_stop_idx:array(uint32, 1d, C),charge:array(uint8, 1d, C),rt:array(float32, 1d, C),mobility:array(float32, 1d, C),mz:array(float32, 1d, C),isotopes:array(float32, 2d, C)>ΒΆ
- frag_start_idx: Array(uint32, 1, 'C', False, aligned=True)ΒΆ
- frag_stop_idx: Array(uint32, 1, 'C', False, aligned=True)ΒΆ
- isotopes: Array(float32, 2, 'C', False, aligned=True)ΒΆ
- mobility: Array(float32, 1, 'C', False, aligned=True)ΒΆ
- mz: Array(float32, 1, 'C', False, aligned=True)ΒΆ
- precursor_idx: Array(uint32, 1, 'C', False, aligned=True)ΒΆ
- rt: Array(float32, 1, 'C', False, aligned=True)ΒΆ
- alphadia.search.selection.config_df.candidate_container_to_df(candidate_container: CandidateContainer) DataFrame[source]ΒΆ
Convert a CandidateContainer to pd.DataFrame.
Fast Fourier Transform operations for signal processing.
- alphadia.search.selection.fft.convolve_fourier(dense, kernel)[source]ΒΆ
Numba helper function to apply a gaussian filter to a 2d or 3d dense matrix.
- Parameters:
dense (np.ndarray) β Array of shape (β¦, n_scans, n_frames)
kernel (np.ndarray) β Array of shape (i, j)
- Returns:
Array of shape (β¦, n_scans, n_frames) containing the filtered dense stack.
- Return type:
np.ndarray
Kernel-based operations for candidate selection.
- class alphadia.search.selection.kernel.GaussianKernel(dia_data: TimsTOFTransposeJIT | AlphaRawJIT, fwhm_rt: float = 10.0, sigma_scale_rt: float = 1.0, fwhm_mobility: float = 0.03, sigma_scale_mobility: float = 1.0, kernel_height: int = 30, kernel_width: int = 30)[source]ΒΆ
Bases:
object- __init__(dia_data: TimsTOFTransposeJIT | AlphaRawJIT, fwhm_rt: float = 10.0, sigma_scale_rt: float = 1.0, fwhm_mobility: float = 0.03, sigma_scale_mobility: float = 1.0, kernel_height: int = 30, kernel_width: int = 30)[source]ΒΆ
Create a two-dimensional gaussian filter kernel for the RT and mobility dimensions of a DIA dataset. First, the observed standard deviation is scaled by a linear factor. Second, the standard deviation is scaled by the resolution of the respective dimension.
This results in sigma_scale to be independent of the resolution of the data and FWHM of the peaks.
- Parameters:
dia_data (DiaDataJIT) β dia_data jit object.
fwhm_rt (float) β Full width at half maximum in RT dimension of the peaks in the spectrum.
sigma_scale_rt (float) β Scaling factor for the standard deviation in RT dimension.
fwhm_mobility (float) β Full width at half maximum in mobility dimension of the peaks in the spectrum.
sigma_scale_mobility (float) β Scaling factor for the standard deviation in mobility dimension.
kernel_size (int) β Kernel shape in pixel. The kernel will be a square of size (kernel_size, kernel_size). Should be even and will be rounded up to the next even number if necessary.
- determine_mobility_sigma(mobility_resolution: float)[source]ΒΆ
Determine the standard deviation of the gaussian kernel in mobility dimension. The standard deviation will be sclaed to the resolution of the raw data.
- Parameters:
mobility_resolution (float) β Resolution of the mobility dimension in 1/K_0.
- Returns:
Standard deviation of the gaussian kernel in mobility dimension scaled to the resolution of the raw data.
- Return type:
float
- determine_rt_sigma(cycle_length_seconds: float)[source]ΒΆ
Determine the standard deviation of the gaussian kernel in RT dimension. The standard deviation will be sclaed to the resolution of the raw data.
- Parameters:
cycle_length_seconds (float) β Cycle length of the duty cycle in seconds.
- Returns:
Standard deviation of the gaussian kernel in RT dimension scaled to the resolution of the raw data.
- Return type:
float
- static gaussian_kernel_2d(size_x: int, size_y: int, sigma_x: float, sigma_y: float)[source]ΒΆ
Create a 2D gaussian kernel with a given size and standard deviation.
- Parameters:
size (int) β Width and height of the kernel matrix.
sigma_x (float) β Standard deviation of the gaussian kernel in x direction. This will correspond to the RT dimension.
sigma_y (float) β Standard deviation of the gaussian kernel in y direction. This will correspond to the mobility dimension.
- Returns:
weights β 2D gaussian kernel matrix of shape (size, size).
- Return type:
np.ndarray, dtype=np.float32
- alphadia.search.selection.kernel.multivariate_normal(x: ndarray, mu: ndarray, sigma: ndarray)[source]ΒΆ
Multivariate normal distribution, probability density function
Most likely an absolutely inefficient implementation of the multivariate normal distribution. Is only used for creating the gaussian kernel and will only be used a few times for small kernels.
- Parameters:
x (np.ndarray) β (N, D,)
mu (np.ndarray) β (1, D,)
sigma (np.ndarray) β (D, D,)
- Returns:
array of shape (N,) with the density at each point
- Return type:
np.ndarray, float32
Utility functions for candidate selection operations.
- alphadia.search.selection.utils.assemble_isotope_mz(mono_mz, charge, isotope_intensity)[source]ΒΆ
Assemble the isotope m/z values from the precursor m/z and the isotope offsets.
- alphadia.search.selection.utils.find_peaks_1d(a: ndarray, top_n: int = 3) tuple[ndarray, ndarray, ndarray][source]ΒΆ
Accepts a dense representation and returns the top three peaks
- alphadia.search.selection.utils.find_peaks_2d(a: ndarray, top_n: int = 3) tuple[ndarray, ndarray, ndarray][source]ΒΆ
Accepts a dense representation and returns the top three peaks