Introduction

Function optimizes Extraction windows for DIA/SWATH so we have the same number of precursor per window. This optimization is based on spectral library data or non redundant .blib files (Bibliospec).

Prerequisites

Constant with method

data("masses")
cdsw <- Cdsw(masses , nbins = 25, digits = 1)
cdsw$plot()

knitr::kable(cdsw$asTable())
from to mid width counts
349.63 384.62 367.125 34.99 6688
383.62 418.62 401.120 35.00 8357
417.62 452.61 435.115 34.99 9661
451.61 486.61 469.110 35.00 10452
485.61 520.60 503.105 34.99 10725
519.60 554.59 537.095 34.99 10837
553.59 588.59 571.090 35.00 10433
587.59 622.58 605.085 34.99 9750
621.58 656.58 639.080 35.00 9276
655.58 690.57 673.075 34.99 8406
689.57 724.56 707.065 34.99 7848
723.56 758.56 741.060 35.00 7116
757.56 792.55 775.055 34.99 6355
791.55 826.55 809.050 35.00 5666
825.55 860.54 843.045 34.99 4923
859.54 894.53 877.035 34.99 4359
893.53 928.53 911.030 35.00 3807
927.53 962.52 945.025 34.99 3344
961.52 996.52 979.020 35.00 2724
995.52 1030.51 1013.015 34.99 2357
1029.51 1064.50 1047.005 34.99 2042
1063.50 1098.50 1081.000 35.00 1807
1097.50 1132.49 1114.995 34.99 1313
1131.49 1166.49 1148.990 35.00 1088
1165.49 1200.48 1182.985 34.99 881
constError <- cdsw$error()

Classical Method based on quantile

Same number of MS1 precursors in each window

cdsw$quantile_breaks()
cdsw$plot()

knitr::kable(cdsw$asTable())
from to mid width counts
0% 349.63 381.03 365.330 31.40 5956
4% 380.03 406.71 393.370 26.68 6131
8% 405.71 429.24 417.475 23.53 6070
12% 428.24 450.05 439.145 21.81 6086
16% 449.05 470.06 459.555 21.01 6095
20% 469.06 488.80 478.930 19.74 6107
24% 487.80 508.12 497.960 20.32 6173
28% 507.12 526.81 516.965 19.69 6150
32% 525.81 545.79 535.800 19.98 6166
36% 544.79 565.29 555.040 20.50 6123
40% 564.29 584.80 574.545 20.51 6139
44% 583.80 605.12 594.460 21.32 6121
48% 604.12 626.34 615.230 22.22 6113
52% 625.34 648.36 636.850 23.02 6108
56% 647.36 672.34 659.850 24.98 6074
60% 671.34 696.53 683.935 25.19 6082
64% 695.53 722.89 709.210 27.36 6054
68% 721.89 751.40 736.645 29.51 6053
72% 750.40 782.43 766.415 32.03 6023
76% 781.43 817.40 799.415 35.97 5982
80% 816.40 857.96 837.180 41.56 6026
84% 856.96 905.62 881.290 48.66 5971
88% 904.62 964.93 934.775 60.31 5943
92% 963.93 1049.48 1006.705 85.55 5903
96% 1048.48 1200.48 1124.480 152.00 5863
quantileError <- cdsw$error()

Adjust windows

Using this method the window start and end is shifted to a mass range with as few MS1 peaks as possible.

knitr::kable(cdsw$optimizeWindows(maxbin = 10, plot = TRUE) )

from to mid width counts
350.13 380.95 365.54 30.82 5952
380.45 406.35 393.40 25.90 5932
406.05 429.05 417.55 23.00 5948
428.65 449.65 439.15 21.00 5891
449.45 469.65 459.55 20.20 5872
469.45 488.35 478.90 18.90 5860
488.15 508.05 498.10 19.90 6137
507.45 526.45 516.95 19.00 5892
526.15 545.45 535.80 19.30 6022
545.15 565.05 555.10 19.90 5992
564.55 584.45 574.50 19.90 5976
584.15 605.05 594.60 20.90 6035
604.55 626.05 615.30 21.50 5893
625.55 648.15 636.85 22.60 5987
647.55 672.15 659.85 24.60 6023
671.75 696.15 683.95 24.40 5890
695.55 722.55 709.05 27.00 5981
722.25 751.15 736.70 28.90 5932
750.65 782.15 766.40 31.50 5927
781.65 817.15 799.40 35.50 5944
816.65 857.55 837.10 40.90 5901
857.35 905.25 881.30 47.90 5904
905.05 964.65 934.85 59.60 5881
964.35 1049.15 1006.75 84.80 5864
1048.95 1200.05 1124.50 151.10 5843

Dynamic Swath Windows with Constraints.

  • Mass range can be specified (mass_range)
  • Maximal window size can be specified (max_window_size). This is because windows should not be to large because of optimal collision energy (personal communication by Bernd R.).
  • Minimal window size can be specified (min_window_size).
  • target number of windows can be specified (nr_windows).
  • boundaries between windows are placed in regions were no precursors are observed.
cdsw$sampling_breaks(maxwindow = 100,plot = TRUE)

cdsw$plot()

knitr::kable(cdsw$asTable())
from to mid width counts
0% 349.63 381.72 365.675 32.09 6123
4% 380.72 408.69 394.705 27.97 6392
8% 407.69 432.39 420.040 24.70 6553
12% 431.39 455.08 443.235 23.69 6612
16% 454.08 476.75 465.415 22.67 6698
20% 475.75 497.28 486.515 21.53 6692
24% 496.28 518.27 507.275 21.99 6720
28% 517.27 538.86 528.065 21.59 6726
32% 537.86 560.14 549.000 22.28 6732
36% 559.14 581.33 570.235 22.19 6741
40% 580.33 603.32 591.825 22.99 6625
44% 602.32 626.32 614.320 24.00 6566
48% 625.32 650.13 637.725 24.81 6563
52% 649.13 675.35 662.240 26.22 6453
56% 674.35 701.39 687.870 27.04 6428
60% 700.39 728.90 714.645 28.51 6246
64% 727.90 758.86 743.380 30.96 6202
68% 757.86 790.62 774.240 32.76 6023
72% 789.62 826.39 808.005 36.77 5948
76% 825.39 866.58 845.985 41.19 5783
80% 865.58 911.76 888.670 46.18 5452
84% 910.76 963.46 937.110 52.70 5175
88% 962.46 1026.54 994.500 64.08 4698
92% 1025.54 1101.10 1063.320 75.56 4154
96% 1100.10 1200.48 1150.290 100.38 3103
knitr::kable(cdsw$optimizeWindows(maxbin = 10, plot = TRUE) )

from to mid width counts
350.13 381.35 365.74 31.22 6053
381.05 408.35 394.70 27.30 6304
408.05 432.05 420.05 24.00 6323
431.65 454.65 443.15 23.00 6474
454.45 476.35 465.40 21.90 6508
476.15 496.85 486.50 20.70 6498
496.45 517.85 507.15 21.40 6587
517.45 538.45 527.95 21.00 6492
538.15 560.05 549.10 21.90 6688
559.55 581.05 570.30 21.50 6482
580.55 603.05 591.80 22.50 6526
602.55 626.05 614.30 23.50 6452
625.55 650.15 637.85 24.60 6495
649.55 675.15 662.35 25.60 6286
674.55 701.15 687.85 26.60 6293
700.55 728.55 714.55 28.00 6160
728.25 758.55 743.40 30.30 6119
758.25 790.15 774.20 31.90 5881
789.65 826.25 807.95 36.60 5932
825.65 866.25 845.95 40.60 5666
866.05 911.35 888.70 45.30 5342
911.25 963.15 937.20 51.90 5108
962.75 1026.05 994.40 63.30 4623
1025.95 1100.65 1063.30 74.70 4126
1100.45 1200.05 1150.25 99.60 3091
mixedError <- cdsw$error()

Benchmarking of the methods.

We compare the optimal number of MS1 peaks per SWATH window (same in each window) with the numbers obtained by using all of the 3 methods implemented.

barplot(c(const = constError$score1, quantile = quantileError$score1, mixed = mixedError$score1),ylab = "Manhattan distance")

barplot(c(const = constError$score2, quantile = quantileError$score2, mixed = mixedError$score2),ylab = "Euclidean distance")

We can see that Method 3 has a relatively small error although it is able to fulfill constraints such as maximum window size.

Session info

## R version 4.1.1 (2021-08-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19044)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] prozor_0.3.1
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.7            highr_0.9             pillar_1.6.4         
##  [4] bslib_0.3.1           compiler_4.1.1        jquerylib_0.1.4      
##  [7] tools_4.1.1           digest_0.6.28         docopt_0.7.1         
## [10] tibble_3.1.4          lifecycle_1.0.1       jsonlite_1.7.2       
## [13] evaluate_0.14         memoise_2.0.0         AhoCorasickTrie_0.1.2
## [16] lattice_0.20-44       pkgconfig_2.0.3       rlang_0.4.11         
## [19] Matrix_1.3-4          yaml_2.2.1            pkgdown_1.6.1        
## [22] xfun_0.26             fastmap_1.1.0         stringr_1.4.0        
## [25] knitr_1.36            desc_1.4.0            fs_1.5.0             
## [28] sass_0.4.0            vctrs_0.3.8           systemfonts_1.0.3    
## [31] hms_1.1.1             rprojroot_2.0.2       ade4_1.7-18          
## [34] grid_4.1.1            R6_2.5.1              textshaping_0.3.6    
## [37] fansi_0.5.0           rmarkdown_2.11        tzdb_0.1.2           
## [40] purrr_0.3.4           readr_2.0.1           seqinr_4.2-8         
## [43] magrittr_2.0.1        htmltools_0.5.2       ellipsis_0.3.2       
## [46] MASS_7.3-54           ragg_1.2.0            utf8_1.2.2           
## [49] stringi_1.7.4         cachem_1.0.6          crayon_1.4.2