Pbase
example dataPackage: Pbase
Authors: Laurent Gatto and Sebastian Gibb
Last compiled: Sun May 15 21:37:04 2016
Last modified: 2016-05-03 14:31:08
This vignette briefly introduces the central data object of the Pbase
package, namely Proteins
instances, as depicted below. They contain a set of protein sequences (10 in the figure below), composed of the protein sequences (grey boxes) and annotation data (table on the left). Each protein links to a set of experimentally observed peptides (also in grey) that are also decorated with their own annotation data. The figure also show the accessors for the different data slots, that are detailed in ?Proteins
.
Proteins
objects are populated by protein sequences stemming from a fasta file and the peptides originate from an LC-MSMS experiment.
The original data used below is a 10 fmol Peptide Retention Time Calibration Mixture spiked into 50 ng HeLa background acquired on a Thermo Orbitrap Q Exactive instrument. A restricted set of high scoring human proteins from the UniProt release 2015_02
were searched using the MSGF+
search engine.
library("Biostrings")
## Loading required package: XVector
fafile <- system.file("extdata/HUMAN_2015_02_selected.fasta",
package = "Pbase")
fa <- readAAStringSet(fafile)
fa
## A AAStringSet instance of length 9
## width seq names
## [1] 2602 MPVTEKDLAEDAPWKKIQQNT...LAVKWGEEHIPGSPFHVTVP sp|O75369|FLNB_HU...
## [2] 3374 MSPESGHSRIFEATAGPNKPE...TLSKDSLSNGVPSGRQAEFS sp|A4UGR9|XIRP2_H...
## [3] 2624 MFRRARLSVKPNVRPGVGARG...ATTVSEYFFNDIFIEVDETE sp|A6H8Y1|BDP1_HU...
## [4] 911 MVDYHAANQSYQYGPSSAGNG...VPGALDYKSFSTALYGESDL sp|O43707|ACTN4_H...
## [5] 417 MSLSNKLTLDKLDVKGKRVVM...ASLELLEGKVLPGVDALSNI sp|P00558|PGK1_HU...
## [6] 375 MDDDIAALVVDNGSGMCKAGF...WISKQEYDESGPSIVHRKCF sp|P60709|ACTB_HU...
## [7] 664 METPSQRRATRSGAQASSTPL...SYLLGNSSPRTQSPQNCSIM sp|P02545|LMNA_HU...
## [8] 364 MPYQYPALTPEQKKELSDIAH...PSGQAGAAASESLFVSNHAY sp|P04075|ALDOA_H...
## [9] 418 MARRKPEGSSFNMTHLSMAMA...PSGQAGAAASESLFVSNHAY sp|P04075-2|ALDOA...
library("mzID")
idfile <- system.file("extdata/Thermo_Hela_PRTC_selected.mzid",
package = "Pbase")
id <- flatten(mzID(idfile))
## reading Thermo_Hela_PRTC_selected.mzid... DONE!
dim(id)
## [1] 137 29
head(id)
## spectrumid scan number(s)
## 1 index=173 12256
## 1.1 index=173 12256
## 2 index=163 11860
## 2.1 index=163 11860
## 3 index=200 13408
## 3.1 index=200 13408
## spectrum title
## 1 msLevel 2; retentionTime 2094.56706; scanNum 12256; precMz 1137.06665029649; precCharge 2
## 1.1 msLevel 2; retentionTime 2094.56706; scanNum 12256; precMz 1137.06665029649; precCharge 2
## 2 msLevel 2; retentionTime 2039.84424; scanNum 11860; precMz 1136.57450195803; precCharge 2
## 2.1 msLevel 2; retentionTime 2039.84424; scanNum 11860; precMz 1136.57450195803; precCharge 2
## 3 msLevel 2; retentionTime 2258.27868; scanNum 13408; precMz 703.038108542133; precCharge 3
## 3.1 msLevel 2; retentionTime 2258.27868; scanNum 13408; precMz 703.038108542133; precCharge 3
## acquisitionnum passthreshold rank calculatedmasstocharge
## 1 173 TRUE 1 1136.574
## 1.1 173 TRUE 1 1136.574
## 2 163 TRUE 1 1136.574
## 2.1 163 TRUE 1 1136.574
## 3 200 TRUE 1 703.037
## 3.1 200 TRUE 1 703.037
## experimentalmasstocharge chargestate ms-gf:denovoscore ms-gf:evalue
## 1 1137.0667 2 132 2.597097e-18
## 1.1 1137.0667 2 132 2.597097e-18
## 2 1136.5745 2 230 4.942664e-17
## 2.1 1136.5745 2 230 4.942664e-17
## 3 703.0381 3 145 4.080429e-10
## 3.1 703.0381 3 145 4.080429e-10
## ms-gf:rawscore ms-gf:specevalue assumeddissociationmethod isotopeerror
## 1 118 2.276758e-22 CID 1
## 1.1 118 2.276758e-22 CID 1
## 2 186 4.333009e-21 CID 0
## 2.1 186 4.333009e-21 CID 0
## 3 98 3.578068e-14 CID 0
## 3.1 98 3.578068e-14 CID 0
## isdecoy post pre end start accession length
## 1 FALSE C K 134 112 sp|P04075|ALDOA_HUMAN 364
## 1.1 FALSE C K 188 166 sp|P04075-2|ALDOA_HUMAN 418
## 2 FALSE C K 134 112 sp|P04075|ALDOA_HUMAN 364
## 2.1 FALSE C K 188 166 sp|P04075-2|ALDOA_HUMAN 418
## 3 FALSE Y K 173 154 sp|P04075|ALDOA_HUMAN 364
## 3.1 FALSE Y K 227 208 sp|P04075-2|ALDOA_HUMAN 418
## description
## 1 Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA PE=1 SV=2
## 1.1 Isoform 2 of Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA
## 2 Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA PE=1 SV=2
## 2.1 Isoform 2 of Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA
## 3 Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA PE=1 SV=2
## 3.1 Isoform 2 of Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA
## pepseq modified modification
## 1 GVVPLAGTNGETTTQGLDGLSER FALSE <NA>
## 1.1 GVVPLAGTNGETTTQGLDGLSER FALSE <NA>
## 2 GVVPLAGTNGETTTQGLDGLSER FALSE <NA>
## 2.1 GVVPLAGTNGETTTQGLDGLSER FALSE <NA>
## 3 IGEHTPSALAIMENANVLAR FALSE <NA>
## 3.1 IGEHTPSALAIMENANVLAR FALSE <NA>
## idFile spectrumFile
## 1 Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 1.1 Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 2 Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 2.1 Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 3 Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 3.1 Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## databaseFile
## 1 HUMAN_2015_02_selected.fasta
## 1.1 HUMAN_2015_02_selected.fasta
## 2 HUMAN_2015_02_selected.fasta
## 2.1 HUMAN_2015_02_selected.fasta
## 3 HUMAN_2015_02_selected.fasta
## 3.1 HUMAN_2015_02_selected.fasta
library("Pbase")
p <- Proteins(fafile)
p <- addIdentificationData(p, idfile)
## Reading 1 identification files:
## 1. /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## done.
p
## S4 class type : Proteins
## Class version : 0.1
## Created : Sun May 15 21:37:17 2016
## Number of Proteins: 9
## Sequences:
## [1] A4UGR9 [2] A6H8Y1 ... [8] P04075-2 [9] P60709
## Sequence features:
## [1] DB [2] AccessionNumber ... [11] Filename [12] npeps
## Peptide features:
## [1] DB [2] AccessionNumber ... [27] acquisitionNum [28] filenames
A Proteins
object is composed of a set of protein sequences accessible with the aa
accessor as well as an optional set of peptides features that are mapped as coordinates along the proteins, available with pranges
. The actual peptide sequences can be extraced with pfeatures
.
aa(p)
## A AAStringSet instance of length 9
## width seq names
## [1] 3374 MSPESGHSRIFEATAGPNKPE...TLSKDSLSNGVPSGRQAEFS A4UGR9
## [2] 2624 MFRRARLSVKPNVRPGVGARG...ATTVSEYFFNDIFIEVDETE A6H8Y1
## [3] 911 MVDYHAANQSYQYGPSSAGNG...VPGALDYKSFSTALYGESDL O43707
## [4] 2602 MPVTEKDLAEDAPWKKIQQNT...LAVKWGEEHIPGSPFHVTVP O75369
## [5] 417 MSLSNKLTLDKLDVKGKRVVM...ASLELLEGKVLPGVDALSNI P00558
## [6] 664 METPSQRRATRSGAQASSTPL...SYLLGNSSPRTQSPQNCSIM P02545
## [7] 364 MPYQYPALTPEQKKELSDIAH...PSGQAGAAASESLFVSNHAY P04075
## [8] 418 MARRKPEGSSFNMTHLSMAMA...PSGQAGAAASESLFVSNHAY P04075-2
## [9] 375 MDDDIAALVVDNGSGMCKAGF...WISKQEYDESGPSIVHRKCF P60709
pranges(p)
## IRangesList of length 9
## $A4UGR9
## IRanges object with 36 ranges and 28 metadata columns:
## start end width | DB AccessionNumber EntryName
## <integer> <integer> <integer> | <Rle> <character> <character>
## A4UGR9 2743 2760 18 | sp A4UGR9 XIRP2_HUMAN
## A4UGR9 307 318 12 | sp A4UGR9 XIRP2_HUMAN
## A4UGR9 1858 1870 13 | sp A4UGR9 XIRP2_HUMAN
## A4UGR9 1699 1708 10 | sp A4UGR9 XIRP2_HUMAN
## A4UGR9 2622 2637 16 | sp A4UGR9 XIRP2_HUMAN
## ... ... ... ... . ... ... ...
## A4UGR9 20 31 12 | sp A4UGR9 XIRP2_HUMAN
## A4UGR9 1712 1729 18 | sp A4UGR9 XIRP2_HUMAN
## A4UGR9 48 61 14 | sp A4UGR9 XIRP2_HUMAN
## A4UGR9 2082 2094 13 | sp A4UGR9 XIRP2_HUMAN
## A4UGR9 2743 2756 14 | sp A4UGR9 XIRP2_HUMAN
## IsoformName
## <Rle>
## A4UGR9 <NA>
## A4UGR9 <NA>
## A4UGR9 <NA>
## A4UGR9 <NA>
## A4UGR9 <NA>
## ... ...
## A4UGR9 <NA>
## A4UGR9 <NA>
## A4UGR9 <NA>
## A4UGR9 <NA>
## A4UGR9 <NA>
## ProteinName
## <character>
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## ... ...
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## OrganismName GeneName ProteinExistence SequenceVersion
## <Rle> <Rle> <Rle> <Rle>
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## ... ... ... ... ...
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## A4UGR9 Homo sapiens XIRP2 Evidence at protein level 2
## Comment spectrumID chargeState rank passThreshold
## <Rle> <factor> <integer> <integer> <logical>
## A4UGR9 <NA> index=124 3 1 TRUE
## A4UGR9 <NA> index=28 2 1 TRUE
## A4UGR9 <NA> index=20 2 1 TRUE
## A4UGR9 <NA> index=187 2 1 TRUE
## A4UGR9 <NA> index=211 3 1 TRUE
## ... ... ... ... ... ...
## A4UGR9 <NA> index=99 2 1 TRUE
## A4UGR9 <NA> index=9 2 1 TRUE
## A4UGR9 <NA> index=122 2 1 TRUE
## A4UGR9 <NA> index=87 2 1 TRUE
## A4UGR9 <NA> index=77 2 1 TRUE
## experimentalMassToCharge calculatedMassToCharge
## <numeric> <numeric>
## A4UGR9 715.0305 715.0308
## A4UGR9 715.9177 715.4117
## A4UGR9 786.9066 786.9081
## A4UGR9 629.8380 629.3386
## A4UGR9 645.3429 645.3511
## ... ... ...
## A4UGR9 619.2888 618.7782
## A4UGR9 1014.0198 1013.5117
## A4UGR9 821.4005 820.8909
## A4UGR9 720.3445 720.3527
## A4UGR9 821.9231 821.9254
## sequence modNum isDecoy post pre
## <factor> <integer> <logical> <factor> <factor>
## A4UGR9 QEITQNKSFFSSVKESQR 0 FALSE D K
## A4UGR9 LPVPKDVYSKQR 0 FALSE N R
## A4UGR9 EQNNDALEKSLRR 0 FALSE L R
## A4UGR9 SLKESSHRWK 0 FALSE E K
## A4UGR9 LKMVPRKQREFSGSDR 0 FALSE G K
## ... ... ... ... ... ...
## A4UGR9 PESGFAEDSAAR 0 FALSE G K
## A4UGR9 QPDAIPGDIEKAIECLEK 1 FALSE A K
## A4UGR9 MARYQAAVSRGDCR 1 FALSE S R
## A4UGR9 TNTSTGLKMAMER 0 FALSE S K
## A4UGR9 QEITQNKSFFSSVK 0 FALSE E K
## start end DatabaseAccess DBseqLength DatabaseSeq
## <integer> <integer> <factor> <integer> <factor>
## A4UGR9 2743 2760 sp|A4UGR9|XIRP2_HUMAN 3374
## A4UGR9 307 318 sp|A4UGR9|XIRP2_HUMAN 3374
## A4UGR9 1858 1870 sp|A4UGR9|XIRP2_HUMAN 3374
## A4UGR9 1699 1708 sp|A4UGR9|XIRP2_HUMAN 3374
## A4UGR9 2622 2637 sp|A4UGR9|XIRP2_HUMAN 3374
## ... ... ... ... ... ...
## A4UGR9 20 31 sp|A4UGR9|XIRP2_HUMAN 3374
## A4UGR9 1712 1729 sp|A4UGR9|XIRP2_HUMAN 3374
## A4UGR9 48 61 sp|A4UGR9|XIRP2_HUMAN 3374
## A4UGR9 2082 2094 sp|A4UGR9|XIRP2_HUMAN 3374
## A4UGR9 2743 2756 sp|A4UGR9|XIRP2_HUMAN 3374
## acquisitionNum
## <numeric>
## A4UGR9 124
## A4UGR9 28
## A4UGR9 20
## A4UGR9 187
## A4UGR9 211
## ... ...
## A4UGR9 99
## A4UGR9 9
## A4UGR9 122
## A4UGR9 87
## A4UGR9 77
## filenames
## <Rle>
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## ... ...
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## A4UGR9 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##
## ...
## <8 more elements>
pfeatures(p)
## AAStringSetList of length 9
## [["A4UGR9"]] A4UGR9=QEITQNKSFFSSVKESQR ... A4UGR9=QEITQNKSFFSSVK
## [["A6H8Y1"]] A6H8Y1=EDAEQVALEVDLNQKKRR ...
## [["O43707"]] O43707=QQRKTFTAWCNSHLR ... O43707=VGWEQLLTTIAR
## [["O75369"]] O75369=DLDIIDNYDYSHTVK ... O75369=VQAQGPGLKEAFTNK
## [["P00558"]] P00558=ELNYFAKALESPER P00558=DLMSKAEK ... P00558=GTKALMDEVVK
## [["P02545"]] P02545=METPSQRRATR ... P02545=RATRSGAQASSTPLSPTR
## [["P04075"]] P04075=GVVPLAGTNGETTTQGLDGLSER ...
## [["P04075-2"]] P04075-2=GVVPLAGTNGETTTQGLDGLSER ...
## [["P60709"]] P60709=DLTDYLMKILTER
A Proteins instance is further described by general metadata
. Protein sequence and peptide features annotations can be accessed with ametadata
and pmetadata
(or acols
and pcols
) respectively.
metadata(p)
## $created
## [1] "Sun May 15 21:37:17 2016"
head(acols(p))
## DataFrame with 6 rows and 12 columns
## DB AccessionNumber EntryName IsoformName
## <Rle> <character> <character> <Rle>
## 1 sp A4UGR9 XIRP2_HUMAN NA
## 2 sp A6H8Y1 BDP1_HUMAN NA
## 3 sp O43707 ACTN4_HUMAN NA
## 4 sp O75369 FLNB_HUMAN NA
## 5 sp P00558 PGK1_HUMAN NA
## 6 sp P02545 LMNA_HUMAN NA
## ProteinName OrganismName GeneName
## <character> <Rle> <Rle>
## 1 Xin actin-binding repeat-containing protein 2 Homo sapiens XIRP2
## 2 Transcription factor TFIIIB component B'' homolog Homo sapiens BDP1
## 3 Alpha-actinin-4 Homo sapiens ACTN4
## 4 Filamin-B Homo sapiens FLNB
## 5 Phosphoglycerate kinase 1 Homo sapiens PGK1
## 6 Prelamin-A/C Homo sapiens LMNA
## ProteinExistence SequenceVersion Comment
## <Rle> <Rle> <Rle>
## 1 Evidence at protein level 2 NA
## 2 Evidence at protein level 3 NA
## 3 Evidence at protein level 2 NA
## 4 Evidence at protein level 2 NA
## 5 Evidence at protein level 3 NA
## 6 Evidence at protein level 1 NA
## Filename
## <Rle>
## 1 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 2 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 3 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 4 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 5 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 6 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/HUMAN_2015_02_selected.fasta
## npeps
## <integer>
## 1 36
## 2 23
## 3 6
## 4 13
## 5 5
## 6 12
head(pcols(p))
## SplitDataFrameList of length 6
## $A4UGR9
## DataFrame with 36 rows and 28 columns
## DB AccessionNumber EntryName IsoformName
## <Rle> <character> <character> <Rle>
## 1 sp A4UGR9 XIRP2_HUMAN NA
## 2 sp A4UGR9 XIRP2_HUMAN NA
## 3 sp A4UGR9 XIRP2_HUMAN NA
## 4 sp A4UGR9 XIRP2_HUMAN NA
## 5 sp A4UGR9 XIRP2_HUMAN NA
## ... ... ... ... ...
## 32 sp A4UGR9 XIRP2_HUMAN NA
## 33 sp A4UGR9 XIRP2_HUMAN NA
## 34 sp A4UGR9 XIRP2_HUMAN NA
## 35 sp A4UGR9 XIRP2_HUMAN NA
## 36 sp A4UGR9 XIRP2_HUMAN NA
## ProteinName
## <character>
## 1 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 2 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 3 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 4 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 5 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## ... ...
## 32 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 33 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 34 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 35 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 36 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## OrganismName GeneName ProteinExistence SequenceVersion
## <Rle> <Rle> <Rle> <Rle>
## 1 Homo sapiens XIRP2 Evidence at protein level 2
## 2 Homo sapiens XIRP2 Evidence at protein level 2
## 3 Homo sapiens XIRP2 Evidence at protein level 2
## 4 Homo sapiens XIRP2 Evidence at protein level 2
## 5 Homo sapiens XIRP2 Evidence at protein level 2
## ... ... ... ... ...
## 32 Homo sapiens XIRP2 Evidence at protein level 2
## 33 Homo sapiens XIRP2 Evidence at protein level 2
## 34 Homo sapiens XIRP2 Evidence at protein level 2
## 35 Homo sapiens XIRP2 Evidence at protein level 2
## 36 Homo sapiens XIRP2 Evidence at protein level 2
## Comment spectrumID chargeState rank passThreshold
## <Rle> <factor> <integer> <integer> <logical>
## 1 NA index=124 3 1 TRUE
## 2 NA index=28 2 1 TRUE
## 3 NA index=20 2 1 TRUE
## 4 NA index=187 2 1 TRUE
## 5 NA index=211 3 1 TRUE
## ... ... ... ... ... ...
## 32 NA index=99 2 1 TRUE
## 33 NA index=9 2 1 TRUE
## 34 NA index=122 2 1 TRUE
## 35 NA index=87 2 1 TRUE
## 36 NA index=77 2 1 TRUE
## experimentalMassToCharge calculatedMassToCharge sequence
## <numeric> <numeric> <factor>
## 1 715.0305 715.0308 QEITQNKSFFSSVKESQR
## 2 715.9177 715.4117 LPVPKDVYSKQR
## 3 786.9066 786.9081 EQNNDALEKSLRR
## 4 629.8380 629.3386 SLKESSHRWK
## 5 645.3429 645.3511 LKMVPRKQREFSGSDR
## ... ... ... ...
## 32 619.2888 618.7782 PESGFAEDSAAR
## 33 1014.0198 1013.5117 QPDAIPGDIEKAIECLEK
## 34 821.4005 820.8909 MARYQAAVSRGDCR
## 35 720.3445 720.3527 TNTSTGLKMAMER
## 36 821.9231 821.9254 QEITQNKSFFSSVK
## modNum isDecoy post pre start end
## <integer> <logical> <factor> <factor> <integer> <integer>
## 1 0 FALSE D K 2743 2760
## 2 0 FALSE N R 307 318
## 3 0 FALSE L R 1858 1870
## 4 0 FALSE E K 1699 1708
## 5 0 FALSE G K 2622 2637
## ... ... ... ... ... ... ...
## 32 0 FALSE G K 20 31
## 33 1 FALSE A K 1712 1729
## 34 1 FALSE S R 48 61
## 35 0 FALSE S K 2082 2094
## 36 0 FALSE E K 2743 2756
## DatabaseAccess DBseqLength DatabaseSeq acquisitionNum
## <factor> <integer> <factor> <numeric>
## 1 sp|A4UGR9|XIRP2_HUMAN 3374 124
## 2 sp|A4UGR9|XIRP2_HUMAN 3374 28
## 3 sp|A4UGR9|XIRP2_HUMAN 3374 20
## 4 sp|A4UGR9|XIRP2_HUMAN 3374 187
## 5 sp|A4UGR9|XIRP2_HUMAN 3374 211
## ... ... ... ... ...
## 32 sp|A4UGR9|XIRP2_HUMAN 3374 99
## 33 sp|A4UGR9|XIRP2_HUMAN 3374 9
## 34 sp|A4UGR9|XIRP2_HUMAN 3374 122
## 35 sp|A4UGR9|XIRP2_HUMAN 3374 87
## 36 sp|A4UGR9|XIRP2_HUMAN 3374 77
## filenames
## <Rle>
## 1 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 2 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 3 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 4 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 5 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## ... ...
## 32 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 33 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 34 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 35 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 36 /tmp/RtmpeIjgDV/Rinst4b207074c283/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##
## ...
## <5 more elements>
Specific proteins can be extracted by index of name using [
and proteins and their peptide features can be plotted with the default plot method.
seqnames(p)
## [1] "A4UGR9" "A6H8Y1" "O43707" "O75369" "P00558" "P02545"
## [7] "P04075" "P04075-2" "P60709"
plot(p[c(1,9)])
More details can be found in ?Proteins
. The object generated above is also directly available as data(p)
.
sessionInfo()
## R version 3.3.0 (2016-05-03)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.4 LTS
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] grid stats4 parallel stats graphics grDevices utils
## [8] datasets methods base
##
## other attached packages:
## [1] mzID_1.10.2 Biostrings_2.40.0 XVector_0.12.0
## [4] Pbase_0.12.2 Gviz_1.16.1 GenomicRanges_1.24.0
## [7] GenomeInfoDb_1.8.2 IRanges_2.6.0 S4Vectors_0.10.0
## [10] Rcpp_0.12.5 BiocGenerics_0.18.0 BiocStyle_2.0.2
##
## loaded via a namespace (and not attached):
## [1] Biobase_2.32.0 httr_1.1.0
## [3] vsn_3.40.0 AnnotationHub_2.4.2
## [5] splines_3.3.0 foreach_1.4.3
## [7] Formula_1.2-1 shiny_0.13.2
## [9] interactiveDisplayBase_1.10.3 affy_1.50.0
## [11] latticeExtra_0.6-28 BSgenome_1.40.0
## [13] Rsamtools_1.24.0 impute_1.46.0
## [15] yaml_2.1.13 RSQLite_1.0.0
## [17] lattice_0.20-33 biovizBase_1.20.0
## [19] limma_3.28.4 chron_2.3-47
## [21] digest_0.6.9 RColorBrewer_1.1-2
## [23] colorspace_1.2-6 preprocessCore_1.34.0
## [25] htmltools_0.3.5 httpuv_1.3.3
## [27] Matrix_1.2-6 plyr_1.8.3
## [29] MALDIquant_1.14 XML_3.98-1.4
## [31] biomaRt_2.28.0 zlibbioc_1.18.0
## [33] xtable_1.8-2 scales_0.4.0
## [35] affyio_1.42.0 cleaver_1.10.2
## [37] BiocParallel_1.6.2 ggplot2_2.1.0
## [39] SummarizedExperiment_1.2.2 GenomicFeatures_1.24.2
## [41] nnet_7.3-12 survival_2.39-4
## [43] magrittr_1.5 mime_0.4
## [45] evaluate_0.9 doParallel_1.0.10
## [47] foreign_0.8-66 mzR_2.6.2
## [49] Pviz_1.6.2 BiocInstaller_1.22.2
## [51] tools_3.3.0 data.table_1.9.6
## [53] formatR_1.4 matrixStats_0.50.2
## [55] stringr_1.0.0 MSnbase_1.20.5
## [57] munsell_0.4.3 cluster_2.0.4
## [59] AnnotationDbi_1.34.2 ensembldb_1.4.2
## [61] pcaMethods_1.64.0 RCurl_1.95-4.8
## [63] dichromat_2.0-0 iterators_1.0.8
## [65] VariantAnnotation_1.18.1 bitops_1.0-6
## [67] rmarkdown_0.9.6 gtable_0.2.0
## [69] codetools_0.2-14 DBI_0.4-1
## [71] reshape2_1.4.1 R6_2.1.2
## [73] GenomicAlignments_1.8.0 gridExtra_2.2.1
## [75] knitr_1.13 rtracklayer_1.32.0
## [77] Hmisc_3.17-4 ProtGenerics_1.4.0
## [79] stringi_1.0-1 rpart_4.1-10
## [81] acepack_1.3-3.3