One useful function of the roma package is its functionality to exactly and partially match sequences. We will explore this further in this vignette.
Let’s say we have a sequence of interest that we want to map, in this case:
“MNDPSLLGYPNVGPQQQQQQQQQQHAGLLGKGTPNALQQQLHMNQLTGIPPPGLMNNSDVHTSSNNNSRQLLDQLANGNANMLNMNMDNNNNNNNNNNNNNNNGGGSGVMMNASTAAVNSIGMVPTVGTPVNINVNASNPLLHPHLDDPSLLNNPIWKLQLHLAAVSAQSLGQPNIYARQNAMKKYLATQQAQQAQQQAQQQAQQQVPGPFGPGPQAAPPALQPTDFQQSHIAEASKSLVDCTKQALMEMADTLTDSKTAKKQQPTGDSTPSGTATNSAVSTPLTPKIELFANGKDEANQALLQHKKLSQYSIDEDDDIENRMVMPKDSKYDDQLWHALDLSNLQIFNISANIFKYDFLTRLYLNGNSLTELPAEIKNLSNLRVLDLSHNRLTSLPAELGSCFQLKYFYFFDNMVTTLPWEFGNLCNLQFLGVEGNPLEKQFLKILTEKSVTGLIFYLRDNRPEIPLPHER”
We can pass it to the mapSequence() function which returns a list of targets. From this list we can then construct roma protein objects for which we can obtain further infromation - such as its oma group or its domains. The example response object is below.
## [1] "query : character"
## [1] "identified_by : character"
## [1] "targets : list"
## [1] 1
One can also directly obtain GO annotations for a given query sequence, which results in an object as below:
## Qualifier GO_ID With Evidence
## 1 GO:0003677 Approx:DICDI02796:239.3634208239752 IEA
## 2 GO:0005509 Approx:DICDI02796:239.3634208239752 IEA
## 3 GO:0005634 Approx:DICDI02796:239.3634208239752 IEA
## 4 GO:0006351 Approx:DICDI02796:239.3634208239752 IEA
## 5 GO:0006355 Approx:DICDI02796:239.3634208239752 IEA
## 6 GO:0007275 Approx:DICDI02796:239.3634208239752 IEA
## Date DB_Object_Type DB_Object_Name Aspect Assigned_By
## 1 20180102 protein F OMA_FastMap
## 2 20180102 protein F OMA_FastMap
## 3 20180102 protein C OMA_FastMap
## 4 20180102 protein P OMA_FastMap
## 5 20180102 protein P OMA_FastMap
## 6 20180102 protein P OMA_FastMap
## GO_name DB DB.Reference
## 1 DNA binding OMA_FastMap OMA_Fun:002
## 2 calcium ion binding OMA_FastMap OMA_Fun:002
## 3 nucleus OMA_FastMap OMA_Fun:002
## 4 transcription, DNA-templated OMA_FastMap OMA_Fun:002
## 5 regulation of transcription, DNA-templated OMA_FastMap OMA_Fun:002
## 6 multicellular organism development OMA_FastMap OMA_Fun:002
## Synonym
## 1
## 2
## 3
## 4
## 5
## 6