Tissue#
lamindb provides access to the following public protein ontologies through lnschema-bionty:
Here we show how to access and search Tissue ontologies to standardize new data.
Setup#
!lamin init --storage ./test-tissue --schema bionty
β
saved: User(uid='DzTjkKse', handle='testuser1', name='Test User1', updated_at=2024-01-24 13:39:15 UTC)
β
saved: Storage(uid='q62GvPYt', root='/home/runner/work/lamin-usecases/lamin-usecases/docs/test-tissue', type='local', updated_at=2024-01-24 13:39:15 UTC, created_by_id=1)
π‘ loaded instance: testuser1/test-tissue
π‘ did not register local instance on hub
import lnschema_bionty as lb
import pandas as pd
PublicOntology objects#
Let us create a public ontology accessor with public()
, which chooses a default public ontology source from PublicSource
.
Itβs a PublicOntology object, which you can think about as a public registry:
tissues = lb.Tissue.public(organism="all")
tissues
π‘ loaded instance: testuser1/test-tissue
PublicOntology
Entity: Tissue
Organism: all
Source: uberon, 2023-09-05
#terms: 15539
π .df(): ontology reference table
π .lookup(): autocompletion of terms
π― .search(): free text search of terms
β
.validate(): strictly validate values
π§ .inspect(): full inspection of values
π½ .standardize(): convert to standardized names
πͺ .diff(): difference between two versions
π .to_pronto(): Pronto.Ontology object
As for registries, you can export the ontology as a DataFrame
:
df = tissues.df()
df.head()
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
UBERON:0000000 | processual entity | An Occurrent [Span:Occurrent] That Exists In T... | None | [] |
UBERON:0000002 | uterine cervix | Lower, Narrow Portion Of The Uterus Where It J... | neck of uterus|canalis cervicis uteri|cervical... | [UBERON:0005156, UBERON:0001560] |
UBERON:0000003 | naris | Orifice Of The Olfactory System. The Naris Is ... | None | [UBERON:0000161] |
UBERON:0000004 | nose | The Olfactory Organ Of Vertebrates, Consisting... | nasal sac|peripheral olfactory organ|nose | [UBERON:0000475, UBERON:0004121, UBERON:001031... |
UBERON:0000005 | chemosensory organ | None | chemosensory sensory organ | [UBERON:0000020] |
Unlike registries, you can also export it as a Pronto object via public.ontology
.
Look up terms#
As for registries, terms can be looked up with auto-complete:
lookup = tissues.lookup()
The .
accessor provides normalized terms (lower case, only contains alphanumeric characters and underscores):
lookup.alveolus_of_lung
Tissue(ontology_id='UBERON:0002299', name='alveolus of lung', definition='Spherical Outcropping Of The Respiratory Bronchioles And Primary Site Of Gas Exchange With The Blood. Alveoli Are Particular To Mammalian Lungs. Different Structures Are Involved In Gas Exchange In Other Vertebrates[Wp].', synonyms='lung alveolus|pulmonary alveolus|alveolus pulmonis|respiratory alveolus', parents=array(['UBERON:0003215', 'UBERON:0004119'], dtype=object))
To look up the exact original strings, convert the lookup object to dict and use the []
accessor:
lookup_dict = lookup.dict()
lookup_dict["alveolus of lung"]
Tissue(ontology_id='UBERON:0002299', name='alveolus of lung', definition='Spherical Outcropping Of The Respiratory Bronchioles And Primary Site Of Gas Exchange With The Blood. Alveoli Are Particular To Mammalian Lungs. Different Structures Are Involved In Gas Exchange In Other Vertebrates[Wp].', synonyms='lung alveolus|pulmonary alveolus|alveolus pulmonis|respiratory alveolus', parents=array(['UBERON:0003215', 'UBERON:0004119'], dtype=object))
By default, the name
field is used to generate lookup keys. You can specify another field to look up:
lookup = tissues.lookup(tissues.ontology_id)
lookup.uberon_0000031
Tissue(ontology_id='UBERON:0000031', name='lamina propria of trachea', definition='A Lamina Propria That Is Part Of A Respiratory Airway.', synonyms='lamina propria mucosae of trachea|trachea lamina propria mucosae|lamina propria of windpipe|windpipe lamina propria mucosa|lamina propria mucosa of trachea|tracheal lamina propria|trachea lamina propria mucosa|windpipe lamina propria|lamina propria mucosa of windpipe|trachea lamina propria|windpipe lamina propria mucosae|lamina propria mucosae of windpipe', parents=array(['UBERON:0004779'], dtype=object))
Search terms#
Search behaves in the same way as it does for registries:
tissues.search("alveolus lung").head(3)
ontology_id | definition | synonyms | parents | __ratio__ | |
---|---|---|---|---|---|
name | |||||
alveolus of lung | UBERON:0002299 | Spherical Outcropping Of The Respiratory Bronc... | lung alveolus|pulmonary alveolus|alveolus pulm... | [UBERON:0003215, UBERON:0004119] | 89.655172 |
left lung alveolus | UBERON:0004862 | An Alveolus That Is Part Of A Left Lung [Autom... | alveolus of lobe of left lung|alveolus of left... | [UBERON:0002299] | 76.470588 |
alveolus | UBERON:0003215 | Organ Part That Has The Form Of A Hollow Cavit... | None | [UBERON:0000064] | 76.190476 |
By default, search also covers synonyms:
tissues.search("nasal sac").head(3)
ontology_id | definition | synonyms | parents | __ratio__ | |
---|---|---|---|---|---|
name | |||||
nose | UBERON:0000004 | The Olfactory Organ Of Vertebrates, Consisting... | nasal sac|peripheral olfactory organ|nose | [UBERON:0000475, UBERON:0004121, UBERON:001031... | 100.000000 |
anal sac | UBERON:0008978 | In Carnivores, Either Of Two Sacs Found Betwee... | None | [UBERON:0000062, UBERON:0009856] | 82.352941 |
nasal air sac | UBERON:0013175 | An Air Sac Opening Into The Passage Of The Blo... | blowhole air sac | [UBERON:0004111] | 81.818182 |
You can turn this off synonym by passing synonyms_field=None
:
tissues.search("nasal sac", synonyms_field=None).head(3)
ontology_id | definition | synonyms | parents | __ratio__ | |
---|---|---|---|---|---|
name | |||||
anal sac | UBERON:0008978 | In Carnivores, Either Of Two Sacs Found Betwee... | None | [UBERON:0000062, UBERON:0009856] | 82.352941 |
nasal air sac | UBERON:0013175 | An Air Sac Opening Into The Passage Of The Blo... | blowhole air sac | [UBERON:0004111] | 81.818182 |
nasal muscle | UBERON:0008522 | Any Muscle Organ That Is Part Of An Nose. | muscle of nose | [UBERON:0001577, UBERON:0004121] | 76.190476 |
Search another field (default is .name
):
tissues.search(
"alveolus in the lung",
field=tissues.definition,
).head()
ontology_id | name | synonyms | parents | __ratio__ | |
---|---|---|---|---|---|
definition | |||||
An Alveolus That Is Part Of A Left Lung [Automatically Generated Definition]. | UBERON:0004862 | left lung alveolus | alveolus of lobe of left lung|alveolus of left... | [UBERON:0002299] | 78.048780 |
The Epithelial Layer Of The Alveoli[Mp]. The Layer Of Cells Covering The Lining Of The Tiny Air Sacs At The End Of The Bronchioles[Bto]. | UBERON:0004821 | pulmonary alveolus epithelium | epithelium of pulmonary alveolus|epithelial ti... | [UBERON:0000487, UBERON:0000115] | 76.923077 |
An Alveolus That Is Part Of A Right Lung [Automatically Generated Definition]. | UBERON:0004861 | right lung alveolus | alveolus of right lung | [UBERON:0002299] | 76.190476 |
An Alveolar Duct That Is Part Of A Left Lung [Automatically Generated Definition]. | UBERON:0003537 | left lung alveolar duct | alveolar duct of left lung | [UBERON:0002173] | 65.217391 |
An Alveolar Duct That Is Part Of A Right Lung [Automatically Generated Definition]. | UBERON:0003536 | right lung alveolar duct | alveolar duct of right lung | [UBERON:0002173] | 63.829787 |
Standardize Tissue identifiers#
Let us generate a DataFrame
that stores a number of Tissue identifiers, some of which corrupted:
df_orig = pd.DataFrame(
index=[
"UBERON:0000000"
"UBERON:0000005"
"UBERON:0000001"
"UBERON:0000002"
"This tissue does not exist",
]
)
df_orig
UBERON:0000000UBERON:0000005UBERON:0000001UBERON:0000002This tissue does not exist |
---|
We can check whether any of our values are validated against the ontology reference:
validated = tissues.validate(df_orig.index, tissues.name)
df_orig.index[~validated]
β 1 term (100.00%) is not validated: UBERON:0000000UBERON:0000005UBERON:0000001UBERON:0000002This tissue does not exist
Index(['UBERON:0000000UBERON:0000005UBERON:0000001UBERON:0000002This tissue does not exist'], dtype='object')
Ontology source versions#
For any given entity, we can choose from a number of versions:
lb.PublicSource.filter(entity="Tissue").df()
uid | entity | organism | currently_used | source | source_name | version | url | md5 | source_website | created_at | updated_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||
25 | 1PY3 | Tissue | all | True | uberon | Uberon multi-species anatomy ontology | 2023-09-05 | http://purl.obolibrary.org/obo/uberon/releases... | abcee3ede566d1311d758b853ccdf5aa | http://obophenotype.github.io/uberon | 2024-01-24 13:39:15.227557+00:00 | 2024-01-24 13:39:15.227566+00:00 | 1 |
26 | 45ES | Tissue | all | False | uberon | Uberon multi-species anatomy ontology | 2023-04-19 | http://purl.obolibrary.org/obo/uberon/releases... | 5611dd1375d5a95ac7d7de8e25e6016f | http://obophenotype.github.io/uberon | 2024-01-24 13:39:15.227656+00:00 | 2024-01-24 13:39:15.227665+00:00 | 1 |
27 | 5RC9 | Tissue | all | False | uberon | Uberon multi-species anatomy ontology | 2023-02-14 | http://purl.obolibrary.org/obo/uberon/releases... | 3f94e22fae4cdde88a555c5cd59c47da | http://obophenotype.github.io/uberon | 2024-01-24 13:39:15.227755+00:00 | 2024-01-24 13:39:15.227764+00:00 | 1 |
28 | 63JG | Tissue | all | False | uberon | Uberon multi-species anatomy ontology | 2022-08-19 | http://purl.obolibrary.org/obo/uberon/releases... | c7c958a1ee48fdce146f2c1763eed27e | http://obophenotype.github.io/uberon | 2024-01-24 13:39:15.227854+00:00 | 2024-01-24 13:39:15.227863+00:00 | 1 |
When instantiating a Bionty object, we can choose a source or version:
public_source = lb.PublicSource.filter(
source="uberon", version="2023-04-19", organism="all"
).one()
tissues= lb.Tissue.public(public_source=public_source)
tissues
β loading non-default source inside a LaminDB instance
PublicOntology
Entity: Tissue
Organism: all
Source: uberon, 2023-04-19
#terms: 15499
π .df(): ontology reference table
π .lookup(): autocompletion of terms
π― .search(): free text search of terms
β
.validate(): strictly validate values
π§ .inspect(): full inspection of values
π½ .standardize(): convert to standardized names
πͺ .diff(): difference between two versions
π .to_pronto(): Pronto.Ontology object
The currently used ontologies can be displayed using:
lb.PublicSource.filter(currently_used=True).df()
Show code cell output
uid | entity | organism | currently_used | source | source_name | version | url | md5 | source_website | created_at | updated_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||
1 | 6IUo | Organism | vertebrates | True | ensembl | Ensembl | release-110 | https://ftp.ensembl.org/pub/release-110/specie... | f3faf95648d3a2b50fd3625456739706 | https://www.ensembl.org | 2024-01-24 13:39:15.225088+00:00 | 2024-01-24 13:39:15.225109+00:00 | 1 |
4 | 2Jzh | Organism | bacteria | True | ensembl | Ensembl | release-57 | https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacte... | ee28510ed5586ea7ab4495717c96efc8 | https://www.ensembl.org | 2024-01-24 13:39:15.225433+00:00 | 2024-01-24 13:39:15.225442+00:00 | 1 |
5 | 1kdI | Organism | fungi | True | ensembl | Ensembl | release-57 | http://ftp.ensemblgenomes.org/pub/fungi/releas... | dbcde58f4396ab8b2480f7fe9f83df8a | https://www.ensembl.org | 2024-01-24 13:39:15.225533+00:00 | 2024-01-24 13:39:15.225542+00:00 | 1 |
6 | 2mIM | Organism | metazoa | True | ensembl | Ensembl | release-57 | http://ftp.ensemblgenomes.org/pub/metazoa/rele... | 424636a574fec078a61cbdddb05f9132 | https://www.ensembl.org | 2024-01-24 13:39:15.225634+00:00 | 2024-01-24 13:39:15.225643+00:00 | 1 |
7 | 2XQ6 | Organism | plants | True | ensembl | Ensembl | release-57 | https://ftp.ensemblgenomes.ebi.ac.uk/pub/plant... | eadaa1f3e527e4c3940c90c7fa5c8bf4 | https://www.ensembl.org | 2024-01-24 13:39:15.225734+00:00 | 2024-01-24 13:39:15.225744+00:00 | 1 |
8 | 1Vzs | Organism | all | True | ncbitaxon | NCBItaxon Ontology | 2023-06-20 | s3://bionty-assets/df_all__ncbitaxon__2023-06-... | 00d97ba65627f1cd65636d2df22ea76c | https://github.com/obophenotype/ncbitaxon | 2024-01-24 13:39:15.225835+00:00 | 2024-01-24 13:39:15.225845+00:00 | 1 |
9 | 4yVc | Gene | human | True | ensembl | Ensembl | release-110 | s3://bionty-assets/df_human__ensembl__release-... | 832f3947e83664588d419608a469b528 | https://www.ensembl.org | 2024-01-24 13:39:15.225934+00:00 | 2024-01-24 13:39:15.225943+00:00 | 1 |
11 | 2akp | Gene | mouse | True | ensembl | Ensembl | release-110 | s3://bionty-assets/df_mouse__ensembl__release-... | fa4ce130f2929aefd7ac3bc8eaf0c4de | https://www.ensembl.org | 2024-01-24 13:39:15.226133+00:00 | 2024-01-24 13:39:15.226142+00:00 | 1 |
13 | 2UvD | Gene | saccharomyces cerevisiae | True | ensembl | Ensembl | release-110 | s3://bionty-assets/df_saccharomyces cerevisiae... | 2e59495a3e87ea6575e408697dd73459 | https://www.ensembl.org | 2024-01-24 13:39:15.226335+00:00 | 2024-01-24 13:39:15.226345+00:00 | 1 |
14 | 7llW | Protein | human | True | uniprot | Uniprot | 2023-03 | s3://bionty-assets/df_human__uniprot__2023-03_... | 1c46e85c6faf5eff3de5b4e1e4edc4d3 | https://www.uniprot.org | 2024-01-24 13:39:15.226435+00:00 | 2024-01-24 13:39:15.226444+00:00 | 1 |
16 | 5U7J | Protein | mouse | True | uniprot | Uniprot | 2023-03 | s3://bionty-assets/df_mouse__uniprot__2023-03_... | 9d5e9a8225011d3218e10f9bbb96a46c | https://www.uniprot.org | 2024-01-24 13:39:15.226632+00:00 | 2024-01-24 13:39:15.226641+00:00 | 1 |
18 | 5nkB | CellMarker | human | True | cellmarker | CellMarker | 2.0 | s3://bionty-assets/human_cellmarker_2.0_CellMa... | d565d4a542a5c7e7a06255975358e4f4 | http://bio-bigdata.hrbmu.edu.cn/CellMarker | 2024-01-24 13:39:15.226829+00:00 | 2024-01-24 13:39:15.226837+00:00 | 1 |
19 | 6AFz | CellMarker | mouse | True | cellmarker | CellMarker | 2.0 | s3://bionty-assets/mouse_cellmarker_2.0_CellMa... | 189586732c63be949e40dfa6a3636105 | http://bio-bigdata.hrbmu.edu.cn/CellMarker | 2024-01-24 13:39:15.226928+00:00 | 2024-01-24 13:39:15.226938+00:00 | 1 |
20 | 6cbC | CellLine | all | True | clo | Cell Line Ontology | 2022-03-21 | https://data.bioontology.org/ontologies/CLO/su... | ea58a1010b7e745702a8397a526b3a33 | https://bioportal.bioontology.org/ontologies/CLO | 2024-01-24 13:39:15.227029+00:00 | 2024-01-24 13:39:15.227058+00:00 | 1 |
21 | 6tvq | CellType | all | True | cl | Cell Ontology | 2023-08-24 | http://purl.obolibrary.org/obo/cl/releases/202... | 46e7dd89421f1255cf0191eca1548f73 | https://obophenotype.github.io/cell-ontology | 2024-01-24 13:39:15.227158+00:00 | 2024-01-24 13:39:15.227168+00:00 | 1 |
25 | 1PY3 | Tissue | all | True | uberon | Uberon multi-species anatomy ontology | 2023-09-05 | http://purl.obolibrary.org/obo/uberon/releases... | abcee3ede566d1311d758b853ccdf5aa | http://obophenotype.github.io/uberon | 2024-01-24 13:39:15.227557+00:00 | 2024-01-24 13:39:15.227566+00:00 | 1 |
29 | 6EOm | Disease | all | True | mondo | Mondo Disease Ontology | 2023-08-02 | http://purl.obolibrary.org/obo/mondo/releases/... | 7f33767422042eec29f08b501fc851db | https://mondo.monarchinitiative.org | 2024-01-24 13:39:15.227952+00:00 | 2024-01-24 13:39:15.227961+00:00 | 1 |
33 | 3V9D | Disease | human | True | doid | Human Disease Ontology | 2023-03-31 | http://purl.obolibrary.org/obo/doid/releases/2... | 64f083a1e47867c307c8eae308afc3bb | https://disease-ontology.org | 2024-01-24 13:39:15.228346+00:00 | 2024-01-24 13:39:15.228355+00:00 | 1 |
35 | 6fKX | ExperimentalFactor | all | True | efo | The Experimental Factor Ontology | 3.57.0 | http://www.ebi.ac.uk/efo/releases/v3.57.0/efo.owl | 2ecafc69b3aba7bdb31ad99438505c05 | https://bioportal.bioontology.org/ontologies/EFO | 2024-01-24 13:39:15.228541+00:00 | 2024-01-24 13:39:15.228550+00:00 | 1 |
37 | 6jHz | Phenotype | human | True | hp | Human Phenotype Ontology | 2023-06-17 | https://github.com/obophenotype/human-phenotyp... | 65e8d96bc81deb893163927063b10c06 | https://hpo.jax.org | 2024-01-24 13:39:15.228735+00:00 | 2024-01-24 13:39:15.228744+00:00 | 1 |
40 | 4q5A | Phenotype | mammalian | True | mp | Mammalian Phenotype Ontology | 2023-05-31 | https://github.com/mgijax/mammalian-phenotype-... | be89052cf6d9c0b6197038fe347ef293 | https://github.com/mgijax/mammalian-phenotype-... | 2024-01-24 13:39:15.229027+00:00 | 2024-01-24 13:39:15.229036+00:00 | 1 |
41 | 6Czy | Phenotype | zebrafish | True | zp | Zebrafish Phenotype Ontology | 2022-12-17 | https://github.com/obophenotype/zebrafish-phen... | 03430b567bf153216c0fa4c3440b3b24 | https://github.com/obophenotype/zebrafish-phen... | 2024-01-24 13:39:15.229125+00:00 | 2024-01-24 13:39:15.229134+00:00 | 1 |
43 | 55lY | Phenotype | all | True | pato | Phenotype And Trait Ontology | 2023-05-18 | http://purl.obolibrary.org/obo/pato/releases/2... | bd472f4971492109493d4ad8a779a8dd | https://github.com/pato-ontology/pato | 2024-01-24 13:39:15.229325+00:00 | 2024-01-24 13:39:15.229334+00:00 | 1 |
44 | 48aa | Pathway | all | True | go | Gene Ontology | 2023-05-10 | https://data.bioontology.org/ontologies/GO/sub... | e9845499eadaef2418f464cd7e9ac92e | http://geneontology.org | 2024-01-24 13:39:15.229429+00:00 | 2024-01-24 13:39:15.229437+00:00 | 1 |
46 | 3rm9 | BFXPipeline | all | True | lamin | Bioinformatics Pipeline | 1.0.0 | s3://bionty-assets/bfxpipelines.json | a7eff57a256994692fba46e0199ffc94 | https://lamin.ai | 2024-01-24 13:39:15.229623+00:00 | 2024-01-24 13:39:15.229632+00:00 | 1 |
47 | 3TI0 | Drug | all | True | dron | Drug Ontology | 2023-03-10 | https://data.bioontology.org/ontologies/DRON/s... | 75e86011158fae76bb46d96662a33ba3 | https://bioportal.bioontology.org/ontologies/DRON | 2024-01-24 13:39:15.229722+00:00 | 2024-01-24 13:39:15.229730+00:00 | 1 |
48 | 7CRn | DevelopmentalStage | human | True | hsapdv | Human Developmental Stages | 2020-03-10 | http://aber-owl.net/media/ontologies/HSAPDV/11... | 52181d59df84578ed69214a5cb614036 | https://github.com/obophenotype/developmental-... | 2024-01-24 13:39:15.229819+00:00 | 2024-01-24 13:39:15.229828+00:00 | 1 |
49 | 16tR | DevelopmentalStage | mouse | True | mmusdv | Mouse Developmental Stages | 2020-03-10 | http://aber-owl.net/media/ontologies/MMUSDV/9/... | 5bef72395d853c7f65450e6c2a1fc653 | https://github.com/obophenotype/developmental-... | 2024-01-24 13:39:15.229917+00:00 | 2024-01-24 13:39:15.229926+00:00 | 1 |
50 | 3Tlc | Ethnicity | human | True | hancestro | Human Ancestry Ontology | 3.0 | https://github.com/EBISPOT/hancestro/raw/3.0/h... | 76dd9efda9c2abd4bc32fc57c0b755dd | https://github.com/EBISPOT/hancestro | 2024-01-24 13:39:15.230015+00:00 | 2024-01-24 13:39:15.230023+00:00 | 1 |
51 | 5JnV | BioSample | all | True | ncbi | NCBI BioSample attributes | 2023-09 | s3://bionty-assets/df_all__ncbi__2023-09__BioS... | 918db9bd1734b97c596c67d9654a4126 | https://www.ncbi.nlm.nih.gov/biosample/docs/at... | 2024-01-24 13:39:15.230112+00:00 | 2024-01-24 13:39:15.230121+00:00 | 1 |
Show code cell content
!lamin delete --force test-tissue
!rm -r test-tissue
π‘ deleting instance testuser1/test-tissue
β
deleted instance settings file: /home/runner/.lamin/instance--testuser1--test-tissue.env
β
instance cache deleted
β
deleted '.lndb' sqlite file
β consider manually deleting your stored data: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-tissue