Read OMIM Data
read_omim.Rd
Reads and formats OMIM data copied or manually downloaded from
https://omim.org/, or downloaded with download_omim()
(permission
required), and appends columns to speed up subsequent curation activities.
Usage
read_omim(file, keep_mim = c("#", "%"), ...)
Arguments
- file
The path to a file (possibly compressed) with copy/pasted or manually downloaded from https://omim.org/ (see "Manual Input Requirements" for details), or downloaded with
download_omim()
.- keep_mim
[OMIM search data only] The MIM symbols representing the data types to keep, as a character vector, or
NULL
to retain all (default:"#"
and"%"
).The OMIM defined MIM symbols are:
MIM symbol MIM type *
gene +
gene, includes phenotype #
phenotype %
phenotype, unknown molecular basis ^
deprecated none
phenotype, suspected/overlap - ...
Arguments passed on to
read_delim_auto
Value
An omim_tbl
(tibble) with an omim
column containing OMIM CURIEs
as formatted in DO xrefs, followed by complete OMIM data arranged as seen on
omim.org for OMIM entries (where possible). If the omim.org "Download as"
button was used to download the data, the omim_tbl
will be additionally
modified based on the download type:
Search list download: Additional
omim_search
class andsearch
column containing the search used.OMIM phenotypic series titles download: Additional
omim_PS_titles
class.OMIM phenotypic series download: Additional
omim_PS
class and a row representing the OMIM phenotypic series itself.
Output with columns typical OMIM phenotype entries, including omim_PS
, will
have an additional geno_inheritance
column containing a best guess at
inheritance from the GENO ontology. This simplifies adding inheritance as
logical subClassOf axioms supporting curation.
NOTE: OMIM phenotypic series on https://omim.org/ include the same data as entries but column are ordered differently.
Manual Input Requirements
The file
with OMIM data copied or downloaded must include headers at the
top. These data can be left as copied & pasted from omim.org even if they
are not formatted correctly, as read_omim()
will process and correct
headers, which includes fixing multi-line or misarranged column headers,
and will trim whitespace.