Skip to contents

Reads and formats OMIM data copied or manually downloaded from https://omim.org/, or downloaded with download_omim() (permission required), and appends columns to speed up subsequent curation activities.

Usage

read_omim(file, keep_mim = c("#", "%"), ...)

Arguments

file

The path to a file (possibly compressed) with copy/pasted or manually downloaded from https://omim.org/ (see "Manual Input Requirements" for details), or downloaded with download_omim().

keep_mim

[OMIM search data only] The MIM symbols representing the data types to keep, as a character vector, or NULL to retain all (default: "#" and "%").

The OMIM defined MIM symbols are:

MIM symbolMIM type
*gene
+gene, includes phenotype
#phenotype
%phenotype, unknown molecular basis
^deprecated
nonephenotype, suspected/overlap
...

Arguments passed on to read_delim_auto

Value

An omim_tbl (tibble) with an omim column containing OMIM CURIEs as formatted in DO xrefs, followed by complete OMIM data arranged as seen on omim.org for OMIM entries (where possible). If the omim.org "Download as" button was used to download the data, the omim_tbl will be additionally modified based on the download type:

  • Search list download: Additional omim_search class and search column containing the search used.

  • OMIM phenotypic series titles download: Additional omim_PS_titles class.

  • OMIM phenotypic series download: Additional omim_PS class and a row representing the OMIM phenotypic series itself.

Output with columns typical OMIM phenotype entries, including omim_PS, will have an additional geno_inheritance column containing a best guess at inheritance from the GENO ontology. This simplifies adding inheritance as logical subClassOf axioms supporting curation.

NOTE: OMIM phenotypic series on https://omim.org/ include the same data as entries but column are ordered differently.

Manual Input Requirements

The file with OMIM data copied or downloaded must include headers at the top. These data can be left as copied & pasted from omim.org even if they are not formatted correctly, as read_omim() will process and correct headers, which includes fixing multi-line or misarranged column headers, and will trim whitespace.