LIRcentral Help

Contents

Definitions
Home
Search
Query Language
Browse
Results Table
Protein Entries


Definitions

LIRcentral contains information about manually curated entries of LIR-motifs that have been experimentally tested for their ability to bind Atg8 homologs, often in a biologically relevant context. LIR-motifs may have been tested under different experimental conditions, in vitro or/and in vivo. Central to the LIRcentral database are instances of LIR-motifs: a group of curators identifies papers in the literature and extracts relevant information that describes
  1. The protein sequence that has been investigated in the respective publication, and its UniProt identifier. In cases where it can be inferred from the respective publication that a particular isoform of a UniProt entry is the one experimentally tested, LIRcentral curators try to identify and provide this information.
  2. The location of the LIR-motif in the sequence. We use as a reference the core LIR-motif [WLY]xx[VLI]. In atypical cases we follow the definitions given by the authors (see for example the entry for the atypical motif in positions 134-136 in CACO2_HUMAN).
  3. Unequivocal experimental evidence on the binding properties of the motif in the LDS of at least one Atg8 homolog.
Based on the different cases we have already encountered in the literature, LIRcentral entries can be roughly categorized as follows:

Verified: These motifs have been examined using a variety of experimental methods. Such motifs are labeled in LIRcentral as:
  • YES: A LIR-motif that has been shown to interact with at least one Atg8 homolog, validated by at least one experiment. This is often refered to as a 'functional' LIR-motif.
  • YES (sAIM) (Conditionally functional): This category has been introduced to accomodate the atypical, shuffled LIR-like motifs in the human "CDK5 regulatory subunit-associated protein 3" and its homolog in A. thaliana, which are required for the binding of a functional LIR-motif, even though they cannot independently maintain the interaction.
  • Accessory LIR motif: A LIR-motif that cooperates with other functional LIR-motifs (in the 'YES' category) in the interaction with an Atg8 homolog.
  • NO: A LIR-motif that has been tested against at least one Atg8 homolog with no positive interaction result.
    Note: LIR-motifs that bind to at least one Atg8 homolog cannot belong in this category.
Predicted: Under this category, we list all motifs matching simple regular expression patterns described in the literature, as follows:
  • LIR: matching the [WFY]xx[VLI] pattern of the core LIR-motif.
  • xLIR: matching the extended LIR-motif described in the manuscript by Kalvari et al, 2014.
  • hfAIM: matches against 5 carefully crafted regular expressions, as described in Xie et al., 2016.
Note: For technically inclined LIRcentral users, the respective regular expressions are given below (they work on "clean" amino acid sequences):
LIR:    [WFY]..[LIV]
xLIR:   [ADEFGLPRSK][DEGMSTV][WFY][DEILQTV][ADEFHIKLMPSTV][ILV]
hfAIM1: .[DE][DE][WFY][ADCQEIGNLMFPSTWYV].[LIV]
hfAIM2: [DE][DE][ADCQEIGNLMFPSTWYV][WFY][ADCQEIGNLMFPSTWYV].[LIV]
hfAIM3: ..[ADCQEIGNLMFPSTWYV][WFY][DE][DE][LIV]
hfAIM4: [DE].[DE][WFY][ADCQEIGNLMFPSTWYV].[LIV]
hfAIM5: ..[DE][WFY][DE].[LIV]

The picture below summarizes the LIRcentral biocuration pipeline.

Back to the Contents.

Home

This is the landing page of LIRcentral, providing a brief introduction. Two graphs summarize the current content of the database. On wide displays these graphs are plotted side by side, while on smaller screens (e.g. mobile phones) they are plotted vertically arranged.
Left/Top: The distribution of experimentally verified LIR-motifs per species. Species names are automatically exctracted from the underlying UniProt entries.
Right/Bottom: The number of instances of different type of LIR-motif entries, as defined by LIRcentral curators based on the literature. See LIR definitions.

Back to the Contents.


Search

The "Search" functionality allows generic keyword-based search across all the fields of the LIRcentral database.

On the search page, users can perform keyword search in any field of the database, such as: UniProt ACC (e.g. O15040), UniProt ID (e.g. TCPR2_HUMAN), LIR motif sequence (e.g. YIAV), xLIR motif sequence (e.g. GDYIAV), any one of the five hfAIM motif sequences (e.g. EDEWEVI), species (e.g. Homo sapiens), protein/gene name (e.g. Tectonin beta-propeller repeat-containing protein 2, CALCOCO), motif type (e.g. LIR), Up-stream (e.g. YLTALDTNGD) and Down-stream (e.g. GSSIGMLYLY). It is worth noting that searches are also executed against synonyms for protein/gene-names, as available in the underlying UniProt records.

Note on searching against protein/gene names: The LIRcentral Search functionality performs simple text queries against the relevant information available in UniProt entries, i.e. no semantic similarity or dictionaries of synonymous terms are employed at this point. When searching for a ‘Protein/gene name’, LIRcentral looks for exact keyword matches in the description (DE) and gene name (GN) fields of the underlying UniProt entries, which may include alternative names (e.g. synonyms) for genes and gene products.




Query language: LIRcentral supports keyword-based searches. Some guidelines on how to form queries are given below:
  • Multiple keywords separated by spaces are handled as a "phrase". Example: the query Homo sapiens is looking for entries where "Homo sapiens" is mentioned exactly as typed.
    Note: Multiple spaces are handled literally, i.e. spaces are considered part of the "phrase". In most cases users should refrain from adding multiple spaces in queries.
    Note: Do not use double quotes to define phrases. Example: The query "Homo sapiens" returns no results.
  • Multiple keywords/phrases separated by commas are subject to boolean 'OR' search. Example: Mus musculus, SQSTM returns all entries in the database where either the phrase "Mus musculus" or the keyword "SQSTM" are found.
  • Queries are case insensitive. For example, the queries Homo sapiens, homo sapiens, and homO Sapiens are all equivalent.

More features:

Under the search bar, there are additional options for the user:

  • Verified: Search in all experimentally verified motifs (reviewed or under review).
  • All: If the user wants to search in all LIR-motifs (predicted and experimentally verified (reviewed or under review motifs).
  • Predicted: Search in all predicted motifs (LIR, xLIR, hfAIM(1-5)) of proteins that contain at least one experimentally verified (functional or non-functional) LIR-motif.

Note: By clicking the “Search” button without any keywords, all available entries in the database will be retrieved.

To get detailed information on the results returned by a "Search" operation, see Results Table.

Back to the Contents.



Browse

The "Browse" functionality allows targeted search in different fields of the LIRcentral database.

First, users need to select a field from the drop down list (e.g. "Species") and then, in the text box on the right, write a term which corresponds to the drop down selection (e.g. Homo sapiens).

Complex queries can be formed by clicking the “Add More” button, enabling the selection of additional field/keyword combinations. By default, additional terms are used with the "OR" Boolean operator. This behaviour can be altered by selecting the appropriate operator from the dropdown selection on the left.



Selecting “AND” will show results that fulfill the first and the second conditions. You can add more rows with more conditions.



To get detailed information on the results returned by a "Browse" operation, see Results Table.

Note on searching against protein/gene names: The LIRcentral Browse functionality performs simple text queries against the relevant information available in UniProt entries, i.e. no semantic similarity or dictionaries of synonymous terms are employed at this point. When searching for a ‘Protein/gene name’, LIRcentral looks for exact keyword matches in the description (DE) and gene name (GN) fields of the underlying UniProt entries, which may include alternative names (e.g. synonyms) for genes and gene products.
Back to the Contents.



Results Table

Results retrieved after a successful "Search" or "Browse" operation are displayed in tabular form as in the figure below:



Note: In wide monitor settings, all columns of the table are displayed, while on smaller screens (e.g. mobile phones) a heuristic is used to display as many columns as possible - the remaining columns can be revealed/hidden by clicking on the +/- control in the first column of each entry.


Decription of the results table: Each table row corresponds to a particular instance of a LIR-motif. The first column shows one of three different icons, representing the LIR-motif status.
for “Reviewed” entries, i.e. LIR-motifs that have been curated based on the literature,
for entries "Under Review"
for “Predicted" motifs (see Definitions).

A detailed description of the fields present in the results table is given below:

  • Species: Source species information as retrieved from the repective UniProt entry.
  • UniProt ACC: The UniProt accession number of the protein where this LIR-motif was identified. In cases where the LIRcentral curators can discriminate the exact isoform of a protein, such information is included in the entry (see e.g. the entry for the LIR-motif in the Isoform B of Glycogen [starch] synthase from D. melanogaster, UniProt accession number Q9VFC8-2).
    • By clicking on the external link icon or the UniProt ACC of the protein, the corresponding UniProt page will be opened in a new tab, for getting more information about the respective protein.
    • By clicking on the icon next to "UniProt ACC" of the protein, the Protein Entries page will open.
  • UniProt ID: The UniProt identifier of the protein on which the LIR-motif was identified.
  • Protein name: The protein name as retrieved from the respective UniProt FASTA entry. Synonyms are not shown for clarity.
  • Motif type: LIRcentral currently supports different LIR-motif types as described in Definitions.
  • Upstream: The amino acid sequence immediately upstream a LIR-motif. By design, LIRcentral displays 10 amino acids before a canonical LIR-motif (see Definitions). Less residues may be displayed in cases of LIR-motifs proximal to the N-terminus of the polypeptide chain.
  • Motif: The amino acid sequence of the core LIR-motif. This may not always adhere to the canonical LIR-motif (see Definitions).
  • Downstream: The amino acid sequence immediately downstream a LIR-motif. By design, LIRcentral displays 10 amino acids after a canonical LIR-motif (see Definitions). Less residues may be displayed in cases of LIR-motifs proximal to the C-terminus of the polypeptide chain.
  • Start position: The position of the first amino acid of the core motif.
  • End position: The position of the last amino acid of the core motif.
  • (-2)LIR pssm score: The position-specific scoring matrix score of the hexapeptide ending with LIR-motif, following the computations in Kalvari et al, 2014 .
  • Experimentally Verified (Functional YES/NO): See Definitions.
  • Ref: A link to the PubMed database with the reference(s) of the papers used to annotate this particular LIR-motif.
Interacting with the Results Table
  • Search results can be copied (as text in the clipboard) or exported to excel, csv or pdf files by pressing the corresponding export buttons:

  • By clicking on the external link icon or the UniProt ACC of the protein, the corresponding UniProt page will be opened in a new tab, for getting more information about the respective protein.
  • By clicking on the icon next to "UniProt ACC" of the protein, the Protein Entries page will open.
  • By clicking on the "Ref." link (when available in the last column), entries for the papers used to annotate the respective LIR-motif are displayed in the PubMed database (opens in a new tab).
  • On the top left side of the table, the user can select the number of the results that will be shown (10, 25, 50, 100 or all the results).
  • On the top right side of the table, the results can be filtered, by providing user-defined keywords.
Back to the Contents.



Protein Entries

"Protein Entries" pages display information on LIR-motifs in the context of the whole polypeptide chain where they reside. This view overlays information available in LIRcentral with features dynamically retrieved from the UniProt database.

Each Protein Entry view is organised in three sections.

  • A header, displaying the Protein name, UniProt accession and Source organism (derived from UniProt).
  • A table displaying instances of LIR-motifs in the particular sequence.
  • A graphical panel with an integrated depiction of LIRcentral data and features retrieved on the fly from UniProt. Visuals are empowered using the ProtVista BioJS viewer.

Important note: LIRcentral curators save the exact UniProt sequence version based on which the LIRcentral annotations are derived. In cases where the current UniProt entry differs, a warning message is displayed in the header section; in those cases, only the header and table sections become available, to avoid confusion. Currently (April 2022), two UniProt entries show such discrepancies and are marked by the LIRcentral database:


The header and table sections are always rendered at the top of the page. The relative position of the graphical panel depends on monitor settings: on wide screens the graphical panel appears next to the table, while on small screens (e.g. tablets) it appears below the table.



Most columns in this table correspond to the columns in the Results table. The only aditional piece of information here, refers to the "Type" of predicted LIR-motifs and is displayed in an additional column (column 6). In this field LIRcentral records the type of regular expression pattern (if any) that matches the particular LIR-motif. Superscripts (if available) in the last columns correspond to comments for this particular protein, displayed below the table. See Definitions for more details.


More features:

  • The Table can be copied (as text in the clipboard) or exported to excel, csv or pdf file by pressing the corresponding export buttons:

  • On the top left side of the table, users can select the number of the results that will be shown (10, 25, 50, 100 or all the results).
  • On the top right of the table, the results can be filtered, by providing user-defined keywords.

ProtVista graphical panel

Instances of different types of LIR-motifs are organized in individual "tracks", under "Lir motifs" on the top left of the visual display. Each of them is displayed with a colored box in the respective position under the sequence. Boxes are selectable and display positional and sequence information.


More ProtVista features:

  • On the left top side of the graph there are four buttons:
    • The first button, downloads the graph’s data as JSON file.
    • The second button, highlights a region on the graph.
    • The third button , resets the graph to the original state.
    • The fourth button , zooms in and out to a selected region.
Back to the Contents.