About Textpresso Categories/Ontology Copyright Downloads Feedback Home Query Language Search User Guide

Welcome to the Textpresso user guide. This document briefly explains the functionality of the Textpresso website.

1) Main Menu

    There are currently nine click-able items aligned horizontally underneath the logo (1). The highlighting indicates the current work area of the user. The menu items are:

    About Textpresso & Copyright: Information about us and copyright issues can be found here.

    Downloads: The Textpresso software can be downloaded in this section.

    Feedback: This interface presents a form to leave feedback, questions and suggestions for the Textpresso team.

    Home: The homepage contains news and messages, information about the database and a smaller version of the search interface.

    Ontology: The corpus of literatures is marked up with terms and phrases of an ontology, which are contained in a lexicon. One can browse the terms and test specific terms against the lexicon.

    Query Language: This interface offers the most powerful and versatile search options available in Textpresso.

    Search: Most common search tasks can be performed using this option. It is the most frequented page of the system.

    User Guide: This document.

2) Feedback

    This form allows the user to leave a message for the Textpresso team. Fill out all the information in the given text field and hit the “Submit!” button. If some fields are left empty, the form will be returned to the user, and the message will not be accepted for submission.

3) Home

    The home page consist of three parts: a simplified search interface, a news and message board as well as the database description. The search interface allows to do keyword searches (1) and/or category searches. The keyword textfield allows for specifying phrases (terms that contain more than one word), as well as simple Boolean operation such as 'AND', 'OR', and 'NOT'. The keyword search can be refined by specifying whether an exact match has to occur or whether the match has to be case sensitive (2). In addition, certain categories of the Textpresso ontology can be required to be present. Up to 4 categories can be specified. Finally, one can restrict searches to specific literatures (4). By default, all criteria that are entered in this interface have to match in a sentence. One can modify the search scope by using the original search interface, which can be accessed from the main menu. To submit the query, hit the 'Search!' button (5).

    The 'News & Messages' column (6) contains updates about latest developments and news. The database upon which the system relies on is described at the bottom of the page (7). All literatures are listed separately, with data counts of particular database fields. The last line summarizes the full database.

4) Ontology

    The ontology page serves two functions: First one can test a word or phrase for a match in the Textpresso lexicon. After entering a term in the textfield (1) and hitting the 'Test!' button (2), the system returns all entries in the lexicon whose regular expression match the term (3).

    The second function allows display of the lexicon of a particular category. Choose a category from the pop-up menu (4) and hit the 'Display!' button (5). All entries of one category are then displayed, and since the return can be quite long, the response time can be somewhat slow.

5) Query Language

    The query language is the most versatile and powerful element of Textpresso. Complicated retrieval tasks can be performed by carefully crafting sets of commands. There are three kind of commands: the first kind (set, clear) manipulate parameter settings, the second kind (find) performs keyword, phrase, category and attribute searches, and the third kind (and, or, not, display) manipulates search results, which are kept in an entity called 'variable' here. The syntax for the commands are listed in the explanation box (1).

    The text area (2) accepts commands line by line. The user can either enter one command at a time and then press the 'Submit!' button (3), or enter all commands (line by line) at once and the have it processed by hitting the 'Submit!' button (3).

    The set command sets a parameter to a single or a series of values (example: set literature=elegans, melanogaster). The parameter names are 'literature', 'field', 'exact match', 'case sensitive', 'sentence scope', 'search mode' and 'sorted by'. When entering a parameter name, the name of the parameters needs to be spelled out, however, if a name consists of two words, you can specify either one of them. The value a parameter can be set to depends on the particular implementation of database of Textpresso. It can be obtained from their respective input field (check boxes, popup-menu) in the search interface (accessible through the main menu). A value can be abbreviated by using the first few letters that identify it uniquely (for example, you could use set literature=ele).

    The clear command clears the setting of a parameter (example: clear all). The user can clear all parameter by using the word 'all' or a specific parameter, by naming it, again spelled out completely.

    The find command performs the actual search (example: find keyword egg > 0 -> var_egg). The exact syntax of this command is

find (keyword | category| attribute) (keyword | "phrase" | category | category:attribute:value) (< | == | >) number -> variable-name.

    The second parameter determines what the user wants to search, the choices are 'keyword', 'category' or 'attribute'. The third parameter specifies the data item the user is searching, such as an keyword (for example, mitosis), a phrase (for example, ”anchor cell”), a category (for example, regulation) or attribute (for example, regulation:type:positive). The fourth and fifth parameters determine the numerical constraint with which the data item has to be found in a given search scope (the search scope is set with the set command, see above). Thus, '> 2' means that the item has to be present in the search scope more than two times. The arrow (->) following these two parameter is a fixed character sequence and cannot be changed. It suggestively points to the last parameter, the variable-name, into which the search result is stored. Characters and numbers should be used for this name. If the find command is the last in the text area before the user hits the 'Submit!' button, the search results of this last find command is returned in the result table. The result table is described in details in the section about the search interface.

    The and, or and not commands concatenate two search results, which have been previously obtained using the find command and stored in two variables. The result of this operation is stored in a third variable, which is specified last (example: and gene-result cell-result -> gene-and-cell). Again, if one of these commands happen to be last in a series of commands in the text area, the result of this operation is returned in the result table.

    Finally, the display command displays a search result that had been previously obtained, but was lost in the course of further operations. The variable name to which the search was assigned to has to be provided as a parameter. Only variables within a session can be retrieved, i.e., once the user leave the query language interface, any search results are lost.

    The following figure shows the search return (1) of a query formulated with the query language interface. Details of the search return table are explained below. Note that the last set of commands is documented in the explanation section (2):

6) Search

    The search page is probably the interface the user will frequent most. The user has a variety of options to conveniently choose from; however, the most accurate way to formulate a query is using the query language. Therefore, if the search page does not fulfill required needs, please go there.

    The user can enter a keywords or phrase in a text field (1), and multiple keywords and/or phrases can be concatenated with the usual operators 'AND', 'OR', and 'NOT' in the same text field, using the syntax described below the text field. Exact matches and case sensitivity can be requested by clicking on the corresponding boxes (2). Furthermore, the user can require up to four categories to be present, which are specified with popup menus (3). The search can be restricted to certain literatures (4), as well as searches can be confined to particular sections of the publication called fields (5). Search scope, search mode and sort option can be modified with popup menus (6). The search scope determines where keywords and categories requirements have to be met, either on a sentence level, on a field level (abstract, author, body, title, year) or on a document level (anywhere in the document. The search mode determines how the score is calculated. 'Boolean' just adds up occurrences of matched entities in an integer fashion. 'tf·idf' (term frequency times inverse document frequency) overweights rare terms. 'Latent themes' emphasizes matches that have similar semantic contexts as the query, but is not implemented yet. The search return can be sorted according to certain output fields, such as year of publication, author, title etc. It also can be sorted according to score. Finally, to submit the search, hit the 'Search!' button (7).

    The result table of a search return has several features: the user can flip through all pages of the search return by using the page selector (1). The output of the result table is edited by the display options panel (2). You can switch on or off the display of abstract, accession, author, citation, journal, title, type or year information by clicking on the respective links. You can further toggle the supplemental links and files as well as the presence of links in text ('textlinks'). Highlighting of search terms that are matched in the text can be switched on or off. The 'matching-sentences' menu allows the user to choose how many sentences around a match should be displayed. The option 'none' does not display the matching sentences at all, while '1' just displayed the matching sentence. Finally, the entries/page menu determines how many publications are displayed per page. For speedy returns, choose a low number for both, the 'matching-sentences' option as well as the 'entries/page' option.

    The search summary (3) tells the user how many matches have been found in how many documents, and indicates the search time. The search time does not include the time to produce the webpage. Following the search summary, links that concern all entries of a search return are presented (4): The first link produces an Endnote file for all result entries, the second link produces a printer-friendly version of the result table. The production of this table might take quite a while when the search return is huge. The publication display (5) shows detailed bibliographical information about each publication which can be customized in the display options panel. The publication display might contain textlinks to other databases (such as gene or cell report pages). Search terms that match are highlighted if the option is turned on. In case of phrases, not only the phrase, but also the phrase is highlighted, but also its constituents. The score of each entry is displayed in the upper right corner (6). Finally, at the bottom of each entry, there are several links to supplemental files and other web pages pertaining to that particular entry. The first links produces an Endnote file for the publication, the second links to the on-line text of the publication. At last, the third link lead to PubMed displaying related articles.

Last update: Thu, February 23, 2006 05:10:50 PM by Hans-Michael Muller.

Textpresso Tue Apr 23 03:18:24 2024 .