HomeStartingEnvironmentDBMSVisualPQLProceduresSQLFormsHost/APIIndex
SQL homecontents start chapter top of pagebottom of pagenext page index SELECT

SELECT

SELECT   [ DISTINCT ]  variable_list
FROM     [ tabfile.  ] table_name,     |
         [ tabfile.  ] view_name ,     |
         [ database.] record_name,     |
         [ database.] CIR ,            |
         path_name ,                   |
         [AS] alias_name,
Keywords:
         CASELIM n
         COMPILE_ONLY
         DBMS [ filename ]
         OUTPUT filename
         RECLIM n
         SAMPLE proportion [ , seed ]
         SELLIM n
Clauses:

FORMAT
GROUP BY
ON
ORDER BY
OUTER
UNION
WHERE

The SELECT command takes data from one or more existing records or rows and creates a new table with the selected data in it. The definition of data in the new table is copied from the existing definition. The number of rows in the new table depends on the number of rows in the input and the particular clauses specified. (See Output from SELECT.)

The variable list and the FROM clause are required. All other clauses are optional.

The result of the SELECT is a new table containing the selected variables. Tables created with the SELECT command are exactly the same as tables created in any other way. The SELECT command also controls the display format of the variables in the created table. The display format of the table can be modified with the display processor. Formatting can be specified on the SELECT with the FORMAT option.

The basic form of the command is:

SELECT variable_list FROM record_list WHERE condition
The variable list specifies the variables to be saved in the new table; the FROM clause, lists the records and tables which contain the input data and the WHERE clause, specifies the conditions under which records or rows are selected.

homecontents start chapter top of pagebottom of pagenext page index

Output

One occurrence of the set of selected variables is one row in the output table. The clauses and keywords specified in the SELECT affect the way in which the output table is constructed. If variables are selected from a single record or table, one output row is written for every individual data record or row which satisfies the WHERE clause.

If variables are selected from multiple records or rows, one output row is written for each individual data record or row which exists and which meets the selection criteria. For example the following results in one output row per combination of employee and review:

SELECT NAME POSITION RATING -
FROM EMPLOYEE REVIEW -
WHERE EMPLOYEE.ID = REVIEW.ID
This is sometimes referred to as an inner join.
There is no output for instances where a record of one type exists but not the other. A join which produces an output row regardless of whether the joined record or row exists is known as an outer join. Specify the OUTER keyword following the FROM clause. Follow OUTER with one or more record or table names. For example;

SELECT ....  FROM A B OUTER B
SELECT ....  FROM A B OUTER A
SELECT ....  FROM A B OUTER A B
All these examples generate an output row for every combination of A and B; the first example adds all occurrences of A where B does not exist; the second example adds all occurrences of B where A does not exist; the third example takes both occurrences of A with no B and occurrences of B with no A. These are sometimes referred to as a RIGHT outer join, a LEFT outer join and a SYMMETRIC outer join.

DISTINCT

The DISTINCT keyword specifies that one output row is written for each distinct or unique set of values selected. For example;

SELECT DISTINCT SALARY FROM EMPLOYEE
lists each salary that one or more people are earning. There is only one output row per value of salary, regardless of how many people earn that salary. The keyword UNIQUE (abbreviation UNQ) is a synonym for DISTINCT.

homecontents start chapter top of pagebottom of pagenext page index

Aggregation

Aggregation functions compute single values from multiple records and change the number of rows that are created. An aggregation function returns a single value to represent a calculation (such as an average) across multiple records. For example:

SELECT AVG(SALARY) FROM EMPLOYEE
This returns one value, the average salary of all the employees; correspondingly, only one row is created in the output table.

The following functions are aggregation functions and alter the number of output rows produced:

SUM
Sum of values selected
AVG
Average of values selected
STD
Standard deviation of values selected
MAX
Maximum value selected
MIN
Minimum value selected
FIRST
First non-missing value selected
LAST
Last non-missing value selected
COUNT
Count number of values

SUM, AVG and STD operate only on numeric data. Other aggregation functions operate on any type of data.

GROUP BY

The GROUP BY clause specifies that one output row is written for each unique value of the variables specified . For example,

SELECT .....  FROM EMPLOYEE GROUP BY CURRPOS
produces one output row per value in the CURRPOS variable. Since this is a grouping of data, individual data items cannot be selected; the only legal expressions that can be used in the variable list when GROUP BY is specified are aggregate data or the variables in the GROUP BY clause. For example, to produce a count of people and a total salary in each position.

SELECT CURRPOS COUNT(SALARY) SUM(SALARY) FROM EMPLOYEE -
GROUP BY CURRPOS
The table that results from aggregation has one entry per aggregation level. To produce a table with individual values for a column plus subtotals and totals (sums, averages, counts, or other statistics) , use the
Format clauses.

homecontents start chapter top of pagebottom of pagenext page index

Variable List

The variable list in the SELECT is a list of variable names and expressions. When a variable name is specified, the definition of that variable is copied from the existing definition. If variables have the same name on different records or tables, qualify the variable name by the name of the record or table. Qualified names are separated by a period.

An asterisk (*) specifies all of the variables in all of the records or rows referenced in the FROM clause. For example,

SELECT ID NAME FROM EMPLOYEE
SELECT * FROM REVIEW
SELECT EMPLOYEE.NAME .....  FROM EMPLOYEE .....

Computing New Values

Expressions may be specified in the variable list. For example, the following query computes the weekly salary by multiplying monthly salary (SALARY) by 12 and dividing by 52.

SELECT ID NAME SALARY*12/52 FROM EMPLOYEE
The name of the computed variable in the output table is the first 32 characters of the expression used to calculate it.

Numeric expressions may use numeric constants, numeric variables, the arithmetic operators ( + , - , / , * , ** ), SQL functions, and parentheses to denote the order of operations.

Character expressions may use quoted strings, variables which are strings, the "+" character to join strings, SQL string functions, and parentheses to denote the order of operation. Enclose strings in single quotes.

Functions

Functions are specified as a keyword followed by one or more arguments enclosed in parentheses. Arguments may be variable names, constants or expressions. The function operates on the arguments and returns a single value for each record selected.

For example, RND is a function which returns a number rounded to the nearest whole integer:

SELECT NAME RND(SALARY*12/52) FROM EMPLOYEE

homecontents start chapter top of pagebottom of pagenext page index

FROM

The FROM clause is required on the SELECT and specifies the records, paths, views and tables to be accessed. Database records may be specified by name or number. If more than one record or table is specified on the FROM clause, a join is performed.

Referencing Multiple Tables

A join retrieves data from two or more records or tables with one query. A join is implied whenever the FROM clause references more than one source for the data.

There must be common values in some corresponding columns between the data sources in the FROM clause. For example, to create a table of employee name and position level for each position ever held by the employee:

SELECT NAME POSITION FROM EMPLOYEE OCCUP -
WHERE  EMPLOYEE.ID = OCCUP.ID
This uses ID to form the relationship between the records in the WHERE clause. Use the record name as prefix to differentiate between column names which are the same in different records or tables. That is, EMPLOYEE.ID refers to the value of ID in the Employee record, while OCCUP.ID refers to the value of ID in the Occup record.

In a case structured database with case mode enabled (the default), there is an automatic relationship between records. A query automatically joins records for the same case. The automatic relationship has the same effect as the WHERE clauses matching each record on the case ID. For example, referencing data from multiple records on the COMPANY database, has the following implicit WHERE clause:

SELECT ...  FROM  EMPLOYEE OCCUP REVIEW -
            WHERE EMPLOYEE.ID = OCCUP.ID -
            AND   EMPLOYEE.ID = REVIEW.ID
There is no need to specify the WHERE clause to join on the case identifier.

homecontents start chapter top of pagebottom of pagenext page index

PATHS

A PATH links one record (or table) to another record (or table) and specifies the manner in which they are to be joined. System defined paths are automatically created by SQL between records in a database which have common keyfields. Common keyfields mean the same variable names of the same type in the same sequence in different records. For example, on the company database there is a system defined path between OCCUP and REVIEW which is equivalent to:

...  WHERE OCCUP.POSITION = REVIEW.POSITION
Each path has a name. Paths are explicitly invoked by naming a record (or table) and a path name on the FROM clause. The path is invoked implicitly by naming both records (or tables) in the FROM clause. If there are multiple paths between two record types, SQL uses the earliest defined path. SQL never automatically creates more than one path between any two records. The
SHOW PATH command displays the path definitions in order.

Create paths with the CREATE PATH command. This names the path, the two records (or tables) to be joined and the variable(s) used to join them. For example,

CREATE PATH MYPATH -
FROM   COMPANY.OCCUP TO COMPANY.REVIEW -
VIA    POSITION

Use the path by naming the path rather than the second record on the FROM clause. For example,

SELECT ....  FROM OCCUP MYPATH 

Path Mode and Case Mode

Records are joined whenever multiple tables or records are referenced in the FROM clause. The way in which records are joined is determined by the WHERE clause, by any PATHs that are referenced in the FROM clause and by the current settings of CASE mode and PATH mode.

If CASE mode is set, records are joined when the case identifier variable in one record is equal to the case identifier in another. This is equivalent to specifying a WHERE clause such as:

...  WHERE RECONE.CASEID = RECTWO.CASEID
If PATH mode is set, any paths between the records referenced in the FROM clause are automatically used. The system defined paths imply joins based on the case identifier and relationships implied in the keyfields.

For example, the default path joining the OCCUP and REVIEW records in the sample database is equivalent to the WHERE clause:

...  WHERE OCCUP.ID EQ REVIEW.ID AND OCCUP.POSITION EQ REVIEW.POSITION
The setting of CASE has no effect on joins if PATH mode is set. (However if you are using a subquery, case mode must be off). The setting of CASE does have an effect if PATHs are cleared (records are still joined within case).

If PATHS are explicitly referenced in the FROM clause, the setting of CASE and PATH have no effect, the PATH is always applied as defined.

homecontents start chapter top of pagebottom of pagenext page index

Joins

The normal type of join and the automatic join performed by case and path mode is an Equi-Join because the comparison operator between the two columns in the two tables is an "Equal" (EQ or =).

A join condition can specify other relationships between columns, such as "greater than" (GT or >), "less than" (LT or <), etc. These are referred to as Non-Equi-Joins. For example, suppose a table is created of minimum and maximum starting salaries by division:

SELECT DIVISION MIN(STARTSAL) MAX(STARTSAL) -
FROM OCCUP -
ON DIVSAL -
GROUP BY DIVISION -
FORMAT COL 2 NAME MINSAL -
FORMAT COL 3 NAME MAXSAL
Then, select anyone whose current salary is greater than the maximum or less than the minimum starting salaries.

SELECT ID NAME SALARY FROM EMPLOYEE REVIEW DIVSAL -
WHERE SALARY GT MAXSAL OR SALARY LT MINSAL
Note that this non-equi-join is a join and not just a test on a column value. This means that a row is produced for every matching condition between the joined rows. Therefore, a row appears once for each time that salary is greater than one of the division maxsal columns or less than one of the division minsal columns.

Non-equi-joins on large tables can produce a tremendous number of rows and the WHERE conditions should be carefully examined to limit the output to the required combinations.

homecontents start chapter top of pagebottom of pagenext page index

Alias

You can define an Alias name for a record or a table in the FROM clause. Use an alias to qualify a variable name when it cannot be done by using the unique record or table name. A record or table name is not unique if the same record or table is on two different databases or tables you are joining, or if you join a record or table to itself.

The alias is defined in the record or table specification in the FROM clause and is used wherever the table or record name would be used in other parts of the SELECT statement.

When an alias is defined, it follows the record name or the optional keyword AS and must be followed by a comma (,) to separate it from any other record names.

For example, suppose a genealogical database where everybody is in a PERSON record, and each person has an ID. In each person's record are the IDs of their father and mother. Because this involves joining a record type to itself, use an alias:

SELECT CHILD.ID FATHER.ID MOTHER.ID -
FROM   PERSON AS CHILD,  PERSON AS FATHER, PERSON AS MOTHER -
WHERE  CHILD.FATHERID EQ FATHER.ID AND -
       CHILD.MOTHERID EQ MOTHER.ID

homecontents start chapter top of pagebottom of pagenext page index

Keywords

The following keywords can be specified on the SELECT command:

CASELIM n
Specifies that the process stops after reading n cases and prompts whether to continue processing. If this is not specified, the default is used. The system default is 1000, which can be changed with the SET command.
COMPILE_ONLY
Specifies that the query is compiled but not executed. This is used to check the syntax of a query without performing the retrieval.
DBMS filename
Specifies that a VisualPQL version of the query is written. This option is intended as a debugging tool. Some modifications to the resultant program may have to be made before use.
OUTPUT filename
Sets the print filename for subsequent PRINT commands. This can also be set by the SET command and by the DISPLAY command.
RECLIM n
Specifies that the process stops after retrieving n records or rows and prompts whether to continue processing. If this is not specified, the default is used. The system default is 1000, which can be changed with the SET command.
Abbreviation: RLIM
SAMPLE proportion [,seed]
Specifies that a random sample is produced. ( 0.0 < proportion < = 1.0 ) The optional seed parameter specifies an odd integer to be used as the seed for the random number generator. This permits the generation of different samples.
SELLIM n
Specifies that the process stops after n rows have been selected and prompts whether to continue. If this is not specified, the default is used. The system default is 1000, which can be changed with the SET command.
Abbreviation: SLIM

homecontents start chapter top of pagebottom of pagenext page index

FORMAT

FORMAT COLUMN column_list  column_format
Use FORMAT clause(s) on a SELECT command to specify the appearance of particular columns. When a table is created, the default settings are used to format columns unless specifically overridden with FORMAT clauses. Specify multiple format clauses on a single select command by repeating the complete FORMAT clause.

The first part of a format clause specifies one or more columns to be formatted. Specify the keyword COLUMN followed by the column_list. The next part of the FORMAT clause specifies the Column Format for the particular columns which is typically a keyword (e.g. WIDTH) and a setting.

You can specify multiple options on a single format clause where the options apply to the specified column(s). For example:

SELECT ID SALARY*1.1 FROM EMPLOYEE -
FORMAT COLUMN 1 NAME EMPLOYEE_ID WIDTH 12 -
FORMAT COLUMN 2 NAME NEWSALARY

homecontents start chapter top of pagebottom of pagenext page index

Column References

Columns can be referenced by column number or by column name. Column number refers to the sequential (left to right) number of the column in the table. Column numbers remain the same regardless of any formatting commands. Each column has a name that can be altered by formatting commands. If the column name is changed then the new column name is used. A column label may be displayed, but the column is always referenced by its name or number.

Specify a list or range of consecutive columns by specifying the start and end columns separated by a colon (:).

Examples of column specifications might be as follows:

FORMAT COLUMN SALARY option
FORMAT COLUMN 1 option
FORMAT COLUMN 3:7 option
FORMAT COLUMN 1 4 8 option
FORMAT COLUMN SALARY option
FORMAT COLUMN BIRTHDAY option

homecontents start chapter top of pagebottom of pagenext page index

Column Formats

Columns have a number of display characteristics which can be specified. The specification can be supplied when the table is created by a SELECT or a CREATE TABLE or can be amended by specific commands.

The column formats are specified as a keyword and a setting as follows:

         [  DATE 'date_map' ]
         [  DPLACES n ]
         [  EXPONENT n ]
         [  LABEL ON | OFF ]
         [  MISSCHAR 'c' ]
         [  NAME column_name ]
         [  NULL 'string']
         [  SEPARATOR 'string' | n BLANKS ]
         [  TIME'time_map' ]
         [  VALLAB ON | OFF ]
         [  WIDTH n ]
         [  ZEROS ON | OFF | 'string']
DATE 'date map'
Sets the date map.
DPLACES n
Sets the number of decimal places to display. Abbreviation: DPL
EXPONENT n
Specifies that exponential notation is used for display or printing. The number specified is the number of decimal places and zero (0) is used to indicate that the field is not displayed in exponential format. For example:
1) To display the number 8,267 as 8.267E+003 specify EXPONENT 3.
2) To display the same number as 8.267000E+003 specify EXPONENT 6.
LABEL
ON specifies that the variable label is used as the column name. OFF specifies the variable name is the column name and is the default. Abbreviation: LAB
MISSCHAR
Sets the single character to display and fill the column when it contains missing values. Abbreviation: MISS
NAME
Sets the column name. Names must obey the SQL name rules or be specified as a string constant in single quotes. For example:
FORMAT col 3 NAME ANNUAL_SALARY
NULL
Sets the string to display if the column contains missing values. If the specified string is longer than can be displayed, it is truncated. NULL takes precedence over any MISSCHAR specified.
SEPARATOR
Sets the separator to the specified string. The separator precedes this column, separating it from the previous column and is typically a number of blanks. The separator may be set to a particular string or to a specified number of BLANKS. Abbreviations: SEP and BL, BLANK. For example:
FORMAT COLUMN salary SEPARATOR 4 BLANKS
TIME
Sets the time map. For example:
FORMAT COLUMN startime TIME 'HH:MM'
VALLAB
Specify VALLAB ON to display value labels instead of data values. Specify VALLAB OFF to display values. Use SET and CLEAR to change the VALLAB setting for all variables in the table. For example:
FORMAT COLUMN marstat VALLAB ON
WIDTH
Sets the column width to the specified number of characters. Column headings that do not fit are wrapped to as many lines as needed. Abbreviation: WID. For example:
FORMAT COLUMN salary WIDTH 12
ZEROS
Enables or disables the display of leading zeros for numeric variables. ZEROS OFF is the default. You can also specify a string to be displayed if the value is zero. For example:
FORMAT COLUMN salary ZEROS ON

homecontents start chapter top of pagebottom of pagenext page index

GROUP BY

GROUP BY specifies that all sets of values selected are grouped together according to their unique values in the value list. This produces a summary table with one entry per group of records. For example, to calculate the average salary for male and female employees:

SELECT VALLAB(GENDER) AVG(SALARY) FROM EMPLOYEE -
GROUP BY GENDER
This produces a table with two entries:

VALLAB(GENDER)  AVG(SALARY)
Male              2745.83
Female            2831.25
The aggregation can be done at additional levels by adding further variables to the GROUP BY clause:

SELECT VALLAB(GENDER) VALLAB(EDUC) AVG(SALARY) -
FROM EMPLOYEE GROUP BY GENDER EDUC

VALLAB(GENDER)        VALLAB(EDUC)              AVG(SALARY)
Male                  Elementary                  2533.33
Male                  High School                 2500.00
Male                  Some University             2550.00
Male                  B.Sc.  or B.A.              3050.00
Male                  M.S.                        2625.00
Male                  Ph.D.                       3350.00
Female                High School                 2600.00
Female                Some University             2700.00
Female                B.Sc.  or B.A.              3533.33
Female                M.S.                        1650.00
Female                Ph.D.                       2400.00
Where there are no records for a level, no row is created in the retrieval. For example, there are no female employees with only elementary education.

Selecting Groups: HAVING Clause

Use the HAVING clause with the GROUP BY clause to select groups according to some condition. For example, to select groups with an average salary greater than 2500.

SELECT VALLAB(EDUC) VALLAB(GENDER) COUNT(SALARY) AVG(SALARY) -
FROM EMPLOYEE GROUP BY EDUC GENDER HAVING AVG(SALARY) > 2500

Case Aggregation

In a case structured database, by default SQL computes aggregate functions within cases. To compute aggregates across cases, turn off case mode with the CLEAR CASE command. This turns off case mode for the rest of the session, or until it is re-enabled. Use the SET CASE command to re-enable case mode. For example; with case mode on, to select the average starting salary for all the positions an employee has had:

SELECT ID AVG(STARTSAL) FROM OCCUP
The result is one row for each employee with any OCCUP records. This gives the average of the starting salaries for each position the employee has had. This same query with cases cleared gives a very different result:

CLEAR CASE
SELECT AVG(STARTSAL) FROM OCCUP
The result is one row, giving the average starting salary for all the positions held by all the employees in the company.

Case mode is equivalent to a GROUP BY clause on the case id for the aggregation functions.

Aggregations with Missing or Undefined Values

Missing or undefined values are ignored in the computation of aggregation functions. For example, if the value of SALARY is missing for an employee, the record is ignored in the computation of the average. The average is a true average of actual values.

COUNT

COUNT counts the number of values selected in a query. It is often used in conjunction with other aggregation functions to count how many records were used in computing the aggregated value.

For example, the following query computes the mean salary and the number of all male employees.

SELECT AVG(SALARY) COUNT(SALARY) FROM EMPLOYEE -
WHERE GENDER = 1
This counts only those records which have a non-missing salary. COUNT can also have the argument * which specifies a count of all selected records regardless of whether the values are valid, missing, or undefined.

homecontents start chapter top of pagebottom of pagenext page index

ON

ON specifies the name of the new table to contain the results of the query.

If an output table is not specified with the ON clause, a table called PREVIOUS_SELECT is used.

All tables are on tabfiles. If a tabfile is not specified, the default is used. If a default tabfile has not been explicitly SET, a tabfile called $SYSTEM is used.

If all defaults are used, the results of a SELECT are stored on the table $SYSTEM.PREVIOUS_SELECT. This table is overwritten as necessary without prompting for confirmation. If you specify a table as output which already exists, you are prompted for confirmation that you want it overwritten before proceeding with the SELECT.

You cannot select ON to a table you are selecting FROM.

homecontents start chapter top of pagebottom of pagenext page index

ORDER BY

Specify ORDER BY to sort the selected rows. The first variable in the ORDER BY list is the major sort key. Specify keys in sequence from major to minor. By default, variables sort in ascending sequence. Follow the variable with the DESC keyword to sort in descending sequence. This only applies to the variable immediately preceding DESC. Missing values sort to the beginning of a set of values regardless as to whether ascending or descending is specified. (The MISS function can be used to retrieve original values for missing values.)

The sort keys may contain variables and expressions. The sort key variables do not have to be in the variable list of the SELECT.

DISPLAY does not resequence rows and ORDER BY must be specified on the SELECT to create the table in a particular sequence if this differs from the source data. For example:

SELECT ID NAME SALARY FROM EMPLOYEE  -
ORDER BY SALARY NAME
Synonyms: ORDER, SORT, SORT BY

homecontents start chapter top of pagebottom of pagenext page index

OUTER

When joining records on a case structured database, OUTER specifies that, if no matching record exist, the SELECT operates as if a record containing undefined values for all variables did exist.

Normally, a join operation creates a row for the resultant table if there exists a record for every record name in the FROM clause. The OUTER option allows the retrieval of some data even when some records may not exist. This operation is called an outer join. The keyword OUTER is specified after all records in the FROM clause and specifies all record names that the OUTER applies to.

SELECT NAME REVDATE FROM EMPLOYEE REVIEW  OUTER REVIEW 

homecontents start chapter top of pagebottom of pagenext page index

UNION

UNION adds the result of a second SELECT clause to the table created by the main SELECT command. With the UNION clause, each SELECT must result in the same number of output columns and each column must correspond in type. The first SELECT command determines the names and the types of the output columns. Numeric variables must correspond to numeric variables, string variables with string variables.

The main SELECT command defines the table that is produced when using the UNION statement. For example, assume two tables, one for current employees and one for ex-employees. A single output table for all employees could be produced with the UNION clause:

SELECT NAME SALARY FROM EMPLOYEE ON ALLEMPLOYEES -
       UNION  SELECT NAME SALARY FROM EXEMPLOYEE
This creates a new table with two columns and a row per employee.

As many SELECT clauses as required may be UNIONed together as long as the rules on number and type of variables are followed.

If sequence of the output table is important, use the ORDER BY clause to specify it. Any such ORDER BY should follow the last clause of the last SELECT in the command.

homecontents start chapter top of pagebottom of pagenext page index

WHERE

WHERE specifies the logical conditions used to select records or rows. Only records or rows meeting the conditions are selected. The WHERE clause can reference any variables, regardless as to whether the variables are in the SELECT variable list or not.

The WHERE clause can reference expressions . Expressions are a combination of variables and operators which produce a new value.

The WHERE clause may contain compound conditions connected by the logical operators AND, OR, XOR and NOT.
AND means both expressions must be true;
OR means either expression must be true;
XOR means one expression must be true but not both;
NOT means the expression must not be true.

The WHERE clause is evaluated in the following order of precedence (parentheses ( ) can be used to denote an explicit order of evaluation):

  1. Expressions
  2. Relational Operators
  3. NOT
  4. AND
  5. OR, XOR

Relational operators

The WHERE clause may include the following operators:

EQ or = or IS
Equal to
NE or ><
Not equal to
LT or <
Less than
LE or <=
Less than or equal to
GT or >
Greater than
GE or >=
Greater than or equal to
BETWEEN expr AND expr
Between or equal to the values of two expressions
IN (expr,......)
Equal to one of the values in a list
LIKE
Matches a specified character pattern
EQ NULL
Is missing
NE NULL
Is not missing

The logical operator NOT can be used to test for the opposite of any condition.

EQ, NE, LT, LE, GT, GE & BETWEEN

These operators test the relationship between two values. If the specified condition is true, the data is selected. NE is provided as a convenient shorthand; it is identical to NOT EQ.
BETWEEN means equal to the end values or any value in between.

IN

Selects records or rows when the value that matches one or more values in a list. For example, to select data for employees 1, 5, and 7.

SELECT ...  FROM EMPLOYEE WHERE ID IN (1,5,7)
To select all records except those in the list use NOT:

SELECT .... WHERE  NOT ( ID IN (1, 5, 7))

LIKE Pattern Matching

A
pattern is a partial string where symbols are used to indicate how that position is to be treated. The pattern consists of symbols plus the string you want to match.

homecontents start chapter top of pagebottom of pagenext page index

Subqueries

Subqueries are used to select rows from a table based on data in other rows. The rows returned by one SELECT statement are used in the WHERE clause of another SELECT statement. The subquery executes first and returns one or more values which are then used by the main SELECT as if it were given a set of constant values. For example, to select the name, gender and education of all employees who have the same education as Mary Black.

SELECT NAME GENDER EDUC FROM EMPLOYEE -
WHERE EDUC = -
(SELECT EDUC FROM EMPLOYEE WHERE NAME = 'Mary Black')
The subquery (the one enclosed in parentheses) returns the value of Mary Black's education. This value is then used as the object of the WHERE clause for the main SELECT. The information selected by the main SELECT consists of the name and education for all employees with the same education as Mary. This set of employees naturally includes Mary.

The data comes from multiple cases and case structure must be off (CLEAR CASE) for this query to operate as required. With a case structured database and subqueries which retrieve data from one case which is used to SELECT other cases, CLEAR CASE mode.

Subqueries can be used wherever a WHERE can be specified.

The previous example shows the most basic use of a subquery - one that returns a single value. If a subquery can return more than one value, specify how the returned values are treated in the WHERE clause with the IN, ANY and ALL functions. ANY tests against any returned value; ALL tests against every returned value. These can be used with the relational operators (EQ, NE, LT, GT, LE, GE). IN tests a value to be equal to a value in the list of returned values and is equivalent to EQ ANY.

Where a subquery can return more than one value, this is equivalent to a list. For example, to select people whose salary is greater than anyone whose current position is in division 1:

SELECT ID NAME SALARY FROM EMPLOYEE -
WHERE SALARY GT ALL  -
(SELECT SALARY FROM EMPLOYEE OCCUP -
WHERE CURRPOS EQ POSITION AND DIVISION EQ 1)
The subquery finds the division in the OCCUP record which matches the current position and tests to be in division 1; the salary of all employees where this is true is retrieved and these are now equivalent to a list of salaries. The salary for each employee is tested against this list to be greater than all entries in the list. A subquery must only return one column to be used to construct the list.

A WHERE clause can contain a combination of conditions and subqueries. Any subquery either returns a single value or a list and can be treated as equivalent to a value or list specified by expressions which are not subqueries. The logical operators (AND, OR, XOR, NOT) are used to connect separate clauses in the WHERE expression.

The logical function EXISTS tests that the subquery returns at least one row. EXISTS returns "True" if at least one row exists, "False" if not. The test can be reversed with the NOT logical operator. For example:

SELECT ID NAME SALARY FROM 1 WHERE -
EXISTS (SELECT * FROM REVIEW WHERE RATING = 5)
When using the EXISTS function, the column returned by the subquery is irrelevant and must be specified as an asterisk (*).

Note: This query could have been performed more easily and more efficiently as a simple join although this would return multiple rows for people who had received multiple ratings of 5:

SELECT ID NAME SALARY FROM EMPLOYEE REVIEW -
          WHERE RATING EQ 5

homecontents start chapter top of pagebottom of pagenext page index