A Data Science Guide

SELECT, FROM & WHERE

The SELECT, FROM, and WHERE statements are the fundamental building blocks for simple queries. All queries will consist of the SELECT and FROM statement, with the WHERE statement appearing in queries that need filtering.

Selecting columns

The SELECT statement is used to select the columns from a table. The FROM statement is used to select the database from which the data is retrieved. In the example below both statements are put together in a query that can be executed.

SELECT
    column_1,
    column_2
FROM table_1;

The query selects column_1 and column_2 from table table_1. If all columns need to be queried from a table the * symbol can be used as follows:

SELECT
	*
FROM table_1;

Filtering columns

Besides, selecting the columns it is also possible to filter the data that is being returned. This can be achieved with the WHERE statement. The WHERE statement always comes after the data is selected with SELECT and FROM and is followed by a condition.

SELECT
	column_1,
	column_2
FROM table_1
WHERE condition;

The condition is based on the data present in one of the columns. Examples of this are:

WHERE column_1 = 100               -- column_1 is equal to 100
WHERE column_1 > 100               -- column_1 is greater than 100
WHERE column_1 >= 100              -- column_1 is greater than or equal to 100
WHERE column_1 <= 100              -- column_1 is smaller than or equal to 100
WHERE column_1 >= 100              -- column_1 is greater than or equal to 100
WHERE column_1 BETWEEN 50 AND 100  -- column_1 is between 50 and 100

WHERE column_2 != 'Yes'            -- column_2 is not equal to the string Yes
WHERE column_2 <> 'Yes'            -- column_2 is not equal to the string Yes
WHERE column_2 IN ('Yes', 'YES')   -- column_2 is equal to "YES" or "Yes"
WHERE column_2 LIKE ''

Combining WHERE with AND, OR, and NOT

AND, OR, and NOT can all be combined with the WHERE statement to have more advanced filter operations and are used as follows:

  • AND: When two conditions are combined with the AND operator both conditions need to be TRUE.
  • OR: When two conditions are combined using the OR operator only on of them needs to be TRUE.
  • NOT: The NOT operator is used to negate a condition, or reverse the outcome.

Here is how they would be used in a query:

-- column_1 is equal to 100 and column_2 is not equal to Yes
WHERE column_1 = 100 AND column_2 != 'Yes';

-- column_1 is equal to 100 or column_1 is equal to 200
WHERE column_1 = 100 OR column_1 = 200;

-- column_2 is not equal to Yes
WHERE NOT column_2 = 'Yes;