Fisseha Berhane, PhD

Data Scientist

443-970-2353 fisseha@jhu.edu CV Resume Linkedin GitHub twitter twitter

Working with TERADATA

Get to know your data in Teradata

One of the first things we should do when we start working with a database is confirm how many tables each database has, and identify the fields contained in each table of the database.

We use different commands to do this in Teradata than we use in MySQL. Instead of using SHOW or DESCRIBE to get a list of columns in a table, use:

HELP TABLE [name of table goes here]

To get information about a single column in a table, we could write:

HELP COLUMN [name of column goes here]

The output of these commands will be a table with several columns. The important columns for you to notice are “Column Name”, which tells you the name of the column, and “Nullable”, which will have a “Y” if null values are permitted in that column and an “N” if null values are not permitted.

One thing that is missing from the information outputted by HELP is whether or not a column is a primary or foreign key. In order to get that information, use a SHOW command: SHOW table [insert name of table here];

However, SHOW does something different in Teradata than it does in MySQL. Teradata uses SHOW to give you the actual code that was written to create the table. You can ignore much of the SHOW output for our purposes, but the end of the create table statement tells you what defaults were set for each column in the table.For example, in the following output:

STORE INTEGER NOT NULL,
CITY CHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,
STATE CHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC,
ZIP CHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC,

The STORE column was configured so that it would not accept null values, the CITY, STATE, and ZIP columns were configured so that their values would not be case specific, and the column STORE was defined as the primary key of the table.

Look at your raw data

Most of the syntax you use to look at your data in Teradata is the same as you use in MySQL.
One of the main differences is that Terdata uses a TOP operator instead of a LIMIT operator to restrict the length of a query output. Whereas LIMIT comes at the end of a MySQL query, TOP comes immediately after SELECT in a Teradata query. The following statement would select the first 10 rows of the strinfo table as they are stored in the native database

SELECT TOP 10 *
FROM strinfo

The following statement would select the first 10 rows of the strinfo table,ordered in ascending alphabetical order by the city name (you would retrieve names that start with “a”):

SELECT TOP 10 *
FROM strinfo
ORDER BY city ASC

The following statement would select the first 10 rows of the strinfo table,ordered in descending alphabetical order by the city name (you would retrieve names that start with “w” and “y”):

SELECT TOP 10 *
FROM strinfo
ORDER BY city DESC

The following query would retrieve 10 random rows from the strinfo table:

SELECT *
FROM strinfo
SAMPLE 10;

The following query would retrieve a random 10% of the rows from the strinfo table:

SELECT *
FROM strinfo
SAMPLE .10

Other important differences between Teradata and MySQL :

  1. DISTINCT and LIMIT can be used in the same query statement in MySQL, but DISTINCT and TOP cannot be used together in Teradata

  2. MySQL accepts either double or single quotation marks around strings of text in queries,but Teradata will only accept single quotation marks

  3. MySQL will accept the symbols “!=” and “<>” to indicate “does not equal”, but Teradata will only accept “<>” (other operators, like “IN”, “BETWEEN”, and “LIKE” are the same: http://www.teradatawiki.net/2013/09/Teradata-Operators.html)

  • What was the date of the earliest sale in the database where the sale price of the item did not equal the original price of the item, and what was the largest margin (original price minus sale price) of an item sold on that earliest date?
In [ ]:
select top 1 saledate, (orgprice-sprice) as margin
from trnsact
where orgprice<>sprice
order by saledate ASC, margin DESC
  • What register number made the sale with the highest original price and highest sale price between the dates of August 1, 2004 and August 10, 2004? Make sure to sort by original price first and sale price second.
In [ ]:
SELECT TOP 10 register,saledate ,orgprice,sprice
FROM trnsact
WHERE extract(year from saledate)=2004 AND extract(MONTH from saledate)=8 AND extract(DAY from saledate)< 10
ORDER BY orgprice DESC, sprice DESC;
  • What brand names with the word/letters “liz” in them exist in the Dillard’s database?
In [ ]:
SELECT DISTINCT brand
FROM skuinfo
WHERE brand LIKE '%liz%'
  • What is the lowest store number of all the stores in the STORE_MSA table that are in the city of “little rock”,”Memphis”, or “tulsa”?
In [ ]:
SELECT TOP 10 store
FROM store_msa
WHERE city IN ('LITTLE ROCK','MEMPHIS','TULSA')
ORDER BY STORE ASC;
  • What is the sku number of the item in the Dillard’s database that had the highest original sales price?
In [ ]:
SELECT top 1 sku
FROM trnsact
ORDER BY orgprice DESC 
  • What was the highest original price in the Dillard’s database of the item with SKU 3631365?
In [ ]:
SELECT TOP 1 orgprice
from trnsact
WHERE SKU=3631365
ORDER BY orgprice DESC
  • What is the color of the Liz Claiborne brand item with the highest SKU # in the Dillard’s database (the Liz Claiborne brand is abbreviated “LIZ CLAI” in the Dillard’s database)?
In [ ]:
SELECT TOP 1 color
from skuinfo
where brand='LIZ CLAI'
ORDER BY sku DESC
  • How many Dillard's departments start with the letter "e"?
In [ ]:
SELECT COUNT(DISTINCT deptdesc)
FROM deptinfo
WHERE deptdesc LIKE 'e%'
  • According to the strinfo table, in how many states within the United States are Dillard’s stores located? (HINT: the bottom of the SQL scratchpad reports the number of rows in your output)
In [ ]:
SELECT COUNT(DISTINCT state)
FROM strinfo
  • What was the date of the earliest sale in the database where the sale price of the item did not equal the original price of the item, and what was the largest margin (original price minus sale price) of an item sold on that earliest date?
In [ ]:
SELECT TOP 10 orgprice, sprice, orgprice-sprice AS margin, saledate

FROM trnsact

WHERE orgprice<>sprice

ORDER BY saledate ASC, margin DESC




comments powered by Disqus