Group Functions in Oracle
Group functions execute on a group of rows and return one result per group. Group functions ignore null values.
Commonly used group functions:
- COUNT
- SUM
- AVG
- MAX
- MIN
COUNT:
COUNT function has 3 variants.
- COUNT(*)
- COUNT(column or expression)
- COUNT(DISTINCT)
COUNT(*):
COUNT(*) returns the total no of rows from a table including duplicate rows and null values in any of the columns in the table.
SELECT * FROM EMP_SELECT;
Creating duplicate rows:
Insert into EMP_SELECT (EMPLOYEE_ID,FIRST_NAME,LAST_NAME,JOB_ID,HIRE_DATE,SALARY,DEPARTMENT_ID) values (100,'Steven','King','AD_PRES',to_date('17-JUN-2003','DD-MON-YYYY'),24000,90); SELECT * FROM EMP_SELECT ORDER BY EMPLOYEE_ID;
SELECT COUNT(*) FROM EMP_SELECT ORDER BY EMPLOYEE_ID;
COUNT(column or expression):
COUNT function operates on a column or expression of a group of rows or table and returns the total no of not null values of that column or expression.
SELECT COUNT(*) FROM EMP_SELECT WHERE DEPARTMENT_ID IS NULL;
SELECT COUNT(*) FROM EMP_SELECT WHERE DEPARTMENT_ID IS NOT NULL;
So we can see that the count of DEPARTMENT_ID is 109 where DEPARTMENT_ID is not null.
SELECT COUNT(DEPARTMENT_ID) COUNT_DEPT FROM EMP_SELECT;
So the COUNT(Column or Expression) is ignoring the null values.
COUNT(DISTINCT):
COUNT(DISTINCT Column) ignores the duplicate values and null values.
SELECT DISTINCT DEPARTMENT_ID FROM EMP_SELECT ORDER BY DEPARTMENT_ID;
So we can check that there are 11 distinct DEPARTMENT_ID are available excluding null values.
SELECT COUNT(DISTINCT DEPARTMENT_ID) COUNT_DISTINCT_DEPARTMENT_ID FROM EMP_SELECT ORDER BY DEPARTMENT_ID;
COUNT DISTINCT ignores the null values as well.
DECODE WITH COUNT:
We want to extract the number of employees working in JOB_ID IT_PROG, ST_MAN and PU_CLERK.
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'IT_PROG';
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'ST_MAN';
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'PU_CLERK';
SELECT COUNT(DECODE(JOB_ID, 'IT_PROG', JOB_ID, NULL)) COUNT_IT_PROG,
COUNT(DECODE(JOB_ID, 'ST_MAN', JOB_ID, NULL)) COUNT_ST_MAN,
COUNT(DECODE(JOB_ID, 'PU_CLERK', JOB_ID, NULL)) COUNT_PU_CLERK
FROM EMP_SELECT; CASE WITH COUNT:
SELECT COUNT(CASE WHEN JOB_ID = 'IT_PROG' THEN JOB_ID ELSE NULL END) COUNT_IT_PROG,
COUNT(CASE WHEN JOB_ID = 'ST_MAN' THEN JOB_ID ELSE NULL END) COUNT_ST_MAN,
COUNT(CASE WHEN JOB_ID = 'PU_CLERK' THEN JOB_ID ELSE NULL END) COUNT_PU_CLERK
FROM EMP_SELECT; SUM:
SUM function operates on a group of data or entire table and returns the sum of a column or expression excluding null values.
SELECT EMPLOYEE_ID,
FIRST_NAME,
LAST_NAME,
JOB_ID,
HIRE_DATE,
SALARY,
DEPARTMENT_ID
FROM EMP_SELECT
WHERE DEPARTMENT_ID = 30; SELECT SUM(SALARY) TOTAL_SAL FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
The above query returns the total salary for DEPARTMENT_ID 30.
SELECT SUM(SALARY) TOTAL_SAL FROM EMP_SELECT;
The above query returns the total salary for entire EMP_SELECT table.
DECODE WITH SUM:
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'IT_PROG';
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'ST_MAN';
SELECT SUM(DECODE(JOB_ID, 'IT_PROG', SALARY, 0)) TOTAL_SAL_IT_PROG,
SUM(DECODE(JOB_ID, 'ST_MAN', SALARY, 0)) TOTAL_SAL_ST_MAN,
SUM(DECODE(DEPARTMENT_ID, 30, SALARY, 0)) TOTAL_SAL_DEPT_30
FROM EMP_SELECT; CASE WITH SUM:
SELECT SUM(CASE WHEN JOB_ID = 'IT_PROG' THEN SALARY ELSE 0 END) TOTAL_SAL_IT_PROG,
SUM(CASE WHEN JOB_ID = 'ST_MAN' THEN SALARY ELSE 0 END) TOTAL_SAL_ST_MAN,
SUM(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE 0 END) TOTAL_SAL_DEPT_30
FROM EMP_SELECT; AVG:
AVG function operates on a group of data or entire table and returns the average of a column or expression excluding null values.
SELECT EMPLOYEE_ID,
FIRST_NAME,
LAST_NAME,
JOB_ID,
HIRE_DATE,
SALARY,
DEPARTMENT_ID
FROM EMP_SELECT
WHERE DEPARTMENT_ID = 30; SELECT SUM(SALARY) TOTAL_SAL,
COUNT(EMPLOYEE_ID) TOTAL_EMPLOYEES,
AVG(SALARY) AVG_SAL,
SUM(SALARY)/COUNT(EMPLOYEE_ID) CALCULATED_AVG_SAL
FROM EMP_SELECT
WHERE DEPARTMENT_ID = 30; In the above example we can see that AVG_SAL is returning average SALARY of DEPARTMENT_ID 30.
DECODE WITH AVG:
SELECT ROUND(AVG(DECODE(JOB_ID, 'IT_PROG', SALARY, NULL)), 0) AVG_SAL_IT_PROG,
ROUND(AVG(DECODE(JOB_ID, 'ST_MAN', SALARY, NULL)), 0) AVG_SAL_ST_MAN,
ROUND(AVG(DECODE(DEPARTMENT_ID, 30, SALARY, NULL)), 0) AVG_SAL_DEPT_30
FROM EMP_SELECT; In the above example, we have used NULL to bypass the other rows apart from those mentioned in DECODE clause.
If we use 0 instead of NULL then the entire result will be calculated on complete data set instead of the filter condition mentioned in DECODE clause.
SELECT ROUND(AVG(DECODE(JOB_ID, 'IT_PROG', SALARY, 0)), 0) AVG_SAL_IT_PROG,
ROUND(AVG(DECODE(JOB_ID, 'ST_MAN', SALARY, 0)), 0) AVG_SAL_ST_MAN,
ROUND(AVG(DECODE(DEPARTMENT_ID, 30, SALARY, 0)), 0) AVG_SAL_DEPT_30
FROM EMP_SELECT; CASE WITH AVG:
SELECT ROUND(AVG(CASE WHEN JOB_ID = 'IT_PROG' THEN SALARY ELSE NULL END), 0) AVG_SAL_IT_PROG,
ROUND(AVG(CASE WHEN JOB_ID = 'ST_MAN' THEN SALARY ELSE NULL END), 0) AVG_SAL_ST_MAN,
ROUND(AVG(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE NULL END), 0) AVG_SAL_DEPT_30
FROM EMP_SELECT; In the above example, we have used NULL to bypass the other rows apart from those mentioned in CASE.
If we use 0 instead of NULL then the entire result will be calculated on complete data set instead of the filter condition mentioned in CASE.
SELECT ROUND(AVG(CASE WHEN JOB_ID = 'IT_PROG' THEN SALARY ELSE 0 END), 0) AVG_SAL_IT_PROG,
ROUND(AVG(CASE WHEN JOB_ID = 'ST_MAN' THEN SALARY ELSE 0 END), 0) AVG_SAL_ST_MAN,
ROUND(AVG(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE 0 END), 0) AVG_SAL_DEPT_30
FROM EMP_SELECT; MIN:
MIN function operates on a column or expression of a group of data or entire table and returns the minimum value excluding null. This function can be used with any data type.
MIN with Character data type:
For character data type MIN function returns the alphabetized first data in the list.
SELECT EMPLOYEE_ID,
FIRST_NAME,
LAST_NAME,
JOB_ID,
HIRE_DATE,
SALARY,
DEPARTMENT_ID
FROM EMP_SELECT
WHERE DEPARTMENT_ID = 30; If we apply MIN function on FIRST_NAME column of above group of data then it will return the name Alexander.
SELECT MIN(FIRST_NAME) MIN_FIRST_NAME FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
SELECT MIN(LAST_NAME) MIN_LAST_NAME FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
MIN with Number data type:
SELECT MIN(SALARY) MIN_SALARY FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
MIN with Date data type:
SELECT MIN(HIRE_DATE) MIN_HIRE_DATE FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
DECODE WITH MIN:
SELECT MIN(DECODE(DEPARTMENT_ID, 30, FIRST_NAME, NULL)) MIN_FIRST_NAME,
MIN(DECODE(DEPARTMENT_ID, 30, LAST_NAME, NULL)) MIN_LAST_NAME,
MIN(DECODE(DEPARTMENT_ID, 30, SALARY, NULL)) MIN_SALARY,
MIN(DECODE(DEPARTMENT_ID, 30, HIRE_DATE, NULL)) MIN_HIRE_DATE
FROM EMP_SELECT; CASE WITH MIN:
SELECT MIN(CASE WHEN DEPARTMENT_ID = 30 THEN FIRST_NAME ELSE NULL END) MIN_FIRST_NAME,
MIN(CASE WHEN DEPARTMENT_ID = 30 THEN LAST_NAME ELSE NULL END) MIN_LAST_NAME,
MIN(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE NULL END) MIN_SALARY,
MIN(CASE WHEN DEPARTMENT_ID = 30 THEN HIRE_DATE ELSE NULL END) MIN_HIRE_DATE
FROM EMP_SELECT; MAX:
MAX function operates on a column or expression of a group of data or entire table and returns the maximum value excluding null. This function can be used with any data type.
MAX with Character data type:
For character data type MAX function returns the alphabetized last data in the list.
SELECT MAX(FIRST_NAME) MAX_FIRST_NAME,
MAX(LAST_NAME) MAX_LAST_NAME
FROM EMP_SELECT
WHERE DEPARTMENT_ID = 30; MAX with Number data type:
SELECT MAX(SALARY) MAX_SALARY FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
MAX with Date data type:
SELECT MAX(HIRE_DATE) MAX_HIRE_DATE FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
DECODE WITH MAX:
SELECT MAX(DECODE(DEPARTMENT_ID, 30, FIRST_NAME, NULL)) MAX_FIRST_NAME,
MAX(DECODE(DEPARTMENT_ID, 30, LAST_NAME, NULL)) MAX_LAST_NAME,
MAX(DECODE(DEPARTMENT_ID, 30, SALARY, NULL)) MAX_SALARY,
MAX(DECODE(DEPARTMENT_ID, 30, HIRE_DATE, NULL)) MAX_HIRE_DATE
FROM EMP_SELECT; CASE WITH MAX:
SELECT MAX(CASE WHEN DEPARTMENT_ID = 30 THEN FIRST_NAME ELSE NULL END) MAX_FIRST_NAME,
MAX(CASE WHEN DEPARTMENT_ID = 30 THEN LAST_NAME ELSE NULL END) MAX_LAST_NAME,
MAX(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE NULL END) MAX_SALARY,
MAX(CASE WHEN DEPARTMENT_ID = 30 THEN HIRE_DATE ELSE NULL END) MAX_HIRE_DATE
FROM EMP_SELECT; GROUP BY CLAUSE:
SINGLE COLUMN GROUP BY CLAUSE:
GROUP BY clause is used with GROUP function to show the result group wise.
SELECT DEPARTMENT_ID,
SUM(SALARY) TOTAL_SALARY
FROM EMP_SELECT
GROUP BY DEPARTMENT_ID
ORDER BY DEPARTMENT_ID; SELECT DEPARTMENT_ID,
MAX(SALARY) MAX_SALARY,
ROUND(AVG(SALARY), 0) AVG_SALARY,
COUNT(EMPLOYEE_ID) NO_OF_EMPLOYEES
FROM EMP_SELECT
GROUP BY DEPARTMENT_ID
ORDER BY DEPARTMENT_ID; In the above example we have grouped the data by using DEPARTMENT_ID column.
MULTIPLE COLUMN GROUP BY CLAUSE:
SELECT DEPARTMENT_ID,
JOB_ID,
MAX(SALARY) MAX_SALARY,
ROUND(AVG(SALARY), 0) AVG_SALARY,
COUNT(EMPLOYEE_ID) NO_OF_EMPLOYEES
FROM EMP_SELECT
WHERE DEPARTMENT_ID IN (10, 20, 30)
GROUP BY DEPARTMENT_ID, JOB_ID
ORDER BY DEPARTMENT_ID; In the above example, data is grouped by DEPARTMENT_ID first and then JOB_ID of that department.
HAVING CLAUSE:
HAVING clause is used to add further filter criteria in the GROUP BY result.
We cannot use any WHERE clause condition after using GROUP BY clause in a SELECT statement. HAVING is the only option to do that.
With the reference of previous example, if we want to show group result where number of employees is 1 then the query will become with HAVING clause:
SELECT DEPARTMENT_ID,
JOB_ID,
MAX(SALARY) MAX_SALARY,
ROUND(AVG(SALARY), 0) AVG_SALARY,
COUNT(EMPLOYEE_ID) NO_OF_EMPLOYEES
FROM EMP_SELECT
WHERE DEPARTMENT_ID IN (10, 20, 30)
GROUP BY DEPARTMENT_ID, JOB_ID
HAVING COUNT(EMPLOYEE_ID) = 1
ORDER BY DEPARTMENT_ID; SELECT DEPARTMENT_ID,
JOB_ID,
MAX(SALARY) MAX_SALARY,
ROUND(AVG(SALARY), 0) AVG_SALARY,
COUNT(EMPLOYEE_ID) NO_OF_EMPLOYEES
FROM EMP_SELECT
WHERE DEPARTMENT_ID IN (10, 20, 30)
GROUP BY DEPARTMENT_ID, JOB_ID
HAVING (COUNT(EMPLOYEE_ID) = 1 AND SUM(SALARY) > 4400)
ORDER BY DEPARTMENT_ID; RELATED TOPICS: