Group Functions in Oracle
Group functions execute on a group of rows and return one result per group. Group functions ignore null values.
Commonly used group functions:
- COUNT
- SUM
- AVG
- MAX
- MIN
COUNT:
COUNT function has 3 variants.
- COUNT(*)
- COUNT(column or expression)
- COUNT(DISTINCT)
COUNT(*):
COUNT(*) returns the total no of rows from a table including duplicate rows and null values in any of the columns in the table.
SELECT * FROM EMP_SELECT;
Creating duplicate rows:
Insert into EMP_SELECT (EMPLOYEE_ID,FIRST_NAME,LAST_NAME,JOB_ID,HIRE_DATE,SALARY,DEPARTMENT_ID) values (100,'Steven','King','AD_PRES',to_date('17-JUN-2003','DD-MON-YYYY'),24000,90);
SELECT * FROM EMP_SELECT ORDER BY EMPLOYEE_ID;
SELECT COUNT(*) FROM EMP_SELECT ORDER BY EMPLOYEE_ID;
COUNT(column or expression):
COUNT function operates on a column or expression of a group of rows or table and returns the total no of not null values of that column or expression.
SELECT COUNT(*) FROM EMP_SELECT WHERE DEPARTMENT_ID IS NULL;
SELECT COUNT(*) FROM EMP_SELECT WHERE DEPARTMENT_ID IS NOT NULL;
So we can see that the count of DEPARTMENT_ID is 109 where DEPARTMENT_ID is not null.
SELECT COUNT(DEPARTMENT_ID) COUNT_DEPT FROM EMP_SELECT;
So the COUNT(Column or Expression) is ignoring the null values.
COUNT(DISTINCT):
COUNT(DISTINCT Column) ignores the duplicate values and null values.
SELECT DISTINCT DEPARTMENT_ID FROM EMP_SELECT ORDER BY DEPARTMENT_ID;
So we can check that there are 11 distinct DEPARTMENT_ID are available excluding null values.
SELECT COUNT(DISTINCT DEPARTMENT_ID) COUNT_DISTINCT_DEPARTMENT_ID FROM EMP_SELECT ORDER BY DEPARTMENT_ID;
COUNT DISTINCT ignores the null values as well.
DECODE WITH COUNT:
We want to extract the number of employees working in JOB_ID IT_PROG, ST_MAN and PU_CLERK.
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'IT_PROG';
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'ST_MAN';
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'PU_CLERK';
SELECT COUNT(DECODE(JOB_ID, 'IT_PROG', JOB_ID, NULL)) COUNT_IT_PROG, COUNT(DECODE(JOB_ID, 'ST_MAN', JOB_ID, NULL)) COUNT_ST_MAN, COUNT(DECODE(JOB_ID, 'PU_CLERK', JOB_ID, NULL)) COUNT_PU_CLERK FROM EMP_SELECT;
CASE WITH COUNT:
SELECT COUNT(CASE WHEN JOB_ID = 'IT_PROG' THEN JOB_ID ELSE NULL END) COUNT_IT_PROG, COUNT(CASE WHEN JOB_ID = 'ST_MAN' THEN JOB_ID ELSE NULL END) COUNT_ST_MAN, COUNT(CASE WHEN JOB_ID = 'PU_CLERK' THEN JOB_ID ELSE NULL END) COUNT_PU_CLERK FROM EMP_SELECT;
SUM:
SUM function operates on a group of data or entire table and returns the sum of a column or expression excluding null values.
SELECT EMPLOYEE_ID, FIRST_NAME, LAST_NAME, JOB_ID, HIRE_DATE, SALARY, DEPARTMENT_ID FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
SELECT SUM(SALARY) TOTAL_SAL FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
The above query returns the total salary for DEPARTMENT_ID 30.
SELECT SUM(SALARY) TOTAL_SAL FROM EMP_SELECT;
The above query returns the total salary for entire EMP_SELECT table.
DECODE WITH SUM:
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'IT_PROG';
SELECT * FROM EMP_SELECT WHERE JOB_ID = 'ST_MAN';
SELECT SUM(DECODE(JOB_ID, 'IT_PROG', SALARY, 0)) TOTAL_SAL_IT_PROG, SUM(DECODE(JOB_ID, 'ST_MAN', SALARY, 0)) TOTAL_SAL_ST_MAN, SUM(DECODE(DEPARTMENT_ID, 30, SALARY, 0)) TOTAL_SAL_DEPT_30 FROM EMP_SELECT;
CASE WITH SUM:
SELECT SUM(CASE WHEN JOB_ID = 'IT_PROG' THEN SALARY ELSE 0 END) TOTAL_SAL_IT_PROG, SUM(CASE WHEN JOB_ID = 'ST_MAN' THEN SALARY ELSE 0 END) TOTAL_SAL_ST_MAN, SUM(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE 0 END) TOTAL_SAL_DEPT_30 FROM EMP_SELECT;
AVG:
AVG function operates on a group of data or entire table and returns the average of a column or expression excluding null values.
SELECT EMPLOYEE_ID, FIRST_NAME, LAST_NAME, JOB_ID, HIRE_DATE, SALARY, DEPARTMENT_ID FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
SELECT SUM(SALARY) TOTAL_SAL, COUNT(EMPLOYEE_ID) TOTAL_EMPLOYEES, AVG(SALARY) AVG_SAL, SUM(SALARY)/COUNT(EMPLOYEE_ID) CALCULATED_AVG_SAL FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
In the above example we can see that AVG_SAL is returning average SALARY of DEPARTMENT_ID 30.
DECODE WITH AVG:
SELECT ROUND(AVG(DECODE(JOB_ID, 'IT_PROG', SALARY, NULL)), 0) AVG_SAL_IT_PROG, ROUND(AVG(DECODE(JOB_ID, 'ST_MAN', SALARY, NULL)), 0) AVG_SAL_ST_MAN, ROUND(AVG(DECODE(DEPARTMENT_ID, 30, SALARY, NULL)), 0) AVG_SAL_DEPT_30 FROM EMP_SELECT;
In the above example, we have used NULL to bypass the other rows apart from those mentioned in DECODE clause.
If we use 0 instead of NULL then the entire result will be calculated on complete data set instead of the filter condition mentioned in DECODE clause.
SELECT ROUND(AVG(DECODE(JOB_ID, 'IT_PROG', SALARY, 0)), 0) AVG_SAL_IT_PROG, ROUND(AVG(DECODE(JOB_ID, 'ST_MAN', SALARY, 0)), 0) AVG_SAL_ST_MAN, ROUND(AVG(DECODE(DEPARTMENT_ID, 30, SALARY, 0)), 0) AVG_SAL_DEPT_30 FROM EMP_SELECT;
CASE WITH AVG:
SELECT ROUND(AVG(CASE WHEN JOB_ID = 'IT_PROG' THEN SALARY ELSE NULL END), 0) AVG_SAL_IT_PROG, ROUND(AVG(CASE WHEN JOB_ID = 'ST_MAN' THEN SALARY ELSE NULL END), 0) AVG_SAL_ST_MAN, ROUND(AVG(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE NULL END), 0) AVG_SAL_DEPT_30 FROM EMP_SELECT;
In the above example, we have used NULL to bypass the other rows apart from those mentioned in CASE.
If we use 0 instead of NULL then the entire result will be calculated on complete data set instead of the filter condition mentioned in CASE.
SELECT ROUND(AVG(CASE WHEN JOB_ID = 'IT_PROG' THEN SALARY ELSE 0 END), 0) AVG_SAL_IT_PROG, ROUND(AVG(CASE WHEN JOB_ID = 'ST_MAN' THEN SALARY ELSE 0 END), 0) AVG_SAL_ST_MAN, ROUND(AVG(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE 0 END), 0) AVG_SAL_DEPT_30 FROM EMP_SELECT;
MIN:
MIN function operates on a column or expression of a group of data or entire table and returns the minimum value excluding null. This function can be used with any data type.
MIN with Character data type:
For character data type MIN function returns the alphabetized first data in the list.
SELECT EMPLOYEE_ID, FIRST_NAME, LAST_NAME, JOB_ID, HIRE_DATE, SALARY, DEPARTMENT_ID FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
If we apply MIN function on FIRST_NAME column of above group of data then it will return the name Alexander.
SELECT MIN(FIRST_NAME) MIN_FIRST_NAME FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
SELECT MIN(LAST_NAME) MIN_LAST_NAME FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
MIN with Number data type:
SELECT MIN(SALARY) MIN_SALARY FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
MIN with Date data type:
SELECT MIN(HIRE_DATE) MIN_HIRE_DATE FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
DECODE WITH MIN:
SELECT MIN(DECODE(DEPARTMENT_ID, 30, FIRST_NAME, NULL)) MIN_FIRST_NAME, MIN(DECODE(DEPARTMENT_ID, 30, LAST_NAME, NULL)) MIN_LAST_NAME, MIN(DECODE(DEPARTMENT_ID, 30, SALARY, NULL)) MIN_SALARY, MIN(DECODE(DEPARTMENT_ID, 30, HIRE_DATE, NULL)) MIN_HIRE_DATE FROM EMP_SELECT;
CASE WITH MIN:
SELECT MIN(CASE WHEN DEPARTMENT_ID = 30 THEN FIRST_NAME ELSE NULL END) MIN_FIRST_NAME, MIN(CASE WHEN DEPARTMENT_ID = 30 THEN LAST_NAME ELSE NULL END) MIN_LAST_NAME, MIN(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE NULL END) MIN_SALARY, MIN(CASE WHEN DEPARTMENT_ID = 30 THEN HIRE_DATE ELSE NULL END) MIN_HIRE_DATE FROM EMP_SELECT;
MAX:
MAX function operates on a column or expression of a group of data or entire table and returns the maximum value excluding null. This function can be used with any data type.
MAX with Character data type:
For character data type MAX function returns the alphabetized last data in the list.
SELECT MAX(FIRST_NAME) MAX_FIRST_NAME, MAX(LAST_NAME) MAX_LAST_NAME FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
MAX with Number data type:
SELECT MAX(SALARY) MAX_SALARY FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
MAX with Date data type:
SELECT MAX(HIRE_DATE) MAX_HIRE_DATE FROM EMP_SELECT WHERE DEPARTMENT_ID = 30;
DECODE WITH MAX:
SELECT MAX(DECODE(DEPARTMENT_ID, 30, FIRST_NAME, NULL)) MAX_FIRST_NAME, MAX(DECODE(DEPARTMENT_ID, 30, LAST_NAME, NULL)) MAX_LAST_NAME, MAX(DECODE(DEPARTMENT_ID, 30, SALARY, NULL)) MAX_SALARY, MAX(DECODE(DEPARTMENT_ID, 30, HIRE_DATE, NULL)) MAX_HIRE_DATE FROM EMP_SELECT;
CASE WITH MAX:
SELECT MAX(CASE WHEN DEPARTMENT_ID = 30 THEN FIRST_NAME ELSE NULL END) MAX_FIRST_NAME, MAX(CASE WHEN DEPARTMENT_ID = 30 THEN LAST_NAME ELSE NULL END) MAX_LAST_NAME, MAX(CASE WHEN DEPARTMENT_ID = 30 THEN SALARY ELSE NULL END) MAX_SALARY, MAX(CASE WHEN DEPARTMENT_ID = 30 THEN HIRE_DATE ELSE NULL END) MAX_HIRE_DATE FROM EMP_SELECT;
GROUP BY CLAUSE:
SINGLE COLUMN GROUP BY CLAUSE:
GROUP BY clause is used with GROUP function to show the result group wise.
SELECT DEPARTMENT_ID, SUM(SALARY) TOTAL_SALARY FROM EMP_SELECT GROUP BY DEPARTMENT_ID ORDER BY DEPARTMENT_ID;
SELECT DEPARTMENT_ID, MAX(SALARY) MAX_SALARY, ROUND(AVG(SALARY), 0) AVG_SALARY, COUNT(EMPLOYEE_ID) NO_OF_EMPLOYEES FROM EMP_SELECT GROUP BY DEPARTMENT_ID ORDER BY DEPARTMENT_ID;
In the above example we have grouped the data by using DEPARTMENT_ID column.
MULTIPLE COLUMN GROUP BY CLAUSE:
SELECT DEPARTMENT_ID, JOB_ID, MAX(SALARY) MAX_SALARY, ROUND(AVG(SALARY), 0) AVG_SALARY, COUNT(EMPLOYEE_ID) NO_OF_EMPLOYEES FROM EMP_SELECT WHERE DEPARTMENT_ID IN (10, 20, 30) GROUP BY DEPARTMENT_ID, JOB_ID ORDER BY DEPARTMENT_ID;
In the above example, data is grouped by DEPARTMENT_ID first and then JOB_ID of that department.
HAVING CLAUSE:
HAVING clause is used to add further filter criteria in the GROUP BY result.
We cannot use any WHERE clause condition after using GROUP BY clause in a SELECT statement. HAVING is the only option to do that.
With the reference of previous example, if we want to show group result where number of employees is 1 then the query will become with HAVING clause:
SELECT DEPARTMENT_ID, JOB_ID, MAX(SALARY) MAX_SALARY, ROUND(AVG(SALARY), 0) AVG_SALARY, COUNT(EMPLOYEE_ID) NO_OF_EMPLOYEES FROM EMP_SELECT WHERE DEPARTMENT_ID IN (10, 20, 30) GROUP BY DEPARTMENT_ID, JOB_ID HAVING COUNT(EMPLOYEE_ID) = 1 ORDER BY DEPARTMENT_ID;
SELECT DEPARTMENT_ID, JOB_ID, MAX(SALARY) MAX_SALARY, ROUND(AVG(SALARY), 0) AVG_SALARY, COUNT(EMPLOYEE_ID) NO_OF_EMPLOYEES FROM EMP_SELECT WHERE DEPARTMENT_ID IN (10, 20, 30) GROUP BY DEPARTMENT_ID, JOB_ID HAVING (COUNT(EMPLOYEE_ID) = 1 AND SUM(SALARY) > 4400) ORDER BY DEPARTMENT_ID;
RELATED TOPICS: