SQL (Structured Query Language) is a standard language for managing and manipulating relational databases. It allows you to create, retrieve, update, and delete data, as well as define database structures and control access.
Before diving into SQL commands, let's understand some fundamental database concepts:
-- Example: A simple 'Students' table structure
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Age INT,
City VARCHAR(50)
);
DDL commands are used to define, modify, and delete database structures.
Used to create a new database.
CREATE DATABASE MySchoolDB;
Used to create a new table in the database.
USE MySchoolDB; -- Select the database (syntax might vary by DB system)
CREATE TABLE Courses (
CourseID INT PRIMARY KEY,
CourseName VARCHAR(100) NOT NULL,
Credits INT
);
CREATE TABLE Enrollments (
EnrollmentID INT PRIMARY KEY AUTO_INCREMENT, -- AUTO_INCREMENT for MySQL/MariaDB
StudentID INT,
CourseID INT,
EnrollmentDate DATE,
FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);
Used to modify an existing table's structure.
-- Add a new column
ALTER TABLE Students
ADD Email VARCHAR(100);
-- Modify an existing column
ALTER TABLE Students
MODIFY COLUMN Age SMALLINT; -- Syntax varies (e.g., ALTER COLUMN in SQL Server)
-- Drop a column
ALTER TABLE Students
DROP COLUMN City;
Used to delete an existing database.
DROP DATABASE MySchoolDB;
Used to delete an existing table.
DROP TABLE Courses;
Removes all rows from a table, but keeps the table structure. It's faster than DELETE for all rows and resets auto-increment counters.
TRUNCATE TABLE Enrollments;
DML commands are used to manipulate data within database objects.
Used to add new rows of data into a table.
-- Insert all column values (order must match table definition)
INSERT INTO Students (StudentID, FirstName, LastName, Age, Email)
VALUES (1, 'Alice', 'Smith', 20, 'alice.s@example.com');
-- Insert specific column values
INSERT INTO Courses (CourseID, CourseName, Credits)
VALUES (101, 'Introduction to SQL', 3);
-- Multiple rows (syntax varies slightly by DB)
INSERT INTO Students (StudentID, FirstName, LastName, Age, Email) VALUES
(2, 'Bob', 'Johnson', 22, 'bob.j@example.com'),
(3, 'Charlie', 'Brown', 21, 'charlie.b@example.com');
Used to retrieve data from one or more tables.
-- Select all columns from a table
SELECT * FROM Students;
-- Select specific columns
SELECT FirstName, LastName, Age FROM Students;
-- Select with a WHERE clause (filtering rows)
SELECT * FROM Students
WHERE Age > 20;
SELECT CourseName FROM Courses
WHERE Credits = 3;
Used to modify existing data in a table.
-- Update a single column for a specific row
UPDATE Students
SET Email = 'alice.smith@example.com'
WHERE StudentID = 1;
-- Update multiple columns
UPDATE Students
SET Age = 21, City = 'New York'
WHERE StudentID = 1;
-- Update all rows (USE WITH CAUTION!)
UPDATE Students
SET City = 'Unknown';
Used to delete rows from a table.
-- Delete specific rows
DELETE FROM Students
WHERE StudentID = 3;
-- Delete all rows (USE WITH CAUTION!)
DELETE FROM Students;
SQL provides various operators for comparisons, logic, and arithmetic.
SELECT * FROM Students WHERE Age = 20;
SELECT * FROM Students WHERE Age <> 20; -- Not equal to
SELECT * FROM Students WHERE Age >= 21;
SELECT * FROM Students WHERE Age > 20 AND City = 'New York';
SELECT * FROM Students WHERE Age < 20 OR City = 'Boston';
SELECT * FROM Students WHERE NOT Age = 20;
-- LIKE (pattern matching)
SELECT * FROM Students WHERE FirstName LIKE 'A%'; -- Starts with 'A'
SELECT * FROM Students WHERE LastName LIKE '%son'; -- Ends with 'son'
SELECT * FROM Students WHERE FirstName LIKE '%li%'; -- Contains 'li'
-- IN (match any value in a list)
SELECT * FROM Students WHERE City IN ('New York', 'Boston', 'Chicago');
-- BETWEEN (range)
SELECT * FROM Students WHERE Age BETWEEN 20 AND 25;
-- IS NULL / IS NOT NULL
SELECT * FROM Students WHERE Email IS NULL;
SELECT * FROM Students WHERE Email IS NOT NULL;
SQL offers various built-in functions for data manipulation and aggregation.
SELECT COUNT(*) FROM Students; -- Total number of students
SELECT SUM(Credits) FROM Courses; -- Total credits of all courses
SELECT AVG(Age) FROM Students; -- Average age of students
SELECT MIN(Age) FROM Students; -- Minimum age
SELECT MAX(Age) FROM Students; -- Maximum age
SELECT UPPER(FirstName) FROM Students; -- Convert to uppercase
SELECT LOWER(CourseName) FROM Courses; -- Convert to lowercase
SELECT LENGTH(Email) FROM Students; -- Length of email string (or LEN in SQL Server)
SELECT ROUND(3.14159, 2); -- Round to 2 decimal places
SELECT NOW(); -- Current date and time (or GETDATE(), CURRENT_TIMESTAMP)
GROUP BY
groups rows that have the same values in specified columns into a summary row. HAVING
is used to filter groups based on a condition.
-- Count students by age
SELECT Age, COUNT(*) AS NumberOfStudents
FROM Students
GROUP BY Age;
-- Count students by city, only for cities with more than 1 student
SELECT City, COUNT(*) AS NumberOfStudents
FROM Students
GROUP BY City
HAVING COUNT(*) > 1;
Sorts the result set of a query.
SELECT * FROM Students
ORDER BY LastName ASC; -- Ascending order (default)
SELECT * FROM Students
ORDER BY Age DESC, FirstName ASC; -- Sort by Age descending, then FirstName ascending
Limits the number of rows returned by a query and can be used for pagination.
-- Limit to first 5 rows (MySQL/PostgreSQL)
SELECT * FROM Students
LIMIT 5;
-- Skip first 5 rows, then get next 10 rows (MySQL/PostgreSQL)
SELECT * FROM Students
LIMIT 10 OFFSET 5;
-- Top 5 rows (SQL Server)
SELECT TOP 5 * FROM Students;
Joins are used to combine rows from two or more tables based on a related column between them.
Returns only the rows that have matching values in both tables.
SELECT S.FirstName, S.LastName, C.CourseName
FROM Students AS S
INNER JOIN Enrollments AS E ON S.StudentID = E.StudentID
INNER JOIN Courses AS C ON E.CourseID = C.CourseID;
Returns all rows from the left table, and the matching rows from the right table. If there's no match, NULLs are returned for the right table's columns.
-- Get all students and the courses they are enrolled in (if any)
SELECT S.FirstName, S.LastName, C.CourseName
FROM Students AS S
LEFT JOIN Enrollments AS E ON S.StudentID = E.StudentID
LEFT JOIN Courses AS C ON E.CourseID = C.CourseID;
Returns all rows from the right table, and the matching rows from the left table. If there's no match, NULLs are returned for the left table's columns.
-- Get all courses and the students enrolled in them (if any)
SELECT C.CourseName, S.FirstName, S.LastName
FROM Courses AS C
RIGHT JOIN Enrollments AS E ON C.CourseID = E.CourseID
RIGHT JOIN Students AS S ON E.StudentID = S.StudentID;
Returns all rows when there is a match in one of the tables. If there is no match, the rows from the non-matching side will have NULLs.
-- (Note: FULL JOIN is not supported in MySQL prior to 8.0, often simulated with UNION of LEFT and RIGHT JOINs)
SELECT S.FirstName, S.LastName, C.CourseName
FROM Students AS S
FULL OUTER JOIN Enrollments AS E ON S.StudentID = E.StudentID
FULL OUTER JOIN Courses AS C ON E.CourseID = C.CourseID;
Returns the Cartesian product of the rows from the tables in the join. Each row from the first table is combined with each row from the second table.
SELECT S.FirstName, C.CourseName
FROM Students S
CROSS JOIN Courses C;
A subquery is a query embedded inside another SQL query. It can be used in SELECT
, INSERT
, UPDATE
, and DELETE
statements, as well as with operators like IN
, EXISTS
, ANY
/ALL
.
-- Select students whose age is greater than the average age
SELECT FirstName, LastName, Age
FROM Students
WHERE Age > (SELECT AVG(Age) FROM Students);
-- Select courses that have no enrollments
SELECT CourseName
FROM Courses
WHERE CourseID NOT IN (SELECT DISTINCT CourseID FROM Enrollments);
-- Using a subquery in SELECT clause (correlated subquery)
SELECT
FirstName,
LastName,
(SELECT COUNT(*) FROM Enrollments WHERE Enrollments.StudentID = Students.StudentID) AS NumberOfCourses
FROM Students;
A view is a virtual table based on the result-set of an SQL query. It does not store data itself but rather a query that is executed when the view is accessed.
-- Create a view of active students
CREATE VIEW ActiveStudents AS
SELECT StudentID, FirstName, LastName, Email
FROM Students
WHERE Age > 18;
-- Query the view
SELECT * FROM ActiveStudents;
-- Drop a view
DROP VIEW ActiveStudents;
Indexes are special lookup tables that the database search engine can use to speed up data retrieval. They are like an index in a book.
-- Create an index on a column
CREATE INDEX idx_lastname
ON Students (LastName);
-- Create a unique index (ensures no duplicate values in the indexed column(s))
CREATE UNIQUE INDEX idx_email
ON Students (Email);
-- Drop an index
DROP INDEX idx_lastname ON Students; -- Syntax varies (e.g., DROP INDEX Students.idx_lastname in SQL Server)
A transaction is a sequence of operations performed as a single logical unit of work. Transactions ensure data integrity and reliability, typically adhering to ACID properties (Atomicity, Consistency, Isolation, Durability).
-- Start a transaction
START TRANSACTION; -- or BEGIN TRANSACTION / BEGIN;
-- Perform operations
UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = 1;
UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = 2;
-- Commit the transaction if all operations are successful
COMMIT;
-- Rollback if any operation fails or an error occurs
-- ROLLBACK;
CTEs provide a way to define a temporary named result set that you can reference within a single SQL statement. They improve readability and modularity of complex queries.
WITH StudentCourseCount AS (
SELECT
StudentID,
COUNT(CourseID) AS NumCoursesEnrolled
FROM Enrollments
GROUP BY StudentID
)
SELECT S.FirstName, S.LastName, SCC.NumCoursesEnrolled
FROM Students AS S
JOIN StudentCourseCount AS SCC ON S.StudentID = SCC.StudentID
WHERE SCC.NumCoursesEnrolled > 1;
Perform calculations across a set of table rows that are related to the current row. Unlike aggregate functions, window functions do not collapse rows into a single output row.
-- RANK: Assign a rank within each partition
SELECT
StudentID,
CourseID,
EnrollmentDate,
RANK() OVER (PARTITION BY StudentID ORDER BY EnrollmentDate) AS EnrollmentRank
FROM Enrollments;
-- ROW_NUMBER: Assign a sequential integer to each row within its partition
SELECT
FirstName,
LastName,
Age,
ROW_NUMBER() OVER (ORDER BY Age DESC) AS RankByAge
FROM Students;
A stored procedure is a prepared SQL code that you can save, so the code can be reused over and over again. It can accept parameters and return values.
-- Example (syntax varies greatly by DB system - this is a simplified example)
DELIMITER //
CREATE PROCEDURE GetStudentsByAge (IN minAge INT, IN maxAge INT)
BEGIN
SELECT FirstName, LastName, Age
FROM Students
WHERE Age BETWEEN minAge AND maxAge;
END //
DELIMITER ;
-- Call the stored procedure
CALL GetStudentsByAge(20, 25);
A trigger is a special type of stored procedure that automatically runs when an event occurs in the database (e.g., INSERT, UPDATE, DELETE).
-- Example (syntax varies greatly by DB system)
-- This trigger logs changes to the Students table
DELIMITER //
CREATE TRIGGER AfterUpdateStudentAge
AFTER UPDATE ON Students
FOR EACH ROW
BEGIN
IF OLD.Age <> NEW.Age THEN
INSERT INTO AuditLog (TableName, Operation, OldValue, NewValue, ChangeDate)
VALUES ('Students', 'UPDATE_AGE', OLD.Age, NEW.Age, NOW());
END IF;
END //
DELIMITER ;
Use clear, consistent names for tables, columns, views, and procedures (e.g., PascalCase for tables, camelCase or snake_case for columns).
-- Good
CREATE TABLE CustomerOrders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE
);
-- Bad
CREATE TABLE co (
id INT,
cid INT,
dt DATE
);
Unless you explicitly intend to modify or delete all rows, always use a WHERE
clause to target specific records.
-- Always confirm your WHERE clause before executing
DELETE FROM Products WHERE ProductID = 123;
UPDATE Users SET Status = 'Inactive' WHERE LastLogin < '2024-01-01';
Organize tables and columns to minimize data redundancy and improve data integrity. This typically involves following database normalization forms (1NF, 2NF, 3NF, BCNF).
Example: Instead of storing customer address repeatedly in an orders table, create a separate 'Addresses' table and link it with a foreign key.
Makes queries more readable, especially when joining multiple tables or tables with long names.
SELECT S.FirstName, C.CourseName
FROM Students AS S
JOIN Enrollments AS E ON S.StudentID = E.StudentID
JOIN Courses AS C ON E.CourseID = C.CourseID;
Explicitly list the columns you need. This improves performance (retrieves less data), makes queries more robust to schema changes, and enhances readability.
-- Good
SELECT ProductID, ProductName, Price FROM Products;
-- Bad
SELECT * FROM Products;
Especially for complex queries, stored procedures, or schema definitions, comments explain the purpose and logic.
-- This query retrieves the top 10 most expensive products
SELECT ProductName, Price
FROM Products
ORDER BY Price DESC
LIMIT 10;
When performing multiple related DML operations that must succeed or fail together, wrap them in a transaction.