MySQL - My Structured Query Language

When building any website, we need to know about 2 things:

1. language to code your webpage in = usually php

2. language to build a database of objects = usually mysql (mysql is really a client/server software, which uses SQL language to allow client to talk to server)

Think of database as a big file that stores all your data. Usually this data gets very big (i.e a website that has a list of all items, their price, description, etc). If we store all this data in 1 big file, then just reading or writing to this massive file every second will be very slow. This is because appr location in file needs to be searched for and then data written over there.  Also not feasible, since updates will be happening to this file like every micro second, whenever a customer places an order, or prices change on items, etc. This will make operations to file very slow. So, a special software comes to our rescue called database management software (DBMS). DBMS still store data as one big file (we have options to break it in smaller files), but their real value comes from their ability to be able to do quick searches, maintain transaction atomicity, prevent datafile from getting corrupted, etc.

There are many varieties of DBMS. One of them particularly popular is RDBMS (Relational DBMS). This is called relational database because all the data is stored into different tables and relations are established using primary keys or other keys known as Foreign Keys.

In DBMS terminology, database refers to a collection of tables with relation b/w them. Tables are just a simple data matrix (like excel sheet) with rows and columns.

MySQL: MySQL is a fast, easy-to-use RDBMS. It's open source, and can support large databases (upto millions of Terabytes).. SQL is a data language, and MySQL uses that language to access the database it stores. MySQL cmds can be accessed via functions in PhP, Perl, Python, etc, so it's really easy to work with MySQL, w/o even explicitly learning SQL cmds. MySQL is wriiten as capital M, S, Q, L, and small y, but below we'll just write it as mysql frequently to promote my laziness.

MySQL pgm has client server architecture. This is very common for pgms, which don't always run on local m/c, and can be accessed from anywhere. MySQL pgm has 2 main pgms => server pgm and client pgm. client pgm runs on local m/c when you type "mysql" on unix promot. Server pgm (called mysqld) runs automatically whenever the m/c is started (provided mysql is installed). As you can see server and client pgm can run on any m/c, and don't have to be the same m/c.

MySQL: good doc here: https://www.tutorialspoint.com/mysql/mysql-quick-guide.htm

One other more comprehensive place is: http://www.mysqltutorial.org/

Offical mysql website also has decent tutorials: https://dev.mysql.com/

Install MySQL: The below link on mysql offical website shows steps in detail on how to install it for apt, yum, etc.

https://dev.mysql.com/downloads/

These are the cmds for installation on linux mint (or any linux derivative using apt):

1. sudo apt-get update

2. sudo apt-getinstall mysql-server => This installs mysql server, mysql client and many other files. Here it installs distribution=5.7, which may not be the latest (there are additional cmds listed which will get you the latest). We do not need to install both client and server, as many times, we don't even need client/server on same m/c. So, we can choose and install only the part that we need. For our purpose, we install both client and server on our local m/c.

After installation, mysql server automatically starts. These are the 2 main binaries that get installed:

A. mysqld => server pgm that contains database. It manages access to the actual databases on disk or in the memory. Binary is /usr/sbin/mysqld. When MySQL server starts, it listens for network connections from client programs and manages access to databases on behalf of those clients. If we see list of processes running (using ps -ef), we will see mysqld running

ex:  /usr/sbin/mysqld --daemonize --pid-file=/run/mysqld/mysqld.pid => "ps -ef | grep mysqld" shows mysqld daemon running with shown args. mysqld has many options that can be specified at startup.

B. mysql   => client pgm (cmd line) that connects to mysql server to manipulate the information in the databases that the server manages. Binary is /usr/bin/mysql. This doesn't start automatically. This is the pgm that we invoke all the time (we won't talk about mysql server pgm anymore, as it's not relevant to most users). Whenever we say "mysql", we almost always means mysql client pgm.

After installation, verify that mysql is installed:

mysql --version => shows mysql client version = mysql  Ver 14.14 Distrib 5.7.27, for Linux (x86_64)

mysqld --version => shows mysqld client version = mysqld  Ver 5.7.27-0ubuntu0.18.04.1 for Linux on x86_64 ((Ubuntu))

mysqladmin --version => shows "mysql admin" script version = mysqladmin  Ver 8.42 Distrib 5.7.27, for Linux on x86_64. mysqladmin is additional pgm included with mysql package to simplify admin tasks.

mysql --help => shows all options for mysql client pgm

MySQL server allows only certain users on certain machines to connect to it for security purposes (just like ssh, etc). It has a list of usernames and passwords stored in one of it's table, along with the name of host machines from where they can connect, and only those users are allowed to connect. Else we get "Error: 28000: access denied" from the server. During mysql installation, a "root" user account with no password on host "localhost" is defined and added to this table.This means a user  "root" on local machine (or on ip addr: 127.0.0.1) can connect to mysql server on that machine using mysql client. Any other user from any other m/c is not allowed to connect to mysql server. So, if we want to run "mysql" client pgm on the same m/c to connect to mysql server, we need to change user to "root" on our machine, meaning we need to use a "sudo" cmd preceeding mysql cmds (or run "sudo -i" to change to root, and then use cmds w/o sudo. su wouldn't work as root a/c is disabled by default in many linux distro for security reasons),

So now we know, that anyone logged in the m/c as root, or using cmds with sudo in front (which gives them same rights as root) can connect to mysql server. mysql server by defaut doesn't require a separate password for root user  to connect to it. This is risky. To fix this, we'll set  password for root (for mysql server user "root").

If we look in /etc/passwd file, we will see list of all users on that machine. We'll see user "root" and user "LOGIN_USER", along with other users. "LOGIN_USER" here is the name of username with which we log into our linux machine (for ex: for username "Rakesh", LOGIN_USER is Rakesh, so we'll see username "root" and "Rakesh" in passwd file). After installing mysql, we also see a new user "mysql" added to this file, as mysql pre-install script adds a user "mysql" using "adduser" cmd.

The mysql.user grant table defines the initial MySQL user account and its access privileges. Installation of MySQL creates only a 'root'@'localhost' superuser account that has all privileges and can do anything. If the root account has an empty password (which is the default), your MySQL installation is unprotected: Anyone can connect to the MySQL server as root without a password and be granted all privileges (although only on local m/c). So, we assign a password to "root" user using any of 2 scripts below, before invoking mysql client. NOTE: we use "sudo" before each cmd, since only "root" user is allowed connecting to mysql server.

1. sudo mysql_secure_installation => this brings up the script to tighten up security for mysql. Most important thing to enter here is your mysql client "root" account's new password.  More details here: https://www.computerbeginnersguides.com/blog/2017/08/02/install-and-configure-mysql-server-5-7-on-linux-mint-18-2/

2. sudo mysqladmin -u root password "new_password"; => this gets new passowrd assigned to "root" a/c on mysql server

cmds to start/stop mysql server: (we use service cmds here, check in linux server notes). All cmds preceeded by sudo.

1. sudo service mysql status => shows mysql is active (running). As mysql automatically starts running on installation

2. sudo service mysql stop => stops mysql server

3. sudo service mysql start => starts mysql server. For restart, we cn also use "mysql restart"

starting MySQL client:

You can connect to your MySQL server through the MySQL client by using the mysql command on cmd line. We have to specify the sql server m/c ip addr (or name), user name who is connecting to it and password for that user. If you haven't assigned any password to "root" account on mysql server, then you can connect to mysql server w/o giving any password as by default it will be set as blank. Also, we'll need to connect as user "root" since no other user is allowed to connect.

> sudo mysql => typing this makes mysql monitor for client appear, and it connects to mysql server locally (Since default host name is localhost). The default user name is unix login user, but by typing "sudo", we become "root" for purpose of executing that cmd. This logins as user "root@localhost", as sudo changes user to "root". IF we don't use "sudo" or are not "root" user, well get access denied error, as user will be unix login user, who doesn't have access rights for mysql server.

cmd line options for mysql client:  Most of the cmd line options can be given in short form as "-o <NAME>" or in long form as "--option=<NAME>". We can also specify all these options in an option file instead of typing them on cmd line.

-h <hostname> => specifies host name, default is localhost. Can also be written as --host=<hostname>

-P <port_number> => specifies which port on the host to connect to. NOTE: capital "P" Mysql server pgm is bound to TCP port number 3306 by default. Since MySQL server can connect to any remote client, it has to be assigned a port number from which it grabs all it's input. This is true for TCP/IP protocol which works that way. It can be any one of 65K ports assigned for connections using TCP/IP. This port is just a chunk of memory (buffer) where incoming data from internet connection is dumped so that specific pgms continuously monitoring this port can start consuming data in this buffer. This port buffer will have additional info on which ip addr the data is coming from, user name, permissions, etc for mysql server to use. That is why when we access mysql server locally, we would expect it to be coming from an ip addr 127.0.0.1(i.e  "localhost"), from port 3306. However in reality, localhost doesn't use TCP/IP protocol, and instead uses unix socket file to transfer data. Socket file are special files used for Inter process communication by the OS. API for unix socket is similar to that of Internet socket, but all communication occurs entirely within kernel, and filenames are used as addr instead of IP address (since communication is all local). So, -P has no meaning for localhost and is ignored.

-u <username> => specifies username, default is unix login user

-p<passwd> => specifies password for that user, NOTE: there is no space b/w -p and password. Also, if we use just -p with no password, then mysql prompts for password on next line, which is safer, since password is not visible in plain text. Also, all processes along with their options can be seen by typing "ps -ef" by any user on that m/c, so anyone can see user's password, if password was entered on cmd line. So, always use "-p" with no password. 

<db_name> => we can optionally specify name of database to open (with no - option) that we want to open. If we don't specify database name, we connect to sql server but can't access any database, until we select a database using "USE" SQL cmd on sql client prompt.

ex:

> mysql -h example.com -u rajat -p => here we connect as user "rajat" on host "example.com on default port 3306. If rajat is not an autorized user in mysql server table on that host, then he can't connect. Since no password provided after -p, then password is prompted on net line.

Enter password:*******

mysql client prompt appears at this time, and we can start typing mysql cmds.

 mysql dir structure: After installation, we'll have many dir with mysql files in them. Details of source code and dir structure is here (very good descr, should read this): https://dev.mysql.com/doc/internals/en/

Here is how the flow works:

  1. User invokes mysql client by typing "mysql" with appr options. client connects to the server, and does initial handshake. Encyption and authentication methods are established (SSL can be used for both encryption and authentication).
    • In a classic client/server, the server has a main thread which is always listening for incoming requests from new clients. Once it receives such a request, it assigns resources which will be exclusive to that client. In particular, the main thread will spawn a new thread just to handle the connection. Then the main server will loop and listen for new connections.
    • MySQL has a great variety of mutexes (mutual exclusion object) that it uses to keep actions of all the threads from conflicting with each other.
    • For each thread, there is an infinite loop. The loop repeatedly gets and does commands received from sql client. When it ends, the connection closes. At that point, the thread will end and the resources for it will be deallocated.  
    • All communication b/w client/server is done via packets. Packets are formatted messages that client/server send over tcp/ip lines. Packets will have a header, an identifier, and a length, followed by the message contents.  
  2. User types any SQL cmd on mysql cmd prompt. client performs initial checks. Source code for this sits in client dir of mysql s/w. As an ex, we enter SQL cmd INSERT =>  mysql> INSERT ,,,
  3. The mysql client then passes this edited/checked cmd over TCP/IP to sql server @ip_address and to default port for sql server. If server is local, then ip addr is 127.0.0.1, and file socket used. These cmds are passed in form of  packets.
  4. Then via the vio routines (in C), the server grabs this cmd, Low level mysql routines are called to parse sql (source code in sql dir), and doing what's necessary (Source code in mysys dir)
  5. Finally, one of the ha (handler) programs in the sql directory will dispatch to an appropriate handler for storage. Depending on the storage engine, handler may be in myisam or Innodb dir.

Few of the important dir are:

1. /usr/bin, /usr/sbin => These directories contain mysql/mysqld binaries

1. /etc/mysql/ => This has many configuration files. Most important is database server configuration file: /etc/mysql/mysql.conf.d/mysqld.cnf. This file has settings for where all log files, data files etc are stored. Data file for our database is stored as 1 big file here:  /var/lib/mysql/ibdata1

2. /var/lib/mysql => this dir has all logs, data, etc. We see ibdata1, which is the some data file that stores databases. More data files would be named ibdata2, ibdata3, etc. Note sure what they store ?? FIXME??. Inside this dir, we see subdir for all databases (as mysql, sys, etc). If we create a new database, a new dir with that name gets created with file db.opt in it. It is a 2 line file, with charset defined for that db. When we create new tables using "create table customer" then 2 new files for that database get created - customer.frm and customer.ibd. These are both binary files.Each table has it's own .frm and .ibd files, along with lot of other files. In other database dir as "mysql", we see lot of other files.

.frm => every MySQL table you create is represented on disk by a .frm file that describes the table's format (that is, the table definition). We can do hexdump to see contents of this file. It shows header which stores various info about the table.

hexdump -v -C customer.frm

.ibd => These store actual data, i.e table values inserted. Data is stored in compact byte form, Each data assigned storage space in file based on it's data type. This is much larger file, as it stores data.


SQL cmds:

We can type SQL cmds directly on mySQL client prompt, or put all the cmds in a file, and then run that file on mySQL client.

  • Each SQL cmd needs to be terminated by ";".
  • SQL is not case sensitive, but convention is to hav capital letters for SQL cmd and small letter for arguments.
  • 3 style of comments in SQL. # and -- treats anything from # or -- to end of line as a comment. -- needs to be followed by atleast one whitespace for it to be recognized as a valid comment. C style /* ... */ is used for in line as well as multiline comment.



1. Databases: A mysql server can keep several databases for different websites or projects, all in same server. This is accomplished by giving separate names to databases. Each of these databases can have multiple tables to store the actual data. The complex security mechanism ensures that individual databases and tables can be accessed only by authorized users. That is why we have to provide a username when starting mysql client, so that mysql client can check which databases and tables are allowed to be accessed by that user. Database related cmds:

- SHOW DATABASES; => shows all databases.By default, mutiple databases are already there. "mysql" is one of the databases that is automatically built on installation that stores user info.

- USE mysql; => this selects a database to work with. This cmd is essential to start working with database, else no database is selected, and so no database related SQL cmds will work. We can also select a database when connecting to mysql client via -D option, i.e mysql -u robert -D mydb -p => selects mydb database
 
- SELECT DATABASE(); => This shows all selected databases.

- CREATE DATABASE test_db; => creates a new database named test_db, This creates a dir named test_db in /var/lib/mysql. Inside test_db will be a *.opt file which stores various configuration for this db. We can only select 1 database at a time, so "USE test_db;" will select this database instead of any previously selected one. In mysql, SCHEMA is the synonym for DATABASE, so they can be used interchangeably.

- DROP DATABASE test_db; This deletes the said db.

 

2.Tables: Tables are most important part of a database. Here it is where all all the data in a database is finally stored. There are various table types also called storage engines that you can choose from. These storage engines are internal way used by a database server to store data. Uuually we should not care about storage engine, as we do not need to change it. The default storage engine from version version 5.5 an above is InnoDB, while before that it was MyISAM. Besides these, there are many other storage engines as MERGE, ARCHIVE, CSV, etc.

A. InnoDB: The InnoDB tables fully support ACID-compliant and transactions. They are also optimal for performance. InnoDB table supports foreign keys, commit, rollback, roll-forward operations. The size of an InnoDB table can be up to 64TB.

B. MyISAM: The MyISAM tables are optimized for compression and speed. However, they are not transaction safe.

Tables store data as a table with rows and columns. We specify how many columns each row has, what is the data type that each entry in a particular column, and that's it. The data types are of 3 categories:

I. Numeric: This includes INT (signed or unsigned upto 11 digits), BIGINT (upto 20 digits), FLOAT, SMALLINT, etc.

II. String: This includes BLOB or TEXT (max of 65K char), VARCHAR (from 1 to 256 char, num of char needs to be specified in brackets as VARCHAR(5) means 5 char), CHAR (CHAR is a string with default char length as 1, but can be defined as CHAR(5) indicating it can have up to 5 char), LONGBLOB or LONGTEXT, TINYBLOB or TINYTEXT, ENUM, etc. The difference b/w BLOB and TEXT is that sort and comparison on BLOB data is case sensitive, but not so for TEXT. BLOB is Binary large object and is used to store large amount of binary data too. ENUM is enumeration or a fancy term for list. Just as in other languages, it allows only those values put in the list to be allowed as valid. ex: ENUM( 'ABC', '12') will allow only ABC, 123 and NULL as valid values for that field.

III. DATE and TIME: This includes DATE (stored in YYYY-MM-DD format), TIME (stored in HH:MM:SS format), DATETIME (stroed in YYYY-MM-DD HH:MM:SS format,)

NOTE: NULL is a special data type which represents "nothing", and it can be applied to any of 3 data types above, whenever we do not want to store a value. Any of above 3 data types above can have NULL value.

These are few table related cmds.

1.CREATE TABLE table1( ... ); => This creates tables with specified columns. Each column has data name, data type and optional data attributes. ex:

CREATE TABLE customer (

Idx INT AUTO_INCREMENT, #This is 1st col. It has field called "Idx" which is of type Integer, and it's value is auto incremented, whenever a new row is created. AUTO_INCREMENT is an optional attribute. Field Attribute AUTO_INCREMENT tells MySQL to go ahead and add the next available number to the id field.

name VARCHAR(236) NOT NULL, #VARCHAR(236) says it can have upto 236 char. "NOT NULL" attr specifically states that "name" is not allowed to have "NULL" value, so if we try to create name with "NULL" value, mysql will raise error.

start_date DATE,

description TEXT,

PRIMARY KEY (Idx) #NOTE: no comma for last field. Keyword PRIMARY KEY is used to define a column as a primary key. You can use multiple columns separated by a comma to define a primary key.
)  ENGINE=INNODB; #Engine name is optional. NOTE: semicolon at end of this line to indicate end of SQL stmt. So, above cmd is 1 long cmd spread on multiple lines for clarity.

2. INSERT INTO table1 (field1, filed2, ...) VALUES (value1, value2, ...); #after creating tables, we use this cmd to insert values into fields of table. We use this cmd repeatedly to insert as many rows as we want to the table. ex:

INSERT INTO customer (description, name, start_date) #Idx is not inserted into table, as Idx has AUTO_INCREMENT attr, which means it's incremented by 1 and assigned to "Idx" for this row, whenever we insert a new row.

VALUES ("Learn PHP", "John Poul", NOW()); #Here NOW() is a function which returns date.

3.UPDATE, DELETE, etc used to update/delete values in fields.

4. ALTER TABLE => Just as we can add, delete. modify values in rows, we can change columns for a given table. i.e add columns, remove columns using this cmd.

5. SHOW TABLES; # shows all tables in selected database.

6. SHOW COLUMNS from table_name; #This shows all the info for all columsn from the given table.

The 3 cmds, SHOW DATABSES, SHOW TABLES and SHOW COLUMNS can give all the info about whatever is present in any data on a sql server. Databases have tables, and tables have columns. Columns are the lowest hier that store the data. To prevent anyone from accessing all this info, mysql provides privileges at finer level up to the table. So, only privileged users can access a certain database and certain table, thus providng security to data stored. Database "mysql" stored by default in mysql server has one of the tables as "user". If we list columns of user, we will see these privileges.

ex: SHOW COLUMNS FROM user; -- Shows all colums as User, Host,  various privileges, etc (total of 45 columns) from table user in database named mysql.

+------------------------+-----------------------------------+------+-----+-----------------------+-------+
| Field                  | Type                              | Null | Key | Default               | Extra |
+------------------------+-----------------------------------+------+-----+-----------------------+-------+
| Host                   | char(60)                          | NO   | PRI |                       |       |
| User                   | char(32)                          | NO   | PRI |                       |       |
| Select_priv       | enum('N','Y')                   | NO   |         | N                   |       |

7. SHOW VARIABLES; => shows all variables and their values in use. variables are current database configuration settings (set by default or modified by user). It shows that default port for mysql server pgm running is 3306.

8.SELECT => This is the most used cmd to select fields from a given table. We can use a lot of complicated search criteria (by using WHERE clause) and get only the rows and columns meeting that search criteria.

ex: SELECT * FROM user; => This shows all fields (columns) for all rows from table "user" in database "mysql" (assuming mysql db was selected before). * denotes "all".
ex: SELECT name, salary FROM tutorials_tbl WHERE tutorial_author = 'Sanjay'; => HEre we select 2 fields (columns) from table "tutorials_tbl". WHERE clause adds extra filtering criteria, saying the field "tutorial_author" needs to be "Sanjay". So, only those rows will be selected which match the WHERE clause, and only 2 columns "name" and "salary" would be displayed. If * was used, all columns would be displayed. Many other keywords as AND, OR, <, !=, etc available for doing a match.

 3. Misc cmd => There are tons of other cmds for doing doing lot of other things with databases.

Adding new user to db:

So far we did everything on mysql db as "root" user. However, we'll want to allow valid users besides root to access mysql db. This can be done via creating a user, and then granting it privileges, Done via following cmds:


mysql > CREATE USER 'ashish'@'localhost' IDENTIFIED BY 'password'; #here, we created a new user 'ashish', but no privileges assigned, so we can't login with this user yet

mysql> GRANT ALL PRIVILEGES ON * . * TO 'ashish'@'localhost'; #grant all privileges to this new user when logged from localhost. First * refers to database while second * refers to  table. So, here we grant all privileges to user ashish on all db and all tables. Not a safe thing to do, as now user 'ashish' can modify anything in any db.                                          

mysql> FLUSH PRIVILEGES; #To reload all the privileges so they take effect immediately

 mysql> use mysql; #To check ifnew user has been added, we check in "mysql" database, and "user" table.

mysql> select * from user; # this shows a row for new user "ashish"

support for mysql in other languages:

We can connect to mysql server via mysql client by invoking it from terminal. However, the real value in mysql db is when we can connect to the db from within some other pgm. For this reason, popular web languages as php, python, perl, etc provide functions to connect to mysql server, rd/wrt to db, and basically do everything from within the language. php language provides extensive support for mysql. Details of this approach are explained in php tutorial.

Migration of Database: Copying/Moving MySQL Database from 1 server to other is not difficult. We know that database contents are stored in *.idb file (for InnoDB). However, moving these files by themselves is not going to work, as they rely on *.frm files and host of other files. Also, if the storage engine is different, then a conversion would be required from one format to other. To prevent users from all this hassle, mysql provides various utilities to copy/move the database.

The mysqldump program is used to copy or back up tables and databases. It can write the table output either as a Raw Datafile or as a set of INSERT statements that recreate the records in the table.

1. As raw datafile: ex:

> mysqldump -u root -p --no-create-info  --tab=/tmp/dir1 tutorials_DB tutorials_TABLE => this dumps table=tutorials_TABLE from database=tutorials_DB to file in /tmp/dir1 dir

2. in sql format: we can also dump selected tables or entire database in sql format in a separate file. Here, we write out output file as series of sql CREATE/INSERT cmds, that will recreate the original table.

> mysqldump -u root -p tutorials_DB tutorials_TABLE > dump.txt => same as above except that --tab option is not used. dump.txt is a regular sql file, that has "CREATE TABLE tutorials_TABLE" and "INSERT INTO tutorials_TABLE". To dump multiple tables from the database, provide name of each table separated by space (i.e tutorials_DB tutorials_TABLE1  tutorials_TABLE2 > dump.txt)

> mysqldump -u root -p TUTORIALS_DB > database_dump.txt  => this dumps entire database instead of tables

So, to move copy database from 1 server to other, use these 3 cmds below:

  • mysqldump -u root -p MY_OLD_DB > database_dump.sql => dump all database in this file on old server from where you want to transfer
  • mysqladmin -u root -p create MY_NEW_DB => create new database named MY_NEW_DB on new server where you want to transfer files to
  • mysql -u root -p MY_NEW_DB < database_dump.sql =>Now run the dumped sql file on this new DB, which will create/insert all tables into this new db, using cmds from *.sql file.