Tuesday 25 November 2014

Linux: awk - Group By Count File Data

While its pretty easy to do 'Group By' at database level, 'awk' enables us to do same at file level.

Consider a file as below:

# cat test.csv
anita 111
rama 555
david 555
raj 111
shaik 222
naren 222


Here, I want to know how many names in the 1st column are having same values in the second column. 
That's nothing but a simple Group By in MySQL or any other database. 
So, load to a table and execute the query.

But, if the file having crores of data, MySQL query would be a costly operation.

In this time, 'awk' command will come for your help.

Below is the command:

# awk  '{arr[$2]++;}END{for(i in arr)print i, arr[i] ;}' test.csv
555 2
111 2
222 2


Here, awk is fetching the second column into an array and counting the appearance.

If, the file is comma separated(CSV), use the below command:



# awk  -F, '{arr[$2]++;}END{for(i in arr)print i, arr[i] ;}' test.csv
555 2
111 2
222 2


Monday 24 November 2014

MySQL: Configuration File

When you install MySQL through RPM, before starting the the MySQL process, make sure /etc/my.cnf do exist. And, if it's not, do create one with the basic configuration values.

Below is a sample my.cnf file for a server with InnoDB as the default engine:


[mysqld]
performance_schema
skip-name-resolve
server-id                       = 1
innodb_file_per_table
max_connections                 = 1000
max_allowed_packet             = 512M

slow-query-log                  = 1
slow_query_log_file             = /var/lib/mysql/myServer_slow.log

sort_buffer_size                = 32M
read_buffer_size                = 32M
read_rnd_buffer_size            = 16M
join_buffer_size                = 32M

# innodb parameters
tmpdir                          = /var/lib/mysql
datadir                         = /var/lib/mysql
innodb_data_home_dir            = /var/lib/mysql
innodb_data_file_path           = ibdata1:10M:autoextend
innodb_log_group_home_dir       = /var/lib/mysql
innodb_log_files_in_group       = 2
innodb_buffer_pool_size         = 40G
innodb_additional_mem_pool_size = 64M
innodb_log_buffer_size          = 32M
innodb_log_file_size            = 2047M
innodb_change_buffering         = inserts
innodb_flush_method             = O_DIRECT


# parameters added for thread cache size, wait time out, interactive timeout and table open cache
wait_timeout                    = 28800
interactive_timeout            = 28800
table_open_cache               = 2000
thread_cache_size              = 1000

[mysql]
max_allowed_packet              = 512M



The explanations of these variables..I think I'll do in next posts..

Friday 7 November 2014

MySQL: User Management

User Management in MySQL is one of the fundamental admin task and is quite easy, as well.

So, here are the basic commands regarding this:


** To list all the MySQL users: Users will be stored in the system table mysql.user. So, you need query this to fetch the users:


mysql> select user,host,password from mysql.user;

Output:

+-----------------+--------------+-------------------------------------------+
| user            | host         | password                                  |
+-----------------+--------------+-------------------------------------------+
| root            | localhost    | *0EF40B7834E1725C32C7EC640FC474ED72254B69 |
| dump            | 127.0.0.1    | *E1A36C05EDBEB86BF19531672613ECFC88F3DB2F |
| root            | 127.0.0.1    | *0EF40B7834E1725C32C7EC640FC474ED72254B69 |

| readonly        | %            | *9E1258A6F6806487E2BFA236472D4D28236F6215 |
+-----------------+--------------+-------------------------------------------+

4 rows in set (0.009 sec)


Note that, Password is encrypted here. This you've to take care while setting up password for any user by using PASSWORD(), a system function.


** To create new user and grant privilege: There are 3 ways to create an user:

1. Using CREATE command:

CREATE USER username@'host' IDENTIFIED BY 'password';
GRANT SELECT ON db.table TO 'username'@'host';
FLUSH PRIVILEGES;


2. Using GRANT command:

GRANT SELECT ON db.table TO 'username'@'host' IDENTIFIED BY 'password';
FLUSH PRIVILEGES;

The above command will create the user, if it is not present.


3. Inserting into mysql.user table:

INSERT INTO mysql.user(user,host,password) values ('usrename','host','encrypted_password');

GRANT SELECT ON db.table TO 'username'@'host' IDENTIFIED BY 'password';
FLUSH PRIVILEGES;

However, this is not a RECOMMENDED way. Make sure, this password is encrypted using below query:

SELECT PASSWORD('password');



**Removing User/Access:


To Revoke access:

REVOKE SELECT ON db.table FROM 'username'@'host';
FLUSH PRIVILEGES;


To Remove user:

DROP USER 'username'@'host';
FLUSH PRIVILEGES;

or

DELETE FROM mysql.user WHERE USER='username' and host='host';
FLUSH PRIVILEGES;


FLUSH PRIVILEGES -- This command is to Reload the privileges of all the user(reloads from mysql.user table)

MongoDB: Start a Shard Process

Here's sample command to start a MongoDB process, which is having replication set up.

* Go to 'bin' directory, where MongoDB is installed
* Execute the below command to start MongoDB process:

./mongod --shardsvr --replSet r1 --port 27010 --quiet --journal --fork --dbpath /path_of_mongo/rs --logpath /path_of_mongo/logs/rs.log


Desciption:

./mongod --> Calling the MongoDB process

shardsvr --> Option - to make MongoDB process as a Shard

replSet r1 --> Option - to start Shard process as part of Replica set (r1 is the value)

port 27010 --> Option - to start MongoDB process on port 27010

quiet --> Option - to Log less

journal --> Option - to enable journaling - which helps in crash recover - acts as re-do log

fork --> Option - to Start background processes, implicitly 

dbpath --> Option - to specify the installation path of MongoDB

logpath --> Option - to specify the log file path

Thursday 16 October 2014

MongoDB: Change opLog Size

If you've MongoDB Replication setup, you must be aware of opLog..! 

Just as Transaction Log in MS SQL Server or Binary log in MySQL, all those transactions on MongoDB shard will be written into a separate file, called opLog. This opLog will have a fixed size, which means, as soon as it reaches the max size, it will be overwritten. 

So, if there are huge number of transactions, you may need to increase the size of it.
To check your current opLog size:

rs.printReplicationInfo()


To change it's size, below are the plan of execution:

* If the shard in Primary, make it secondary
* Shutdown this secondary shard 
* Start it with different port -- basically, we're isolating it from replica set
* Take dump of opLog -- for safety purpose
* Create a temporary collection 
* Insert the last entry of opLog to this temporary collection
* Drop the existing opLog
* Create a new opLog with custom size
* Insert the record from temporary collection into this new opLog
* You're all set to go..!
* Re-start shard with original port

Here are the steps with commands:

Check size:
> rs.printReplicationInfo()


Make primary as secondary:
> use admin
> rs.stepDown()


Shutdown the server:
> use admin

> db.shutdownServer()


Start with dummy port:
# ./mongod --port 37010 --dbpath /paath/data/rs --logpath /path/logs/rs.log


Create Dump of opLog:
# ./mongodump --db local --collection 'oplog.rs' --host localhost --port 37010 


Create a temp collection:
> use local

> db = db.getSiblingDB('local')
local

> db.temp.drop()

false

> db.temp.save( db.oplog.rs.find( { }, { ts: 1, h: 1 } ).sort( {$natural : -1} ).limit(1).next() )

> db.temp.find()
{ "_id" : ObjectId("5436532b1cd0a62fea32625b"), "ts" : Timestamp(1412841535, 2), "h" : NumberLong("6772852693461355875") }



Drop and Create a new opLog(I'm creating here for 20 GB):
> db = db.getSiblingDB('local')

local

> db.runCommand({create: "oplog.rs", capped: true, size:(20 * 1024 * 1024 * 1024)})
{ "ok" : 1 }

> db.oplog.rs.save(db.temp.findOne())

> db.oplog.rs.find()

{ "_id" : ObjectId("5436532b1cd0a62fea32625b"), "ts" : Timestamp(1412841535, 2), "h" : NumberLong("6772852693461355875") }


Re-start as earlier:
> use admin;

> db.shutdownServer()


That's all !

Sunday 5 October 2014

MySQL : Replication Error 1062

It's been a pretty long time without posts. So, there's a lot to blog it!

One of the common error in MySQL Replication is 'Duplicate Entry' - Error 1062. This pop-ups only if there are any manual intervention.
The SQL Thread stops with this error. Just check the record in the error on both master and slave. If you understand the exact cause, you may delete the record in slave and start replication. Or else, just skip ignore the insert statement using the below command:


mysql> STOP SLAVE;
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; 
mysql> START SLAVE; 

This will skip 1 statement and continues replication. If you're getting lot of such errors, again and again, just add the below entry in Slave's my.cnf and restart MySQL:

[mysqld]

slave_skip_errors    = 1062


This will keep on skipping duplicate entries whenever it occurs.

Please note that, you can use this for any other error, as well. 

Friday 21 February 2014

MySQL: SELECT / LOAD in Batch

Here's are the MySQL commands to extract data from  table and to load data into tables.
Unlike other DBMS, MySQL offers SELECT..INTO.. and LOAD DATA.. commands are quite faster and convenient. Data would be fetched and loaded in bacthes.


Sample query to extract data from a table:

SELECT *
FROM tmp.my_table
INTO OUTFILE '/file_path/filename.csv'
FIELDS TERMINATED BY '|$|'
LINES TERMINATED BY '\n';

You can select particular columns, instead of  '*' .

OUTFILE should be given with file path and name. By default, file will be created inside data directory.
Important thing to be noted here is, the file will be generated inside the local host only.
Means, if you're executing above query on server S2, fetching data from S1, the file will be generated inside S1 itself.

Also, to execute above command, it requires FILE Privilege for MySQL user and the destination directory should also be having server level MySQL user privilege.

Below is the query that an be used to load data in batches:

LOAD DATA LOCAL 
INFILE '/file_path/filename.csv' 
INTO TABLE tmp.my_table 
CHARACTER SET utf8 
FIELDS TERMINATED '|$|' 
LINES TERMINATED BY '\n';

You can use, 'IGNORE INTO' instead of 'INTO', to ignore the duplicate records.
CHARACTER SET is mentioned explicitly as the data being loaded is having local languages(like kannada, hindi, telugu,..). Without this, data won't be visible - scrambled.

Thursday 20 February 2014

Linux: Concatenate In Linux

Here is the commands to concatenate in Linux.

Consider a file as below:

[root@myserver 6048]# cat test.txt
12
0
123445456
234565443
3883
33


My requirement is to generate update/delete statement for each of this value. Think of a file with millions of records.

My update statement would be:

update mytable set col1='Yes' where col2=' value in the file';

Here's the solution:

[root@myserver 6048]# cat test.txt | sed "s/^/update mytable set col1='Yes' where col2='/g;s/$/';/g"
update mytable set col1='Yes' where col2='12';
update mytable set col1='Yes' where col2='0';
update mytable set col1='Yes' where col2='123445456';
update mytable set col1='Yes' where col2='234565443';
update mytable set col1='Yes' where col2='3883';
update mytable set col1='Yes' where col2='33';
update mytable set col1='Yes' where col2='';
update mytable set col1='Yes' where col2='987676';



You may redirect the output to some other file, as below:

[root@myserver 6048]# cat test.txt | sed "s/^/update mytable set col1='Yes' where col2='/g;s/$/';/g" > updates_test.sql


You may also use the below query for the purpose:


[root@myserver 6048]# sed  "s/^/update mytable set col1='Yes' where col2='/g;s/$/';/g"  test.txt > updates_test.sql


Ping me for any doubts.. +Kiran Yadagere  | facebook.com/kiranyadagere



Linux: Adding Numbers in a File

Here's a Linux command to add numbers inside a file:


[root@myserver misc]# cat num.test
1
2
3
4
5



This is a sample file with only 5 lines. Think of a file with millions of lines. Either you can do with MS - Excel, which may hang(!!) or use Linux command - 'awk'- which is powerful and easier:

[root@myserver misc]# awk '{s+=$1} END {print s}' num.test
15


As simple as that..!!

Wednesday 19 February 2014

MongoDB: mongoexport Command - Sample

Here's is a simple MongoDB 'mongoexport' command syntax to backup the document/database. This command can be used to backup based on particular conditions ('where' clause) -

Go to MongoDB bin  directory and execute as below:

./mongoexport --host localhost --port 10000 -d doc_name -c coll_name -q '{"_id":"12345687899asdc"}' -o /path/file_name.csv

This will export a single record which is matching the given value for field '_id'.

Sunday 16 February 2014

Android: Read Kannada/Other Local Languages in your mobile

Here's how to Read Kannada, other Indian languages or any Unicode characters in your Andriod mobile:
(Mainly for Facebook ..!!)

This one is mainly for old android sets which doesn't have support for Unicode.  All today's Andriod mobiles display it correctly, by default.




Here's the steps what I followed to read Kannada on my Android phone:

1. Download and install 'Opera Mini' from Android Market/Google Play Store

2. Open Opera Mini and type below words in address bar, hit enter:

about:config

3. This will open a settings page. Scroll down to until you see below line:

Use bitmap fonts for complex scripts

4. There should be a drop down list, select 'Yes' and Save it.

5. Now open new tab and test it. You can type below address, which is a kannada website:

sampada.net

You should be able to see Kannada letters, clearly.

Saturday 15 February 2014

Geek: Write Kannada/Other Local Languages


How to write Kannada/Hindi or any other other language in any Website:
  






There are many ways, but here is one of the easiest way. (Its for PC/Laptop, doesn't work with mobile)
Follow the steps below:

1. Download and install Mozilla Firefox browser using below link if you don't have already:

http://www.mozilla.org/en-US/firefox/new/

2. Open Firefox and press Alt to see the menu bar. Go to Tools -> Add-ons.
Or
Press Ctrl + Shift + A

3. It opens a new tab 'Add-ons Manager'. In upper right corner there will be a Search bar. Type 'Lipikaar' and press Enter.

4. It will search for an add on - 'Lipikaar' (see the attached image). Now click on Install button. It prompts a dialogue box - click on Accept Install.
It downloads the add-on of 176 KB.

  






5. After installation, click on Restart Now.

6. After restarting, in the same 'Add-on Manager' tab, click on 'Extentions' (see the image). It shows all the installed add-ons. In Lipikaar, click on Options button.
Select your preferred language and click OK (see the image).





7. Last but important step is, whenever you want to type in the language you seleced in Lipikaar, press Ctrl + Alt + L .
Use the same to revert back to english.

Any doubts?!

Geek: Download Video from Web (YouTube)

So, here comes an interesting topic.

How to download videos from any websites(not only YouTube)?

There are many ways. Below is one of the easiest way:

For this, Mozilla Firefox comes to our help. Using an add-on called 'Video download helper'

1. Download and install Mozilla Firefox browser using below link if you don't have already:

http://www.mozilla.org/en-US/firefox/new/


2. Open Firefox and press Alt to see the menu bar. Go to Tools -> Add-ons.
Or
Press Ctrl + Shift + A

3. It opens a new tab 'Add-ons Manager'. In upper right corner there will be a Search bar. Type 'Video Download Helper' and press Enter.

4. It will search for an add on - 'Video DownloadHelper' (see the attached image). Now click on Install button. It prompts a dialogue box - click on Accept Install.
It downloads the add-on of 756 KB.








5. After installation, click on Restart Now.

6. After restarting, in the same 'Add-on Manager' tab, click on 'Extentions'. It shows all the installed add-ons. In DownloadHelper, click on Options button.
Click on 'Services -> Download -> Storage Directory' to set the download location and click OK (see the image).













7. Now open any website and play a video. As the video starts playing, on the left top corner, just before the address bar, the Video Download Helper icon starts rotating. Click on the drop down arrow of it which gives you options to download i different formats.

I would suggest to click on .mp4 format. Select Quick Download from it.(See the image).















8. Your fav video starts downloading now. No need to keep playing the video.
But don't close Mozilla Firefox.


Thursday 13 February 2014

Linux: Sorting a File

Here's simple 'sort' command that will remove the duplicate entry from any file and sort in ascending order:

Consider a file with few numbers:

[root@myhost tmp]# cat testSort.txt
23
4
56
001
34
3
 


To sort it:

[root@myhost tmp]# sort -u testSort.txt > sortd_testSort.txt
 

wherein, -u --> Unique


Its sorted:

[root@myhost tmp]# cat sortd_testSort.txt
001
23
3
34
4
56

Oops! Though it worked, it didn't work correctly. Because the file contains numeric data.

So use the below:

[root@myhost tmp]# sort -u -n testSort.txt > numeric_sortd_testSort.txt 
wherein, -n --> for numeric data

 

And, now its correct:

[root@myhost tmp]# cat numeric_sortd_testSort.txt
001
3
4
23
34
56

However, this works only for the small files. Smaller than space available with '/tmp' partition.

If you want to sort huge files, use the below:

[root@myhost lib]# sort -u -n -T. testSort.txt > numeric_sortd_testSort_new.txt 
wherein, -T. --> T indicates the location used as temporary space. By default its '/tmp'.

'.' indicates current directory

Wednesday 12 February 2014

MySQL: RPM Installation

Before going through this post, I would recommend you to have glance on my previous post related to this - 'MySQL - Before Installation..

So, here is the steps to install MySQL through RPMs:

* First of all download Server and Client RPMs. 
You can download the latest GA release from MySQL site or version you want from Oracle e-delivery.

Below are the RPMs, I downloaded:

[root@localhost Desktop]# ls -lhrt
total 70M
-rwxrw-rw-. 1 theray theray 52M Feb 12 05:14 MySQL-server-5.6.16-1.el6.x86_64.rpm
-rwxrw-rw-. 1 theray theray 18M Feb 12 05:14 MySQL-client-5.6.16-1.el6.x86_64.rpm


* Make sure that there is not previously installed MySQL, using the below command :

[root@localhost Desktop]# rpm -qa | grep -i mysql
mysql-libs-5.1.47-4.el6.x86_64

If already there, us below command to remove it:

[root@localhost Desktop]# rpm -e rpm_package_name


* Install RPMs:

[root@localhost Desktop]# ls -lrht
total 70M
-rwxrw-rw-. 1 theray theray 52M Feb 12 05:14 MySQL-server-5.6.16-1.el6.x86_64.rpm
-rwxrw-rw-. 1 theray theray 18M Feb 12 05:14 MySQL-client-5.6.16-1.el6.x86_64.rpm


--Installing Client:

[root@localhost Desktop]# rpm -ivh MySQL-client-5.6.16-1.el6.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-client           ########################################### [100%]


--Installing Server:

[root@localhost Desktop]# rpm -ivh --force MySQL-server-5.6.16-1.el6.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-server           ########################################### [100%]
2014-02-12 05:39:06 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2014-02-12 05:39:06 32621 [Note] InnoDB: Using atomics to ref count buffer pool pages
2014-02-12 05:39:06 32621 [Note] InnoDB: The InnoDB memory heap is disabled
2014-02-12 05:39:06 32621 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
.
.
.


* Now, create /etc/my.cnf file(its the basic configuration file) with the necessary configurations in it:

[root@localhost Desktop]# cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid




* Start MySQL process :

[root@localhost Desktop]# /etc/init.d/mysql start
Starting MySQL......................................................................................................
.......... SUCCESS!

* Once its started, use the below command to remove anonymous users and make the installations secure:


[root@localhost Desktop]# /usr/bin/mysql_secure_installation


Now the MySQL is ready for use!

MySQL: Slave Status - SECONDS_BEHIND_MASTER



We often execute the below command to check the replication status:

mysql> SHOW SLAVE STATUS \G

This will show all the details regarding replication. Out of which, the field SECONDS_BEHIND_MASTER is the one is the curious one.

It is the time in Seconds, which indicates actual amount of lag between Master and Slave.

In actual words, it's the calculation in seconds between the relay log's latest query event time stamp(which is already executed on master and queued on slave) and the current time stamp of the slave.

So, if the time set on Master's host and Slave's host are different, then, you'll see a scary value for this field.

Tuesday 11 February 2014

techGeek: Create Password less PDF !

Now-a-days, almost all PDF docs are password protected! Uff..sometimes I get irritated of it and try to make it password-proof!

And here's how you can make it:

My PDF with password:





No worries! Just close it and do as below:

* Open the PDF with Google Chrome:






*It will prompt for the password and give it:



* Once the PDF is open, click on Print button at right side bottom (or press Ctrl + P) and select Save as PDF. Click on Save :



Its done! 
Now the PDF will be saved without password!!






Monday 10 February 2014

Linux: Crontab - Brief

I always get confuse whenever I want to set a new cron job. The confuse is with regard to the options too be set!
For those who new to 'cron', its nothing but, an event scheduler in Linux. That means, you can schedule any script to run at any time you wanted to. Its just the system/server should be up and running!

cron job is specific to every user in Linux/Unix. So, one can't see other's cron unless the necessary privileges or sudo root access given.

Whatever, here is the options in cron:


To check cron jobs:

[root@localhost kiran]# crontab -l
no crontab for root


To set cron jobs:

[root@localhost kiran]# crontab -e


After adding, here is how it looks:

[root@localhost kiran]# crontab -l
##Script to test
00 */2 1-31 * 0,2,3   sh /home/kiran/test.sh >> /dev/null


Every Cron job should be given with 5 options:


  • minute -> 0-59
  • hour -> 0-23
  • day of month -> 1-31
  • month -> 1-12 
  • day of week -> 0-7 (0 is Sunday )

In the above example:

00 -- 0th Minute
*/2 -- Every 2 hours
1-31 -- Every day (1 to 31)
* -- Every Month
0,2,3 -- Sunday,Tuesday,Wednesday




Friday 7 February 2014

MySQL: Clustered Index


Clustered Index is the InnoDB term for a primary key index. 

InnoDB table storage is organized based on the values of the primary key columns, to speed up queries and sorts involving the primary key columns. 

If you do not define a PRIMARY KEY for your table, MySQL locates the first UNIQUE index where all the key columns are NOT NULL and InnoDB uses it as the clustered index.

If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB internally generates a hidden clustered index on a synthetic column containing row ID values. 


The rows are ordered by the ID that InnoDB assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted.
Thus, the rows ordered by the row ID are physically in insertion order. 


All indexes other than the clustered index are known as secondary indexes.
In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. 


InnoDB uses this primary key value to search for the row in the clustered index.


Physical Strcuture:
All InnoDB indexes are B-trees where the index records are stored in the leaf pages of the tree. 
The default size of an index page is 16KB. 
When new records are inserted, InnoDB tries to leave 1/16 of the page free for future insertions and updates of the index records.

The page size can be specified for all InnoDB tablespaces in a MySQL instance by setting the innodb_page_size configuration option before creating the instance.
 

Once the page size for a MySQL instance is set, you cannot change it.
Supported sizes are 16KB, 8KB, and 4KB, corresponding to the option values 16k, 8k, and 4k.

A MySQL instance using a particular InnoDB page size cannot use data files or log files from an instance that uses a different page size.

MySQL: B-Tree Index

 When any CREATE TABLE is executed, InnoDB internally creates a clustered B-tree index  representing the table. 
If no primary key is explicitly given, internally a primary key is added. 

Each node of B-tree is represented by one page (which is by default is of size 16KB). 


Below is an useful set of info, copied from other blog
(guptavikas.wordpress.com/2012/12/17/b-tree-index-in-mysql/):


B-Tree is the default index for most storage engines in MySql. 

The general idea of a B-Tree is that all the values are stored in order, and each leaf page is the same distance from the root.

A B-Tree index speeds up data access because the storage engine doesn’t have to scan the whole table to find the desired data. 
Instead, it starts at the root node. The slots in the root node hold pointers to child nodes, and the storage engine follows these pointers.

It finds the right pointer by looking at the values in the node pages, which define the upper and lower bounds of the values in the child nodes. 

Eventually, the storage engine either determines that the desired value doesn’t exist or successfully reaches a leaf page. Leaf pages are special, because they have pointers to the indexed data instead of pointers to other pages.

u00320020206brl05_01

MySQL: InnoDB doublewrite buffer

That's an useful piece of information, so did a copy-paste from MySQL site:

InnoDB uses a novel file flush technique called doublewrite. 

Before writing pages to the data files, InnoDB first writes them to a contiguous area called the doublewrite buffer. Only after the write and the flush to the doublewrite buffer have completed, does InnoDB write the pages to their proper positions in the data file. 

If the operating system crashes in the middle of a page write, InnoDB can later find a good copy of the page from the doublewrite buffer during crash recovery.

Although data is always written twice, the doublewrite buffer does not require twice as much I/O overhead or twice as many I/O operations. Data is written to the buffer itself as a large sequential chunk, with a single fsync() call to the operating system.

To turn off the doublewrite buffer, specify the option innodb_doublewrite=0.



Wednesday 29 January 2014

MySQL: SUBSTRING_INDEX - Select Patterns

Consider, a MySQL table having values in a column like below:

SELECT location  FROM geo LIMIT 3;

"location"
"India.Karnataka.Shimoga.Gopala"
"India.Karnataka.Bengaluru.BTM"
"India.Karnataka.Chikmaglore.Koppa"

My requirement is to take only 4th value from each of the rows(such as, Gopala,BTM,Koppa). 
I don't want to display remaining values. 
Its same as what 'cut' command will do in Linux.

For this, we can use SUBSTRING_INDEX function.

SELECT SUBSTRING_INDEX(location,'.',-1)  from geo LIMIT 3;
"location"
"Gopala"
"BTM"
"Koppa"

Syntax: SUBSTRING_INDEX(string,delimiter,count)
Here count means column number based on delimiter. 
Negative value indicates that the column numbers calculated from right side.

So, if I give '-2' instead of  '-1':

SELECT SUBSTRING_INDEX(location,'.',-2)  from geo LIMIT 3;
"location"
"Shimoga.Gopala"
"Bengaluru.BTM"
"Chikmaglore.Koppa"

Linux: awk - Grouping Data in a file

Consider the below file:

# cat test.csv
aa 1 qwer
ab 2 tyui
aa 3 poiu
ab 2 mnb
bb 1 njio
ba 2 njtwe

test.csv  is a tab separated file with 3 columns. 

Here, I want to segregate the whole lines with matching 1st and 2nd columns into separate files. 

Like below:
# cat file_bb_1.csv
bb 1 njio

# cat file_ba_2.csv
ba 2 njtwe

# cat file_ab_2.csv
ab 2 tyui
ab 2 mnb

# cat file_aa_3.csv
aa 3 poiu

# cat file_aa_1.csv
aa 1 qwer


Though you can do this manually, think of a file with more than million lines.

Here, 'awk', being an powerful data manipulation tool,comes to our help. 
Below is the command we can use:


#cat test.csv | awk '{a=$1;b=$2; print $0 >> "file_" a "_" b ".csv"}'


[You can give any name instead of 'file_']