<?= AV4TAr: mysql

Mostrando las entradas con la etiqueta mysql. Mostrar todas las entradas

20120213

Primer Mysql Meetup Montevideo

Bueno, el año se está poniendo en marcha y nuevamente empiezan a aparecer los eventos y meetups de IT. El jueves próximo pasado se realizó el primer Mysql Meetup Montevideo auspiciado por Percona y Guruhub.

Unas 25 personas (aprox.) nos juntamos para hacer un poco de networking y charlar luego de Mysql y tecnologías relacionadas.

La primer charla por Marcos Albe y Fernando Ipar fué sobre pt-query-digest que es una herramienta para analizar logs de mysql parte del percona-toolkit.

Luego de esa charla tocó la mía: Introducción a Sphinx. El objetivo de esta charla era dar un primer pantallazo sobre este motor de búsqueda full text, tocar temas de configuración basicamente y ver algunos casos prácticos sobre proyectos en los cuales hemos usado Sphinx.

La charla fluyó muy bien con muy buena participación y preguntas.

Hacer clic en la imagen de arriba para ver la presentación online, pueden bajarse la charla y los ejemplos en github (fork it!).

Ya estamos preparando un laboratorio hands on para la próxima Meetup. Están todos invitados!

Gracias a los sponsors:

Thanks to the support of Case Inc.
Otra referencia al meetup: http://picandocodigo.net/2012/meetup-reunion-mysql-montevideo/

20110811

Sphinx Search - why and how to use sphinx search delta indexes

Problem:
Anybody who used sphinx search knows that reindexing big indexes takes a long time. The main problem here is that the whole index is recreated every time you execute a reindexing.

Solution:
The way of handling this is by using delta index updates with index merging. The idea is to have 2 indexes:

the "main" and big index for the old (unchanged) data, and
a small "delta" for the new (recently changed) data. So, instead of reindexing the "main" index, you should only reindex the "delta" one every few minutes.

After a while (once a day) you should merge the "delta" and the "main" index (depending of the size of the delta).
This is called "main+delta" scheme.

How-to:
Having a table "documents" with fields: "id, title, body", we create the sphinx search index as follows:

As you can see the main sql_query give us all the documents from the documents table. The idea of the "main+delta" scheme is that every time you reindex main, you will have to store the last id processed somewhere, so delta can start from there and process a little amount of records.

So first, create this table in mysql to store that id:

Then in sphinx.conf:

How does this work? As you can see in main specification a pre-fetch query called sql_query_pre appears. That query is executed before the main query (you can have a bunch of those).

REPLACE INTO sph_counter SELECT 1, MAX(id) FROM documents

Our pre-fetch query, updates a record in the sph_counter table to be used later by delta. This record will store the max(id) from our documents table in the moment of indexing, so the main query will get documents with id less than or equal to that maximum and delta query with ids bigger than that max.

In brief, you update main once in a while, and delta every so often. main will get ALL the documents till the time you update it, and delta will get all the new ones.

In your code you will have to search in both indexes:

$sphinxClient->Query(“this is my search query”, “main delta”);

Indexing main will take a long time too, but you will be sort of "live" because delta will update very quickly.
Instead of reindexing main you can merge both indexes, and only update delta. I will write about merging in a future post.

Links:

20110713

How to reset MySql root password on ubuntu

Just do this

sudo dpkg-reconfigure mysql-server-5.1

Works on debian too.

20110604

Removing duplicate rows in mysql

In our recent project called Sfter we have a deamon that do a lot of inserts in a mysql table.
It had a bug that duplicate those rows a LOT! (more than 300 times every record)... today morning i realized we had around 3millon records in that table DAMN!.

So i have to figure out how to remove those duplicates directly using some SQL commands.

From the Mysql manual:

IGNORE is a MySQL extension to standard SQL. It controls how ALTER TABLE works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. If IGNORE is not specified, the copy is aborted and rolled back if duplicate-key errors occur. If IGNORE is specified, only the first row is used of rows with duplicates on a unique key, the other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.

Pretty simple eh!

20110128

Tips and Tricks - MYSQL - how to dump a database

So you want to backup your mysql database??? Mysql has many ways of doing it, i will explain the easier one: mysqldump.

Being strict, msqldump is not a "backup utility", it creates a script with "CREATE TABLE" and "INSERT INTO" commands that you can run in your server to re-create your database. mysqldump is shipped with MYSQL so everybody should have it installed.

Using mysqldump command line is very straightforward, just type this in your command line or terminal (using your username and password):

mysqldump -u USERNAME -pPASSWORD OUR_DATABASE > filename.sql

This will give us a file (filename.sql) containing "CREATE TABLE" and "INSERT INTO" commands for OUR_DATABASE.
Some times you need to dump ALL your databases (you are moving you dev machine) in your MYSQL server:

mysqldump -u USERNAME -pPASSWORD --all-databases > filename.sql

If you are using MAMP (in macOS), your mysqldump binary is located in /Applications/MAMP/Library/bin/mysqldump.

There is a lot more options here in the official MYSQL documentation: http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html

To restore our database, we have to we have to run this script in our MYSQL server. You can do it using any utility (phpmyadmin for instance) or jus (again) the command line:

mysql -u USERNAME -pPASSWORD DATABASE_NAME < filename.sql

So, good backup!! ;-)