SQLAlchemy gevent Mysql python drivers comparison

sqlalchemy-gevent-mysql-drivers-comparison

Compare different mysql drivers working with SQLAlchemy and gevent, see if they support cooperative multitasking using coroutines.

The main purpose of this test is to find which mysql driver has best concurrency performance when we use SQLAlchemy and gevent together.
So we won’t test umysql, which isn’t compatible with DBAPI 2.0.

Code is here. Thank you CMGS.

Result example

100 sessions, 20 concurrency, each session take 0.5 seconds (by select sleep)

mysql://root:@localhost:3306/mysql_drivers_test total 100 (20) 50.5239 seconds
mysql+pymysql://root:@localhost:3306/mysql_drivers_test total 100 (20) 2.6847 seconds
mysql+oursql://root:@localhost:3306/mysql_drivers_test total 100 (20) 50.4289 seconds
mysql+mysqlconnector://root:@localhost:3306/mysql_drivers_test total 100 (20) 2.6682 seconds

With greenify ptached MySQLdb (mysql-python).

mysql://root:@localhost:3306/mysql_drivers_test total 100 (20) 2.5790 seconds
mysql+pymysql://root:@localhost:3306/mysql_drivers_test total 100 (20) 2.6618 seconds
mysql+oursql://root:@localhost:3306/mysql_drivers_test total 100 (20) 50.4437 seconds
mysql+mysqlconnector://root:@localhost:3306/mysql_drivers_test total 100 (20) 2.6340 seconds

Pure python driver support gevent’s monkey patch, so they support cooperative multitasking using coroutines.
That means the main thread won’t be block by MySQL calls when you use PyMySQL or mysql-connector-python.

1000 sessions, 100 concurrency, each session only have 1 insert and 1 select

mysql://root:@localhost:3306/mysql_drivers_test total 1000 (100) 10.1098 seconds
mysql+pymysql://root:@localhost:3306/mysql_drivers_test total 1000 (100) 26.8285 seconds
mysql+oursql://root:@localhost:3306/mysql_drivers_test total 1000 (100) 6.4626 seconds
mysql+mysqlconnector://root:@localhost:3306/mysql_drivers_test total 1000 (100) 22.4569 seconds

Oursql is faster than MySQLdb in this case.
In pure python driver, mysql-connector-python is a bit faster than PyMySQL.
use greenify or not won’t affect the testing result in this user scenario.

Setup

mkvirtualenv mysql_drivers_test # workon mysql_drivers_test
pip install --upgrade setuptools pip cython
pip install -r requirements.txt
python mysql_drivers_comparison.py

Test with greenify MySQL-python (on OSX with homebrew).

mkvirtualenv mysql_drivers_test # workon mysql_drivers_test
pip install --upgrade setuptools pip cython
pip install -r requirements.txt
git clone https://github.com/CMGS/greenify
cd greenify
cmake -G 'Unix Makefiles' -D CMAKE_INSTALL_PREFIX=$VIRTUAL_ENV CMakeLists.txt
make & make install
cd ..
export LIBGREENIFY_PREFIX=$VIRTUAL_ENV
pip install git+git://github.com/CMGS/greenify.git#egg=greenify # if you are using zsh, use \#egg=greenify
git clone https://github.com/CMGS/mysql-connector-c
cd mysql-connector-c
export DYLD_LIBRARY_PATH=$VIRTUAL_ENV/lib
cmake -G 'Unix Makefiles' -D GREENIFY_INCLUDE_DIR=$VIRTUAL_ENV/include -D GREENIFY_LIB_DIR=$VIRTUAL_ENV/lib -D WITH_GREENIFY=1 -D CMAKE_INSTALL_PREFIX=$VIRTUAL_ENV CMakeLists.txt
make & make install
cd ..
git clone https://github.com/CMGS/MySQL-python.git
cd MySQL-python
export DYLD_LIBRARY_PATH=$VIRTUAL_ENV/lib
export LIBRARY_DIRS=$VIRTUAL_ENV/lib
export INCLUDE_DIRS=$VIRTUAL_ENV/include
unlink /usr/local/lib/libmysqlclient.18.dylib
ln -s $VIRTUAL_ENV/lib/libmysql.16.dylib /usr/local/lib/libmysqlclient.18.dylib
python setup.py install
brew switch mysql [version] # brew switch mysql 5.6.15 on my env, brew info mysql to check which version is available on your env
cd ..
python mysql_drivers_comparison.py

If the greenify doesn’t work for you, you can use otool -L _mysql.so in your $VIRTUAL_ENV/lib/python2.7/site-packages MySQL-python folder. Can’t find otool even after you installed XCode’s command line tools? Follow this link.

I need say thank you to CMGS. He guided me how to install greenify and how it works, he also help me triage the issues I met (include the otool part). Make greenify and mysql work on OSX make no sense, you shold do it on your application server which probably will be a linux, hope you will figure out how to.

我真的很讨厌Python的缩进语法,我真的不明白它为啥会存在。没啥美感可言,带来…

我真的很讨厌Python的缩进语法,我真的不明白它为啥会存在。没啥美感可言,带来的是无尽的烦恼。你经常要担心python解释器是否因为我的输入错误产生一些诡异的问题(虽然大部分不是因为这个问题)。不过这增加了很多无形的压力。I have python’s force indentation, it make me sick!

最近被开发团队的成员blame的几个错误: 一个从老系统导入新系统的脚本,其中…

最近被开发团队的成员blame的几个错误:

  • 一个从老系统导入新系统的脚本,其中一些是订单导入,还涉及到相关订单的支付问题。结果这个脚本创建订单和支付的操作没有声明事务,而是默认的在它们自己的事务完成后自动提交。结果其中一些因为帐户余额不足无法支付的订单被创建了,造成账目对不上。这个错误的教训就是在复用系统逻辑的时候一定要用心察看事务边界,而且对于导入脚本这样的脚手架也一定要对“事务”有个弦,不能忽略。
  • 一个Django的项目,在重构阶段,一些包正在逐渐被移动到心的地方,我们测试环境中manage.py runserver这几日都正常,但是部署的时候发现mod_python部署报告一个模型无法Import,而后同事又换了mod_wsgi,但是问题依旧。他们用bisect的方法找到了开始出问题的revision,这个revision是我提交的(我那天solo),但是仔细看了diff也很难找到问题,经过多次尝试发现是某一个import后出问题。先开始李明同学怀疑是我用textmate的时候混用了tab和space,但是后来发现问题不它。他们最后定位到这个诡异的问题是循环依赖造成的,虽然说不出具体的问题在哪里,但是经过一些包移动后问题解决了。这个教训是不要太相信Django和脆弱的Python的提示,遇到问题要多怀疑python脆弱的包依赖管理。当然这不是主要问题,对于这个问题也许我们要做的是频繁的部署,这样我们就能早些发现这个问题了。

Python中的method_missing和*get*钩子方法族

刚才实验了一下,发现Python里面声明类的时候是否选择继承objects还是有很大区别的。只有继承了objects,才可以使用钩子方法如’__get__’, ‘__set__’, ‘__getattr__’, ‘__getattribute__’这些方法。也就是说这些有用的钩子方法是所谓的new object里面的东西。今天我在项目的代码里面尝试了一下类似Ruby method_missing的写法,实验在Python里面加点元编程的东西。发现很相似的三个方法’__get__’, ‘__getattr__’, ‘__getattribute__’方法区别挺大。注意,一定要继承object才可以享用三个钩子方法。

  • 访问对象方法的时候首先会访问’__getattribute__’,它是访问类里面所有属性的时候都要经过的方法,包括创建对象的时候访问’__init__’, ‘_meta’这些都回经过’__getattribute__’访问。如果你什么异常都不抛出,它就不会访问’__getattr__’方法。
  • 如果’__getattribute__’方法抛出’AttributeError’,那么会继续尝试访问’__getattr__’方法。再有异常抛出,那么这个类就没有钩子再接住异常了。所以从这个角度来说它的工作方式非常相似于Ruby的method_missing。
  • 我本以为’__getattr__’是系统内建函数hasattr(object, property)优先访问的方法。不过实验证明,实际上还是先走’__getattribute__’后走’__getattr__’的。也就是说hasattr这样的函数没有优先绑定’__getattr__’。
  • ‘__get__’方法是用来监视自己的类作为其它类的成员的时候被访问的钩子。对应的是’__set__’,是相应属性被赋值时的钩子。这个方法与’__getattr__’和’__getattribute__’完全不是一会儿事。刚才看《Python核心编程一书》完全没有解释清楚。《How-To Guide for Descriptors》这篇文章对解释’__get__’帮助很大,有兴趣可以看看,不过我倒是没有想到什么是合理的应用场合。

我目前还没有调用方法的method_missing,目前只是访问一些属性。我们实际处理的是一个可以直接用属性名读取/修改对象里面的持久化json属性的方法,就是类持有一个{‘property’: ‘value’}的json文本属性,我们就可以直接用Model.property访问和修改里面的方法,而不用特别的去生命json结构过来。是一个在Python中做meta programming的尝试。

测试刚才说的几个*get*方法的测试如下:

# -*- coding: utf8 -*-
import unittest

class A(object):
    def __init__(self):
        print 'init A'

    def __get__(self, *args):
        print '__get__ A', args

    def __set__(self, *args):
        print '__set__ A', args

class B(object):
    a = A()
    
    def __init__(self):
        print 'init B'

    def __getattr__(self, *args):
        print '__getattr__ B', args

    def __getattribute__(self, *args):
        print '__getattribute__ B', args
        raise AttributeError

    def __get__(self, *args):
        print '__get__ B', args

    def __set__(self, *args):
        print '__set__ B', args

class MeTest(unittest.TestCase):
    def test_simple(self):
        b = B()
        print b.a
        b.a = A()
        b.c
        hasattr(b, 'e')