阿里云ECS部署Pyspider总结

一、简述

最近用pyspider写了个爬虫需要部署到云服务器,所以入手了阿里云ECS,来完成最后的部署工作。当然在部署pyspider过程中,碰到些许问题,这里做些记录和总结。

二、部署工作

主要用到的命令(git-bash):

1
2
3
1. ssh连接服务端:ssh root@host
2. 公钥copy至服务端:scp ~/.ssh/id_rsa.pub root@host:~/.ssh/authorized_keys,需要先在服务端创建~/.ssh文件夹
3. 查看当前系统版本:cat /etc/redhat-release

安装Python3,默认是Python2:

1
2
3
4
5
6
7
8
9
1. yum install wget gcc make
# wget 用于下载源码包, gcc 和 make 用于编译
2. wget https://www.python.org/ftp/python/3.5.4/Python-3.5.4.tgz
# [查看Python官网版本](https://www.python.org/downloads/source/)
3. tar -zxvf Python-3.5.4.tgz
4. cd Python-3.5.4
5. ./configure
6. make
7. make install

安装MySQL:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
碰到问题最多的地方,刚开安装MySQL5.7,安装MySQL Server老是失败,后面查了资料说是Centos6.7缺少了某些依赖的支持,报错信息:
--> Finished Dependency Resolution
Error: Package: mysql-community-server-5.7.20-1.el7.x86_64 (mysql57-community)
Requires: libsasl2.so.3()(64bit)
Error: Package: mysql-community-client-5.7.20-1.el7.x86_64 (mysql57-community)
Requires: libstdc++.so.6(GLIBCXX_3.4.15)(64bit)
Error: Package: mysql-community-libs-5.7.20-1.el7.x86_64 (mysql57-community)
Requires: libc.so.6(GLIBC_2.14)(64bit)
Error: Package: mysql-community-server-5.7.20-1.el7.x86_64 (mysql57-community)
Requires: systemd
Error: Package: mysql-community-server-5.7.20-1.el7.x86_64 (mysql57-community)
Requires: libstdc++.so.6(GLIBCXX_3.4.15)(64bit)
Error: Package: mysql-community-client-5.7.20-1.el7.x86_64 (mysql57-community)
Requires: libc.so.6(GLIBC_2.14)(64bit)
Error: Package: mysql-community-server-5.7.20-1.el7.x86_64 (mysql57-community)
Requires: libc.so.6(GLIBC_2.17)(64bit)
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest

所以后面重新安装了低版本的MySQL,折腾了一圈终于可以成功启动服务。
不过,后续也重新尝试安装MySQL 5.6,也成功将5.1替换成5.6,具体做法可以参照下面的参考资料。
my.cnf 配置(文件编码为ANSI,不然mysqld服务会启动失败):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[mysqld]
datadir=/data/mysql
socket=/tmp/mysql.sock
symbolic-links=0
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
innodb_buffer_pool_size = 64M
max_connections = 100
wait_timeout = 1000

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

[client]
socket=/tmp/mysql.sock

MySQL修改root登陆密码:

1
2
3
use mysql;
update user set password=password('root') where user = 'root';
flush privileges;

MySQL禁止匿名登陆:

1
2
3
use mysql;
delete from user where user = '';
flush privileges;

MySQL创建pyspider用户命令:

1
2
3
CREATE USER 'pyspider'@'%' IDENTIFIED BY 'pyspider';
GRANT ALL ON *.* TO 'pyspider'@'%'; # 请忽略,设置允许外网远程连接
FLUSH PRIVILEGES;

创建pyspider的数据库:

1
2
3
create database taskdb;
create database projectdb;
create database resultdb;

用户权限授权:

1
2
3
4
GRANT SELECT, INSERT, UPDATE, REFERENCES, DELETE, CREATE, DROP, ALTER, INDEX, TRIGGER, CREATE VIEW, SHOW VIEW, EXECUTE, ALTER ROUTINE, CREATE ROUTINE, CREATE TEMPORARY TABLES, LOCK TABLES, EVENT ON `taskdb`.* TO 'pyspider'@'%';
GRANT SELECT, INSERT, UPDATE, REFERENCES, DELETE, CREATE, DROP, ALTER, INDEX, TRIGGER, CREATE VIEW, SHOW VIEW, EXECUTE, ALTER ROUTINE, CREATE ROUTINE, CREATE TEMPORARY TABLES, LOCK TABLES, EVENT ON `projectdb`.* TO 'pyspider'@'%';
GRANT SELECT, INSERT, UPDATE, REFERENCES, DELETE, CREATE, DROP, ALTER, INDEX, TRIGGER, CREATE VIEW, SHOW VIEW, EXECUTE, ALTER ROUTINE, CREATE ROUTINE, CREATE TEMPORARY TABLES, LOCK TABLES, EVENT ON `resultdb`.* TO 'pyspider'@'%';
FLUSH PRIVILEGES;

常用状态查看命令:

1
2
3
# 查看所有连接:show full processlist;
# 查看状态:show status|show variables
# 杀死连接: kill 2246

安装redis:

1
2
3
4
5
yum install redis
启动服务:service redis start
停止服务:service redis stop
重启服务:service redis restart
检查状态:service redis status

安装pyspider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
pip3 install pyspider
config.json设置:
{
"taskdb": "mysql+taskdb://pyspider:pyspider@127.0.0.1:3306/taskdb",
"projectdb": "mysql+projectdb://pyspider:pyspider@127.0.0.1:3306/projectdb",
"resultdb": "mysql+resultdb://pyspider:pyspider@127.0.0.1:3306/resultdb",
"message_queue": "redis://127.0.0.1:6379/db",
"webui": {
"port": 5050,
"username": "pyspider",
"password": "pyspider",
"need-auth": true
}
}

创建系统pyspider用户:

1
2
3
4
5
6
useradd -md /pyspider pyspider
创建上述的配置文件config.json:vi /pyspider/config.json
chown pyspider:pyspider config.json
chmod 400 config.json
vi /pyspider/pyspider_err.log # 创建log文件
vi /pyspider/pyspider.log # 创建log文件

安装supervisor:

1
2
3
4
5
6
7
8
9
10
yum install supervisor
vi /etc/supervisord.conf
[program:pyspider]
command=/usr/local/bin/pyspider -c /pyspider/config.json
autorestart=true
autostart=true
log_stdout=true
log_stderr=true
stderr_logfile=/pyspider/pyspider_err.log
stdout_logfile=/pyspider/pyspider.log

启动supervisor服务:service supervisord start
重新加载配置设定:supervisorctl reload
启动supervisor进程:supervisorctl start pyspider

折腾完上述服务配置,终于启动pyspider服务

三、参考资料

pyspider部署
vi简易操作
supervisor初体验
使用 supervisor 管理进程
在CentOS 6.5 上安装、配置MySQL 5.6

评论