前言:
本文档主要描述通过Oracle Grid Infrastructure Agents (XAG)基于Oracle RAC实现GoldenGate(OGG)软件高可用的实施操作
环境信息:
源端 | 目标端 | ||
节点一IP 节点二IP | 192.168.1.84 192.168.1.86 | 节点一IP 节点二IP | 192.168.1.200 192.168.1.210 |
VIP | 192.168.1.88 | VIP | 192.168.1.204 |
数据库版本 | 11.2.0.4 | 数据库版本 | 12.2.0.1 |
OGG版本 | 19.1.0.0.4 | OGG版本 | 19.1.0.0.4 |
XAG版本 | 10.2 | XAG版本 | 7.1.0 |
EXTRACT进程 | E_BSS_1 | REPLICAT进程 | R_ING_1 |
PUMP进程 | P_BSR_1 |
源端OGG高可用配置:
1 配置vip资源:
--添加vip资源,这里使用集群新建的network 2网络,可以使用默认的network 1,同网段即可 #network is the network number that you want to use. #ip is the IP address provided by your system administrator for the new Application VIP. This IP address must be in the same subnet as determined above. #gg_vip_source is the name of the application VIP that you will create. /u01/app/11.2.0/grid/bin/appvipcfg create -network=2 \ -ip=192.168.1.88 \ -vipname=gg_vip_source \ -user=root --root授权Oracle用户有启动vip的权限 /u01/app/11.2.0/grid/bin/crsctl setperm resource gg_vip_source -u user:oracle:r-x --Oracle用户启动资源 /u01/app/11.2.0/grid/bin/crsctl start resource gg_vip_source --验证资源 [oracle@rac1 ~]$ /u01/app/11.2.0/grid/bin/crsctl status resource gg_vip_source NAME=gg_vip_source TYPE=app.appvip_net2.type TARGET=ONLINE STATE=ONLINE on rac2
2 配置ACFS资源:
--配置ACFS,在节点一操作即可 --添加vol卷 su -grid ASMCMD> volcreate -G DATA -s 5G acfsvol --查看acfs生成的卷组 su - grid ASMCMD> volinfo -G DATA acfsvol Diskgroup Name: DATA Volume Name: ACFSVOL Volume Device: /dev/asm/acfsvol-119 State: ENABLED Size (MB): 5120 Resize Unit (MB): 32 Redundancy: UNPROT Stripe Columns: 4 Stripe Width (K): 128 Usage: Mountpath: --挂载/ogg 文件系统 --确认asm下面生成acfs卷组 # ls /dev/asm --创建挂载目录 su - root mkdir /ogg --格式划acfs su - root # /sbin/mkfs -t acfs /dev/asm/acfsvol-119 --加入集群管理 su - root /u01/app/11.2.0/grid/bin/srvctl add filesystem -d /dev/asm/acfsvol-119 -g 'DATA' -v ACFSVOL -m /ogg -u oracle --挂载 /u01/app/11.2.0/grid/bin/srvctl start filesystem -d /dev/asm/acfsvol-119 --查看是否online以及mount su - grid crsctl stat res -t |grep -i acfs ora.data.acfsvol.acfs ora.registry.acfs df -h --如果没有online,再手动mount su - root # mount.acfs -o all --授予目录权限 su - root # chown oracle.oinstall /ogg
3 安装ogg软件:
- --安装ogg软件,要安装在acfs所在的目录/ogg里面
- cd fbo_ggs_Linux_x64_shiphome/
- cd Disk1/
- ./runInstaller
4 ogg开启ddl支持:
- sqlplus / as sysdba
- @marker_setup.sql
- @ddl_setup.sql
- @role_setup.sql
- GRANT GGS_GGSUSER_ROLE TO ogg_owner;
- @ddl_enable.sql
- @?/rdbms/admin/dbmspool.sql
- @ddl_pin ogg_owner
5 配置MGR进程:
- --需要注意配置的参数是autostart,因为高可用是依靠mgr去自动启动进程的
- GGSCI (rac1) 3> edit params mgr
- port 7809
- DYNAMICPORTLIST 7940-8100
- AUTOSTART ER *
- AUTORESTART ER *,RETRIES 10, WAITMINUTES 1, RESETMINUTES 60
- PURGEOLDEXTRACTS /ogg/dirdat/*,USECHECKPOINTS,MINKEEPDAYS 5
- LAGREPORTHOURS 1
- LAGINFOMINUTES 30
- LAGCRITICALMINUTES 45
- --启动mgr进程
- GGSCI (rac1) 5> start mgr
6 源端添加表级附加日志:
- --连接数据库
- GGSCI>dblogin userid ogg password ogg
- --添加表级附加日志
- GGSCI>add trandata test.*
- --显示日志添加情况
- GGSCI>info trandata test.*
7 配置EXTRACT进程:
--添加extract进程 add extract E_BSS_1 tranlog,begin now,threads 2 add exttrail ./dirdat/es,extract E_BSS_1,megabytes 1000 --需要注意的是要通过tns去连接数据库,而不是ORACLE_SID方式,并且连接的tns需要可以访问到全部节点,最好连scan_ip --配置ogg连接数据库的tns ORCL = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = orcl) ) ) --配置extract进程 GGSCI (rac1) 3> edit params e_bss_1 extract E_BSS_1 SETENV (ORACLE_HOME = "/u01/app/oracle/product/11.2.0/db_1") SETENV (NLS_LANG = "AMERICAN_AMERICA.ZHS16GBK") userid ogg@orcl, password ogg exttrail ./dirdat/es gettruncates TRANLOGOPTIONS INCLUDEREGIONID,EXCLUDEUSER ogg TRANLOGOPTIONS BUFSIZE 2048000 TRANLOGOPTIONS DBLOGREADER,DBLOGREADERBUFSIZE 2048000 DISCARDFILE ./dirrpt/e_bss_1.dsc,APPEND,MEGABYTES 1000 DISCARDROLLOVER AT 6:00 REPORTROLLOVER AT 6:00 REPORTCOUNT EVERY 1 HOURS,RATE DDL INCLUDE MAPPED , OBJTYPE 'TABLE' & INCLUDE MAPPED OBJTYPE 'INDEX' DDLOPTIONS ADDTRANDATA RETRYOP RETRYDELAY 10 MAXRETRIES 10 DDLOPTIONS REPORT FETCHOPTIONS MISSINGROW ABEND STATOPTIONS REPORTFETCH WARNLONGTRANS 1H,CHECKINTERVAL 10m DYNAMICRESOLUTION TABLE test.test;
8 配置PUMP进程:
--添加投递进程 add extract P_BSR_1,exttrailsource ./dirdat/es add RMTTRAIL ./dirdat/rs,ext P_BSR_1,megabytes 1000 --需要注意的是这里远程的IP是目标端的VIP,这样才能在failover之后,依然可以正常投递 --配置投递进程参数 GGSCI (rac1) 3> edit params P_BSR_1 extract P_BSR_1 rmthost 192.168.1.204, mgrport 7809 rmttrail ./dirdat/rs passthru DISCARDFILE ./dirrpt/p_bsr_1.dsc,APPEND,MEGABYTES 1000 DISCARDROLLOVER AT 6:00 REPORTROLLOVER AT 6:00 REPORTCOUNT EVERY 1 HOURS,RATE TABLE TEST.TEST;
9 添加源端OGG高可用服务实例:
++++++++++++++++++++++++++++++++++++++++++++++++参数说明 agctl add goldengate --help Adds Goldengate instance to Oracle Clusterware. <instance_name> Instance name --gg_home <GoldenGate_Home> GoldenGate home directory --instance_type <instance_type> {source | target | dual} --server_pool <server_pool> Name of the Server pool where instance can be run --nodes <node>[,...] List of nodes where instance can be run --vip_name <vip_name> Virtual IP name --network <network_number> Network number --ip <ip_address> IP address --user <user> Operating System user name that owns the instance --group <group> Name of the group to which the Operating System user belongs --oracle_home <oracle_home> ORACLE_HOME location --databases <database>[,...] List of database instance dependencies --db_services <db_service>[,...] List of database service dependencies --filesystems <filesystem>[,...] List of file system resource dependencies --attribute <name>=<value>[,...] Attributes that can be applied --environment_vars <var=value>[,...] Additional environment variables to set --monitor_extracts <ext>[,...] List of EXTRACT groups to monitor --monitor_replicats <rep>[,...] List of REPLICAT groups to monitor --critical_extracts <exta>[,...] List of critical EXTRACT groups to monitor and failover --critical_replicats <repa>[,...] List of critical REPLICAT groups to monitor and failover --dataguard_autostart <yes|no> Start GoldenGate on DataGuard role transition to PRIMARY --jagent_autostart <yes|no> Start JAgent on GoldenGate startup --agctl添加高可用source_ogg实例,oracle用户执行以及管理 --注意这里源端VIP不是必需的,因为源端不需要用到VIP --注意这里的database也可以不用设置,这里的设置主要是强关联database,这样当节点的database关闭会触发ogg failover agctl add goldengate source_ogg --gg_home /ogg \ --instance_type source \ --nodes rac1,rac2 \ --vip_name gg_vip_source \ --filesystems ora.data.acfsvol.acfs\ --databases ora.orcl.db \ --oracle_home /u01/app/oracle/product/11.2.0/db_1 \ --monitor_extracts E_BSS_1,P_BSR_1 --查看实例配置 [oracle@rac1 ogg]$ agctl config goldengate source_ogg Instance name: source_ogg Application GoldenGate location is: /ogg Goldengate MicroServices Architecture environment: no GoldenGate instance type is: source EXTRACT groups to monitor: E_BSS_1,P_BSR_1 REPLICAT groups to monitor: Critical EXTRACT groups: Critical REPLICAT groups: Autostart on DataGuard role transition to PRIMARY: no Autostart JAgent: no Configured to run on Nodes: rac1 rac2 ORACLE_HOME location is: /u01/app/oracle/product/11.2.0/db_1 Databases needed: ora.orcl.db File System resources needed: ora.data.acfsvol.acfs VIP name: gg_vip_source
10 启动源端OGG高可用服务实例:
--开启source_ogg实例,oracle用户执行 [oracle@rac1 ogg]$ agctl start goldengate source_ogg --node rac1 --会把mgr进程启动,mgr会把extract进程,pump进程启动 [oracle@rac1 ogg]$ ./ggsci Oracle GoldenGate Command Interpreter for Oracle Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO Linux, x64, 64bit (optimized), Oracle 11g on Oct 17 2019 23:13:12 Operating system character set identified as UTF-8. Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved. GGSCI (rac1) 1> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT RUNNING E_BSS_1 00:00:00 00:00:03 EXTRACT RUNNING P_BSR_1 00:00:00 00:00:00 --查看集群服务状态,当前服务在节点一 crsctl stat res -t Cluster Resources -------------------------------------------------------------------------------- gg_vip_source 1 ONLINE ONLINE rac1 ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE rac1 ora.bkrac1-vip.vip 1 ONLINE ONLINE rac1 ora.bkrac2-vip.vip 1 ONLINE ONLINE rac2 ora.cvu 1 ONLINE ONLINE rac1 ora.oc4j 1 ONLINE ONLINE rac2 ora.orcl.db 1 ONLINE ONLINE rac1 Open 2 ONLINE ONLINE rac2 Open ora.rac1.vip 1 ONLINE ONLINE rac1 ora.rac2.vip 1 ONLINE ONLINE rac2 ora.scan1.vip 1 ONLINE ONLINE rac1 xag.source_ogg.goldengate 1 ONLINE ONLINE rac1 GGSCI (rac1) 2> --关闭source_ogg实例,用oracle用户执行 agctl stop goldengate source_ogg --手动切换source_ogg实例到去其他节点,用oracle用户执行 agctl relocate goldengate source_ogg --node rac2
目标端OGG高可用配置:
1 配置VIP,ACFS,OGG软件安装参照源端配置即可
2 配置MGR进程:
- --需要注意配置的参数是autostart,因为高可用是依靠mgr去自动启动进程的
- GGSCI (rac1) 3> edit params mgr
- port 7809
- DYNAMICPORTLIST 7940-8100
- AUTOSTART ER *
- AUTORESTART ER *,RETRIES 10, WAITMINUTES 1, RESETMINUTES 60
- PURGEOLDEXTRACTS /ogg/dirdat/*,USECHECKPOINTS,MINKEEPDAYS 5
- LAGREPORTHOURS 1
- LAGINFOMINUTES 30
- LAGCRITICALMINUTES 45
- --启动mgr进程
- GGSCI (rac1) 5> start mgr
3 配置replicat应用进程:
--创建检查表 GGSCI> dblogin userid ogg password ogg GGSCI > add checkpointtable ogg.rep_ogg_ckpt --添加应用进程 GGSCI> add replicat R_ING_1,exttrail ./dirdat/rs,checkpointtable ogg.rep_ogg_ckpt --需要注意的是要通过tns去连接数据库,而不是ORACLE_SID方式,并且连接的tns需要可以访问到全部节点,最好连scan_ip --配置ogg连接数据库的tns ORCL = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = orcl) ) ) --配置应用进程 GGSCI (rac1) 3> edit params r_ing_1 REPLICAT R_ING_1 SETENV (ORACLE_HOME = "/oracle/app/oracle/product/12.2/db_1") SETENV (NLS_LANG = "AMERICAN_AMERICA.AL32UTF8") userid ogg@orcl, password ogg ASSUMETARGETDEFS DBOPTIONS DEFERREFCONST DBOPTIONS SUPPRESSTRIGGERS DISCARDFILE ./dirrpt/r_ing_1.dsc, APPEND, MEGABYTES 1000 DISCARDROLLOVER AT 6:00 REPERROR (DEFAULT, ABEND) REPERROR (24344, DISCARD) gettruncates DDL INCLUDE MAPPED , OBJTYPE 'TABLE' & INCLUDE MAPPED OBJTYPE 'INDEX' DDLOPTIONS REPORT
4 添加目标端OGG高可用服务实例:
--agctl添加高可用target_ogg实例,用oracle用户执行以及管理 --注意这里的database也可以不用设置,这里的设置主要是强关联database,这样当节点的database关闭会触发ogg failover agctl add goldengate target_ogg --gg_home /ogg \ --instance_type target \ --nodes rac1,rac2 \ --vip_name gg_vip_source \ --filesystems ora.mgmt.acfsvol.acfs \ --databases ora.orcl.db \ --oracle_home /oracle/app/oracle/product/12.2/db_1 \ --monitor_replicats R_ING_1 --查看实例配置 [oracle@rac1 ogg]$ agctl config goldengate target_ogg GoldenGate location is: /ogg GoldenGate instance type is: target Configured to run on Nodes: rac1 rac2 ORACLE_HOME location is: /oracle/app/oracle/product/12.2/db_1 Databases needed: ora.orcl.db File System resources needed: ora.mgmt.acfsvol.acfs VIP name: gg_vip_source EXTRACT groups to monitor: E_BSS_1,P_BSR_1 REPLICAT groups to monitor: R_ING_1 Critical EXTRACT groups: Critical REPLICAT groups: Autostart on DataGuard role transition to PRIMARY: no Autostart JAgent: no
5 启动目标端OGG高可用服务实例:
--开启source_ogg实例,用oracle用户执行 [oracle@rac1 ogg]$ agctl start goldengate target_ogg --node rac1 --会把mgr进程启动,mgr会把extract进程,pump进程启动 [oracle@rac1 ogg]$ ./ggsci [oracle@rac1 ogg]$ ./ggsci Oracle GoldenGate Command Interpreter for Oracle Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO Linux, x64, 64bit (optimized), Oracle 12c on Oct 18 2019 01:38:51 Operating system character set identified as UTF-8. Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved. GGSCI (rac1) 1> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING REPLICAT RUNNING R_ING_1 00:00:00 00:00:00 --查看集群服务状态,当前服务在节点一 crsctl stat res -t Cluster Resources -------------------------------------------------------------------------------- gg_vip_source 1 ONLINE ONLINE rac1 STABLE ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE rac2 STABLE ora.MGMTLSNR 1 ONLINE ONLINE rac1 169.254.179.17 192.1 68.2.10,STABLE ora.asm 1 ONLINE ONLINE rac1 Started,STABLE 2 ONLINE ONLINE rac2 Started,STABLE 3 OFFLINE OFFLINE STABLE ora.cvu 1 ONLINE ONLINE rac2 STABLE ora.mgmtdb 1 ONLINE ONLINE rac1 Open,STABLE ora.orcl.db 1 ONLINE ONLINE rac1 Open,HOME=/oracle/ap p/oracle/product/12. 2/db_1,STABLE 2 ONLINE ONLINE rac2 Open,HOME=/oracle/ap p/oracle/product/12. 2/db_1,STABLE ora.qosmserver 1 ONLINE ONLINE rac2 STABLE ora.rac1.vip 1 ONLINE ONLINE rac1 STABLE ora.rac2.vip 1 ONLINE ONLINE rac2 STABLE ora.scan1.vip 1 ONLINE ONLINE rac2 STABLE xag.target_ogg.goldengate 1 ONLINE ONLINE rac1 STABLE -------------------------------------------------------------------------------- GGSCI (rac1) 2> --关闭source_ogg实例,用oracle用户执行 agctl stop goldengate target_ogg --手动切换source_ogg实例到去其他节点,用oracle用户执行 agctl relocate goldengate target_ogg --node rac2
测试OGG高可用配置:
1 源端OGG节点主机重启:
--在源端开启会话,模拟表操作 declare v_count number(10):=0; begin for i in 1..1000000 loop insert into test.test select * from dba_objects where rownum<4; delete from test.test where rownum<2; update test.test set owner='aaa' where rownum<2; v_count:=v_count+1; if v_count>=1 THEN commit; v_count:=0; end if; end loop; commit; end; / --当前源端ogg所在节点主机rac1 [oracle@rac1 ~]$ agctl status goldengate source_ogg Goldengate instance 'source_ogg' is running on rac1 --重启主机rac1 --ogg实例服务会自动转移到节点二rac2 [oracle@rac2 ogg]$ agctl status goldengate source_ogg Goldengate instance 'source_ogg' is running on rac2 [oracle@rac2 ogg]$ ./ggsci Oracle GoldenGate Command Interpreter for Oracle Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO Linux, x64, 64bit (optimized), Oracle 11g on Oct 17 2019 23:13:12 Operating system character set identified as UTF-8. Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved. GGSCI (rac2) 1> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT RUNNING E_BSS_1 00:00:02 00:00:10 EXTRACT RUNNING P_BSR_1 00:00:00 00:00:02 GGSCI (rac2) 2> --目标端进程无感知 --源端和目标端两边的数据同步一致
2 目标端OGG节点主机重启:
--当前目标端ogg所在节点主机rac1 [oracle@rac1 ogg]$ agctl status goldengate target_ogg Goldengate instance 'target_ogg' is running on rac1 --重启主机rac1 --ogg实例服务会自动转移到节点二rac2 [oracle@rac2 ~]$ agctl status goldengate target_ogg Goldengate instance 'target_ogg' is running on rac2 GGSCI (rac2) 1> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING REPLICAT RUNNING R_ING_1 00:00:00 00:00:08 GGSCI (rac2) 2> --源端的投递进程会短暂abend,因为vip要从目标节点一漂移到节点二,所以会短暂的连接不上,但由于配置了autorestart参数,所以mgr进程会尝试重新start投递进程 GGSCI (rac2) 11> info all --出现abend Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT RUNNING E_BSS_1 00:00:02 00:00:06 EXTRACT ABENDED P_BSR_1 00:00:00 00:01:39 --自己恢复正常 GGSCI (rac2) 12> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT RUNNING E_BSS_1 00:00:02 00:00:00 EXTRACT RUNNING P_BSR_1 00:00:00 00:00:04 --源端和目标端两边的数据同步一致 source: SQL> select count(*) from test.test; COUNT(*) ---------- 294962 SQL> target: select count(*) from test1.test COUNT(*) ---------- 294962
联系客服