ORA-01591故障处理

news/2024/6/18 1:58:17 标签: sqlserver, oracle, microsoft, crm, 数据库, 文档
 

早晨到办公室听同事说表被锁了,一试,发现表中某字段为1111111的行都被锁了,SELECT都不行。报错误ORA-01591,打开TOAD的KnowledgeeXpert,描述很少,只是说由于分布式事务错误而造成锁定。询问同事,昨天通过一个调用另一个存储过程出了错误,而后者通过透明网关insert一些数据到数据库
立即想到打开OEM,谁知道大失所望,进入锁,根本没发现相关的对象被锁定,开始有点郁闷。转而检查会话,该用户有5个会话,都是INACTIVE,不管三七二十一,全部杀掉。结果依旧,并且锁也没有出现。远程登陆上主机,发现和都正常,也没有发现透明网关进程挂死(之前曾发现TG4SQL在无业务量时也会出现25%左右的CPU,挂死)。
突然想到看看alert.log,经过仔细搜索,终于发现:

WedNov1700:00:042004
Errorsinfiled:“oracle“admin“xdcj“udump“xdcj_j006_3020.trc:
ORA-12012:自动执行作业82出错
ORA-01591:锁定已被有问题的分配事务处理6.5.887985挂起
ORA-06512:在line6

这正是出错的地方,往前追溯:

TueNov1617:35:042004
Error28500trappedin2PContransaction6.5.887985.Cleaningup.
Errorstackreturnedtouser:
ORA-02054:事务处理6.5.887985有问题
ORA-28500:连接ORACLE到非Oracle系统时返回此:
[TransparentgatewayforMSSQL]
ORA-02063:紧接着2lines(源于ZSMOS_CRM)
TueNov1617:35:042004
DISTRIBTRANQDCJ.US.ORACLE.COM.5ae32328.6.5.887985
islocaltran6.5.887985
insertpendingpreparedtran,scn=6606197672830
TueNov1617:35:072004
Errorsinfiled:“oracle“admin“xdcj“bdump“xdcj_reco_3024.trc:
ORA-28500:connectionfromORACLEtoanon-Oraclesystemreturnedthismessage:
[TransparentgatewayforMSSQL][Microsoft][ODBCSQLServerDriver][SQLServer]用户'RECOVER'登录失败。
ORA-02063:preceding2linesfromZSMOS_CRM

TueNov1617:35:122004
Errorsinfiled:“oracle“admin“xdcj“bdump“xdcj_reco_3024.trc:
ORA-28500:connectionfromORACLEtoanon-Oraclesystemreturnedthismessage:
[TransparentgatewayforMSSQL][Microsoft][ODBCSQLServerDriver][SQLServer]用户'RECOVER'登录失败。
ORA-02063:preceding2linesfromZSMOS_CRM

这就是事发地点了。看来是昨天下午远程事务失败,但是又没有返回造成分布式事务挂死,从而锁定了行。终于找到了详细的错误ORA-02054,进入TOAD一查,说是要等待或者提交该事务,可是怎么操作呢。还是打开官方文档搜索相关内容,在AdminstratorGuide中发现如下内容:
DiscoveringProblemswithaTwo-PhaseCommit
Theuserapplicationthatcommitsadistributedtransactionisinformedofaproblembyoneofthefollowingerrormessages:

ORA-02050:transactionIDrolledback,
someremotedbsmaybein-doubt
ORA-02051:transactionIDcommitted,
someremotedbsmaybein-doubt
ORA-02054:transactionIDin-doubt


Arobustapplicationshouldsaveinformationaboutatransactionifitreceivesanyoftheaboveerrors.Thisinformationcanbeusedlaterifmanualdistributedtransactionrecoveryisdesired.

Noactionisrequiredbytheadministratorofanynodethathasoneormorein-doubtdistributedtransactionsduetoanetworkorsystemfailure.TheautomaticrecoveryfeaturesofOracletransparentlycompleteanyin-doubttransactionsothatthesameoutcomeoccursonallnodesofasessiontreeafterthenetworkorsystemfailureisresolved.

Inextendedoutages,however,youcanforcethecommitorrollbackofatransactiontoreleaseanylockeddata.Applicationsmustaccountforsuchpossibilities.

DeterminingWhethertoPerformaManualOverride
Overrideaspecificin-doubttransactionmanuallyonlywhenoneofthefollowingsituationsexists:

Thein-doubttransactionlocksdatathatisrequiredbyothertransactions.ThissituationoccurswhentheORA-01591errormessageinterfereswithusertransactions.
Anin-doubttransactionpreventstheextentsofarollbacksegmentfrombeingusedbyothertransactions.Thefirstportionofanin-doubtdistributedtransaction'slocaltransactionIDcorrespondstotheIDoftherollbacksegment,aslistedbythedatadictionaryviewsDBA_2PC_PENDINGandDBA_ROLLBACK_SEGS.
Thefailurepreventingthetwo-phasecommitphasestocompletecannotbecorrectedinanacceptabletimeperiod.Examplesofsuchcasesincludeatelecommunicationnetworkthathasbeendamagedoradamageddatabasethatrequiresalongrecoverytime.
Normally,youshouldmakeadecisiontolocallyforceanin-doubtdistributedtransactioninconsultationwithadministratorsatotherlocations.Awrongdecisioncanleadtodatabaseinconsistenciesthatcanbedifficulttotraceandthatyoumustmanuallycorrect.

Iftheconditionsabovedonotapply,alwaysallowtheautomaticrecoveryfeaturesofOracletocompletethetransaction.Ifanyoftheabovecriteriaaremet,however,consideralocaloverrideofthein-doubttransaction.

看来是建议差不多,后面Oracle总是试图登录SQlServer就是要自动恢复,可是总不成功。察看视图DBA_2PC_PENDING确实发现了该事务的痕迹。要怎样操作呢?

ManuallyCommittinganIn-DoubtTransaction
Beforeattemptingtocommitthetransaction,ensurethatyouhavetheproperprivileges.Notethefollowingrequirements:

Ifthetransactionwascommittedby...Thenyoumusthavethisprivilege...
You
FORCETRANSACTION

Anotheruser
FORCEANYTRANSACTION


CommittingUsingOnlytheTransactionID
ThefollowingSQLstatementcommitsanin-doubttransaction:

COMMITFORCE'transaction_id';


Thevariabletransaction_idistheidentifierofthetransactionasspecifiedineithertheLOCAL_TRAN_IDorGLOBAL_TRAN_IDcolumnsoftheDBA_2PC_PENDINGdatadictionaryview.

Forexample,assumethatyouqueryDBA_2PC_PENDINGanddeterminethatLOCAL_TRAN_IDforadistributedtransactionis1:45.13.

YouthenissuethefollowingSQLstatementtoforcethecommitofthisin-doubttransaction:

COMMITFORCE'1.45.13';

CommittingUsinganSCN
Optionally,youcanspecifytheSCNforthetransactionwhenforcingatransactiontocommit.Thisfeatureallowsyoutocommitanin-doubttransactionwiththeSCNassignedwhenitwascommittedatothernodes.

Consequently,youmaintainthesynchronizedcommittimeofthedistributedtransactionevenifthereisafailure.SpecifyanSCNonlywhenyoucandeterminetheSCNofthesametransactionalreadycommittedatanothernode.

Forexample,assumeyouwanttomanuallycommitatransactionwiththefollowingglobaltransactionID:

SALES.ACME.COM.55d1c563.1.93.29

First,querytheDBA_2PC_PENDINGviewofaremotedatabasealsoinvolvedwiththetransactioninquestion.NotetheSCNusedforthecommitofthetransactionatthatnode.SpecifytheSCNwhencommittingthetransactionatthelocalnode.Forexample,iftheSCNis829381993,issue:

COMMITFORCE'SALES.ACME.COM.55d1c563.1.93.29',829381993;

SeeAlso:
Oracle9iSQLReferenceformoreinformationaboutusingtheCOMMITstatement


ManuallyRollingBackanIn-DoubtTransaction
Beforeattemptingtorollbackthein-doubtdistributedtransaction,ensurethatyouhavetheproperprivileges.Notethefollowingrequirements:

Ifthetransactionwascommittedby...Thenyoumusthavethisprivilege...
You
FORCETRANSACTION

Anotheruser
FORCEANYTRANSACTION


ThefollowingSQLstatementrollsbackanin-doubttransaction:

ROLLBACKFORCE'transaction_id';


Thevariabletransaction_idistheidentifierofthetransactionasspecifiedineithertheLOCAL_TRAN_IDorGLOBAL_TRAN_IDcolumnsoftheDBA_2PC_PENDINGdatadictionaryview.

Forexample,torollbackthein-doubttransactionwiththelocaltransactionIDof2.9.4,usethefollowingstatement:

ROLLBACKFORCE'2.9.4';

于是登陆数据库
COMMITFORCE'6.5.887985';
然后查看DBA_2PC_PENDING发现状态已经改为'COMMITFORCE',SELECT该表相关行,一切正常。至此,故障解决。
总体来看,直接INSERT...TABLENAME@SQLDBLK还是很危险的,遇上不能正常返回就出问题了。Oracle的文档是推荐使用包或者存储过程来解决,此后建议同事改用此方法,目前已经测试通过。


http://www.niftyadmin.cn/n/1552489.html

相关文章

一次web 服务器无法连接上oracle 数据库的故障处理

今天早上维护人员打来电话说某移动的114 web server 无法连接到数据库,web server 报一大堆jdbc 的错误,最后报 sql error,但是并没有明显的ORA- 的错误,第一反映应该不是oracle 数据库的问题,估计是web server 与数据库连接出现了…

分析共享池脚本

分析共享池脚本 参考《oracle性能优化实务》 SQL> col "avg size" format a30 truncate; SQL> col siz format 999999999999 SQL> SELECT KSMCHCLS CLASS, COUNT(KSMCHCLS) NUM, SUM(KSMCHSIZ) SIZ,2 To_char( ((SUM(KSMCHSIZ)/COUNT(KSMCHCLS)/1024)),999…

oracle数据库常见故障处理

一、定位数据库故障原因。定位原因大概可以分三步走:1、如果有oracle错误号或者alert日志中有详细的出错信息,则可以根据这些去定位数据库故障原因;2、如果没有,则可以运行awr工具或者statspack工具生成报告,根据报告去…

共享池碎片化分析脚本

共享池逐渐碎片化是正常现象,oracle有自动合并内存的机制来解决碎片化,如果这个机制解决不了问题,那么考虑业务少的时候刷新共享池(alter system flush shared_pool;)或重启实例。 SQL> set line 200 SQL> col s…

drop user cascade出现ORA-04043问题的解决

问题: SQL> drop user hbylinit cascade;drop user hbylinit cascadeORA-00604: 递归 SQL 级别 1 出现错误ORA-04043: 对象 SYS_YOID0000104160$ 不存在分析:ora-4043就是提示对象不存在,一般在写错对象名的时候都会报这个错误。推测出…

优化sql解析

使用绑定变量调整CURSOR_SHARING 1)、EXACT:通常来说,exact值是Oracle推荐的,也是默认的,它要求SQL语句在完全相同时才会重用,否则会被重新执行硬解析操作。 2)、SIMILAR:similar是在Oracle认为某条SQL语句…

共享池抖动分析脚本

参考自《oracle性能优化实务》 SQL> set line 200 SQL> alter session set nls_date_formatyyyy-mm-dd hh24:mi:ss;Session altered.SQL> col component format a40 truncate; SQL> select component,oper_type, oper_mode, start_time, end_time, trunc(target_si…

数据库缩小表空间

案例环境:今天启动应用程序,程序报错如下:ExceptionMessageORA-01653: 表 HBXNB_CS.BZ29 无法通过 1024 (在表空间 USERS 中) 扩展ORA-06512: 在 "HBXNB_CS.DBMS_CONTEXT", line 40ORA-01653: 表 HBXNB_CS.BZ29 无法通过 1024 (在表…