30 March 2007
Sachin, broadcasting live from his blog Oracle Online Help, has posted a nice primer on the differences between three join types that anyone who has ever look at a query plan will be familiar with: nested loops, hash joins, and sort-merge joins.
I definitely sleep a little better knowing what these mean. Now if only I knew what everything else meant.
Hopefully I can carry the torch and keep the flame burning and whatever other metaphors one can think of.
Also, both of my readers may have noticed that I haven't posted that much this week. Alas, my duties as president of the Manitowoc Ultimate Frisbee Concern have been pulling me away from the blogging. We're organizing a tournament that we're hoping will be a success, and I've been focusing on drafting a sponsorship proposal to get funds and goodies for the players. Hopefully this thing is a hit with everyone and we rock it every year.
26 March 2007
I've become so spoiled with Oracle's system of read consistency and multiversioning that nothing else comes close to making any sense. Tom doesn't name any names, but I'd like to know what DBMSes employ some of the insane logic that he describes "others" doing.
I've added Jeremy's blog to my blogroll now, and he's had some great write-ups. So he's got a better blog, appears to be younger, fitter, and much more knowledgable about Oracle than me. But I bet he doesn't have a son that makes awesome horse sounds. BEAT THAT SCHNEIDER.
And now to tell you why "wait" ... the Oracle 10g wait interface!!!!!11 Feel my excitement. I just recently read some articles by Robin Schumacher about the 10g wait interface to analyze possible reasons for why a query might take so long. Is it waiting for disk I/O? Is it waiting for CPU? Well Oracle 10g provides some great views to see exactly what is waiting on what. The Schumacher articles are "Response Time Analysis Made Easy in Oracle Database 10g" and "Exploring the Oracle Database 10g Release 1 Wait Interface". The latter has the "only applies to Release 1" disclaimer, so caveat emptor. The articles do overlap a bit, so perhaps the first article will suit you enough.
I have to admit that a lot of it has left my brain already. I'll have to take timeouts and do random checks. I did run the query to get database wait time and CPU time ratios. Let's just say that my production database was nowhere near the great ratios that Robin was seeing. Hopefully some of the adjustments I plan to put in next month will bump things in my favor.
22 March 2007
More often than not it will be because one or more of the objects they are trying to access has a blocking lock on it from another user session. This is usually the result of an uncommitted transaction, which can be a sign of poor application design. Transactions should be committed or rolled back when they are done. Sometimes this is not done, as when an error occurs in one of the statements in the transaction. In those cases, the individual statement will roll back, but the transaction will remain uncommitted. Applications need to catch this and (probably) roll back the transaction as well.
But back to the original task of detecting a blocking lock! There is a great OraFAQ article on the subject. I encourage you to read that, and I'll just cheat and post the meaty query they use to get the goods:
As you can see, this will tell you the user/host of all blocking locks, and the user/host of who they are blocking. Now you can venture forth to see what the blocking session is doing and begin the process of fixing some bad transactional application code. Or you can just kill it.SQL> select s1.username || '@' || s1.machine 2 || ' ( SID=' || s1.sid || ' ) is blocking ' 3 || s2.username || '@' || s2.machine || ' ( SID=' || s2.sid || ' ) ' AS blocking_status 4 from v$lock l1, v$session s1, v$lock l2, v$session s2 5 where s1.sid=l1.sid and s2.sid=l2.sid 6 and l1.BLOCK=1 and l2.request > 0 7 and l1.id1 = l2.id1 8 and l1.id2 = l2.id2 ; BLOCKING_STATUS ---------------------------------------------------------------------------- BULKLOAD@yttrium ( SID=422 ) is blocking BULKLOAD@yttrium ( SID=479 ) 1 row selected.
20 March 2007
you probably want to double-check your java executable:
# java -versionIf you see the gcj, kill it with fire and install the Sun version (or remove the gcj-based /usr/bin/java that was ahead of your Sun executable in the $PATH):
java version "1.4.2"
gcj (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# java -versionThen all will be well in the world again.
java version "1.4.2_13"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_13-b06)
Java HotSpot(TM) Client VM (build 1.4.2_13-b06, mixed mode)
19 March 2007
Julien Lamarche at the Pythian blog lists some tips for new DBAs.
15 March 2007
Fast forward to today. I decide to finally set up Oracle's EM dbconsole web UI for administration. I had done this in the past on some other instances with absolutely no problems. So imagine my surprise when it fails to start the dbconsole after installation. When I tried to start it manually, I get this message:
The agentTZRegion value (US/Central) in /u00/app/oracle/product/10.2.0/db_1/foo.world_dev64/sysman/config/emd.properties does not match the current environment TZ setting(US/Central).This error makes no sense at all. US/Central doesn't match US/Central? After some googling I found a note that this occurs when the Oracle software isn't patched, but the operating system is, which is the case here. So Oracle was thinking US/Central (CST), but the operating system was saying US/Central (CDT). CURSES!
So I ended up downloading the latest Oracle one-off patch for 10.2.0.2 (with the version 4 timezone fixes, HI CANADA!) and, after a pretty painless patching process, all is well. Of course I did ignore the "best practice" of having separate ORACLE_HOMEs for your ASM and RDBMS installations, so I did have to shutdown both, but since this RDBMS instance is the only one using this ASM instance, it really didn't matter. It also helps that this is my test 64-bit migration server and no one was using it today but me. :p
I thought I could thumb my nose at Congress and their silly DST changes, but my love of the GUI brought me down. ;_;
Look for a site at tbcsu.info in the coming weeks. Classes will be starting soon.
14 March 2007
09 March 2007
P.S. -- Just in case you didn't know, "Tom" is Tom Kyte.
Out of a frothy mixture of hope and curiosity, I checked out his upcoming shows, and was pleasantly surprised to see Wisconsin listed! But ... PLOVER?!?!?! I only live in Manitowoc and even I think Plover is in the middle of nowhere.
Jonathan reminds me of an ex-coworker Pascal, who has an album out of his own. They both wear glasses and have shaggy hair (well Pascal did when I knew him around 11 years ago). There are some other similarities but I really just wanted to give Pascal a plug since he's a nice guy.
Anyway, I'll leave you with this clip of Jonathan performing the aforementioned Code Monkey. Enjoy.
So I went about setting sga_target and sga_max_size to 8 GB (8192M) and restarted the instance ... or attempted to. The instance refused to start again, giving me this:
ORA-27102: out of memory
Linux-x86_64 Error: 28: No space left on device
Some searching around led me to this post from Paul Moen over at Pythian (friends of this blog ... more on that later). It seems we had undervalued our shmall as do a lot of people. As Paul wrote:
The real godfather, the wizard behind the curtain is shmall. Its value determines the maximum amount of memory that ALL shared memory can take.Another review of puschitz (puschitz real good) confirmed things. After some tinkering by my SA, we now have an 8 GB SGA. I did originally set it to 16 GB, which might be a goal in production, but probably best to start modest with 8 GB for a multi-instance development box.
Just to make it fun, the actual setting is derived…
the maximum amount of memory = shmall * pagesize
where pagesize = getconf PAGE_SIZE and shmall = cat /proc/sys/kernel/shmall
Making shmall larger than free RAM is a recipe for paging hell and much gnashing of teeth. Oracle recommends half the RAM, we pushed the envelope and chose 75% as 8 gigabytes of free for OS and cache is just wasteful.
Especially given Oracle is already caching hot blocks in its memory.
08 March 2007
I should say "fixed" because it's only a problem if you want non-oracle/oinstall users to use utilities like sqlplus locally. Not totally unheard-of by any means, but some people really lock down their servers, and would prefer the tight perms.
Anyway, the script is $ORACLE_HOME/install/changePerm.sh, and it will loosen things up for normal OS users.
For the unwashed masses (from their homepage):
cx_Oracle is a Python extension module that allows access to Oracle databases and conforms to the Python database API specification.Makes me realize how I wish I could find more stuff to use python in. Most of my reporting is done in perl, and not worth converting. Most of my maintenance scripts (rman, etc.) are bash scripts calling the Oracle executables themselves.
Also be thankful that the Hoff is there in case you have fat fingers!
07 March 2007
[10:59] <otay> <rizzo> excuse me public in channel or i nuke you with my bomber net. NOW.
06 March 2007
However be aware that if the listener is started after the Oracle instance(s), then it may take the instance PMON up to 60 seconds to re-try. If you don't feel like waiting for what may seem like an eternity, you can manually tell your instance to register immediately with:
ALTER SYSTEM REGISTER;
The Oracle Database Net Services Administrator's Guide states it pretty plainly:
If the listener is not running when an instance starts, PMON is not able to register the service information. PMON attempts to connect to the listener periodically, however, it may take up to 60 seconds before PMON registers with the listener after it has been started. To initiate service registration immediately after the listener is started, use the SQL statementMy exam prep guide, written during 10gR1, seems to imply that Oracle instances will not automatically re-try to register, and that ALTER SYSTEM REGISTER is required if the listener was not already up:
ALTER SYSTEM REGISTER. This is especially useful in high-availability configurations.
You will need to re-register your instance with the listener with "alter system" if you have restarted the listener, or if you started the database instance before starting the listener.Perhaps the 60-second retry was introduced in 10gR2, or perhaps I'm just reading too much into that sentence. It does indeed work on 10gR2, as my ASM instance demonstrated last week. I don't have a 9i or 10gR1 instance handy to test it on though.
asmcmd basically provides a command-line interface to the ASM "file system". Since ASM uses raw disk, traditional OS tools are powerless to view the precious files within. asmcmd lets you navigate the contents of your ASM instance's disk groups with commands that should be familiar to any Linux user, including wildcard matching.
RMAN-00571: ===========================================================But otherwise RMAN said the duplication was finished and the database was opened. I could log in locally as SYSDBA just fine, but logins through the listener gave me those two PL/SQL errors again, as did any attept to use PL/SQL.
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-12005: error during channel cleanup
ORA-06544: PL/SQL: internal error, arguments: , , , , , , , 
ORA-06553: PLS-801: internal error 
Thankfully, my blogging rival hali in #oracle suggested ultirp.sql, which is one of the many provided scripts in your $ORACLE_HOME/rdbms/admin/ directory, along with its brother, utlrp.sql. From the comments of utlirp.sql:
This script can be used to invalidate and all pl/sql modules (procedures, functions, packages, types, triggers, views) in a database.That second paragraph described me to a tee. So I kicked it off (have to start the instance in upgrade mode):
This script must be run when it is necessary to regenerate the compiled code because the PL/SQL code format is inconsistent with the Oracle executable (e.g., when migrating a 32 bit database to a 64 bit database or vice-versa).
Please note that this script does not recompile invalid objects automatically. You must restart the database and explicitly invoke utlrp.sql to recompile invalid objects.
SQL> shutdown immediate;Once that finishes, it tells you to then restart in normal mode and run utlrp.sql to recompile the objects it just invalidated.
SQL> startup upgrade;
SQL> shutdown immediate;and VOILA! Errors gone. Of course, we still have to do the rest of the application testing! Thanks to hali for the tip, and for agreeing to let me blog about it this time.