Tuesday, January 31, 2006

Setup of dual-node NFS failover Sun cluster

Here's the procedure that has been tested on Solaris 10 01/06 both sparc and x86 with SMI-labelled disk devices (size less than 1TB). Still doesn't work with EFI-labelled ones, it looks like ressurection of the bug 5073220.
Your feedback is quite welcome.
UPD: SunCluster 3.1 doesn't support EFI-labelled disks
Thanks to Kristien,Peter,Tim,Matthias,Steve,Sanjay,Tony,Alex,Robert for their help.
SunCluster Solaris x86SVM NFS

Monday, January 30, 2006

Almost 27...

Does it mean Elnino has 26.6 of them ?

Saturday, January 28, 2006

Crash anatomy

1.Entered kaif
2.Panic'ed
3.Ass Failed
[0]> $c
kaif_enter(1816400, 1823040, 108ac00, 0, 1816400, 184f000)
panicsys+0x408(1868550, 18553d0, 180c000, 0, 20040, 184b000)
vpanic+0xcc(1274d80, 2a1039a94e8, 0, ffffffffffffff01, ff, ff)
panic+0x1c(1274d80, 7b631810, 7b631828, 26b, 1, 0)
assfail+0x74(7b631810, 7b631828, 26b, 184f000, 1274c00, 0)
SunCluster

Friday, January 27, 2006

Day without Guinness...


...is a day with Murphy's - see Cyril's Rainy evening at Murphy's
It was a very interesting discussion on OpenSolaris, software architecture and caboose hunting.
OpenSolaris Caboose

Wednesday, January 25, 2006

I found the problem!!! :)

The problem that consumed 100% of my CBU (core brain unit) now solved. It caused by this:
When the MPxIO is enabled, the SVM cannot work with disks with long device names (64 bytes target name length).
UPD1: replies are in in this thread
UPD2: this bug is known from 21-DEC-2004
UPD3: April,9 2007 - still not fixed
Sun Cluster SVM Solaris10 Solarisx86 OpenSolaris

Sunday, January 22, 2006

SVM metaset problem - I was so close...IMHO...:(

...still trying to find a solution to the bug that totally blocked my project, I found the solution (acc. to my wishfull thinking) in the following InfoDoc:
If the host is attached to disks without unique serial numbers ( for example, non-Sun qualified disks ), sd.conf needs to be modified so that the serial number is not used to generate a DevID for these disks:
sd-config-list="VendorID", "unsupported-hack"; unsupported-hack=1,0x8,0,0,0,0,0;


Our storage doesn't support INQUIRY 0x80 page ("Unit Serial Number") so the absence of the Serial Number may be definitely an issue. I did the following:
1.cleaned the metadb, added to the /kernel/drv/sd.conf :

sd-config-list="myvendorid", "unsupported-hack";
unsupported-hack=1, 0x8, 0, 0, 0, 0, 0;


rebooted and run the test case - no success, got
# metaset -s w1 -a c4t5849562D494E4320584956313033302020202020313031343020202020202033d0
metaset: myhost: /: No such file or directory

2.cleaned the metadb, added to /kernel/drv/md.conf :

md_devid_destroy=1;
md_keep_repl_state=1;

to disable the DevID according to this doc
rebooted and run the test case - no success, got the same error

I was almost sure it's my case but it looks it's not :(
OpenSolaris SVM Solaris10

Saturday, January 21, 2006

Re: New Community Proposal: Naming Services

During the 1st IOSUG meeting LK raised a very painful questions about Solaris/Windows name services interoperability. Yesterday the new OpenSolaris community was proposed: Name Services. The guys are talking about the same things: LDAP, Name Service Switch (NSS), Active Directory interoperability, user identity and authentication,...
+1
OpenSolaris

Biggest Jewish family business was sold

BCF, the legendary Monroe Milstein's family business was sold for 2B$

Friday, January 20, 2006

new features/fixes of sd/sdd driver

If this patch really fixes these problems we'll love it.
driver config files need an "exclude" option
x86 needs to support large LUNs
Only part of disk is usable by fdisk or format on Solaris
Cannot revert EFI labeled disk back to VTOC
SCSI driver (sd) needs to cope with >2Tb

UPD: Now the EFI label can be converted to VTOC(SMI)

Tuesday, January 17, 2006

Casper who?

In the bug description related to sd driver appears the name    "Casper Disk".
Guess who will be mentioned in the st driver bug?      Technorati: OpenSolaris

Thursday, January 12, 2006

Turbocharging an NFS server

When out-of-the box S10/x86 showed mediocre NFS performance (measured by the SPEC SFS benchmark) the following measures improved the results more than twice :
1.increase the dnlc size (set ncsize = 0x100000 in /etc/system)
2.increase the number of concurrent NFS requests (NFSD_SERVERS=128 1024 in /etc/default/nfs)
3.increase the number of VxFS inode table (yes, I am using VxFS)
4.increase the FC HBA queue depth to 256 commands
The dual AMD Opteron 252 box reached 11200 ops/sec with response time 2.4 msec with 24 processes on one load generator.
Kudos to Cyril who helped me to tune the machine.
Solaris10 Solarisx86

Sunday, January 08, 2006

Wanna develop OpenSolaris and make some $$ ?   Go to China !

"As Software Engineering Manager you will be responsible for a team of Software Development engineers or a team of Test Suite software engineers who will work on the design and development of UNIX / Open Solaris operating system including the Java Desktop System and related applications. You and your team of engineers will have advanced Open Source, Java and Solaris application development skills. You will proactively develop and execute plans to budget by providing strong leadership and management to your team."

Company: Executive Marque
Type: Full-time
Experience: Mid-Senior level
Function: Management
Industry: Computer Software
Location: Beijing (China)
Salary: US$50,000 to 120,000
Job Code: L-811
Date Posted: January 7, 2006
Technorati: OpenSolaris

Thursday, January 05, 2006

motd

"I am using vi on solaris but it is not as simple as vi on linux."
[solarisx86] vi

Wednesday, January 04, 2006

Bummer

Adaptec SATA Host Controller 1205SA worked fine under S10 03/05 but now after an upgrade to S10 01/06 it makes a lot of problems. I've spent too much time trying to make the system work. I am tired.
Solaris10 Solarisx86